> ## Documentation Index
> Fetch the complete documentation index at: https://cerebrium.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Streaming Endpoints

Streaming sends live output from a model over a server-sent event (SSE) stream.
It works with any Python object that implements the iterator or generator protocol.

The generator/iterator must `yield` data, which is sent downstream via the `text/event-stream` Content-Type.
Data can be sent in JSON format and decoded on the client side.

A minimal example:

```python theme={null}
import time

def run(upper_range: int):
    for i in range(upper_range):
        yield f"Number {i} "
        time.sleep(1)
```

Deploy this snippet and call the endpoint. SSE events appear progressively, one per second:

```bash theme={null}
curl -X POST https://api.aws.us-east-1.cerebrium.ai/v4/<YOUR-PROJECT-ID>/2-streaming-endpoint/run \
       -H 'Content-Type: application/json'\
       -H 'Accept: text/event-stream\
       -H 'Authorization: Bearer <YOUR-JWT-TOKEN>\
       --data '{"upper_range": 3}'
```

This should output:

```bash theme={null}
HTTP/1.1 200 OK
cache-control: no-cache
content-encoding: gzip
content-type: text/event-stream; charset=utf-8
date: Tue, 28 May 2024 21:12:46 GMT
server: envoy
transfer-encoding: chunked
vary: Accept-Encoding
x-envoy-upstream-service-time: 198995
x-request-id: e6b55132-32af-96d7-a064-8915c4a42452

data: Number 0
...
```

The remaining data streams in every second:

```
...
data: Number 1

data: Number 2
```

Postman also supports SSE streams natively.

<img src="https://mintcdn.com/cerebrium/w6QrtZunT-SzaBze/images/cortex/streaming-postman.png?fit=max&auto=format&n=w6QrtZunT-SzaBze&q=85&s=8718df1936d620d2c7be03b5d9edb0a4" alt="Streaming" width="2600" height="1668" data-path="images/cortex/streaming-postman.png" />

For a Falcon-7B streaming example, see the [streaming endpoint example](https://github.com/CerebriumAI/examples/tree/master/5-large-language-models/2-streaming-endpoint).
