Working with streams
Each RTMS stream follows a predictable flow that starts with at least one RTMS session being started; continues while your app connects to the required servers, receives stream data, and maintains connection; and ends when the stream ends. A stream can have multiple sessions happening within it, each session typically corresponds to an individual user.
Sessions can exist in the following states:
| State | |
|---|---|
| INACTIVE | Default state. The session is not active yet. |
| INITIALIZE | The session is initializing. |
| STARTED | The session has started. |
| PAUSED | The session has been paused either by the consumer or the user. |
| RESUMED | The paused session has been resumed. |
| STOPPED | The session is being stopped by the consumer, user, or the engagement ending. |
Prerequisites
To process RTMS data, your app needs the appropriate event subscriptions, scopes, and, optionally, the REST APIs for starting and managing sessions. For more information, see Add RTMS features to your app.
Once your app is configured, RTMS sessions will follow these steps:
Step 1: RTMS is started
RTMS provides role-based controls to manage the streaming experience. You can set apps to auto-start streams when a user in a queue joins a voice engagement or to start manually by using the REST API.
In addition, the streaming experience is also controlled by:
- Admins can manage the auto-start settings for Queues through the ZCC Admin portal
- Users can manually start, pause, and end their own in-call streams
Admins can manage the auto-start settings for Queues through the Zoom Contact Center (ZCC) Admin portal. With auto-start enabled, RTMS will start streams when a queue user joins or initiates a voice engagement.
Apps can also manually control RTMS streams with button controls and Zoom JavaScript SDK APIs. User interaction is needed to let an application know that an engagement has started and RTMS needs to be started or stopped.
Note
Starting RTMS is dependent on being in an active engagement. Manually access and use engagement context using the Contact Center events available for the Zoom apps SDK.
Endpoint: send API request to Update engagement Real-Time Media Streams (RTMS) app status
Method: PUT request
Authentication: Use a valid Zoom access token. Payloads must be securely signed.
Example request
{ "action": "start",
"settings": {
"client_id": YOUR_CLIENT_ID
}
}
Admin-level apps can manually start and stop RTMS streams with a REST API request and contact center webhook events. This enables backend or external systems to start an RTMS session without consumer or agent interaction. For example, RTMS could be started from your company's scheduling software rather than from the active engagement.
Note
To access engagement context dynamically, applications must be admin-level apps to subscribe to engagement
contact_center.engagement_startedandcontact_center.engagement_endedevents.Admin-level apps can't be surface apps.
Endpoint: send API request to Update engagement Real-Time Media Streams (RTMS) app status
Scopes: contact_center:update:engagement_rtms_app_status
Method: PUT request
Authentication: Use a valid Zoom access token. Payloads must be securely signed.
Step 2: App receives streaming notification
Zoom sends contact_center.voice_rtms_started webhook events when RTMS streaming starts. You can then use the information in the event to establish a signaling connection in the next step.
To receive notifications when new streams are available, create an HTTP POST handler in your web app. This handler acts as the endpoint for incoming webhook events. In your app settings, provide the URL of this endpoint. For more information, see Subscribe to RTMS started and stopped events.
After you receive an event, verify the event's signature to ensure it's from a trusted source.
Step 3: App establishes signaling connection
Establishing a signal connection to an RTMS server enables your app to establish and manage a WebSocket connection with the RTMS server. It begins with a signed handshake and includes messages for session readiness, state updates, and stream control.
The signal connection provides lifecycle updates for the media connection, such as when it starts, stops, or encounters an interruption event.
When you have the connection details and know a stream is available, you can start the connection to the signaling server.
- Run the following command to create a signature that your app will use to securely connect to the RTMS server; replacing
client_idandsecretwith your app's Client ID and Client Secret, andengagement_idandrtms_stream_idwith the engagement_id and rtms_stream_id from the streaming notification event.
HMACSHA256(client_id + "," + engagement_id + "," + rtms_stream_id, secret);
- Your app sends a signaling handshake request with the
engagement_idandrtms_stream_idfrom the streaming notification event and the signature you just created. If the handshake is successful, your app will receive a signaling handshake response confirming the connection and containing the media server locations.
If the handshake fails, the server responds with a
SIGNALING_HAND_SHAKE_RESPmessage containing a status code and reason.
- (Optional) Your app can Subscribe to events for participant changes and active speaker updates.
Once your app has successfully established a signaling connection, it can now establish a media connection.
Step 4: App establishes the media connection
Use one of the URLs list in media_server.server_urls in the signaling handshake response
to establish a media WebSocket connection.
Use integers for message and event types
For data type definitions, use the representative enum integers.
Example: Send
msg_type: 1✓ notmsg_type: SIGNALING_HAND_SHAKE_REQ✗
- Your app sends a media handshake request with the
engagement_idandrtms_stream_idfrom the streaming notification and the signature you created in Step 3 above. If the handshake is successful, your app will receive a media handshake response confirming the connection and containing information about the available media.
ZCC only supports
AUDIO_MULTI_STREAMS(data_opt: 2) for audio.AUDIO_MIXED_STREAMis not supported.
If the handshake fails, the server responds with a
SIGNALING_HAND_SHAKE_RESPmessage containing a status code and reason.
- Your app sends a client ready ACK message to the RTMS server media connection to indicate readiness to receive media.
Now that the connection is made, your app can receive media data.
Step 5: App receives media data
Once the connection is established, your app receives continuous streams of:
- Audio data from engagement participants
For more information about working with media data, see Handling media data.
Step 6: App maintains connections
Throughout the session, your app must:
- Respond to keep-alive requests to maintain stable connections (sent every 10 seconds when no data is flowing)
- Monitor session state updates for pauses, resumes, or interruptions
- Handle stream state changes for connection issues or termination
Connection maintenance is critical: If your app fails to respond to three consecutive keep-alive requests (sent every 10 seconds during idle periods), the RTMS server will terminate the connection.
If a connection has been interrupted, see Failover and reconnection for more information on reestablishing the connection.
Step 7: Stream ends
The RTMS stream ends when:
- The call concludes
- The user manually stops streaming
- Connection issues cause termination
- App users leave the engagement
Your app receives a contact_center.voice_rtms_stopped notification indicating the stream has ended.
Failover and reconnection
The connection between the RTMS server and the app can unexpectedly be interrupted.
Lost connection between the RTMS server and the app
The connection between the RTMS server and the app can be lost if either the RTMS server or the app have issues.
RTMS server issue
If the RTMS server goes down, the new RTMS server will send a new contact_center.voice_rtms_started event, which includes the engagement_id and rtms_stream_id. Your app must then establish new connections. For more information, see App establishes the signaling connection and
App establishes the media connection.
App issue
If an app encounters downtime or network issues, it must establish new socket connections to the RTMS server.
-
When the signal connection is down, the RTMS server will interrupt both signal and data sockets with the app. The RTMS server will send a
contact_center.voice_rtms_interruptedevent and the app must then re-establish the signaling and media connections. For more information, see App establishes the signaling connection and App establishes the media connection. -
When only the data socket connection(s) is down and the signaling socket remains active, the RTMS server will send an event update with event type
MEDIA_CONNECTION_INTERRUPTEDmessage through the signaling connection and notify the app which media connection is disconnected. The app must then re-establish only the corresponding media connection. For more information, see App establishes the media connection.