Working with streams

Each RTMS stream follows a predictable flow that starts with at least one RTMS session being started; continues while your app connects to the required servers, receives stream data, and maintains connection; and ends when the stream ends. A stream can have multiple sessions happening within it, each session typically corresponds to an individual user.

Sessions can exist in the following states:

State
INACTIVEDefault state. The session is not active yet.
INITIALIZEThe session is initializing.
STARTEDThe session has started.
PAUSEDThe session has been paused either by the consumer or the user.
RESUMEDThe paused session has been resumed.
STOPPEDThe session is being stopped by the consumer, user, or the engagement ending.

Prerequisites

To process RTMS data, your app needs the appropriate event subscriptions, scopes, and, optionally, the REST APIs for starting and managing sessions. For more information, see Add RTMS features to your app.

Once your app is configured, RTMS sessions will follow these steps:

Step 1: RTMS is started

RTMS provides role-based controls to manage the streaming experience. You can set apps to auto-start streams when a user in a queue joins a voice engagement or to start manually by using the REST API.

In addition, the streaming experience is also controlled by:

  • Admins can manage the auto-start settings for Queues through the ZCC Admin portal
  • Users can manually start, pause, and end their own in-call streams

Admins can manage the auto-start settings for Queues through the Zoom Contact Center (ZCC) Admin portal. With auto-start enabled, RTMS will start streams when a queue user joins or initiates a voice engagement.

Apps can also manually control RTMS streams with button controls and Zoom JavaScript SDK APIs. User interaction is needed to let an application know that an engagement has started and RTMS needs to be started or stopped.

Note

Starting RTMS is dependent on being in an active engagement. Manually access and use engagement context using the Contact Center events available for the Zoom apps SDK.

Endpoint: send API request to Update engagement Real-Time Media Streams (RTMS) app status

Method: PUT request

Authentication: Use a valid Zoom access token. Payloads must be securely signed.

Example request

{  "action": "start",
  "settings": {
    "client_id": YOUR_CLIENT_ID
  }
}

Admin-level apps can manually start and stop RTMS streams with a REST API request and contact center webhook events. This enables backend or external systems to start an RTMS session without consumer or agent interaction. For example, RTMS could be started from your company's scheduling software rather than from the active engagement.

Note

To access engagement context dynamically, applications must be admin-level apps to subscribe to engagement contact_center.engagement_started and contact_center.engagement_ended events.

Admin-level apps can't be surface apps.

Endpoint: send API request to Update engagement Real-Time Media Streams (RTMS) app status

Scopes: contact_center:update:engagement_rtms_app_status

Method: PUT request

Authentication: Use a valid Zoom access token. Payloads must be securely signed.

Step 2: App receives streaming notification

Zoom sends contact_center.voice_rtms_started webhook events when RTMS streaming starts. You can then use the information in the event to establish a signaling connection in the next step.

To receive notifications when new streams are available, create an HTTP POST handler in your web app. This handler acts as the endpoint for incoming webhook events. In your app settings, provide the URL of this endpoint. For more information, see Subscribe to RTMS started and stopped events.

After you receive an event, verify the event's signature to ensure it's from a trusted source.

Step 3: App establishes signaling connection

Establishing a signal connection to an RTMS server enables your app to establish and manage a WebSocket connection with the RTMS server. It begins with a signed handshake and includes messages for session readiness, state updates, and stream control.

The signal connection provides lifecycle updates for the media connection, such as when it starts, stops, or encounters an interruption event.

When you have the connection details and know a stream is available, you can start the connection to the signaling server.

  1. Run the following command to create a signature that your app will use to securely connect to the RTMS server; replacing client_id and secret with your app's Client ID and Client Secret, and engagement_id and rtms_stream_id with the engagement_id and rtms_stream_id from the streaming notification event.

HMACSHA256(client_id + "," + engagement_id + "," + rtms_stream_id, secret);

  1. Your app sends a signaling handshake request with the engagement_id and rtms_stream_id from the streaming notification event and the signature you just created. If the handshake is successful, your app will receive a signaling handshake response confirming the connection and containing the media server locations.

If the handshake fails, the server responds with a SIGNALING_HAND_SHAKE_RESP message containing a status code and reason.

  1. (Optional) Your app can Subscribe to events for participant changes and active speaker updates.

Once your app has successfully established a signaling connection, it can now establish a media connection.

Step 4: App establishes the media connection

Use one of the URLs list in media_server.server_urls in the signaling handshake response to establish a media WebSocket connection.

Use integers for message and event types

For data type definitions, use the representative enum integers.

Example: Send msg_type: 1 ✓ not msg_type: SIGNALING_HAND_SHAKE_REQ

  1. Your app sends a media handshake request with the engagement_id and rtms_stream_id from the streaming notification and the signature you created in Step 3 above. If the handshake is successful, your app will receive a media handshake response confirming the connection and containing information about the available media.

ZCC only supports AUDIO_MULTI_STREAMS (data_opt: 2) for audio. AUDIO_MIXED_STREAM is not supported.

If the handshake fails, the server responds with a SIGNALING_HAND_SHAKE_RESP message containing a status code and reason.

  1. Your app sends a client ready ACK message to the RTMS server media connection to indicate readiness to receive media.

Now that the connection is made, your app can receive media data.

Step 5: App receives media data

Once the connection is established, your app receives continuous streams of:

For more information about working with media data, see Handling media data.

Step 6: App maintains connections

Throughout the session, your app must:

Connection maintenance is critical: If your app fails to respond to three consecutive keep-alive requests (sent every 10 seconds during idle periods), the RTMS server will terminate the connection.

If a connection has been interrupted, see Failover and reconnection for more information on reestablishing the connection.

Step 7: Stream ends

The RTMS stream ends when:

  • The call concludes
  • The user manually stops streaming
  • Connection issues cause termination
  • App users leave the engagement

Your app receives a contact_center.voice_rtms_stopped notification indicating the stream has ended.

Failover and reconnection

The connection between the RTMS server and the app can unexpectedly be interrupted.

Lost connection between the RTMS server and the app

The connection between the RTMS server and the app can be lost if either the RTMS server or the app have issues.

RTMS server issue

If the RTMS server goes down, the new RTMS server will send a new contact_center.voice_rtms_started event, which includes the engagement_id and rtms_stream_id. Your app must then establish new connections. For more information, see App establishes the signaling connection and App establishes the media connection.

App issue

If an app encounters downtime or network issues, it must establish new socket connections to the RTMS server.

  • When the signal connection is down, the RTMS server will interrupt both signal and data sockets with the app. The RTMS server will send a contact_center.voice_rtms_interrupted event and the app must then re-establish the signaling and media connections. For more information, see App establishes the signaling connection and App establishes the media connection.

  • When only the data socket connection(s) is down and the signaling socket remains active, the RTMS server will send an event update with event type MEDIA_CONNECTION_INTERRUPTED message through the signaling connection and notify the app which media connection is disconnected. The app must then re-establish only the corresponding media connection. For more information, see App establishes the media connection.