# Get Contact Center audio and transcripts from Realtime Media Streams using WebSockets

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

In this guide, you'll build a server that uses WebSockets to get Zoom Contact Center audio and transcripts from Realtime Media Streams (RTMS).

This [sample app](https://github.com/zoom/RTMS-ZCC-Sample) is available on _GitHub_.

The server will:

1. Listen for incoming webhook events `contact_center.voice_rtms_started` and `contact_center.voice_rtms_stopped`
2. Generate a signature for handshake requests
3. Connect to the WebSocket endpoint for the engagement
4. Receive audio and transcripts in real time

**Prerequisites**

1. Create an app in the [Zoom Marketplace](https://marketplace.zoom.us/develop/create)
2. Add [Realtime Media Streams features](/docs/rtms/contact-center/add-features/) to the app.

## Get started

First, create a new Node.js project and install [express](https://expressjs.com/), [dotenv](https://www.npmjs.com/package/dotenv), and [ws](https://www.npmjs.com/package/ws) as dependencies.

```bash
npm init -y
```

```bash
npm install express dotenv ws
```

Next, we'll create a basic server on `localhost:3000`. Create a new file named `index.js` and add the following code to it:

```javascript
import express from "express";
import dotenv from "dotenv";
import WebSocket from "ws";

// Load environment variables from .env
dotenv.config();

const app = express();

// Enable JSON body parsing
app.use(express.json());

// Basic root route for testing
app.get("/", (req, res) => {
    res.send("Zoom RTMS Server is up and running.");
});

// Listen on localhost:3000
const PORT = 3000;
app.listen(PORT, () => {
    console.log(`Server is listening on http://localhost:${PORT}`);
});
```

### Setup environment variables

Create a `.env` file in your project root. Add the following environment variables:

```javascript
ZOOM_APP_CLIENT_ID = your_client_id;
ZOOM_APP_CLIENT_SECRET = your_client_secret;
PORT = 3000;
```

Get your `ZOOM_APP_CLIENT_ID` and `ZOOM_APP_CLIENT_SECRET` from a [Zoom app with Realtime Media Streams features](/docs/rtms/contact-center/add-features/) (webhook subscriptions and scopes are required).

## Build the webhook receiver

When a RTMS session starts or stops, your app will receive a webhook event with the following payloads.

When a stream starts: `contact_center.voice_rtms_started`

```json
{
    "event": "contact_center.voice_rtms_started",
    "event_ts": 1626230691572,
    "payload": {
        "engagement_id": "4444AAAiAAAAAiAiAiiAii==",
        "rtms_stream_id": "609340fb2a7946909659956c8aa9250c",
        "server_urls": "wss://127.0.0.1:443"
    }
}
```

When a stream stops: `contact_center.voice_rtms_stopped`

```json
{
    "event": "contact_center.voice_rtms_stopped",
    "event_ts": 1626230691572,
    "payload": {
        "engagement_id": "4444AAAiAAAAAiAiAiiAii==",
        "rtms_stream_id": "xxxxxxxxxxc",
        "stop_reason": 6
    }
}
```

To connect to the stream, our app needs the `engagement_id`, `rtms_stream_id`, and `server_urls` from the payload.

To handle these webhook events, we'll build a simple webhook receiver in and create a `/webhook` route to receive the POST requests from our event subscriptions.

Add the following code to `index.js`.

```javascript
app.use(express.json());

app.post("/webhook", (req, res) => {
    const { event, payload } = req.body;
    console.log("Webhook received:", event);
    console.log("Payload:", JSON.stringify(payload, null, 2));
    res.sendStatus(200);
});
```

Next we will handle the `contact_center.voice_rtms_started` and `contact_center.voice_rtms_stopped` events.

When we receive the `contact_center.voice_rtms_started` event, we extract the engagement details to open a signaling WebSocket connection to start the RTMS handshake.

```javascript
// Handle RTMS start event
if (event === "contact_center.voice_rtms_started") {
    const { engagement_id, rtms_stream_id, server_urls } = payload;
    console.log(`Starting RTMS for engagement ${engagement_id}`);
    // Connect to signaling WebSocket to establish RTMS connection
    connectToSignalingWebSocket(engagement_id, rtms_stream_id, server_urls, {});
}
```

```javascript
// Handle RTMS stop event
if (event === "contact_center.voice_rtms_stopped") {
    const { engagement_id } = payload;
    console.log(`Stopping RTMS for engagement ${engagement_id}`);
}
```

Put together, the code for the webhook receiver looks like this:

```javascript
app.post("/webhook", (req, res) => {
    const { event, payload } = req.body;

    // Handle RTMS start event
    if (event === "contact_center.voice_rtms_started") {
        const { engagement_id, rtms_stream_id, server_urls } = payload;
        console.log(`Starting RTMS for engagement ${engagement_id}`);
        // Connect to signaling WebSocket to establish RTMS connection
        connectToSignalingWebSocket(
            engagement_id,
            rtms_stream_id,
            server_urls,
            {},
        );
        // Handle RTMS stop event
    } else if (event === "contact_center.voice_rtms_stopped") {
        const { engagement_id } = payload;
        console.log(`Stopping RTMS for engagement ${engagement_id}`);
    } else {
        console.log("Unknown event:", event);
    }
    res.sendStatus(200);
});
```

## Create the signature generator

Next, we will create a function to generate the signature for the signaling WebSocket connection using HMAC SHA256. This will be used to authenticate the handshake request to the signaling server.

Add the following code to `index.js`.

```javascript
const crypto = require("crypto");

const CLIENT_ID = process.env.ZOOM_APP_CLIENT_ID;
const CLIENT_SECRET = process.env.ZOOM_APP_CLIENT_SECRET;

function generateSignature(engagementId, rtmsStreamId) {
    const message = `${CLIENT_ID},${engagementId},${rtmsStreamId}`;
    return crypto
        .createHmac("sha256", CLIENT_SECRET)
        .update(message)
        .digest("hex");
}
```

This helper returns the computed signature string.

## Connect to the signaling server with WebSockets

Next, we'll use the signature inside a `connectToSignalingWebSocket()` function to establish the signaling connection.

Add the following code to `index.js`.

```javascript
const WebSocket = require("ws");

function connectToSignalingWebSocket(
    engagementId,
    rtmsStreamId,
    serverUrl,
    engagementData,
) {
    const ws = new WebSocket(serverUrl);

    ws.on("open", () => {
        console.log(
            `Signaling WebSocket opened for engagement ${engagementId}`,
        );

        const signature = generateSignature(engagementId, rtmsStreamId);

        const handshake = {
            msg_type: 1, // SIGNALING_HAND_SHAKE_REQ
            protocol_version: 1,
            engagement_id: engagementId,
            rtms_stream_id: rtmsStreamId,
            sequence: 0,
            signature: signature,
        };

        console.log("Sending handshake message:", handshake);
        ws.send(JSON.stringify(handshake));
    });

    engagementData.signalingWs = ws;
    return ws;
}
```

The `sequence` field starts at 0 and increments with each message you send on the signaling channel. Store a reference to the signaling WebSocket instance — you'll need it later to send the `CLIENT_READY_ACK` after the media handshake completes.

This function sends the signature and required handshake fields to the signaling server to authorize the connection.

### Handling keep-alive requests

When the signaling WebSocket connection is active, the RTMS server periodically sends keep-alive messages to check if the client is still connected. The client needs to respond promptly with a keep-alive response message, including the timestamp received in the request, to maintain the WebSocket connection. Add this if-statement:

Add the following code to `index.js`.

```javascript
ws.on("message", async (data) => {
    const message = JSON.parse(data.toString());

    if (message.msg_type === 2) {
        // Handshake response — handled in previous step
    } else if (message.msg_type === 12) {
        ws.send(
            JSON.stringify({
                msg_type: 13, // KEEP_ALIVE_RESP
                timestamp: message.timestamp,
            }),
        );
    }
});
```

Failing to respond to keep-alive pings will cause the server to close the signaling connection.

## Connect to the media server with a WebSocket

When the signaling handshake is successful, the RTMS signaling server sends a handshake response with media server URLs in `media_server.server_urls`:

```json
{
    "msg_type": 2,
    "protocol_version": 1,
    "sequence": 0,
    "status_code": 0,
    "reason": "",
    "media_server": {
        "server_urls": {
            "audio": "wss://...",
            "transcript": "wss://...",
            "all": "wss://..."
        }
    }
}
```

Next, our app will need to open one of the media URLs to open a WebSocket connection to receive media data. In this example, we will request the transcript stream, which uses `media_type: 32` for audio and transcripts.

When the Media WebSocket connection opens, we build and send a handshake request to the media server with our engagement details and signature:

Add the following code to `index.js`.

```javascript
function connectToMediaWebSocket(
    mediaUrl,
    engagementId,
    rtmsStreamId,
    signalingWs,
    engagementData,
) {
    const ws = new WebSocket(mediaUrl);

    ws.on("open", () => {
        const handshake = {
            msg_type: 3, // DATA_HAND_SHAKE_REQ
            protocol_version: 1,
            engagement_id: engagementId,
            rtms_stream_id: rtmsStreamId,
            signature: generateSignature(engagementId, rtmsStreamId),
            media_type: 32,
            payload_encryption: false,
            media_params: {
                audio: {
                    content_type: 2, // RAW_AUDIO
                    sample_rate: 1, // 16kHz
                },
                transcript: {
                    content_type: 5,
                    src_language: 9,
                    enable_lid: true,
                },
            },
        };
        ws.send(JSON.stringify(handshake));
    });

    engagementData.mediaWs = ws;
    return ws;
}
```

### Send the client ready acknowledgement (ACK)

To verify our app is ready to receive media, we send a client ready ACK message back to the signaling WebSocket. This tells the RTMS server our client is ready to receive a stream on the media server:

Add the following code to `index.js`.

```javascript
if (message.msg_type === 4 && message.status_code === 0) {
    signalingWs.send(
        JSON.stringify({
            msg_type: 7,
            rtms_stream_id: rtmsStreamId,
        }),
    );
}
```

Be sure you are sending this acknowledgement over the signaling channel. The `CLIENT_READY_ACK` completes the full two-channel handshake, and the RTMS server will begin delivering media packets on the media WebSocket immediately after.

### Receive audio and transcript data

Once the `CLIENT_READY_ACK` is sent, the RTMS server will begin streaming the actual media data, in our case audio and transcripts, through the media WebSocket.

Incoming media packets have different `msg_type` values depending on the type of media you requested in your `DATA_HAND_SHAKE_REQ`.

For transcripts, each chunk arrives as a message with `msg_type 17`.

When the media WebSocket is active and the stream has started, you need to handle incoming packets.

Add the following code to `index.js`.

```javascript
// When receiving a MEDIA_DATA_TRANSCRIPT message
if (msg.msg_type === 17) {
    console.log("Received transcript:", msg.content);
}

// When receiving a MEDIA_DATA_AUDIO message
if (message.msg_type === 14) {
    const audioBuffer = Buffer.from(message.content.data, "base64");
    const channelId = message.content.channel_id;
}
```

### Send a keep-alive message to the media WebSocket

Similar to the signaling connection, we also need to keep the media connection alive. We will use the same logic.

Add the following code to `index.js`.

```javascript
if (msg.msg_type === 12) {
    // KEEP_ALIVE_REQ
    console.log("Received KEEP_ALIVE_REQ, responding with KEEP_ALIVE_ACK");
    ws.send(
        JSON.stringify({
            msg_type: 13, // KEEP_ALIVE_ACK
            timestamp: msg.timestamp,
        }),
    );
}
```

### Start the app to receive data

You'll now be able to receive audio and transcript data from the Zoom Contact Center engagement. Install your Zoom app and start a stream in an engagement. You'll start seeing media data in your console.