Get meeting transcripts using native WebSockets with RTMS

At Zoom, we love giving our developers choices, and RTMS was built with that philosophy in mind. So as a developer, you have the choice of using the SDK for speed and simplicity, or WebSockets for full control.

In our earlier blog, (Get Zoom transcripts in 5 lines of code), we demonstrated how to use the RTMS SDK to stream meeting transcripts. The SDK is still the fastest and simplest way to work with RTMS. It handles the complexity for you and gets you to useful data fast.

In this blog, we explore the native WebSockets approach. Most developers will not need this path, but if you want full control over connections, handshakes, buffering, or observability, WebSockets give you that flexibility, while staying simpler than running meeting bots or virtual clients. If you are new to RTMS, take a quick look at the documentation for Getting started with RTMS and Add RTMS features to your app before continuing. We will follow the same flow.

Video walkthrough

If you prefer to follow along visually, this step by step video shows how to stream live meeting transcripts using native WebSockets.

  • Configure your Zoom Marketplace app, including scopes, events, and domain allow lists
  • Handle the meeting.rtms_started webhook
  • Generate HMAC SHA256 signatures for authentication
  • Establish signaling and media WebSocket connections
  • Exchange keep alive requests and acknowledgements
  • Stream live transcript data in realtime using Node.js

When to use Websockets vs the SDK ?

Our RTMS SDKs are the fastest way to get started because they include helpers for authentication, handshakes, retries, and payload handling.

Choose the native WebSockets flow when you need:

  • Portability and freedom to use any language or runtime that supports WebSockets such as Go, Rust, Python, or .NET
  • Granular control over handshakes, buffering, back pressure, routing, and connection lifecycle
  • Observability or compliance needs where you must log raw messages, collect metrics, and meet audit requirements without hidden layers.

What you'll build

You will build a service that:

  • Handles the meeting.rtms_started and meeting.rtms_stopped webhooks
  • Computes the RTMS handshake signature
  • Opens the signaling WebSocket then the media WebSocket
  • Sends CLIENT_READY_ACK after successful handshakes
  • Prints transcript chunks as they arrive

Prefer to skip ahead? Jump to Full code.

RTMS connection process overview

  1. Zoom sends a webhook when an RTMS enabled meeting starts (meeting.rtms_started).
  2. Your service opens the signaling WebSocket using the URL from the webhook.
  3. Your service generates a signature and completes the signaling handshake.
  4. Zoom returns media server URLs after the handshake succeeds.
  5. Your service opens the media WebSocket for the transcript or other media type you want.
  6. Your service completes the media handshake using the same signature.
  7. Your service sends CLIENT_READY_ACK on the signaling socket to begin streaming.
  8. Zoom sends live media data on the media WebSocket.
  9. Both sides exchange keep alive messages to maintain the signaling and media connections.

Prerequisites

If you’re new to RTMS, start here: Getting started and then Add RTMS features to your app.

  1. Create a Zoom Marketplace app and enable Realtime Media Streams.

  2. Subscribe to webhooks: meeting.rtms_started and meeting.rtms_stopped.

  3. Add the scope meeting:read:meeting_transcripts.

  4. Expose a publicly reachable HTTPS webhook endpoint (for development, ngrok is fine).

  5. App settings checklist (common gotchas):

    • Provide your Home URL.
    • Add appssdk.zoom.us to the domain allow list.
    • Add your own domain to the allow list.
    • Enable the Zoom Apps SDK (if you are embedding an app UI).
    • Install the app to your account and enable Auto‑start in the Zoom Apps settings. For details, see Host and admin tools and controls.

Get started (Node.js)

RTMS works with any language that can open a WebSocket. We’ll use Node.js for brevity. For the handshake sequence, keep the WebSockets quickstart handy.

Install dependencies:

npm init -y
npm install express dotenv ws

Create .env:

.env sample

ZOOM_CLIENT_ID=your_client_id
ZOOM_CLIENT_SECRET=your_client_secret
PORT=3000

Create index.js (base server):

import express from "express";
import dotenv from "dotenv";
import WebSocket from "ws";
// Load environment variables from .env
dotenv.config();
const app = express();
// Enable JSON body parsing
app.use(express.json());
// Basic root route for testing
app.get("/", (req, res) => {
    res.send("Zoom RTMS Server is up and running.");
});
// Listen on localhost:3000
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
    console.log(`Server is listening on http://localhost:${PORT}`);
});

Build the webhook receiver

Zoom posts events when RTMS starts or stops. Add a /webhook route and trigger the connect flow (the event shapes are documented in the WebSockets quickstart).

For complete details on RTMS webhook events, including event payloads and verification, see the RTMS webhook events API reference.

app.post("/webhook", (req, res) => {
    const { event, payload } = req.body;
    console.log("Webhook received:", event);
    console.log("Payload:", JSON.stringify(payload, null, 2));
    res.sendStatus(200);
    if (event === "meeting.rtms_started") {
        const { meeting_uuid, rtms_stream_id, server_urls } = payload;
        console.log(`Starting RTMS for meeting ${meeting_uuid}`);
        connectToSignalingWebSocket(meeting_uuid, rtms_stream_id, server_urls);
    } else if (event === "meeting.rtms_stopped") {
        const { meeting_uuid } = payload;
        console.log(`Stopping RTMS for meeting ${meeting_uuid}`);
        // Clean up any sockets you opened for this meeting_uuid
    }
});

Create the signature generator

Zoom requires an HMAC SHA256 signature during both the signaling and media handshakes. The signature combines your client ID, the meeting UUID, and the RTMS stream ID.

This is described in Add RTMS features to your app and used in the WebSockets quickstart.

import crypto from "crypto";
function generateSignature(meetingUuid, rtmsStreamId) {
    const message = `${process.env.ZOOM_CLIENT_ID},${meetingUuid},${rtmsStreamId}`;
    return crypto
        .createHmac("sha256", process.env.ZOOM_CLIENT_SECRET)
        .update(message)
        .digest("hex");
}

Connect to the signaling server

Your service opens the signaling WebSocket, sends the handshake request, and waits for Zoom to respond with available media server URLs.

function connectToSignalingWebSocket(meetingUuid, rtmsStreamId, serverUrls) {
    const signalingWs = new WebSocket(serverUrls);
    signalingWs.on("open", () => {
        const handshakeMsg = {
            msg_type: 1, // SIGNALING_HAND_SHAKE_REQ
            protocol_version: 1,
            sequence: 0,
            meeting_uuid: meetingUuid,
            rtms_stream_id: rtmsStreamId,
            signature: generateSignature(meetingUuid, rtmsStreamId),
        };
        signalingWs.send(JSON.stringify(handshakeMsg));
    });
    signalingWs.on("message", (data) => {
        const msg = JSON.parse(data.toString());
        // Successful handshake → pick media URL
        if (msg.msg_type === 2 && msg.status_code === 0) {
            const transcriptUrl = msg.media_server?.server_urls?.transcript;
            if (transcriptUrl) {
                connectToMediaWebSocket(
                    transcriptUrl,
                    meetingUuid,
                    rtmsStreamId,
                    signalingWs,
                );
            }
        }
        // Keep‑alive
        if (msg.msg_type === 12) {
            signalingWs.send(
                JSON.stringify({
                    msg_type: 13, // KEEP_ALIVE_RESP
                    timestamp: msg.timestamp,
                }),
            );
        }
    });
    signalingWs.on("error", (err) =>
        console.error("Signaling WebSocket error:", err),
    );
    signalingWs.on("close", (code, reason) =>
        console.log("Signaling WebSocket closed:", code, reason),
    );
}

Connect to the media server

Your service opens a second WebSocket for the transcript media stream. After the media handshake succeeds, you send CLIENT_READY_ACK on the signaling socket and Zoom begins sending transcript packets.

function connectToMediaWebSocket(
    mediaUrl,
    meetingUuid,
    rtmsStreamId,
    signalingSocket,
) {
    const mediaWs = new WebSocket(mediaUrl);
    mediaWs.on("open", () => {
        const handshakeMsg = {
            msg_type: 3, // DATA_HAND_SHAKE_REQ
            protocol_version: 1,
            sequence: 0,
            meeting_uuid: meetingUuid,
            rtms_stream_id: rtmsStreamId,
            signature: generateSignature(meetingUuid, rtmsStreamId),
            media_type: 8, // transcripts
        };
        mediaWs.send(JSON.stringify(handshakeMsg));
    });
    mediaWs.on("message", (data) => {
        const msg = JSON.parse(data.toString());
        // Handshake OK → tell signaling we’re ready
        if (msg.msg_type === 4 && msg.status_code === 0) {
            signalingSocket.send(
                JSON.stringify({
                    msg_type: 7, // CLIENT_READY_ACK
                    rtms_stream_id: rtmsStreamId,
                }),
            );
            return;
        }
        // Transcript packets
        if (msg.msg_type === 17) {
            // MEDIA_DATA_TRANSCRIPT
            const who = msg.content?.user_name ?? "Speaker";
            const text = msg.content?.data ?? "";
            if (text) console.log(`${who}: ${text}`);
            return;
        }
        // Keep‑alive
        if (msg.msg_type === 12) {
            mediaWs.send(
                JSON.stringify({
                    msg_type: 13, // KEEP_ALIVE_RESP
                    timestamp: msg.timestamp,
                }),
            );
        }
    });
    mediaWs.on("error", (err) => console.error("Media WebSocket error:", err));
    mediaWs.on("close", (code, reason) =>
        console.log("Media WebSocket closed:", code, reason),
    );
}

Understanding message types

RTMS uses a small set of message types for handshakes, keep alive exchanges, and media delivery. You will see these messages on both the signaling and media sockets.

Message TypeValueDescription
SIGNALING_HAND_SHAKE_REQ1Initial handshake request to signaling server
SIGNALING_HAND_SHAKE_RESP2Response from signaling server with media URLs
DATA_HAND_SHAKE_REQ3Handshake request to media server
DATA_HAND_SHAKE_RESP4Response from media server confirming connection
CLIENT_READY_ACK7Sent to signaling server to start media stream
KEEP_ALIVE_REQ12Sent by server every 10 seconds to keep connection alive
KEEP_ALIVE_RESP13Your response to KEEP_ALIVE_REQ
MEDIA_DATA_TRANSCRIPT17Incoming transcript data packet

For a complete reference of all RTMS message types, event types, status codes, and data structures, see the Event reference and Data type definitions documentation.

Media types

Each RTMS media stream uses a numeric flag that identifies the type of media you want to receive.

ValueMedia Type
1Audio
2Video
4Screen share
8Transcript
16Chat
32All media types in one WebSocket connection

For detailed information about configuring media parameters (codecs, resolutions, sample rates, etc.), see the Media parameter definitions documentation.

Run it

Start your server:

npm run start

or:

node --env-file=.env index.js

Then expose your /webhook endpoint publicly over HTTPS. For development, an ngrok tunnel works well:

ngrok http 3000

Paste the generated URL into your Marketplace app's webhook settings.

Finally:

  1. Install your app.
  2. Enable Auto start in the Zoom Apps settings. For details, see Host and admin tools and controls.

Auto-start settings in Zoom Apps

  1. Join a meeting that has RTMS enabled.
  2. Watch transcript lines appear in your console as Zoom sends live data.

If you do not see data, confirm that your webhook is reachable and that the app has the required scope and permissions.

For end‑to‑end handshake details, see the WebSockets quickstart. For first‑time setup, also consult Getting started and Add RTMS features to your app.

Full code

Get the complete runnable sample on GitHub: