Scribe API

The Scribe API delivers scalable, high-performance transcription across a broad range of media formats and use cases.

It enables organizations to convert audio and video into accurate text, and handle large archives in bulk or real-time interactions at scale.

Key features

  • Supports high-volume transcription from cloud storage
  • Accepts multiple input formats, including common audio and video formats
  • Handles call conversations, podcasts, and long-form media processing
  • Adds timestamps for each segment
  • Applies punctuation and formatting for human-readable output
  • Separates speakers when multiple people are present
  • Filters profanity when enabled
  • Supports short command-and-control scenarios through fast mode

Processing modes

Fast mode

Fast mode provides synchronous, low-latency transcription for individual files.

  • Processes one audio file at a time
  • Responses return immediately after the transcription completes
  • Works best for short recordings

Note: Fast mode can handle one (mono) or two (stereo) audio channels. The API returns either a single combined transcript or separate transcripts for each channel.

Example workflow

To convert an audio recording into searchable text on demand:

  1. Use fast mode for near real-time transcription.
  2. Your app uploads the audio file.
  3. The backend generates a JWT with Build platform credentials.
  4. The backend sends a transcription request.
  5. The API returns a JSON transcript with timestamps.
  6. Your app displays the transcript to the user.

For details, see Fast mode.

Batch mode

Batch mode provides asynchronous transcription for large or complex jobs.

  • Processes many files in a single request
  • Runs in the background — submit jobs and retrieve results when processing is complete

Batch mode is best for:

  • Long recordings
  • Large collections of files
  • Multi-speaker audio

Each audio file generates its own transcript in the corresponding output location.

Example workflow

To transcribe stored call recordings in S3:

  1. Submit a batch job and specify the input folder in your bucket.
  2. The job runs asynchronously.
  3. The service writes transcripts to the specified output location.
  4. Use batch job status endpoints or webhooks to monitor progress.
  5. Retrieve per-file results when processing completes.

For details, see Batch mode.