Long text

The Supertone TTS API caps text at 300 characters per request. Anything longer returns 400 Bad Request. To synthesize longer scripts you have two options:

Use the SDKs. Both the Python and TypeScript SDKs automatically split long input, generate each segment, and merge the audio for you. You get one final clip back even if the input was 2,000 characters.
Chunk yourself. If you call the REST API directly, split the text on sentence boundaries and concatenate the resulting audio.

The 300-character limit, at a glance

Layer	Behavior on >300 characters
REST API (`POST /v1/text-to-speech/{voice_id}`)	Returns `400 Bad Request`.
Python SDK (`create_speech`, `stream_speech`)	Auto-chunks at 300, generates in parallel for `create_speech` (sequential for streaming), merges audio.
TypeScript SDK (`createSpeech`, `streamSpeech`)	Auto-chunks at 300, generates sequentially, merges audio.
`predict_duration`	Not auto-chunked — same 300-character limit applies.

The threshold is configurable in both SDKs:

# Python — pass maxTextLength via SDK options
# (the default is 300; lower it to chunk earlier)

// TypeScript — pass maxTextLength in the options object
const response = await client.textToSpeech.createSpeech(
  { voiceId, apiConvertTextToSpeechUsingCharacterRequest: { text, language: "en" } },
  { maxTextLength: 250 },
);

SDK auto-chunking, end to end

Python
TypeScript

import os
from supertone import Supertone

VOICE_ID = "20160a4c5ba38967330c84"  # replace with your voice ID

LONG_TEXT = (
    "Once upon a time, in a faraway land, there lived a quiet librarian "
    "who collected stories of forgotten kingdoms. Every evening she would "
    "open a leather-bound notebook and continue writing the next chapter "
    "of a tale she had been telling herself for years. ...continue with "
    "many more sentences spanning over 300 characters..."
)

with Supertone(api_key=os.environ["SUPERTONE_API_KEY"]) as client:
    response = client.text_to_speech.create_speech(
        voice_id=VOICE_ID,
        text=LONG_TEXT,
        language="en",
    )

    with open("narration.wav", "wb") as f:
        f.write(response.result.read())

Internally, the SDK splits LONG_TEXT at sentence boundaries (then word boundaries, then character boundaries if a single word is too long), runs up to 3 parallel create_speech requests, and merges the resulting WAV/MP3 audio with intermediate file headers stripped.

import { Supertone } from "@supertone/supertone";
import * as fs from "node:fs";

const VOICE_ID = "20160a4c5ba38967330c84"; // replace with your voice ID

const LONG_TEXT = `
  Once upon a time, in a faraway land, there lived a quiet librarian
  who collected stories of forgotten kingdoms. Every evening she would
  open a leather-bound notebook and continue writing the next chapter
  of a tale she had been telling herself for years. ...continue with
  many more sentences spanning over 300 characters...
`.trim();

const client = new Supertone({ apiKey: process.env.SUPERTONE_API_KEY });

const response = await client.textToSpeech.createSpeech({
  voiceId: VOICE_ID,
  apiConvertTextToSpeechUsingCharacterRequest: {
    text: LONG_TEXT,
    language: "en",
  },
});

if (response.result instanceof Uint8Array) {
  fs.writeFileSync("narration.wav", response.result);
} else if (response.result && "getReader" in response.result) {
  const reader = (response.result as ReadableStream<Uint8Array>).getReader();
  const chunks: Uint8Array[] = [];
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    if (value) chunks.push(value);
  }
  fs.writeFileSync("narration.wav", Buffer.concat(chunks));
}

The TS SDK runs the segments sequentially to keep response handling deterministic.

Streaming long text

The SDKs also auto-chunk on stream_speech / streamSpeech. The audio is delivered to your iterator as if it were a single continuous stream — you don’t need to know how many segments were used. See Stream speech for the streaming pattern.

Chunking yourself (cURL or raw HTTP)

If you call the REST API directly, you need to split before sending. A reasonable strategy:

Split on sentence-ending punctuation (., !, ?, 。, ？).
If a sentence is still over 300 characters, split on commas, then on word boundaries.
For each segment, call POST /v1/text-to-speech/{voice_id} and append the returned audio to your output file.
For WAV concatenation, strip the WAV header (first 44 bytes) from every segment after the first so the final file plays as one clip.

# Pseudo-shell: split a script and concatenate WAV chunks
VOICE_ID="20160a4c5ba38967330c84"

split_into_sentences "$INPUT_TEXT" > sentences.txt
while read -r line; do
  curl -s -X POST "https://supertoneapi.com/v1/text-to-speech/$VOICE_ID" \
    -H "x-sup-api-key: $SUPERTONE_API_KEY" \
    -H "Content-Type: application/json" \
    -d "{\"text\": \"$line\", \"language\": \"en\"}" \
    >> raw_chunks.bin
done < sentences.txt
# Then merge with your audio tooling (ffmpeg, etc.) and re-attach a single WAV header.

For most projects, prefer the SDKs — they handle these edge cases (cross-segment timing, header stripping, retry on partial failures) so you don’t have to.

Tips

Punctuation matters. Auto-chunking prefers sentence boundaries. Well-punctuated input produces cleaner cuts and more natural transitions.
Estimate cost first. predict_duration doesn’t auto-chunk, but you can split text yourself and sum durations to estimate total credits.
Watch rate limits. A single long input becomes multiple TTS requests — track your account’s rate limits and consider throttling in your own caller.

Long-form narration

End-to-end example for generating a multi-paragraph narration.

Rate limits

Per-minute request limits by tier.

Get started

Core concepts

Text-to-Speech

SDKs

Examples

Production

Resources

The 300-character limit, at a glance

SDK auto-chunking, end to end

Streaming long text

Chunking yourself (cURL or raw HTTP)

Tips

Long-form narration

Rate limits

Get started

Core concepts

Text-to-Speech

SDKs

Examples

Production

Resources

Documentation Index

​The 300-character limit, at a glance

​SDK auto-chunking, end to end

​Streaming long text

​Chunking yourself (cURL or raw HTTP)

​Tips

​Related

Long-form narration

Rate limits

The 300-character limit, at a glance

SDK auto-chunking, end to end

Streaming long text

Chunking yourself (cURL or raw HTTP)

Tips

Related