Retries and backoff

Transient failures (429, 408, 500, network drops) are normal in any networked service. The right response is exponential backoff with jitter, applied only to retryable errors. Both SDKs ship with a configurable retry policy out of the box.

What to retry

Status / failure	Retryable?	Notes
`408 Request Timeout`	✅	Server side — wait briefly and retry.
`429 Too Many Requests`	✅	Rate limit; back off.
`500 Internal Server Error`	✅	Treat as transient.
`502`, `503`, `504`	✅	Network/upstream — retry with backoff.
Network errors (DNS, broken pipe, connect timeout)	✅	Retry — these are pure transport failures.
`400`, `401`, `402`, `403`, `404`, `413`, `415`	❌	Caller-side problem — fix the request first.

SDK configuration

Both SDKs accept a retry config at the client level and per-call.

Python
TypeScript

from supertone import Supertone
from supertone.utils.retries import RetryConfig

client = Supertone(
    api_key=os.environ["SUPERTONE_API_KEY"],
    retry_config=RetryConfig(
        strategy="backoff",
        backoff={
            "initial_interval": 500,        # ms
            "max_interval": 60_000,         # ms
            "exponent": 1.5,
            "max_elapsed_time": 3_600_000,  # ms (1 hour cap)
        },
        retry_connection_errors=True,
    ),
)

import { Supertone } from "@supertone/supertone";

const client = new Supertone({
  apiKey: process.env.SUPERTONE_API_KEY,
  retryConfig: {
    strategy: "backoff",
    backoff: {
      initialInterval: 500,
      maxInterval: 60_000,
      exponent: 1.5,
      maxElapsedTime: 3_600_000,
    },
    retryConnectionErrors: true,
  },
});

This retries on the SDK’s default set of retryable status codes (429, 5xx) with exponential backoff, capped at 1 minute between retries and 1 hour total. Tune the numbers to your latency SLO.

Manual retry pattern

If you call the REST API directly or want to wrap the SDK with your own logic:

import random
import time

def call_with_backoff(fn, *, max_attempts=5, base_ms=500, max_ms=60_000):
    for attempt in range(max_attempts):
        try:
            return fn()
        except (
            errors.TooManyRequestsErrorResponse,
            errors.InternalServerErrorResponse,
            errors.RequestTimeoutErrorResponse,
        ) as e:
            if attempt == max_attempts - 1:
                raise
            delay_ms = min(base_ms * (2 ** attempt), max_ms)
            # Jitter to avoid thundering herd
            delay_ms = delay_ms / 2 + random.uniform(0, delay_ms / 2)
            time.sleep(delay_ms / 1000)

Choosing the right backoff

429 from rate limiting — initial wait of 500–1000 ms doubles to 30–60 s; cap retries at 3–5 attempts.
500/5xx from transient errors — same shape, slightly more aggressive (300 ms initial) since they usually clear quickly.
Streaming requests — retrying a partial stream is fine if you haven’t started playback yet; once playback has begun, it’s usually better to fail than to splice in fresh audio. Decide based on UX.
Long-text auto-chunking — both SDKs apply the same retry policy to each underlying segment, so a single long-text call effectively has the retry budget per segment.

Idempotency

Supertone API calls are idempotent in effect — the same request produces the same audio output (modulo small synthesis variability). Retrying a successful-but-aborted request won’t double-bill you for credits as long as the original request didn’t produce billed audio. For voice-cloning uploads, retries do create separate voices each time if the upload completes. If you re-upload after a network blip, verify whether the previous attempt actually finished before posting again — otherwise you’ll end up with duplicate voices in list_custom_voices.

Anti-patterns to avoid

Retrying 4xx (other than 429) — these don’t get better with time. Fix the request.
No upper bound — always cap the number of attempts and the total elapsed time.
No jitter — fixed-delay retries cause thundering-herd patterns under widespread failures.
Retrying inside a loop without backoff — burns through credits and rate-limit budget in seconds.

Rate limits

What triggers 429 in the first place.

Error handling

Full reference of error codes and SDK error classes.

Get started

Core concepts

Text-to-Speech

SDKs

Examples

Production

Resources

What to retry

SDK configuration

Manual retry pattern

Choosing the right backoff

Idempotency

Anti-patterns to avoid

Rate limits

Error handling

Get started

Core concepts

Text-to-Speech

SDKs

Examples

Production

Resources

Documentation Index

​What to retry

​SDK configuration

​Manual retry pattern

​Choosing the right backoff

​Idempotency

​Anti-patterns to avoid

​Related

Rate limits

Error handling

What to retry

SDK configuration

Manual retry pattern

Choosing the right backoff

Idempotency

Anti-patterns to avoid

Related