Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.supertoneapi.com/llms.txt

Use this file to discover all available pages before exploring further.

Transient failures (429, 408, 500, network drops) are normal in any networked service. The right response is exponential backoff with jitter, applied only to retryable errors. Both SDKs ship with a configurable retry policy out of the box.

What to retry

Status / failureRetryable?Notes
408 Request TimeoutServer side — wait briefly and retry.
429 Too Many RequestsRate limit; back off.
500 Internal Server ErrorTreat as transient.
502, 503, 504Network/upstream — retry with backoff.
Network errors (DNS, broken pipe, connect timeout)Retry — these are pure transport failures.
400, 401, 402, 403, 404, 413, 415Caller-side problem — fix the request first.

SDK configuration

Both SDKs accept a retry config at the client level and per-call.
from supertone import Supertone
from supertone.utils.retries import RetryConfig

client = Supertone(
    api_key=os.environ["SUPERTONE_API_KEY"],
    retry_config=RetryConfig(
        strategy="backoff",
        backoff={
            "initial_interval": 500,        # ms
            "max_interval": 60_000,         # ms
            "exponent": 1.5,
            "max_elapsed_time": 3_600_000,  # ms (1 hour cap)
        },
        retry_connection_errors=True,
    ),
)
This retries on the SDK’s default set of retryable status codes (429, 5xx) with exponential backoff, capped at 1 minute between retries and 1 hour total. Tune the numbers to your latency SLO.

Manual retry pattern

If you call the REST API directly or want to wrap the SDK with your own logic:
import random
import time

def call_with_backoff(fn, *, max_attempts=5, base_ms=500, max_ms=60_000):
    for attempt in range(max_attempts):
        try:
            return fn()
        except (
            errors.TooManyRequestsErrorResponse,
            errors.InternalServerErrorResponse,
            errors.RequestTimeoutErrorResponse,
        ) as e:
            if attempt == max_attempts - 1:
                raise
            delay_ms = min(base_ms * (2 ** attempt), max_ms)
            # Jitter to avoid thundering herd
            delay_ms = delay_ms / 2 + random.uniform(0, delay_ms / 2)
            time.sleep(delay_ms / 1000)

Choosing the right backoff

  • 429 from rate limiting — initial wait of 500–1000 ms doubles to 30–60 s; cap retries at 3–5 attempts.
  • 500/5xx from transient errors — same shape, slightly more aggressive (300 ms initial) since they usually clear quickly.
  • Streaming requests — retrying a partial stream is fine if you haven’t started playback yet; once playback has begun, it’s usually better to fail than to splice in fresh audio. Decide based on UX.
  • Long-text auto-chunking — both SDKs apply the same retry policy to each underlying segment, so a single long-text call effectively has the retry budget per segment.

Idempotency

Supertone API calls are idempotent in effect — the same request produces the same audio output (modulo small synthesis variability). Retrying a successful-but-aborted request won’t double-bill you for credits as long as the original request didn’t produce billed audio. For voice-cloning uploads, retries do create separate voices each time if the upload completes. If you re-upload after a network blip, verify whether the previous attempt actually finished before posting again — otherwise you’ll end up with duplicate voices in list_custom_voices.

Anti-patterns to avoid

  • Retrying 4xx (other than 429) — these don’t get better with time. Fix the request.
  • No upper bound — always cap the number of attempts and the total elapsed time.
  • No jitter — fixed-delay retries cause thundering-herd patterns under widespread failures.
  • Retrying inside a loop without backoff — burns through credits and rate-limit budget in seconds.

Rate limits

What triggers 429 in the first place.

Error handling

Full reference of error codes and SDK error classes.