> ## Documentation Index
> Fetch the complete documentation index at: https://docs.supertoneapi.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Rate limits

> Per-minute request limits by account tier, and how to respond when limits are exceeded.

The Supertone API enforces request-rate limits to protect service stability. Limits scale with your plan; if you need higher capacity, an [enterprise plan](https://docs.google.com/forms/d/1YexQpjpK0ZEou12blTytkZLqvrV-Uv95GbhxoOQ54R8/edit) is available.

## Limits by tier

### Speech generation (`text_to_speech`, `stream_speech`)

| Tier           | Requests per minute |
| -------------- | :-----------------: |
| Free & Starter |        **20**       |
| Creator        |        **30**       |
| Pro            |        **60**       |
| Enterprise     |        Custom       |

### Voice cloning (`create_cloned_voice`)

| Tier                  | Requests per minute |
| --------------------- | :-----------------: |
| Starter, Creator, Pro |        **10**       |
| Free                  |    Not available    |
| Enterprise            |        Custom       |

Other endpoints (listing voices, usage queries, credit balance, predict-duration) are not subject to the speech limit but may be throttled if abused.

## When you exceed a limit

The API returns:

```
HTTP/1.1 429 Too Many Requests
```

In some cases the server may also delay or drop requests temporarily to absorb the spike. Treat any `429` response as a signal to pause and retry — see [Retries and backoff](/en/docs/production/retries-and-backoff).

## Handling rate limits in code

<Tabs>
  <Tab title="Python">
    ```python theme={"dark"}
    from supertone import Supertone, errors

    try:
        response = client.text_to_speech.create_speech(...)
    except errors.TooManyRequestsErrorResponse as e:
        # Retry after a backoff — see the retries-and-backoff guide
        wait_then_retry()
    ```
  </Tab>

  <Tab title="TypeScript">
    ```typescript theme={"dark"}
    import * as errors from "@supertone/supertone/models/errors";

    try {
      const response = await client.textToSpeech.createSpeech({ /* ... */ });
    } catch (err) {
      if (err instanceof errors.TooManyRequestsErrorResponse) {
        // Retry after a backoff
        await waitThenRetry();
      } else {
        throw err;
      }
    }
    ```
  </Tab>
</Tabs>

Both SDKs accept a `retry_config` / `retryConfig` option that retries `429` (and transient `5xx`) automatically with exponential backoff. See [Retries and backoff](/en/docs/production/retries-and-backoff) for a tuned configuration.

## Designing for the limit

* **Batch upstream.** If your app generates many sentences per user action (e.g. translating a paragraph), serialize them through a queue rather than firing them all at once.
* **Throttle at the edge.** Apply your own per-user limit so a single user's burst can't consume your account's whole minute.
* **Long-text auto-chunking.** A single 2,000-character call becomes \~7 API calls under the hood. Account for that against your minute budget.
* **Streaming chats.** Sentence-by-sentence streaming TTS uses one API call per sentence. A multi-paragraph response might burn through the Free tier limit in a few seconds.

## Need higher limits?

If you're hitting the limit consistently or operating a high-traffic service, contact us for an enterprise plan with custom limits, dedicated capacity, and account-level support.

<Card title="Enterprise inquiry" icon="building" href="https://docs.google.com/forms/d/1YexQpjpK0ZEou12blTytkZLqvrV-Uv95GbhxoOQ54R8/edit">
  Share your use case and traffic shape — we'll respond with options.
</Card>
