Forecast credit cost, check your balance, monitor usage by voice or time bucket, and set up anomaly alerts.
The Supertone API uses a credit-based billing model that is shared with Supertone Play. This page covers how credits work, how to forecast cost before generating, and the endpoints you should integrate into your observability stack.
Credits are deducted per second of generated audio.
Credits are shared with Supertone Play — the same balance applies to both. Credits charged in Play are immediately available to the API and vice versa.
Both preset voices and custom voices deduct from the same balance.
predict_duration is free. No credits are deducted — use it for cost preview and pre-flighting.
predict_duration returns the expected length of generated speech for a given text, without producing audio. Use it to:
Show users a “this will take ~12 seconds” estimate before generation.
Forecast credit cost (no credits are deducted by this endpoint).
Decide whether a sentence fits a UI slot or budget.
predict_duration does not auto-chunk — the same 300-character limit applies. For longer scripts, split yourself and sum the predicted durations.
Python
TypeScript
cURL
import osfrom supertone import SupertoneVOICE_ID = "20160a4c5ba38967330c84" # replace with your voice IDwith Supertone(api_key=os.environ["SUPERTONE_API_KEY"]) as client: response = client.text_to_speech.predict_duration( voice_id=VOICE_ID, text="This is a sentence to estimate.", language="en", )print(f"Estimated duration: {response.duration:.2f}s")
import { Supertone } from "@supertone/supertone";const VOICE_ID = "20160a4c5ba38967330c84"; // replace with your voice IDconst client = new Supertone({ apiKey: process.env.SUPERTONE_API_KEY });const response = await client.textToSpeech.predictDuration({ voiceId: VOICE_ID, predictTTSDurationRequest: { text: "This is a sentence to estimate.", language: "en", },});console.log(`Estimated duration: ${response.duration?.toFixed(2)}s`);
VOICE_ID="20160a4c5ba38967330c84"curl -X POST "https://supertoneapi.com/v1/predict-duration/$VOICE_ID" \ -H "x-sup-api-key: $SUPERTONE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "text": "This is a sentence to estimate.", "language": "en" }'
Response:
{ "duration": 2.18 }
The request body is the same shape as Create speech — text, language, style, model, voice_settings — minus output_format, include_phonemes, and normalized_text (none of which affect the predicted length).Tips for accurate forecasts:
Speed changes the result.voice_settings.speed multiplies the duration. If you preview at one speed and generate at another, the actual length will differ. Use the same speed in both calls.
Match the model.predict_duration accepts the same model field as create_speech. If your generation will use a specific model, predict against it.
For pre-flighting batch jobs or per-user budgets, call predict_duration first and sum the durations to decide whether to proceed.
A simple cron job that posts to your monitoring tool when the balance drops below a threshold gives you days of warning before a 402 Payment Required outage.
Spend spike — compare today’s minutes-generated to the rolling 7-day median. Alert if today is more than (say) 3× the median.
New voice in production — if a voice not in your expected set appears in get_voice_usage, flag it. This catches accidental hard-coded test voices in production code.