Pick your language below and run the snippet. The Python and TypeScript SDKs handle authentication, retries, and chunking for long text out of the box.The code uses an example voice_id — once you’ve heard it work, swap it for any voice from the voice library.
import osfrom supertone import SupertoneVOICE_ID = "20160a4c5ba38967330c84" # example voice — replace with your ownwith Supertone(api_key=os.environ["SUPERTONE_API_KEY"]) as client: response = client.text_to_speech.create_speech( voice_id=VOICE_ID, text="Hello from Supertone. This audio was generated with the Python SDK.", language="en", output_format="wav", ) with open("speech.wav", "wb") as f: f.write(response.result.read())print("Saved speech.wav")
Supertone(api_key=...) / new Supertone({ apiKey })
Creates a client. The key is sent in the x-sup-api-key header.
voice_id
Identifies which character speaks the text.
text
The script to synthesize. Max 300 characters per API call. SDKs auto-chunk longer text.
language
Language of the text. Required, and must be supported by the voice and model.
model
Defaults to sona_speech_1. See Models for trade-offs.
output_format
wav (default) or mp3.
The SDKs also expose typed enum constants (e.g. models.APIConvertTextToSpeechUsingCharacterRequestLanguage.EN) if you prefer type safety over plain strings — both work.