Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.supertoneapi.com/llms.txt

Use this file to discover all available pages before exploring further.

A custom voice is a voice clone tied to your account. Once registered, it behaves exactly like a preset voice — same TTS endpoint, same parameters — but it lives at a separate set of endpoints and is only callable by the account that created it. Custom voices can be created two ways:
  • In Supertone Play — upload a sample and clone through the UI.
  • Via the APIPOST /v1/custom-voices/cloned-voice with an audio file.
Either path produces the same kind of voice. Voices cloned in Play show up in list_custom_voices, and voices cloned via the API show up in Play. There’s no sync step.
Voice cloning via the API is not available on the Free tier. Cloning in Play is available on all paid tiers.

Endpoint overview

EndpointPurpose
POST /v1/custom-voices/cloned-voiceCreate a new clone from an uploaded audio sample.
GET /v1/custom-voicesList all custom voices on your account.
GET /v1/custom-voices/searchFilter custom voices by name or description.
GET /v1/custom-voices/{voice_id}Retrieve a single custom voice.
PATCH /v1/custom-voices/{voice_id}Update name or description.
DELETE /v1/custom-voices/{voice_id}Permanently delete a custom voice.

Create a clone

Upload a clean audio sample of the voice you want to clone:
import os
from supertone import Supertone

with Supertone(api_key=os.environ["SUPERTONE_API_KEY"]) as client:
    with open("voice_sample.wav", "rb") as f:
        response = client.custom_voices.create_cloned_voice(
            files={"file_name": "voice_sample.wav", "content": f.read()},
            name="Hana — narrator voice",
            description="Calm, mid-pitch female voice for audiobook narration.",
        )

    print("Created custom voice:", response.voice_id)
The response includes the new voice_id. Save it — that’s what you’ll pass to TTS calls.

Upload constraints

  • Format: WAV or MP3
  • Size: under 3 MB
  • Name: up to 100 characters
Clean, mono, single-speaker audio (5–30 seconds) yields the best clones. Avoid background music, multiple voices, or heavy room noise.
# List all
result = client.custom_voices.list_custom_voices(page_size=20)
for voice in result.items or []:
    print(voice.voice_id, voice.name)

# Search by name or description
result = client.custom_voices.search_custom_voices(name="narrator")

Use a custom voice in TTS

Once the clone is registered, call text_to_speech.create_speech exactly like you would with a preset voice — just pass the custom voice_id:
response = client.text_to_speech.create_speech(
    voice_id=CUSTOM_VOICE_ID,
    text="The first chapter begins on a quiet rainy morning.",
    language="en",
    model="sona_speech_2",
)
Custom voices support the same voice_settings, output_format, include_phonemes, and normalized_text fields as preset voices.

Update or delete

# Rename / update description
client.custom_voices.edit_custom_voice(
    voice_id=CUSTOM_VOICE_ID,
    name="Hana — narrator (v2)",
    description="Improved take with cleaner room tone.",
)

# Permanently delete
client.custom_voices.delete_custom_voice(voice_id=CUSTOM_VOICE_ID)
// Rename / update description
await client.customVoices.editCustomVoice({
  voiceId: CUSTOM_VOICE_ID,
  name: "Hana — narrator (v2)",
  description: "Improved take with cleaner room tone.",
});

// Permanently delete
await client.customVoices.deleteCustomVoice({ voiceId: CUSTOM_VOICE_ID });

Important constraints

  • Account-scoped. A custom voice is callable only by the account that created it. Calling someone else’s custom voice — even if you know the ID — returns 403 Forbidden.
  • Same credits. Custom voice calls deduct credits at the same rate as preset voices.
  • Permissions and disclosure. Make sure you have rights to clone any voice you upload. Check your jurisdiction’s rules around AI-generated voices and end-user disclosure.

Preset voices

Browse and search Supertone’s preset voice library.

End-to-end example

Clone, list, and use a custom voice in TTS.

API reference

Full request and response schema.

Cost and usage

Forecast cost and monitor per-voice usage.