POST
/
v1
/
text-to-speech
/
{voice_id}
curl --request POST \
  --url https://supertoneapi.com/v1/text-to-speech/{voice_id} \
  --header 'Content-Type: application/json' \
  --header 'x-sup-api-key: <x-sup-api-key>' \
  --data '{
  "text": "<string>",
  "language": "en",
  "style": "<string>",
  "model": "sona_speech_1",
  "voice_settings": {
    "pitch_shift": 0,
    "pitch_variance": 1,
    "speed": 1
  }
}'
This response does not have an example.

This is the TTS (Text-to-Speech) API that converts text to speech using a specified voice.
Through this API, you can generate natural speech from desired sentences.

Basic Usage

  • {voice_id}: Only character-based IDs can be used
  • Parameters like language, style, model are included in the Request Body

Request Body Items Description

ItemRequiredDescription
textโœ…Text to convert. Up to 300 characters can be input
languageโœ…Language of the text. One of ko, en, ja
styleโŒEmotional style. E.g., neutral, happy, sad, etc. If not specified, the characterโ€™s default style is applied
modelโŒModel to use. Default is sona_speech_1. Currently only this model is available
voice_settingsโŒPitch/speed adjustment. Includes pitch_shift, pitch_variance, speed fields (defaults: 0, 1, 1)

Request Example

POST /v1/text-to-speech/{voice_id}
Content-Type: application/json
x-sup-api-key: [YOUR_API_KEY]

{
  "text": "Thank you for calling.",
  "language": "en",
  "style": "happy",
  "model": "sona_speech_1",
  "voice_settings": {
    "pitch_shift": 0,
    "pitch_variance": 1,
    "speed": 1
  }
}

Response

  • Response body is returned as binary audio file, default format is wav
  • Can also respond in mp3 format by passing output_format=mp3 as query parameter
  • Voice length (seconds) can be checked through X-Audio-Length header

Important Notes

  • 400 error occurs when text length exceeds 300 characters.
  • Calls are possible even without style, but default styles may vary by character, so please call Get Voices API to check the default style (the first value in the styles array is the default).
  • The audio file in the response can be directly saved or played (appropriate handling required depending on client).

Headers

x-sup-api-key
string
required

API key for the service

Path Parameters

voice_id
string
required

Query Parameters

output_format
enum<string>
default:wav

The desired output format of the audio file (wav, mp3). Default is wav.

Available options:
wav,
mp3

Body

application/json

Response

200
audio/wav

Audio file converted from text. The response includes an X-Audio-Length header with the duration in seconds.

The response is of type file.