POST
/
v1
/
text-to-speech
/
{voice_id}
/
stream
Convert text to speech with streaming response
curl --request POST \
  --url https://supertoneapi.com/v1/text-to-speech/{voice_id}/stream \
  --header 'Content-Type: application/json' \
  --header 'x-sup-api-key: <x-sup-api-key>' \
  --data '{
  "text": "<string>",
  "language": "en",
  "style": "<string>",
  "model": "sona_speech_1",
  "output_format": "wav",
  "voice_settings": {
    "pitch_shift": 0,
    "pitch_variance": 1,
    "speed": 1
  }
}'
This response does not have an example.
This is a Streaming TTS (Text-to-Speech) API that returns the input text as an audio stream.

Basic Usage

  • {voice_id}: Only character-level IDs are supported
  • Parameters such as language, style, and model must be included in the Request Body

Request Body Field Descriptions

FieldRequiredDescription
textโœ…Text to be converted. Up to 300 characters allowed
languageโœ…Language of the text. One of ko, en, or ja
styleโŒEmotion style. e.g., neutral, happy, sad, etc. Defaults to characterโ€™s base style if unspecified
modelโŒModel to use. Default is sona_speech_1. Currently, only this model is supported
voice_settingsโŒControls pitch/speed. Includes pitch_shift, pitch_variance, and speed (default: 0, 1, 1)
output_formatโŒDesired audio file format. wav or mp3. (Default: wav)

Usage Example

POST /v1/text-to-speech/{voice_id}/stream
Content-Type: application/json
x-sup-api-key: [YOUR_API_KEY]

{
  "text": "Thank you for calling.",
  "language": "en",
  "style": "happy",
  "model": "sona_speech_1",
  "voice_settings": {
    "pitch_shift": 0,
    "pitch_variance": 1,
    "speed": 1
  }
}

Response

  • The response body is returned as a binary chunk, with the default format being wav
  • If you include output_format=mp3 as a query parameter, the response can be returned in MP3 format

Notes

  • A 400 error will occur if the text length exceeds 300 characters.
  • The API can be called without specifying style, but the default style may vary by character.
    Please use the Get Voices API to check the default (the first value in the style array is the default).
  • The returned audio file can be saved or played directly. (Appropriate handling may be required depending on the client.)

Headers

x-sup-api-key
string
required

API key for the service

Path Parameters

voice_id
string
required

Query Parameters

output_format
enum<string>
default:wav

The desired output format of the audio file (wav, mp3). Default is wav.

Available options:
wav,
mp3

Body

application/json

Response

200
audio/wav

Streaming audio data in binary format

The response is of type file.