Basic Usage
{voice_id}
: Only character-based IDs can be used- Parameters like
language
,style
,model
are included in the Request Body
Request Body Items Description
Item | Required | Description |
---|---|---|
text | ✅ | Text to convert. Up to 300 characters can be input |
language | ✅ | Language of the text. One of ko , en , ja |
style | ❌ | Emotional style. E.g., neutral , happy , sad , etc. If not specified, the character’s default style is applied |
model | ❌ | Model to use. Default is sona_speech_1 . Currently only this model is available |
voice_settings | ❌ | Pitch/speed adjustment. Includes pitch_shift , pitch_variance , speed fields (defaults: 0, 1, 1) |
output_format | ❌ | Desired audio file format. wav or mp3 . (Default: wav ) |
Request Example
Response
- Response body is returned as binary audio file, default format is
wav
- Can also respond in mp3 format by passing
output_format=mp3
as query parameter - Voice length (seconds) can be checked through
X-Audio-Length
header
Important Notes
400 error occurs when text length exceeds 300 characters
.- Calls are possible even without
style
, but default styles may vary by character, so please call Get Voices API to check the default style (the first value in the styles array is the default). - The audio file in the response can be directly saved or played (appropriate handling required depending on client).
Headers
API key for the service
Choose TTS engine: "supercage" or "torchserve"
Path Parameters
Query Parameters
The desired output format of the audio file (wav, mp3). Default is wav.
Available options:
wav
, mp3
Body
application/json
Response
Audio file converted from text. The response includes an X-Audio-Length header with the duration in seconds.
The response is of type file
.