Text to Speech
Text To Speech
Guides the request structure and parameter usage methods, and error precautions for the TTS API that converts text to speech.
POST
This is the TTS (Text-to-Speech) API that converts text to speech using a specified voice.
Through this API, you can generate natural speech from desired sentences.
Basic Usage
{voice_id}
: Only character-based IDs can be used- Parameters like
language
,style
,model
are included in the Request Body
Request Body Items Description
Item | Required | Description |
---|---|---|
text | โ | Text to convert. Up to 300 characters can be input |
language | โ | Language of the text. One of ko , en , ja |
style | โ | Emotional style. E.g., neutral , happy , sad , etc. If not specified, the characterโs default style is applied |
model | โ | Model to use. Default is sona_speech_1 . Currently only this model is available |
voice_settings | โ | Pitch/speed adjustment. Includes pitch_shift , pitch_variance , speed fields (defaults: 0, 1, 1) |
Request Example
Response
- Response body is returned as binary audio file, default format is
wav
- Can also respond in mp3 format by passing
output_format=mp3
as query parameter - Voice length (seconds) can be checked through
X-Audio-Length
header
Important Notes
400 error occurs when text length exceeds 300 characters
.- Calls are possible even without
style
, but default styles may vary by character, so please call Get Voices API to check the default style (the first value in the styles array is the default). - The audio file in the response can be directly saved or played (appropriate handling required depending on client).
Headers
API key for the service
Path Parameters
Query Parameters
The desired output format of the audio file (wav, mp3). Default is wav.
Available options:
wav
, mp3
Body
application/json
Response
200
audio/wav
Audio file converted from text. The response includes an X-Audio-Length header with the duration in seconds.
The response is of type file
.