Text to Speech
Predict Duration
An API that returns only the expected length without generating speech, useful for billing prediction or text length adjustment.
POST
This API does not actually generate speech,
but only returns the expected speech length (in seconds) based on the input text.
Itโs useful for understanding expected credit consumption or adjusting text length before making TTS calls.
Request
- The calling method and Request Body are almost identical to the
text-to-speech
API. - However, only the
duration
value is returned as a result, not audio. - No credits are consumed when calling the Predict Duration API.
Request Body
Item | Required | Description |
---|---|---|
text | โ | Text to analyze. Maximum 300 characters |
language | โ | Text language. One of ko , en , ja |
style | โ | Emotional style. Default style is used if not specified |
model | โ | Default is sona_speech_1 . Currently only this model is available |
voice_settings | โ | Speech speed or pitch adjustment values. May affect result length |
Request Example
Response Example
This means that generating this text would create approximately 3.57 seconds of audio.
Tips
- Credits are not actually deducted. (because no speech generation occurs)
- You can get results very similar to when actually calling with the same text.
- Since adjusting
voice_settings.speed
changes the length, itโs better to test with a fixed speech speed.
Headers
API key for the service
Path Parameters
Body
application/json
Response
200
application/json
Returns predicted duration of the audio in seconds
The response is of type object
.