Introduction

Supertone API is a RESTful API that can synthesize emotionally expressive voices from text. Based on a voice library integrated with Play, it can generate natural speech with just text input, providing various features from voice exploration to usage management.

Features

Text-to-Speech

Converts text to speech. Users can pass language, style, model, and detailed voice settings (pitch, speed, etc.) along with voice ID.

Voice Exploration

You can retrieve all voices available in your current account or search by conditions such as language, style, and name.

Speech Duration Prediction

Before generating speech, you can predict how many seconds of audio the input text will generate. This feature can be called without deducting credits.

Credits and Usage Check

Provides endpoints to check your account’s remaining credit balance and voice time used through the API.

Documentation

Authentication: Guides authentication methods for using the API.
Endpoint Reference: You can check calling methods and response structures for each feature.

After its official launch, Supertone API operates as a scalable voice synthesis platform that organically connects Play, console, and voice generation features beyond simple calling functionality.

Supertone API

Voices

Text to Speech

Usage

Features

Text-to-Speech

Voice Exploration

Speech Duration Prediction

Credits and Usage Check

Documentation

Supertone API

Voices

Text to Speech

Usage

​Features

Text-to-Speech

Voice Exploration

Speech Duration Prediction

Credits and Usage Check

​Documentation

Features

Documentation