MCP - Supertone API Documentation

Integrate Supertone TTS with Claude, Cursor, and other MCP-compatible clients. The Supertone MCP server exposes the Text-to-Speech API as a set of composable Model Context Protocol tools, so an AI agent can discover voices, preview samples, estimate cost, clone voices, and synthesize speech — and chain those steps into multi-step workflows on its own. Source: supertone-inc/supertone-mcp.

What you can do

Synthesize speech with control over voice, language, speed, pitch, and emotion style.
Discover voices by language, gender, age, use case, or style — and preview samples before committing.
Clone and manage custom voices from a local audio file.
Track usage — check credit balance and usage history.
Stitch audio — merge multiple clips into one file, with optional silence gaps or crossfades.

Prerequisites

uv installed (provides uvx), or Python with pip.
A Supertone API key from the Developer Console.

Install

Every client runs the same server — uvx supertone-mcp — with your API key passed as an environment variable. Pick your client below.

Cursor
Claude Desktop
Claude Code
VS Code
Windsurf

Add to ~/.cursor/mcp.json (global) or .cursor/mcp.json (per-project), then fill in your API key:

{
  "mcpServers": {
    "supertone-tts": {
      "command": "uvx",
      "args": ["supertone-mcp"],
      "env": { "SUPERTONE_API_KEY": "your-api-key-here" }
    }
  }
}

Add the server to claude_desktop_config.json, then restart Claude Desktop:

{
  "mcpServers": {
    "supertone-tts": {
      "command": "uvx",
      "args": ["supertone-mcp"],
      "env": { "SUPERTONE_API_KEY": "your-api-key-here" }
    }
  }
}

Config file location:

OS	Path
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`

One command from your terminal:

claude mcp add supertone-tts -e SUPERTONE_API_KEY=your-api-key-here -- uvx supertone-mcp

Add -s user to make it available across all your projects.

Add to .vscode/mcp.json in your workspace (note the top-level servers key):

{
  "servers": {
    "supertone-tts": {
      "command": "uvx",
      "args": ["supertone-mcp"],
      "env": { "SUPERTONE_API_KEY": "your-api-key-here" }
    }
  }
}

Then enable the server from the MCP view and use it in Copilot Chat’s Agent mode.

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "supertone-tts": {
      "command": "uvx",
      "args": ["supertone-mcp"],
      "env": { "SUPERTONE_API_KEY": "your-api-key-here" }
    }
  }
}

Reload the server list in Windsurf’s Cascade panel.

Environment variables

Variable	Required	Default	Purpose
`SUPERTONE_API_KEY`	Yes	—	Authentication
`SUPERTONE_MCP_VOICE_ID`	No	Aiden (multilingual)	Default `voice_id` for `text_to_speech`
`SUPERTONE_OUTPUT_DIR`	No	`~/supertone-tts-output/`	Where generated audio files are saved

Tools

The server exposes its capabilities as composable building blocks the agent can chain.

Speech synthesis

Tool	Description
`text_to_speech`	Generate audio with control over speed, pitch, emotion style, and output format.
`predict_duration`	Estimate synthesis duration and credit cost before generating.

Voice discovery

Tool	Description
`search_voice`	Filter preset voices by language, gender, age, use case, or style.
`get_voice`	Retrieve full details for a voice.
`preview_voice`	Fetch sample audio URLs to evaluate a voice.

Voice cloning

Tool	Description
`clone_voice`	Create a cloned voice from a local WAV/MP3 (≤ 3 MB).
`search_custom_voice`	List and filter your cloned voices.
`get_custom_voice`	Fetch details for a cloned voice.
`edit_custom_voice`	Update a cloned voice’s name or description.
`delete_custom_voice`	Permanently remove a cloned voice (irreversible).

Usage & credits

Tool	Description
`get_credit_balance`	Check remaining credits.
`get_usage_history`	View usage over a time window.
`get_voice_usage`	Usage metrics for a specific voice.

Audio editing

Tool	Description
`merge_audio_files`	Merge two or more local audio files into one — plain concatenation, silence gaps (`gap_ms`), or crossfade blending (`crossfade_ms`). Useful for stitching multiple `text_to_speech` outputs.

Key `text_to_speech` parameters

text (required), voice_id, language, output_format (mp3 / wav)
model — e.g. sona_speech_2_flash, sona_speech_1
speed (0.5–2.0), pitch_shift (−24 to +24 semitones), style
output_mode (files / resources / both), autoplay (default false), streaming (sona_speech_1 only)

These are per-call parameters, so the agent controls output mode, autoplay, and model on each invocation.

Key `merge_audio_files` parameters

input_paths (required) — two or more local audio file paths, in order. (A single path is returned unchanged.)
gap_ms — silence inserted between clips, in milliseconds.
crossfade_ms — crossfade blend between clips, in milliseconds. Mutually exclusive with gap_ms.
output_format — override the output format. By default it’s auto-detected: all inputs sharing an extension → that extension; mixed → mp3. Differing sample rates or channel counts are normalized automatically before merging.

ffmpeg is bundled via imageio-ffmpeg, so merging works out of the box with uvx supertone-mcp — no system ffmpeg install required.

Example workflows

Discover → preview → estimate → synthesize

“Find a calm Korean female voice, let me hear a sample, check the cost, then make this announcement as mp3.”Chains search_voice() → preview_voice() → predict_duration() + get_credit_balance() → text_to_speech().

Clone and use immediately

“Create a cloned voice from ~/recordings/sample.wav named MyVoice, then read this greeting with it and play it.”Chains clone_voice() → get_custom_voice() → text_to_speech(autoplay=true).

Narrate a script and stitch it together

“Generate each paragraph of this script, then merge them into one mp3 with a short pause between each.”Chains text_to_speech() per segment → merge_audio_files(gap_ms=...).

Troubleshooting

The client doesn't list the Supertone tools

Make sure the config file is valid JSON and the client was fully restarted. Most clients only load MCP servers at startup.

uvx: command not found

Install uv (which provides uvx): see the uv install guide. Alternatively pip install supertone-mcp and set the command to supertone-mcp.

Authentication errors

Confirm SUPERTONE_API_KEY is set in the server’s env block (not just your shell) and is valid. Get a key from the Developer Console.

Where did my audio go?

With output_mode: files, audio is written to SUPERTONE_OUTPUT_DIR (default ~/supertone-tts-output/). Set autoplay: true to also play it immediately.

CLI

The same capabilities from your terminal and scripts.

Custom voices

How voice cloning works on Supertone.

​What you can do

​Prerequisites

​Install

​Environment variables

​Tools

​Key text_to_speech parameters

​Key merge_audio_files parameters

​Example workflows