Python SDK

このドキュメントは英語の原文から自動翻訳されています。表現に不自然な箇所がある場合があります。正確な内容は英語の原文もあわせてご確認ください。

公式のPython SDKは、PyPIでsupertoneとして公開されています。ソース：supertone-inc/supertone-python。

インストール

pip
uv
poetry

pip install supertone

uv add supertone

poetry add supertone

Python 3.9以上が必要です。

API Keyを設定する

SDKは環境変数を自動的に読み込みません。api_keyを明示的に渡してください。慣例としてSUPERTONE_API_KEYから読み込みます。

export SUPERTONE_API_KEY="Kp9mZ3xQ7v..."

import os
from supertone import Supertone

client = Supertone(api_key=os.environ["SUPERTONE_API_KEY"])

ボイスIDは環境変数ではありません。用途ごとに変わるため、コード内のプレーンな文字列として保持するか（あるいはリクエストペイロードから渡してください）。

音声を生成する（同期）

推奨パターンでは、コンテキストマネージャを使用して内部のHTTP接続をクリーンにクローズします。

import os
from supertone import Supertone

VOICE_ID = "20160a4c5ba38967330c84"  # replace with your voice ID

with Supertone(api_key=os.environ["SUPERTONE_API_KEY"]) as client:
    response = client.text_to_speech.create_speech(
        voice_id=VOICE_ID,
        text="Hello from the Python SDK.",
        language="en",
        output_format="wav",
    )

    with open("speech.wav", "wb") as f:
        f.write(response.result.read())

音声を生成する（非同期）

_asyncサフィックスとasync withを使用します。

import asyncio
import os
from supertone import Supertone

VOICE_ID = "20160a4c5ba38967330c84"  # replace with your voice ID

async def main():
    async with Supertone(api_key=os.environ["SUPERTONE_API_KEY"]) as client:
        response = await client.text_to_speech.create_speech_async(
            voice_id=VOICE_ID,
            text="Hello from the async Python SDK.",
            language="en",
        )

        with open("speech.wav", "wb") as f:
            f.write(response.result.read())

asyncio.run(main())

SDK上のすべてのリソースメソッドには両方の形式が用意されています：create_speech / create_speech_async、stream_speech / stream_speech_async、list_voices / list_voices_asyncなどです。

音声をストリーミングする

ストリーミングはオーディオチャンクのイテレータ（または非同期イテレータ）を返します。

import os
from supertone import Supertone

VOICE_ID = "20160a4c5ba38967330c84"  # replace with your voice ID

with Supertone(api_key=os.environ["SUPERTONE_API_KEY"]) as client:
    response = client.text_to_speech.stream_speech(
        voice_id=VOICE_ID,
        text="This response is streamed chunk by chunk.",
        language="en",
        model="sona_speech_1",
    )

    with open("streamed.wav", "wb") as f:
        for chunk in response.result.iter_bytes():
            f.write(chunk)

非同期版はasync for chunk in response.result.aiter_bytes()を使用します。ストリーミングは現在sona_speech_1のみ対応しています。

長文の自動チャンク分割

create_speech、create_speech_async、stream_speech、stream_speech_asyncは、300文字を超えるテキストを自動的に分割します。create_speechは最大3セグメントを並列実行してオーディオを結合し、stream_speechはセグメントを逐次実行してチャンクをイテレータに転送します。

LONG_TEXT = "..."  # any length, including thousands of characters

response = client.text_to_speech.create_speech(
    voice_id=VOICE_ID,
    text=LONG_TEXT,
    language="en",
)

with open("narration.wav", "wb") as f:
    f.write(response.result.read())   # single merged file

predict_durationは自動チャンク分割を行いません。入力を300文字以内に収め、長いスクリプトでは手動で時間を合計してください。詳細とチューニングについては長文を参照してください。

一般的な操作

# List voices with pagination
result = client.voices.list_voices(page_size=20)

# Search voices
result = client.voices.search_voices(language="ko,en", style="happy")

# Get a single voice
voice = client.voices.get_voice(voice_id=VOICE_ID)

# Predict duration (no credits deducted)
duration = client.text_to_speech.predict_duration(
    voice_id=VOICE_ID,
    text="How long will this be?",
    language="en",
)

# Get credit balance
balance = client.usage.get_credit_balance()

型安全なenum（任意）

型安全性を求める場合、SDKはsupertone.modelsにenum定数を公開しています。

from supertone import models

models.APIConvertTextToSpeechUsingCharacterRequestLanguage.EN       # "en"
models.APIConvertTextToSpeechUsingCharacterRequestModel.SONA_SPEECH_1  # "sona_speech_1"

プレーンな文字列（"en"、"sona_speech_1"）も動作します。お好みのスタイルをお使いください。

エラー処理

エラーはsupertone.errorsに定義されており、すべてSupertoneErrorを継承しています。

from supertone import Supertone, errors

try:
    response = client.text_to_speech.create_speech(...)
except errors.TooManyRequestsErrorResponse as e:
    # 429 — back off and retry
    print("Rate limited:", e.message)
except errors.UnauthorizedErrorResponse as e:
    # 401 — bad or missing API key
    print("Auth failed:", e.message)
except errors.PaymentRequiredErrorResponse as e:
    # 402 — out of credits
    print("Buy more credits:", e.message)
except errors.SupertoneError as e:
    # Any other API error
    print(f"HTTP {e.status_code}: {e.message}")

エラークラス	HTTPステータス
`BadRequestErrorResponse`	400
`UnauthorizedErrorResponse`	401
`PaymentRequiredErrorResponse`	402
`ForbiddenErrorResponse`	403
`NotFoundErrorResponse`	404
`RequestTimeoutErrorResponse`	408
`PayloadTooLargeErrorResponse`	413
`UnsupportedMediaTypeErrorResponse`	415
`TooManyRequestsErrorResponse`	429
`InternalServerErrorResponse`	500

ネットワークエラー（DNS失敗、パイプの破損など）はhttpxから発生し、SupertoneErrorを継承していません。

設定

from supertone import Supertone
from supertone.utils.retries import RetryConfig

client = Supertone(
    api_key=os.environ["SUPERTONE_API_KEY"],
    timeout_ms=30_000,
    retry_config=RetryConfig(
        strategy="backoff",
        backoff={"initial_interval": 500, "max_interval": 60_000},
        retry_connection_errors=True,
    ),
)

TypeScript SDK

NodeおよびBun向けの同等のSDK。

サンプル

一般的なワークフローのレシピ。

はじめに

コアコンセプト

Text-to-Speech

SDK

サンプル

本番運用

リソース

インストール

API Keyを設定する

音声を生成する（同期）

音声を生成する（非同期）

音声をストリーミングする

長文の自動チャンク分割

一般的な操作

型安全なenum（任意）

エラー処理

設定

関連項目

TypeScript SDK

サンプル

はじめに

コアコンセプト

Text-to-Speech

SDK

サンプル

本番運用

リソース

Documentation Index

​インストール

​API Keyを設定する

​音声を生成する（同期）

​音声を生成する（非同期）

​音声をストリーミングする

​長文の自動チャンク分割

​一般的な操作

​型安全なenum（任意）

​エラー処理

​設定

​関連項目

TypeScript SDK

サンプル

インストール

API Keyを設定する

音声を生成する（同期）

音声を生成する（非同期）

音声をストリーミングする

長文の自動チャンク分割

一般的な操作

型安全なenum（任意）

エラー処理

設定

関連項目