TypeScript SDK

このドキュメントは英語の原文から自動翻訳されています。表現に不自然な箇所がある場合があります。正確な内容は英語の原文もあわせてご確認ください。

公式のTypeScript SDKは、npmで@supertone/supertoneとして公開されています。ソース：supertone-inc/supertone-ts。

インストール

npm
pnpm
bun
yarn

npm add @supertone/supertone

pnpm add @supertone/supertone

bun add @supertone/supertone

Yarnの場合はピア依存を手動でインストールする必要があります。

yarn add @supertone/supertone zod

このパッケージはESMとCommonJSの両方のエントリポイントで公開されており、TypeScriptの型定義も同梱されています。Node 18以上が必要です（グローバルなfetchおよびReadableStreamのため）。BunおよびDenoでも動作します。ブラウザ向けには設計されていません（API Keyはクライアント側に置くべきではありません）。

API Keyを設定する

export SUPERTONE_API_KEY="Kp9mZ3xQ7v..."

import { Supertone } from "@supertone/supertone";

const client = new Supertone({ apiKey: process.env.SUPERTONE_API_KEY });

ボイスIDは環境変数ではありません。用途ごとに変わるため、コード内のプレーンな文字列として保持するか（あるいはリクエストペイロードから渡してください）。

音声を生成する

import { Supertone } from "@supertone/supertone";
import * as fs from "node:fs";

const VOICE_ID = "20160a4c5ba38967330c84"; // replace with your voice ID

const client = new Supertone({ apiKey: process.env.SUPERTONE_API_KEY });

const response = await client.textToSpeech.createSpeech({
  voiceId: VOICE_ID,
  apiConvertTextToSpeechUsingCharacterRequest: {
    text: "Hello from the TypeScript SDK.",
    language: "en",
    model: "sona_speech_1",
    outputFormat: "wav",
  },
});

if (response.result instanceof Uint8Array) {
  fs.writeFileSync("speech.wav", response.result);
} else if (response.result && "getReader" in response.result) {
  const reader = (response.result as ReadableStream<Uint8Array>).getReader();
  const chunks: Uint8Array[] = [];
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    if (value) chunks.push(value);
  }
  fs.writeFileSync("speech.wav", Buffer.concat(chunks));
}

すべてのメソッドは非同期であり、Promiseを返します。同期版はありません。

音声をストリーミングする

import { Supertone } from "@supertone/supertone";
import * as fs from "node:fs";

const VOICE_ID = "20160a4c5ba38967330c84"; // replace with your voice ID

const client = new Supertone({ apiKey: process.env.SUPERTONE_API_KEY });

const response = await client.textToSpeech.streamSpeech({
  voiceId: VOICE_ID,
  apiConvertTextToSpeechUsingCharacterRequest: {
    text: "This response is streamed chunk by chunk.",
    language: "en",
    model: "sona_speech_1",
  },
});

if (response.result && typeof response.result === "object" && "getReader" in response.result) {
  const reader = (response.result as ReadableStream<Uint8Array>).getReader();
  const chunks: Uint8Array[] = [];
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    if (value) chunks.push(value);
  }
  fs.writeFileSync("streamed.wav", Buffer.concat(chunks));
}

ストリーミングは現在sona_speech_1のみ対応しています。

長文の自動チャンク分割

createSpeechとstreamSpeechは、いずれも300文字を超えるテキストを自動的にチャンク分割します。テキストはそのまま渡してください。SDKが分割し、セグメントごとに生成し、結果を結合（またはストリーミング）します。

const response = await client.textToSpeech.createSpeech(
  {
    voiceId: VOICE_ID,
    apiConvertTextToSpeechUsingCharacterRequest: { text: longText, language: "en" },
  },
  {
    maxTextLength: 300, // optional override (default 300)
  },
);

predictDurationは自動チャンク分割を行いません。300文字制限が強制されます。詳細は長文を参照してください。

一般的な操作

// List voices with pagination
const list = await client.voices.listVoices({ pageSize: 20 });

// Search voices
const search = await client.voices.searchVoices({
  language: "ko,en",
  style: "happy",
});

// Get a single voice
const voice = await client.voices.getVoice({ voiceId: VOICE_ID });

// Predict duration (no credits deducted)
const { duration } = await client.textToSpeech.predictDuration({
  voiceId: VOICE_ID,
  predictTTSDurationRequest: {
    text: "How long will this be?",
    language: "en",
  },
});

// Get credit balance
const balance = await client.usage.getCreditBalance();

型安全なenum（任意）

型安全性を求める場合、SDKは@supertone/supertone/modelsにenum定数を公開しています。

import * as models from "@supertone/supertone/models";

models.APIConvertTextToSpeechUsingCharacterRequestLanguage.En;     // "en"
models.APIConvertTextToSpeechUsingCharacterRequestModel.SonaSpeech1; // "sona_speech_1"

プレーンな文字列リテラル（"en"、"sona_speech_1"）も動作します。お好みのスタイルをお使いください。

エラー処理

エラーは@supertone/supertone/models/errorsに定義されており、すべてSupertoneErrorを継承しています。

import { Supertone } from "@supertone/supertone";
import * as errors from "@supertone/supertone/models/errors";

const client = new Supertone({ apiKey: process.env.SUPERTONE_API_KEY });

try {
  const response = await client.textToSpeech.createSpeech({ /* ... */ });
} catch (err) {
  if (err instanceof errors.TooManyRequestsErrorResponse) {
    console.log("Rate limited:", err.message);
  } else if (err instanceof errors.UnauthorizedErrorResponse) {
    console.log("Auth failed:", err.message);
  } else if (err instanceof errors.PaymentRequiredErrorResponse) {
    console.log("Out of credits:", err.message);
  } else if (err instanceof errors.SupertoneError) {
    console.log(`HTTP ${err.statusCode}: ${err.message}`);
  } else {
    throw err;
  }
}

エラークラス	HTTPステータス
`BadRequestErrorResponse`	400
`UnauthorizedErrorResponse`	401
`PaymentRequiredErrorResponse`	402
`ForbiddenErrorResponse`	403
`NotFoundErrorResponse`	404
`RequestTimeoutErrorResponse`	408
`PayloadTooLargeErrorResponse`	413
`UnsupportedMediaTypeErrorResponse`	415
`TooManyRequestsErrorResponse`	429
`InternalServerErrorResponse`	500

ネットワーク層のエラー（ConnectionError、RequestTimeoutError、RequestAbortedErrorなど）はSupertoneErrorではなくHTTPClientErrorを継承します。

設定

const client = new Supertone({
  apiKey: process.env.SUPERTONE_API_KEY,
  timeoutMs: 30_000,
  retryConfig: {
    strategy: "backoff",
    backoff: { initialInterval: 500, maxInterval: 60_000, exponent: 1.5, maxElapsedTime: 3_600_000 },
    retryConnectionErrors: true,
  },
});

Python SDK

Python向けの同等のSDK。

サンプル

一般的なワークフローのレシピ。

はじめに

コアコンセプト

Text-to-Speech

SDK

サンプル

本番運用

リソース

インストール

API Keyを設定する

音声を生成する

音声をストリーミングする

長文の自動チャンク分割

一般的な操作

型安全なenum（任意）

エラー処理

設定

関連項目

Python SDK

サンプル

はじめに

コアコンセプト

Text-to-Speech

SDK

サンプル

本番運用

リソース

Documentation Index

​インストール

​API Keyを設定する

​音声を生成する

​音声をストリーミングする

​長文の自動チャンク分割

​一般的な操作

​型安全なenum（任意）

​エラー処理

​設定

​関連項目

Python SDK

サンプル

インストール

API Keyを設定する

音声を生成する

音声をストリーミングする

長文の自動チャンク分割

一般的な操作

型安全なenum（任意）

エラー処理

設定

関連項目