Speak
Convert text to speech using Bland Beige
This endpoint only supports Bland’s Beige Voices.
Pricing
Text-to-speech pricing varies by plan:
- Start Plan: $0.02 per 100 characters (Default)
- Build Plan: $0.02 per 150 characters
- Scale Plan: $0.02 per 200 characters
- Enterprise Plan: $0.02 per 400 characters
Headers
Your API key for authentication.
Body Parameters
The ID or name of the Beige Voice to use for speech synthesis.
The text content to be converted to speech.
The audio output format for the generated speech.
Available formats:
pcm_8000
- PCM 8kHz, 16-bit, monopcm_16000
- PCM 16kHz, 16-bit, monopcm_24000
- PCM 24kHz, 16-bit, monopcm_44100
- PCM 44.1kHz, 16-bit, mono (default)ulaw_8000
- μ-law 8kHz, 8-bit, mono
Controls voice consistency and stability. Value between 0 and 1.
0
- More varied, expressive speech1
- More consistent, stable speech
Controls voice expressiveness and emotion. Value between 0 and 1.
0
- More monotone, neutral speech1
- More expressive, emotional speech
Enable streaming audio response for real-time playback.
When true
, audio is streamed as it’s generated. When false
, the complete audio file is returned after generation.
Response
Non-Streaming Response
Complete WAV audio file with the specified output format containing the synthesized speech.
Streaming Response
Chunked WAV audio stream. The response starts with a WAV header followed by audio data chunks as they’re generated.
Response Headers
Time in milliseconds from request to first audio chunk/complete response.
Cost in USD for the text-to-speech conversion
Always audio/x-wav
for successful responses.
Size of the complete audio file in bytes (non-streaming only).
Set to chunked
for streaming responses.