Speak

curl -X POST "https://api.bland.ai/v1/speak" \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "Maeve",
    "text": "Hello, this is a test of the text-to-speech system.",
    "output_format": "pcm_44100"
  }'

{
  "status": "error",
  "errors": [
    {
      "error": "Invalid Input",
      "message": "Missing text in request body - must be a string"
    }
  ]
}

POST

speak

curl -X POST "https://api.bland.ai/v1/speak" \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "Maeve",
    "text": "Hello, this is a test of the text-to-speech system.",
    "output_format": "pcm_44100"
  }'

{
  "status": "error",
  "errors": [
    {
      "error": "Invalid Input",
      "message": "Missing text in request body - must be a string"
    }
  ]
}

This endpoint only supports Bland’s Beige Voices.

Pricing

Text-to-speech pricing varies by plan:

Start Plan: $0.02 per 100 characters (Default)
Build Plan: $0.02 per 150 characters
Scale Plan: $0.02 per 200 characters
Enterprise Plan: $0.02 per 400 characters

Headers

authorization

string

required

Your API key for authentication.

Body Parameters

voice

string

required

The ID or name of the Beige Voice to use for speech synthesis.

text

string

required

The text content to be converted to speech.

output_format

string

default:"pcm_44100"

The audio output format for the generated speech.Available formats:

pcm_8000 - PCM 8kHz, 16-bit, mono
pcm_16000 - PCM 16kHz, 16-bit, mono
pcm_24000 - PCM 24kHz, 16-bit, mono
pcm_44100 - PCM 44.1kHz, 16-bit, mono (default)
ulaw_8000 - μ-law 8kHz, 8-bit, mono

consistency

float

Controls voice consistency and stability. Value between 0 and 1.

0 - More varied, expressive speech
1 - More consistent, stable speech

expressiveness

float

Controls voice expressiveness and emotion. Value between 0 and 1.

0 - More monotone, neutral speech
1 - More expressive, emotional speech

stream

boolean

default:"false"

Enable streaming audio response for real-time playback.When true, audio is streamed as it’s generated. When false, the complete audio file is returned after generation.

Response

Non-Streaming Response

audio

audio/x-wav

Complete WAV audio file with the specified output format containing the synthesized speech.

Streaming Response

audio_stream

audio/x-wav

Chunked WAV audio stream. The response starts with a WAV header followed by audio data chunks as they’re generated.

Response Headers

x-latency

string

Time in milliseconds from request to first audio chunk/complete response.

x-cost

string

Cost in USD for the text-to-speech conversion

Content-Type

string

Always audio/x-wav for successful responses.

Content-Length

string

Size of the complete audio file in bytes (non-streaming only).

Transfer-Encoding

string

Set to chunked for streaming responses.

curl -X POST "https://api.bland.ai/v1/speak" \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "Maeve",
    "text": "Hello, this is a test of the text-to-speech system.",
    "output_format": "pcm_44100"
  }'

curl -X POST "https://api.bland.ai/v1/speak" \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "Maeve",
    "text": "This is a streaming example that will return audio chunks as they are generated.",
    "output_format": "pcm_24000",
    "stream": true
  }' \
  --output audio_stream.wav

curl -X POST "https://api.bland.ai/v1/speak" \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice": "Maeve",
    "text": "This example uses custom voice parameters for more expressive speech.",
    "output_format": "pcm_16000",
    "consistency": 0.3,
    "expressiveness": 0.8
  }'

{
  "status": "error",
  "errors": [
    {
      "error": "Invalid Input",
      "message": "Missing text in request body - must be a string"
    }
  ]
}

Resend Post Call Webhook Clone

Basic Tutorials

Calls

Text to Speech

Intelligence

Conversational Pathways

Vector Knowledge Bases

Numbers

Blocked Numbers

Voices

Custom Tools

Memory

Web Agents

Custom Twilio Accounts

Batches

Prompts

Account

Organizations

SMS

Custom Dialing Pools

Citation Schemas

Pricing

Headers

Body Parameters

Response

Non-Streaming Response

Streaming Response

Response Headers

Basic Tutorials

Calls

Text to Speech

Intelligence

Conversational Pathways

Vector Knowledge Bases

Numbers

Blocked Numbers

Voices

Custom Tools

Memory

Web Agents

Custom Twilio Accounts

Batches

Prompts

Account

Organizations

SMS

Custom Dialing Pools

Citation Schemas

​Pricing

​Headers

​Body Parameters

​Response

​Non-Streaming Response

​Streaming Response

​Response Headers

Pricing

Headers

Body Parameters

Response

Non-Streaming Response

Streaming Response

Response Headers