Skip to main content
POST
/
v1
/
voices
/
clone
curl -X POST "https://api.bland.ai/v1/voices/clone" \
  -H "Authorization: your_api_key_here" \
  -F "name=Professional Voice" \
  -F "audio_samples=@voice_sample.wav" \
  -F "gender=female"
{
  "status": 200,
  "data": {
    "voice_id": "4cc89edf-c4b7-4df8-98c5-f8548fcd7c62",
    "name": "Blandie"
  },
  "errors": null
}

Authentication

Authorization
string
required
Your API key for authentication.

Request Body

The request must be sent as multipart/form-data with the following fields:
name
string
required
The name for your voice clone. Must be 30 characters or less.Validation:
  • Required field
  • Maximum 30 characters
  • Will be used as the display name in your voice library
audio_samples
file[]
required
Audio file(s) containing voice samples for cloning.Requirements:
  • At least one audio file is required
  • The default BTTS v3 engine only consumes a single sample; additional files are ignored
  • Supported formats: WAV, MP3, and other common audio formats
  • Maximum file size: 10MB per file
  • Maximum total upload size: 25MB
  • Audio duration: 1-60 seconds per sample
  • Higher quality audio samples produce better voice clones
  • Recommended: 10-15 seconds of clear speech
gender
string
The gender of the voice being cloned.Valid values:
  • "male"
  • "female"
Usage:
  • Optional but recommended for better voice modeling
  • Helps the BTTS system optimize voice characteristics
description
string
Optional description for the voice clone.Usage:
  • Provides additional context to underlying model for tone/style/etc.
isBTTS_V2
boolean
Opt into the production-grade BTTS v2 cloning engine instead of the default BTTS v3.Usage:
  • Pass "true" (as a string) in the multipart form to use BTTS v2
  • V2 is the previous-generation stable engine; V3 is the default

Audio Validation Limits

The following limits are enforced for audio file uploads:
  • Maximum file size: 10MB per individual file
  • Maximum total upload size: 25MB per request
  • Maximum files per upload: 5 files
  • Maximum samples per voice: 5 samples total (across all uploads)
  • Audio duration range: 1-60 seconds per sample
  • Minimum duration: 1 second per sample

Response

status
number
HTTP status code (200 for success)
data
object
errors
array
Array of error objects (null on success)
curl -X POST "https://api.bland.ai/v1/voices/clone" \
  -H "Authorization: your_api_key_here" \
  -F "name=Professional Voice" \
  -F "audio_samples=@voice_sample.wav" \
  -F "gender=female"
{
  "status": 200,
  "data": {
    "voice_id": "4cc89edf-c4b7-4df8-98c5-f8548fcd7c62",
    "name": "Blandie"
  },
  "errors": null
}

Usage in Text-to-Speech

Once created, your BTTS voice clone can be used from our /speak endpoint, by referencing the returned voice_id or name.

Docs for agents: llms.txt