POST
/
v1
/
voices
/
clone
curl -X POST "https://api.bland.ai/v1/voices/clone" \
  -H "Authorization: Bearer your_jwt_token_here" \
  -F "name=Professional Voice" \
  -F "audio_samples=@voice_sample1.wav" \
  -F "audio_samples=@voice_sample2.wav" \
  -F "gender=female" \
{
  "status": 200,
  "data": {
    "voice_id": "4cc89edf-c4b7-4df8-98c5-f8548fcd7c62",
    "name": "Blandie"
  },
  "errors": null
}

Authentication

Authorization
string
required
Your API key for authentication.

Request Body

The request must be sent as multipart/form-data with the following fields:
name
string
required
The name for your voice clone. Must be 30 characters or less.Validation:
  • Required field
  • Maximum 30 characters
  • Will be used as the display name in your voice library
audio_samples
file[]
required
Audio files containing voice samples for cloning. Multiple files can be uploaded.Requirements:
  • At least one audio file is required
  • Maximum 5 files per upload
  • Supported formats: WAV, MP3, and other common audio formats
  • Maximum file size: 10MB per file
  • Maximum total upload size: 25MB
  • Audio duration: 1-60 seconds per sample
  • Maximum 5 samples per voice (total across all uploads)
  • Higher quality audio samples produce better voice clones
  • Recommended: 10-30 seconds of clear speech per sample
  • 1-2 samples is recommended for optimal results
gender
string
The gender of the voice being cloned.Valid values:
  • "male"
  • "female"
Usage:
  • Optional but recommended for better voice modeling
  • Helps the BTTS system optimize voice characteristics
description
string
Optional description for the voice clone.Usage:
  • Provides additional context to underlying model for tone/style/etc.

Audio Validation Limits

The following limits are enforced for audio file uploads:
  • Maximum file size: 10MB per individual file
  • Maximum total upload size: 25MB per request
  • Maximum files per upload: 5 files
  • Maximum samples per voice: 5 samples total (across all uploads)
  • Audio duration range: 1-60 seconds per sample
  • Minimum duration: 1 second per sample

Response

status
number
HTTP status code (200 for success)
data
object
errors
array
Array of error objects (null on success)
curl -X POST "https://api.bland.ai/v1/voices/clone" \
  -H "Authorization: Bearer your_jwt_token_here" \
  -F "name=Professional Voice" \
  -F "audio_samples=@voice_sample1.wav" \
  -F "audio_samples=@voice_sample2.wav" \
  -F "gender=female" \
{
  "status": 200,
  "data": {
    "voice_id": "4cc89edf-c4b7-4df8-98c5-f8548fcd7c62",
    "name": "Blandie"
  },
  "errors": null
}

Usage in Text-to-Speech

Once created, your BTTS voice clone can be used from our /speak endpoint, by referencing the returned voice_id or name.