POST
/
v1
/
voices
/
clone
curl -X POST "https://api.bland.ai/v1/voices/clone" \
  -H "Authorization: Bearer your_jwt_token_here" \
  -F "name=Professional Voice" \
  -F "samples=@voice_sample1.wav" \
  -F "samples=@voice_sample2.wav" \
  -F "gender=female" \
{
  "status": 200,
  "data": {
    "voice_id": "4cc89edf-c4b7-4df8-98c5-f8548fcd7c62",
    "name": "Blandie"
  },
  "errors": null
}

Authentication

Authorization
string
required

Your API key for authentication.

Request Body

The request must be sent as multipart/form-data with the following fields:

name
string
required

The name for your voice clone. Must be 30 characters or less.

Validation:

  • Required field
  • Maximum 30 characters
  • Will be used as the display name in your voice library
samples
file[]
required

Audio files containing voice samples for cloning. Multiple files can be uploaded.

Requirements:

  • At least one audio file is required
  • Maximum 5 files per upload
  • Supported formats: WAV, MP3, and other common audio formats
  • Maximum file size: 10MB per file
  • Maximum total upload size: 25MB
  • Audio duration: 1-60 seconds per sample
  • Maximum 5 samples per voice (total across all uploads)
  • Higher quality audio samples produce better voice clones
  • Recommended: 10-30 seconds of clear speech per sample
  • 1-2 samples is recommended for optimal results
gender
string

The gender of the voice being cloned.

Valid values:

  • "male"
  • "female"

Usage:

  • Optional but recommended for better voice modeling
  • Helps the BTTS system optimize voice characteristics
description
string

Optional description for the voice clone.

Usage:

  • Provides additional context to underlying model for tone/style/etc.

Audio Validation Limits

The following limits are enforced for audio file uploads:

  • Maximum file size: 10MB per individual file
  • Maximum total upload size: 25MB per request
  • Maximum files per upload: 5 files
  • Maximum samples per voice: 5 samples total (across all uploads)
  • Audio duration range: 1-60 seconds per sample
  • Minimum duration: 1 second per sample

Response

status
number

HTTP status code (200 for success)

data
object
errors
array

Array of error objects (null on success)

curl -X POST "https://api.bland.ai/v1/voices/clone" \
  -H "Authorization: Bearer your_jwt_token_here" \
  -F "name=Professional Voice" \
  -F "samples=@voice_sample1.wav" \
  -F "samples=@voice_sample2.wav" \
  -F "gender=female" \
{
  "status": 200,
  "data": {
    "voice_id": "4cc89edf-c4b7-4df8-98c5-f8548fcd7c62",
    "name": "Blandie"
  },
  "errors": null
}

Usage in Text-to-Speech

Once created, your BTTS voice clone can be used from our /speak endpoint, by referencing the returned voice_id or name.