Tex to Speech
Clone
Create a custom voice clone using Bland’s BTTS (Beige Text-to-Speech)
POST
Authentication
Your API key for authentication.
Request Body
The request must be sent as multipart/form-data
with the following fields:
The name for your voice clone. Must be 30 characters or less.
Validation:
- Required field
- Maximum 30 characters
- Will be used as the display name in your voice library
Audio files containing voice samples for cloning. Multiple files can be uploaded.
Requirements:
- At least one audio file is required
- Maximum 5 files per upload
- Supported formats: WAV, MP3, and other common audio formats
- Maximum file size: 10MB per file
- Maximum total upload size: 25MB
- Audio duration: 1-60 seconds per sample
- Maximum 5 samples per voice (total across all uploads)
- Higher quality audio samples produce better voice clones
- Recommended: 10-30 seconds of clear speech per sample
- 1-2 samples is recommended for optimal results
The gender of the voice being cloned.
Valid values:
"male"
"female"
Usage:
- Optional but recommended for better voice modeling
- Helps the BTTS system optimize voice characteristics
Optional description for the voice clone.
Usage:
- Provides additional context to underlying model for tone/style/etc.
Audio Validation Limits
The following limits are enforced for audio file uploads:
- Maximum file size: 10MB per individual file
- Maximum total upload size: 25MB per request
- Maximum files per upload: 5 files
- Maximum samples per voice: 5 samples total (across all uploads)
- Audio duration range: 1-60 seconds per sample
- Minimum duration: 1 second per sample
Response
HTTP status code (200 for success)
Array of error objects (null on success)
Usage in Text-to-Speech
Once created, your BTTS voice clone can be used from our /speak endpoint, by referencing the returned voice_id
or name
.