> ## Documentation Index
> Fetch the complete documentation index at: https://docs.bland.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Backfill Citation Schema

> Retroactively apply one or more citation schemas to extract data from existing call transcripts.

## Overview

The backfill endpoint allows you to apply citation schemas to calls and SMS conversations that have already been completed. This is useful when you:

* Create a new schema and want to extract data from historical calls or conversations
* Need to re-analyze calls or conversations with updated schemas
* Want to extract data from multiple calls or conversations at scale

Call backfill operations are **asynchronous** — you'll receive a `workflow_id` immediately and can poll the status endpoint to check progress. SMS conversation backfill is **synchronous** — results are returned directly in the response.

<Info>
  This is an **Enterprise-only feature**. Contact your Bland representative or reach out to sales to enable this functionality.
</Info>

<Warning>
  Backfilling will update citation variables for the specified call(s) and schema(s). Variables are merged with existing data - only the variables defined in the schema being backfilled will be updated.
</Warning>

***

## Request

### Headers

<ParamField header="authorization" type="string" required>
  Your API key for authentication.
</ParamField>

### Body Parameters

#### Schema Parameters (Choose One)

<ParamField body="schema_id" type="string">
  A single citation schema ID from your organization to apply to the call(s).
</ParamField>

<ParamField body="schema_ids" type="string[]">
  An array of citation schema IDs to apply to the call(s). All schemas will be processed for each call.

  **Note**: Cannot be combined with `schema` (inline schema).
</ParamField>

<ParamField body="schema" type="object">
  An inline citation schema definition. Useful for one-time analysis without creating a schema in your account.

  Can be combined with `call_id`/`call_ids` or `conversation_id`/`conversation_ids` for multi-call or multi-conversation analysis.

  **Note**: Cannot be combined with `schema_ids`.
</ParamField>

#### Call Parameters

<ParamField body="call_id" type="string">
  A single call ID to analyze with the specified schema(s).
</ParamField>

<ParamField body="call_ids" type="string[]">
  An array of call IDs to analyze with the specified schema(s). All schemas will be applied to each call.
</ParamField>

#### SMS Conversation Parameters

<ParamField body="conversation_id" type="string">
  A single SMS conversation ID to analyze with the specified schema(s). Processed synchronously.
</ParamField>

<ParamField body="conversation_ids" type="string[]">
  An array of SMS conversation IDs to analyze. Maximum 50 conversations per request. All schemas will be applied to each conversation. Processed synchronously.
</ParamField>

***

## Response

Returns `202 Accepted` when calls are included (async processing), or `200 OK` when only SMS conversations are included (sync processing).

<ResponseField name="status" type="string">
  `"processing"` when calls are being processed asynchronously, or `"complete"` when only SMS conversations were processed.
</ResponseField>

<ResponseField name="workflow_id" type="string">
  Unique identifier for the call backfill workflow. Only present when calls are included. Use this to check status and retrieve results.
</ResponseField>

<ResponseField name="call_count" type="integer">
  Number of calls being processed.
</ResponseField>

<ResponseField name="conversation_count" type="integer">
  Number of SMS conversations processed.
</ResponseField>

<ResponseField name="schema_count" type="integer">
  Number of schemas being applied.
</ResponseField>

<ResponseField name="total_combinations" type="integer">
  Total number of call+schema extractions that will be performed (`call_count × schema_count`). Only present when calls are included.
</ResponseField>

<ResponseField name="conversations" type="array">
  Results for SMS conversation backfill. Only present when conversations are included. Each entry contains:

  * `conversation_id` — the conversation that was analyzed
  * `status` — `"success"`, `"skipped"`, or `"error"`
  * `variables_extracted` — count of unique variable names extracted across all schemas (same definition as `result.calls[].variables_extracted_count`)
  * `previous_variables` — the conversation's citation variables **before** this backfill operation
  * `newly_extracted_variables` — variables that were extracted or updated by **this backfill** operation
  * `final_variables` — the conversation's citation variables **after** this backfill operation (complete merged state)
  * `error` — error message if the extraction failed
</ResponseField>

<ResponseField name="message" type="string">
  Instructions on how to check the status of the call workflow.
</ResponseField>

***

## Checking Status

Use the workflow ID to poll the status endpoint:

```bash theme={null}
GET https://api.bland.ai/v1/citation_schemas/backfill/status/{workflow_id}
```

### Status Response

<ResponseField name="workflow_id" type="string">
  The workflow ID you're querying.
</ResponseField>

<ResponseField name="status" type="string">
  Current workflow status:

  * `RUNNING`: Still processing
  * `COMPLETED`: Finished successfully (may have partial failures)
  * `FAILED`: Workflow encountered an error
  * `TERMINATED`: Manually stopped
  * `TIMED_OUT`: Exceeded maximum execution time
</ResponseField>

<ResponseField name="result" type="object">
  Only present when `status` is `COMPLETED`. Contains the full backfill results.
</ResponseField>

### Result Object (When Completed)

<ResponseField name="result.status" type="string">
  Overall result status:

  * `success`: All extractions succeeded
  * `partial`: Some extractions succeeded, some failed
  * `failed`: All extractions failed
</ResponseField>

<ResponseField name="result.calls" type="array">
  Array of results, one entry per call processed.
</ResponseField>

<ResponseField name="result.calls[].call_id" type="string">
  The call that was analyzed.
</ResponseField>

<ResponseField name="result.calls[].status" type="string">
  Status for this call: `success`, `partial`, or `failed`.
</ResponseField>

<ResponseField name="result.calls[].schemas_processed" type="array">
  Details about each schema extraction for this call.
</ResponseField>

<ResponseField name="result.calls[].variables_extracted_count" type="integer">
  Total number of unique variables extracted across all schemas for this call.
</ResponseField>

<ResponseField name="result.calls[].previous_variables" type="object">
  The call's citation variables **before** this backfill operation.
</ResponseField>

<ResponseField name="result.calls[].newly_extracted_variables" type="object">
  Variables that were extracted or updated by **this backfill** operation.
</ResponseField>

<ResponseField name="result.calls[].final_variables" type="object">
  The call's citation variables **after** this backfill operation (complete state).
</ResponseField>

<ResponseField name="result.summary" type="object">
  Aggregate statistics across all calls and schemas:

  * `total_calls`: Number of calls processed
  * `successful_calls`: Calls where all schemas succeeded
  * `partial_calls`: Calls where some schemas failed
  * `failed_calls`: Calls where all schemas failed
  * `total_schemas_processed`: Total schema extractions attempted
  * `successful_schema_extractions`: Schema extractions that succeeded
  * `failed_schema_extractions`: Schema extractions that failed
</ResponseField>

***

## Usage Flow

1. **Send backfill request** and receive `workflow_id`
2. **Poll the status endpoint** periodically (every 5-10 seconds)
3. **When status is `COMPLETED`**, retrieve results from `result` field
4. **Review the results** to see what was extracted for each call

***

## Examples

### Single Call, Single Schema

```bash Request theme={null}
curl -X POST https://api.bland.ai/v1/citation_schemas/backfill \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "call_id": "call-123",
    "schema_id": "schema-abc"
  }'
```

```json Response (202 Accepted) theme={null}
{
  "status": "processing",
  "workflow_id": "backfill:call-123:1760635265843",
  "call_count": 1,
  "schema_count": 1,
  "total_combinations": 1,
  "message": "Citation backfill workflow started for 1 call(s) with 1 schema(s)..."
}
```

### Multiple Calls, Multiple Schemas

```bash Request theme={null}
curl -X POST https://api.bland.ai/v1/citation_schemas/backfill \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "call_ids": ["call-123", "call-456", "call-789"],
    "schema_ids": ["schema-abc", "schema-def"]
  }'
```

```json Response (202 Accepted) theme={null}
{
  "status": "processing",
  "workflow_id": "backfill:batch:671b7623-f9ac-46db-95f1-9b504fe59a06:1760635265843",
  "call_count": 3,
  "schema_count": 2,
  "total_combinations": 6,
  "message": "Citation backfill workflow started for 3 call(s) with 2 schema(s)..."
}
```

### SMS Conversations

```bash Request theme={null}
curl -X POST https://api.bland.ai/v1/citation_schemas/backfill \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "conversation_ids": ["conv-123", "conv-456"],
    "schema_id": "schema-abc"
  }'
```

```json Response (200 OK) theme={null}
{
  "status": "complete",
  "call_count": 0,
  "conversation_count": 2,
  "schema_count": 1,
  "conversations": [
    {
      "conversation_id": "conv-123",
      "status": "success",
      "variables_extracted": 2,
      "previous_variables": {
        "customer_name": "Jane"
      },
      "newly_extracted_variables": {
        "intent": "support",
        "resolved": true
      },
      "final_variables": {
        "customer_name": "Jane",
        "intent": "support",
        "resolved": true
      }
    },
    {
      "conversation_id": "conv-456",
      "status": "success",
      "variables_extracted": 2,
      "previous_variables": {},
      "newly_extracted_variables": {
        "intent": "billing",
        "resolved": false
      },
      "final_variables": {
        "intent": "billing",
        "resolved": false
      }
    }
  ]
}
```

### Single Call with Inline Schema

```bash Request theme={null}
curl -X POST https://api.bland.ai/v1/citation_schemas/backfill \
  -H "authorization: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "call_id": "call-123",
    "schema": {
      "name": "Customer Sentiment Analysis",
      "variables": [
        {
          "name": "customer_satisfied",
          "type": "boolean",
          "description": "Whether the customer expressed satisfaction"
        },
        {
          "name": "sentiment_score",
          "type": "number",
          "description": "Overall sentiment from -1 (negative) to 1 (positive)"
        }
      ]
    }
  }'
```

### Checking Status

```bash Request theme={null}
curl https://api.bland.ai/v1/citation_schemas/backfill/status/backfill:call-123:1760635265843 \
  -H "authorization: YOUR_API_KEY"
```

```json Response (Status: RUNNING) theme={null}
{
  "workflow_id": "backfill:call-123:1760635265843",
  "status": "RUNNING",
  "run_id": "0199ee0a-2343-7d98-8fbb-8d5c2d46ed4c",
  "start_time": "2025-10-16T17:21:05.859Z"
}
```

```json Response (Status: COMPLETED) theme={null}
{
  "workflow_id": "backfill:call-123:1760635265843",
  "status": "COMPLETED",
  "run_id": "0199ee0a-2343-7d98-8fbb-8d5c2d46ed4c",
  "start_time": "2025-10-16T17:21:05.859Z",
  "close_time": "2025-10-16T17:21:39.019Z",
  "result": {
    "status": "success",
    "calls": [
      {
        "call_id": "call-123",
        "status": "success",
        "schemas_processed": [
          {
            "schema_id": "schema-abc",
            "status": "success",
            "variables_extracted": 5
          }
        ],
        "variables_extracted_count": 5,
        "previous_variables": {
          "customer_name": "John Doe"
        },
        "newly_extracted_variables": {
          "order_id": "ORD-789456",
          "issue_type": "shipping",
          "products": [
            {
              "name": "Wireless Headphones",
              "issue": "damaged packaging",
              "quantity": 1
            }
          ],
          "customer_satisfied": true,
          "resolution": "refund_issued"
        },
        "final_variables": {
          "customer_name": "John Doe",
          "order_id": "ORD-789456",
          "issue_type": "shipping",
          "products": [
            {
              "name": "Wireless Headphones",
              "issue": "damaged packaging",
              "quantity": 1
            }
          ],
          "customer_satisfied": true,
          "resolution": "refund_issued"
        }
      }
    ],
    "summary": {
      "total_calls": 1,
      "successful_calls": 1,
      "partial_calls": 0,
      "failed_calls": 0,
      "total_schemas_processed": 1,
      "successful_schema_extractions": 1,
      "failed_schema_extractions": 0
    }
  }
}
```

***

## Error Responses

### 400 Bad Request

Returned when:

* No schema parameter provided (`schema_id`, `schema_ids`, or `schema` required)
* No identifier parameter provided (`call_id`, `call_ids`, `conversation_id`, `conversation_ids`, or `recording_url` required)
* Invalid parameter combinations (e.g., `schema_ids` with inline `schema`)
* Empty arrays provided
* More than 50 SMS conversations in a single request

### 403 Forbidden

Returned when:

* Account does not have enterprise features enabled
* Attempting to access another organization's calls, conversations, or schemas

### 404 Not Found

Returned when:

* Specified call ID(s) not found
* Specified schema ID(s) not found
* Resources don't belong to your organization
* None of the provided `schema_id`/`schema_ids` can be resolved (error code: `NO_RESOLVABLE_SCHEMAS`)

### 500 Internal Server Error

Returned when an unexpected error occurs starting the workflow.

***

## Best Practices

### Polling Strategy

* Poll every **5-10 seconds** for single-call operations
* Poll every **10-30 seconds** for batch operations
* Implement exponential backoff if polling for extended periods
* Stop polling once status is `COMPLETED`, `FAILED`, `TERMINATED`, or `TIMED_OUT`

### Batch Processing

* Process up to **100 calls** at once for optimal performance
* For larger batches, split into multiple backfill requests
* Use multiple schemas in a single request rather than separate requests per schema

### Error Handling

* Check the `result.calls[].status` field for per-call results
* Review `schemas_processed` array to see which schemas succeeded/failed
* Use `error_summary` field for quick diagnosis of partial failures
* Failed extractions don't prevent successful ones from being stored

### Variable Management

* **Newly extracted variables** overwrite previous values with the same name
* Variables from different schemas are merged together
* Only variables defined in the backfilled schema are updated
* Other existing variables remain unchanged

***

Docs for agents: [llms.txt](/llms.txt)
