Skip to main content

[BETA] Batches API

Covers Batches, Files

Supported Providers:

  • Azure OpenAI
  • OpenAI

Quick Start

  • Create File for Batch Completion

  • Create Batch Request

  • List Batches

  • Retrieve the Specific Batch and File Content

$ export OPENAI_API_KEY="sk-..."

$ litellm

# RUNNING on http://0.0.0.0:4000

Create File for Batch Completion

curl http://localhost:4000/v1/files \
-H "Authorization: Bearer sk-1234" \
-F purpose="batch" \
-F file="@mydata.jsonl"

Create Batch Request

curl http://localhost:4000/v1/batches \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-abc123",
"endpoint": "/v1/chat/completions",
"completion_window": "24h"
}'

Retrieve the Specific Batch

curl http://localhost:4000/v1/batches/batch_abc123 \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \

List Batches

curl http://localhost:4000/v1/batches \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \

👉 Proxy API Reference

Azure Batches API

Just add the azure env vars to your environment.

export AZURE_API_KEY=""
export AZURE_API_BASE=""

AND use /azure/* for the Batches API calls

http://0.0.0.0:4000/azure/v1/batches

Usage

Setup

  • Add Azure API Keys to your environment

1. Upload a File

curl http://localhost:4000/azure/v1/files \
-H "Authorization: Bearer sk-1234" \
-F purpose="batch" \
-F file="@mydata.jsonl"

Example File

Note: model should be your azure deployment name.

{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was Microsoft founded?"}]}}
{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was the first XBOX released?"}]}}
{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "REPLACE-WITH-MODEL-DEPLOYMENT-NAME", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "What is Altair Basic?"}]}}

2. Create a batch

curl http://0.0.0.0:4000/azure/v1/batches \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-abc123",
"endpoint": "/v1/chat/completions",
"completion_window": "24h"
}'

3. Retrieve batch

curl http://0.0.0.0:4000/azure/v1/batches/batch_abc123 \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \

4. Cancel batch

curl http://0.0.0.0:4000/azure/v1/batches/batch_abc123/cancel \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \
-X POST

5. List Batch

curl http://0.0.0.0:4000/v1/batches?limit=2 \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json"

👉 Health Check Azure Batch models

[BETA] Loadbalance Multiple Azure Deployments

In your config.yaml, set enable_loadbalancing_on_batch_endpoints: true

model_list:
- model_name: "batch-gpt-4o-mini"
litellm_params:
model: "azure/gpt-4o-mini"
api_key: os.environ/AZURE_API_KEY
api_base: os.environ/AZURE_API_BASE
model_info:
mode: batch

litellm_settings:
enable_loadbalancing_on_batch_endpoints: true # 👈 KEY CHANGE

Note: This works on {PROXY_BASE_URL}/v1/files and {PROXY_BASE_URL}/v1/batches. Note: Response is in the OpenAI-format.

  1. Upload a file

Just set model: batch-gpt-4o-mini in your .jsonl.

curl http://localhost:4000/v1/files \
-H "Authorization: Bearer sk-1234" \
-F purpose="batch" \
-F file="@mydata.jsonl"

Example File

Note: model should be your azure deployment name.

{"custom_id": "task-0", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was Microsoft founded?"}]}}
{"custom_id": "task-1", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "When was the first XBOX released?"}]}}
{"custom_id": "task-2", "method": "POST", "url": "/chat/completions", "body": {"model": "batch-gpt-4o-mini", "messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "What is Altair Basic?"}]}}

Expected Response (OpenAI-compatible)

{"id":"file-f0be81f654454113a922da60acb0eea6",...}
  1. Create a batch
curl http://0.0.0.0:4000/v1/batches \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-f0be81f654454113a922da60acb0eea6",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"model: "batch-gpt-4o-mini"
}'

Expected Response:

{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}
  1. Retrieve a batch
curl http://0.0.0.0:4000/v1/batches/batch_94e43f0a-d805-477d-adf9-bbb9c50910ed \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json" \

Expected Response:

{"id":"batch_94e43f0a-d805-477d-adf9-bbb9c50910ed",...}
  1. List batch
curl http://0.0.0.0:4000/v1/batches?limit=2 \
-H "Authorization: Bearer $LITELLM_API_KEY" \
-H "Content-Type: application/json"

Expected Response:

{"data":[{"id":"batch_R3V...}