Connect Any LLM to Bytebot with LiteLLM
LiteLLM acts as a unified proxy that lets you use 100+ LLM providers with Bytebot - including Azure OpenAI, AWS Bedrock, Anthropic, Hugging Face, Ollama, and more. This guide shows you how to set up LiteLLM with Bytebot.Why Use LiteLLM?
100+ LLM Providers
Use Azure, AWS, GCP, Anthropic, OpenAI, Cohere, and local models
Cost Tracking
Monitor spending across all providers in one place
Load Balancing
Distribute requests across multiple models and providers
Fallback Models
Automatic failover when primary models are unavailable
Quick Start with Bytebot’s Built-in LiteLLM Proxy
Bytebot includes a pre-configured LiteLLM proxy service that makes it easy to use any LLM provider. Here’s how to set it up:1
Use Docker Compose with Proxy
The easiest way is to use the proxy-enabled Docker Compose file:This automatically:
- Starts the
bytebot-llm-proxy
service on port 4000 - Configures the agent to use the proxy via
BYTEBOT_LLM_PROXY_URL
- Makes all configured models available through the proxy
2
Customize Model Configuration
To add custom models or providers, edit the LiteLLM config:Then rebuild:
3
Verify Models are Available
The Bytebot agent automatically queries the proxy for available models:The UI will show all available models in the model selector.
How It Works
Architecture
Key Components
-
bytebot-llm-proxy Service: A LiteLLM instance running in Docker that:
- Runs on port 4000 within the Bytebot network
- Uses the config from
packages/bytebot-llm-proxy/litellm-config.yaml
- Inherits API keys from environment variables
-
Agent Integration: The Bytebot agent:
- Checks for
BYTEBOT_LLM_PROXY_URL
environment variable - If set, queries the proxy at
/model/info
for available models - Routes all LLM requests through the proxy
- Checks for
-
Pre-configured Models: Out of the box support for:
- Anthropic: Claude Opus 4, Claude Sonnet 4
- OpenAI: GPT-4.1, GPT-4o
- Google: Gemini 2.5 Pro, Gemini 2.5 Flash
Provider Configurations
Azure OpenAI
AWS Bedrock
Google Vertex AI
Local Models (Ollama)
Hugging Face
Advanced Features
Load Balancing
Distribute requests across multiple providers:Fallback Models
Configure automatic failover:Cost Controls
Set spending limits and track usage:Rate Limiting
Prevent API overuse:Alternative Setup: External LiteLLM Proxy
If you prefer to run LiteLLM separately or have an existing LiteLLM deployment:Option 1: Modify docker-compose.yml
Option 2: Use Environment Variable
Option 3: Run Standalone LiteLLM
Kubernetes Setup
Deploy with Helm:Monitoring & Debugging
LiteLLM Dashboard
Access metrics and logs:Debug Requests
Enable detailed logging:Common Issues
Model not found
Model not found
Check model name matches exactly:
Authentication errors
Authentication errors
Verify master key in both LiteLLM and Bytebot:
Slow responses
Slow responses
Check latency per provider:
Best Practices
Model Selection for Bytebot
Choose models with strong vision capabilities for best results:- Recommended
- Budget Options
- Local Models
- Claude 3.5 Sonnet (Best overall)
- GPT-4o (Good vision + reasoning)
- Gemini 1.5 Pro (Large context)
Performance Optimization
Security
Next Steps
Supported Models
Full list of 100+ providers
LiteLLM Proxy Docs
Official LiteLLM proxy server documentation
LiteLLM Docs
Complete LiteLLM documentation
Pro tip: Start with a single provider, then add more as needed. LiteLLM makes it easy to switch or combine models without changing Bytebot configuration.