> ## Documentation Index > Fetch the complete documentation index at: https://docs.bytebot.ai/llms.txt > Use this file to discover all available pages before exploring further. # LiteLLM Integration > Use any LLM provider with Bytebot through LiteLLM proxy # Connect Any LLM to Bytebot with LiteLLM LiteLLM acts as a unified proxy that lets you use 100+ LLM providers with Bytebot - including Azure OpenAI, AWS Bedrock, Anthropic, Hugging Face, Ollama, and more. This guide shows you how to set up LiteLLM with Bytebot. ## Why Use LiteLLM? Use Azure, AWS, GCP, Anthropic, OpenAI, Cohere, and local models Monitor spending across all providers in one place Distribute requests across multiple models and providers Automatic failover when primary models are unavailable ## Quick Start with Bytebot's Built-in LiteLLM Proxy Bytebot includes a pre-configured LiteLLM proxy service that makes it easy to use any LLM provider. Here's how to set it up: The easiest way is to use the proxy-enabled Docker Compose file: ```bash theme={null} # Clone Bytebot git clone https://github.com/bytebot-ai/bytebot.git cd bytebot # Set up your API keys in docker/.env cat > docker/.env << EOF # Add any combination of these keys ANTHROPIC_API_KEY=sk-ant-your-key-here OPENAI_API_KEY=sk-your-key-here GEMINI_API_KEY=your-key-here EOF # Start Bytebot with LiteLLM proxy docker-compose -f docker/docker-compose.proxy.yml up -d ``` This automatically: * Starts the `bytebot-llm-proxy` service on port 4000 * Configures the agent to use the proxy via `BYTEBOT_LLM_PROXY_URL` * Makes all configured models available through the proxy To add custom models or providers, edit the LiteLLM config: ```yaml theme={null} # packages/bytebot-llm-proxy/litellm-config.yaml model_list: # Add Azure OpenAI - model_name: azure-gpt-4o litellm_params: model: azure/gpt-4o-deployment api_base: https://your-resource.openai.azure.com/ api_key: os.environ/AZURE_API_KEY api_version: "2024-02-15-preview" # Add AWS Bedrock - model_name: claude-bedrock litellm_params: model: bedrock/anthropic.claude-3-5-sonnet aws_region_name: us-east-1 # Add local models via Ollama - model_name: local-llama litellm_params: model: ollama/llama3:70b api_base: http://host.docker.internal:11434 ``` Then rebuild: ```bash theme={null} docker-compose -f docker/docker-compose.proxy.yml up -d --build ``` The Bytebot agent automatically queries the proxy for available models: ```bash theme={null} # Check available models through Bytebot API curl http://localhost:9991/tasks/models # Or directly from LiteLLM proxy curl http://localhost:4000/model/info ``` The UI will show all available models in the model selector. ## How It Works ### Architecture ```mermaid theme={null} graph LR A[Bytebot UI] -->|Select Model| B[Bytebot Agent] B -->|BYTEBOT_LLM_PROXY_URL| C[LiteLLM Proxy :4000] C -->|Route Request| D[Anthropic API] C -->|Route Request| E[OpenAI API] C -->|Route Request| F[Google API] C -->|Route Request| G[Any Provider] ``` ### Key Components 1. **bytebot-llm-proxy Service**: A LiteLLM instance running in Docker that: * Runs on port 4000 within the Bytebot network * Uses the config from `packages/bytebot-llm-proxy/litellm-config.yaml` * Inherits API keys from environment variables 2. **Agent Integration**: The Bytebot agent: * Checks for `BYTEBOT_LLM_PROXY_URL` environment variable * If set, queries the proxy at `/model/info` for available models * Routes all LLM requests through the proxy 3. **Pre-configured Models**: Out of the box support for: * Anthropic: Claude Opus 4, Claude Sonnet 4 * OpenAI: GPT-4.1, GPT-4o * Google: Gemini 2.5 Pro, Gemini 2.5 Flash ## Provider Configurations ### Azure OpenAI ```yaml theme={null} model_list: - model_name: azure-gpt-4o litellm_params: model: azure/gpt-4o-deployment-name api_base: https://your-resource.openai.azure.com/ api_key: your-azure-key api_version: "2024-02-15-preview" - model_name: azure-gpt-4o-vision litellm_params: model: azure/gpt-4o-deployment-name api_base: https://your-resource.openai.azure.com/ api_key: your-azure-key api_version: "2024-02-15-preview" supports_vision: true ``` ### AWS Bedrock ```yaml theme={null} model_list: - model_name: claude-bedrock litellm_params: model: bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0 aws_region_name: us-east-1 # Uses AWS credentials from environment - model_name: llama-bedrock litellm_params: model: bedrock/meta.llama3-70b-instruct-v1:0 aws_region_name: us-east-1 ``` ### Google Vertex AI ```yaml theme={null} model_list: - model_name: gemini-vertex litellm_params: model: vertex_ai/gemini-1.5-pro vertex_project: your-gcp-project vertex_location: us-central1 # Uses GCP credentials from environment ``` ### Local Models (Ollama) ```yaml theme={null} model_list: - model_name: local-llama litellm_params: model: ollama/llama3:70b api_base: http://ollama:11434 - model_name: local-mixtral litellm_params: model: ollama/mixtral:8x7b api_base: http://ollama:11434 ``` ### Hugging Face ```yaml theme={null} model_list: - model_name: hf-llama litellm_params: model: huggingface/meta-llama/Llama-3-70b-chat-hf api_key: hf_your_token ``` ## Advanced Features ### Load Balancing Distribute requests across multiple providers: ```yaml theme={null} model_list: - model_name: gpt-4o litellm_params: model: gpt-4o api_key: sk-openai-key - model_name: gpt-4o # Same name for load balancing litellm_params: model: azure/gpt-4o api_base: https://azure.openai.azure.com/ api_key: azure-key router_settings: routing_strategy: "least-busy" # or "round-robin", "latency-based" ``` ### Fallback Models Configure automatic failover: ```yaml theme={null} model_list: - model_name: primary-model litellm_params: model: claude-3-5-sonnet-20241022 api_key: sk-ant-key - model_name: fallback-model litellm_params: model: gpt-4o api_key: sk-openai-key router_settings: model_group_alias: "smart-model": ["primary-model", "fallback-model"] # Use "smart-model" in Bytebot config ``` ### Cost Controls Set spending limits and track usage: ```yaml theme={null} general_settings: master_key: sk-litellm-master database_url: "postgresql://user:pass@localhost:5432/litellm" # Budget limits max_budget: 100 # $100 monthly limit budget_duration: "30d" # Per-model limits model_max_budget: gpt-4o: 50 claude-3-5-sonnet: 50 litellm_settings: callbacks: ["langfuse"] # For detailed tracking ``` ### Rate Limiting Prevent API overuse: ```yaml theme={null} model_list: - model_name: rate-limited-gpt litellm_params: model: gpt-4o api_key: sk-key rpm: 100 # Requests per minute tpm: 100000 # Tokens per minute ``` ## Alternative Setup: External LiteLLM Proxy If you prefer to run LiteLLM separately or have an existing LiteLLM deployment: ### Option 1: Modify docker-compose.yml ```yaml theme={null} # docker-compose.yml (without built-in proxy) services: bytebot-agent: environment: # Point to your external LiteLLM instance - BYTEBOT_LLM_PROXY_URL=http://your-litellm-server:4000 # ... rest of config ``` ### Option 2: Use Environment Variable ```bash theme={null} # Set the proxy URL before starting export BYTEBOT_LLM_PROXY_URL=http://your-litellm-server:4000 # Start normally docker-compose -f docker/docker-compose.yml up -d ``` ### Option 3: Run Standalone LiteLLM ```bash theme={null} # Run your own LiteLLM instance docker run -d \ --name litellm-external \ -p 4000:4000 \ -v $(pwd)/custom-config.yaml:/app/config.yaml \ -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \ ghcr.io/berriai/litellm:main \ --config /app/config.yaml # Then start Bytebot with: export BYTEBOT_LLM_PROXY_URL=http://localhost:4000 docker-compose up -d ``` ## Kubernetes Setup Deploy with Helm: ```yaml theme={null} # litellm-values.yaml replicaCount: 2 image: repository: ghcr.io/berriai/litellm tag: main service: type: ClusterIP port: 4000 config: model_list: - model_name: claude-3-5-sonnet litellm_params: model: claude-3-5-sonnet-20241022 api_key: ${ANTHROPIC_API_KEY} general_settings: master_key: ${LITELLM_MASTER_KEY} # Then in Bytebot values.yaml: agent: openai: enabled: true apiKey: "${LITELLM_MASTER_KEY}" baseUrl: "http://litellm:4000/v1" model: "claude-3-5-sonnet" ``` ## Monitoring & Debugging ### LiteLLM Dashboard Access metrics and logs: ```bash theme={null} # Port forward to dashboard kubectl port-forward svc/litellm 4000:4000 # Access at http://localhost:4000/ui # Login with your master_key ``` ### Debug Requests Enable detailed logging: ```yaml theme={null} litellm_settings: debug: true detailed_debug: true general_settings: master_key: sk-key store_model_in_db: true # Store request history ``` ### Common Issues Check model name matches exactly: ```bash theme={null} curl http://localhost:4000/v1/models \ -H "Authorization: Bearer sk-key" ``` Verify master key in both LiteLLM and Bytebot: ```bash theme={null} # Test LiteLLM curl http://localhost:4000/v1/chat/completions \ -H "Authorization: Bearer sk-key" \ -H "Content-Type: application/json" \ -d '{"model": "your-model", "messages": [{"role": "user", "content": "test"}]}' ``` Check latency per provider: ```yaml theme={null} router_settings: routing_strategy: "latency-based" enable_pre_call_checks: true ``` ## Best Practices ### Model Selection for Bytebot Choose models with strong vision capabilities for best results: * Claude 3.5 Sonnet (Best overall) * GPT-4o (Good vision + reasoning) * Gemini 1.5 Pro (Large context) * Claude 3.5 Haiku (Fast + cheap) * GPT-4o mini (Good balance) * Gemini 1.5 Flash (Very fast) * LLaVA (Vision support) * Qwen-VL (Vision support) * CogVLM (Vision support) ### Performance Optimization ```yaml theme={null} # Optimize for Bytebot workloads router_settings: routing_strategy: "latency-based" cooldown_time: 60 # Seconds before retrying failed provider num_retries: 2 request_timeout: 600 # 10 minutes for complex tasks # Cache for repeated requests cache: true cache_params: type: "redis" host: "redis" port: 6379 ttl: 3600 # 1 hour ``` ### Security ```yaml theme={null} general_settings: master_key: ${LITELLM_MASTER_KEY} # IP allowlist allowed_ips: ["10.0.0.0/8", "172.16.0.0/12"] # Audit logging store_model_in_db: true # Encryption encrypt_keys: true # Headers to forward forward_headers: ["X-Request-ID", "X-User-ID"] ``` ## Next Steps Full list of 100+ providers Official LiteLLM proxy server documentation Complete LiteLLM documentation **Pro tip:** Start with a single provider, then add more as needed. LiteLLM makes it easy to switch or combine models without changing Bytebot configuration.