Architecture

Overview

Bytebot is a self-hosted AI desktop agent built with a modular architecture. It combines a Linux desktop environment with AI to create an autonomous computer user that can perform tasks through natural language instructions. Bytebot Architecture Diagram

System Architecture

The system consists of four main components that work together:

1. Bytebot Desktop Container

The foundation of the system - a virtual Linux desktop that provides:

Ubuntu 22.04 LTS base for stability and compatibility
XFCE4 Desktop for a lightweight, responsive UI
bytebotd Daemon - The automation service built on nutjs that executes computer actions
Pre-installed Applications: Firefox ESR, Thunderbird, text editors, and development tools
noVNC for remote desktop access

Key Features:

Runs completely isolated from your host system
Consistent environment across different platforms
Can be customized with additional software
Accessible via REST API on port 9990
MCP SSE endpoint available at /mcp
Uses shared types from @bytebot/shared package

2. AI Agent Service

The brain of the system - orchestrates tasks using an LLM:

NestJS Framework for robust, scalable backend
LLM Integration supporting Anthropic Claude, OpenAI GPT, and Google Gemini models
WebSocket Support for real-time updates
Computer Use API Client to control the desktop
Prisma ORM for database operations
Tool definitions for computer actions (mouse, keyboard, screenshots)

Responsibilities:

Interprets natural language requests
Plans sequences of computer actions
Manages task state and progress
Handles errors and retries
Provides real-time task updates via WebSocket

3. Web Task Interface

The user interface for interacting with your AI agent:

Next.js 15 Application with TypeScript for type safety
Embedded VNC Viewer to watch the desktop in action
Task Management UI with status badges
WebSocket Connections for live updates
Reusable components for consistent UI
API utilities for streamlined server communication

Features:

Task creation and management interface
Desktop tab for direct manual control
Real-time desktop viewer with takeover mode
Task history and status tracking
Responsive design for all devices

4. PostgreSQL Database

Persistent storage for the agent system:

Tasks Table: Stores task details, status, and metadata
Messages Table: Stores AI conversation history
Prisma ORM for type-safe database access

Data Flow

Task Execution Flow

User Input

User describes a task in natural language via the chat UI

Task Creation

Agent service creates a task record and adds it to the processing queue

AI Planning

The LLM analyzes the task and generates a plan of computer actions

Action Execution

Agent sends computer actions to bytebotd via REST API or MCP

Desktop Automation

bytebotd executes actions (mouse, keyboard, screenshots) on the desktop

Result Processing

Agent receives results, updates task status, and continues or completes

User Feedback

Results and status updates are sent back to the user in real-time

Communication Protocols

Security Architecture

Isolation Layers

Container Isolation
- Each desktop runs in its own Docker container
- No access to host filesystem by default
- Network isolation with explicit port mapping
Process Isolation
- bytebotd runs as non-root user
- Separate processes for different services
- Resource limits enforced by Docker
Network Security
- Services only accessible from localhost by default
- Can be configured with authentication
- HTTPS/WSS for external connections

API Security

Desktop API: No authentication by default (localhost only). Supports REST and MCP.
Agent API: Can be secured with API keys
Database: Password protected, not exposed externally

Default configuration is for development. For production:

Enable authentication on all APIs
Use HTTPS/WSS for all connections
Implement network policies
Rotate credentials regularly

Deployment Patterns

Single User (Development)

Services: All on one machine
Scale: 1 instance each
Use Case: Personal automation, development
Resources: 4GB RAM, 2 CPU cores

Production Deployment

Services: All services on dedicated hardware
Scale: Single instance (1 agent, 1 desktop)
Use Case: Business automation
Resources: 8GB+ RAM, 4+ CPU cores

Enterprise Deployment

Services: Kubernetes orchestration
Scale: Single instance with high availability
Use Case: Organization-wide automation
Resources: Dedicated nodes

Extension Points

Custom Tools

Add specialized software to the desktop:

FROM bytebot/desktop:latest
RUN apt-get update && apt-get install -y \
    your-custom-tools

AI Integrations

Extend agent capabilities:

Custom tools for the LLM
Additional AI models
Specialized prompts
Domain-specific knowledge

Performance Considerations

Resource Usage

Desktop Container: ~1GB RAM idle, 2GB+ active
Agent Service: ~256MB RAM
UI Service: ~128MB RAM
Database: ~256MB RAM

Optimization Tips

Allocate sufficient resources to containers
Limit concurrent tasks to prevent overload
Monitor resource usage regularly
Use LiteLLM proxy for provider flexibility

Next Steps

Agent System

Learn about the AI agent capabilities

Desktop Environment

Explore the virtual desktop environment

API Reference

Integrate with your applications

Deployment Guide

Deploy your own instance

Getting Started

User Guides

Deployment

Core Concepts

Overview

System Architecture

1. Bytebot Desktop Container

2. AI Agent Service

3. Web Task Interface

4. PostgreSQL Database

Data Flow

Task Execution Flow

Communication Protocols

Security Architecture

Isolation Layers

API Security

Deployment Patterns

Single User (Development)

Production Deployment

Enterprise Deployment

Extension Points

Custom Tools

AI Integrations

Performance Considerations

Resource Usage

Optimization Tips

Next Steps

Agent System

Desktop Environment

API Reference

Deployment Guide

Getting Started

User Guides

Deployment

Core Concepts

​Overview

​System Architecture

​1. Bytebot Desktop Container

​2. AI Agent Service

​3. Web Task Interface

​4. PostgreSQL Database

​Data Flow

​Task Execution Flow

​Communication Protocols

​Security Architecture

​Isolation Layers

​API Security

​Deployment Patterns

​Single User (Development)

​Production Deployment

​Enterprise Deployment

​Extension Points

​Custom Tools

​AI Integrations

​Performance Considerations

​Resource Usage

​Optimization Tips

​Next Steps

Agent System

Desktop Environment

API Reference

Deployment Guide

Overview

System Architecture

1. Bytebot Desktop Container

2. AI Agent Service

3. Web Task Interface

4. PostgreSQL Database

Data Flow

Task Execution Flow

Communication Protocols

Security Architecture

Isolation Layers

API Security

Deployment Patterns

Single User (Development)

Production Deployment

Enterprise Deployment

Extension Points

Custom Tools

AI Integrations

Performance Considerations

Resource Usage

Optimization Tips

Next Steps