Execute actions like mouse movements, clicks, keyboard input, and screenshots in the Bytebot desktop environment.

Request

action
string
required

The type of computer action to perform. Must be one of: move_mouse, trace_mouse, click_mouse, press_mouse, drag_mouse, scroll, type_keys, press_keys, type_text, wait, screenshot, cursor_position.

Mouse Actions

Keyboard Actions

System Actions

Response

Responses vary based on the action performed:

Default Response

Most actions return a simple success response:

{
  "success": true
}

Screenshot Response

Returns the screenshot as a base64 encoded string:

{
  "success": true,
  "data": {
    "image": "base64_encoded_image_data"
  }
}

Cursor Position Response

Returns the current cursor position:

{
  "success": true,
  "data": {
    "x": 123,
    "y": 456
  }
}

Error Response

{
  "success": false,
  "error": "Error message"
}

Code Examples

curl -X POST http://localhost:9990/computer-use \
  -H "Content-Type: application/json" \
  -d '{"action": "move_mouse", "coordinates": {"x": 100, "y": 200}}'