Skip to main content
POST
/
computer-use
{
  "success": true,
  "data": {
    "image": "<string>"
  }
}
Execute actions like mouse movements, clicks, keyboard input, and screenshots in the Bytebot desktop environment.

Request

action
string
required
The type of computer action to perform. Must be one of: move_mouse, trace_mouse, click_mouse, press_mouse, drag_mouse, scroll, type_keys, press_keys, type_text, wait, screenshot, cursor_position.

Mouse Actions

coordinates
object
required
The target coordinates to move to.
Example Request
{
  "action": "move_mouse",
  "coordinates": {
    "x": 100,
    "y": 200
  }
}
path
array
required
Array of coordinate objects for the mouse path.
holdKeys
array
Keys to hold while moving the mouse along the path.
Example Request
{
  "action": "trace_mouse",
  "path": [
    { "x": 100, "y": 100 },
    { "x": 150, "y": 150 },
    { "x": 200, "y": 200 }
  ],
  "holdKeys": ["shift"]
}
coordinates
object
The coordinates to click (uses current cursor position if omitted).
button
string
required
Mouse button to click. Must be one of: left, right, middle.
clickCount
number
required
Number of clicks to perform.
holdKeys
array
Keys to hold while clicking (e.g., [‘ctrl’, ‘shift’])
Example Request
{
  "action": "click_mouse",
  "coordinates": {
    "x": 150,
    "y": 250
  },
  "button": "left",
  "clickCount": 2
}
coordinates
object
The coordinates to press/release (uses current cursor position if omitted).
button
string
required
Mouse button to press/release. Must be one of: left, right, middle.
press
string
required
Whether to press or release the button. Must be one of: up, down.
Example Request
{
  "action": "press_mouse",
  "coordinates": {
    "x": 150,
    "y": 250
  },
  "button": "left",
  "press": "down"
}
path
array
required
Array of coordinate objects for the drag path.
button
string
required
Mouse button to use for dragging. Must be one of: left, right, middle.
holdKeys
array
Keys to hold while dragging.
Example Request
{
  "action": "drag_mouse",
  "path": [
    { "x": 100, "y": 100 },
    { "x": 200, "y": 200 }
  ],
  "button": "left"
}
coordinates
object
The coordinates to scroll at (uses current cursor position if omitted).
direction
string
required
Scroll direction. Must be one of: up, down, left, right.
scrollCount
number
required
Number of scroll steps to perform.
holdKeys
array
Keys to hold while scrolling.
Example Request
{
  "action": "scroll",
  "direction": "down",
  "scrollCount": 5
}

Keyboard Actions

keys
array
required
Array of keys to type in sequence.
delay
number
Delay between key presses in milliseconds.
Example Request
{
  "action": "type_keys",
  "keys": ["a", "b", "c", "enter"],
  "delay": 50
}
keys
array
required
Array of keys to press or release.
press
string
required
Whether to press or release the keys. Must be one of: up, down.
Example Request
{
  "action": "press_keys",
  "keys": ["ctrl", "shift", "esc"],
  "press": "down"
}
text
string
required
The text string to type.
delay
number
Delay between characters in milliseconds.
Example Request
{
  "action": "type_text",
  "text": "Hello, Bytebot!",
  "delay": 50
}
text
string
required
The text to paste. Useful for special characters that aren’t on the standard keyboard.
Example Request
{
  "action": "paste_text",
  "text": "Special characters: ©®™€¥£ émojis 🎉"
}

System Actions

duration
number
required
Wait duration in milliseconds.
Example Request
{
  "action": "wait",
  "duration": 2000
}
No parameters required.Example Request
{
  "action": "screenshot"
}
No parameters required.Example Request
{
  "action": "cursor_position"
}
application
string
required
The application to switch to. Available options: firefox, 1password, thunderbird, vscode, terminal, desktop, directory.
Example Request
{
  "action": "application",
  "application": "firefox"
}
Available Applications:
  • firefox - Mozilla Firefox web browser
  • 1password - Password manager
  • thunderbird - Email client
  • vscode - Visual Studio Code editor
  • terminal - Terminal/console application
  • desktop - Switch to desktop
  • directory - File manager/directory browser

Response

Responses vary based on the action performed:

Default Response

Most actions return a simple success response:
{
  "success": true
}

Screenshot Response

Returns the screenshot as a base64 encoded string:
{
  "success": true,
  "data": {
    "image": "base64_encoded_image_data"
  }
}

Cursor Position Response

Returns the current cursor position:
{
  "success": true,
  "data": {
    "x": 123,
    "y": 456
  }
}

Error Response

{
  "success": false,
  "error": "Error message"
}

Code Examples

curl -X POST http://localhost:9990/computer-use \
  -H "Content-Type: application/json" \
  -d '{"action": "move_mouse", "coordinates": {"x": 100, "y": 200}}'

Body

application/json
  • Option 1
  • Option 2
  • Option 3
  • Option 4
  • Option 5
  • Option 6
  • Option 7
  • Option 8
  • Option 9
  • Option 10
  • Option 11
  • Option 12
action
enum<string>
required
Available options:
move_mouse
coordinates
object
required

Response

Successful response

success
boolean
required
data
object
  • Option 1
  • Option 2
I