SDK Reference

SDK Reference

The Bytebot SDK enables developers to automate web tasks by translating natural language prompts into executable actions on web pages. This guide provides detailed documentation on the BytebotClient class, its methods, and the various interfaces and enums that are part of the SDK.

BytebotClient

BytebotClient is the primary interface for interacting with the Bytebot SDK, offering methods to execute actions based on natural language prompts or predefined action sets.

Constructor

Instantiate BytebotClient with optional ClientOptions for customization:

1import { BytebotClient } from "@bytebot/sdk";
2
3const client = new BytebotClient({
4 apiKey: () => "Your Bytebot API Key",
5});

Act Method

The act method allows you to translate a natural language prompt into one or more actions for a specified page. The actions are returned as an array of BrowserAction objects.

Usage

1const actions = await client.act("Enter your prompt here", page, {
2 parameters: {
3 /* Sensitive parameters */
4 },
5});

Parameters

  • prompt: A string containing the natural language instructions to be converted into actions.
  • page: The Page object representing the browser page where the actions will be executed.
  • options: Optional. An object of PromptOptions to further customize the prompt execution.

Extract Method

The extract method allows you to translate a natural language prompt into a series of actions and execute them on a specified page. A single action is returned in an BrowserAction array.

Usage

1const schema = Table([
2 Column("Company Name", Text("The name of the company")),
3 Column("Company Description", Text("The description of the company")),
4 ]);
5
6const actions = await bytebot.extract(schema, page);

Parameters

  • schema: A object representing the structure of what to extract.
  • page: The Page object representing the browser page where the actions will be executed.
  • options: Optional. An object of PromptOptions to further customize the prompt execution.

Execute Method

The execute method directly executes a list of specified actions on a given page. Depending on the action type, the method will return null or a value.

1const result = await client.execute(actions, page);

Enums

BrowserActionType

An enumeration of possible action types that BytebotClient can perform, such as copying attributes, assigning attributes, clicking elements, copying text, and extracting tables.

1export const BrowserActionType = {
2 CopyAttribute: "CopyAttribute",
3 AssignAttribute: "AssignAttribute",
4 Click: "Click",
5 CopyText: "CopyText",
6 ExtractTable: "ExtractTable",
7} as const;

Interfaces

ClientOptions

An interface for customizing the behavior of the BytebotClient:

1interface ClientOptions {
2 apiUrl?: string;
3 apiKey?: string;
4 logVerbose?: boolean;
5}

Properties

  • apiUrl: Optional. A string for the URL of the Bytebot API. Defaults to ‘https://api.bytebot.ai’.
  • apiKey: Optional. A string for the API key to authenticate with the Bytebot API. If not provided, the API key must be set using the BYTEBOT_API_KEY environment variable.
  • logVerbose: Optional. A boolean indicating whether verbose logging should be enabled. Defaults to false.

PromptOptions

An interface for customizing the behavior of the act method:

1interface PromptOptions {
2 parameters?: Record<string, string>;
3}

Properties

  • parameters: Optional. An object containing key-value pairs for passing sensitive information securely.

BrowserAction

Defines the structure for individual actions, including their type, target, and parameters:

1interface BrowserAction {
2 type: Bytebot.BrowserActionType; // Click, CopyText, CopyAttribute, AssignAttribute, ExtractTable
3 xpath: string;
4 // Action parameters
5 attribute?: string; // Required for CopyAttribute and AssignAttribute
6 value?: string; // Required for AssignAttribute
7 rows?: Bytebot.ExtractTableColumn[][]; // Required for ExtractTable
8
9}

Properties

  • type: ActionDetailActionType Specifies the type of action, such as Click, CopyText, or ExtractTable.
  • xpath: The XPath expression targeting the element(s) the action is to be performed on.
  • attribute: Required for CopyAttribute and AssignAttribute. The name of the attribute to read or assign.
  • value: Required for AssignAttribute. The value to assign to the specified attribute.
  • rows: Required for ExtractTable. An table represented as a 2D array of ExtractTableColumn objects that specify the cells to be extracted.

ExtractTableColumn

Details for individual table columns in an extraction action:

1interface ExtractTableColumn {
2 name: string;
3 action: Bytebot.BrowserAction; // Only CopyAttribute and CopyText are supported
4}

Properties

  • name: The name assigned to the column, which will be used as key in the dictionary returned.
  • action: BrowserAction Specifies the action to perform on the column. Supported actions include CopyAttribute and CopyText.