Using a Remote Browser

Create a Multi-Step Query

When creating multistep queries with a remote Bytebot-managed browser, there are two rules to remember:

  1. The act function will produce one or more void actions. The extract function will produce exactly one extract action. You cannot extract multiple elements from a single extract call.
  2. Sequential actions (that depend on each other) need to be split into separate act prompts. This is because a browser’s state needs to be updated first before a subsequent action could be generated. However, non-sequential, independent actions can be combined into a single prompt.

These rules are best explained through a few examples.

Example 1: Independent actions through one prompt

If you have multiple actions that do not depend on each other, they can be combined into a single act call. For example, if you need to select two radioboxes that are already visible:

1import { BytebotClient } from "./src/index.ts";
2import "dotenv/config";
3
4const bytebot = new BytebotClient({
5 apiKey: process.env.BYTEBOT_API_KEY,
6});
7
8async function run() {
9 const browser = await bytebot.browser.startSession(
10 "https://www.ycombinator.com/companies"
11 );
12
13 if (browser.sessionId) {
14 const prompt = "Filter by Top Companies and filter by B2B companies.";
15 const actions = await bytebot.browser.act({
16 sessionId: browser.sessionId,
17 prompt: prompt,
18 });
19 console.log(actions);
20
21 bytebot.browser.endSession(browser.sessionId);
22 }
23}
24
25run().catch(console.error);

Here, sorting and filtering can be executed in whatever order. Accordingly, they can be bundled into a single execution.

Example 2: Sequential actions through multiple prompts

Instead of clicking on two category filters, imagine clicking on a category filter and a subsequent subcategory filter. These actions need to be done sequentially. Therefore, they need to be executed in separate act calls.

To accomplish this cleanly, we can use a loop:

1import { BytebotClient } from "./src/index.ts";
2import "dotenv/config";
3
4const bytebot = new BytebotClient({
5 apiKey: process.env.BYTEBOT_API_KEY,
6});
7
8async function run() {
9 const browser = await bytebot.browser.startSession("https://www.ycombinator.com/companies");
10
11 if (browser.sessionId) {
12 const prompts = [
13 "Select the Top Companies",
14 "Select B2B Companies",
15 "Select analytics companies from the B2B submenu",
16 ];
17
18 for (const prompt of prompts) {
19 await bytebot.browser.act({
20 sessionId: browser.sessionId,
21 prompt: prompt,
22 });
23 }
24
25 bytebot.browser.endSession(browser.sessionId);
26 }
27}
28
29run().catch(console.error);

Example 3: Multiple extractions

Because extract can only create and execute a single extract browser action, a separate function call is needed for every extraction:

1import { BytebotClient, Text, } from "./src/index.ts";
2import "dotenv/config";
3
4const bytebot = new BytebotClient({
5 apiKey: process.env.BYTEBOT_API_KEY,
6});
7
8async function run() {
9 const browser = await bytebot.browser.startSession("https://developer.chrome.com");
10
11 const schemas = [
12 Text("The main CTA button"),
13 Attribute("The link of the main CTA"),
14 ];
15
16 if (browser.sessionId) {
17 var extractions = [];
18 for (const schema of schemas) {
19 const extractActions = await bytebot.browser.extract({
20 sessionId: browser.sessionId,
21 schema: schema,
22 });
23
24 //execute the actions
25 extractions.push(extractActions);
26 }
27
28 bytebot.browser.endSession(browser.sessionId);
29 }
30}
31
32run().catch(console.error);

Example 4: Mixing actions and extractions

Actions and extractions can be mixed. For example:

1import { BytebotClient, Text } from "./src/index.ts";
2import "dotenv/config";
3
4const bytebot = new BytebotClient({
5 apiKey: process.env.BYTEBOT_API_KEY,
6});
7
8async function run() {
9 const browser = await bytebot.browser.startSession("https://www.ycombinator.com/companies");
10
11 if (browser.sessionId) {
12 const prompt = "Filter by Top Companies and filter by B2B";
13 await bytebot.browser.act({ sessionId: browser.sessionId, prompt: prompt });
14
15 await bytebot.browser.extract({
16 sessionId: browser.sessionId,
17 schema: Text("The first company's name"),
18 });
19
20 bytebot.browser.endSession(browser.sessionId);
21 }
22}
23
24run().catch(console.error);

Next Steps

Learn more about Bytebot’s recommended tips, or discover Bytebot’s BrowserAction types in-depth.