Using a Remote Browser

Create an Extraction

In this guide, you will extract content from a webpage. Extracting content is particularly common to web scraping and testing. This tutorial will use a Bytebot-managed remote browser; if you want to use a local Puppeteer instance instead, click here.

Extract Schema Functions

Extract Schema Functions comprise a small set of functions that detail the structure and type of extracted data. Bytebot features four extract schema functions. These are split into two primary functions:

  • Attribute(prompt: string): This defines a natural language prompt that specifies an element’s attribute (e.g., src or value).
  • Text(prompt: string): This defines a natural language prompt that specifies an element’s innerText.

And two structure-based functions, specific to tables:

  • Table(columns: Column[]): This defines a table of contents, itemized across columns. Table requires an array of Column function calls.
  • Column(name: string, value: Attribute | Text): This defines a named table column, where the value is an Attribute or Text function call.

Extract Schema Functions do not extract data. Instead, they define the extracted data’s structure to Bytebot’s extract function.

Bytebot’s Extract Function

To extract data, you need to use the BytebotClient.browser.extract(options: Object) function. Like act, the function is async and accepts the following options:

AttributeRequiredDescription
sessionIdRequiredA String that identifies the targeted remote browser session
schemaRequiredAn Attribute, Text, or Table function call. Schema will not accept a direct Column function call, as all tabular data needs to be wrapped in a Table.
urlOptionalA String that indicates which destination the data should be extracted from. If omitted, Bytebot will default to the browser’s last URL.
pageIdOptionalA String that indicates which page (virtual tab) the data should be extracted from. If omitted, Bytebot will default to the last page interacted with.

The return object of BytebotClient.browser.extract has a few optional attributes:

AttributeDescription
dataEither a string or an Array of Record objects, depending on whether the extract was on a text/attribute field or a table, respectively. For Attribute, it will be a string.
pagesAn array of pages available on the browser session
actionAn extract BrowserAction
errorA string detailing any error

Extracting an element’s attribute

To extract an element’s attribute, we will need to import Attribute from Bytebot’s SDK:

$import { Attribute, BytebotClient } from "./src/index.ts";

Attribute is an extract schema function and takes a natural language prompt as a string. This prompt does not need action words like “Get” or “Extract”. Instead, the prompt should strictly define what attribute is to be extracted from where:

1const prompt = "The link from the 'Sign in' button";

Next, call extract, using Attribute to wrap the prompt and the sessionId:

1import { Attribute, BytebotClient } from "./src/index.ts";
2import "dotenv/config";
3
4const bytebot = new BytebotClient({
5 apiKey: process.env.BYTEBOT_API_KEY,
6});
7
8async function run() {
9 const browser = await bytebot.browser.startSession(
10 "https://developer.chrome.com/"
11 );
12
13 if (browser.sessionId) {
14 const prompt = "The link from the 'Sign in' button";
15 const extractActions = await bytebot.browser.extract({
16 sessionId: browser.sessionId,
17 schema: Attribute(prompt),
18 });
19 console.log("Data", extractActions.data) // extracted data!
20 console.log("Actions", extractActions.actions) // browser actions!
21
22 bytebot.browser.endSession(browser.sessionId);
23 }
24}
25
26run().catch(console.error);

When you run this code, you will console something similar to:

1Data https://developer.chrome.com/_d/signin?continue=https%3A%2F%2Fdeveloper.chrome.com%2F&prompt=select_account
2Actions [
3 {
4 type: 'CopyAttribute',
5 xpath: '/html/body/section/devsite-header/div/div[1]/div/div/devsite-user/div/a',
6 attribute: 'href'
7 }
8]

This BrowserAction translates to Puppeteer code that will copy the button’s link.

Extracting an element’s text

Extracting text is similar to extracting an attribute. Instead of Attribute, you will use Text. First, import Text from Bytebot’s SDK:

$import { Text, BytebotClient } from "./src/index.ts";

Text is a extract schema function and takes a natural language prompt as a string. Like Attribute, the prompt does not need actions words like “Get” or “Extract”. Instead, the prompt should strictly define what element’s text is to be extracted:

1import { Text, BytebotClient } from "./src/index.ts";
2import "dotenv/config";
3
4const bytebot = new BytebotClient({
5 apiKey: process.env.BYTEBOT_API_KEY,
6});
7
8async function run() {
9 const browser = await bytebot.browser.startSession(
10 "https://developer.chrome.com/"
11 );
12
13 if (browser.sessionId) {
14 const prompt = "The header text";
15 const extractActions = await bytebot.browser.extract({
16 sessionId: browser.sessionId,
17 schema: Text(prompt),
18 });
19 console.log("Actions", extractActions.actions) // browser actions!
20
21 bytebot.browser.endSession(browser.sessionId);
22 }
23}
24
25run().catch(console.error);

When you run this query, you will console something similar to:

1Actions [
2 {
3 type: 'CopyText',
4 xpath: '/html/body/section/section/main/devsite-content/article/div[3]/section[1]/div/header/div[1]/h2/text()[1]'
5 }
6]

You can only extract the entire text block from an element. Any additional text manipulation (e.g., “Only the first sentence”) must be handled after the entire element’s text has been extracted.

Extracting tabular data

To extract tabular data, you will need to leverage both Column and Table extract schema functions, as well as Attribute and/or Text depending on what you are extracting.

Bytebot does not require that tabular data be encapsulated in an HTML table. Instead, so long as data is visually presented in some tabular or list-like format, Bytebot can extract the data as a table.

First, import Column and Table from Bytebot’s SDK. You should also import Attribute and/or Text depending on what you are extracting:

1import { Table, Column, Attribute, Text, BytebotClient } from "./src/index.ts";

To create an extract schema, call Table on an array of Column function calls, where each Column function call has a name and a value set to a Text or Attribute function call. Finally, call extract on that schema:

1import { Table, Column, Attribute, Text, BytebotClient } from "./src/index.ts";
2import "dotenv/config";
3
4const bytebot = new BytebotClient({
5 apiKey: process.env.BYTEBOT_API_KEY,
6});
7
8async function run() {
9 const browser = await bytebot.browser.startSession(
10 "https://www.ycombinator.com/companies"
11 );
12
13 if (browser.sessionId) {
14 const extractActions = await bytebot.browser.extract({
15 sessionId: browser.sessionId,
16 schema: Table([
17 Column("Name", Text("The name of the company")),
18 Column("Description", Text("The description of the company")),
19 Column("Profile Page", Attribute("The link to the company's profile")),
20 ]),
21 });
22 console.log("Extract Actions", extractActions);
23
24 bytebot.browser.endSession(browser.sessionId);
25 }
26}
27
28run().catch(console.error);

When executed, this snippet will print a full table of scraped companies:

1Extract Actions {
2 sessionId: '625a9bd7-0b4c-4d09-95e1-ec83d02cf3e4',
3 actions: [ { type: 'ExtractTable', xpath: '', rows: [Array] } ],
4 pages: [ { pageId: 0, url: 'https://www.ycombinator.com/companies' } ],
5 data: [
6 {
7 Name: 'Airbnb',
8 Description: 'Book accommodations around the world.',
9 'Profile Page': '/companies/airbnb'
10 },
11 {
12 Name: 'Amplitude',
13 Description: 'Digital Analytics Platform',
14 'Profile Page': '/companies/amplitude'
15 },
16 ...
17 {
18 Name: 'Mixpanel',
19 Description: 'Mixpanel is event analytics for builders that need answers.',
20 'Profile Page': '/companies/mixpanel'
21 }
22 ]
23}

Next Steps

After creating one-off actions or extractions, you may want to arbitrarily check the status of the browser, especially if the browser actions are handled by a separate, asynchronous loop. You can accomplish that with the status function.