Skip to content

Commit

Permalink
browser automation support (#668)
Browse files Browse the repository at this point in the history
* Add playwright for browser automation and update compile scripts to include as external

* Add support for browser cleanup and image definition from buffers in templates

* Add HTML to Markdown conversion support in BrowserPage and update dependencies

* Add auto-playwright and sanitize-html dependencies, implement HTMLToMarkdown conversion in core package

* add HTML global type

* Add installation instructions and script metadata for browse function

* Add exec and browse method documentation to ShellHost interface

* Remove HTMLToMarkdown import from playwright CLI module
  • Loading branch information
pelikhan authored Aug 28, 2024
1 parent b5dfe25 commit 6d16b0b
Show file tree
Hide file tree
Showing 35 changed files with 3,128 additions and 59 deletions.
199 changes: 197 additions & 2 deletions docs/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 23 additions & 11 deletions docs/src/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ using the ${schema} schema.`
Extension](/genaiscript/getting-started/installation/) to get started.
</Card>
<Card title="Configure your LLMs" icon="setting">
Configure the [secrets](/genaiscript/getting-started/configuration) to access your
LLMs.
Configure the [secrets](/genaiscript/getting-started/configuration) to
access your LLMs.
</Card>
<Card title="Write your first script" icon="pencil">
Follow [Getting
Expand Down Expand Up @@ -190,9 +190,19 @@ The quick brown fox jumps over the lazy dog.
Grep or fuzz search [files](/genaiscript/referen/script/files)

```js wrap
const { files } = await workspace.grep(
/[a-z][a-z0-9]+/,
"**/*.md")
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, "**/*.md")
```

</Card>

<Card title="Browser automation" icon="document">

Browse and scrape the web with [Playwright](/genaiscript/reference/scripts/browse).

```js
const page = await host.browse("https://...")
const table = await page.locator("table[...]").innerHTML()
def("TABLE", HTML.convertToMarkdown(table))
```

</Card>
Expand Down Expand Up @@ -233,17 +243,19 @@ script({ ..., model: "ollama:phi3" })
<Card title="LLM Tools" icon="setting">

Register JavaScript functions as [LLM tools](/genaiscript/reference/scripts/tools/)

```js wrap
defTool("weather", "live weahter",
{ city: "Paris" }, // schema
async ({ city }) => // callback
defTool("weather", "live weahter",
{ city: "Paris" }, // schema
async ({ city }) => // callback
{ ... "sunny" }
)
```

or use built-in [@agentic tools](/genaiscript/guides/agentic-tools/)

```js wrap
import { WeatherClient }
from "@agentic/weather"
import { WeatherClient } from "@agentic/weather"
defTool(new WeatherClient())
```

Expand All @@ -254,7 +266,7 @@ defTool(new WeatherClient())
Let the LLM run code in a sandboxed execution environment.

```js wrap
script({ tools: ["python_code_interpreter"]})
script({ tools: ["python_code_interpreter"] })
```

</Card>
Expand Down
51 changes: 51 additions & 0 deletions docs/src/content/docs/reference/scripts/browse.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
title: Browse
sidebar:
order: 30
---

GenAIScript provides a simplified API to interact with a headless browser using [Playwright](https://playwright.dev/) .
This allows you to interact with web pages, scrape data, and automate tasks.

```js
const page = await host.browse(
"https://github.com/microsoft/genaiscript/blob/main/packages/sample/src/penguins.csv"
)
const table = page.locator('table[data-testid="csv-table"]')
const csv = parsers.HTMLToMarkdown(await table.innerHTML())
def("DATA", csv)
$`Analyze DATA.`
```

## Installation

You will need to install Playright locally before using the `browse` function.

```bash
npx playwright install-deps chromium
```

## `host.browse`

This function launches a new browser instance and optionally navigates to the page.

```js
const page = await host.browse(url)
```

You can configure a number of options for the browser instance:

```js
const page = await host.browse(url, { incognito: true })
```

## (Advanced) Native Playwright APIs

The `page` instance returned is a native [Playwright Page](https://playwright.dev/docs/api/class-page) object.
You can import `playwright` and case the instance back to the native playwright object.

```js
import { Page } from "playwright"

const page = await host.browse(url) as Page
```
Loading

0 comments on commit 6d16b0b

Please sign in to comment.