Skip to content

Commit

Permalink
support for esm scripts, with import... (#320)
Browse files Browse the repository at this point in the history
* adding import feature

* do import

* support for .mjs files

* retreival -> retrieval

* add test import

* reduce file sisze

* add docs

* path resolution
  • Loading branch information
pelikhan authored Apr 3, 2024
1 parent ff9d285 commit 1929363
Show file tree
Hide file tree
Showing 53 changed files with 376 additions and 251 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ packages/sample/*.slides.md
vscode-extension-samples/
.DS_Store
.genaiscript/temp
.genaiscript/retreival
.genaiscript/retrieval
results/
.genaiscript/
applications/EdgePeeringAI
2 changes: 1 addition & 1 deletion .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"type": "node",
"cwd": "${workspaceFolder}",
"preLaunchTask": "npm: compile-cli",
"args": ["parse", "code", "packages/sample/src/tla/EWD998PCal.tla", "(interface_declaration) @i"]
"args": ["run", "packages/sample/genaisrc/summarize-import.genai.mjs", "packages/sample/src/questions.md"]
},
{
"name": "Run - sample",
Expand Down
2 changes: 2 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
{
"devicescript.devtools.autoStart": false,
"cSpell.words": [
"AICI",
"genai",
"Genaiscript",
"gpspec",
"gpspecs",
"gptool",
Expand Down
8 changes: 4 additions & 4 deletions docs/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/src/content/docs/getting-started/installation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ to bug fixes earlier than the marketplace release.
- ...
- .genaiscript/ folder created by the extension to store supporting files
- cache/ various cache files
- retreival/ retreival database caches
- retrieval/ retrieval database caches
- ... supporting files
- **genaiscript.vsix**

Expand Down
2 changes: 1 addition & 1 deletion docs/src/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ const { pages } = await parsers.PDF(env.files[0])

```js wrap
// embedding vector index and search
const { files } = await retreival.search("cats", env.files)
const { files } = await retrieval.search("cats", env.files)
```

</Card>
Expand Down
16 changes: 8 additions & 8 deletions docs/src/content/docs/reference/cli/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,10 +94,10 @@ Options:
-h, --help display help for command
```

## `retreival`
## `retrieval`

```
Usage: genaiscript retreival [options] [command]
Usage: genaiscript retrieval [options] [command]
RAG support
Expand All @@ -111,10 +111,10 @@ Commands:
help [command] display help for command
```

### `retreival index`
### `retrieval index`

```
Usage: genaiscript retreival index [options] <file...>
Usage: genaiscript retrieval index [options] <file...>
Index a set of documents
Expand All @@ -131,10 +131,10 @@ Options:
-h, --help display help for command
```

### `retreival search`
### `retrieval search`

```
Usage: genaiscript retreival search [options] <query> [files...]
Usage: genaiscript retrieval search [options] <query> [files...]
Search index
Expand All @@ -145,10 +145,10 @@ Options:
-h, --help display help for command
```

### `retreival clear`
### `retrieval clear`

```
Usage: genaiscript retreival clear [options]
Usage: genaiscript retrieval clear [options]
Clear index to force re-indexing
Expand Down
4 changes: 2 additions & 2 deletions docs/src/content/docs/reference/scripts/embeddings-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ keywords: embeddings search, similarity search, vector database, indexing, LLM A
The `retrieval.search` indexes the input files using [embeddings](https://platform.openai.com/docs/guides/embeddings) into a vector database that can be used for similarity search. This is commonly referred to as Retrieval Augmented Generation (RAG).

```js
const { files, fragments } = await retreival.search("keyword", env.files)
const { files, fragments } = await retrieval.search("keyword", env.files)
```

The returned `files` object contains the file with
Expand All @@ -18,7 +18,7 @@ concatenated embeddings, and the `fragments` object contains each individual fil
You can use the result of `files` in the `def` function.

```js
const { files } = await retreival.search("keyword", env.files)
const { files } = await retrieval.search("keyword", env.files)
def("FILE", files)
```

Expand Down
2 changes: 1 addition & 1 deletion docs/src/content/docs/reference/scripts/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ defFunction("current_weather", ...)
Let's illustrate how functions come together with a question answering script.

In the script below, we add the ``system.web_search` which registers the `web_search` function. This function
will call into `retreival.webSearch` as needed.
will call into `retrieval.webSearch` as needed.

```js file="answers.genai.js"
script({
Expand Down
67 changes: 67 additions & 0 deletions docs/src/content/docs/reference/scripts/imports.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: Imports
sidebar:
order: 20
---

import { Steps } from "@astrojs/starlight/components"
import { FileTree } from "@astrojs/starlight/components"

By default, the scripts cannot import modules (they are evaled) and are expected to execute the code directly.

You can add support for imports by following these steps.

## Converting to a module

<Steps>

<ol>

<li>

Rename the `.genai.js` file
to [module file](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules#aside_%E2%80%94_.mjs_versus_.js) `.genai.mjs`.

<FileTree>

- genaisrc/
- **poem.genai.mjs** // .js -> .mjs

</FileTree>

</li>

<li>

Wrap the script code in a function and make it the default export. You can leave `script` in the main scope.

```js title="poem.genai.mjs" "export default async function() {" "}"
script(...)
export default async function() {
$`Write a poem.`
}
```

</li>

</ol>

</Steps>

## Imports

Once the file is converted, you can use static or dynamic imports as any other module file.

```js
import { parse } from "ini"
...
export default async function () {
// static import
const res = parse("x = 1\ny = 2")
console.log(res)

// dynamic import
const { stringify } = await import("ini")
console.log(stringify(res))
}
```
14 changes: 7 additions & 7 deletions docs/src/content/docs/reference/scripts/retreival.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
---
title: Retreival
title: Retrieval
sidebar:
order: 10
---

GenAIScript provides various utilities to retreive content and augment the prompt. This technique is typically referred as **RAG** (Retreival-Augmentation-Generation) in the literature. GenAIScript uses [llamaindex-ts](https://ts.llamaindex.ai/api/classes/VectorIndexRetriever) which supports many vector database vendors.
GenAIScript provides various utilities to retreive content and augment the prompt. This technique is typically referred as **RAG** (Retrieval-Augmentation-Generation) in the literature. GenAIScript uses [llamaindex-ts](https://ts.llamaindex.ai/api/classes/VectorIndexRetriever) which supports many vector database vendors.

## Search

The `retreive.search` performs a embeddings search to find the most similar documents to the prompt. The search is performed using the [llamaindex-ts](https://ts.llamaindex.ai/api/classes/VectorIndexRetriever) library.

```js
const { files, fragments } = await retreival.search("cat dog", env.files)
const { files, fragments } = await retrieval.search("cat dog", env.files)
def("RAG", files)
```

The `files` variable contains a list of files, with concatenated fragments, that are most similar to the prompt. The `fragments` variable contains a list of fragments from the files that are most similar to the prompt.

### Indexing

By default, the retreival uses [OpenAI text-embedding-ada-002](https://ts.llamaindex.ai/modules/embeddings/) embeddings. The first search might be slow as the files get indexed for the first time.
By default, the retrieval uses [OpenAI text-embedding-ada-002](https://ts.llamaindex.ai/modules/embeddings/) embeddings. The first search might be slow as the files get indexed for the first time.

You can index your project using the [CLI](/genaiscript/reference/cli).

Expand All @@ -29,7 +29,7 @@ genaiscript retreive index "src/**"

:::tip

You can simulate an indexing command in Visual Studio Code by right-clicking on a folder and selecting **Retreival** > **Index**. Once indexed, you can test search using **Retreival** > **Search**.
You can simulate an indexing command in Visual Studio Code by right-clicking on a folder and selecting **Retrieval** > **Index**. Once indexed, you can test search using **Retrieval** > **Search**.

:::

Expand All @@ -39,10 +39,10 @@ You can control the chunk size, overlap and model used for index files. You can

## Web Search

The `retreival.webSearch` performs a web search using a search engine API. You will need to provide API keys for the search engine you want to use.
The `retrieval.webSearch` performs a web search using a search engine API. You will need to provide API keys for the search engine you want to use.

```js
const { webPages } = await retreival.webSearch("cat dog")
const { webPages } = await retrieval.webSearch("cat dog")
def("RAG", webPages)
```

Expand Down
6 changes: 3 additions & 3 deletions docs/src/content/docs/reference/scripts/web-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ sidebar:
order: 15
---

The `retreival.webSearch` executes a web search using the Bing Web Search API.
The `retrieval.webSearch` executes a web search using the Bing Web Search API.

## Web Pages

Expand All @@ -15,7 +15,7 @@ as an array of files, similarly to `env.files`. The content contains
the summary snippet returned by the search engine.

```js
const { webPages } = await retreival.webSearch("microsoft")
const { webPages } = await retrieval.webSearch("microsoft")
def("PAGES", webPages)
```

Expand All @@ -31,7 +31,7 @@ BING_SEARCH_API_KEY="your-api-key"

## Function

Add the [system.web_search](https://github.com/microsoft/genaiscript/blob/main/packages/core/src/genaisrc/system.web_search.genai.js) system script to register a [function](/genaiscript/reference/scripts/functions) that uses `retreival.webSearch`.
Add the [system.web_search](https://github.com/microsoft/genaiscript/blob/main/packages/core/src/genaisrc/system.web_search.genai.js) system script to register a [function](/genaiscript/reference/scripts/functions) that uses `retrieval.webSearch`.

```js
script({
Expand Down
10 changes: 5 additions & 5 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,12 @@
"test:front-matter": "node packages/cli/built/genaiscript.cjs run front-matter SUPPORT.md",
"test:pdf": "node packages/cli/built/genaiscript.cjs parse pdf packages/sample/src/rag/loremipsum.pdf",
"test:docx": "node packages/cli/built/genaiscript.cjs parse docx packages/sample/src/rag/Document.docx",
"test:index": "node packages/cli/built/genaiscript.cjs retreival index \"packages/sample/src/rag/*\"",
"test:index:summary": "node packages/cli/built/genaiscript.cjs retreival index \"packages/sample/src/rag/*\" --summary",
"test:search": "node packages/cli/built/genaiscript.cjs retreival search lorem \"packages/sample/src/rag/*\"",
"test:search:summary": "node packages/cli/built/genaiscript.cjs retreival search lorem \"packages/sample/src/rag/*\" --summary",
"test:index": "node packages/cli/built/genaiscript.cjs retrieval index \"packages/sample/src/rag/*\"",
"test:index:summary": "node packages/cli/built/genaiscript.cjs retrieval index \"packages/sample/src/rag/*\" --summary",
"test:search": "node packages/cli/built/genaiscript.cjs retrieval search lorem \"packages/sample/src/rag/*\"",
"test:search:summary": "node packages/cli/built/genaiscript.cjs retrieval search lorem \"packages/sample/src/rag/*\" --summary",
"test:codequery": "node packages/cli/built/genaiscript.cjs code query packages/core/src/progress.ts \"(interface_declaration) @i\"",
"test:tokens": "node packages/cli/built/genaiscript.cjs retreival tokens packages/sample/src/rag/*",
"test:tokens": "node packages/cli/built/genaiscript.cjs retrieval tokens packages/sample/src/rag/*",
"serve": "node packages/cli/built/genaiscript.cjs serve",
"docs": "cd docs && ./node_modules/.bin/astro dev --host",
"build:docs": "cd docs && yarn build && yarn build:asw",
Expand Down
11 changes: 8 additions & 3 deletions packages/cli/src/build.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
import { GENAI_EXT, host, parseProject } from "genaiscript-core"
import {
GENAI_JS_GLOB,
GPSPEC_GLOB,
host,
parseProject,
} from "genaiscript-core"

export async function buildProject(options?: {
toolFiles?: string[]
Expand All @@ -9,8 +14,8 @@ export async function buildProject(options?: {
const {
toolFiles,
specFiles,
toolsPath = "**/*" + GENAI_EXT,
specsPath = "**/*.gpspec.md",
toolsPath = GENAI_JS_GLOB,
specsPath = GPSPEC_GLOB,
} = options || {}

const gpspecFiles = specFiles?.length
Expand Down
Loading

0 comments on commit 1929363

Please sign in to comment.