Skip to content

Commit

Permalink
Running builtin tests from vscode (#395)
Browse files Browse the repository at this point in the history
* adding tests, better assertions

* updated buffering options

* add another test

* added test ui

* don't use array if not needed

* add server/client

* server impl

* test node 20, 22

* updated promptfoo

* [genai] front matter update

* so

* retry after parsing

* better help to configure keys

* indent body request

* handle multiple scripts

* start viewer

* better handling of starting the prmpt

* fix cli

* updated commands

* more logging

* more docs

* more docs

* more docs updates

* updated docs, added javascript

* updated instructions

* updated landing page
  • Loading branch information
pelikhan authored Apr 29, 2024
1 parent 35fb457 commit b506497
Show file tree
Hide file tree
Showing 37 changed files with 936 additions and 259 deletions.
5 changes: 4 additions & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,17 @@ on:
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [20, 22]
steps:
- uses: actions/checkout@v3
with:
submodules: "recursive"
fetch-depth: 10
- uses: actions/setup-node@v3
with:
node-version: "20"
node-version: "${{ matrix.node-version }}"
cache: yarn
- run: yarn install --frozen-lockfile
- name: typecheck
Expand Down
10 changes: 10 additions & 0 deletions docs/genaisrc/frontmatter.genai.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,16 @@ script({
maxTokens: 2000,
temperature: 0,
model: "gpt-4",
tests: [
{
files: "src/content/docs/refeference/scripts/aici.md",
rubrics: [
"is a generated front matter",
"The generated frontmatter is SEO optimized.",
],
keywords: "aici",
},
],
})

defFileMerge((fn, label, before, generated) => {
Expand Down
52 changes: 41 additions & 11 deletions docs/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

55 changes: 28 additions & 27 deletions docs/package.json
Original file line number Diff line number Diff line change
@@ -1,29 +1,30 @@
{
"name": "docs",
"type": "module",
"private": true,
"version": "1.22.1",
"license": "MIT",
"scripts": {
"dev": "astro dev --host",
"start": "astro dev --host",
"check": "astro check",
"build": "astro build",
"build:asw": "rm -Rf distasw && mkdir distasw && touch distasw/index.html && mkdir distasw/genaiscript && cp -r dist/* distasw/genaiscript",
"preview": "astro preview",
"astro": "astro",
"genai:frontmatter": "node .genaiscript/genaiscript.cjs batch frontmatter src/**/*.md --apply-edits",
"genai:technical": "node .genaiscript/genaiscript.cjs batch technical src/**/*.md --apply-edits",
"genai:alt-text": "node scripts/image-alt-text.mjs"
},
"dependencies": {
"@astrojs/check": "^0.5.9",
"@astrojs/starlight": "^0.21.1",
"astro": "^4.5.3",
"sharp": "0.32.6",
"typescript": "5.4.5"
},
"devDependencies": {
"zx": "^8.0.2"
}
"name": "docs",
"type": "module",
"private": true,
"version": "1.22.1",
"license": "MIT",
"scripts": {
"dev": "astro dev --host",
"start": "astro dev --host",
"check": "astro check",
"build": "astro build",
"build:asw": "rm -Rf distasw && mkdir distasw && touch distasw/index.html && mkdir distasw/genaiscript && cp -r dist/* distasw/genaiscript",
"preview": "astro preview",
"astro": "astro",
"genai:test": "node ../packages/cli/built/genaiscript.cjs test src/**/*.md",
"genai:frontmatter": "node ../packages/cli/built/genaiscript.cjs batch frontmatter src/**/*.md --apply-edits",
"genai:technical": "node ../packages/cli/built/genaiscript.cjs batch technical src/**/*.md --apply-edits",
"genai:alt-text": "node scripts/image-alt-text.mjs"
},
"dependencies": {
"@astrojs/check": "^0.5.9",
"@astrojs/starlight": "^0.21.1",
"astro": "^4.5.3",
"sharp": "0.32.6",
"typescript": "5.4.5"
},
"devDependencies": {
"zx": "^8.0.2"
}
}
22 changes: 15 additions & 7 deletions docs/src/content/docs/getting-started/testing-scripts.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
---
title: Testing scripts
sidebar:
order: 4.6
order: 4.6
description: Learn how to declare and run tests for your scripts to ensure their correctness and reliability.
keywords: testing, scripts, validation, GenAIScript CLI, automation
---

import providerSrc from "../../../../../packages/core/src/genaiscript-api-provider.mjs?raw"
import { Code } from '@astrojs/starlight/components';
import { Code } from "@astrojs/starlight/components"

It is possible to declare [tests](/genaiscript/reference/scripts/tests) in the `script` function
to validate the output of the script.
Expand All @@ -18,18 +19,25 @@ The tests are added as an array of objects in the `tests` key of the `script` fu
```js title="proofreader.genai.js" wrap
scripts({
...,
tests: [{
tests: {
files: "src/rag/testcode.ts",
rubrics: "is a report with a list of issues",
facts: `The report says that the input string
facts: `The report says that the input string
should be validated before use.`,
}]
}
})
```

## Running tests

You can use the cli to run the tests of your script and open the results.
### Visual Studio Code

- Open the script to test, in this example `proofreader.genai.js`.
- Right-click on the script and select **Run GenAIScript Tests**.

### Command Line

Run this command from the workspace root.

```sh
npx genaiscript test proofreader --view
Expand All @@ -40,4 +48,4 @@ npx genaiscript test proofreader --view
Currently, promptfoo treats the script source as the prompt text. Therefore, one cannot use assertions
that also rely on the input text, such as `answer_relevance`.

- Read more about [tests](/genaiscript/reference/scripts/tests) in the reference.
- Read more about [tests](/genaiscript/reference/scripts/tests) in the reference.
34 changes: 27 additions & 7 deletions docs/src/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,25 +15,24 @@ hero:
link: https://github.com/microsoft/genaiscript/
icon: github
---
import { Image} from "astro:assets"

import { Image } from "astro:assets"
import { Card, CardGrid } from "@astrojs/starlight/components"
import { FileTree } from "@astrojs/starlight/components"

import vscodeSrc from "../../../public/images/visual-studio-code.png"

import debuggerSrc from '../../assets/debugger.png';
import debuggerSrc from "../../assets/debugger.png"
import debuggerAlt from "../../assets/debugger.png.txt?raw"

import sarifSrc from "../../assets/tla-ai-linter.png"
import sarifAlt from "../../assets/tla-ai-linter.png.txt?raw"


```js wrap title="extract-data.genai.js"
// define the context
def("FILE", env.files, { endsWith: ".pdf" })
// structure the data
const schema = defSchema("DATA",
{ type: "array", items: { type: "string" } })
const schema = defSchema("DATA", { type: "array", items: { type: "string" } })
// assign the task
$`Analyze FILE and extract data to JSON
using the ${schema} schema.`
Expand Down Expand Up @@ -61,7 +60,11 @@ using the ${schema} schema.`
</Card>
</CardGrid>

<Image src={vscodeSrc} alt="A screenshot of VSCode with a genaiscript opened" loading="lazy" />
<Image
src={vscodeSrc}
alt="A screenshot of VSCode with a genaiscript opened"
loading="lazy"
/>

## Features

Expand All @@ -82,12 +85,29 @@ $`Summarize FILE. Today is ${new Date()}.`

<Card title="Fast Development Loop" icon="rocket">

Edit, [debug](/genaiscript/getting-started/debugging-scripts/), [run](/genaiscript/getting-started/running-scripts/) your scripts in [Visual Studio Code](/genaiscript/getting-started/installation).
Edit, [Debug](/genaiscript/getting-started/debugging-scripts/), [Run](/genaiscript/getting-started/running-scripts/),
[Test](/genaiscript/getting-started/testing-scripts/) your scripts in [Visual Studio Code](/genaiscript/getting-started/installation)
or with a [command line](/genaiscript/getting-started/installation).

<Image src={debuggerSrc} alt={debuggerAlt} loading="lazy" />

</Card>

<Card title="Builtin LLM Tests" icon="star">

Build reliable prompts using [LLM tests](/genaiscript/reference/scripts/tests)
powered by [promptfoo](https://promptfoo.dev/).

```js wrap
script({ ..., tests: {
files: "penguins.csv",
rubric: "is a data analysis report",
facts: "The data refers about penguin population in Antartica.",
}})
```

</Card>

<Card title="Reuse and Share Scripts" icon="star">

Scripts are [files](/genaiscript/reference/scripts/)! They can be versioned, shared, forked, ...
Expand Down
6 changes: 2 additions & 4 deletions docs/src/content/docs/reference/cli/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,12 +88,12 @@ Options:
## `test`

```
Usage: genaiscript test [options] [script]
Usage: genaiscript test [options] [script...]
Runs the tests for scripts
Arguments:
script Script id. If not provided, all scripts are
script Script ids. If not provided, all scripts are
tested
Options:
Expand All @@ -106,8 +106,6 @@ Options:
-tp, --test-provider <string> test provider
--view open test viewer once tests are executed
--no-cache disable LLM result cache
--no-run do not run the tests
--no-write Do not write results to promptfoo directory
-v, --verbose verbose output
-h, --help display help for command
```
Expand Down
Loading

0 comments on commit b506497

Please sign in to comment.