Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test group filter, run retries #638

Merged
merged 5 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion docs/src/content/docs/reference/cli/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,11 @@
-se, --seed <number> seed for the run
-em, --embeddings-model <string> embeddings model for the run
--no-cache disable LLM result cache
-cn, --cache-name <name> custom cache file name

Check failure on line 47 in docs/src/content/docs/reference/cli/commands.md

View workflow job for this annotation

GitHub Actions / build

Incorrect indentation before the csv-separator option.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
--cs, --csv-separator <string> csv separator (default: "\t")
-cs, --csv-separator <string> csv separator (default: "\t")
-ae, --apply-edits apply file edits
--vars <namevalue...> variables, as name=value, stored in env.vars
-rr, --run-retry <number> number of retries for the entire run

Check failure on line 51 in docs/src/content/docs/reference/cli/commands.md

View workflow job for this annotation

GitHub Actions / build

The run-retry option is missing a description.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
-h, --help display help for command
```

Expand Down Expand Up @@ -89,6 +90,8 @@
-v, --verbose verbose output
-pv, --promptfoo-version [version] promptfoo version, default is 0.78.0
-os, --out-summary <file> append output summary in file
--groups <groups...> groups to include or exclude. Use :!
prefix to exclude

Check failure on line 94 in docs/src/content/docs/reference/cli/commands.md

View workflow job for this annotation

GitHub Actions / build

The description for the groups option is ambiguous and could be clarified.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
-h, --help display help for command
```

Expand Down
7 changes: 6 additions & 1 deletion packages/cli/src/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -142,12 +142,13 @@ export async function cli() {
)
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
.option("--no-cache", "disable LLM result cache")
.option("-cn, --cache-name <name>", "custom cache file name")
.option("--cs, --csv-separator <string>", "csv separator", "\t")
.option("-cs, --csv-separator <string>", "csv separator", "\t")
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
.option("-ae, --apply-edits", "apply file edits")
.option(
"--vars <namevalue...>",
"variables, as name=value, stored in env.vars"
)
.option("-rr, --run-retry <number>", "number of retries for the entire run")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option "-rr, --run-retry " is missing a description.

generated by pr-review-commit missing_argument

.action(runScriptWithExitCode)

const test = program.command("test")
Expand All @@ -174,6 +175,10 @@ export async function cli() {
`promptfoo version, default is ${PROMPTFOO_VERSION}`
)
.option("-os, --out-summary <file>", "append output summary in file")
.option(
"--groups <groups...>",
"groups to include or exclude. Use :! prefix to exclude"
)
.action(scriptsTest)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option "--groups <groups...>" is missing a description.

generated by pr-review-commit missing_argument


test.command("view")
Expand Down
14 changes: 13 additions & 1 deletion packages/cli/src/run.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
ANNOTATION_ERROR_CODE,
GENAI_ANY_REGEX,
TRACE_CHUNK,
UNRECOVERABLE_ERROR_CODES,
} from "../../core/src/constants"
import { isCancelError, errorMessage } from "../../core/src/error"
import { Fragment, GenerationResult } from "../../core/src/generation"
Expand All @@ -47,6 +48,7 @@
normalizeInt,
logVerbose,
logError,
delay,
} from "../../core/src/util"
import { YAMLStringify } from "../../core/src/yaml"
import { PromptScriptRunOptions } from "../../core/src/server/messages"
Expand Down Expand Up @@ -79,7 +81,17 @@
TraceOptions &
CancellationOptions
) {
const { exitCode } = await runScript(scriptId, files, options)
const runRetry = Math.max(1, normalizeInt(options.runRetry) || 1)
let exitCode = -1
for (let r = 0; r < runRetry; ++r) {
const res = await runScript(scriptId, files, options)
exitCode = res.exitCode
if (UNRECOVERABLE_ERROR_CODES.includes(exitCode)) break

const delayMs = 2000 * Math.pow(2, r)
console.error(`run failed, retry #${r + 1}/${runRetry} in ${delayMs}ms`)
await delay(delayMs)
}

Check failure on line 94 in packages/cli/src/run.ts

View workflow job for this annotation

GitHub Actions / build

The retry logic in the `runScriptWithExitCode` function could potentially lead to an infinite loop if the `runRetry` option is set to a negative number. Consider adding a check to ensure that `runRetry` is a positive integer.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
process.exit(exitCode)
}

Expand Down
3 changes: 3 additions & 0 deletions packages/cli/src/test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import {
logInfo,
logVerbose,
delay,
tagFilter,
} from "../../core/src/util"
import { YAMLStringify } from "../../core/src/yaml"
import {
Expand Down Expand Up @@ -83,6 +84,7 @@ export async function runPromptScriptTests(
const scripts = prj.templates
.filter((t) => arrayify(t.tests)?.length)
.filter((t) => !ids?.length || ids.includes(t.id))
.filter((t) => tagFilter(options?.groups, t.group))
if (!scripts.length)
return {
ok: false,
Expand Down Expand Up @@ -217,6 +219,7 @@ export async function scriptsTest(
promptfooVersion?: string
outSummary?: string
testDelay?: string
groups?: string[]
}
) {
const { status, value = [] } = await runPromptScriptTests(ids, options)
Expand Down
9 changes: 9 additions & 0 deletions packages/core/src/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@
export const EXEC_MAX_BUFFER = 64
export const DOT_ENV_FILENAME = ".env"

export const SUCCESS_ERROR_CODE = 0
export const UNHANDLED_ERROR_CODE = -1
export const ANNOTATION_ERROR_CODE = -2
export const FILES_NOT_FOUND_ERROR_CODE = -3
Expand All @@ -85,6 +86,14 @@
export const USER_CANCELLED_ERROR_CODE = -7
export const CONFIGURATION_ERROR_CODE = -8

export const UNRECOVERABLE_ERROR_CODES = Object.freeze([
0,
CONNECTION_CONFIGURATION_ERROR_CODE,
USER_CANCELLED_ERROR_CODE,
FILES_NOT_FOUND_ERROR_CODE,
ANNOTATION_ERROR_CODE,
])

Check failure on line 95 in packages/core/src/constants.ts

View workflow job for this annotation

GitHub Actions / build

The `UNRECOVERABLE_ERROR_CODES` constant includes `0` which is typically a success error code. This could lead to incorrect behavior if a successful operation is treated as an unrecoverable error.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UNRECOVERABLE_ERROR_CODES constant includes 0 which is typically a success error code. This could lead to incorrect behavior if a successful operation is treated as an unrecoverable error.

generated by pr-review-commit unrecoverable_error_codes


export const DOT_ENV_REGEX = /\.env$/i
export const PROMPT_FENCE = "```"
export const MARKDOWN_PROMPT_FENCE = "`````"
Expand Down
1 change: 1 addition & 0 deletions packages/core/src/genaiscript-api-provider.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ class GenAIScriptApiProvider {

args.push("run", prompt)
if (files) args.push(...files)
args.push("--run-retry", 2)
if (testVars && typeof testVars === "object") {
args.push("--vars")
for (const [key, value] of Object.entries(testVars)) {
Expand Down
2 changes: 2 additions & 0 deletions packages/core/src/server/messages.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ export interface ServerEnv extends RequestMessage {
export interface PromptScriptTestRunOptions {
testProvider?: string
models?: string[]
groups?: string[]
}

export interface PromptScriptTestRun extends RequestMessage {
Expand All @@ -44,6 +45,7 @@ export interface PromptScriptTestRunResponse extends ResponseStatus {
export interface PromptScriptRunOptions {
excludedFiles: string[]
excludeGitIgnore: boolean
runRetry: string
out: string
retry: string
retryDelay: string
Expand Down
15 changes: 13 additions & 2 deletions packages/core/src/util.ts
Original file line number Diff line number Diff line change
Expand Up @@ -180,9 +180,9 @@
export function logError(msg: string | Error | SerializedError) {
const { message, ...e } = serializeError(msg)
if (message) host.log(LogLevel.Error, message)
console.debug(msg)
console.debug(msg)
const se = YAMLStringify(e)
if (!/^\s*\{\}\s*$/) host.log(LogLevel.Info, se)
if (!/^\s*\{\s*\}\s*$/) host.log(LogLevel.Info, se)
}
export function concatArrays<T>(...arrays: T[][]): T[] {
if (arrays.length == 0) return []
Expand Down Expand Up @@ -285,3 +285,14 @@
}

export const HTMLEscape = HTMLEscape_

export function tagFilter(tags: string[], tag: string) {
if (!tags?.length || !tag) return true
const ltag = tag.toLocaleLowerCase()
for (const t of tags) {
const lt = t.toLocaleLowerCase()
if (lt.startsWith(":!") && ltag.startsWith(lt.slice(2))) return false
else if (ltag.startsWith(t)) return true
}
return false
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
}

Check failure on line 298 in packages/core/src/util.ts

View workflow job for this annotation

GitHub Actions / build

The `tagFilter` function uses `startsWith` to match tags which could lead to incorrect matches. For example, a tag "test" would match "testing". Consider using exact match instead.
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
script({
title: "Describe objects in each image",
model: "gpt-3.5-turbo",
group: "vision",
maxTokens: 4000,
system: [],
tests: {
Expand Down
1 change: 1 addition & 0 deletions packages/sample/genaisrc/describe-image.genai.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
script({
title: "Describe objects in image",
model: "gpt-4-turbo-v",
group: "vision",
maxTokens: 4000,
system: [],
tests: {
Expand Down
4 changes: 2 additions & 2 deletions packages/sample/genaisrc/summarize-max-tokens.genai.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,6 @@ script({
},
})

def("FILE", env.files, { maxTokens: 40 })
def("FILE", env.files, { maxTokens: 80 })

$`Extract keywords for the contents of FILE.`
$`Extract 5 keywords for the contents of FILE.`
2 changes: 1 addition & 1 deletion packages/sample/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"test:watch": "node --import tsx --watch --test-name-pattern=run --test src/**.test.ts",
"cache:clear": "node ../cli/built/genaiscript.cjs cache clear",
"run:script": "node ../cli/built/genaiscript.cjs run",
"test:scripts": "node ../cli/built/genaiscript.cjs test -rmo -tp tnrllmproxy.azurewebsites.net",
"test:scripts": "node ../cli/built/genaiscript.cjs test --groups :!vision -rmo",
"test:scripts:view": "node ../cli/built/genaiscript.cjs test view"
},
"devDependencies": {
Expand Down
2 changes: 1 addition & 1 deletion packages/sample/src/vision/describe-card-schema.genai.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
script({
description:
"Given an image of a receipt, extract a csv of the receipt data",
group: "image tools",
group: "vision",
model: "gpt-4-turbo-v",
maxTokens: 4000,
})
Expand Down
2 changes: 1 addition & 1 deletion packages/sample/src/vision/describe-card.genai.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
script({
description: "Given an image of business card, extract the details to a csv file",
group: "image tools",
group: "vision",
model: "gpt-4-turbo-v",
maxTokens: 4000,
})
Expand Down
2 changes: 1 addition & 1 deletion packages/sample/src/vision/describe-image.genai.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
script({
description:
"Given an image of a receipt, extract a csv of the receipt data",
group: "image tools",
group: "vision",
model: "gpt-4-turbo-v",
maxTokens: 4000,
})
Expand Down
Loading