Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test group filter, run retries #638

Merged
merged 5 commits into from
Aug 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion docs/src/content/docs/reference/cli/commands.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,10 @@ Options:
-em, --embeddings-model <string> embeddings model for the run
--no-cache disable LLM result cache
-cn, --cache-name <name> custom cache file name
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
--cs, --csv-separator <string> csv separator (default: "\t")
-cs, --csv-separator <string> csv separator (default: "\t")
-ae, --apply-edits apply file edits
--vars <namevalue...> variables, as name=value, stored in env.vars
-rr, --run-retry <number> number of retries for the entire run
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
-h, --help display help for command
```

Expand Down Expand Up @@ -89,6 +90,8 @@ Options:
-v, --verbose verbose output
-pv, --promptfoo-version [version] promptfoo version, default is 0.78.0
-os, --out-summary <file> append output summary in file
--groups <groups...> groups to include or exclude. Use :!
prefix to exclude
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
-h, --help display help for command
```

Expand Down
7 changes: 6 additions & 1 deletion packages/cli/src/cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -142,12 +142,13 @@ export async function cli() {
)
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
.option("--no-cache", "disable LLM result cache")
.option("-cn, --cache-name <name>", "custom cache file name")
.option("--cs, --csv-separator <string>", "csv separator", "\t")
.option("-cs, --csv-separator <string>", "csv separator", "\t")
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
.option("-ae, --apply-edits", "apply file edits")
.option(
"--vars <namevalue...>",
"variables, as name=value, stored in env.vars"
)
.option("-rr, --run-retry <number>", "number of retries for the entire run")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option "-rr, --run-retry " is missing a description.

generated by pr-review-commit missing_argument

.action(runScriptWithExitCode)

const test = program.command("test")
Expand All @@ -174,6 +175,10 @@ export async function cli() {
`promptfoo version, default is ${PROMPTFOO_VERSION}`
)
.option("-os, --out-summary <file>", "append output summary in file")
.option(
"--groups <groups...>",
"groups to include or exclude. Use :! prefix to exclude"
)
.action(scriptsTest)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option "--groups <groups...>" is missing a description.

generated by pr-review-commit missing_argument


test.command("view")
Expand Down
21 changes: 20 additions & 1 deletion packages/cli/src/run.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ import {
ANNOTATION_ERROR_CODE,
GENAI_ANY_REGEX,
TRACE_CHUNK,
UNRECOVERABLE_ERROR_CODES,
SUCCESS_ERROR_CODE,
} from "../../core/src/constants"
import { isCancelError, errorMessage } from "../../core/src/error"
import { Fragment, GenerationResult } from "../../core/src/generation"
Expand All @@ -47,6 +49,7 @@ import {
normalizeInt,
logVerbose,
logError,
delay,
} from "../../core/src/util"
import { YAMLStringify } from "../../core/src/yaml"
import { PromptScriptRunOptions } from "../../core/src/server/messages"
Expand Down Expand Up @@ -79,7 +82,23 @@ export async function runScriptWithExitCode(
TraceOptions &
CancellationOptions
) {
const { exitCode } = await runScript(scriptId, files, options)
const runRetry = Math.max(1, normalizeInt(options.runRetry) || 1)
let exitCode = -1
for (let r = 0; r < runRetry; ++r) {
const res = await runScript(scriptId, files, options)
exitCode = res.exitCode
if (
exitCode === SUCCESS_ERROR_CODE ||
UNRECOVERABLE_ERROR_CODES.includes(exitCode)
)
break

const delayMs = 2000 * Math.pow(2, r)
console.error(
`error: run failed with ${exitCode}, retry #${r + 1}/${runRetry} in ${delayMs}ms`
)
await delay(delayMs)
}
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
process.exit(exitCode)
}

Expand Down
3 changes: 3 additions & 0 deletions packages/cli/src/test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ import {
logInfo,
logVerbose,
delay,
tagFilter,
} from "../../core/src/util"
import { YAMLStringify } from "../../core/src/yaml"
import {
Expand Down Expand Up @@ -83,6 +84,7 @@ export async function runPromptScriptTests(
const scripts = prj.templates
.filter((t) => arrayify(t.tests)?.length)
.filter((t) => !ids?.length || ids.includes(t.id))
.filter((t) => tagFilter(options?.groups, t.group))
if (!scripts.length)
return {
ok: false,
Expand Down Expand Up @@ -217,6 +219,7 @@ export async function scriptsTest(
promptfooVersion?: string
outSummary?: string
testDelay?: string
groups?: string[]
}
) {
const { status, value = [] } = await runPromptScriptTests(ids, options)
Expand Down
8 changes: 8 additions & 0 deletions packages/core/src/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ export const FETCH_RETRY_MAX_DELAY_DEFAULT = 120000
export const EXEC_MAX_BUFFER = 64
export const DOT_ENV_FILENAME = ".env"

export const SUCCESS_ERROR_CODE = 0
export const UNHANDLED_ERROR_CODE = -1
export const ANNOTATION_ERROR_CODE = -2
export const FILES_NOT_FOUND_ERROR_CODE = -3
Expand All @@ -85,6 +86,13 @@ export const CONNECTION_CONFIGURATION_ERROR_CODE = -6
export const USER_CANCELLED_ERROR_CODE = -7
export const CONFIGURATION_ERROR_CODE = -8

export const UNRECOVERABLE_ERROR_CODES = Object.freeze([
CONNECTION_CONFIGURATION_ERROR_CODE,
USER_CANCELLED_ERROR_CODE,
FILES_NOT_FOUND_ERROR_CODE,
ANNOTATION_ERROR_CODE,
])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UNRECOVERABLE_ERROR_CODES constant includes 0 which is typically a success error code. This could lead to incorrect behavior if a successful operation is treated as an unrecoverable error.

generated by pr-review-commit unrecoverable_error_codes


export const DOT_ENV_REGEX = /\.env$/i
export const PROMPT_FENCE = "```"
export const MARKDOWN_PROMPT_FENCE = "`````"
Expand Down
1 change: 1 addition & 0 deletions packages/core/src/genaiscript-api-provider.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ class GenAIScriptApiProvider {

args.push("run", prompt)
if (files) args.push(...files)
args.push("--run-retry", 2)
if (testVars && typeof testVars === "object") {
args.push("--vars")
for (const [key, value] of Object.entries(testVars)) {
Expand Down
2 changes: 2 additions & 0 deletions packages/core/src/server/messages.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ export interface ServerEnv extends RequestMessage {
export interface PromptScriptTestRunOptions {
testProvider?: string
models?: string[]
groups?: string[]
}

export interface PromptScriptTestRun extends RequestMessage {
Expand All @@ -44,6 +45,7 @@ export interface PromptScriptTestRunResponse extends ResponseStatus {
export interface PromptScriptRunOptions {
excludedFiles: string[]
excludeGitIgnore: boolean
runRetry: string
out: string
retry: string
retryDelay: string
Expand Down
19 changes: 17 additions & 2 deletions packages/core/src/util.ts
Original file line number Diff line number Diff line change
Expand Up @@ -180,9 +180,9 @@ export function logWarn(msg: string) {
export function logError(msg: string | Error | SerializedError) {
const { message, ...e } = serializeError(msg)
if (message) host.log(LogLevel.Error, message)
console.debug(msg)
console.debug(msg)
const se = YAMLStringify(e)
if (!/^\s*\{\}\s*$/) host.log(LogLevel.Info, se)
if (!/^\s*\{\s*\}\s*$/) host.log(LogLevel.Info, se)
}
export function concatArrays<T>(...arrays: T[][]): T[] {
if (arrays.length == 0) return []
Expand Down Expand Up @@ -285,3 +285,18 @@ export function renderWithPrecision(
}

export const HTMLEscape = HTMLEscape_

export function tagFilter(tags: string[], tag: string) {
if (!tags?.length || !tag) return true
const ltag = tag.toLocaleLowerCase()
let inclusive = false
for (const t of tags) {
const lt = t.toLocaleLowerCase()
const exclude = lt.startsWith(":!")
if (!exclude) inclusive = true

if (exclude && ltag.startsWith(lt.slice(2))) return false
else if (ltag.startsWith(t)) return true
}
return !inclusive
}
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
script({
title: "Describe objects in each image",
model: "gpt-3.5-turbo",
group: "vision",
maxTokens: 4000,
system: [],
tests: {
Expand Down
1 change: 1 addition & 0 deletions packages/sample/genaisrc/describe-image.genai.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
script({
title: "Describe objects in image",
model: "gpt-4-turbo-v",
group: "vision",
maxTokens: 4000,
system: [],
tests: {
Expand Down
4 changes: 2 additions & 2 deletions packages/sample/genaisrc/summarize-max-tokens.genai.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,6 @@ script({
},
})

def("FILE", env.files, { maxTokens: 40 })
def("FILE", env.files, { maxTokens: 80 })

$`Extract keywords for the contents of FILE.`
$`Extract 5 keywords for the contents of FILE.`
2 changes: 1 addition & 1 deletion packages/sample/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"test:watch": "node --import tsx --watch --test-name-pattern=run --test src/**.test.ts",
"cache:clear": "node ../cli/built/genaiscript.cjs cache clear",
"run:script": "node ../cli/built/genaiscript.cjs run",
"test:scripts": "node ../cli/built/genaiscript.cjs test -rmo -tp tnrllmproxy.azurewebsites.net",
"test:scripts": "node ../cli/built/genaiscript.cjs test --groups :!vision -rmo",
"test:scripts:view": "node ../cli/built/genaiscript.cjs test view"
},
"devDependencies": {
Expand Down
2 changes: 1 addition & 1 deletion packages/sample/src/vision/describe-card-schema.genai.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
script({
description:
"Given an image of a receipt, extract a csv of the receipt data",
group: "image tools",
group: "vision",
model: "gpt-4-turbo-v",
maxTokens: 4000,
})
Expand Down
2 changes: 1 addition & 1 deletion packages/sample/src/vision/describe-card.genai.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
script({
description: "Given an image of business card, extract the details to a csv file",
group: "image tools",
group: "vision",
model: "gpt-4-turbo-v",
maxTokens: 4000,
})
Expand Down
2 changes: 1 addition & 1 deletion packages/sample/src/vision/describe-image.genai.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
script({
description:
"Given an image of a receipt, extract a csv of the receipt data",
group: "image tools",
group: "vision",
model: "gpt-4-turbo-v",
maxTokens: 4000,
})
Expand Down
Loading