Skip to content

Commit

Permalink
Refactor grep functionality and enhance documentation tools across co…
Browse files Browse the repository at this point in the history
…debase
  • Loading branch information
pelikhan committed Oct 7, 2024
1 parent 641df9c commit 9daa2b2
Show file tree
Hide file tree
Showing 39 changed files with 929 additions and 195 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ The quick brown fox jumps over the lazy dog.
Grep or fuzz search [files](https://microsoft.github.io/genaiscript/reference/scripts/files).

```js
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, "**/*.md")
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, { globs: "*.md" })
```

### LLM Tools
Expand Down
39 changes: 31 additions & 8 deletions docs/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/src/components/BuiltinAgents.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import { LinkCard } from '@astrojs/starlight/components';

### Builtin Agents

<LinkCard title="agent docs" description="query the documentation files" href="/genaiscript/reference/scripts/system#systemagent_docs" />
<LinkCard title="agent fs" description="query files to accomplish tasks" href="/genaiscript/reference/scripts/system#systemagent_fs" />
<LinkCard title="agent git" description="query a repository using Git to accomplish tasks. Provide all the context information available to execute git queries." href="/genaiscript/reference/scripts/system#systemagent_git" />
<LinkCard title="agent github" description="query GitHub to accomplish tasks" href="/genaiscript/reference/scripts/system#systemagent_github" />
Expand Down
1 change: 1 addition & 0 deletions docs/src/components/BuiltinTools.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import { LinkCard } from '@astrojs/starlight/components';
<LinkCard title="github_pulls_get" description="Get a single pull request by number." href="/genaiscript/reference/scripts/system#systemgithub_pulls" />
<LinkCard title="github_pulls_review_comments_list" description="Get review comments for a pull request." href="/genaiscript/reference/scripts/system#systemgithub_pulls" />
<LinkCard title="math_eval" description="Evaluates a math expression" href="/genaiscript/reference/scripts/system#systemmath" />
<LinkCard title="md_find_files" description="Get the file structure of the documentation markdown/MDX files. Retursn filename, title, description for each match. Use pattern to specify a regular expression to search for in the file content." href="/genaiscript/reference/scripts/system#systemmd_find_files" />

Check failure on line 31 in docs/src/components/BuiltinTools.mdx

View workflow job for this annotation

GitHub Actions / build

The word 'Retursn' is misspelled; it should be 'Returns'.
<LinkCard title="md_read_frontmatter" description="Reads the frontmatter of a markdown or MDX file." href="/genaiscript/reference/scripts/system#systemmd_frontmatter" />
<LinkCard title="python_code_interpreter_run" description="Executes python 3.12 code for Data Analysis tasks in a docker container. The process output is returned. Do not generate visualizations. The only packages available are numpy, pandas, scipy. There is NO network connectivity. Do not attempt to install other packages or make web requests." href="/genaiscript/reference/scripts/system#systempython_code_interpreter" />
<LinkCard title="python_code_interpreter_copy_files" description="Copy files from the host file system to the container file system" href="/genaiscript/reference/scripts/system#systempython_code_interpreter" />
Expand Down
4 changes: 2 additions & 2 deletions docs/src/content/docs/guides/search-and-transform.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ that allows to efficiently search for a pattern in files (this is the same searc
that powers the Visual Studio Code search).

```js "workspace.grep"
const { pattern, glob } = env.vars
const { pattern, globs } = env.vars
const patternRx = new RegExp(pattern, "g")
const { files } = await workspace.grep(patternRx, glob)
const { files } = await workspace.grep(patternRx, { globs })
```

Check failure on line 75 in docs/src/content/docs/guides/search-and-transform.mdx

View workflow job for this annotation

GitHub Actions / build

The variable name 'glob' is changed to 'globs' which may lead to a reference error if not updated everywhere.

## Compute Transforms
Expand Down
2 changes: 1 addition & 1 deletion docs/src/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ The quick brown fox jumps over the lazy dog.
Grep or fuzz search [files](/genaiscript/referen/script/files)

```js wrap
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, "**/*.md")
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, { globs: "*.md" })
```

Check failure on line 253 in docs/src/content/docs/index.mdx

View workflow job for this annotation

GitHub Actions / build

The second argument to 'workspace.grep' should be a string, not an object literal.

</Card>
Expand Down
136 changes: 134 additions & 2 deletions docs/src/content/docs/reference/scripts/system.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,53 @@ $`- You are concise.
`````


### `system.agent_docs`

Agent that perform tasks on the documentation.





`````js wrap title="system.agent_docs"
system({
title: "Agent that perform tasks on the documentation.",
})

const docsRoot = env.vars.docsRoot || "docs"
const samplesRoot = env.vars.samplesRoot || "packages/sample/genaisrc/"

defAgent(
"docs",
"query the documentation files",
async (ctx) => {
ctx.$`Your are a helpfull LLM agent that is an expert at Technical documentation. You can provide the best analyzis to any query about the documentation.
Analyze QUERY and respond with the requested information.
## Tools
The 'md_find_files' can perform a grep search over the documentation files and return the title, description, and filename for each match.
To optimize search, conver the QUERY request into keywords or a regex pattern.
Try multiple searches if you cannot find relevant files.
## Context
- the documentation is stored in markdown/MDX files in the ${docsRoot} folder
${samplesRoot ? `- the code samples are stored in the ${samplesRoot} folder` : ""}
`
},
{
system: ["system.explanations", "system.github_info"],
tools: ["md_find_files", "fs_read_file"],
maxTokens: 10000,
}
)

`````


### `system.agent_fs`

Agent that can find, search or read files to accomplish tasks
Expand Down Expand Up @@ -616,7 +663,7 @@ defTool(
`ls ${glob} ${pattern ? `| grep ${pattern}` : ""} ${frontmatter ? "--frontmatter" : ""}`
)
const res = pattern
? (await workspace.grep(pattern, glob, { readText: false })).files
? (await workspace.grep(pattern, { glob, readText: false })).files
: await workspace.findFiles(glob, { readText: false })
if (!res?.length) return "No files found."

Check failure on line 669 in docs/src/content/docs/reference/scripts/system.mdx

View workflow job for this annotation

GitHub Actions / build

The second argument to 'workspace.grep' should be a string, not an object literal.
Expand Down Expand Up @@ -1106,7 +1153,7 @@ system({

const info = await github.info()
if (info?.owner) {
const { auth, owner, repo, baseUrl } = info
const { owner, repo, baseUrl } = info
$`- current github repository: ${owner}/${repo}`
if (baseUrl) $`- current github base url: ${baseUrl}`
}

Check failure on line 1159 in docs/src/content/docs/reference/scripts/system.mdx

View workflow job for this annotation

GitHub Actions / build

The variable 'auth' is declared but its value is never read.
Expand Down Expand Up @@ -1448,6 +1495,91 @@ defTool(
`````
### `system.md_find_files`
Tools to help with documentation tasks
- tool `md_find_files`: Get the file structure of the documentation markdown/MDX files. Retursn filename, title, description for each match. Use pattern to specify a regular expression to search for in the file content.

Check failure on line 1504 in docs/src/content/docs/reference/scripts/system.mdx

View workflow job for this annotation

GitHub Actions / build

The word 'Retursn' is misspelled; it should be 'Returns'.
`````js wrap title="system.md_find_files"
system({
title: "Tools to help with documentation tasks",
})
const model = (env.vars.mdSummaryModel = "gpt-4o-mini")
defTool(
"md_find_files",
"Get the file structure of the documentation markdown/MDX files. Retursn filename, title, description for each match. Use pattern to specify a regular expression to search for in the file content.",
{
type: "object",
properties: {
path: {
type: "string",
description: "root path to search for markdown/MDX files",
},
pattern: {
type: "string",
description:
"Optional regular expression pattern to search for in the file content.",
},
},
},
async (args) => {
const { path, pattern, context } = args
context.log(
`docs: ls ${path} ${pattern ? `| grep ${pattern}` : ""} --frontmatter`
)
const matches = pattern
? (await workspace.grep(pattern, { path, readText: true })).files
: await workspace.findFiles(path + "/**/*.{md,mdx}", {
readText: true,
})
if (!matches?.length) return "No files found."
const q = await host.promiseQueue(5)
const files = await q.mapAll(matches, async ({ filename, content }) => {
const file = {
filename,
}
try {
const fm = await parsers.frontmatter(content)
if (fm) {
file.title = fm.title
file.description = fm.description
}
const { text: summary } = await runPrompt(
(_) => {
_.def("CONTENT", content, { language: "markdown" })
_.$`As a professional summarizer, create a concise and comprehensive summary of the provided text, be it an article, post, conversation, or passage, while adhering to these guidelines:
* The summary is intended for an LLM, not a human.
* Craft a summary that is detailed, thorough, in-depth, and complex, while maintaining clarity and conciseness.
* Incorporate main ideas and essential information, eliminating extraneous language and focusing on critical aspects.
* Rely strictly on the provided text, without including external information.
* Format the summary in one single paragraph form for easy understanding. Keep it short.
* Generate a list of keywords that are relevant to the text.`
},
{
label: `summarize ${filename}`,
cache: "md_find_files_summary",
model,
}
)
file.summary = summary
} catch (e) {}
return file
})
const res = YAML.stringify(files)
context.log(res)
return res
},
{ maxTokens: 10000 }
)
`````
### `system.md_frontmatter`
Markdown frontmatter reader
Expand Down
39 changes: 31 additions & 8 deletions genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 9daa2b2

Please sign in to comment.