Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor grep functionality and enhance documentation tools across codebase #757

Merged
merged 22 commits into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
9daa2b2
Refactor grep functionality and enhance documentation tools across co…
pelikhan Oct 7, 2024
d52641f
Refactor documentation querying and tool functionality, enhance loggi…
pelikhan Oct 7, 2024
e0893e3
Update description of system.agent_docs to clarify its querying capab…
pelikhan Oct 7, 2024
11bfc09
Add option to truncate text from the end in truncateTextToTokens func…
pelikhan Oct 7, 2024
e8c217b
Enhance token handling with new utilities for counting and truncating…
pelikhan Oct 7, 2024
cee33a6
Update script paths and descriptions, and revise git diff handling logic
pelikhan Oct 7, 2024
28346b7
Add maxTokens option and enhance tool usage for pull request reviewer
pelikhan Oct 7, 2024
8d7ad22
Update script references and improve error handling in tool calls
pelikhan Oct 7, 2024
8a8b7cc
Fix typos and add note on pull requests in GitHub agent documentation
pelikhan Oct 7, 2024
8228e4a
Fix typo by changing "globs" to "glob" in workspace.grep call
pelikhan Oct 7, 2024
7a00aa2
Remove excludedPaths in git diff config and update commit prompt with…
pelikhan Oct 7, 2024
a95d3ff
Add note about using the description for pull requests 📝
pelikhan Oct 7, 2024
bef76d7
Refactor error handling, update image model, and enhance logging 📦🛠️
pelikhan Oct 7, 2024
20a33c8
Update model references from gpt-4-turbo-v to gpt-4o throughout the p…
pelikhan Oct 7, 2024
63ed371
Remove debugger statement from cli.ts 🚀
pelikhan Oct 7, 2024
7ab8dcf
Remove `workspace grep` command and related functionality 🗑️
pelikhan Oct 7, 2024
877ea80
Add PR descriptor script for pull request summaries 📝✨
pelikhan Oct 7, 2024
6642f85
Add git branch default tool and enhance PR descriptor tools 🚀
pelikhan Oct 7, 2024
ea54423
Improve logging, add options handling, and fix typos in code and docs 📄🔧
pelikhan Oct 8, 2024
ab0d5ed
Update PR description generation: add GPT-4o model and increase token…
pelikhan Oct 8, 2024
9f925d3
Remove workspace grep tests from cli.test.ts 🧹
pelikhan Oct 8, 2024
0b2b74d
Update fs_diff_files description for clarity 📄✨
pelikhan Oct 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ The quick brown fox jumps over the lazy dog.
Grep or fuzz search [files](https://microsoft.github.io/genaiscript/reference/scripts/files).

```js
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, "**/*.md")
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, { globs: "*.md" })
```

### LLM Tools
Expand Down
39 changes: 31 additions & 8 deletions docs/genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/src/components/BuiltinAgents.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import { LinkCard } from '@astrojs/starlight/components';

### Builtin Agents

<LinkCard title="agent docs" description="query the documentation" href="/genaiscript/reference/scripts/system#systemagent_docs" />
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link text should be descriptive and unique for accessibility. "agent docs" is not descriptive enough.

generated by pr-docs-review-commit link_text

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link provided is incorrect; it should point to 'system.agent_docs' instead of 'system#systemagent_docs'.

generated by pr-docs-review-commit broken_link

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link provided for "agent docs" is incorrect or outdated.

generated by pr-docs-review-commit link_error

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new LinkCard was added without a corresponding description in the documentation.

generated by pr-docs-review-commit link_card_missing

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link for "agent docs" is missing a title attribute in the frontmatter.

generated by pr-docs-review-commit missing_link

<LinkCard title="agent fs" description="query files to accomplish tasks" href="/genaiscript/reference/scripts/system#systemagent_fs" />
<LinkCard title="agent git" description="query a repository using Git to accomplish tasks. Provide all the context information available to execute git queries." href="/genaiscript/reference/scripts/system#systemagent_git" />
<LinkCard title="agent github" description="query GitHub to accomplish tasks" href="/genaiscript/reference/scripts/system#systemagent_github" />
Expand Down
1 change: 1 addition & 0 deletions docs/src/components/BuiltinTools.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import { LinkCard } from '@astrojs/starlight/components';
<LinkCard title="github_pulls_get" description="Get a single pull request by number." href="/genaiscript/reference/scripts/system#systemgithub_pulls" />
<LinkCard title="github_pulls_review_comments_list" description="Get review comments for a pull request." href="/genaiscript/reference/scripts/system#systemgithub_pulls" />
<LinkCard title="math_eval" description="Evaluates a math expression" href="/genaiscript/reference/scripts/system#systemmath" />
<LinkCard title="md_find_files" description="Get the file structure of the documentation markdown/MDX files. Retursn filename, title, description for each match. Use pattern to specify a regular expression to search for in the file content." href="/genaiscript/reference/scripts/system#systemmd_find_files" />
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new LinkCard was added without a corresponding description in the documentation.

generated by pr-docs-review-commit link_card_missing

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link for "md_find_files" is missing a title attribute in the frontmatter.

generated by pr-docs-review-commit missing_link

<LinkCard title="md_read_frontmatter" description="Reads the frontmatter of a markdown or MDX file." href="/genaiscript/reference/scripts/system#systemmd_frontmatter" />
<LinkCard title="python_code_interpreter_run" description="Executes python 3.12 code for Data Analysis tasks in a docker container. The process output is returned. Do not generate visualizations. The only packages available are numpy, pandas, scipy. There is NO network connectivity. Do not attempt to install other packages or make web requests." href="/genaiscript/reference/scripts/system#systempython_code_interpreter" />
<LinkCard title="python_code_interpreter_copy_files" description="Copy files from the host file system to the container file system" href="/genaiscript/reference/scripts/system#systempython_code_interpreter" />
Expand Down
4 changes: 2 additions & 2 deletions docs/src/content/docs/guides/search-and-transform.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,11 @@
that allows to efficiently search for a pattern in files (this is the same search engine
that powers the Visual Studio Code search).

```js "workspace.grep"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'globs' should be 'glob' to match the property name used in the 'workspace.grep' method.

generated by pr-docs-review-commit variable_name_mismatch

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable 'globs' is incorrect; it should be 'glob' as used in the workspace.grep function.

generated by pr-docs-review-commit incorrect_variable

const { pattern, glob } = env.vars
const { pattern, globs } = env.vars
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable 'glob' has been renamed to 'globs', which may affect the script's functionality.

generated by pr-docs-review-commit variable_renamed

const patternRx = new RegExp(pattern, "g")
const { files } = await workspace.grep(patternRx, glob)
const { files } = await workspace.grep(patternRx, { globs })
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect API usage, 'workspace.grep' should be called with 'glob' instead of 'globs'.

generated by pr-docs-review-commit api_usage

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' function is incorrectly called with an object instead of separate arguments.

generated by pr-docs-review-commit incorrect_function_usage

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' function is called with an object containing 'globs' instead of a string 'glob' parameter.

generated by pr-docs-review-commit incorrect_argument

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' function is incorrectly called with an object instead of separate arguments.

generated by pr-docs-review-commit api_usage_error

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of the 'workspace.grep' method has changed, which may affect the script's functionality.

generated by pr-docs-review-commit method_usage_change

```

Check failure on line 75 in docs/src/content/docs/guides/search-and-transform.mdx

View workflow job for this annotation

GitHub Actions / build

The variable name 'glob' has been changed to 'globs' which may cause issues if not updated everywhere it's used.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved

## Compute Transforms

Expand Down
2 changes: 1 addition & 1 deletion docs/src/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -248,9 +248,9 @@

Grep or fuzz search [files](/genaiscript/referen/script/files)

```js wrap
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, "**/*.md")
const { files } = await workspace.grep(/[a-z][a-z0-9]+/, { globs: "*.md" })
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect API usage, 'workspace.grep' should be called with 'glob' instead of 'globs'.

generated by pr-docs-review-commit api_usage

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' function is incorrectly called with an object instead of separate arguments.

generated by pr-docs-review-commit incorrect_function_usage

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter for 'workspace.grep' should be 'glob' instead of 'globs'.

generated by pr-docs-review-commit parameter_error

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' function is called with an object containing 'globs' instead of a string 'glob' parameter.

generated by pr-docs-review-commit incorrect_argument

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' function is incorrectly called with an object instead of separate arguments.

generated by pr-docs-review-commit api_usage_error

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of the 'workspace.grep' method has changed, which may affect the script's functionality.

generated by pr-docs-review-commit method_usage_change

```

Check failure on line 253 in docs/src/content/docs/index.mdx

View workflow job for this annotation

GitHub Actions / build

The structure of the parameter passed to 'workspace.grep' has been changed from a string to an object with a 'globs' property. This change needs to be consistent across all usage.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure of the parameter passed to 'workspace.grep' has changed from a string to an object with a 'globs' property. This change needs to be reflected in all relevant code snippets.

generated by pr-docs-review-commit parameter_structure_change


</Card>

Expand Down
145 changes: 143 additions & 2 deletions docs/src/content/docs/reference/scripts/system.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,58 @@
`````


### `system.agent_docs`

Agent that can query on the documentation.





`````js wrap title="system.agent_docs"
system({
title: "Agent that can query on the documentation.",
})

const docsRoot = env.vars.docsRoot || "docs"
const samplesRoot = env.vars.samplesRoot || "packages/sample/genaisrc/"

defAgent(

Check failure on line 115 in docs/src/content/docs/reference/scripts/system.mdx

View workflow job for this annotation

GitHub Actions / build

The function name 'defAgent' does not follow the naming conventions used in the documentation. It should be camelCase or snake_case consistent with JavaScript and TypeScript standards.
"docs",
"query the documentation",
async (ctx) => {
ctx.$`Your are a helpfull LLM agent that is an expert at Technical documentation. You can provide the best analyzis to any query about the documentation.

Analyze QUERY and respond with the requested information.

## Tools

The 'md_find_files' can perform a grep search over the documentation files and return the title, description, and filename for each match.
To optimize search, conver the QUERY request into keywords or a regex pattern.

Try multiple searches if you cannot find relevant files.

## Context

- the documentation is stored in markdown/MDX files in the ${docsRoot} folder
${samplesRoot ? `- the code samples are stored in the ${samplesRoot} folder` : ""}
`
},
{
system: ["system.explanations", "system.github_info"],
tools: [
"md_find_files",
"md_read_frontmatterm",

Check failure on line 140 in docs/src/content/docs/reference/scripts/system.mdx

View workflow job for this annotation

GitHub Actions / build

There is a typo in the tool name 'md_read_frontmatterm'; it should be 'md_read_frontmatter'.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the tool name "md_read_frontmatterm" which should be "md_read_frontmatter".

generated by pr-docs-review-commit typo

"fs_find_files",
"fs_read_file",
],
maxTokens: 5000,
}
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name 'defAgent' does not follow the naming convention; it should be 'defineAgent' or similar.

generated by pr-docs-review-commit incorrect_function_name


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent "docs" is incorrectly declared with a placeholder QUERY and a non-functional code block.

generated by pr-docs-review-commit incorrect_agent_declaration

`````
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new system prompt 'system.agent_docs' has been added without a corresponding description in the documentation.

generated by pr-docs-review-commit new_system_prompt



### `system.agent_fs`

Agent that can find, search or read files to accomplish tasks
Expand Down Expand Up @@ -615,11 +667,11 @@
context.log(
`ls ${glob} ${pattern ? `| grep ${pattern}` : ""} ${frontmatter ? "--frontmatter" : ""}`
)
const res = pattern
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter for 'workspace.grep' should be 'glob' instead of 'glob'.

generated by pr-docs-review-commit parameter_error

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' function is called with an object containing 'glob' instead of a string 'glob' parameter.

generated by pr-docs-review-commit incorrect_argument

? (await workspace.grep(pattern, glob, { readText: false })).files
? (await workspace.grep(pattern, { glob, readText: false })).files
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' method is incorrectly called with '{ glob, readText: false }' instead of 'glob' as the second parameter.

generated by pr-docs-review-commit parameter_mismatch

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect API usage, 'workspace.grep' should be called with 'glob' instead of 'globs'.

generated by pr-docs-review-commit api_usage

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'workspace.grep' function is incorrectly called with an object instead of separate arguments.

generated by pr-docs-review-commit incorrect_function_usage

: await workspace.findFiles(glob, { readText: false })
if (!res?.length) return "No files found."

Check failure on line 674 in docs/src/content/docs/reference/scripts/system.mdx

View workflow job for this annotation

GitHub Actions / build

The structure of the parameter passed to 'workspace.grep' has been changed from separate arguments to an object. This change needs to be consistent across all usage.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure of the parameter passed to 'workspace.grep' has changed from separate arguments to a single object. This change needs to be reflected in the code snippet.

generated by pr-docs-review-commit parameter_structure_change

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tool "fs_find_files" is incorrectly called with an object instead of separate arguments.

generated by pr-docs-review-commit incorrect_tool_usage

if (frontmatter) {
const files = []
for (const { filename } of res) {
Expand Down Expand Up @@ -1105,11 +1157,11 @@
})

const info = await github.info()
if (info?.owner) {
const { auth, owner, repo, baseUrl } = info
const { owner, repo, baseUrl } = info
$`- current github repository: ${owner}/${repo}`
if (baseUrl) $`- current github base url: ${baseUrl}`
}

Check failure on line 1164 in docs/src/content/docs/reference/scripts/system.mdx

View workflow job for this annotation

GitHub Actions / build

The variable 'auth' is declared but never used, which may indicate unnecessary code or a missing implementation.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved

`````

Expand Down Expand Up @@ -1448,6 +1500,95 @@
`````


### `system.md_find_files`

Tools to help with documentation tasks



- tool `md_find_files`: Get the file structure of the documentation markdown/MDX files. Retursn filename, title, description for each match. Use pattern to specify a regular expression to search for in the file content.

Check failure on line 1509 in docs/src/content/docs/reference/scripts/system.mdx

View workflow job for this annotation

GitHub Actions / build

The word 'Retursn' is a typo and should be corrected to 'Returns'.
pelikhan marked this conversation as resolved.
Show resolved Hide resolved

`````js wrap title="system.md_find_files"
system({
title: "Tools to help with documentation tasks",
})

const model = (env.vars.mdSummaryModel = "gpt-4o-mini")

defTool(
"md_find_files",
"Get the file structure of the documentation markdown/MDX files. Retursn filename, title, description for each match. Use pattern to specify a regular expression to search for in the file content.",
{
type: "object",
properties: {
path: {
type: "string",
description: "root path to search for markdown/MDX files",
},
pattern: {
type: "string",
description:
"regular expression pattern to search for in the file content.",
},
question: {
type: "string",
description: "Question to ask when computing the summary",
},
},
},
async (args) => {
const { path, pattern, context, question } = args
context.log(
`docs: ls ${path} ${pattern ? `| grep ${pattern}` : ""} --frontmatter ${question ? `--ask ${question}` : ""}`
)
const matches = pattern
? (await workspace.grep(pattern, { path, readText: true })).files
: await workspace.findFiles(path + "/**/*.{md,mdx}", {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'path' parameter should be part of an object literal { path, readText: true }, not a separate argument.

generated by pr-docs-review-commit object_literal

readText: true,
})
if (!matches?.length) return "No files found."
const q = await host.promiseQueue(5)
const files = await q.mapAll(matches, async ({ filename, content }) => {
const file = {
filename,
}
try {
const fm = await parsers.frontmatter(content)
if (fm) {
file.title = fm.title
file.description = fm.description
}
const { text: summary } = await runPrompt(
(_) => {
_.def("CONTENT", content, { language: "markdown" })
_.$`As a professional summarizer, create a concise and comprehensive summary of the provided text, be it an article, post, conversation, or passage, while adhering to these guidelines:
${question ? `* ${question}` : ""}
* The summary is intended for an LLM, not a human.
* Craft a summary that is detailed, thorough, in-depth, and complex, while maintaining clarity and conciseness.
* Incorporate main ideas and essential information, eliminating extraneous language and focusing on critical aspects.
* Rely strictly on the provided text, without including external information.
* Format the summary in one single paragraph form for easy understanding. Keep it short.
* Generate a list of keywords that are relevant to the text.`
},
{
label: `summarize ${filename}`,
cache: "md_find_files_summary",
model,
}
)
file.summary = summary
} catch (e) {}
return file
})
const res = YAML.stringify(files)
return res
},
{ maxTokens: 20000 }
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of the 'md_find_files' tool contains a typo "Retursn" which should be "Returns".

generated by pr-docs-review-commit typo

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name 'defTool' does not follow the naming convention; it should be 'defineTool' or similar.

generated by pr-docs-review-commit incorrect_function_name


`````
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new tool 'md_find_files' has been added without a corresponding description in the documentation.

generated by pr-docs-review-commit new_tool_added



### `system.md_frontmatter`

Markdown frontmatter reader
Expand Down
39 changes: 31 additions & 8 deletions genaisrc/genaiscript.d.ts

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading