Skip to content

Commit

Permalink
Add Coding Utils
Browse files Browse the repository at this point in the history
Add Coding Utils
  • Loading branch information
svilupp authored Dec 4, 2023
2 parents 898d074 + 7b7307f commit 2240fc0
Show file tree
Hide file tree
Showing 6 changed files with 600 additions and 2 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- Introduced a set of utilities for working with generate Julia code (Eg, extract code-fenced Julia code with `PromptingTools.extract_code_blocks` ) or simply apply `AICode` to the AI messages. `AICode` tries to extract, parse and eval Julia code, if it fails both stdout and errors are captured. It is useful for generating Julia code and, in the future, creating self-healing code agents

### Fixed
- Changed type of global `PROMPT_SCHEMA::AbstractPromptSchema` for an easier switch to local models as a default option
Expand Down
5 changes: 5 additions & 0 deletions src/PromptingTools.jl
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,11 @@ const TEMPLATE_METADATA = Vector{AITemplateMetadata}()
## Utilities to support structured extraction
include("extraction.jl")

## Utilities to support code generation
export AICode
# Not export extract_code_blocks, extract_function_name
include("code_generation.jl")

## Individual interfaces
include("llm_openai.jl")
include("llm_ollama_managed.jl")
Expand Down
349 changes: 349 additions & 0 deletions src/code_generation.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,349 @@
# These are utilities to support code generation
#
# Types defined (not exported!):
# - AbstractCodeBlock
# - AICode
#
# Functions defined (not exported!):
# - detect_pkg_operation, extract_julia_imports, detect_missing_packages
# - extract_code_blocks
# - eval!
#
#
#
## # Types

abstract type AbstractCodeBlock end

"""
AICode(code::AbstractString; safe_eval::Bool=false, prefix::AbstractString="", suffix::AbstractString="")
A mutable structure representing a code block (received from the AI model) with automatic parsing, execution, and output/error capturing capabilities.
Upon instantiation with a string, the `AICode` object automatically runs a code parser and executor (via `PromptingTools.eval!()`), capturing any standard output (`stdout`) or errors.
This structure is useful for programmatically handling and evaluating Julia code snippets.
See also: `PromptingTools.extract_code_blocks`, `PromptingTools.eval!`
# Workflow
- Until `cb::AICode` has been evaluated, `cb.success` is set to `nothing` (and so are all other fields).
- The text in `cb.code` is parsed (saved to `cb.expression`).
- The parsed expression is evaluated.
- Outputs of the evaluated expression are captured in `cb.output`.
- Any `stdout` outputs (e.g., from `println`) are captured in `cb.stdout`.
- If an error occurs during evaluation, it is saved in `cb.error`.
- After successful evaluation without errors, `cb.success` is set to `true`.
Otherwise, it is set to `false` and you can inspect the `cb.error` to understand why.
# Properties
- `code::AbstractString`: The raw string of the code to be parsed and executed.
- `expression`: The parsed Julia expression (set after parsing `code`).
- `stdout`: Captured standard output from the execution of the code.
- `output`: The result of evaluating the code block.
- `success::Union{Nothing, Bool}`: Indicates whether the code block executed successfully (`true`), unsuccessfully (`false`), or has yet to be evaluated (`nothing`).
- `error::Union{Nothing, Exception}`: Any exception raised during the execution of the code block.
# Keyword Arguments
- `safe_eval::Bool`: If set to `true`, the code block checks for package operations (e.g., installing new packages) and missing imports, and then evaluates the code inside a bespoke scratch module. This is to ensure that the evaluation does not alter any user-defined variables or the global state. Defaults to `false`.
- `prefix::AbstractString`: A string to be prepended to the code block before parsing and evaluation.
Useful to add some additional code definition or necessary imports. Defaults to an empty string.
- `suffix::AbstractString`: A string to be appended to the code block before parsing and evaluation.
Useful to check that tests pass or that an example executes. Defaults to an empty string.
# Methods
- `Base.isvalid(cb::AICode)`: Check if the code block has executed successfully. Returns `true` if `cb.success == true`.
# Examples
```julia
code = AICode("println(\"Hello, World!\")") # Auto-parses and evaluates the code, capturing output and errors.
isvalid(code) # Output: true
code.stdout # Output: "Hello, World!\n"
```
We try to evaluate "safely" by default (eg, inside a custom module, to avoid changing user variables).
You can avoid that with `save_eval=false`:
```julia
code = AICode("new_variable = 1"; safe_eval=false)
isvalid(code) # Output: true
new_variable # Output: 1
```
You can also call AICode directly on an AIMessage, which will extract the Julia code blocks, concatenate them and evaluate them:
```julia
msg = aigenerate("In Julia, how do you create a vector of 10 random numbers?")
code = AICode(msg)
# Output: AICode(Success: True, Parsed: True, Evaluated: True, Error Caught: N/A, StdOut: True, Code: 2 Lines)
# show the code
code.code |> println
# Output:
# numbers = rand(10)
# numbers = rand(1:100, 10)
# or copy it to the clipboard
code.code |> clipboard
# or execute it in the current module (=Main)
eval(code.expression)
```
"""
@kwdef mutable struct AICode <: AbstractCodeBlock
code::AbstractString
expression = nothing
stdout = nothing
output = nothing
success::Union{Nothing, Bool} = nothing
error::Union{Nothing, Exception} = nothing
end
# Eager evaluation if instantiated with a string
function (CB::Type{T})(md::AbstractString;
safe_eval::Bool = true,
prefix::AbstractString = "",
suffix::AbstractString = "") where {T <: AbstractCodeBlock}
cb = CB(; code = md)
eval!(cb; safe_eval, prefix, suffix)
end
Base.isvalid(cb::AbstractCodeBlock) = cb.success == true
function Base.copy(cb::AbstractCodeBlock)
AICode(cb.code, cb.expression, cb.stdout, cb.output, cb.success, cb.error)
end
function Base.show(io::IO, cb::AICode)
success_str = cb.success === nothing ? "N/A" : titlecase(string(cb.success))
expression_str = cb.expression === nothing ? "N/A" : "True"
stdout_str = cb.stdout === nothing ? "N/A" : "True"
output_str = cb.output === nothing ? "N/A" : "True"
error_str = cb.error === nothing ? "N/A" : "True"
count_lines = count(==('\n'), collect(cb.code)) + 1 # there is always at least one line

print(io,
"AICode(Success: $success_str, Parsed: $expression_str, Evaluated: $output_str, Error Caught: $error_str, StdOut: $stdout_str, Code: $count_lines Lines)")
end

## Overload for AIMessage - simply extracts the code blocks and concatenates them
function AICode(msg::AIMessage; kwargs...)
code = extract_code_blocks(msg.content) |> Base.Fix2(join, "\n")
return AICode(code; kwargs...)
end

## # Functions

# Utility to detect if Pkg.* is called in a string (for `safe` code evaluation)
function detect_pkg_operation(input::AbstractString)
m = match(r"\bPkg.[a-z]", input)
return !isnothing(m)
end
# Utility to detect dependencies in a string (for `safe` code evaluation / understand when we don't have a necessary package)
function extract_julia_imports(input::AbstractString)
package_names = Symbol[]
for line in split(input, "\n")
if occursin(r"(^using |^import )"m, line)
subparts = replace(replace(line, "using" => ""), "import" => "")
## TODO: add split on .
subparts = map(x -> contains(x, ':') ? split(x, ':')[1] : x,
split(subparts, ","))
subparts = replace(join(subparts, ' '), ',' => ' ')
packages = filter(!isempty, split(subparts, " ")) .|> Symbol
append!(package_names, packages)
end
end
return package_names
end

# Utility to pinpoint unavailable dependencies
function detect_missing_packages(imports_required::AbstractVector{<:Symbol})
available_packages = Base.loaded_modules |> values .|> Symbol
missing_packages = filter(pkg -> !in(pkg, available_packages), imports_required)
if length(missing_packages) > 0
return true, missing_packages
else
return false, Symbol[]
end
end

"""
extract_code_blocks(markdown_content::String) -> Vector{String}
Extract Julia code blocks from a markdown string.
This function searches through the provided markdown content, identifies blocks of code specifically marked as Julia code
(using the ```julia ... ``` code fence patterns), and extracts the code within these blocks.
The extracted code blocks are returned as a vector of strings, with each string representing one block of Julia code.
Note: Only the content within the code fences is extracted, and the code fences themselves are not included in the output.
# Arguments
- `markdown_content::String`: A string containing the markdown content from which Julia code blocks are to be extracted.
# Returns
- `Vector{String}`: A vector containing strings of extracted Julia code blocks. If no Julia code blocks are found, an empty vector is returned.
# Examples
Example with a single Julia code block
```julia
markdown_single = \"""
```julia
println("Hello, World!")
```
\"""
extract_code_blocks(markdown_single)
# Output: [\"Hello, World!\"]
```
```julia
# Example with multiple Julia code blocks
markdown_multiple = \"""
```julia
x = 5
```
Some text in between
```julia
y = x + 2
```
\"""
extract_code_blocks(markdown_multiple)
# Output: ["x = 5", "y = x + 2"]
```
"""
function extract_code_blocks(markdown_content::AbstractString)
# Define the pattern for Julia code blocks
pattern = r"```julia\n(.*?)\n```"s

# Find all matches and extract the code
matches = eachmatch(pattern, markdown_content)

# Extract and clean the code blocks
code_blocks = String[m.captures[1] for m in matches]

return code_blocks
end

"""
extract_function_name(code_block::String) -> Union{String, Nothing}
Extract the name of a function from a given Julia code block. The function searches for two patterns:
- The explicit function declaration pattern: `function name(...) ... end`
- The concise function declaration pattern: `name(...) = ...`
If a function name is found, it is returned as a string. If no function name is found, the function returns `nothing`.
# Arguments
- `code_block::String`: A string containing Julia code.
# Returns
- `Union{String, Nothing}`: The extracted function name or `nothing` if no name is found.
# Example
```julia
code = \"""
function myFunction(arg1, arg2)
# Function body
end
\"""
extract_function_name(code)
# Output: "myFunction"
```
"""
function extract_function_name(code_block::AbstractString)
# Regular expression for the explicit function declaration
pattern_explicit = r"function\s+(\w+)\("
# Regular expression for the concise function declaration
pattern_concise = r"^(\w+)\(.*\)\s*="

# Searching for the explicit function declaration
match_explicit = match(pattern_explicit, code_block)
if match_explicit !== nothing
return match_explicit.captures[1]
end

# Searching for the concise function declaration
match_concise = match(pattern_concise, code_block)
if match_concise !== nothing
return match_concise.captures[1]
end

# Return nothing if no function name is found
return nothing
end

"""
eval!(cb::AICode; safe_eval::Bool=true, prefix::AbstractString="", suffix::AbstractString="")
Evaluates a code block `cb` in-place. It runs automatically when AICode is instantiated with a String.
Check the outcome of evaluation with `Base.isvalid(cb)`. If `==true`, provide code block has executed successfully.
Steps:
- If `cb::AICode` has not been evaluated, `cb.success = nothing`.
After the evaluation it will be either `true` or `false` depending on the outcome
- Parse the text in `cb.code`
- Evaluate the parsed expression
- Capture outputs of the evaluated in `cb.output`
- Capture any stdout outputs (eg, test failures) in `cb.stdout`
- If any error exception is raised, it is saved in `cb.error`
- Finally, if all steps were successful, success is set to `cb.success = true`
# Keyword Arguments
- `safe_eval::Bool`: If `true`, we first check for any Pkg operations (eg, installing new packages) and missing imports,
then the code will be evaluated inside a bespoke scratch module (not to change any user variables)
- `prefix::AbstractString`: A string to be prepended to the code block before parsing and evaluation.
Useful to add some additional code definition or necessary imports. Defaults to an empty string.
- `suffix::AbstractString`: A string to be appended to the code block before parsing and evaluation.
Useful to check that tests pass or that an example executes. Defaults to an empty string.
"""
function eval!(cb::AbstractCodeBlock;
safe_eval::Bool = true,
prefix::AbstractString = "",
suffix::AbstractString = "")
(; code) = cb
code_extra = string(prefix, "\n", code, "\n", suffix)
## Safety checks on `code` only
if safe_eval
detect_pkg_operation(code) &&
throw(error("Error: Use of package manager (`Pkg.*`) detected! Please verify the safety of the code or disable the safety check (`safe_eval=false`)"))
detected, missing_packages = detect_missing_packages(extract_julia_imports(code))
detected &&
throw(error("Error: Failed package import. Missing packages: $(join(string.(missing_packages),", ")). Please add them or disable the safety check (`safe_eval=false`)"))
end
## Parse into an expression
try
ex = Meta.parseall(code_extra)
cb.expression = ex
catch e
cb.error = e
cb.success = false
return cb
end

## Eval
safe_module = gensym("SafeCustomModule")
# Prepare to catch any stdout
pipe = Pipe()
redirect_stdout(pipe) do
try
# eval in Main module to have access to std libs, but inside a custom module for safety
if safe_eval
cb.output = @eval(Main, module $safe_module
using Test # just in case unit tests are provided
$(cb.expression)
end)
else
# Evaluate the code directly into Main
cb.output = @eval(Main, begin
using Test # just in case unit tests are provided
$(cb.expression)
end)
end
cb.success = true
catch e
cb.error = e
cb.success = false
end
end
close(Base.pipe_writer(pipe))
cb.stdout = read(pipe, String)
return cb
end
2 changes: 0 additions & 2 deletions src/extraction.jl
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# These are utilities to support structured data extraction tasks through the OpenAI function calling interface (wrapped by `aiextract`)
#
# TODOs:
# - add support for enums
to_json_type(s::Type{<:AbstractString}) = "string"
to_json_type(n::Type{<:Real}) = "number"
to_json_type(n::Type{<:Integer}) = "integer"
Expand Down
Loading

0 comments on commit 2240fc0

Please sign in to comment.