Skip to content

Commit

Permalink
Add preference system (#17)
Browse files Browse the repository at this point in the history
Fixes #13
  • Loading branch information
svilupp authored Apr 19, 2024
1 parent c3d2c44 commit 6cbbfd6
Show file tree
Hide file tree
Showing 18 changed files with 898 additions and 63 deletions.
4 changes: 2 additions & 2 deletions Artifacts.toml
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ lazy = true

[["tidier__nomicembedtext-0-Bool".download]]
sha256 = "b3f604f382a7191c657c77017e9d3e455e6eacc1fa6a59e6202e8a83498d4269"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/tidier__v20240407__nomicembedtext-1024-Bool__v1.0.tar.gz"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/tidier__v20240407__nomicembedtext-0-Bool__v1.0.tar.gz"

["makie__nomicembedtext-0-Float32"]
git-tree-sha1 = "4b0ff243278fcbda1491db94e4082c1050a44d80"
Expand All @@ -100,4 +100,4 @@ lazy = true

[["makie__nomicembedtext-0-Bool".download]]
sha256 = "c62a0d3d27fc417498d2449b59c58d18dba34164094a08206b51e88d05fa7fa7"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/makie__v20240330__nomicembedtext-1024-Bool__v1.0.tar.gz"
url = "https://github.com/svilupp/AIHelpMeArtifacts/raw/main/artifacts/makie__v20240330__nomicembedtext-0-Bool__v1.0.tar.gz"
15 changes: 12 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- (Preliminary) Knowledge packs available for Julia docs (`:julia`), Tidier ecosystem (`:tidier`), Makie ecosystem (`:makie`). Load with `load_index!(:julia)` or several with `load_index!([:julia, :tidier])`.

### Fixed

## [0.1.0]

### Added
- (Preliminary) Knowledge packs available for Julia docs (`:julia`), Tidier ecosystem (`:tidier`), and Makie ecosystem (`:makie`). Load with `load_index!(:julia)` or several with `load_index!([:julia, :tidier])`.
- Preferences.jl-based persistence for chat model, embedding model, embedding dimension, and which knowledge packs to load on start. See `AIHelpMe.PREFERENCES` for more details.
- Precompilation statements to improve TTFX.
- First Q&A evaluation dataset in folder `evaluations/`.

### Changed
- Bumped up PromptingTools to v0.20 (brings new RAG capabilities, pretty-printing, etc.)
- Changed default model to be GPT-4 Turbo to improve answer quality
- Bumped up PromptingTools to v0.21 (brings new RAG capabilities, pretty-printing, etc.)
- Changed default model to be GPT-4 Turbo to improve the answer quality (you can quickly change to "gpt3t" if you want something simple)
- Documentation moved to Vitepress

### Fixed
Expand Down
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "AIHelpMe"
uuid = "01402e1f-dc83-4213-a98b-42887d758baa"
authors = ["J S <[email protected]> and contributors"]
version = "0.0.1-DEV"
version = "0.1.0"

[deps]
HDF5 = "f67ccb44-e63f-5c2f-98bd-6dc0ccc4ba2f"
Expand All @@ -25,7 +25,7 @@ LazyArtifacts = "<0.0.1, 1"
LinearAlgebra = "<0.0.1, 1"
PrecompileTools = "1"
Preferences = "1"
PromptingTools = "=0.20.1"
PromptingTools = "0.21"
REPL = "1"
SHA = "0.7"
Serialization = "<0.0.1, 1"
Expand Down
16 changes: 13 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,9 @@ All setup should take less than 5 minutes!
> [!TIP]
> Your results will significantly improve if you enable re-ranking of the context to be provided to the model (eg, `aihelp(..., rerank=true)`) or change pipeline to `update_pipeline!(:silver)`. It requires setting up Cohere API key but it's free for community use.

> [!TIP]
> Do you want to safely execute the generated code? Use `AICode` from `PromptingTools.Experimental.AgentToolsAI.`. It can executed the code in a scratch module and catch errors if they happen (eg, apply directly to `AIMessage` response like `AICode(msg)`).

Noticed some weird answers? Please let us know! See the section "Help Us Improve and Debug" in the Advanced section of the docs!

## How to Obtain API Keys
Expand Down Expand Up @@ -158,7 +161,7 @@ f(Int8(2))
# we get: ERROR: MethodError: no method matching f(::Int8)
# Help is here:
aihelp"What does this error mean? $err" # Note the $err to interpolate the stacktrace
aihelp"What does this error mean? \$err" # Note the $err to interpolate the stacktrace
```

```plaintext
Expand Down Expand Up @@ -221,15 +224,22 @@ A: Not at the moment. It might be possible in the future, as PromptingTools.jl s
**Q: Why do we need Cohere API Key?**
A: Cohere's API is used to re-rank the best matching snippets from the documentation. It's free to use in limited quantities (ie, ~thousand requests per month), which should be enough for most users. Re-ranking improves the quality and accuracy of the answers.

**Q: Why do we need Tavily API Key?**
A: Tavily's API is used to search the best matching snippets from the documentation. It's free to use in limited quantities (ie, ~thousand requests per month), which should be enough for most users. Searching improves the quality and accuracy of the answers.

**Q: Can we use Ollama (locally-hosted) models?**
A: Yes, see the Advanced section in the docs.

## Future Directions

AIHelpMe is continuously evolving. Future updates may include:
- Tools to trace the provenance of answers (ie, where did the answer come from?).
- GUI interface (I'm looking at you, Stipple.jl!)
- Better tools to trace the provenance of answers (ie, where did the answer come from?).
- Creation of a gold standard Q&A dataset for evaluation.
- Refinement of the RAG ingestion pipeline for optimized chunk processing and deduplication.
- Introduction of context filtering to focus on specific modules.
- Transition to a more sophisticated multi-turn conversation design.
- Enhancement of the marginal information provided by the RAG context.
- Expansion of content sources beyond docstrings, potentially including documentation sites and community resources like Discourse or Slack posts.

Please note that this is merely a pre-release to gauge the interest in this project.
Please note that this is merely a pre-release - we still have a long way to go...
2 changes: 1 addition & 1 deletion docs/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterVitepress = "4710194d-e776-4893-9690-8d956a29c365"

[compat]
DocumenterVitepress = "0.0.7"
DocumenterVitepress = "0.0.18"
113 changes: 104 additions & 9 deletions docs/src/advanced.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,26 @@
# Advanced

## Using Ollama Models
AIHelpMe can use Ollama models (locally-hosted models), but the knowledge packs are available for only one embedding model: "nomic-embed-text"!

You must set `model_embedding="nomic-embed-text"` and `truncate_dimension=0` (maximum dimension available) for everything to work correctly!
AIHelpMe can use Ollama (locally-hosted) models, but the knowledge packs are available for only one embedding model: "nomic-embed-text"!

You must set `model_embedding="nomic-embed-text"` and `embedding_dimension=0` (maximum dimension available) for everything to work correctly!

Example:

```julia
using PromptingTools: register_model!, OllamaSchema
using AIHelpMe: update_pipeline!, load_index!

# register model names with the Ollama schema
register_model!(; name="mistral:7b-instruct-v0.2-q4_K_M",schema=OllamaSchema())
register_model!(; name="nomic-embed-text",schema=OllamaSchema())
# register model names with the Ollama schema - if needed!
# eg, register_model!(; name="mistral:7b-instruct-v0.2-q4_K_M",schema=OllamaSchema())

# you can use whichever chat model you like!
update_pipeline!(:bronze; model_chat = "mistral:7b-instruct-v0.2-q4_K_M",model_embedding="nomic-embed-text", truncate_dimension=0)
# you can use whichever chat model you like! Llama 3 8b is the best trade-of right now and it's already known to PromptingTools.
update_pipeline!(:bronze; model_chat = "llama3",model_embedding="nomic-embed-text", embedding_dimension=0)


# You must download the corresponding knowledge packs via `load_index!` (because you changed the embedding model)
load_index!(:julia) # or whichever other packs you want!
load_index!()
```

Let's ask a question:
Expand All @@ -35,6 +35,27 @@ PromptingTools.AIMessage("In Julia, you can create a named tuple by enclosing ke
...continues
```

You can use the Preference setting mechanism (`?PREFERENCES`) to change the default settings.
```julia
AIHelpMe.set_preferences!("MODEL_CHAT" => "llama3", "MODEL_EMBEDDING" => "nomic-embed-text", "EMBEDDING_DIMENSION" => 0)
```

## Code Execution

Use `AICode` on the generated answers.

For example:
```julia
using PromptingTools.Experimental.AgentTools: AICode

msg = aihelp"Write a code to sum up 1+1"
cb = AICode(msg)
# Output: AICode(Success: True, Parsed: True, Evaluated: True, Error Caught: N/A, StdOut: True, Code: 1 Lines)
```

If you want to access the extracted code, simply use `cb.code`. If there is an error, you can see it in `cb.error`.

See the docstrings to learn more about it!

## Extending the Knowledge Base

Expand All @@ -56,10 +77,84 @@ To use your newly created index as the main source for queries, execute `load_in

The main index for queries is held in the global variable `AIHelpMe.MAIN_INDEX[]`.

## Save Your Preferences

You can now leverage the amazing Preferences.jl mechanism to save your default choices of the chat model, embedding model, embedding dimension, and which knowledge packs to load on start.

For example, if you wanted to switch to Ollama-based pipeline, you could persist the configuration with:

```julia
AIHelpMe.set_preferences!("MODEL_CHAT" => "llama3", "MODEL_EMBEDDING" => "nomic-embed-text", "EMBEDDING_DIMENSION" => 0)
```
This will create a file `LocalPreferences.toml` in your project directory that will remember these choices across sessions.

See `AIHelpMe.PREFERENCES` for more details.

## Debugging Poor Answers

We're using a Retrieval-Augmented Generation (RAG) pipeline, which consists of two major steps:

1. Generate a "context" snippet from the question and the current context.
2. Generate an "answer" from the context.
3. (Optional) Generate a refined answer (only some pipelines will have this).

If you're not getting good answers / the answers you would expect, you need to understand if the problem is in step 1 or step 2.

First, get the `RAGResult` (the full detail of the pipeline and its intermediate steps):
```julia
using AIHelpMe: pprint, last_result

# This will help you access how was the last answer generated
result = last_result()

# You can pretty-print the result
pprint(result)
```

**Checking the Context**
Check the `result.context` - is it using the right snippets ("knowledge") from the knowledge packs?
Alternatively, you can add it directly to the pretty-printed answer:
```julia
pprint(result; add_context=true, add_scores=false)
```

How is the context produced?

It's a function of your pipeline setting (see below) and the knowledge packs loaded (`AIHelpMe.LOADED_PACKS`).

A quick way to diagnose the context pipeline:

- See a quick overview of the embedding configuration with `AIHelpMe.get_config_key()` (it captures the embedding model, dimension, and element type).
- See the pipeline configuration (which steps) with `AIHelpMe.RAG_CONFIG[]`.
- See the individual kwargs provided to the pipeline with `AIHelpMe.RAG_KWARGS[]`.

The simplest fix for "poor" context is to enable reranking (`rerank=true`).

Other angles to consider:
- Are you sure if the source information is present in the knowledge pack? Can it be added (see `?update_index`)?
- Was the question phrased poorly? Can we ask the same in a more sensible way?
- Try different models, dimensions, element types (Bool is less precise than Float32, etc.)

**Checking the First Answer**

Check the original answer (`result.answer`) - is it correct?

Assuming the context was good, but the answer is not we need to find out if it's because of the model or because of the prompt template.
- Is it that you're using a weak chat model? Try different models. Does it improve?
- Is the prompt template not suitable for your questions? Experiment with tweaking the prompt template.

**Checking the Refined Answer**

You always see the final answer by default (`result.final_answer`).
Is it different from `result.answer`? That means the `refiner` step in the RAG pipeline has been used.

Debug it in the same way as the first answer.
- Is it the context provided to the refiner?
- Is it the answer from the refiner? If yes, is it because of the model? Or because of the prompt template?

## Help Us Improve and Debug

It would be incredibly helpful if you could share examples when the pipeline fails.
It would be incredibly helpful if you could share examples when the RAG pipeline fails.
We're particularly interested in cases where the answer you get is wrong because of the "bad" context provided.

Let's say you ran a question and got a wrong answer (or some other aspect is worth reporting).
Expand Down
5 changes: 4 additions & 1 deletion docs/src/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ Each query incurs only a fraction of a cent, depending on the length and chosen

## Can I use the Cohere Trial API Key for commercial projects?
No, a trial key is only for testing purposes. But it takes only a few clicks to switch to Production API. The cost is only $1 per 1000 searches (!!!) and has many other benefits.
Alternatively, set a different `rerank_strategy` in `aihelp` calls to avoid using Cohere API.

## How accurate are the answers?
Like any other Generative AI answers, ie, it depends and you should always double-check.
Expand All @@ -24,3 +23,7 @@ Cohere's API is used to re-rank the best matching snippets from the documentatio
## Why do we need Tavily API Key?
Tavily is used for the web search results to augment your answers. It's free to use in limited quantities.

## Can we use Ollama (locally-hosted) models?
Yes! See the [Using Ollama Models](@ref) section.


2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ AI models, while powerful, often produce inaccurate or outdated information, kno

AIHelpMe addresses these challenges by incorporating "knowledge packs" filled with preprocessed, up-to-date Julia information. This ensures that you receive not only faster but also more reliable and contextually accurate coding assistance.

With AIHelpMe, you benefit from enhanced AI reliability tailored specifically to Julia’s unique environment, leading to better, more informed coding decisions.
Most importantly, AIHelpMe is designed to be uniquely yours! You can customize the RAG pipeline however you want, bring any additional knowledge (eg, your currently loaded packages) and use it to get more accurate answers on something that's not even public!

## Getting Started

Expand Down
8 changes: 7 additions & 1 deletion docs/src/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@ Welcome to AIHelpMe.jl, your go-to for getting answers to your Julia coding ques

AIhelpMe is a simple wrapper around RAG functionality in PromptingTools.

It provides two extras:
It provides three extras:

- (hopefully), a simpler interface to handle RAG configurations (there are thousands of possible configurations)
- pre-computed embeddings for key “knowledge” in the Julia ecosystem (we refer to them as “knowledge packs”)
- ability to quickly incorporate any additional knowledge (eg, your currently loaded packages) into the "assistant"

> [!CAUTION]
> This is only a prototype! We have not tuned it yet, so your mileage may vary! Always check your results from LLMs!
Expand Down Expand Up @@ -137,10 +138,15 @@ All setup should take less than 5 minutes!
> [!TIP]
> Your results will significantly improve if you enable re-ranking of the context to be provided to the model (eg, `aihelp(..., rerank=true)`) or change pipeline to `update_pipeline!(:silver)`. It requires setting up Cohere API key but it's free for community use.

> [!TIP]
> Do you want to safely execute the generated code? Use `AICode` from `PromptingTools.Experimental.AgentToolsAI.`. It can executed the code in a scratch module and catch errors if they happen (eg, apply directly to `AIMessage` response like `AICode(msg)`).

Noticed some weird answers? Please let us know! See [Help Us Improve and Debug](@ref).

If you want to use locally-hosted models, see the [Using Ollama Models](@ref) section.

If you want to customize your setup, see `AIHelpMe.PREFERENCES`.

## How to Obtain API Keys

### OpenAI API Key:
Expand Down
Loading

2 comments on commit 6cbbfd6

@svilupp
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/105262

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.1.0 -m "<description of version>" 6cbbfd6a36b7f9b366c6969005cc36a06d951204
git push origin v0.1.0

Please sign in to comment.