Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for cached tokens in cost calculation #222

Open
wants to merge 115 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
115 commits
Select commit Hold shift + click to select a range
ac751a9
Revert "update deps"
svilupp Feb 26, 2024
423faef
remove GoogleGenAI (#83)
svilupp Feb 26, 2024
320b9b7
fix docs
svilupp Feb 27, 2024
18e1b71
Templating utilities (#84)
svilupp Feb 28, 2024
5200985
update docs + version (#85)
svilupp Feb 29, 2024
846aa37
Add image generation with DALL-E 3 (#86)
svilupp Mar 1, 2024
37fa1f4
update changelog (#87)
svilupp Mar 1, 2024
b8e888d
Update docs to Vitepress (#88)
svilupp Mar 1, 2024
4d26a4f
update docs/make.jl
svilupp Mar 1, 2024
a581d55
fix CI
svilupp Mar 1, 2024
83cefd9
Fix CI nr2
svilupp Mar 1, 2024
db91e60
Fix CI nr3
svilupp Mar 1, 2024
2d45b8e
Fix CI nr4
svilupp Mar 1, 2024
a84f230
Fix CI nr5
svilupp Mar 1, 2024
81198f5
Fix CI nr.6
svilupp Mar 1, 2024
0d6954e
Fix CI nr7
svilupp Mar 1, 2024
37603a4
Fix CI nr8
svilupp Mar 1, 2024
e97f571
Update README.md
svilupp Mar 2, 2024
cd681c8
Add support annotations (#90)
svilupp Mar 7, 2024
b3756b2
Update Documentation (#91)
svilupp Mar 9, 2024
94911cf
Add Prompt Templates to the Docs (#92)
svilupp Mar 9, 2024
846d069
fix typo on set_preferences! examples, fixes #93 (#94)
ceferisbarov Mar 17, 2024
3b30d66
RAG Interface Rewrite (#95)
svilupp Mar 20, 2024
2509e0b
Update Google AI tutorial (#103)
svilupp Mar 21, 2024
e347081
Allow HTMLStyler in node annotations (#105)
svilupp Mar 21, 2024
96357ce
update diagram in the docs (#108)
svilupp Mar 22, 2024
ce4a2ed
A little README.md correction (#107)
Muhammad-saad-2000 Mar 24, 2024
7cab975
Add support for Claude API (#109)
svilupp Mar 25, 2024
5146a95
Enable GoogleGenAI extension (#111)
svilupp Mar 26, 2024
4925b70
Update CHANGELOG.md (#104)
svilupp Mar 26, 2024
5d9d132
Add ShareGPT template (#113)
svilupp Mar 26, 2024
82d87db
Increase compat for GoogleGenAI v0.3
svilupp Mar 26, 2024
069b6f6
Update html printing (#115)
svilupp Mar 27, 2024
7eb32b1
Fix bug in `print_html` (#116)
svilupp Mar 27, 2024
ee3459d
Add Binary embeddings to RAGTools (#117)
svilupp Apr 3, 2024
024562d
Add data extraction for Anthropic models (#122)
svilupp Apr 5, 2024
fc0073e
Register mistral tiny (#123)
svilupp Apr 6, 2024
84f68cc
Add new GPT-4 Turbo (#124)
svilupp Apr 10, 2024
5054f4c
Update code fences in the hero cards in the docs
cpfiffer Apr 11, 2024
49f3f5b
Update Docs Hero page
svilupp Apr 11, 2024
eb94d1a
Add TraceMessage for observability (#133)
svilupp Apr 15, 2024
23e4f0b
Update binary RAG pipeline (#136)
svilupp Apr 17, 2024
a81bd76
Fix truncate_dimension (#137)
svilupp Apr 18, 2024
790b4e2
Add Llama 3 (#138)
svilupp Apr 19, 2024
02b7cf1
Add support for groq (#139)
svilupp Apr 20, 2024
3b18f02
Update Project.toml (#140)
svilupp Apr 20, 2024
ebaa2c5
Update new OpenAI pre-paid credit requirements (#135)
KronosTheLate Apr 20, 2024
4f28fb6
Add model providers and Supported functions (#134)
adarshpalaskar1 Apr 20, 2024
382638d
Add templates and minor improvements (#142)
svilupp Apr 27, 2024
4655422
Add DeepSeek models (#147)
svilupp May 7, 2024
36aa9cb
Update Changelog (#148)
svilupp May 7, 2024
64f51cb
Add GPT4-Omni
svilupp May 13, 2024
f29b99e
Improvements to aiclassify and aitemplates (#150)
svilupp May 18, 2024
2549722
Improve tracer schema / automated logging (#151)
svilupp May 19, 2024
d244680
Add BitPacked embeddings for RAG retrieval (#152)
svilupp May 19, 2024
39647b3
Add more tracer kwargs for logging
svilupp May 20, 2024
d8f3fca
Fix LCS utility
svilupp May 21, 2024
dee5263
Template file parsing fix
svilupp May 22, 2024
bf1eb85
Add BM25 Index (#157)
svilupp May 28, 2024
608bdf0
E2E Hybrid retrieval
svilupp May 28, 2024
6ec6456
Register Mistral Codestral
svilupp May 29, 2024
0f45fde
add FlashRank.jl package extension
svilupp Jun 11, 2024
f3e3994
Update deps
svilupp Jun 14, 2024
223107f
Update RAG performance
svilupp Jun 18, 2024
a4f191a
Improve unpack_bits (#165)
svilupp Jun 18, 2024
03029ee
Update FlashRank to use only unique documents (#166)
svilupp Jun 18, 2024
bbb1c38
fix formatting of changelog
svilupp Jun 18, 2024
f63935d
Add Anthropic Sonnet 3.5 (#167)
svilupp Jun 20, 2024
be91226
Rag Tools fix + relaxing `const` for API key loading
svilupp Jun 26, 2024
971e926
Update CHANGELOG.md
svilupp Jun 26, 2024
4f0cfd7
Add back API keys
svilupp Jun 26, 2024
af4f67f
Add RankGPT (#172)
svilupp Jul 1, 2024
44450c0
Reciprocal Rank Fusion
svilupp Jul 1, 2024
b5f089f
Update rankGPT
svilupp Jul 2, 2024
6e9f0ea
Compat-flashrank-v04
svilupp Jul 7, 2024
988f3d3
Fix CohereReranker bug
svilupp Jul 9, 2024
dd3fbbc
Drafter update
svilupp Jul 13, 2024
dfb88a1
Add AllTagFilter (#178)
svilupp Jul 16, 2024
53ac0b8
Add GPT-4o-mini + set as default (#180)
svilupp Jul 19, 2024
fcd7509
Add SubChunkIndex (view of index)
svilupp Jul 21, 2024
89d4c43
Add SubDocumentTermMatrix (#181)
svilupp Jul 22, 2024
0f1a334
Register Llama3.1 + minor retrieval improvements
svilupp Jul 23, 2024
c02bd43
Llama 3.1 Fireworks.ai
svilupp Jul 24, 2024
886bdd1
Add Mistral Large 2 and Nemo
svilupp Jul 24, 2024
f6cb37d
Fix wrap string
svilupp Jul 30, 2024
0916bd7
version fix
svilupp Jul 30, 2024
e2553b8
Updates embedding concatenation (#186)
svilupp Aug 4, 2024
1473799
Fix getindex
svilupp Aug 4, 2024
1f941f9
Update RAGTools docstrings
svilupp Aug 5, 2024
a53fbfe
Add GPT4o registry entry
svilupp Aug 7, 2024
7d6a8d8
Add DTM specialized method
svilupp Aug 8, 2024
8f697f6
Add OpenAI structured outputs
svilupp Aug 9, 2024
a2cde30
Add ChatGPT 4o (#195)
svilupp Aug 14, 2024
8e880f8
Add experimental prompt cache support for Anthropic
svilupp Aug 16, 2024
06aa438
Improved docs styling with DocumenterVitepress
lazarusA Aug 20, 2024
271ff08
Add assertion for empty docs in `get_embeddings` (#200)
iuliadmtru Aug 23, 2024
5fe770a
Fix documentation for building RAG (#201)
iuliadmtru Aug 23, 2024
1c8fb7d
Update Structured Extraction
svilupp Sep 3, 2024
e1a3d23
Add stream callbacks
svilupp Sep 7, 2024
1ef3ba3
Update Anthropic kwargs + docs
svilupp Sep 9, 2024
a125fcd
Updated DocumentTermMatrix implementation
svilupp Sep 10, 2024
7038175
OpenAI JSON mode extraction
svilupp Sep 15, 2024
9a75d2c
OpenRouter support, new OpenAI o1 models (#207)
svilupp Sep 15, 2024
fe23115
Update reserved keywords
svilupp Sep 15, 2024
51a46bc
Tidy up streaming
svilupp Sep 17, 2024
2a01bb5
Enable Streaming for OpenAI-compatible models
svilupp Sep 18, 2024
0745925
Fix logging
svilupp Sep 22, 2024
cd3f891
Support for Azure OpenAI API
pabvald Sep 24, 2024
ea8d51a
Version up
svilupp Sep 24, 2024
a66da99
Add Cerebras API + node validation for airetry!
svilupp Oct 9, 2024
f945ab9
Tool interface update
svilupp Oct 19, 2024
5280ee8
Add a new models, clean up tools (#219)
svilupp Oct 20, 2024
c57d209
Update VERSION
svilupp Oct 20, 2024
a8c1799
Multi-turn tool fix
svilupp Oct 20, 2024
2117e1c
Add support for cached tokens in cost calculation
devin-ai-integration[bot] Oct 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 17 additions & 19 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ jobs:
version:
- '1.9'
- '1.10'
# - '1.11'
# - 'nightly'
os:
- ubuntu-latest
Expand All @@ -49,24 +50,21 @@ jobs:
permissions:
contents: write
statuses: write
pages: write
id-token: write
actions: write
steps:
- uses: actions/checkout@v3
- uses: julia-actions/setup-julia@v1
with:
version: '1'
- name: Configure doc environment
run: |
julia --project=docs/ -e '
using Pkg
Pkg.develop(PackageSpec(path=pwd()))
Pkg.instantiate()'
- uses: julia-actions/julia-buildpkg@v1
- uses: julia-actions/julia-docdeploy@v1
- name: Checkout
uses: actions/checkout@v4
- name: Setup Julia
uses: julia-actions/setup-julia@v1
- name: Pull Julia cache
uses: julia-actions/cache@v1
- name: Install documentation dependencies
run: julia --project=docs -e 'using Pkg; pkg"dev ."; Pkg.instantiate(); Pkg.precompile(); Pkg.status()'
- name: Build and deploy docs
uses: julia-actions/julia-docdeploy@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- run: |
julia --project=docs -e '
using Documenter: DocMeta, doctest
using PromptingTools
DocMeta.setdocmeta!(PromptingTools, :DocTestSetup, :(using PromptingTools); recursive=true)
doctest(PromptingTools)'
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # For authentication with GitHub Actions token
GKSwstype: "100" # for Plots.jl plots (if you have them)
JULIA_DEBUG: "Documenter"
9 changes: 8 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,11 @@
/docs/build/

**/.DS_Store
**/.vscode
**/.vscode

# exclude scratch files
**/_*
docs/package-lock.json

# Ignore Cursor rules
.cursorrules
450 changes: 448 additions & 2 deletions CHANGELOG.md

Large diffs are not rendered by default.

22 changes: 18 additions & 4 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,35 +1,47 @@
name = "PromptingTools"
uuid = "670122d1-24a8-4d70-bfce-740807c42192"
authors = ["J S @svilupp and contributors"]
version = "0.13.0"
version = "0.59.1"

[deps]
AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c"
Base64 = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
OpenAI = "e9f21f70-7185-4079-aca2-91159181367c"
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
PrecompileTools = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
Preferences = "21216c6a-2e73-6563-6e65-726566657250"
REPL = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[weakdeps]
FlashRank = "22cc3f58-1757-4700-bb45-2032706e5a8d"
GoogleGenAI = "903d41d1-eaca-47dd-943b-fee3930375ab"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a"
Snowball = "fb8f903a-0164-4e73-9ffe-431110250c3b"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
Unicode = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"

[extensions]
FlashRankPromptingToolsExt = ["FlashRank"]
GoogleGenAIPromptingToolsExt = ["GoogleGenAI"]
MarkdownPromptingToolsExt = ["Markdown"]
RAGToolsExperimentalExt = ["SparseArrays", "LinearAlgebra"]
RAGToolsExperimentalExt = ["SparseArrays", "LinearAlgebra", "Unicode"]
SnowballPromptingToolsExt = ["Snowball"]

[compat]
AbstractTrees = "0.4"
Aqua = "0.7"
Base64 = "<0.0.1, 1"
HTTP = "1"
Dates = "<0.0.1, 1"
FlashRank = "0.4"
GoogleGenAI = "0.3"
HTTP = "1.10.8"
JSON3 = "1"
LinearAlgebra = "<0.0.1, 1"
Logging = "<0.0.1, 1"
Expand All @@ -38,6 +50,7 @@ OpenAI = "0.9"
Pkg = "<0.0.1, 1"
PrecompileTools = "1"
Preferences = "1"
REPL = "<0.0.1, 1"
Random = "<0.0.1, 1"
SparseArrays = "<0.0.1, 1"
Statistics = "<0.0.1, 1"
Expand All @@ -49,6 +62,7 @@ Aqua = "4c88cf16-eb10-579e-8560-4a9242c79595"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
Unicode = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"

[targets]
test = ["Aqua", "SparseArrays", "Statistics", "LinearAlgebra", "Markdown"]
test = ["Aqua", "FlashRank", "SparseArrays", "Statistics", "LinearAlgebra", "Markdown", "Snowball", "Unicode"]
114 changes: 106 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

[![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://svilupp.github.io/PromptingTools.jl/stable/)
[![Dev](https://img.shields.io/badge/docs-dev-blue.svg)](https://svilupp.github.io/PromptingTools.jl/dev/)
[![Slack](https://img.shields.io/badge/slack-%23generative--ai-brightgreen.svg?logo=slack)](https://julialang.slack.com/archives/C06G90C697X)
[![Build Status](https://github.com/svilupp/PromptingTools.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/svilupp/PromptingTools.jl/actions/workflows/CI.yml?query=branch%3Amain)
[![Coverage](https://codecov.io/gh/svilupp/PromptingTools.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/svilupp/PromptingTools.jl)
[![Aqua](https://raw.githubusercontent.com/JuliaTesting/Aqua.jl/master/badge.svg)](https://github.com/JuliaTesting/Aqua.jl)
Expand All @@ -11,17 +12,22 @@ Streamline your life using PromptingTools.jl, the Julia package that simplifies

PromptingTools.jl is not meant for building large-scale systems. It's meant to be the go-to tool in your global environment that will save you 20 minutes every day!

> [!TIP]
> Jump to the **[docs](https://svilupp.github.io/PromptingTools.jl/dev/)**

## Quick Start with `@ai_str` and Easy Templating

Getting started with PromptingTools.jl is as easy as importing the package and using the `@ai_str` macro for your questions.

Note: You will need to set your OpenAI API key as an environment variable before using PromptingTools.jl (see the [Creating OpenAI API Key](#creating-openai-api-key) section below).
Note: You will need to set your OpenAI API key as an environment variable before using PromptingTools.jl (see the [Creating OpenAI API Key](#creating-openai-api-key) section below).

Following the introduction of [Prepaid Billing](https://help.openai.com/en/articles/8264644-what-is-prepaid-billing), you'll need to buy some credits to get started ($5 minimum).
For a quick start, simply set it via `ENV["OPENAI_API_KEY"] = "your-api-key"`

Install PromptingTools:
```julia
using Pkg
Pkg.add("PromptingTools.jl")
Pkg.add("PromptingTools")
```

And we're ready to go!
Expand Down Expand Up @@ -76,6 +82,7 @@ For more practical examples, see the `examples/` folder and the [Advanced Exampl
- [Table of Contents](#table-of-contents)
- [Why PromptingTools.jl](#why-promptingtoolsjl)
- [Advanced Examples](#advanced-examples)
- [`ai*` Functions Overview](#ai-functions-overview)
- [Seamless Integration Into Your Workflow](#seamless-integration-into-your-workflow)
- [Advanced Prompts / Conversations](#advanced-prompts--conversations)
- [Templated Prompts](#templated-prompts)
Expand All @@ -89,10 +96,12 @@ For more practical examples, see the `examples/` folder and the [Advanced Exampl
- [Experimental Agent Workflows / Output Validation with `airetry!`](#experimental-agent-workflows--output-validation-with-airetry)
- [Using Ollama models](#using-ollama-models)
- [Using MistralAI API and other OpenAI-compatible APIs](#using-mistralai-api-and-other-openai-compatible-apis)
- [Using Anthropic Models](#using-anthropic-models)
- [More Examples](#more-examples)
- [Package Interface](#package-interface)
- [Frequently Asked Questions](#frequently-asked-questions)
- [Why OpenAI](#why-openai)
- [What if I cannot access OpenAI?](#what-if-i-cannot-access-openai)
- [Data Privacy and OpenAI](#data-privacy-and-openai)
- [Creating OpenAI API Key](#creating-openai-api-key)
- [Setting OpenAI Spending Limits](#setting-openai-spending-limits)
Expand All @@ -102,6 +111,7 @@ For more practical examples, see the `examples/` folder and the [Advanced Exampl
- [Instant Access from Anywhere](#instant-access-from-anywhere)
- [Open Source Alternatives](#open-source-alternatives)
- [Setup Guide for Ollama](#setup-guide-for-ollama)
- [How would I fine-tune a model?](#how-would-i-fine-tune-a-model)
- [Roadmap](#roadmap)

## Why PromptingTools.jl
Expand All @@ -118,12 +128,59 @@ Some features:

## Advanced Examples

TODOs:
### `ai*` Functions Overview

Noteworthy functions: `aigenerate`, `aiembed`, `aiclassify`, `aiextract`, `aiscan`, `aiimage`, `aitemplates`

All `ai*` functions have the same basic structure:

`ai*(<optional schema>,<prompt or conversation>; <optional keyword arguments>)`,

but they differ in purpose:

- `aigenerate` is the general-purpose function to generate any text response with LLMs, ie, it returns `AIMessage` with field `:content` containing the generated text (eg, `ans.content isa AbstractString`)
- `aiembed` is designed to extract embeddings from the AI model's response, ie, it returns `DataMessage` with field `:content` containing the embeddings (eg, `ans.content isa AbstractArray`)
- `aiextract` is designed to extract structured data from the AI model's response and return them as a Julia struct (eg, if we provide `return_type=Food`, we get `ans.content isa Food`). You need to define the return type first and then provide it as a keyword argument.
- `aitools` is designed for agentic workflows with a mix of tool calls and user inputs. It can work with simple functions and execute them.
- `aiclassify` is designed to classify the input text into (or simply respond within) a set of discrete `choices` provided by the user. It can be very useful as an LLM Judge or a router for RAG systems, as it uses the "logit bias trick" and generates exactly 1 token. It returns `AIMessage` with field `:content`, but the `:content` can be only one of the provided `choices` (eg, `ans.content in choices`)
- `aiscan` is for working with images and vision-enabled models (as an input), but it returns `AIMessage` with field `:content` containing the generated text (eg, `ans.content isa AbstractString`) similar to `aigenerate`.
- `aiimage` is for generating images (eg, with OpenAI DALL-E 3). It returns a `DataMessage`, where the field `:content` might contain either the URL to download the image from or the Base64-encoded image depending on the user-provided kwarg `api_kwargs.response_format`.
- `aitemplates` is a helper function to discover available templates and see their details (eg, `aitemplates("some keyword")` or `aitemplates(:AssistantAsk)`)

If you're using a known `model`, you do NOT need to provide a `schema` (the first argument).

Optional keyword arguments in `ai*` tend to be:

- `model::String` - Which model you want to use
- `verbose::Bool` - Whether you went to see INFO logs around AI costs
- `return_all::Bool` - Whether you want the WHOLE conversation or just the AI answer (ie, whether you want to include your inputs/prompt in the output)
- `api_kwargs::NamedTuple` - Specific parameters for the model, eg, `temperature=0.0` to be NOT creative (and have more similar output in each run)
- `http_kwargs::NamedTuple` - Parameters for the HTTP.jl package, eg, `readtimeout = 120` to time out in 120 seconds if no response was received.

**Experimental: AgentTools**

In addition to the above list of `ai*` functions, you can also use the **"lazy" counterparts** of these functions from the experimental AgentTools module.
```julia
using PromptingTools.Experimental.AgentTools
```

For example, `AIGenerate()` will create a lazy instance of `aigenerate`. It is an instance of `AICall` with `aigenerate` as its ai function.
It uses exactly the same arguments and keyword arguments as `aigenerate` (see `?aigenerate` for details).

"lazy" refers to the fact that it does NOT generate any output when instantiated (only when `run!` is called).

Or said differently, the `AICall` struct and all its flavors (`AIGenerate`, ...) are designed to facilitate a deferred execution model (lazy evaluation) for AI functions that interact with a Language Learning Model (LLM). It stores the necessary information for an AI call and executes the underlying AI function only when supplied with a `UserMessage` or when the `run!` method is applied. This allows us to remember user inputs and trigger the LLM call repeatedly if needed, which enables automatic fixing (see `?airetry!`).

If you would like a powerful auto-fixing workflow, you can use `airetry!`, which leverages Monte-Carlo tree search to pick the optimal trajectory of conversation based on your requirements.

**Experimental: RAGTools**

- [ ] Add more practical examples (with DataFrames!)
- [ ] Add an example of how to build a RAG app in 50 lines
Lastly, we provide a set of tools to build RAG applications (Retrieve, Answer, Generate).

Noteworthy functions: `aigenerate`, `aiembed`, `aiclassify`, `aiextract`, `aitemplates`
It can be as simple as two calls: `build_index` and `airag` (Retrieve, Answer, Generate).

If you then use pretty-printing with `PromptingTools.pprint`, we highlight the generated text vs text likely sourced from the context and we score how strongly is the generated answer supported by the context.
In addition, we annotate each generated chunk with a reference to which source document it likely came from (including the confidence score between 0 and 1).

### Seamless Integration Into Your Workflow
Google search is great, but it's a context switch. You often have to open a few pages and read through the discussion to find the answer you need. Same with the ChatGPT website.
Expand Down Expand Up @@ -414,7 +471,7 @@ run!(out)
How is it useful? We can use the same "inputs" for repeated calls, eg, when we want to validate
or regenerate some outputs. We have a function `airetry` to help us with that.

The signature of `airetry` is `airetry(condition_function, aicall::AICall, feedback_function)`.
The signature of `airetry!` is `airetry!(condition_function, aicall::AICall, feedback_function)`.
It evaluates the condition `condition_function` on the `aicall` object (eg, we evaluate `f_cond(aicall) -> Bool`). If it fails, we call `feedback_function` on the `aicall` object to provide feedback for the AI model (eg, `f_feedback(aicall) -> String`) and repeat the process until it passes or until `max_retries` value is exceeded.

We can catch API failures (no feedback needed, so none is provided)
Expand Down Expand Up @@ -528,6 +585,30 @@ As you can see, it also works for any local models that you might have running o

Note: At the moment, we only support `aigenerate` and `aiembed` functions for MistralAI and other OpenAI-compatible APIs. We plan to extend the support in the future.

### Using Anthropic Models

Make sure the `ANTHROPIC_API_KEY` environment variable is set to your API key.

```julia
# cladeuh is alias for Claude 3 Haiku
ai"Say hi!"claudeh
```

Preset model aliases are `claudeo`, `claudes`, and `claudeh`, for Claude 3 Opus, Sonnet, and Haiku, respectively.

The corresponding schema is `AnthropicSchema`.

There are several prompt templates with `XML` in the name, suggesting that they use Anthropic-friendly XML formatting for separating sections.
Find them with `aitemplates("XML")`.

```julia
# cladeo is alias for Claude 3 Opus
msg = aigenerate(
:JuliaExpertAskXML, ask = "How to write a function to convert Date to Millisecond?",
model = "cladeo")
```


### More Examples

TBU...
Expand Down Expand Up @@ -599,6 +680,13 @@ There will be situations not or cannot use it (eg, privacy, cost, etc.). In that

Note: To get started with [Ollama.ai](https://ollama.ai/), see the [Setup Guide for Ollama](#setup-guide-for-ollama) section below.

### What if I cannot access OpenAI?

There are many alternatives:

- **Other APIs**: MistralAI, Anthropic, Google, Together, Fireworks, Voyager (the latter ones tend to give free credits upon joining!)
- **Locally-hosted models**: Llama.cpp/Llama.jl, Ollama, vLLM (see the examples and the corresponding docs)

### Data Privacy and OpenAI

At the time of writing, OpenAI does NOT use the API calls for training their models.
Expand Down Expand Up @@ -681,7 +769,7 @@ A better way:
- On a Mac, add the configuration line to your terminal's configuration file (eg, `~/.zshrc`). It will get automatically loaded every time you launch the terminal
- On Windows, set it as a system variable in "Environment Variables" settings (see the Resources)

We also support Preferences.jl, so you can simply run: `PromptingTools.set_preferences!("OPENAI_API_KEY"="your-api-key")` and it will be persisted across sessions.
We also support Preferences.jl, so you can simply run: `PromptingTools.set_preferences!("OPENAI_API_KEY"=>"your-api-key")` and it will be persisted across sessions.
To see the current preferences, run `PromptingTools.get_preferences("OPENAI_API_KEY")`.

Be careful NOT TO COMMIT `LocalPreferences.toml` to GitHub, as it would show your API Key to the world!
Expand Down Expand Up @@ -729,6 +817,16 @@ Show currently available models with `ollama list`.

See [Ollama.ai](https://ollama.ai/) for more information.

### How would I fine-tune a model?

Fine-tuning is a powerful technique to adapt a model to your specific use case (mostly the format/syntax/task). It requires a dataset of examples, which you can now easily generate with PromptingTools.jl!

1. You can save any conversation (vector of messages) to a file with `PT.save_conversation("filename.json", conversation)`.

2. Once the finetuning time comes, create a bundle of ShareGPT-formatted conversations (common finetuning format) in a single `.jsonl` file. Use `PT.save_conversations("dataset.jsonl", [conversation1, conversation2, ...])` (notice that plural "conversationS" in the function name).

For an example of an end-to-end finetuning process, check out our sister project [JuliaLLMLeaderboard Finetuning experiment](https://github.com/svilupp/Julia-LLM-Leaderboard/blob/main/experiments/cheater-7b-finetune/README.md). It shows the process of finetuning for half a dollar with [Jarvislabs.ai](https://jarvislabs.ai/templates/axolotl) and [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl).

## Roadmap

This is a list of features that I'd like to see in the future (in no particular order):
Expand Down
4 changes: 4 additions & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
build/
node_modules/
package-lock.json
Manifest.toml
7 changes: 7 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
[deps]
DataFramesMeta = "1313f7d8-7da2-5740-9ea0-a2ca25f37964"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterVitepress = "4710194d-e776-4893-9690-8d956a29c365"
FlashRank = "22cc3f58-1757-4700-bb45-2032706e5a8d"
GoogleGenAI = "903d41d1-eaca-47dd-943b-fee3930375ab"
HTTP = "cd3eb016-35fb-5094-929b-558a96fad6f3"
JSON3 = "0f8b85d8-7281-11e9-16c2-39a750bddbf1"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
LiveServer = "16fef848-5104-11e9-1b77-fb7a48bbb589"
Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a"
PromptingTools = "670122d1-24a8-4d70-bfce-740807c42192"
Snowball = "fb8f903a-0164-4e73-9ffe-431110250c3b"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"

[compat]
DocumenterVitepress = "0.0.7"
Loading