diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 6ee948701..25451708f 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.1","generation_timestamp":"2024-03-01T19:41:26","documenter_version":"1.2.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.1","generation_timestamp":"2024-03-01T19:44:38","documenter_version":"1.2.1"}} \ No newline at end of file diff --git a/dev/examples/building_RAG/index.html b/dev/examples/building_RAG/index.html index f20d3079f..b192f2506 100644 --- a/dev/examples/building_RAG/index.html +++ b/dev/examples/building_RAG/index.html @@ -65,4 +65,4 @@ results = filter(x->!isnothing(x.answer_score), results);
Note: You could also use the vectorized version results = run_qa_evals(evals)
to evaluate all items at once.
# Let's take a simple average to calculate our score
@info "RAG Evals: $(length(results)) results, Avg. score: $(round(mean(x->x.answer_score, results);digits=1)), Retrieval score: $(100*round(Int,mean(x->x.retrieval_score,results)))%"
[ Info: RAG Evals: 10 results, Avg. score: 4.6, Retrieval score: 100%
-
Note: The retrieval score is 100% only because we have two small documents and running on 10 items only. In practice, you would have a much larger document set and a much larger eval set, which would result in a more representative retrieval score.
You can also analyze the results in a DataFrame:
df = DataFrame(results)
Row | source | context | question | answer | retrieval_score | retrieval_rank | answer_score | parameters |
---|---|---|---|---|---|---|---|---|
String | String | String | SubStrin… | Float64 | Int64 | Float64 | Dict… | |
1 | examples/data/database_style_joins.txt | Database-Style Joins\nIntroduction to joins\nWe often need to combine two or more data sets together to provide a complete picture of the topic we are studying. For example, suppose that we have the following two data sets:\n\njulia> using DataFrames | What is the purpose of joining two or more data sets together? | The purpose of joining two or more data sets together is to combine the data sets based on a common key and provide a complete picture of the topic being studied. | 1.0 | 1 | 5.0 | Dict(:top_k=>3) |
2 | examples/data/database_style_joins.txt | julia> people = DataFrame(ID=[20, 40], Name=["John Doe", "Jane Doe"])\n2×2 DataFrame\n Row │ ID Name\n │ Int64 String\n─────┼─────────────────\n 1 │ 20 John Doe\n 2 │ 40 Jane Doe | What is the DataFrame called 'people' composed of? | The DataFrame called 'people' consists of two columns: 'ID' and 'Name'. The 'ID' column contains integers, and the 'Name' column contains strings. | 1.0 | 1 | 4.0 | Dict(:top_k=>3) |
3 | examples/data/database_style_joins.txt | julia> jobs = DataFrame(ID=[20, 40], Job=["Lawyer", "Doctor"])\n2×2 DataFrame\n Row │ ID Job\n │ Int64 String\n─────┼───────────────\n 1 │ 20 Lawyer\n 2 │ 40 Doctor | What are the jobs and IDs listed in the dataframe? | The jobs and IDs listed in the dataframe are as follows:\n\nID: 20\nJob: Lawyer\n\nID: 40\nJob: Doctor | 1.0 | 1 | 4.67 | Dict(:top_k=>3) |
4 | examples/data/database_style_joins.txt | We might want to work with a larger data set that contains both the names and jobs for each ID. We can do this using the innerjoin function: | How can we combine the names and jobs for each ID in a larger data set? | We can use the `innerjoin` function to combine the names and jobs for each ID in a larger data set. | 1.0 | 1 | 4.33333 | Dict(:top_k=>3) |
5 | examples/data/database_style_joins.txt | julia> innerjoin(people, jobs, on = :ID)\n2×3 DataFrame\n Row │ ID Name Job\n │ Int64 String String\n─────┼─────────────────────────\n 1 │ 20 John Doe Lawyer\n 2 │ 40 Jane Doe Doctor | What is the name of the person with the ID 40 and their job? | The name of the person with the ID 40 is Jane Doe and their job is Doctor. | 1.0 | 2 | 4.67 | Dict(:top_k=>3) |
6 | examples/data/database_style_joins.txt | In relational database theory, this operation is generally referred to as a join. The columns used to determine which rows should be combined during a join are called keys.\n\nThe following functions are provided to perform seven kinds of joins: | What are the different kinds of joins? | The different kinds of joins are:\n\n1. Inner Join: Returns only the rows that have matching values in both data frames.\n2. Left Join: Returns all rows from the left data frame and the matching rows from the right data frame.\n3. Right Join: Returns all rows from the right data frame and the matching rows from the left data frame.\n4. Full Outer Join: Returns all rows from both data frames and fills in missing values with null.\n5. Cross Join: Returns the cartesian product of the rows from both data frames.\n6. Semi Join: Returns only the rows from the left data frame that have matching values in the right data frame.\n7. Anti Join: Returns only the rows from the left data frame that do not have matching values in the right data frame. | 1.0 | 1 | 4.66667 | Dict(:top_k=>3) |
7 | examples/data/database_style_joins.txt | innerjoin: the output contains rows for values of the key that exist in all passed data frames. | What does the output of the inner join operation contain? | The output of the inner join operation contains only the rows for values of the key that exist in all passed data frames. | 1.0 | 1 | 5.0 | Dict(:top_k=>3) |
8 | examples/data/database_style_joins.txt | leftjoin: the output contains rows for values of the key that exist in the first (left) argument, whether or not that value exists in the second (right) argument. | What is the purpose of the left join operation? | The purpose of the left join operation is to combine data from two tables based on a common key, where all rows from the left (first) table are included in the output, regardless of whether there is a match in the right (second) table. | 1.0 | 1 | 4.66667 | Dict(:top_k=>3) |
9 | examples/data/database_style_joins.txt | rightjoin: the output contains rows for values of the key that exist in the second (right) argument, whether or not that value exists in the first (left) argument. | What is the purpose of the right join operation? | The purpose of the right join operation is to include all the rows from the second (right) argument, regardless of whether a match is found in the first (left) argument. | 1.0 | 1 | 4.67 | Dict(:top_k=>3) |
10 | examples/data/database_style_joins.txt | outerjoin: the output contains rows for values of the key that exist in any of the passed data frames.\nsemijoin: Like an inner join, but output is restricted to columns from the first (left) argument. | What is the difference between outer join and semi join? | The difference between outer join and semi join is that outer join includes rows for values of the key that exist in any of the passed data frames, whereas semi join is like an inner join but only outputs columns from the first argument. | 1.0 | 1 | 4.66667 | Dict(:top_k=>3) |
We're done for today!
extract_metadata=true
in build_index)rerank
function, you can use Cohere ReRank API)... and much more! See some ideas in Anyscale RAG tutorial
This page was generated using Literate.jl.
Settings
This document was generated with Documenter.jl version 1.2.1 on Friday 1 March 2024. Using Julia version 1.10.1.