Skip to content

Commit

Permalink
update page
Browse files Browse the repository at this point in the history
  • Loading branch information
btyu committed Nov 12, 2024
1 parent 359cfd4 commit a80f570
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ <h1 class="title is-1 mmmu">
<div class="container is-max-desktop">
<div class="content has-text-centered">
<p style="text-align:left">The following table shows the overall performance on the specialized tasks (the SMolInstruct dataset) and the general questions (the MMLU-Chemistry and GPQA-Chemistry dataset).</p>
<img style="width: 90%;" src="./static/images/tables/overall_performance.png" class="center">
<img style="width:80%;" src="./static/images/tables/overall_performance.png" class="center">
<p></p>
<p style="text-align:left">To understand the errors that ChemAgent makes, for all the samples where ChemAgent (GPT) fails, we engage a chemistry expert to analyze the problem solving process and identify the errors, which are categorized into three types: <strong>reasoning error</strong>, <strong>grounding error</strong>, and <strong>tool error</strong>. The following bar charts show the error distribution.</p>
<img style="width: 100%;" src="./static/images/error_analysis.png" class="center">
Expand Down
Binary file modified static/images/error_analysis.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit a80f570

Please sign in to comment.