-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pinecone metadata to include confidence & summary #168
Comments
I'm not sure I understand what exactly that would look like. |
Right now, it looks like the arxiv abstracts are in summaries not included in the text. I think that's a key missing component. Initially, we can just split the abstract/summaries like any other text with the same header. Eventually, we might want a separate namespace for all summaries to be embedded without splitting into chunks. This could be helpful to retrieve sources to answer questions like "What was the paper/blog post about..." which is a often requested feature. We should discuss before implementing this.
For thumbs down, I would mark the metadata confidence=0 then make sure they don't get returned in pinecone search results. As long as the results are filtered out, it's not as important whether the data remains in pinecone or not... maybe leave it for know until we decide otherwise. Obviously, anything already low confidence in MySQL should NOT be added to pinecone. |
Ok, perfect. To have the chatbot not use those, we just need to add a filter that doesn't accept confidence=0: |
fixed #171 |
In the Pinecone metadata we need to include confidence & summary too.
The text was updated successfully, but these errors were encountered: