Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROUGE measures? #28

Open
cmacdonald opened this issue Nov 26, 2021 · 2 comments
Open

ROUGE measures? #28

cmacdonald opened this issue Nov 26, 2021 · 2 comments

Comments

@cmacdonald
Copy link
Collaborator

If QA was a stage of the pipeline, how could we measure some ROUGE metrics or similar at the end of a Pyterrier pipeline?

@seanmacavaney
Copy link
Collaborator

That would be a bit of an undertaking, given that the current architecture is built around the many-to-many nature of search result lists ((query_id, doc_id) -> score mappings) and qrels ((query_id, doc_id) -> relevance mappings). ROUGE and similar measures operate over query_id -> text mappings and query_id -> [possible text answers] mappings.

But I could see how it could work. The structure already allows for various input data formats, so this new type of mapping would just be another one. If you request a qrel-oriented measure but provide text mappings instead (or vise versa), it would just throw an error.

I'd need to familiarise myself with the landscape of these measures too. IIRC there's a ton of fragmentation there as well.


It's worth also considering limiting the scope of this tool to only qrel-oriented measures.

@cmacdonald
Copy link
Collaborator Author

I think with longer QA pipelines involving retrieval and other NLP techniques, e.g. conversational QA, there might be something interesting in putting that as part of pt.Experiment().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants