Skip to content

Commit

Permalink
documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
seanmacavaney committed Nov 25, 2024
1 parent da7d1f2 commit dcde8fa
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 4 deletions.
12 changes: 8 additions & 4 deletions pyterrier_dr/prf.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,9 @@ class VectorPrf(pt.Transformer):
- beta: weight of doc_vec
- k: number of pseudo-relevant feedback documents
Expected Input: ['qid', 'query_vec', 'docno', 'doc_vec']
Output: ['qid', 'query_vec']
Expected Input Columns: ``['qid', 'query_vec', 'docno', 'doc_vec']``
Output Columns: ``['qid', 'query_vec']`` (Any other query columns from the input are also pulled included in the output.)
Example::
Expand Down Expand Up @@ -56,6 +57,7 @@ def __init__(self,

@pta.transform.by_query(add_ranks=False)
def transform(self, inp: pd.DataFrame) -> pd.DataFrame:
"""Performs Vector PRF on the input dataframe."""
pta.validate.result_frame(inp, extra_columns=['query_vec', 'doc_vec'])

query_cols = [col for col in inp.columns if col.startswith('q') and col != 'query_vec']
Expand All @@ -79,8 +81,9 @@ class AveragePrf(pt.Transformer):
Arguments:
- k: number of pseudo-relevant feedback documents
Expected Input: ['qid', 'query_vec', 'docno', 'doc_vec']
Output: ['qid', 'query_vec']
Expected Input Columns: ``['qid', 'query_vec', 'docno', 'doc_vec']``
Output Columns: ``['qid', 'query_vec']`` (Any other query columns from the input are also pulled included in the output.)
Example::
Expand Down Expand Up @@ -117,6 +120,7 @@ def __init__(self,

@pta.transform.by_query(add_ranks=False)
def transform(self, inp: pd.DataFrame) -> pd.DataFrame:
"""Performs Average PRF on the input dataframe."""
pta.validate.result_frame(inp, extra_columns=['query_vec', 'doc_vec'])

query_cols = [col for col in inp.columns if col.startswith('q') and col != 'query_vec']
Expand Down
8 changes: 8 additions & 0 deletions pyterrier_dr/pt_docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,11 @@ Import ``pyterrier_dr``, load a pre-built index and model, and retrieve:
997 82.551933 10064 10063 997 1 chemical reactions
998 82.546890 4417 4416 998 1 chemical reactions
999 82.545776 7120 7119 999 1 chemical reactions
.. rubric:: Table of Contents

.. toctree::
:maxdepth: 1

prf
16 changes: 16 additions & 0 deletions pyterrier_dr/pt_docs/prf.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Pseudo Relevance Feedback (PRF)
===============================

Dense Pseudo Relevance Feedback (PRF) is a technique to improve the performance of a retrieval system by expanding the
original query vector with the vectors from the top-ranked documents. The idea is that the top-ranked documents.

PyTerrier-DR provides two dense PRF implementations: :class:`pyterrier_dr.AveragePrf` and :class:`pyterrier_dr.VectorPrf`.

API Documentation
-----------------

.. autoclass:: pyterrier_dr.AveragePrf
:members:

.. autoclass:: pyterrier_dr.VectorPrf
:members:

0 comments on commit dcde8fa

Please sign in to comment.