You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Source: (read-in-spreadsheet branch) PPUC/PxPUC/views.py in ResearcherSearchList
Description: In an attempt to sort by sentences with a Fuzz ratio rank assigned to them, the loop from the original algorithm is removed and the new algorithm works on the full user query (post stopwords being removed). The current structure of the algorithm:
Prefetch_queryset is created using sentences that contain the current query
Each sentence is looped over and annotated a new field for score
This score is the fuzz.token_set_ratio between the current sentence text and the tokenized user query
The prefetch is first ordered by these scores before being passed to the location_queryset
A count for sentences containing the user query per location is created
The location_queryset annotates a new field for the count and connects locations to their corresponding sentences, excluding locations where count is 0.
This queryset is the one that will be returned to the frontend
Issue: Steps 1 and 5 might have some difficulties in getting the best results as they are now working with a full query and not a fragmented one as used in the original algorithm. Another problem comes from step 2 where the loop occurs. Annotations don't seem to work in that way, so finding another way to give a unique rank per sentence must be discovered
Source: (read-in-spreadsheet branch) PPUC/PxPUC/views.py in ResearcherSearchList
Description: In an attempt to sort by sentences with a Fuzz ratio rank assigned to them, the loop from the original algorithm is removed and the new algorithm works on the full user query (post stopwords being removed). The current structure of the algorithm:
Issue: Steps 1 and 5 might have some difficulties in getting the best results as they are now working with a full query and not a fragmented one as used in the original algorithm. Another problem comes from step 2 where the loop occurs. Annotations don't seem to work in that way, so finding another way to give a unique rank per sentence must be discovered
Additional Notes: https://www.jashds.com/blog/2019/05/13/fuzzy-stringmatching-python#:~:text=This%20ratio%20uses%20a%20simple,differences%20existing%20between%20both%20strings.
The text was updated successfully, but these errors were encountered: