Elasticsearch Search API parameters and grounding accuracy. #162
jvwong
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The grounding-search system ranks the set of results from an initial call to the Elasticsearch Search API. There are several relevant parameters:
fuzziness
: (Optional, string) Maximum edit distance allowed for matching. See Fuzziness for valid values and more information. See Fuzziness in the match query for an example.MAX_FUZZ_ES
(default: 2)min_score
: (Optional, float) Minimum _score for matching documents. Documents with a lower _score are not included in the search results.ES_MIN_SCORE
(default: 0)Given the introduction of test cases where the target entities in fact, do not exist ('out of dictionary') (#160 ), there is a desire to reduce spurious matches (#161). One way to achieve this is to provide a stricter criteria for ES results, such as filtering for low ES
_score
or reduced fuzziness.Test Configurations
The following analysis examines grounding search test results with these parameters altered alone or in combination:
Test Results
Figure 1. Search accuracy over different configurations (N=868). A test case fails when the expected ground is not the top search result returned from the grounding-search.
Figure 2. Search errors grouped into different classes based on rank. Runner up is second hit; OOD is 'out of dictionary' meaning a ground does not exist but a (non-empty) search hit is returned.
Beta Was this translation helpful? Give feedback.
All reactions