Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#6089] Fix typeahead searches #6120

Merged
merged 4 commits into from
Feb 16, 2021
Merged

[#6089] Fix typeahead searches #6120

merged 4 commits into from
Feb 16, 2021

Conversation

gbp
Copy link
Member

@gbp gbp commented Feb 15, 2021

Relevant issue(s)

Fixes #6089

What does this do?

Update expected exception message when wildcard searches error allowing
fallback to an non-wildcard search to continue.

Why was this needed?

Poor search performance

Implementation notes

Locally in dev and test env all was working fine. I could only replicate in production, this is due to the size of the index. A large index can't always do a wildcard search, for example, when searching for Cabinet Office it will raise an exception with the message like:
WildcardError: Wildcard office* expands to more than 1000 terms

This message seems to have changed at some point when upgrading Xapian in the past.

@gbp
Copy link
Member Author

gbp commented Feb 15, 2021

It appearsQueryParserError still exists in the Xapian so maybe we need to catch both QueryParserError and WildcardError

@garethrees garethrees self-assigned this Feb 15, 2021
@gbp
Copy link
Member Author

gbp commented Feb 15, 2021

Exception message matching was first added in 22b2db4

Copy link
Member

@garethrees garethrees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense in principle!

Would be worth expanding the details into the commit message, and ideally link to the upstream code where these errors exist so that we have a better future reference.

It appears QueryParserError still exists in the Xapian so maybe we need to catch both QueryParserError and WildcardError

Yeah maybe they split these into two distinct concepts. We reference QueryParserError elsewhere so I assume both still exist, but whether we need to rescue from both here IDK.

I could only replicate in production

I think it's worth trying to test this fix against the prod DB somehow. Worst case just temporarily patch the live file and run the query in the console.

Still defaults to 1000 expansions but this can now be overriden by
adding a `config/xapian.yml` file or by calling
`ActsAsXapian.max_wildcard_expansion = n` for example in a before spec
callback hook.
@gbp gbp force-pushed the 6089-fix-typeahead-searches branch from d2ad899 to 99dc45b Compare February 16, 2021 09:47
@gbp
Copy link
Member Author

gbp commented Feb 16, 2021

@garethrees I've made some changes:

  1. managed to replicate this error in a spec by adding a configuration option to ActsAsXapian. So this is now tested.
  2. moved the RuntimeError rescue into ActsAsXapian and now re-raising as exception within the ActsAsXapian model to allow us to catch know exception classes in the core Alaveteli code.
  3. added exception notifications for us

@gbp
Copy link
Member Author

gbp commented Feb 16, 2021

In trying to write a spec to show the original QueryParserError exception message I've worked out that this only is raised when using the Xapian::QueryParser::FLAG_BOOLEAN flag which we don't use in typeahead search.

[1] pry(main)> ActsAsXapian.readable_init
=> snip
[2] pry(main)> ActsAsXapian.query_parser.parse_query('dog AND', Xapian::QueryParser::FLAG_BOOLEAN)
RuntimeError: QueryParserError: Syntax: <expression> AND <expression>
from (pry):2:in `parse_query'
[3] pry(main)> ActsAsXapian.query_parser.parse_query('dog AND', Xapian::QueryParser::FLAG_LOVEHATE | Xapian::QueryParser::FLAG_SPELLING_CORRECTION | Xapian::QueryParser::FLAG_WILDCARD)
=> #<Xapian::Query:0x00007fd9082652d0 @__swigtype__="_p_Xapian__Query">

@gbp
Copy link
Member Author

gbp commented Feb 16, 2021

In production we are hitting the wildcard expansion limit, which is currently set to 1000. We should consider increasing this limit. We can do this with 9d57051 by adding a config/xapian.yml file, something like:

test:
  max_wildcard_expansion: 10

development:
  max_wildcard_expansion: 500

production:
  max_wildcard_expansion: 1000

From my tests in the production Rails console, increasing to 1075 allows the wildcard query for 'Cabinet Office' to succeed. We would also need to consider server memory usage /cc @sagepe

Copy link
Member

@garethrees garethrees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Amazing work 🦸‍♂️

Couple of tiny changes suggested but no need for re-review.

lib/acts_as_xapian/acts_as_xapian.rb Outdated Show resolved Hide resolved
lib/typeahead_search.rb Outdated Show resolved Hide resolved
lib/typeahead_search.rb Outdated Show resolved Hide resolved
gbp added 3 commits February 16, 2021 11:43
Re-raise with are own exception class to allow us to handle gracefully
downstream.
Use new ActsAsXapian exception instead of relying matching the exception
message which has/may changed in the past/future.

Adds a spec, using the new `max_wildcard_expansion` config to prove
non-wildcard searches will happen.
It's unclear if this is a big problem or if we could mitigate failed
wildcard searches by adjusting Xapian's max expansion limit.

This is also changes the  Rails logging level to warn.
@gbp gbp force-pushed the 6089-fix-typeahead-searches branch from 99dc45b to 0e0fe30 Compare February 16, 2021 11:54
@gbp gbp merged commit cb3c714 into develop Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Authority search very poor
2 participants