-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Support custom analysers #111
base: master
Are you sure you want to change the base?
Conversation
I think there should be symmetry in names, I agree it doesn't make sense to have the settings defined on the models so let's just not. Maybe we should configure it on the es object. Finally, the natural conclusion of this to me would be to use metaclass techniques (or have an API on the es object that such metaprogramming would use) to register models. That kind of magic fits with magic properties, like How far do we want to go? I'd be fine to leave it all alone, TBH, because I'd like to invest instead in looking toward other storage. |
I was thinking that I'm not sure what you mean with metaclass/metaprogramming techniques. But I also think that creating nice abstractions and APIs is just a waste of effort. Creating abstractions with less effort simply leads to more bugs, maintenance, leaky abstractions and nasty edge cases, so I would rather just make things simple and stay close to the structure Elasticsearch provides for now. I'll look how clean things become if |
To support custom analysers, it seems best to abandon model.create_all in favour of an index-wide method (es.create_models) that sets all mappings (and custom analysis settings) at once. Code is largely transplanted from hypothesis/h#1825 Details: Custom analysers(&filters&tokenisers) are shared in Elasticsearch among all document types (models). To update a model's mappings one needs to make sure the custom analysers are defined first, but updating those just for one model is inelegant; The index would have to be closed for each update, and one cannot check for duplicate definitions of analysers as it is not visible which model defined a particular analyser. Treating index-wide settings as per-model settings creates leaky abstractions.
The function deletes the whole index, not just one type of documents, so this makes much more sense.
Note that also when the index is an alias index_exists will be true.
2d00201
to
fe6e900
Compare
The approach where each model can define custom analysers did not match Elasticsearch's structure well, and created more complexity than it was worth.
fe6e900
to
458789c
Compare
To support custom analysers, it seems best to abandon model.create_all
in favour of an index-wide method (es.create_models) that sets all
mappings (and custom analysis settings) at once.
Code is largely transplanted from hypothesis/h#1825
Details:
Custom analysers(&filters&tokenisers) are shared in Elasticsearch among
all document types (models). To update a model's mappings one needs to
make sure the custom analysers are defined first, but updating those
just for one model is inelegant; The index would have to be closed for
each update, and one cannot check for duplicate definitions of analysers
as it is not visible which model defined a particular analyser. Treating
index-wide settings as per-model settings creates leaky abstractions.
@tilgovi, @nickstenning (and anyone concerned): Please let me know if you disagree with abandoning the model-centric abstraction in this case. I think it is better this way because Elasticsearch does not group analysers by document type, so if we do pretend that it does we either get leaky abstractions with possibly unreliable edge cases, or we need to do more effort than it's worth to pertain the model-centric abstraction.
We could even consider to just define analysers centrally instead of in each model, to simplify things even further (no merging of settings required, it would just match ES's structure), but I think I'm okay with this approach.
I will do tests, add tests and polish things when the approach is considered satisfactory.