-
-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full text search #757
base: master
Are you sure you want to change the base?
Full text search #757
Conversation
I have trouble reviewing this PR because of vendoring I suppose: my IDE just refuses to show a list of files. Would you be so kind as to split your repo into a branch with vendoring and branch on top of it with just the code changes, create a PR with just the code changes and ping me for review there? |
@paskal I removed vendor from this PR and pushed it to separate branch full-text-search-vendor. I haven't create PR for this separate branch yet, because maybe it is better to keep one PR and merge changes back after reviews? Or I'll create another PR later if we decide to do it. |
Sorry for not being clear, I wanted to have PR like one we have here to be a separate one on top of vendoring branch in your fork, so I would review it there. I'll try to look into this by tomorrow evening. |
I expect to have the ability to renew it this weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, didn't have time for proper review yet. Comments are small stuff about the code style rather than about architecture, I'll look into the idea behind the code shortly after.
Also if you don't mind I'll recommend rebasing against current master as it will simplify the review a little.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another round, I reviewed everything but search
module as it's a beast on its own. @vdimir please let me know if what I wrote something which makes sense or I'm looking into wrong things and that's not what you were looking for.
Thanks for feedback! |
I should have time tonight to review it thoughtfully. |
Thanks! I think main issue that I haven't figure out is wrapping store.Interface. Wrapping store with proxy is simple way to handle all store calls and perform additional work to support search without modifying store. But I understand that this solution not so clear, but I don't know how to do it better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into search module and the architecture tomorrow. It would be great if you would fix small things and move unrelated changes out of PR, that would really make things simpler for me.
Very busy work week, I hope I'll have time for the review either on Friday or Sunday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, aside from TODO stuff which I didn't check. Also, I didn't check test coverage as the code is not 100% complete at the moment, and I didn't run it yet as I want to run when the code review from Umputun will be passed.
I propose to fix review comments, ping me directly if you'll need any clarifications, and after these comments are resolved ping Umputun for final review. It's unlikely we'll jump on your todo items so I propose to add them as comment threads in review and ping me/Umputun directly on them if you want our reaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added issue to my TODOs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent job! Few very minor things to fix plus few tests to write, and also please check that it compiles as it doesn't now. After fixing that review notes ping Umputun directly, I don't think I have any more things I'm conserned about other than ones written in that review comments.
Thanks! I hope I am finishing fixing latest issues soon. |
I finally run service after all modification, load bunch of comments like in first post in this PR, seems it works. |
@umputun could you review the code, please? |
Sorry for the long delay and thx for the impressive work and detailed reviews. I have tried to understand the general structure/flow before I dig deeper into implementation details and it got me confused a little bit on multiple levels. Somehow the overall structure doesn't look intuitive to me and I found myself looking at things implementing unexpected logic in unexpected places. Part of my confusion represented in the comments I've made, but I'm not really sure if my comments make much sense due to the general confusion. Fundamentally all of this feels to me as two levels of abstraction, Servcie and Engine:
and just to reiterate the general flow from the consumer point of view: the only thing consumers care about and use are interfaces extracted from the service struct. Some consumers may need a different subset of those methods. let me know what you think |
@umputun , thank you for review! I think I've got general idea and I'm going to rewrite PR reusing my code with respect to your notes. If some additional question will occur during implementation, I'll ask here. I hope I'll start soon :) |
@vdimir I would love to have your work incorporated into the next release (1.7.0), please let us know if there is any way to assist you. |
Thanks for the concern. I'm going to work on it this weekend, I think after that we can discuss it more detailed. |
After these changes + documentation ready, + some UI changes if we'll find someone to make them in time. |
1f792f9
to
185145b
Compare
185145b
to
f1a4736
Compare
f1a4736
to
24b402d
Compare
29d260c
to
f9e7345
Compare
@paskal I updated the code to have synchronous indexing for a cold start in Tested on 8cpu/16gb digitalocean droplet, dataset with 70k comments, took 12 sec: Logs
|
Can you please test on the cheapest droplet? |
I tried on a small droplet (1gb ram) and faced memory issues, I will investigate. UPD: on 2GB ram droplet succeeded in 2 minues (100k comments). But I set num workers to Details
|
Tested on latest commit on cheapest 512ram droplet 100k comments in 20 topics:
It seems it was an issue with too big batches passed to |
Are default concurrency settings good enough not to overuse the machine's memory? |
I think so, peak usage was about 300-400mb. I also tested on another workload with 1000 topics and ~100 comments each. It took more time - 4 min on 512mb ram single-core droplet.
|
Sounds acceptable to me for a single-time index build. @umputun please take a final look. |
I haven't found a linter for these, so I had to catch these manually. I found umputun#757 to fix one of these, and I thought it would be good to fix everything at once.
Resolves #734
Uses bleve.