-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance/scalability issue in Folder::searchByMime (et al) #11244
Comments
GitMate.io thinks possibly related issues are #9524 (Huge security issue when sharing folder), #4407 (Performance issues e.g. with favourites), #553 (Issues after renaming an already shared folder...), #3021 (performance issue when deleting folder of user with many files), and #3613 (Shared folder download issues). |
cc @icewind1991 @nextcloud/sharing |
the search backend has been changed in Nextcloud 23, are you still seeing this issue there @paulijar ? |
It took some time to setup tests for this but now I have some results. The TLDR is that there is indeed significant improvement on this since I reported the issue but the improvement has happened already before NC23. My test setup was:
With this setup, I measured the time taken to execute
On NC18, I got:
On NC22, I got:
On NC25, I got:
I also tested Conclusions: Still in NC25, the |
Hi, please update to 24.0.8 or better 25.0.2 and report back if it fixes the issue. Thank you! |
The execution time of the function
Folder::searchByMime()
depends on the number of matching files in the whole user storage. That is, the matching files don't have to be within the targeted folder, just having them anywhere in the user storage slows down the search in all folders.I ran into this when investigating the issue owncloud/music#664. There, the user had some 1.5 million image files (including the generated thumbnails) in the storage and running
Folder::searchByMime('image')
took 10 minutes even on empty folder!The reason for this behavior is in the function
Folder::searchCommon()
in https://github.com/nextcloud/server/blob/master/lib/private/Files/Node/Folder.php#L245. There, the function first fetches all the matching files in the whole storage and only then filters the result set according the target path. So in our example case, 1.5 million rows were retrieved from the DB tableoc_filecache
, a newCacheEntry
object was instantiated for each row, and then all the 1.5 million objects were discarded.As I see it, the filtering by path could as well happen already when making the SQL query in https://github.com/nextcloud/server/blob/master/lib/private/Files/Cache/Cache.php#L658. That should be at least an order of magnitude faster than filtering afterwards in PHP
foreach
loop.This performance issue is in code which is identical in ownCloud and Nextcloud. I have reported the same issue for ownCloud in owncloud/core#32720.
The text was updated successfully, but these errors were encountered: