-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance #24
Comments
What's your code look like? On Wed, Aug 6, 2014 at 1:18 AM, Ryan Stout [email protected] wrote:
|
Yup, pretty much a known problem, and work to be done. I won't be personally happy until I'm bottlenecked by actual IO, for large files. I'm personally seeing ~75 MB/s writes and 15 MB/s reads on a fairly complex app, so anything below that is your own fault ;) I've been meaning to write a simpler benchmark, but profiling currently puts the blame pretty hard on the FUSE kernel communication. Avoiding OpenDirectIO helps, as it lets the kernel manage a writeback cache. I wrote quick notes on this in the Bazil group, see item 5 in https://groups.google.com/d/msg/bazil-dev/z-PtgA84f-o/BP9u2_Ko0VQJ To be improved. |
Humm, Here's the code I'm running: Again, the files I'm copying in is a directory of 3MB images. (1.4gig's worth) Thanks for the help. If I could get it up to 75MB/s, that would be great. |
I should mention also that even if I disable the actual writing to the disk (which is an SSD), it still only does about 20MB/s per sec. |
hmmm... things look pretty normal. Have you tried doing a CPU profile? On Wed, Aug 6, 2014 at 1:58 AM, Ryan Stout [email protected] wrote:
|
@hesamrabeti Humm. Was the 75MB/s was on osx? |
My measurements, and development, are primarily on Linux. |
@tv42 So do you think the performance could be osx related? Also, would it make any difference that I'm not calling methods with pointers? Thanks for the help. |
@tv42 @hesamrabeti One other question. So I read in a few places that for larger files (what I'm dealing with), increasing the block size will improve performance quite a bit. I noticed though when setting -fuse.debug=true that everything's coming in as a 4096 block. I see that for osx the block size seems to be hard coded as 4096 https://github.com/bazillion/fuse/blob/9802bb510ca4cd1c18ffc840cf6fe9ef5d1546a8/mount_darwin.go#L63 Would someone mind making this an option passed to mount? Thanks! |
That can't be changed unilaterally there, the receiving buffer also needs to be resized, see 9802bb5. The incoming buffer management needs to change, once that's better the size from there can update iosize= too. And for the incoming buffer management, the linux side really wants to switch to vmsplice, so there's a bit more work to be done. I'm writing simple benchmarks right now, just to get an idea of what the current state is, and to be able to measure any potential improvements. |
@tv42 Cool, thanks. I would love to use bazillion/fuse for a project, but we're using it to store photos (on osx), so I need it to be faster in order for it to make since. Thanks a bunch for the help. |
@tv42 Hopefully this helps incentivize some work on it. Thanks a bunch: |
Commit 0f430c9 adds simple benchmarks, to keep track of any improvements. See the commit message for current expected numbers. OS X is currently a mystery; I don't know if the problem is just my Mac Mini, or all of OS X / OSXFUSE. |
Recording here for posterity: OS X performance work is hindered by kernel hangs: |
@ryanstout Can you try eccde64 and see if that helps? It bumps up the maximum write size, and dropped the syscall overhead for my workload significantly. This is nowhere near the end of the story, but a good start.. eccde64 (HEAD, github/master, master) Use the 128 KiB kernel receive buffer; set fuse.InitBigWrites |
Here's 3 concurrent writers writing into a full-blown filesystem where most of the CPU cost is in hashing and crypto:
650 MB/s ain't too bad ;) (Truth in advertising: that's probably cheating by not pushing all the data down to FUSE in time, before the benchmark ends. Slapping an fsync at the end gives a more modest 200 MB/s.) I don't have |
@tv42 yea, 1/3rd the time of before. Nice work. That puts it where I need performance wise. Thanks a bunch for working on that. Feel free to claim the bountysource if you want. |
Alright, since this ticket never really stated specific numbers or specific changes, and the original reporter seems happy enough, I'm gonna close this as good enough. Ongoing work with a specific goal still left in #35 (Linux only). As part of that, I may introduce a sync.Pool for the buffers used, even on OS X; that'll help a little bit more. FUSE still has plenty of overhead, but it's role is definitely shrinking in my CPU profiles, to a point where I personally get more gains from optimizing the FS logic itself. Once again, that's Linux, and afaik OS X CPU profiling is busted. |
Sorry to be this guy, but I just wanted to get some thoughts from someone with more experience with fuse. I've done a mirror file system (as a test), and I'm seeing write performance at about 1/10th what it is to write directly to the disk (for a folder of 3mb images) Would direct_io help here? I saw where it got disabled. I tried increasing the block size, but that didn't seem to make much different. I'm on osx using osxfuse.
Thanks,
Ryan
The text was updated successfully, but these errors were encountered: