Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak in Disks::refresh_list #1282

Closed
keabarnes opened this issue Jun 5, 2024 · 22 comments · Fixed by #1344
Closed

Possible memory leak in Disks::refresh_list #1282

keabarnes opened this issue Jun 5, 2024 · 22 comments · Fixed by #1344

Comments

@keabarnes
Copy link

Describe the bug

System: macOS Sonoma 14.1.2 (23B92)
sysinfo version: 0.30.12

When calling Disks::refresh_list() repeatedly in a loop, the memory the application uses grows without bound. Variations shown in the minimum reproducible example below:

To Reproduce

use std::thread::sleep;
use sysinfo::{Disks, MINIMUM_CPU_UPDATE_INTERVAL};

fn main() {
    // Memory leak when using `new_with_refreshed_list`:
    // loop {
    //     let _ = Disks::new_with_refreshed_list();
    //     sleep(MINIMUM_CPU_UPDATE_INTERVAL);
    // }

    // Memory leak when using `refresh_list`:
    // let mut disks = Disks::new_with_refreshed_list();
    // loop {
    //     disks.refresh_list();
    //     disks.refresh();
    //     sleep(MINIMUM_CPU_UPDATE_INTERVAL);
    // }

    // No memorty leak but `available_space` isn't updated when the space on my computer is
    // actively changing:
    let mut disks = Disks::new_with_refreshed_list();
    loop {
        disks.refresh();

        let first_available_space = disks.first().map(|disk| disk.available_space()).unwrap();
        println!("Available space: {first_available_space}");

        sleep(MINIMUM_CPU_UPDATE_INTERVAL);
    }
}
@keabarnes keabarnes added the bug label Jun 5, 2024
@keabarnes
Copy link
Author

keabarnes commented Jun 5, 2024

Using cargo instruments I ran an Allocations template for my program and was able to capture the following memory allocation trace:
image

That queryCache just grows over time

@GuillaumeGomez
Copy link
Owner

It means that CFURLCopyResourcePropertiesForKeys is allocating memory that is never freed. Sounds pretty bad.

Is it on mac M* or an older one? If it's on mac M*, I can't test it as I don't have one.

@keabarnes
Copy link
Author

I'm experiencing it on an M*, but I think others have experienced it on Intel. Let me try find an Intel to run the example above on and see if it persists.

In the meantime, I downgraded to 0.29 and the issue persists on that version too.

@keabarnes
Copy link
Author

I've been able to confirm that this behaviour does still occur on an Intel based Mac

@GuillaumeGomez
Copy link
Owner

Then I'll be able to check what's going on locally. Thanks for the information!

@GuillaumeGomez
Copy link
Owner

I just checked and... I have absolutely no idea where the leak comes from... To be honest, I'm pretty bad at using mac tools to try to check memory issues since they don't have valgrind and their other tools are confusing. So for now, I'll consider it's "ok". Don't hesitate to give it a try though, help to fix this issue would be very welcome!

@keabarnes
Copy link
Author

Ok, if you leave the issue open, I will try circle back to it and see what I can find. Do you know of some way that I can work around the issue for now? Is there some way that I can instantiate the library in a way that forces it to clear the memory?

@GuillaumeGomez
Copy link
Owner

It seems to be specific to the running process (or maybe the current thread? To be confirmed). So unless you get this information from another process/thread, I don't see how.

@keabarnes
Copy link
Author

Good suggestion, spawning a new thread does the trick. The underlying code still needs to be fixed, but the code below prevents the memory from growing past one invocation's worth:

use std::thread::sleep;
use sysinfo::{Disks, MINIMUM_CPU_UPDATE_INTERVAL};

fn main() {
    // Memory leak when using `refresh_list`:
    loop {
        let handle = std::thread::spawn(move || {
            let mut disks = Disks::new_with_refreshed_list();
            disks.refresh_list();
            disks.refresh();
            disks.first().map(|disk| {
                let available_space = disk.available_space();
                println!("{available_space}");
            });
            sleep(MINIMUM_CPU_UPDATE_INTERVAL);
        });
        let _ = handle.join();
    }
}

@GuillaumeGomez
Copy link
Owner

It'd be much simpler if we could find out where the leaked memory comes from. If anyone knows a good memory debugger for mac...

@keabarnes
Copy link
Author

keabarnes commented Jun 10, 2024

In this comment, I used cargo instruments (cargo install cargo-instruments) which uses the standard macOS tools, running the minimum example with cargo instruments -t Allocations --time-limit 10000.

In the Instruments application, on the left side, you can change from Statistics to Call Trees, which is much more helpful.
image
to
image

Then, you can traverse the tree and see where the memory is being allocated from. In the comment above, you can see in my case the lowest level user-space call was sysinfo::unix::apple::disk::get_disk_properties.
image

I imagine the solution would involve calling release on one of the underlying framework entities, maybe the result stored from CFURLCopyResourcePropertiesForKeys or something like that.

@GuillaumeGomez
Copy link
Owner

Thanks for the detailed explanations! I'll give it another try when I have time.

@gongzhengyang
Copy link
Contributor

System: macOS Sonoma 14.5
sysinfo version: 0.30.12

It seems no problem.

If I run the code below for a long time, I notice after the memory grows to 50M, but will drop back to about 10M, and then increase and drop again.

loop {
     let _ = Disks::new_with_refreshed_list();
     sleep(MINIMUM_CPU_UPDATE_INTERVAL);
}

@complexspaces
Copy link
Contributor

complexspaces commented Aug 10, 2024

Hi everyone. I'm the author of the code in question here. During its original development I spent a lot of time making sure there were no CoreFoundation-induced leaks, so I took a look at this one today to triple check. And: I've concluded its not a memory leak.

I actually had to deal with a very similar problem to this in my crate arboard. In that case, the OS frameworks were caching resources as a result of Objective-C calls. The same thing is happening here (as can be seen from the multiple caching references in the Instruments stack calls. Note the Objective-C calls in the stacktrace are the ones with [] around them.

The root cause problem here is that Objective-C implicitly autoreleases a lot, and often caches will take advantage of this for automatic cleanup. The part that tipped me off here was this example spawning threads to remove the leak. On macOS every thread has its own autorelease pool stack, which its responsible for draining. In "normal" macOS apps, the main thread event loop does this for you implicitly. In Rust... it does not.

This means that if you have a single thread calling the same function extremely fast with autoreleasing objects somewhere in the callstack, you get what appears to be a memory leak. But, once the main thread exits (or each temporary thread in a loop), the thread's pools are cleaned up automatically by the OS. This is also why the leaks CLI doesn't report any leaks on the original reproducer: by the time main returns, the memory has been freed.

To demonstrate this, you can use one of the original reproducers and wrap it in a new autorelease pool per loop, which will be drained when the scope ends (at the end of each loop iteration):

let mut disks = Disks::new_with_refreshed_list();
loop {
    objc2::rc::autoreleasepool(|_pool| {
        disks.refresh_list();
        disks.refresh();
        std::thread::sleep(sysinfo::MINIMUM_CPU_UPDATE_INTERVAL);

    })
}

See the with/without autorelease pool for allocation traces:
image

With the explanation done, how should this be fixed? IMO calling refresh disks fast enough to incur this is pathological behavior and not something everyone will do, so adding an autorelease pool to sysinfo is the wrong direction. This is because the autorelease pool is unnecessary for other uses cases like if its compiled into a proper Cocoa app or used at a slower pace. The same was true in arboard, except that was actually using Objective-C and has a tighter contract to uphold because it dealt with many objects autoretained from AppKit. Given all that, I would lean towards saying it's the job of more "specialized" applications to handle this on their own and add autorelease pools if the temporary memory spikes are non-tolerable. Of course, this is up to @GuillaumeGomez as the crate maintainer.

@GuillaumeGomez
Copy link
Owner

Thanks for the very detailed explanations!

So if I understand correctly, getting disk information in a thread would allow for the extra resources to be cleared once the thread is removed, right? That would be an acceptable solution.

Otherwise, not doing anything would be acceptable too.

@complexspaces
Copy link
Contributor

complexspaces commented Aug 10, 2024

Yes, that is correct 👍. At the cost of some unsafe code (or an objc2 dependency), a manual autorelease pool would incur less overhead then a full thread spawn in sysinfo, but someone would need to benchmark to see if it makes a noticeable difference.

@GuillaumeGomez
Copy link
Owner

I'd rather just use some unsafe code directly. To benchmark it, you can add a new benchmark for it in benches to make this function in particular called and then check a before/after. We can think about it once we have the results. Since I can't trigger the memory bug on my computer, not sure it'd be worth it to check it here. Wanna give it a try?

@keabarnes
Copy link
Author

keabarnes commented Aug 28, 2024

Thanks @complexspaces for the detailed response!

Just to add a response to this portion of the comment:

With the explanation done, how should this be fixed? IMO calling refresh disks fast enough to incur this is pathological behavior and not something everyone will do, so adding an autorelease pool to sysinfo is the wrong direction. This is because the autorelease pool is unnecessary for other uses cases like if its compiled into a proper Cocoa app or used at a slower pace. The same was true in arboard, except that was actually using Objective-C and has a tighter contract to uphold because it dealt with many objects autoretained from AppKit. Given all that, I would lean towards saying it's the job of more "specialized" applications to handle this on their own and add autorelease pools if the temporary memory spikes are non-tolerable. Of course, this is up to @GuillaumeGomez as the crate maintainer.

I only used a tight loop to exacerbate the issue, it would normally only be called every 15 minutes in our application and the application would crash after a couple days. I don't think this use case for the library is too foreign, for example, a toolbar application that has a thread to check disk usage seems like it would be a fairly common use case.

Either way, we will pursue the objc2::rc:: autoreleasepool approach in our codebase to mitigate the issue for now. Thank you for highlighting that solution, I wasn't aware of the method and that it could be called in this way outside of the underlying objective-C code :)

@GuillaumeGomez
Copy link
Owner

The fact that the application crashes after a few days is not acceptable for me. Rust now provides thread::scope which would allow to have a thread to get the data and then it would free the memory. If it sounds fine for both of you, I can implement it and make a release in the next hours.

@complexspaces
Copy link
Contributor

I agree, a crash every few days is more problematic. Please have a look at #1344, which implements the autorelease pool solution I proposed earlier.

I would appreciate a smoke test of that branch if anyone has time too.

@GuillaumeGomez
Copy link
Owner

Published the new version with this fix.

@keabarnes
Copy link
Author

Thanks for the work on this! I'll put it through testing on our side 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants