Memory Leak #9

weixiyen · 2018-04-07T22:25:36Z

I'm processing about 2000 images per minute across 8 nodes in production and noticed that system memory is going up.

:erlang.memory() is stable, but system memory goes up by about ~50mb per hour per node until the VM just crashes.

It seems that the NIF bindings from exmagick might be the issue here where memory is not being freed up somewhere.

The commands I'm using are init! image_load!, image_dump!, and size!

0.0.5

dgvncsz0f · 2018-04-10T10:26:34Z

looking at the code my guess is image_dump!. enif_make_resource_binary requires enif_release_resource which is missing. i'll make a test to verify and if that is the case i'll make a patch fixing.

thanks!

dgvncsz0f · 2018-04-10T22:35:16Z

@weixiyen could you test #10 and let me know if it fixes your problem?

weixiyen · 2018-04-11T03:14:48Z

I added the config to my deps, works fine locally.

{:exmagick, git: "https://github.com/Xerpa/exmagick.git", ref: "b9162a1bd0908d0e3d5d501a07764c0e2936eaa2"},

In prod getting this crash on startup for this commit hash whereas hex.pm's version 0.0.5 works fine.

2018-04-11 03:36:29 crash_report
    initial_call: {supervisor,kernel,['Argument__1']}
    pid: <0.3408.0>
    registered_name: []
    error_info: {exit,{on_load_function_failed,'Elixir.ExMagick'},[{gen_server,init_it,6,[{file,"gen_server.erl"},{line,352}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}
    ancestors: [kernel_sup,<0.3383.0>]
    messages: []
    links: [<0.3384.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 376
    stack_size: 27
    reductions: 117
2018-04-11 03:36:29 supervisor_report
    supervisor: {local,kernel_sup}
    errorContext: start_error
    reason: {on_load_function_failed,'Elixir.ExMagick'}
    offender: [{pid,undefined},{id,kernel_safe_sup},{mfargs,{supervisor,start_link,[{local,kernel_safe_sup},kernel,safe]}},{restart_type,permanent},{shutdown,infinity},{child_type,supervisor}]
2018-04-11 03:36:30 crash_report
    initial_call: {application_master,init,['Argument__1','Argument__2','Argument__3','Argument__4']}
    pid: <0.3382.0>
    registered_name: []
    error_info: {exit,{{shutdown,{failed_to_start_child,kernel_safe_sup,{on_load_function_failed,'Elixir.ExMagick'}}},{kernel,start,[normal,[]]}},[{application_master,init,4,[{file,"application_master.erl"},{line,134}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}
    ancestors: [<0.3381.0>]
    messages: [{'EXIT',<0.3383.0>,normal}]
    links: [<0.3381.0>,<0.3380.0>]
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 376
    stack_size: 27
    reductions: 152
2018-04-11 03:36:30 std_info
    application: kernel
    exited: {{shutdown,{failed_to_start_child,kernel_safe_sup,{on_load_function_failed,'Elixir.ExMagick'}}},{kernel,start,[normal,[]]}}
    type: permanent
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{{shutdown,{failed_to_start_child,kernel_safe_sup,{on_load_function_failed,'Elixir.ExMagick'}}},{kernel,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,kernel,{{shutdown,{failed_to_start_child,kernel_safe_sup,{on_load_function_failed,'Elixir.ExMagick'}}},{kernel,start,[normal,

dgvncsz0f · 2018-04-11T10:22:15Z

the master of exmagick includes flags to enable dirty scheduler that requires a fairly recent erlang version, probably that is why. we've been testing this in production for a while now and is stable but we haven't made a release yet to avoid breaking compatibility. let me fix that and we test again.

dgvncsz0f · 2018-04-11T11:22:44Z

ok. i made dirty scheduler disabled by default [opt-in by using an env var]. could you try it again?

weixiyen · 2018-04-11T17:48:47Z

Yep, trying it again.

I just updated my erlang distribution to erts-9.3 with the env flag turned on.

Will get back to you in about ~12 hours with some graphs to compare before / after.

Thank you!

weixiyen · 2018-04-12T18:54:46Z

Hi @dgvncsz0f - just checking back in.

I think there is still a memory leak, and it's kind of hard to tell how much the fix affected anything.

After running in prod for roughly 24 hours, there is still a big difference between what Erlang thinks it is using versus what the Linux Kernel thinks.

For example, on one node, Erlang thinks it is using 300MB of memory while the Linux Kernel thinks Erlang is using 5GB of memory.

According to Fred Herbert, this type of symptom could be caused by a NIF:
http://erlang.org/pipermail/erlang-questions/2013-September/075401.html

Although, I can't be 100% sure.

Is it possible that there are minor leaks related to either init!, image_load!, or size! functions?

I'll try to do some more testing on my end as well to see if I can get more helpful information

dgvncsz0f · 2018-04-12T20:13:07Z

@weixiyen thanks. if you can provide more information it will be valuable. later today i'll continue investigating looking for other possible leaks. i'll get back to you as soon as i have something up.

Thanks!

dgvncsz0f · 2018-04-13T01:39:29Z

sorry, accidentally closed when merged the pr...

dgvncsz0f mentioned this issue Apr 10, 2018

bugfix: memory leak in ImageToBlob #10

Merged

dgvncsz0f closed this as completed in #10 Apr 12, 2018

dgvncsz0f reopened this Apr 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Leak #9

Memory Leak #9

weixiyen commented Apr 7, 2018 •

edited

Loading

dgvncsz0f commented Apr 10, 2018

dgvncsz0f commented Apr 10, 2018

weixiyen commented Apr 11, 2018 •

edited

Loading

dgvncsz0f commented Apr 11, 2018

dgvncsz0f commented Apr 11, 2018 •

edited

Loading

weixiyen commented Apr 11, 2018

weixiyen commented Apr 12, 2018 •

edited

Loading

dgvncsz0f commented Apr 12, 2018

dgvncsz0f commented Apr 13, 2018

Memory Leak #9

Memory Leak #9

Comments

weixiyen commented Apr 7, 2018 • edited Loading

dgvncsz0f commented Apr 10, 2018

dgvncsz0f commented Apr 10, 2018

weixiyen commented Apr 11, 2018 • edited Loading

dgvncsz0f commented Apr 11, 2018

dgvncsz0f commented Apr 11, 2018 • edited Loading

weixiyen commented Apr 11, 2018

weixiyen commented Apr 12, 2018 • edited Loading

dgvncsz0f commented Apr 12, 2018

dgvncsz0f commented Apr 13, 2018

weixiyen commented Apr 7, 2018 •

edited

Loading

weixiyen commented Apr 11, 2018 •

edited

Loading

dgvncsz0f commented Apr 11, 2018 •

edited

Loading

weixiyen commented Apr 12, 2018 •

edited

Loading