Memory Leak in Parsing #54

jkrot · 2016-08-03T15:22:40Z

https://github.com/chafey/dicomParser/blob/34c02a7cec6b71a68ae21856029a8f5d957e900e/src/readSequenceElementExplicit.js

So I have been doing tests for parsing 1000+ files and I keep running into a memory leak/ inefficient garbage collection happening with these functions. It is making out my heap and causing the parsing to crawl waiting for garbage collection to finish. I was wondering if anyone else ran into this problem. Also I think I may be able to fix the underlying issue but I might need to talk it through with @chafey

jkrot · 2016-08-03T15:34:00Z

Fyi good reading on this topic is
http://stackoverflow.com/questions/1596782/how-to-unset-a-javascript-variable
#4 https://www.toptal.com/javascript/10-most-common-javascript-mistakes
https://auth0.com/blog/four-types-of-leaks-in-your-javascript-code-and-how-to-get-rid-of-them/

yagni · 2016-08-03T17:32:33Z

What's your use case and what browser? I've been using the library to parse a 2GB directory (about 3200 files) with less than 150MB memory usage. Once I parse the files and extract the field values I need, I unset the byteArrays and that keeps the memory around 130MB. Are you sure it's not due to how you're using the objects?

chafey · 2016-08-03T18:13:27Z

I checked the source file you linked and don't see any unsetting of variables - did you mean to link to another file?

jkrot · 2016-08-03T22:13:26Z

So I used that as an example of previous stuff, I think what is happening is that a fileReader is getting set for each file and not just one fileReader for all of them and parse them one at a time.

chafey · 2016-08-05T13:10:49Z

OK so you think your leak is outside the library then? If so, please close

jkrot · 2016-08-05T18:15:18Z

I am still confirming that is the case, I am not sure but I am still
seeing some memory issues even with my changes.

On Fri, Aug 5, 2016 at 8:10 AM Chris Hafey [email protected] wrote:

OK so you think your leak is outside the library then? If so, please close

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#54 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAHmqx3BVcpWl2Vbb6guwp8XtXOoqx20ks5qczZagaJpZM4Jbx4X
.

chafey · 2016-08-18T14:59:24Z

@jkrot any update on this?

jkrot · 2016-08-18T16:02:08Z

I have made headway on this https://github.com/chafey/dicomParser/blob/master/src/parseDicomDataSet.js line 36 & 70 declaring the variable in the loop was causing a heap leak for me. I am still poking around because it is hard trying to find all the places this type of leak is happening but I am making progress. I will get a commit up by this weekend.

chafey · 2016-08-18T16:18:14Z

OK this isn't really a "leak" per say but more of a "using more heap than is necessary". The issue is that javascript does not hoist initialized variables. If we moved the declaratio out of the loop or declare in the loop but not assign we should be good. I am going to be working on dicomParser today and will see if I can fix this. Thanks

chafey · 2016-08-18T16:19:09Z

Useful references:
http://www.w3schools.com/js/js_hoisting.asp
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Memory_Management

jkrot · 2016-08-18T16:31:37Z

Correct the problem is I am maxing out my heap so when that happens the parsing goes to an absolute crawl waiting for the garbage collector to clean up so it can be used. Part of this was my fault on how I am using the reader which I fixed but part of it is some optimization in the parsing code for better heap management without waiting for the collector.

I guess you never expected to parse several thousand files from disk in parallel.

jkrot · 2016-08-18T17:00:19Z

I also played around with https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/delete

chafey · 2016-08-18T18:40:12Z

ok I'll wait for your PR, please rebase with latest first

jkrot · 2016-08-22T07:29:17Z

So the only way I was able to resolve this was to setup workers and to destroy them when done. I tried forcing the collection to happen in multiple places tracing through the stack in the code and just couldn't get it to happen. I suspect this is more to do with the fileReader then dicomParser but I tried setting my fileReader = undefined; in the onloadend function for it and that didn't help. I traced back from dicomParser.explicitElementToString util all the way to my promise chain for processing these files and just couldn't get it to free up the memory along the way. I still feel like there is room for improvement somewhere but I can't seem to narrow it down. I do know this is because max heap size is being hit with over 1000+ files and those files get parsed in under 30 seconds so the collection doesn't have time to run.

chafey · 2016-09-08T03:06:20Z

Any update on this? Please close unless you have evidence of a real issue

jkrot · 2016-09-08T15:19:20Z

So still working on this, I am now classifying this as a heap management issue. So the library does a ton of string creation everywhere and that is all fine and dandy because strings are easy to garbage collect when they are no longer associated. However if you are processing a large amount of files at a time then these may not be garbage collected fast enough. The 2 ways I have found to work around this is to make everything a object because you can delete object properties yourself and free that up or push it into a web worker and terminate the web worker when done because the memory will then be cleaned up. I have just tried changing this in a few places to clean up some heap and it has helped a little but I figure it would take a ton of modification to the entire library to use objects and properties and to delete properties when you are done to streamline the heap issue I am encountering. I have done this in my own code by making the fileReader a method call in an object and deleting that method call after I am done with it to force the memory to be cleaned up faster and that has helped some but there are still strings in the library which are holding onto their associations longer then I expected. This is a very hard issue to track because I have to inspect heap snapshots and look at why the specific memory items could not be collected by garbage collection.

chafey · 2016-09-08T19:22:08Z

Interesting information! I would say that your use is atypical so making significant changes to the library might be hard to justify, especially if they have any negative side effects for the more common use cases (e.g. slower performance). I hate to say it, but your use case might be better served by using a C++ DICOM parsing library like dcmtk or gdcm

jkrot · 2016-09-14T02:18:01Z

I will have my first pass at heap reduction by this weekend. It wont be complete but I am aiming at a 10% reduction.

jkrot · 2016-09-16T23:51:03Z

#56 Pull request for improvements.

jkrot · 2017-01-24T19:42:06Z

Hey @chafey I know you have been busy any chance of you looking at the pull request for heap improvements. i have been running it in a forked branch for awhile now.

chafey · 2017-02-16T13:12:14Z

@jkrot we will get to this soon, sorry for the delay!

jkrot · 2017-03-06T22:39:22Z

So I have noticed something as well, after the initial parse when using the options untilTag option the byte array returned from

        try { 
            var options = {
                omitPrivateAttibutes: true,
                untilTag: 'x0020000e',
                maxElementLength: 128
            };
            var dataSet = dicomParser.parseDicom(byteArray, options);

I was expecting the byteArray returned to be smaller since we were trimming results to a specific tag but that was not the case. I need to preslice the byteArray before I send it to the parser because I am not interested in the media file in the dicom just the data in the file itself. Occasionally I am getting parsing errors in this endeavor. I thought 4k of byte stream would be more then enough is most cases but files with lots of private tags or files that has been anonymized in a strange manner sometimes break the parsing. Any ideas/thoughts on how best to trim down the size?

swederik added the enhancement label Apr 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Leak in Parsing #54

Memory Leak in Parsing #54

jkrot commented Aug 3, 2016

jkrot commented Aug 3, 2016

yagni commented Aug 3, 2016

chafey commented Aug 3, 2016

jkrot commented Aug 3, 2016

chafey commented Aug 5, 2016

jkrot commented Aug 5, 2016

chafey commented Aug 18, 2016

jkrot commented Aug 18, 2016 •

edited

Loading

chafey commented Aug 18, 2016

chafey commented Aug 18, 2016

jkrot commented Aug 18, 2016 •

edited

Loading

jkrot commented Aug 18, 2016

chafey commented Aug 18, 2016

jkrot commented Aug 22, 2016

chafey commented Sep 8, 2016

jkrot commented Sep 8, 2016

chafey commented Sep 8, 2016

jkrot commented Sep 14, 2016

jkrot commented Sep 16, 2016 •

edited

Loading

jkrot commented Jan 24, 2017

chafey commented Feb 16, 2017

jkrot commented Mar 6, 2017

Memory Leak in Parsing #54

Memory Leak in Parsing #54

Comments

jkrot commented Aug 3, 2016

jkrot commented Aug 3, 2016

yagni commented Aug 3, 2016

chafey commented Aug 3, 2016

jkrot commented Aug 3, 2016

chafey commented Aug 5, 2016

jkrot commented Aug 5, 2016

chafey commented Aug 18, 2016

jkrot commented Aug 18, 2016 • edited Loading

chafey commented Aug 18, 2016

chafey commented Aug 18, 2016

jkrot commented Aug 18, 2016 • edited Loading

jkrot commented Aug 18, 2016

chafey commented Aug 18, 2016

jkrot commented Aug 22, 2016

chafey commented Sep 8, 2016

jkrot commented Sep 8, 2016

chafey commented Sep 8, 2016

jkrot commented Sep 14, 2016

jkrot commented Sep 16, 2016 • edited Loading

jkrot commented Jan 24, 2017

chafey commented Feb 16, 2017

jkrot commented Mar 6, 2017

jkrot commented Aug 18, 2016 •

edited

Loading

jkrot commented Aug 18, 2016 •

edited

Loading

jkrot commented Sep 16, 2016 •

edited

Loading