Worker exceeded 95% memory budget #476

beyucel · 2020-06-02T14:01:13Z

I just wanted to discuss the memory usage issue with this notebook.
When chunk size is above 25 ( >250 mb), a single worker gets to 6.3 GB memory usage and restarts the kernel. When Chunk size is 25 and below, there is no problem.

My question is, why do 300mb chunks have this high memory usage issue?

wd15 · 2020-06-02T14:14:53Z

We need to profile the memory usage. Let's check the delta first to see if that makes sense.

beyucel · 2020-06-03T04:58:33Z

I initially tried the memory_profiler and %memit magic function. Not sure if it does what we want , or I did not use it correct ( I am investigating that) because it shows that (notebook ) peak memory: 219.80 MiB, increment: 14.20 MiB and it does not seem reasonable. I am following the htop and memory usage for those lines is a lot higher. I will try other two memory profiler as well.

wd15 · 2020-06-03T16:57:40Z

I would do all the memory profiling outside of the notebook for starters as that can confuse things. Also, start with only one process to get a good benchmark to make sure you understand the delta between each step in the code. Furthermore, break the code down into imperative steps might help.

beyucel · 2020-06-03T16:59:54Z

Thanks, Daniel. That is what I am trying to do right now. I will share the delta values of each process

beyucel · 2020-06-04T06:17:32Z

Filename: memory_try.py

Line Mem usage Increment Line Contents

39  183.129 MiB  183.129 MiB   @profile
40                             def HomogenizationPipeline(x):
41  183.215 MiB    0.086 MiB       a1=PrimitiveTransformer(n_state=2, min_=0.0, max_=1.0).transform(x)
42  183.762 MiB    0.547 MiB       a2=TwoPointCorrelation(periodic_boundary=True, cutoff=15,correlations=[(1,1)]).transform(a1)
43  183.762 MiB    0.000 MiB       a3=FlattenTransformer().transform(a2)
44 10015.367 MiB 9831.605 MiB       a4=PCA(n_components=3).fit_transform(a3)
45 10015.367 MiB    0.000 MiB       return a4

================================================
This is the non-compute version this does not tell much because the first three lines are lazy and all computation is made with PCA fit transform. I will add the compute version as well for discussion. This still uses the same notebook as above ( I just wrote a separate .py file with the notebook and used the notebook as a shell)

wd15 · 2021-08-03T00:03:18Z

@beyucel is this still an issue? Can this be closed? Please close if you think that this isn't something we can act on

beyucel assigned wd15 and auag92 Jun 2, 2020

beyucel added the question label Jun 2, 2020

wd15 added this to the 0.4 milestone Jun 2, 2020

beyucel modified the milestones: 0.4, 0.4.1 Jul 2, 2020

wd15 modified the milestones: 0.4.1, 0.4.2 Aug 17, 2020

wd15 modified the milestones: 0.4.2, 0.5 Aug 2, 2021

wd15 modified the milestones: 0.5, 0.5.1 Aug 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worker exceeded 95% memory budget #476

Worker exceeded 95% memory budget #476

beyucel commented Jun 2, 2020

wd15 commented Jun 2, 2020

beyucel commented Jun 3, 2020

wd15 commented Jun 3, 2020

beyucel commented Jun 3, 2020

beyucel commented Jun 4, 2020

wd15 commented Aug 3, 2021

Worker exceeded 95% memory budget #476

Worker exceeded 95% memory budget #476

Comments

beyucel commented Jun 2, 2020

wd15 commented Jun 2, 2020

beyucel commented Jun 3, 2020

wd15 commented Jun 3, 2020

beyucel commented Jun 3, 2020

beyucel commented Jun 4, 2020

Line Mem usage Increment Line Contents

wd15 commented Aug 3, 2021