Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduced CStack limit in Built Image #712

Open
Ramachandran-CHEENIYIL opened this issue Oct 11, 2023 · 7 comments
Open

Reduced CStack limit in Built Image #712

Ramachandran-CHEENIYIL opened this issue Oct 11, 2023 · 7 comments
Labels
bug Something isn't working needs more info Further information is requested pre-built images Related to pre-built images

Comments

@Ramachandran-CHEENIYIL
Copy link

Ramachandran-CHEENIYIL commented Oct 11, 2023

Container image name

rocker/shiny-verse:4.1.3

Container image digest

rocker/shiny-verse@sha256:cef2c8eb1202797a224dfad9cd01312ce13041dee43d542a57a5e7c67c9e753f

What operating system are you seeing the problem on?

Linux

System information

-Docker version 20.10.13, build a224086

Bug description

The CStack limit in the images built using this base image is half of what is available in local, Windows, system. This is causing certain resource intensive processes in our engine to break down with the error message:
Warning: Error in : C stack usage 7969508 is too close to the limit

These processes differ from working cases in something as little as going through a 50 column csv input file to a 100 column csv input file.

It is not possible to change this limit from within R, and the environment where the image is being hosted have security measures preventing us from tampering with the base image by writing custom commands in the dockerfile.

How to reproduce this bug?

Run the following function in R console in local

Cstack_info()

Afterwards, run the same function in any image built with rocker/shiny-verse or even the base image, as shown below

docker run -ti rocker/shiny-verse:4.1.3 R -q -e "Cstack_info()".

Chances are that you will see the Cstack_info()$size value to be half of what you have in your local Windows system.

@Ramachandran-CHEENIYIL Ramachandran-CHEENIYIL added the bug Something isn't working label Oct 11, 2023
@eitsupi
Copy link
Member

eitsupi commented Oct 11, 2023

Thanks for explaining the problem. But is this a problem with the Rocker project?

This seems to be the default value for Linux.
https://support.posit.co/hc/en-us/articles/15137303824279-C-Stack-usage-limit

If this is coming from Linux, I think the change should be made carefully.
(Or I don't think it should be done here.)

@eddelbuettel @cboettig Any thoughts?

@eitsupi eitsupi added needs more info Further information is requested pre-built images Related to pre-built images labels Oct 11, 2023
@eddelbuettel
Copy link
Member

Can you start from a bash shell and then try ulimit -s someval for some suitable value, and then call R ?

@eddelbuettel
Copy link
Member

Also, for what it is worth, on my Linux host I see the same value outside of Docker and inside:

$ ulimit -s
8192
$ docker run --rm -ti rocker/r2u bash -c 'ulimit -s'
8192
$ 

@Ramachandran-CHEENIYIL
Copy link
Author

Ramachandran-CHEENIYIL commented Oct 13, 2023

Hello @eddelbuettel!:-) It makes sense in your system you see the same size both inside and outside docker, since you are on Linux, and the container would look a lot like your environment.
I tried, as you suggested, to increase the stack limit in in my container by adding in my Dockerfile, and hence in the build, the line

# Increase stack size
RUN ulimit -s 16384

However, this did not increase the stack limit, which still remained at less that half that amount. But, I did manage to fix the code by splitting the more wider data frames used as arguments in the UDF that I was having issues with column wise, so that I have two "leaner" data frames, calling the same UDF on each set of the leaner data frames, and then combining the results.

myUDF <- function(df1, df2, repeatOperation = TRUE){
  
  #...some data wrangling which will either already have a return call, or return df1 and df2 as data frames of the exact same width...
  
  if(ncol(df1) > someValueA & repeatOperation = TRUE){
    splitDF1_One <- df1[,1:(someValueC)]
    splitDF1_Two <- cbind(df1[,1:(someValueB)], df1[,(someValueC + 1):ncol(df1)])
    
    splitDF2_One <- df2[,1:(someValueC)]
    splitDF2_Two <- cbind(df2[,1:(someValueB)], df2[,(someValueC + 1):ncol(df2)])
    
    Set1Result <- myUDF(splitDF1_One, splitDF2_One, repeatOperation = FALSE)
    
    Set2Result <- myUDF(splitDF1_Two, splitDF2_Two, repeatOperation = FALSE)
    
    #...some operations to combine the results Set1Result and Set2Result to give the final result of the function...
    
    return(finalResults)
    
  }
  
  #...some calculations...
  
  return(results)
  
}

Perhaps not the best approach, but this is holding up for all of my tests so far. I should still argue, though, that ideally, the limit should be increased in the base image. I'd leave that decision up to the developers.

@CoolShades
Copy link

CoolShades commented Oct 16, 2023

Got a solution for you @Ramachandran-CHEENIYIL

It seems docker allows you to set ulimits on run.

If you are using docker-compose:

services:
  rocker:
    image: rocker/ml
    ulimits:
      nofile:
        soft: 20000
        hard: 40000
      stack:
        soft: 67108864 # 64MB in bytes
        hard: 134217728 # 128MB in bytes

if you are using just the standard docker run from CLI, its:

docker run \
  --ulimit nofile=20000:40000 \
  --ulimit stack=67108864:134217728 \
  rocker/ml

I have tested it, and works beautifully.

@cboettig
Copy link
Member

@CoolShades thanks, that's awesome. (I've occasionally been bitten on the n open file limits)

@Ramachandran-CHEENIYIL
Copy link
Author

@CoolShades Thank you for the idea :-) However, it is not the run that I need to set the limit on. I am not very familiar with Docker Compose, but with what I have read, I understand that it is not recommended for the Production environments. Is there a justification to the contrary here? If so, I'd try it. And the image will not be run via CLI. It will be pushed on to a container registry from where a web application would pick it up and host it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs more info Further information is requested pre-built images Related to pre-built images
Projects
None yet
Development

No branches or pull requests

5 participants