Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility to add a timeshifted cronjob and distribute concurrency #147

Closed
SimonScholl opened this issue Apr 9, 2019 · 6 comments
Closed
Labels

Comments

@SimonScholl
Copy link

First, thank you for this awesome plugin, it is really helpful for our company reducing our runtime by several seconds.

So while using it we noticed a logical problem. For keeping our lambdas warm we set the trigger to 15/20 minutes, depending on this. A coldstart can occur when the warmup gets triggered this will take around 1 minute and 30 seconds. E.g.: we have a concurrency of 2, if a requests will be triggered while those 2 concurrent functions are executed, the demand will be increased by 1 and the concurrency is 3 which will result in an additional container instantiated by AWS.

The solution could be to set the trigger on a lower time interval, or to have a timeshift property we can set.
E.g.: We have a lambda to keep warm with concurrency 4. There could be a timeshifted cronjob 5 minutes after the first one. While the actual cronjob is set to 15 minutes, it gets triggered at 9:15, 9:30, 9:45, 10:00, etc.. the second one could be triggered at 9:20, 9:35, 9:50, 10:05. If the amount of concurrency will be distributed so that each cronjob will warm up a concurrency of 2, demand peeks could be catched and reduced.

I think this would be a good addition especially with big concurrency values, let me know if this feature makes any sense to you.

Best Regards

@juanjoDiaz
Copy link
Owner

Hi @SimonScholl ,

I see your point.
However, timeshift as you proposed would warmup 2 lambdas, not 4.
It's actually the same as simply using a more specific schedule.
For your example: 0,5,15,20,30,,35,45,50 * * * * .

In any case, it's pretty much impossible to avoid 100% cold starts because we have zero control on how AWS decide to spawn a container. All we can do is a best guess.

Wdyt? Did I missed anything?

@SimonScholl
Copy link
Author

Yeah the problem is we have no reference to the container instance itself, of course there is no easy solution for this problem. Maybe an idea would be to use serverless building two variants of the same lambda, i already use this mechanism for parallel execution in AWS Stepfunctions. Serverless can create different lambdas based on the same code base. Like one normal and one shifted version of same function, incoming calls then have to be put on those functions which got called recently, but i see there would be a lot of redundant stuff and the solution would not look that good.

But anyway thank you for effort and your fast response :)

@juanjoDiaz
Copy link
Owner

There used to be a company called LambdaCult which offered a solution that did just that: it created two aliases for each function and switched from A to B to warm A and from B to A to warm B, ensuring that the lambdas warming process didn't impact the users. They also use CloudWatch logs to ensure that the correct number of containers are instantiated.
However, I never saw any proof of the system actually working and the company seems to be out of business now.

This plugin has a "best attempt" approach which is much simpler but also less precise indeed.
AFAIK, people are using this plugin and making cold starts a bit less of a problem... (I am 🙂)

You can see more discussion related to this at #24 .

I'd be happy to have the discussion about a possibly better approach and accept PRs for it as long as they are proven to work and don't complicate things for the users that rather the simple approach.
The "double alias" approach should be doable. But I have no idea of whether it would actually work any better than the current one and how it would increase the costs. Also, I'm not entirely sure if the plugin would need to reconfigure the API gateway to point to one of the alias or the other or if that would be done using changing the lambda itself which would require extra permissions and makes things a bit more error prone.

@Vadorequest
Copy link

Vadorequest commented Sep 9, 2019

I had a talk with someone at some AWS conference and I remember he told me his company had done something quite smart to handle cold starts and rehydrating a pool of pre-warmed lambdas.

Unfortunately, I don't remember exactly what that was, but that's something like:

  1. Sending X calls on the lambda, all at once
    • This forces the lambda to spawn multiple containers to handle the load

  2. Analyse (using CloudWatch) how many of those were cold starts (based on the lambda's duration)
    • This give some insights about how many containers are running (with 80% precision)

  3. Configure warmups using Y calls on the lambda, all at once
    • This forces the existing pre-warmed containers to handle the load, and be warmed up again

  4. Repeat this pattern and adjust X/Y based on how many containers you want to be warmed

Precision of this method was 80% (so he said), meaning that if they aimed to warm up 10 containers for a lambda, they usually ended up with 8-12 containers.

I thought that was quite smart, they handled the issue in a very particular way, which fits well when you want to be able to absorb some load without having any cold start.

I'm not saying the plugin should do that, but I thought about that while reading this thread 😄


I spent the last 25 mn reading other issues and saw that is was already implement here, great job!
#24

@juanjoDiaz
Copy link
Owner

Hi @Vadorequest ,

We have implemented indeed the multiple lambda warming. One thing that has been discussed but never implemented is to add more intelligence to the warming process by inspecting CloudWatch logs in order to find out how many containers are actually running.
It's a bit tricky and it needs some extra permissions.
PRs are welcomed 😄

@juanjoDiaz
Copy link
Owner

Closing in favour of #181.
As stated above, the timeshifted option wouldn't really work. So the only real proposal here was using CloudWatch logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants