I have 20K message in SQS queue. I also have a lambda will process the SQS messages, and put data into ElasticSearch server.
I have configured SQS as the lambda’s trigger, and limited the Lambda’s SQS batch size to be 10. I also limited the only one instance of the lambda can be run at a giving time.
However, sometime I see over 10K
in-flight messages from the AWS console. Should it be max at 10 in-flight messages?
Because of this, the lambdas will only able to process 9K of the SQS message properly.
Below is a screen capture to show that I have limited the lambda to have only 1 instance running at a giving time.
I’ve been doing some testings and contacting AWS tech support at the same time.
What I do believe at the moment is that:
Amazon Simple Queue Service supports an initial burst of 5 concurrent function invocations and increases concurrency by 60 concurrent invocations per minute. Doc
1/ The thing that does that polling, is a separate entity. It is most likely to be a lambda function that will long-poll the SQS and then, invoke our lambda functions.
2/ That polling Lambda does not take into account any of our Receiver-Lambda at all. It does not care whether the function is running at max capacity or not, or how many max concurrency is available for the Receiver-Lambda
3/ Due to that combination. The behavior is not what we expected from the Lambda-SQS integration. And worse, If you have suddenly, millions of message burst in your queue. The Receiver-Lambda concurrency can never catch up with the amount of messages that the polling Lambda is sending, result in loss of work
- Create one Lambda function that takes 30 seconds to return true;
- Set that function’s concurrency to 50;
- Push 300 messages into the queue ( Visibility timeout : 10 Minutes, batch message count: 1, no re-drive )
- Amount of messages available just increase gradually
- At first, there are few enough messages to be processed by Receiver-Lambda
- After half a minute, there are more messages available than what Receiver-Lambda can handle
- These message would be discarded to dead queue. Due to polling Lambda unable to invoke Receiver-Lambda
I will update this answer as soon as I got the confirmation from AWS support
Support answer. As of Q1 2019, TL;DR version
1/ The assumption was correct, there was a “Poller”
2/ That Poller do not take into consideration of reserved concurrency
as part of its algorithm
3/ That poller have hard limit of 1000
The above information need to be updated. Support said that the poller correctly consider reserved concurrency but it should be at least 5. The SQS-Lambda integration is still being updated and this answer will not. So please consult AWS if you get into some weird issues