AWS Lambda Polling from SQS: in-flight messages count

Question:

I have 20K message in SQS queue. I also have a lambda will process the SQS messages, and put data into ElasticSearch server.

I have configured SQS as the lambda’s trigger, and limited the Lambda’s SQS batch size to be 10. I also limited the only one instance of the lambda can be run at a giving time.

However, sometime I see over 10K in-flight messages from the AWS console. Should it be max at 10 in-flight messages?

Because of this, the lambdas will only able to process 9K of the SQS message properly.

Below is a screen capture to show that I have limited the lambda to have only 1 instance running at a giving time.

enter image description here

Answer:

I’ve been doing some testings and contacting AWS tech support at the same time.

What I do believe at the moment is that:

Amazon Simple Queue Service supports an initial burst of 5 concurrent function invocations and increases concurrency by 60 concurrent invocations per minute. Doc

1/ The thing that does that polling, is a separate entity. It is most likely to be a lambda function that will long-poll the SQS and then, invoke our lambda functions.

2/ That polling Lambda does not take into account any of our Receiver-Lambda at all. It does not care whether the function is running at max capacity or not, or how many max concurrency is available for the Receiver-Lambda

3/ Due to that combination. The behavior is not what we expected from the Lambda-SQS integration. And worse, If you have suddenly, millions of message burst in your queue. The Receiver-Lambda concurrency can never catch up with the amount of messages that the polling Lambda is sending, result in loss of work

The test:

  • Create one Lambda function that takes 30 seconds to return true;
  • Set that function’s concurrency to 50;
  • Push 300 messages into the queue ( Visibility timeout : 10 Minutes, batch message count: 1, no re-drive )

The result:

  • Amount of messages available just increase gradually
  • At first, there are few enough messages to be processed by Receiver-Lambda
  • After half a minute, there are more messages available than what Receiver-Lambda can handle
  • These message would be discarded to dead queue. Due to polling Lambda unable to invoke Receiver-Lambda

I will update this answer as soon as I got the confirmation from AWS support

Support answer. As of Q1 2019, TL;DR version

1/ The assumption was correct, there was a “Poller”

2/ That Poller do not take into consideration of reserved concurrency
as part of its algorithm

3/ That poller have hard limit of 1000

Q2-2019 :

The above information need to be updated. Support said that the poller correctly consider reserved concurrency but it should be at least 5. The SQS-Lambda integration is still being updated and this answer will not. So please consult AWS if you get into some weird issues

Leave a Reply