AWS Lambda worrisome behaviour after an accidental infinite loop


I accidently deployed some Java code into AWS Lambda which contains the following obviously buggy getter:

The Lambda function is configured with a limit of 15 seconds and 320 Mo. It is triggered by a DynamoDB stream. After deploying the problematic code, I modified my DynamoDB table around 22h17 hence executing the code. I checked out the logs, and as you can expect from the previous function, I had a classic StackOverflowError with a very long stack trace. However, I was surprised to see that this did not stop the function which kept executing and reporting several more stack overflow errors (logs in CloudWatch). I was even more worried when I realized that the function just would not stop even after the 15 seconds limits. I could not find any way to manually stop it so I simply deleted it from the Lambda console around 22h30 which killed it at last. The following graph (from CloudWatch) shows the execution duration over time and you can see several tests I performed before running the bug (runs of more than 10s), then the consecutive quick runs (about 200ms) between 22h17 and 22h30.

enter image description here

Also, I am sure that I didn’t touch my DynamoDB table (and no one else has access to it) nor tried to execute the Lambda function in any other way. How comes it kept executing for several minutes until I deleted it? I certainly should have been more careful and performed some local pre-testing first, but isn’t the duration limit supposed to guarantee that nothing will be executed once reached?

Thank you for your help.


I finally figured out the origin of this behaviour. In AWS Lambda official documentation it is said that:

Depending on the event source, AWS Lambda may retry the failed Lambda
function. For example, if Amazon Kinesis is the event source for the
Lambda function, AWS Lambda retries the failed function until the
Lambda function succeeds or the records in the stream expire.

DynamoDB streams have an expiration delay of 24 hours, so my function would only have stopped by then.

Leave a Reply