Intermittent timeouts between AWS Lambda and RDS

Question:

We are currently experiencing what I can only describe as random intermittent timeouts between AWS Lambda and RDS. After deploying our functions and running them successfully, they can randomly switch to a state of timing out with no configuration changes. Important to note, we are also monitoring the DB connections and can confirm that we aren’t running into a max connection issue.

Here are the details on our setup:

Code being executed (using Node.JS v. 6.10):

We are using the Node.JS mysql library, v. 2.14.1.

From a networking perspective:

  • The Lambda function is in the same VPC as our RDS instance
  • The Lambda function has subnets assigned, which are associated with a routing table that does not have internet access (not associated with an internet gateway)
  • The RDS database is not publicly accessible.
  • A security group has been created and associated with the Lambda function that has wide open access on all ports (for now – once DB connectivity is reliable, that will change).
  • The above security group has been whitelisted on port 3306 within a security group associated with the RDS instance.

CloudWatch error:

Amongst the references already reviewed:

In summary, the fact that these timeouts are intermittent makes this an issue that is totally confusing. AWS support has stated that NodeJS-mysql is a third-party tool, and is technically not supported, but I know folks are using this technique.

Any help is greatly appreciated!

Answer:

Considering that the RDS connections are not exhausted, there is a possibility that the lambda running into a particular subnet is always failing to connect to db. I am assuming that the RDS instances and lambdas are running in separate subnets. One way to investigate this is to check flow logs.

Go to EC2 -> Network interfaces -> search for lambda name -> copy eni ref and then go to VPC -> Subnets -> select the subnet of lambda -> Flow Logs -> search by eni ref.

If you see “REJECT OK” in your flow logs for your db port means that there is missing config in Network ACLs.

Leave a Reply