Question:
We are currently experiencing what I can only describe as random intermittent timeouts between AWS Lambda and RDS. After deploying our functions and running them successfully, they can randomly switch to a state of timing out with no configuration changes. Important to note, we are also monitoring the DB connections and can confirm that we aren’t running into a max connection issue.
Here are the details on our setup:
Code being executed (using Node.JS v. 6.10):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
const mysql = require('mysql'); exports.dbWrite = (events, context, callback) => { const db = mysql.createConnection({ host: user: password: database: }); db.connect(function (err) { if (err) { console.error('error connecting: ' + err.stack); return; } console.log('connected !'); }); db.end(); }; |
We are using the Node.JS mysql
library, v. 2.14.1.
From a networking perspective:
- The Lambda function is in the same VPC as our RDS instance
- The Lambda function has subnets assigned, which are associated with a routing table that does not have internet access (not associated with an internet gateway)
- The RDS database is not publicly accessible.
- A security group has been created and associated with the Lambda function that has wide open access on all ports (for now – once DB connectivity is reliable, that will change).
- The above security group has been whitelisted on port 3306 within a security group associated with the RDS instance.
CloudWatch error:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
{ "errorMessage": "connect ETIMEDOUT", "errorType": "Error", "stackTrace": [ "Connection._handleConnectTimeout (/var/task/node_modules/mysql/lib/Connection.js:419:13)", "Socket.g (events.js:292:16)", "emitNone (events.js:86:13)", "Socket.emit (events.js:185:7)", "Socket._onTimeout (net.js:338:8)", "ontimeout (timers.js:386:14)", "tryOnTimeout (timers.js:250:5)", "Timer.listOnTimeout (timers.js:214:5)", " --------------------", "Protocol._enqueue (/var/task/node_modules/mysql/lib/protocol/Protocol.js:145:48)", "Protocol.handshake (/var/task/node_modules/mysql/lib/protocol/Protocol.js:52:23)", "Connection.connect (/var/task/node_modules/mysql/lib/Connection.js:130:18)", "Connection._implyConnect (/var/task/node_modules/mysql/lib/Connection.js:461:10)", "Connection.query (/var/task/node_modules/mysql/lib/Connection.js:206:8)", "/var/task/db-write-lambda.js:52:12", "getOrCreateEventTypeId (/var/task/db-write-lambda.js:51:12)", "exports.dbWrite (/var/task/db-write-lambda.js:26:9)" ] } |
Amongst the references already reviewed:
- https://forums.aws.amazon.com/thread.jspa?threadID=221928
(the invocation ID in CloudWatch is different on all timeout cases) - pretty much every post in this list: https://stackoverflow.com/search?q=aws+lambda+timeouts+to+RDS
In summary, the fact that these timeouts are intermittent makes this an issue that is totally confusing. AWS support has stated that NodeJS-mysql
is a third-party tool, and is technically not supported, but I know folks are using this technique.
Any help is greatly appreciated!
Answer:
Considering that the RDS connections are not exhausted, there is a possibility that the lambda running into a particular subnet is always failing to connect to db. I am assuming that the RDS instances and lambdas are running in separate subnets. One way to investigate this is to check flow logs.
Go to EC2 -> Network interfaces -> search for lambda name -> copy eni ref and then go to VPC -> Subnets -> select the subnet of lambda -> Flow Logs -> search by eni ref.
If you see “REJECT OK” in your flow logs for your db port means that there is missing config in Network ACLs.