Question:
I’m trying to create a Glue Job that enumerates all tables in a database in my catalog. In order to do so I use the following code snippet:
1 2 3 4 5 6 7 |
session = boto3.Session(region_name='us-east-2') glue = session.client('glue') tables = glue.get_tables( DatabaseName='customer1' ) print tables |
The job hangs for about 15 minutes and the connection appears to be refused, because I eventually get the following error:
botocore.vendored.requests.exceptions.ConnectTimeout:
HTTPSConnectionPool(host=’glue.us-east-2.amazonaws.com’, port=443):
Max retries exceeded with url: / (Caused by
ConnectTimeoutError(, ‘Connection to glue.us-east-2.amazonaws.com timed
out. (connect timeout=60)’))
This issue is specific to the glue API. I can use the S3 API with no problems.
I’ve gone through all my security groups and opened up all the ports to traffic from anywhere. I’ve even added self-referencing rules. But this is to no avail.
I can’t figure out what could be causing the connection to be blocked. Is AWS specifically blocking glue requests?
Answer:
I was facing the same problem that boto3 calls to glue
or s3
were hanging and eventually timing out.
I fixed it by changing the subnet-id when creating the dev-endpoint.
Initially I was using a subnet that routed traffic to an Internet Gateway.
I switched to a subnet routing traffic to an internal NAT gateway. Hope this helps.