How to execute aws glue scripts using python 2.7 from local machine?


I have aws cli and boto3 installed in my python 2.7 environment. I want to do various operations like get schema information, get database details of all the tables present in AWS Glue console. I tried below samples of scripts:

I got error ImportError: No module named awsglue.transforms which should be correct as there is no such package present in boto3 as I identified using the command dir(boto3). I found that boto3 offers various client calls through awscli and we can access them by using client=boto3.client('glue'). So, for getting schema information as above, I tried below sample code:

But then I get this error:
AccessDeniedException: An error occurred (AccessDeniedException) when calling the GetDatabases operation: Cross account access is not allowed.

I am pretty sure that either one of them or probably both of them are correct approaches to get what I am trying to get but something doesn’t fall into correct slots here. Any ideas to get the details about the schema and database tables from AWS Glue using python 2.7 locally like I tried above?


The following code works for me, and am using locally setup Zeppelin notebook, as a dev end point. The printschema reads the schema from the data catalog.

Hope you have enabled the ssh tunnelling as well.

Also you may need to make some changes for Spark interpreter, (tick on the Connect to existing process option available in the top, and host(localhost), port number (9007).

For second part You need to to do aws configure and then create glue client after installing boto3 client. After this, check your proxy settings for hiding behind a firewall or company network.

To be clear, boto3 client is helpful for all AWS related client side api and for server side, Zeppelin way is the best.

Hope this helps.

