Question:
I have uploaded the MySQL csv file / MYSQL zip file of all tables in Amazon S3 bucket. Now I want to link the Amazon Athena with S3 bucket file. But when I write the schema for different tables is showing the same result for the select query of each table. I have search a lot but not able to understand the exact /correct way to do this.
I want to create/update different table schema in Athena with the help of one csv /sql zip file from S3 bucket.
Answer:
Amazon Athena will look in a defined directory for data. All data files within that directory will be treated as containing data for a given table.
You use a CREATE TABLE
command to both define the schema and direct Athena to the directory, eg:
1 2 3 4 5 6 7 8 |
CREATE EXTERNAL TABLE test1 ( f1 string, s2 string ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ("separatorChar" = ",", "escapeChar" = "\\") LOCATION 's3://my-bucket/data-directory/' |
You will need to run a CREATE EXTERNAL TABLE
command for each table, and the data for each table should be in a separate directory. The CSV files can be in ZIP format (which makes it faster and cheaper to query).
As an alternative to writing these table definitions yourself, you can create a crawler in AWS Glue. Point the crawler to the data directory, supply a name and the crawler will examine the data files and create a table definition that matches the files.