Question:
I am trying to create a table in Glue catalog with s3 path location from spark running in EMR using hive. I have tried the following commands, but getting the error:
pyspark.sql.utils.AnalysisException: u’java.lang.IllegalArgumentException: Can not create a Path from an
empty string;’
sparksession.sql("CREATE TABLE IF NOT EXISTS abc LOCATION 's3://my-bucket/test/' as (SELECT * from my_table)")
sparksession.sql("CREATE TABLE abcSTORED AS PARQUET LOCATION 's3://my-bucket/test/' AS select * from my_table")
sparksession.sql("CREATE TABLE abcas SELECT * from my_table USING PARQUET LOCATION 's3://my-bucket/test/'")
Can someone please suggest the parameters that I am missing?
Answer:
The issue happens when a database is created without specified location:
1 2 |
CREATE DATABASE db_name; |
To fix the issue, specify location when create database:
1 2 |
CREATE DATABASE db_name LOCATION 's3://my-bucket/db_path'; |
Then, create a table:
1 2 3 |
USE db_name; CREATE TABLE IF NOT EXISTS abc LOCATION 's3://my-bucket/db_path/abc' as (SELECT * from my_table) |