Question:
My S3 file (s3://MY_s3/stage/my_s3_file
) has a partition on load_dt
:
S3 Structure:
1 2 3 4 5 6 7 8 |
-MY_S3 --stage ---my_s3_file ----load_dt=2016-02-01 ----000024 ----load_dt=2016-02-02 ----000036 |
The actual files are under load_dt partitions like 000024 & 000036.
The COPY command works fine if I am not keeping load_dt in Redshift table, but when I am adding load_dt in Redshift table the COPY command fails due to data error as the input layout & target layout are not matching (with extra load_dt on target).
The hive table which creates the S3 file shows partitioned load_dt at the end.
How to make this COPY command work with load_dt being on target Redshift?
I even tried changing S3 locations to s3://MY_s3/stage/my_s3_file/load_dt
but no luck.
Answer:
I think I found the answer for my case.
I was not able to load the hive partition because Hive store that partition value in Hive metadata , that’s why the partition column was not there in S3 file.
Now I am adding a new column to S3 through Hive as Load_Dt_New so the S3 file would have the required column for my Redshift COPY command to work.