Question:
I’m my S3 bucket there are so many files are in different file formats. So I would like to copy from all the subfolders which has .JSON extension to another folder.
Current Structure:
1 2 3 4 5 6 |
S3://mybucket/f1/file.JPG S3://mybucket/f1/newfile.JSON S3://mybucket/f2/Oldfile.JSON |
It (JSON FILES) should be copied to the folder arrange:
1 2 3 |
S3://mybucket/arrange/newfile.JSON S3://mybucket/arrange/Oldfile.JSON |
I tried this (But there is not filter for JSON) From stackoverflow
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
import os import boto3 old_bucket_name = 'SRC' old_prefix = 'A/B/C/' new_bucket_name = 'TGT' new_prefix = 'L/M/N/' s3 = boto3.resource('s3') old_bucket = s3.Bucket(old_bucket_name ) new_bucket = s3.Bucket(new_bucket_name ) for obj in old_bucket.objects.filter(Prefix=old_prefix): old_source = { 'Bucket': old_bucket_name, 'Key': obj.key} # replace the prefix new_key = obj.key.replace(old_prefix, new_prefix) new_obj = new_bucket.Object(new_key) new_obj.copy(old_source) |
Answer:
You can keep a filter for JSON files like below :
1 2 3 4 5 6 7 8 9 |
for obj in old_bucket.objects.filter(Prefix=old_prefix): if obj.key.endswith('.JSON'): old_source = { 'Bucket': old_bucket_name, 'Key': obj.key} # replace the prefix new_key = obj.key.replace(old_prefix, new_prefix) new_obj = new_bucket.Object(new_key) new_obj.copy(old_source) |