Question:
I am trying to download data from one of Amazon’s public buckets.
Here is a description of the bucket in question
The bucket has web accessible folders for example.
I would want to download say all the listed files in that folder.
There will a long list of suitable tiles identified, and the goal would be to get all files in a folder in one go rather than downloading each individually from the http site.
From other StackOverflow questions I realize I need to use the REST endpoint and use a tool like the AWS CLI or Cyberduck, but I cannot get these to work as yet.
I think the issue may be authentication. I don’t have an AWS account, and I was hoping to stick with guest / anonymous access.
Does anyone have a good solution / tool to traverse a public bucket and grab the contents as a guest? Could a different approach using curl or wget work for this type of task?
Thanks.
Answer:
For the AWS CLI, you need to provide the --no-sign-request
flag to skip signing. Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
> aws s3 ls landsat-pds Unable to locate credentials. You can configure credentials by running "aws configure". > aws s3 ls landsat-pds --no-sign-request PRE L8/ PRE landsat-pds_stats/ PRE runs/ PRE tarq/ PRE tarq_corrupt/ PRE test/ 2015-01-28 10:13:53 23764 index.html 2015-04-14 10:43:22 25 robots.txt 2016-07-13 12:53:31 38 run_info.json 2016-07-13 12:53:30 23971821 scene_list.gz |
To download that entire bucket into a directory, you would do something like this:
1 2 3 |
> mkdir landsat-pds > aws s3 sync s3://landsat-pds landsat-pds --no-sign-request |