I am trying to download data from one of Amazon’s public buckets.
Here is a description of the bucket in question
The bucket has web accessible folders for example.
I would want to download say all the listed files in that folder.
There will a long list of suitable tiles identified, and the goal would be to get all files in a folder in one go rather than downloading each individually from the http site.
From other StackOverflow questions I realize I need to use the REST endpoint and use a tool like the AWS CLI or Cyberduck, but I cannot get these to work as yet.
I think the issue may be authentication. I don’t have an AWS account, and I was hoping to stick with guest / anonymous access.
Does anyone have a good solution / tool to traverse a public bucket and grab the contents as a guest? Could a different approach using curl or wget work for this type of task?
For the AWS CLI, you need to provide the
--no-sign-request flag to skip signing. Example:
> aws s3 ls landsat-pds
Unable to locate credentials. You can configure credentials by running "aws configure".
> aws s3 ls landsat-pds --no-sign-request
2015-01-28 10:13:53 23764 index.html
2015-04-14 10:43:22 25 robots.txt
2016-07-13 12:53:31 38 run_info.json
2016-07-13 12:53:30 23971821 scene_list.gz
To download that entire bucket into a directory, you would do something like this:
> mkdir landsat-pds
> aws s3 sync s3://landsat-pds landsat-pds --no-sign-request