Question:
Doing something like the following:
1 2 3 4 |
s3 = boto3.resource('s3') bucket = s3.Bucket('a_dummy_bucket') bucket.objects.all() |
Will return all the objects under ‘a_dummy_bucket’ bucket, like:
1 2 3 4 5 6 7 8 9 |
test1/blah/blah/afile45645.zip test1/blah/blah/afile23411.zip test1/blah/blah/afile23411.zip [...] 2500 files test2/blah/blah/afile.zip [...] 2500 files test3/blah/blah/afile.zip [...] 2500 files |
Is there any way of getting, in this case, ‘test1′,’test2’, ‘test3’, etc… without paginating over all results?
For reaching ‘test2’ I need 3 paginated calls, each one with 1000 keys to know that there is a ‘test2’, and then other 3 with 1000 keys to reach ‘test3’, and so on.
How can I get all these prefixes without paginating over all results?
Thanks
Answer:
I believe getting the Common Prefixes is what you are possibly looking for. Which can be done using this example:
1 2 3 4 5 6 7 8 |
import boto3 client = boto3.client('s3') paginator = client.get_paginator('list_objects') result = paginator.paginate(Bucket='my-bucket', Delimiter='/') for prefix in result.search('CommonPrefixes'): print(prefix.get('Prefix')) |
AWS Documentation#Bucket.Get says the following regarding Common Prefixes:
A response can contain CommonPrefixes only if you specify a delimiter. When you do, CommonPrefixes contains all (if there are any) keys between Prefix and the next occurrence of the string specified by delimiter. In effect, CommonPrefixes lists keys that act like subdirectories in the directory specified by Prefix. For example, if prefix is notes/ and delimiter is a slash (/), in notes/summer/july, the common prefix is notes/summer/. All of the keys rolled up in a common prefix count as a single return when calculating the number of returns. See MaxKeys.
Type: String
Ancestor: ListBucketResult