S3 multipart upload using AWS CLI with example

S3 multipart upload using AWS CLI with example

S3 multipart upload using AWS CLI with example

Hello Everyone

Welcome to CloudAffaire and this is Debjeet.

Today we are going to discuss how to split a large file into multiple files and upload it into an S3 bucket using the multipart feature. We will also compare the speed of different S3 file upload options like native AWS CLI commands like aws s3 cp or aws s3api put-object, upload using multipart and finally upload using S3 transfer acceleration and find out which one is fastest. So, without any due let’s get started.

What is S3 multipart upload?

Multipart upload allows you to upload a single object as a set of parts. Each part is a contiguous portion of the object’s data. You can upload these object parts independently and in any order. If transmission of any part fails, you can retransmit that part without affecting other parts. After all parts of your object are uploaded, Amazon S3 assembles these parts and creates the object. In general, when your object size reaches 100 MB, you should consider using multipart uploads instead of uploading the object in a single operation.

Multipart upload process:

Multipart upload is a three-step process: You initiate the upload, you upload the object parts, and after you have uploaded all the parts, you complete the multipart upload. Upon receiving the complete multipart upload request, Amazon S3 constructs the object from the uploaded parts, and you can then access the object just as you would any other object in your bucket.

What is S3 Transfer Acceleration?

Amazon S3 Transfer Acceleration is a bucket-level feature that enables fast, easy, and secure transfers of files over long distances between your client and an S3 bucket. Transfer Acceleration is designed to optimize transfer speeds from across the world into S3 buckets. Transfer Acceleration takes advantage of the globally distributed edge locations in Amazon CloudFront. As the data arrives at an edge location, the data is routed to Amazon S3 over an optimized network path.

S3 multipart upload using AWS CLI with example:

Prerequisites:

AWS CLI installed and configured.

Warning: Additional cost is associated with the demo (data transfer, S3 storage, and usage of S3 transfer acceleration), please refer to the S3 pricing document for details.

Step 1: Create a large file that will be used to test S3 upload speed.

Step 2: Archive the large file into multiple chunks which we will be used for the multipart upload.

Step 3: Create an S3 bucket with a bucket policy that allows you to upload S3 objects.

Now we are ready to test the S3 upload speed.

Step 4: Using aws s3 cp command:

Step 5: Using aws s3api put-object command:

So old generation s3 cp command seems to be faster than the new generation s3api put-object. Next, we will create a multipart upload and upload the large file by splitting into multiple parts and uploading it using AWS CLI.

Step 6: Create an S3 multipart using AWS CLI.

Step 7: Upload the files into multipart using AWS CLI.

Observe: Old generation aws s3 cp is still faster. Of course, you can run the multipart parallelly which will reduce the speed to around 12 to15 seconds.

Next, we need to combine the multiple files into a single file.

Step 8: Combine the multiple parts into a single object.

Next, let’s try S3 Transfer Acceleration which uses the nearest edge location to upload your S3 objects and from there the data travels through the AWS backbone network reducing the time to travel over a public network.

Note: If you have a dedicated connection (direct connect) the speed may be significantly less. I am using a public network to upload the object.

Step 9: Enable S3 transfer acceleration for your S3 bucket using AWS CLI.

Step 10: Update your AWS configuration settings to support usage of s3 transfer acceleration.

Step 11: Upload a large object using S3 transfer acceleration.

Observe: S3 transfer acceleration seems to be the fastest option to upload a large file.

Note: You can combine S3 multipart upload in parallel with S3 transfer acceleration to reduce the time further down. Of course, if you have petabyte size data to upload then you should use other AWS services like Snowball or Snowmobile.

Step 12: Clean up.

Hope you have enjoyed this article. To get more details in AWS S3, please refer to the below documentation.

https://docs.aws.amazon.com/aws-backup/index.html

Leave a Reply

Close Menu