(I’m new to Amazon AWS/S3, so please bear with me)
My ultimate goal is to allow my users to upload files to S3 using their web browser, my requirements are:
- I must handle large files (2GB+)
- I must support pause/resume with progress indicator
- (Optional but desirable!) Ability to resume upload if connection temporarily drops out
My two-part question is:
- I’ve read about the S3 multipart upload but it’s not clear how can I
implement the pause/resume for webbrowser-based uploads.
Is it even possible to do this for large files? If so how?
- Should I upload files to EC2 then move them to S3 once I’m done? Can
I (securely) upload files directly to S3 instead of using a temp. webserver?
If it’s possible to upload directly to S3, how can I handle pause/resume?
PS. I’m using PHP 5.2+
My initial answer apparently missed the main point, so to clarify:
If you want to do browser based upload via simple HTML forms, you are constrained to using the POST Object operation, which adds an object to a specified bucket using HTML forms:
POST is an alternate form of PUT that enables browser-based uploads as
a way of putting objects in buckets. Parameters that are passed to PUT
via HTTP Headers are instead passed as form fields to POST in the
multipart/form-data encoded message body. […]
The upload is handled in a single operation here, thus doesn’t support pause/resume and limits you to the original maximum object size of 5 gigabytes (GB) or less.
You can only overcome both limitations by Using the REST API for Multipart Upload instead, which is in turn used by SDKs like the AWS SDK for PHP to implement this functionality.
This obviously requires a server (e.g. on EC2) to handle the operation initiated via the browser (which allows you to facilitate S3 Bucket Policies and/or IAM Policies for access control easily as well).
If it’s possible to upload [large files] directly to S3, how can I handle
The AWS SDK for PHP supports uploading large files to Amazon S3 by means of the Low-Level PHP API for Multipart Upload:
The AWS SDK for PHP exposes a low-level API that closely resembles the
Amazon S3 REST API for multipart upload (see Using the REST API for
Multipart Upload ). Use the low-level API when you need to pause and
resume multipart uploads, vary part sizes during the upload, or do not
know the size of the data in advance. Use the high-level API (see
Using the High-Level PHP API for Multipart Upload) whenever you don’t
have these requirements. [emphasis mine]
Amazon S3 can handle objects from 1 byte all the way to 5 terabytes (TB), see the respective introductory post Amazon S3 – Object Size Limit Now 5 TB:
[…] Now customers can store extremely
large files as single objects, which greatly simplifies their storage
experience. Amazon S3 does the bookkeeping behind the scenes for our
customers, so you can now GET that large object just like you would
any other Amazon S3 object.
In order to store larger objects you would use the new Multipart Upload API that I blogged about last month to upload the object in