Question:
I’m unsuccessfully trying to do a multipart upload with pre-signed part URLs.
This is the procedure I follow (1-3 is on the server-side, 4 is on the client-side):
- Instantiate boto client.
1 2 3 4 5 6 7 8 9 10 11 |
import boto3 from botocore.client import Config s3 = boto3.client( "s3", region_name=aws.default_region, aws_access_key_id=aws.access_key_id, aws_secret_access_key=aws.secret_access_key, config=Config(signature_version="s3v4") ) |
- Initiate multipart upload.
1 2 3 4 5 6 7 |
upload = s3.create_multipart_upload( Bucket=AWS_S3_BUCKET, Key=key, Expires=datetime.now() + timedelta(days=2), ) upload_id = upload["UploadId"] |
- Create a pre-signed URL for the part upload.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
part = generate_part_object_from_client_submited_data(...) part.presigned_url = s3.generate_presigned_url( ClientMethod="upload_part", Params={ "Bucket": AWS_S3_BUCKET, "Key": upload_key, "UploadId": upload_id, "PartNumber": part.no, "ContentLength": part.size, "ContentMD5": part.md5, }, ExpiresIn=3600, # 1h HttpMethod="PUT", ) |
Return the pre-signed URL to the client.
- On the client try to upload the part using
requests
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
part = receive_part_object_from_server(...) with io.open(filename, "rb") as f: f.seek(part.offset) buffer = io.BytesIO(f.read(part.size)) r = requests.put( part.presigned_url, data=buffer, headers={ "Content-Length": str(part.size), "Content-MD5": part.md5, "Host": "AWS_S3_BUCKET.s3.amazonaws.com", }, ) |
And when I try to upload I either get:
1 2 3 |
urllib3.exceptions.ProtocolError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe')) |
Or:
1 2 3 4 5 6 7 8 9 10 11 12 |
NoSuchUpload The specified upload does not exist. The upload ID may be invalid, or the upload may have been aborted or completed. |
Even though the upload still exist and I can list it.
Can anyone tell me what am I doing wrong?
Answer:
Here is a command utilty that does exactly the same thing, you might want to give it at try and see if it works. If it does it will be easy to find the difference between your code and theirs. If it doesn’t I would double check the whole process. Here is an example how to upload a file using aws
commandline https://aws.amazon.com/premiumsupport/knowledge-center/s3-multipart-upload-cli/?nc1=h_ls
Actually if it does work. Ie you can replecate the upload using aws s3 commands then we need to focus on the use of persigned url. You can check how the url should look like here:
https://github.com/aws/aws-sdk-js/issues/468
https://github.com/aws/aws-sdk-js/issues/1603
This are js sdk but the guys there talk about the raw urls and parameters so you should be able to spot the difference between your urls and the urls that are working.
Another option is to give a try this script, it uses js to upload file using persigned urls from web browser.
https://github.com/prestonlimlianjie/aws-s3-multipart-presigned-upload
If it works you can inspect the communication and observe the exact URLs that are being used to upload each part, which you can compare with the urls your system is generating.
Btw. once you have a working url for multipart upload you can use the aws s3 presign url
to obtain the persigned url, this should let you finish the upload using just curl
to have full control over the upload process.