Question:
What are the right content types for the different types of files of a static site hosted at AWS and how to set these in a smart way via boto3?
I use the upload_file method:
1 2 3 4 5 |
import boto3 s3 = boto3.resource('s3') bucket = s3.Bucket('allecijfers.nl') bucket.upload_file('C:/Hugo/Sites/allecijfers/public/test/index.html', 'test/index.html', ExtraArgs={'ACL': 'public-read', 'ContentType': 'text/html'}) |
This works well for the html files. I initially left out the ExtraArgs which results in a file download (probably because the content type is binary?). I found this page that states several content types but I am not sure how to apply it.
E.g. probably the CSS files should be uploaded with ‘ContentType’: ‘text/css’.
But what about the js files, index.xml, etc? And how to do this in a smart way? FYI this is my current script to upload from Windows to AWS, this requires string.replace(“\”,”/”) which probably is not the smartest either?
1 2 3 4 5 6 7 8 9 |
for root, dirs, files in os.walk(local_root + local_dir): for filename in files: # construct the full local path local_path = os.path.join(root, filename).replace("\\","/") # construct the full S3 path relative_path = os.path.relpath(local_path, local_root) s3_path = os.path.join(relative_path).replace("\\","/") bucket.upload_file(local_path, s3_path, ExtraArgs={'ACL': 'public-read', 'ContentType': 'text/html'}) |
I uploaded my complete Hugo site from the same source using the AWS CLI to the same S3 bucket and this works perfect without specifying content types, is this also possible via boto 3?
Many thanks in advance for your help!
Answer:
There is a python built-in library to guess mimetypes.
So you could just lookup each filename first. It works like this:
1 2 3 |
import mimetypes print(mimetypes.guess_type('filename.html')) |
Result:
1 2 |
('text/html', None) |
In your code. I also slightly improved the portability of your code with respect to the windows path. Now it will do the same thing, but be portable to a Unix platform by looking up the platform specific separator (
os.path.sep
) that will be being used in any paths.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
import boto3 import mimetypes s3 = boto3.resource('s3') bucket = s3.Bucket('allecijfers.nl') for root, dirs, files in os.walk(local_root + local_dir): for filename in files: # construct the full local path (Not sure why you were converting to a # unix path when you'd want this correctly as a windows path local_path = os.path.join(root, filename) # construct the full S3 path relative_path = os.path.relpath(local_path, local_root) s3_path = relative_path.replace(os.path.sep,"/") # Get content type guess content_type = mimetypes.guess_type(filename)[0] bucket.upload_file( local_path, s3_path, ExtraArgs={'ACL': 'public-read', 'ContentType': content_type} ) |