Question:
I’m using s3fs to mount a lot of files to an S3 bucket. It works fine except the fact that my local disk space is also growing a lot (the content in the /tmp directory).
My command is:
1 2 |
$ su ec2-user -c '/usr/bin/s3fs my-bucket-name -o use_cache=/tmp /home/ec2-user/dir'` |
I’m using the use_cache
parameter but what is actually cached? Are this files which still need to be uploaded to s3 and are cached on my local machine? Can I just delete it during upload/mount or not? And will my upload go quicker if I turn it off (if it’s for other purposes)?
Answer:
From the s3fs wiki (which is a bit hard to find).
If enabled via “use_cache” option, s3fs automatically maintains a local cache of files in the folder specified by use_cache. Whenever
s3fs needs to read or write a file on s3 it first downloads the entire
file locally to the folder specified by use_cache and operates on it.
When fuse release() is called, s3fs will re-upload the file to s3 if
it has been changed. s3fs uses md5 checksums to minimize downloads
from s3. Note: this is different from the stat cache (see below).Local file caching works by calculating and comparing md5 checksums (ETag HTTP header).
The folder specified by use_cache is just a local cache. It can be deleted at any time. s3fs re-builds it on demand. Note: this directory
grows unbounded and can fill up a file system dependent upon the
bucket and reads to that bucket.