What does s3fs cache in /tmp?

Question:

I’m using s3fs to mount a lot of files to an S3 bucket. It works fine except the fact that my local disk space is also growing a lot (the content in the /tmp directory).

My command is:

I’m using the use_cache parameter but what is actually cached? Are this files which still need to be uploaded to s3 and are cached on my local machine? Can I just delete it during upload/mount or not? And will my upload go quicker if I turn it off (if it’s for other purposes)?

Answer:

From the s3fs wiki (which is a bit hard to find).

If enabled via “use_cache” option, s3fs automatically maintains a local cache of files in the folder specified by use_cache. Whenever
s3fs needs to read or write a file on s3 it first downloads the entire
file locally to the folder specified by use_cache and operates on it.
When fuse release() is called, s3fs will re-upload the file to s3 if
it has been changed. s3fs uses md5 checksums to minimize downloads
from s3. Note: this is different from the stat cache (see below).

Local file caching works by calculating and comparing md5 checksums (ETag HTTP header).

The folder specified by use_cache is just a local cache. It can be deleted at any time. s3fs re-builds it on demand. Note: this directory
grows unbounded and can fill up a file system dependent upon the
bucket and reads to that bucket.

Leave a Reply