I have a python script that on a loop
- Downloads video chunks from AWS S3 to /filename.
- Sorts files in order and concatenates them.
- Uploads entire processed video file to AWS S3.
- Deletes folder /filename.
Then continues on a loop until the AWS SQS queue is empty.
Script works great! I have run it for months. The hard drive space varies but never gets about 5%, depending on size of the video.
I decided to put this script in a docker container and run docker-compose so I could run a bunch of them at a time.
The problem is the hard drive fills up! I know with 5 running the space on the disk will be higher, but when I’m done processing, the file gets deleted.
But with docker, it seems to be a cache or something. I exec into each container and they are running fine. Deleteing old files and all.
No clue what the difference between, in a docker container and running as a service would have as impact on the HD.
To add to this, when I “rm” the docker containers, the hard drive space frees up. I run a “docker ps -s” and the space on the containers is not crazy. Just seems like when you “rm” a file inside the docker container it never really “rm”s it.
If you’re downloading the image to a directory NOT volumed mapped from the host, the docker container will not release the used disk space until the container is removed–anything done in the container is ephemeral, but the HOST doesn’t know the state of what’s going on inside the container.
In this sense it’s a lot like a virtual machine image, backed by a file that just grows as needed, but never shrinks. Docker has a directory for a running container tracking changes. On the host you can find the files backing the running container in
If you need your containers to share disk space, I’d recommend you map a shared volume from the host into each docker container images to share.
Try the following
docker run -ti -v /host/dir:/container/dir ubuntu bash
The above would run the ubuntu image in terminal interactive mode and mounting the host’s directory
/host/dir inside the running container. Anything the container writes to
/container/dir will appear in the hosts
/host/dir and any other containers mounting it will see the changes as well.
Just remember anything done in the shared volume is seen by all containers that mount it, so be careful when adding and deleting files/directories from it!