Container server loadings are becoming more popular and it is becoming more common to see web server distributions running in containers. Can the same benefits be applied to databases?
Docker can handle stateful workloads
It is best to start by asking another question: Cone do you even run a database in Docker? In general, Docker is not designed for stateful services. One of the most important points of sale for containers is that they can be stopped and started at will, usually connecting to an authoritative data source, such as a database, to store their permits. All data in the container is volatile and is destroyed when the container is removed.
This makes running stateful workloads particularly challenging, but fortunately Docker has some tools for managing conditions: volume and binders. These allow you to mount a location on the host computer to a location in the container, which stores data even when the container is turned off. This way, you can run containers long-term without worrying about data being lost.
Volume brackets are the preferred way to handle most scenarios. They let you create a volume managed by Docker …
docker volume create my-volume
… hang that volume to a destination inside the container:
docker run --mount source=my-volume,target=/app
Binding brackets are simpler. They are what the volumes use under the hood, but they let you set the location manually on the host disk instead of having it managed via Docker.
docker run ~/nginxlogs:/var/log/nginx
In practice, it can be a little more complicated to use these mounts. Many managed Docker services, such as AWS ECS, or managed Kubernetes, do not give you direct access to the underlying server, and as such you will not be able to make binding mount connections directly. This is usually solved with a service such as EFS, which makes it possible to mount on ECS containers, or with an external data warehouse, as a database.
Should you choose Docker for your database?
Docks are usually not good for handling conditions. Docker-based workloads usually post this problem to databases. With a database as a solution to the problem, is it convenient to place your database in Docker?
By and large, the answer is “usually not”;. Docker has come a long way since its inception, and it’s not a horrible or “wrong” idea to containerize databases anymore. It can really be done and has some benefits with it. For most general workloads, however, the benefits do not outweigh the complications.
To see why, let’s look at the benefits that Docker brings to the table:
- Easy scaling: servers can be created and destroyed quickly to meet demand
- Simpler tools for CI / CD: automatic buildings are trivial
- Codification of your infrastructure: all underlying libraries and installation are managed in Dockerfile
Most of these do not transfer exactly well to database workloads, which are often long-term efforts that primarily benefit data integrity. You generally do not want to autoscale most databases; they do not usually receive regular code updates themselves, and as such they do not benefit as much from running in containers. And if you only mount a local storage device anyway, why not run it outside the Docker?
To free yourself from the complexity of managing databases, Docker is not the tool for the job. It is simply an unnecessary complication for a workload that can be easily run on a regular VPS. You will probably be much better off using a fully managed database-like service, such as AWS’s RDS. This provides much of the automation that Docker is good for, without any headaches of doing it yourself.
The main place where Docker can be useful for database workloads is in development environments. Docker makes it easy to spin up new databases with different configurations, which makes it quick to test. In production, however, the rules are generally stricter.