Reusing Packages with Docker
Docker images are a convenient format of artifacts for backend applications. They are agnostic to both Linux distribution and programming language. So you may have a single artifacts storage for the whole company (a Docker Registry) and happily deploy everything with Docker.
But when you start using Docker for packaging all your applications, sooner or later you may start facing a problem with trying to reuse a single artifact (hereinafter referred to as a “library”) between different Docker images.
List of Approaches You Might Take
There’s no silver bullet here, so you would have to consider multiple options and pick what’s best for you.
Library as a Base Image
Build a single image with the library, and use that as a base image in your other Docker images via FROM:
# `mylibrary:latest` Dockerfile:
FROM python:3.8
# ... build your library
# `project1` Dockerfile:
FROM mylibrary:latest
# ... build your project on top of the library
# `project2` Dockerfile:
FROM mylibrary:latest
# ... build your project on top of the library
Key problems of this approach:
- Docker doesn’t allow multiple inheritance (frankly), so there’s no way to add a second library the same way
- All the inheriting images must use the same base Docker image (i.e.
you can’t start one image with
FROM python:3.6-alpineand the other withFROM python:3.8; the library’s base image would be used everywhere)
Otherwise this is a straightforward solution, if you have only 1 library and can live with a single base image. Nothing stops you from bundling multiple libraries to the same single image, though (except that you would probably regret that later).
Build the Library from Sources
Don’t store the built artifact, and build the library from sources in all project images:
# `project1` Dockerfile:
FROM python:3.6-alpine
# ... build your library
# ... build your project on top of the library
# `project2` Dockerfile:
FROM python:3.8
# ... build your library
# ... build your project on top of the library
Key problems of this approach:
Each project image must prepare the proper environment for building the library’s artifact (i.e. install dependencies with required versions).
Normally the built artifact should be tested (because the building environment might be incorrect and a broken artifact might be produced with no errors during the build), and most probably the project images will be focused on testing themselves, so the library’s artifact might end up being not covered with tests.
The building environment will inevitably go out of sync between the project images (e.g. dependencies’ version changes).
Otherwise this is a quick and working solution, which could help you solve the problem at start.
Specialized Artifacts Storage
Python’s wheel, npm package and Debian’s dpkg — are different kinds of packages, and usually a specialized package storage should be set up in the company for each of them (e.g. devpi for Python, aptly for Debian).
There are universal storages like Artifactory or Nexus, but still you will need to have one (both can replace Docker Registry as well!).
But if you do have a repository where you could upload the built artifact of your library, then you can just do that, and then install the library in the project images by downloading it from the repository:
# `project1` Dockerfile:
FROM python:3.6-alpine
# ... install the library from repository
# ... build your project on top of the library
# `project2` Dockerfile:
FROM python:3.8
# ... install the library from repository
# ... build your project on top of the library
This approach also allows to use the library outside of Docker.
The only problem is that it requires an additional artifacts storage besides the Docker Registry. But if you can publish your library to the Internet, then you can just use the public repositories, such as Python’s pypi or node’s npm.
Copy Artifact from a Built Docker Image
The idea is to build a library artifact and store it in a Docker Registry in a thin image, and then copy that artifact in the project images from that library image.
Docker multi-stage builds allow to do such copies from images:
# `mylibrary:latest` Dockerfile:
FROM debian:stable as builder
# ... build your library
# Assuming that the built artifact is called `/mylib.pkg`,
# create a thin Docker Image which will be pushed to the registry:
FROM scratch
COPY --from=builder /mylib.pkg /
# `project1` Dockerfile:
FROM mylibrary:latest as mylibrary_pkg
FROM mylibrary2:latest as mylibrary2_pkg
FROM python:3.6-alpine
COPY --from=mylibrary_pkg /mylib.pkg /
COPY --from=mylibrary2_pkg /mylib2.pkg /
# ... install the just copied libraries
# ... build your project on top of the library
# `project2` Dockerfile:
FROM mylibrary:latest as mylibrary_pkg
FROM mylibrary2:latest as mylibrary2_pkg
FROM python:3.8
COPY --from=mylibrary_pkg /mylib.pkg /
COPY --from=mylibrary2_pkg /mylib2.pkg /
# ... install the just copied libraries
# ... build your project on top of the library
This approach is definitely the hardest to grasp of all the other ones, but it seems to be the best one out there if you want to stick with just Docker Registry.
The main drawback is that you cannot easily use the built library outside of Docker.
Summary
If you have a specialized artifacts storage, consider just using it.
If you want to limit artifacts storage to just Docker Registry, then choose between simplicity (first two approaches) and flexibility (the last one).
| Quality | Base Image | Build from Sources | Specialized Artifacts Storage | Copy from Image |
|---|---|---|---|---|
| Simple | + | + | + | - |
| Only Docker Registry is required | + | + | - | + |
| Built library can be installed without Docker | - | - | + | - |
| Different base Docker images could be used | - | + | + | + |
| Multiple different libraries could be installed | - | + | + | + |
| The library is built just once | + | - | + | + |