At logilab we use more and more docker both for test and production purposes.

For a CubicWeb application I had to write a Dockerfile that does a pip install and compiles javascript code using npm.

The first version of the Dockerfile was:

FROM debian:stretch
RUN apt-get update && apt-get -y install \
    wget gnupg ca-certificates apt-transport-https \
    python-pip python-all-dev libgecode-dev g++
RUN echo "deb https://deb.nodesource.com/node_9.x stretch main" > /etc/apt/sources.list.d/nodesource.list
RUN wget https://deb.nodesource.com/gpgkey/nodesource.gpg.key -O - | apt-key add -
RUN apt-get update && apt-get -y install nodejs
COPY . /sources
RUN pip install /sources
RUN cd /sources/frontend && npm install && npm run build && \
    mv /sources/frontend/bundles /app/bundles/
# ...

The resulting image size was about 1.3GB which cause issues while uploading it to registries and with the required disk space on production servers.

So I looked how to reduce this image size. What is important to know about Dockerfile is that each operation result in a new docker layer, so removing useless files at the end will not reduce the image size.

https://www.logilab.org/file/10128049/raw/docker-multi-stage.png

The first change was to use debian:stretch-slim as base image instead of debian:stretch, this reduced the image size by 18MB.

Also by default apt-get would pull a lot of extra optional packages which are "Suggestions" or "Recommends". We can simply disable it globally using:

RUN echo 'APT::Install-Recommends "0";' >> /etc/apt/apt.conf && \
    echo 'APT::Install-Suggests "0";' >> /etc/apt/apt.conf

This reduced the image size by 166MB.

Then I looked at the content of the image and see a lot of space used in /root/.pip/cache. By default pip build and cache python packages (as wheels), this can be disabled by adding --no-cache to pip install calls. This reduced the image size by 26MB.

In the image we also have a full nodejs build toolchain which is useless after the bundles are generated. The old workaround is to install nodejs, build files, remove useless build artifacts (node_modules) and uninstall nodejs in a single RUN operation, but this result in a ugly Dockerfile and will not be an optimal use of the layer cache. Instead we can setup a multi stage build

The idea behind multi-stage builds is be able to build multiple images within a single Dockerfile (only the latest is tagged) and to copy files from one image to another within the same build using COPY --from= (also we can use a base image in the FROM clause).

Let's extract the javascript building in a single image:

FROM debian:stretch-slim as base
RUN echo 'APT::Install-Recommends "0";' >> /etc/apt/apt.conf && \
    echo 'APT::Install-Suggests "0";' >> /etc/apt/apt.conf

FROM base as node-builder
RUN apt-get update && apt-get -y install \
    wget gnupg ca-certificates apt-transport-https \
RUN echo "deb https://deb.nodesource.com/node_9.x stretch main" > /etc/apt/sources.list.d/nodesource.list
RUN wget https://deb.nodesource.com/gpgkey/nodesource.gpg.key -O - | apt-key add -
RUN apt-get update && apt-get -y install nodejs
COPY . /sources
RUN cd /sources/frontend && npm install && npm run build

FROM base
RUN apt-get update && apt-get -y install python-pip python-all-dev libgecode-dev g++
RUN pip install --no-cache /sources
COPY --from=node-builder /sources/frontend/bundles /app/bundles

This reduced the image size by 252MB

The Gecode build toolchain is required to build rql with gecode extension (which is a lot faster that native python), my next idea was to build rql as a wheel inside a staging image, so the resulting image would only need the gecode library:

FROM debian:stretch-slim as base
RUN echo 'APT::Install-Recommends "0";' >> /etc/apt/apt.conf && \
    echo 'APT::Install-Suggests "0";' >> /etc/apt/apt.conf

FROM base as node-builder
RUN apt-get update && apt-get -y install \
    wget gnupg ca-certificates apt-transport-https \
RUN echo "deb https://deb.nodesource.com/node_9.x stretch main" > /etc/apt/sources.list.d/nodesource.list
RUN wget https://deb.nodesource.com/gpgkey/nodesource.gpg.key -O - | apt-key add -
RUN apt-get update && apt-get -y install nodejs
COPY . /sources
RUN cd /sources/frontend && npm install && npm run build

FROM base as wheel-builder
RUN apt-get update && apt-get -y install python-pip python-all-dev libgecode-dev g++ python-wheel
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /wheels rql

FROM base
RUN apt-get update && apt-get -y install python-pip
COPY --from=wheel-builder /wheels /wheels
RUN pip install --no-cache --find-links /wheels /sources
# a test to be sure that rql is built with gecode extension
RUN python -c 'import rql.rql_solve'
COPY --from=node-builder /sources/frontend/bundles /app/bundles

This reduced the image size by 297MB.

So the final image size goes from 1.3GB to 300MB which is more suitable to use in a production environment.

Unfortunately I didn't find a way to copy packages from a staging image, install it and remove the package in a single docker layer, a possible workaround is to use an intermediate http server.

The next step could be to build a Debian package inside a staging image, so the build process would be separated from the Dockerfile and we could provide Debian packages along with the docker image.

Another approach could be to use Alpine as base image instead of Debian.

blog entry of