Docker: Taking advantage of the build cache

Docker: Taking advantage of the build cache

·

2 min read

Prerequisites

FROM debian
# Copy application files
COPY . /app
# Install required system packages
RUN apt-get update
RUN apt-get -y install imagemagick curl software-properties-common gnupg vim ssh
RUN curl -sL https://deb.nodesource.com/setup_10.x | bash -
RUN apt-get -y install nodejs
# Install NPM dependencies
RUN npm install --prefix /app
EXPOSE 80
CMD ["npm", "start", "--prefix", "app"]

The build cache is based on the previous steps. You should always keep it in mind and reduce the build time by reusing existing layers.

Let's try to emulate the process of rebuilding your apps' image to introduce a new change in the code, so you can understand how the cache works. To do so, edit the message used in the console.log at server.js and rebuild the image using the command below:

$ docker build . -t express-image:0.0.2

It takes 114.8 seconds to build the image.

Using the current approach, you can't reuse the build cache to avoid installing the system packages if a single bit changes in the application's code. However, if you switch the order of the layers, you will be able to avoid reinstalling the system packages:

FROM debian
- # Copy application files
- COPY . /app
# Install required system packages
RUN apt-get update
...
RUN apt-get -y install nodejs
+ # Copy application files
+ COPY . /app
# Install NPM dependencies
...

Rebuild the image using the same command, but avoiding the installation of the system packages. This is the result: it takes 5.8 seconds to build!! The improvement is huge!!

But what would happen if a single character changed in the README.md file (or in any other file which is in the repository but is not related to the application)? You would currently be copying the whole directory to the image and therefore, you would be throwing the cache again!!

You should be more specific about the files you copy to make sure that you are not invalidating the cache with changes that do not affect the application.

...
# Copy application files
- COPY . /app
+ COPY package.json server.js /app
# Install NPM dependencies
...

Tip

Use "COPY" instead of "ADD" when possible. Both commands do basically the same thing, but "ADD" is more complex: it has extra features like extracting files or copying them from remote sources. From a security perspective too, using "ADD" increases the risk of malware injection in your image if the remote source you are using is unverified or insecure.