Verified Commit 064e83be authored by Frere, Jonathan (FWCC) - 142176's avatar Frere, Jonathan (FWCC) - 142176
Browse files

Finish the text off for part 2

parent 580a3f47
......@@ -48,6 +48,112 @@ RUN pip install -r requirements.txt
ENTRYPOINT [ "python3", "" ]
# Building Our Example Project
First let's figure out how to turn this Dockerfile into a container that we can run.
The first step is to get the code --
you can find it in this repository so you can clone it and follow along.
The first step to getting this ready to run is `docker build`.
To build an image, you need a Dockerfile, a name for the image, and a context.
The Dockerfile is what tells Docker how to build the image,
the name is what Docker will use to reference this image later (e.g. `python` or `hello-world`),
and the context is the set of files from your file system that Docker will have access to when it tries to build the project.
Usually the context is the project directory (usually also the directory where the build command is run from).
Likewise, by convention, a Dockerfile is generally called `Dockerfile` (with no extension),
and lives in the project's root directory.
If this isn't the case, there are additional flags to pass to `docker build` that specify where it is located.
The name is given with the `-t` flag, also specifying any tags that you want to provide (as always, these default to `:latest`).
The `-t` flag can be provided multiple times, so you can tag one build with multiple tags,
for example if your current build should belong to both the `latest` tag, and a fixed tag for this release version.
Having cloned the example repository, you can run this build process like this:
$ # builds the file at ./Dockerfile, with the current working directory as the context,
$ # with the name `my-analyser`.
$ docker build -t my-analyser .
Sending build context to Docker daemon 20.48kB
Step 1/5 : FROM python:3.8.5
3.8.5: Pulling from library/python
d6ff36c9ec48: Pull complete
c958d65b3090: Pull complete
edaf0a6b092f: Pull complete
80931cf68816: Pull complete
7dc5581457b1: Pull complete
87013dc371d5: Pull complete
dbb5b2d86fe3: Pull complete
4cb6f1e38c2d: Pull complete
0b3d7b2fc317: Pull complete
Digest: sha256:4c62d8c5ef331e485143c7a664fd6deeea4595ac17008ef5c10cc470d259e39f
Status: Downloaded newer image for python:3.8.5
---> 62aa40094bb1
Step 2/5 : WORKDIR /opt/my-project
Removing intermediate container 3e718c528a63
---> f6845bcf9e20
Step 3/5 : COPY . /opt/my-project
---> 8977a9a29d1c
Step 4/5 : RUN pip install -r requirements.txt
---> Running in 8da06d6427d0
Collecting numpy==1.19.1
Downloading numpy-1.19.1-cp38-cp38-manylinux2010_x86_64.whl (14.5 MB)
Collecting click==7.1.2
Downloading click-7.1.2-py2.py3-none-any.whl (82 kB)
Installing collected packages: numpy, click
Successfully installed click-7.1.2 numpy-1.19.1
Removing intermediate container 8da06d6427d0
---> ba22084bd57e
Step 5/5 : ENTRYPOINT [ "python3", "" ]
---> Running in d1c9dc9bc09f
Removing intermediate container d1c9dc9bc09f
---> d12d76ae371b
Successfully built d12d76ae371b
Successfully tagged my-analyser:latest
There are a few things to notice here.
Firstly, Docker sends the build context (that's the `.` part) to the Docker daemon.
We'll discuss the role of the Docker daemon a bit in the next post, but for now, the daemon is the process that actually does the work here.
After that, we start going through the steps defined in the Dockerfile
(you'll notice the five steps each match up to the five commands).
We'll go through what each command is actually doing in a moment,
although it might be interesting to get an idea for what each line is doing before reading onwards.
Before we explore the individual commands, however, we should figure out how to actually run this compiled image.
The Python script that we're running is a fairly simple one --
it has two commands, one to tell us how many items of data we've got, and another to give us the average values from that data.
We can run it like this:
$ docker run my-analyser
--help Show this message and exit.
$ docker run my-analyser count-datapoints
My Custom Application
datapoint count = 100
$ docker run my-analyser analyse-data
My Custom Application
height = 1.707529904338
height = 76.956408654431
This is very similar to the `hello-world` container that we ran,
except without any need to download anything (because the container has already been built on our system).
We'll look at transfering the container to other computers in the next post,
but, in principle, this is all we need to do to get a completely self-sufficient container containing all the code
that we need to run our project.
For now, let's go through the Dockerfile step-by-step and clarify what each command given there is doing.
# Step-by-step Through The Dockerfile
The first thing **(1)** a Dockerfile needs is a parent image.
In our case, we're using one of the pre-built Python images.
This is an official image provided by Docker that starts with a basic Debian Linux installation,
......@@ -100,12 +206,67 @@ Part of the idea of Docker is that each Docker container does one thing, and it
(You might recognise the UNIX philosophy here.)
As a result, a Docker container should generally contain one application,
and only the dependencies that that application needs to run.
The `ENTRYPOINT` command tells Docker which application should run.
The `ENTRYPOINT` command, along with the `CMD` command tells Docker which application should run.
The difference between the `ENTRYPOINT` and `CMD` is a bit subtle, but it roughly comes down to how you use the `docker run` command.
When we ran it in the previous post, we generally used the default commands set by the containers --
for `hello-world`, the default command was the executable that printed out the welcome message,
while in `python`, the default command was the Python REPL.
However, it's possible to overwrite this command from the `docker run` command.
For example, we can run the Python container to jump straight into a bash shell, skipping the Python process completely:
$ docker run -it python:3.8.5 bash # note the addition of 'bash' here to specify a different command to run
This ability to replace the default command comes from using `CMD`.
In the Python Dockerfile, there is a line that looks like `CMD python`, which essentially tells Docker
"if nobody has a better plan, just run the Python executable".
On the other hand, the arguments to `ENTRYPOINT` will just be put before whatever this command ends up being.
(It is possible to override this as well, but it's not as common.)
For example, consider the following Dockerfile:
FROM ubuntu:20.04
# using `echo` allows us to "debug" what arguments get
# passed to the ENTRYPOINT command
ENTRYPOINT [ "echo" ]
# this command can be overridden
CMD [ "Hello, World" ]
When we run this container, we get the following options:
$ docker run echotest # should print the default value CMD value
Hello, World
$ docker run echotest override arguments # should print the overidden arguments
override arguments
$ docker run -it --entrypoint bash echotest # overrides the entrypoint
As a rule, I would recommend using `ENTRYPOINT` when building a container for a custom application,
and `CMD` when you're building a container that you expect to be a base layer,
or an environment in which you expect people to run a lot of other commands.
In our case, using `ENTRYPOINT` allows us to add subcommands to the `` script that can be run easily from the command line,
as demonstrate in the opening examples.
If we'd used `CMD` instead of `ENTRYPOINT`,
then running `docker run my-analyser count-datapoints` would have just tried to run the `count-entrypoints` command in the system,
which doesn't exist, and would have caused an error.
# Next: Part 3 -- Practical Applications in Science
In this second of three parts, we've looked at an example project with an example Dockerfile.
We explored how to build and run this Dockerfile,
and we explored some of the most important commands needed to set up the Dockerfile for a project.
In the final part, I want to explore some of the different ways that someone might use Docker as part of research.
For example,
how to distribute Docker containers to other places,
how to run Docker containers on HPC systems,
building Docker via Continuous Integration,
and other places where you might see Docker being used.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment