How to Write a Dockerfile (FROM, COPY, RUN, CMD, ENTRYPOINT)

A Dockerfile is the recipe for an image. Eight or nine instructions cover almost every real-world Dockerfile you will write or read: FROM, WORKDIR, COPY, RUN, ENV, EXPOSE, USER, CMD, ENTRYPOINT. This article is the working walkthrough of those: what each does, what order they belong in, the difference between CMD and ENTRYPOINT that catches people, and the layer-cache rule that decides whether your build takes 5 seconds or 5 minutes.

For optimizing the result further once it works, see Docker Image Size Optimization and the per-language guides (Node.js, Python, Next.js, Go, PHP).

How do I write a Dockerfile?

A working Dockerfile has five parts. First, pick a base image with FROM. Second, set a working directory inside the image with WORKDIR. Third, copy in only the dependency manifest (package.json, requirements.txt, go.mod) and install dependencies with RUN — separate from the source copy, because that order is what keeps the dependency layer cached between builds. Fourth, copy the rest of the source with COPY. Fifth, declare what runs when the container starts with CMD (or ENTRYPOINT for fixed binaries with overridable args). The whole thing for a Node app fits in 11 lines:

dockerfile

FROM node:22-alpine
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
COPY . .
EXPOSE 3000
USER node
CMD ["node", "server.js"]

The rest of the article is what each of those lines is doing and where the variations come in.

Jump to:

FROM — pick the base image
WORKDIR, COPY, ADD
RUN — install things, and the layer-cache rule
ENV and ARG
EXPOSE
USER — drop root
CMD vs ENTRYPOINT
HEALTHCHECK
A complete example
Common pitfalls
FAQ

FROM — pick the base image

dockerfile

FROM node:22-alpine

FROM declares the base image. The tag matters: node:22-alpine is Node 22 on Alpine (small, musl libc), node:22-slim is Debian-slim (small, glibc, native modules work cleanly), node:22 is the full Debian image (bigger, includes more build tools). Pinning to a specific major version (22) is the minimum sane choice; pinning to a specific point release (22.7.0) is more deterministic.

FROM scratch is the empty base — used for compiled static binaries (Go, Rust) where you do not need a userland at all. The result is a 5-15 MB image instead of 200 MB.

Multi-stage builds chain FROM lines:

dockerfile

FROM node:22 AS build
# ... build steps ...

FROM node:22-alpine
COPY --from=build /app/dist /app

The final image is the last FROM plus anything copied in from the previous stages. Everything in the build stages is discarded. This is how you get small production images even when the build needs heavy tooling. Full treatment in Docker Image Size Optimization.

WORKDIR, COPY, ADD

dockerfile

WORKDIR /app
COPY package.json package-lock.json ./
COPY . .

WORKDIR sets the current directory inside the image for every following instruction. It also creates the directory if it does not exist. Always set one explicitly; defaulting to / ends up dropping files into root.

COPY src dst copies from the build context (your project directory on the host) into the image. dst is interpreted relative to WORKDIR. Use the dependency-first pattern: COPY package.json package-lock.json ./ then RUN npm ci, then COPY . . — this keeps the dependency layer cached when only your source changes. See the cache rule section below.

ADD is COPY plus two extra behaviors: it can fetch from URLs and it auto-extracts tar archives. Both behaviors are usually surprising rather than helpful. Use COPY unless you specifically want tar extraction.

.dockerignore controls what COPY . . actually copies. Without it, you ship node_modules, .git, .env, build artifacts, and the rest. Full pattern in .dockerignore Best Practices.

RUN — install things, and the layer-cache rule

dockerfile

RUN apt-get update && apt-get install -y --no-install-recommends \
      curl ca-certificates && \
    rm -rf /var/lib/apt/lists/*

RUN executes a shell command at build time and bakes the result into a new layer. Three rules turn RUN from "slow nightmare" into "5-second cached rebuild":

1. Chain related commands with &&. Each RUN is one layer. Putting apt-get update and apt-get install in separate RUNs means the install layer can be cached with a stale package index, producing weird Unable to locate package failures on rebuild.

2. Clean up in the same RUN. Removing files in a later RUN does not shrink the earlier layer; the file is still there in the previous layer's tarball. The rm -rf /var/lib/apt/lists/* belongs in the same RUN as the install.

3. The layer cache invalidates from the first change downward. If line 7 of your Dockerfile changes, lines 7+ rebuild and everything before stays cached. So order matters: put the slowest, rarely-changing layers (system packages, dependency install) at the top, and the often-changing layers (your source code) at the bottom. The "copy package.json, install deps, then copy source" pattern is exactly this principle applied to Node/Python/Ruby/PHP.

dockerfile

# Order that keeps deps cached when only source changes
COPY package.json package-lock.json ./
RUN npm ci                       # cached unless package.json changes
COPY . .                         # invalidates whenever source changes

RUN defaults to shell form (/bin/sh -c "..."). Exec form (RUN ["npm", "ci"]) skips the shell. Shell form is more common; exec form is occasionally useful when the image has no shell.

ENV and ARG

dockerfile

ARG NODE_ENV=production
ENV NODE_ENV=$NODE_ENV
ENV PORT=3000

ENV sets an environment variable that exists at runtime inside the container. Anything you would normally set with docker run -e can also be baked into the image with ENV. Keep secrets out of ENV — anyone who pulls the image can read them with docker image inspect.

ARG is build-time only. Available to subsequent RUN, COPY, and other instructions during docker build, gone at runtime. Pass values with docker build --build-arg KEY=VALUE. Critically, ARG values end up in the image's build history, so they are not a place for secrets either. Use BuildKit's --secret mount for actual secrets.

Full breakdown of when to use each (and the order-of-precedence rules in Compose) is in Docker Environment Variables.

EXPOSE

dockerfile

EXPOSE 3000

EXPOSE is documentation. It declares which port the container listens on. It does not publish the port to the host — only docker run -p does that. Tools like docker inspect and Compose can read EXPOSE to do their own thing (e.g., compose up honors it for some networking helpers), but on the runtime level it is essentially a comment.

You can omit it and everything still works. I include it because it documents the contract.

USER — drop root

dockerfile

USER node

By default, processes inside containers run as root. That root is the host's root via a UID mapping; if a process escapes the container (rare but possible), it owns the host. Dropping to a non-root user shrinks that blast radius. Most official images ship a non-root user ready for this: Node has node, Postgres has postgres, Nginx has nginx.

The catch: anything USER runs after this can no longer write to root-owned paths inside the image. Set permissions on directories you need to write to (/app, /data) before the USER switch:

dockerfile

WORKDIR /app
COPY --chown=node:node . .
USER node

The --chown flag on COPY sets ownership at copy time, avoiding a separate RUN chown layer. Full security-baseline article: Running Docker Containers as Non-Root.

CMD vs ENTRYPOINT

This is the one that catches people. Both define what runs when the container starts; they interact in a specific way.

CMD sets the default command. It is overridable at run time by passing a new command:

dockerfile

CMD ["node", "server.js"]

bash

docker run my-image                  # runs node server.js
docker run my-image bash             # overrides CMD; runs bash instead

ENTRYPOINT sets the executable. It is not overridable at run time without the --entrypoint flag:

dockerfile

ENTRYPOINT ["node", "server.js"]

bash

docker run my-image                  # runs node server.js
docker run my-image bash             # tries to run: node server.js bash (fails / weird)
docker run --entrypoint sh my-image  # only way to get a shell

Used together, ENTRYPOINT is the binary and CMD is the default arguments:

dockerfile

ENTRYPOINT ["node"]
CMD ["server.js"]

bash

docker run my-image                  # runs node server.js
docker run my-image worker.js        # runs node worker.js (CMD is overridden, ENTRYPOINT stays)

Rule of thumb: CMD alone for apps where you might want a shell or different command later (most web apps). ENTRYPOINT + CMD for CLI-like images where the image is a binary (docker run ffmpeg-image -i input.mp4 output.mp4 style — the image always runs ffmpeg, you supply the args).

Use the exec form (the JSON array syntax: ["a", "b"]) for both, not the shell form. Exec form makes signal handling work correctly; shell form wraps your process in /bin/sh -c and signals like SIGTERM get eaten by the shell instead of reaching your app. That is why graceful shutdowns don't fire and docker stop waits the full 10 seconds before killing.

HEALTHCHECK

dockerfile

HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD curl -f http://localhost:3000/healthz || exit 1

Tells Docker how to check whether the app inside is healthy. docker ps then shows (healthy) or (unhealthy) next to the container, and Compose can wait for service_healthy before starting dependent services. Useful in Compose stacks; less useful in one-off docker run. Full picture in Docker Restart Policies and Health Checks.

A complete example

A working Dockerfile for a Node.js Express app, with multi-stage to keep the final image small:

dockerfile

# Build stage
FROM node:22 AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# Runtime stage
FROM node:22-alpine
WORKDIR /app
ENV NODE_ENV=production
COPY --from=build --chown=node:node /app/package.json /app/package-lock.json ./
RUN npm ci --omit=dev
COPY --from=build --chown=node:node /app/dist ./dist
USER node
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD wget -q -O - http://localhost:3000/healthz || exit 1
CMD ["node", "dist/server.js"]

Build and run:

bash

docker build -t my-app .
docker run -d --name my-app -p 3000:3000 my-app

For app-specific Dockerfile patterns: Node.js, Python, Next.js, Go, PHP / Laravel, static sites.

Common pitfalls

No .dockerignore, so COPY . . ships node_modules / .git / build artifacts. Add a .dockerignore (best practices). Builds get faster, images get smaller, secrets stay out.
Source copied before dependency install. Every source edit invalidates the dependency layer; every build re-installs everything. Copy the manifest first, install, then copy source.
Cleanup in a separate RUN from the install. The files are still in the earlier layer; the image is the same size. Combine into one RUN with &&.
Shell form for CMD/ENTRYPOINT. Signals do not reach your app; docker stop always waits the full timeout. Use exec form (JSON array).
latest tag in FROM. Non-deterministic builds. Pin to a major or specific version (node:22-alpine not node:latest).
Running as root for no reason. Add a USER line. Most official images ship a non-root user ready to use.
ADD for plain files. Use COPY. Save ADD for when you actually want auto-extract of a tar archive.

FAQ

RUN executes at build time and bakes the result into the image (installing packages, compiling code). CMD and ENTRYPOINT define what executes at run time when the container starts. Between those two: CMD is the default command, overridable at docker run; ENTRYPOINT is the fixed executable, only overridable with --entrypoint.

Default to slim (Debian-slim). It's small, uses glibc so native modules work cleanly, and has good package support. Use Alpine when you want the smallest possible image and you have verified your dependencies work on musl libc. Use the full image only when you genuinely need its extra tools at runtime — most use cases don't. See Docker Image Size Optimization.

Order instructions from least-changing to most-changing, use .dockerignore to shrink the build context, copy dependency manifests separately from source, prefer npm ci / pip install --no-cache-dir over their cacheful equivalents, and turn on BuildKit (default in Docker 23.0+). For builds shared across machines or CI, BuildKit cache export gives you persistent caching too.

Your dependencies are big, or your source includes large files that should have been in .dockerignore, or you ran apt install without cleaning up package lists, or you copied a build toolchain in and never removed it. Run docker image history my-app, and the size column shows which layer is fat. Multi-stage builds with a final FROM alpine or FROM scratch typically fix the "I needed build tools but only at build time" case.

It refers to a user that exists in the image's /etc/passwd. Official images like node, postgres, and nginx all ship with a corresponding non-root user. If you are starting from a base that does not (a bare Debian image, say), create one yourself: RUN useradd -m appuser then USER appuser.

How to Write a Dockerfile (FROM, COPY, RUN, CMD, ENTRYPOINT)

How do I write a Dockerfile?

FROM — pick the base image

WORKDIR, COPY, ADD

RUN — install things, and the layer-cache rule

ENV and ARG

EXPOSE

USER — drop root

CMD vs ENTRYPOINT

HEALTHCHECK

A complete example

Common pitfalls

FAQ

See also

Sources

Ishan Karunaratne

Related posts

How to Remove a File from Git Without Deleting It

How to Write an Effective System Prompt

How to Toggle Dark Mode From the macOS Command Line

What is the difference between RUN, CMD, and ENTRYPOINT?

Should I use Alpine, slim, or the full image as my base?

How do I make my Docker builds faster?

Why is my image still huge after FROM alpine?

What does USER node mean — where does that user come from?

Sources

Ishan Karunaratne