TechEarl

How to Write a Dockerfile (FROM, COPY, RUN, CMD, ENTRYPOINT)

A Dockerfile that actually builds, line by line: the instructions, the order that controls layer caching, the difference between CMD and ENTRYPOINT, and the small habits that keep builds fast and images small.

Ishan KarunaratneIshan Karunaratne⏱️ 10 min readUpdated
Share thisCopied

A Dockerfile is the recipe for an image. Eight or nine instructions cover almost every real-world Dockerfile you will write or read: FROM, WORKDIR, COPY, RUN, ENV, EXPOSE, USER, CMD, ENTRYPOINT. This article is the working walkthrough of those: what each does, what order they belong in, the difference between CMD and ENTRYPOINT that catches people, and the layer-cache rule that decides whether your build takes 5 seconds or 5 minutes.

For optimizing the result further once it works, see Docker Image Size Optimization and the per-language guides (Node.js, Python, Next.js, Go, PHP).

How do I write a Dockerfile?

A working Dockerfile has five parts. First, pick a base image with FROM. Second, set a working directory inside the image with WORKDIR. Third, copy in only the dependency manifest (package.json, requirements.txt, go.mod) and install dependencies with RUN — separate from the source copy, because that order is what keeps the dependency layer cached between builds. Fourth, copy the rest of the source with COPY. Fifth, declare what runs when the container starts with CMD (or ENTRYPOINT for fixed binaries with overridable args). The whole thing for a Node app fits in 11 lines:

dockerfile
FROM node:22-alpine
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
COPY . .
EXPOSE 3000
USER node
CMD ["node", "server.js"]

The rest of the article is what each of those lines is doing and where the variations come in.

Jump to:

FROM — pick the base image

dockerfile
FROM node:22-alpine

FROM declares the base image. The tag matters: node:22-alpine is Node 22 on Alpine (small, musl libc), node:22-slim is Debian-slim (small, glibc, native modules work cleanly), node:22 is the full Debian image (bigger, includes more build tools). Pinning to a specific major version (22) is the minimum sane choice; pinning to a specific point release (22.7.0) is more deterministic.

FROM scratch is the empty base — used for compiled static binaries (Go, Rust) where you do not need a userland at all. The result is a 5-15 MB image instead of 200 MB.

Multi-stage builds chain FROM lines:

dockerfile
FROM node:22 AS build
# ... build steps ...

FROM node:22-alpine
COPY --from=build /app/dist /app

The final image is the last FROM plus anything copied in from the previous stages. Everything in the build stages is discarded. This is how you get small production images even when the build needs heavy tooling. Full treatment in Docker Image Size Optimization.

WORKDIR, COPY, ADD

dockerfile
WORKDIR /app
COPY package.json package-lock.json ./
COPY . .

WORKDIR sets the current directory inside the image for every following instruction. It also creates the directory if it does not exist. Always set one explicitly; defaulting to / ends up dropping files into root.

COPY src dst copies from the build context (your project directory on the host) into the image. dst is interpreted relative to WORKDIR. Use the dependency-first pattern: COPY package.json package-lock.json ./ then RUN npm ci, then COPY . . — this keeps the dependency layer cached when only your source changes. See the cache rule section below.

ADD is COPY plus two extra behaviors: it can fetch from URLs and it auto-extracts tar archives. Both behaviors are usually surprising rather than helpful. Use COPY unless you specifically want tar extraction.

.dockerignore controls what COPY . . actually copies. Without it, you ship node_modules, .git, .env, build artifacts, and the rest. Full pattern in .dockerignore Best Practices.

RUN — install things, and the layer-cache rule

dockerfile
RUN apt-get update && apt-get install -y --no-install-recommends \
      curl ca-certificates && \
    rm -rf /var/lib/apt/lists/*

RUN executes a shell command at build time and bakes the result into a new layer. Three rules turn RUN from "slow nightmare" into "5-second cached rebuild":

1. Chain related commands with &&. Each RUN is one layer. Putting apt-get update and apt-get install in separate RUNs means the install layer can be cached with a stale package index, producing weird Unable to locate package failures on rebuild.

2. Clean up in the same RUN. Removing files in a later RUN does not shrink the earlier layer; the file is still there in the previous layer's tarball. The rm -rf /var/lib/apt/lists/* belongs in the same RUN as the install.

3. The layer cache invalidates from the first change downward. If line 7 of your Dockerfile changes, lines 7+ rebuild and everything before stays cached. So order matters: put the slowest, rarely-changing layers (system packages, dependency install) at the top, and the often-changing layers (your source code) at the bottom. The "copy package.json, install deps, then copy source" pattern is exactly this principle applied to Node/Python/Ruby/PHP.

dockerfile
# Order that keeps deps cached when only source changes
COPY package.json package-lock.json ./
RUN npm ci                       # cached unless package.json changes
COPY . .                         # invalidates whenever source changes

RUN defaults to shell form (/bin/sh -c "..."). Exec form (RUN ["npm", "ci"]) skips the shell. Shell form is more common; exec form is occasionally useful when the image has no shell.

ENV and ARG

dockerfile
ARG NODE_ENV=production
ENV NODE_ENV=$NODE_ENV
ENV PORT=3000

ENV sets an environment variable that exists at runtime inside the container. Anything you would normally set with docker run -e can also be baked into the image with ENV. Keep secrets out of ENV — anyone who pulls the image can read them with docker image inspect.

ARG is build-time only. Available to subsequent RUN, COPY, and other instructions during docker build, gone at runtime. Pass values with docker build --build-arg KEY=VALUE. Critically, ARG values end up in the image's build history, so they are not a place for secrets either. Use BuildKit's --secret mount for actual secrets.

Full breakdown of when to use each (and the order-of-precedence rules in Compose) is in Docker Environment Variables.

EXPOSE

dockerfile
EXPOSE 3000

EXPOSE is documentation. It declares which port the container listens on. It does not publish the port to the host — only docker run -p does that. Tools like docker inspect and Compose can read EXPOSE to do their own thing (e.g., compose up honors it for some networking helpers), but on the runtime level it is essentially a comment.

You can omit it and everything still works. I include it because it documents the contract.

USER — drop root

dockerfile
USER node

By default, processes inside containers run as root. That root is the host's root via a UID mapping; if a process escapes the container (rare but possible), it owns the host. Dropping to a non-root user shrinks that blast radius. Most official images ship a non-root user ready for this: Node has node, Postgres has postgres, Nginx has nginx.

The catch: anything USER runs after this can no longer write to root-owned paths inside the image. Set permissions on directories you need to write to (/app, /data) before the USER switch:

dockerfile
WORKDIR /app
COPY --chown=node:node . .
USER node

The --chown flag on COPY sets ownership at copy time, avoiding a separate RUN chown layer. Full security-baseline article: Running Docker Containers as Non-Root.

CMD vs ENTRYPOINT

This is the one that catches people. Both define what runs when the container starts; they interact in a specific way.

CMD sets the default command. It is overridable at run time by passing a new command:

dockerfile
CMD ["node", "server.js"]
bash
docker run my-image                  # runs node server.js
docker run my-image bash             # overrides CMD; runs bash instead

ENTRYPOINT sets the executable. It is not overridable at run time without the --entrypoint flag:

dockerfile
ENTRYPOINT ["node", "server.js"]
bash
docker run my-image                  # runs node server.js
docker run my-image bash             # tries to run: node server.js bash (fails / weird)
docker run --entrypoint sh my-image  # only way to get a shell

Used together, ENTRYPOINT is the binary and CMD is the default arguments:

dockerfile
ENTRYPOINT ["node"]
CMD ["server.js"]
bash
docker run my-image                  # runs node server.js
docker run my-image worker.js        # runs node worker.js (CMD is overridden, ENTRYPOINT stays)

Rule of thumb: CMD alone for apps where you might want a shell or different command later (most web apps). ENTRYPOINT + CMD for CLI-like images where the image is a binary (docker run ffmpeg-image -i input.mp4 output.mp4 style — the image always runs ffmpeg, you supply the args).

Use the exec form (the JSON array syntax: ["a", "b"]) for both, not the shell form. Exec form makes signal handling work correctly; shell form wraps your process in /bin/sh -c and signals like SIGTERM get eaten by the shell instead of reaching your app. That is why graceful shutdowns don't fire and docker stop waits the full 10 seconds before killing.

HEALTHCHECK

dockerfile
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD curl -f http://localhost:3000/healthz || exit 1

Tells Docker how to check whether the app inside is healthy. docker ps then shows (healthy) or (unhealthy) next to the container, and Compose can wait for service_healthy before starting dependent services. Useful in Compose stacks; less useful in one-off docker run. Full picture in Docker Restart Policies and Health Checks.

A complete example

A working Dockerfile for a Node.js Express app, with multi-stage to keep the final image small:

dockerfile
# Build stage
FROM node:22 AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# Runtime stage
FROM node:22-alpine
WORKDIR /app
ENV NODE_ENV=production
COPY --from=build --chown=node:node /app/package.json /app/package-lock.json ./
RUN npm ci --omit=dev
COPY --from=build --chown=node:node /app/dist ./dist
USER node
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD wget -q -O - http://localhost:3000/healthz || exit 1
CMD ["node", "dist/server.js"]

Build and run:

bash
docker build -t my-app .
docker run -d --name my-app -p 3000:3000 my-app

For app-specific Dockerfile patterns: Node.js, Python, Next.js, Go, PHP / Laravel, static sites.

Common pitfalls

  • No .dockerignore, so COPY . . ships node_modules / .git / build artifacts. Add a .dockerignore (best practices). Builds get faster, images get smaller, secrets stay out.
  • Source copied before dependency install. Every source edit invalidates the dependency layer; every build re-installs everything. Copy the manifest first, install, then copy source.
  • Cleanup in a separate RUN from the install. The files are still in the earlier layer; the image is the same size. Combine into one RUN with &&.
  • Shell form for CMD/ENTRYPOINT. Signals do not reach your app; docker stop always waits the full timeout. Use exec form (JSON array).
  • latest tag in FROM. Non-deterministic builds. Pin to a major or specific version (node:22-alpine not node:latest).
  • Running as root for no reason. Add a USER line. Most official images ship a non-root user ready to use.
  • ADD for plain files. Use COPY. Save ADD for when you actually want auto-extract of a tar archive.

FAQ

Sources

Authoritative references this article was fact-checked against.

TagsDockerDockerfileContainersBuildDevOpsMulti-stage

Found this useful? Pass it on.

Copied
Ishan Karunaratne

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years building software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Currently Chief Technology Officer at a healthcare tech startup, which is where most of these field notes come from.

Keep reading

Related posts

Write an effective system prompt with five parts: role, capabilities, constraints, output, refusal. Before/after examples and the structure that maximises cache hits.

How to Write an Effective System Prompt

Write an effective system prompt for an LLM with five parts: role, capabilities, constraints, output format, refusal policy. With before/after examples and the structure that maximises cache hits.

Six techniques that actually reduce LLM hallucination: grounding, citations, tool use, structured outputs, explicit don't-know, and LLM-as-judge verification.

How to Stop an LLM from Hallucinating

Six techniques that actually reduce LLM hallucination: grounding with retrieved context, citation requirements, tool use for facts, structured outputs, explicit don't-know permission, and LLM-as-judge verification.