TechEarl

Docker Image Size Optimization: Multi-Stage Builds, Alpine, Slim, Scratch, and Distroless

Why your Docker image is 1.5 GB and how to get it under 100 MB. Multi-stage builds, the choice between alpine and slim, when to reach for scratch or distroless, and the docker history command for finding the bloat.

Ishan KarunaratneIshan Karunaratne⏱️ 8 min readUpdated
Share thisCopied

A naïve Docker image for a Node.js app is 1.5 GB. The same app, built thoughtfully, is 80 MB — about 20x smaller, faster to push, faster to pull, smaller attack surface. The techniques are not exotic: multi-stage builds, an Alpine or slim base, a strict .dockerignore, and using docker history to find the layer that's bigger than it should be.

Why image size matters

  • Push and pull time. A 1.5 GB image on a 50 Mbps connection takes 4 minutes per pull. On a CI host that pulls 50 times a day, that's 200 minutes. Cut the image to 80 MB and it's under a minute.
  • Disk space. Production hosts running 30 services with 1 GB images each is 30 GB just to have them on disk. The same services at 100 MB each: 3 GB.
  • Cold-start latency. On serverless runtimes (Cloud Run, Fargate, Lambda containers), the image is pulled fresh on a cold start. 1.5 GB cold starts in 30-60 seconds; 80 MB in 2-3 seconds.
  • Attack surface. Every binary in the image is a potential exploit. A scratch or distroless image has no shell, no package manager, no curl, nothing to attack. Smaller usually means simpler, simpler usually means safer.

The two techniques that do 95% of the work

1. Multi-stage builds. Compile, build, test, install in a heavy stage; copy only the artifacts into a slim final stage. Build toolchains, devDependencies, and intermediate files never reach the final image.

2. A slim base image. node:22-alpine instead of node:22. python:3.13-slim instead of python:3.13. The savings are immediate, often 100+ MB.

That's it for most apps. Everything else in this article is the edges.

Multi-stage builds, properly

A Node.js app, before:

dockerfile
FROM node:22
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
EXPOSE 3000
CMD ["node", "dist/server.js"]

That image is around 1.5 GB. Includes the full Debian image, all of node_modules (including devDependencies), the source tree, build artifacts, possibly cached test data.

After:

dockerfile
# Build stage
FROM node:22 AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# Runtime stage
FROM node:22-alpine
WORKDIR /app
ENV NODE_ENV=production
COPY --from=build /app/package.json /app/package-lock.json ./
RUN npm ci --omit=dev
COPY --from=build /app/dist ./dist
USER node
EXPOSE 3000
CMD ["node", "dist/server.js"]

That image is around 100 MB. Everything in the build stage (build tools, devDependencies, source) is discarded; only the runtime essentials make it into the final image.

Alpine vs slim vs full

BaseSizeWhen to use
Full (node:22, python:3.13)150-200 MBBuild stage where you need compilers. Almost never as a runtime base.
Slim (python:3.13-slim, node:22-slim)75-150 MBDefault for runtime. Debian-slim, glibc, most native modules work cleanly.
Alpine (node:22-alpine, python:3.13-alpine)40-90 MBWhen size matters and your dependencies tolerate musl libc.
Distroless (gcr.io/distroless/...)20-40 MB + your binaryFor locked-down runtime. No shell, no package manager.
Scratch0 MB + your binaryStatic binaries only (Go, Rust). Absolute smallest.

Alpine caveats: Alpine uses musl libc and BusyBox. Some Python wheels (NumPy, Pandas, SciPy) don't publish musl builds, so pip falls back to source compilation that takes minutes and may fail. Some Node native modules have similar issues. Test before committing.

Distroless is Google's "just the runtime, nothing else" image family. Variants for static binaries (Go, Rust), Java, Node, Python. No shell — debugging is harder, security is better.

Scratch is fully empty. Only works for languages that produce static binaries with no runtime dependencies. See How to Dockerize a Go App.

Use docker history to find the fat layer

bash
docker history my-image

Outputs each layer with its size and the instruction that created it. The bloat is usually obvious:

code
IMAGE          CREATED        CREATED BY                            SIZE
abc123         2 hours ago    /bin/sh -c npm install                412MB   ← here
def456         2 hours ago    /bin/sh -c apt-get install -y curl    98MB    ← and here
ghi789         3 hours ago    COPY . /app                           340MB   ← oh

That's where to look. COPY . /app shipping hundreds of MB means a missing .dockerignore. npm install taking 412 MB usually means devDependencies got installed in the runtime stage.

.dockerignore

A missing .dockerignore lets COPY . . ship node_modules, .git, .env, build artifacts, test data, and everything else in the project directory. Even with multi-stage and Alpine, this single file can be the difference between 50 MB and 500 MB.

A solid baseline for most projects:

code
node_modules
.git
.env
.env.local
*.log
coverage
dist
build
.next
out
__pycache__
*.pyc
.pytest_cache
.vscode
.idea
Dockerfile
.dockerignore
README.md

Full pattern in .dockerignore Best Practices.

Layer ordering for cache reuse

Each Dockerfile instruction is a layer. Changing a layer invalidates every layer below it. Order matters:

dockerfile
# Bad: source change re-runs npm install
COPY . .
RUN npm ci

# Good: source change doesn't bust the install layer
COPY package.json package-lock.json ./
RUN npm ci
COPY . .

Slow-changing things go first (system packages, dependency manifests, dependency installs). Fast-changing things go last (your source code). Re-building when you've only edited source then takes seconds instead of minutes.

Combine RUN steps and clean up in the same RUN

Each RUN is a layer. Cleaning up in a later RUN does not shrink the earlier layer:

dockerfile
# Bad — install layer keeps the package lists
RUN apt-get update && apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*

# Good — install + cleanup in one layer
RUN apt-get update && apt-get install -y --no-install-recommends curl && \
    rm -rf /var/lib/apt/lists/*

The result of the bad version: the install layer has the package lists baked in, the rm layer adds an empty "I deleted those" diff. Both layers ship.

Real-world targets

Rough sizes after applying these techniques:

App typeOptimized image
Static site (Nginx + dist/)25-30 MB
Go web service10-15 MB
Node.js Express app60-100 MB
Next.js (standalone output)150-200 MB
Python Flask/Django/FastAPI100-150 MB
PHP Laravel (php-fpm)90-130 MB
Java Spring Boot150-250 MB

If your image is 2-3x larger than these, run docker history and look for the heavy layer.

What not to optimize

  • Building from scratch without good reason. Scratch is great for Go binaries; trying it for Node forces you to ship Node and you lose the ergonomics for marginal size savings.
  • Squashing layers manually. Tools like --squash (experimental) and docker export | docker import exist but break layer reuse for everyone pulling your image. The cache wins from layers usually outweigh the size saving from squashing.
  • Custom-built minimal images to save a few MB versus Alpine. Maintenance cost is real. Alpine is good enough for almost everything.

What to do next

FAQ

Sources

Authoritative references this article was fact-checked against.

TagsDockerImage SizeMulti-stageAlpineDistrolessDevOps

Found this useful? Pass it on.

Copied
Ishan Karunaratne

Ishan Karunaratne

Tech Architect · Software Engineer · AI/DevOps

Tech architect and software engineer with 20+ years building software, Linux systems, and DevOps infrastructure, and lately working AI into the stack. Currently Chief Technology Officer at a healthcare tech startup, which is where most of these field notes come from.

Keep reading

Related posts