Docker Best Practices in 2026: What I Stopped Doing in Production

I shipped a Dockerfile last March that I was quietly proud of. 47 lines, three build stages, non-root user, every base image pinned to a digest, and a 38MB final image. I felt clever.

Then a teammate looked at it for three seconds and said “why are you still doing apt-get update on a distroless image?” Reader, I was not, but I had two other things in there that were just as silly. So I spent a weekend going through every production Dockerfile I owned and ripping out habits I’d kept since 2021. Some were fine in 2021 and not fine now. Others were never fine, I just hadn’t noticed.

This post is the list. Not “complete” best practices, because I find those exhausting. Just the changes that actually moved a needle in production: build time, image size, security posture, or my sleep at 2 AM.

I stopped optimizing image size below 50MB

I’ve watched developers spend a full day shaving 8MB off an image that ships once a week. It feels like work. It is not work.

At our pull rates, going from 60MB to 52MB saved roughly 11 seconds a day across the whole fleet, and added an hour of Dockerfile complexity. The trade was bad and I should have known.

What I actually optimize for now: a sensible base image, multi-stage builds where they cost nothing, and one good .dockerignore. If the image is under 150MB and CI pulls it in under five seconds, I stop. Done.

Where size still matters: edge functions, Lambda containers, anything that pays per-image-MB on cold start. For a long-running server, it almost never does.

One small heuristic. If your runtime image isn’t already a gcr.io/distroless variant or an alpine/slim flavor, switching is usually worth a 20-minute experiment. Going from alpine to scratch is usually not, unless you genuinely have no shared libraries.

Multi-stage builds are still the best habit I have

If you take one thing from this post, take this: separate build dependencies from runtime dependencies. Always.

Here’s what I used to write for a Go service:

FROM golang:1.22
WORKDIR /app
COPY . .
RUN go build -o /server ./cmd/server
CMD ["/server"]

About 1.1GB. Compilers, source code, the Go module cache, plus a layer of “I’ll clean this up later” that I never did.

Here’s what I write now:

# Build stage
FROM golang:1.22 AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o /out/server ./cmd/server

# Runtime stage
FROM gcr.io/distroless/static-debian12
COPY --from=build /out/server /server
USER nonroot:nonroot
EXPOSE 8080
ENTRYPOINT ["/server"]

Roughly 18MB. No shell, no package manager, no curl. If an attacker breaks in, there is nothing for them to run.

The same pattern works for Rust (build on rust:1.75, copy the binary into distroless/cc), Python via python:3.12-slim plus a venv copy, Node via node:20-alpine plus a node_modules copy. I covered similar production trade-offs in my Go web frameworks post, where the deploy story is almost always “containerize and ship”.

I stopped writing `latest` anywhere

I used :latest for years. The argument was speed: pin once, get updates automatically. The reality was that two of our environments would run different versions of the same image because they pulled at different times. That bug took me four hours to find. I haven’t trusted :latest since.

The fix is boring. Pin by digest in production:

FROM golang:1.22.3@sha256:e5d6326abc...

For local dev, version tags like :1.22-bookworm are fine. The official Docker best practices guide has been saying this for years and I ignored it because pinning felt like extra work. It is extra work. It is also the only way “works on my machine” stops being a sentence anyone has to say.

Renovate or Dependabot handles the actual updates. I just review the PRs. Total time cost: about ten minutes a week. Time saved when an incident isn’t caused by a silent base-image bump: incalculable, mostly because I haven’t had one of those incidents since.

Healthchecks I now write differently

For a long time my healthchecks looked like this:

HEALTHCHECK --interval=30s --timeout=3s \
  CMD curl -f http://localhost:8080/health || exit 1

Two problems. First, curl isn’t in distroless images, so this either silently fails or forces me to add curl back, which defeats the purpose. Second, /health for most of my services was a trivial endpoint that returned 200 whether the database was reachable or not. Docker thought the container was healthy while the app was returning 503s to real traffic.

Now I do two things. I expose a tiny healthcheck binary inside the image, a small Go program that calls the real /ready endpoint over a unix socket. And I make /ready actually check the things that matter: database connection, downstream service ping, queue connectivity. Liveness stays cheap. Readiness gets honest.

If you’re running on Kubernetes, the Dockerfile HEALTHCHECK is mostly ignored anyway in favor of liveness and readiness probes. The principle still applies. Make readiness check the things whose failure means “stop sending traffic here”, and keep liveness dumb.

docker-compose anti-patterns I dropped

Compose is where I had the most embarrassing habits. Three I burned:

Bind-mounting node_modules from the host. I did this thinking it’d speed up dev. It mostly produced platform mismatches when my Mac mounted Linux containers. The fix: use a named volume for node_modules and bind-mount only the source. Faster, fewer surprises.

restart: always on services that should not restart always. Migrations, init scripts, seed jobs. These should be restart: no or a one-shot service. I had a migration container in a restart loop for an hour once. Cost me a Saturday afternoon and a chunk of my dignity.

Not using profiles for optional services. I’d comment out the mailpit service whenever I didn’t need email locally, then forget to comment it back in. Profiles solve this:

services:
  mailpit:
    image: axllent/mailpit
    profiles: ["email"]

docker compose up skips it. docker compose --profile email up includes it. No more commented-out blocks rotting in version control.

Security scanning is cheap, and I should have started years ago

For ages I had no scanning in my pipeline. The reason was that the tooling felt heavy, and I was scared of what I’d find. Both excuses aged poorly.

In 2026, a basic scan is one line of CI:

- name: Scan image
  run: docker scout cves --exit-code --only-severity critical,high $IMAGE

Docker Scout ships with the CLI. Snyk, Trivy, and Grype all work too. The Snyk Docker security cheat sheet is a decent primer if you’ve never thought about this layer.

What I won’t pretend: most scan findings on a typical image are low-impact. CVE noise is real. The ones I care about are CRITICAL with a known exploit, and HIGH on packages that actually run in the request path. Everything else goes in a tracking issue and gets batched into a base-image bump.

What I’d do if I were starting fresh

If you’re setting up Docker for a service in 2026 and want one config to copy, here’s the short version: multi-stage Dockerfile, non-root user, distroless or slim runtime, a real readiness endpoint, digest-pinned base images, Scout in CI. Six things. You’re past 90% of teams I’ve audited, and you didn’t have to think about it more than once.

If you want one thing to try this week: pick your most-deployed service, run docker scout cves --only-severity critical,high against the current image, and see what falls out. I bet it’s either nothing, in which case you’ve earned a small celebration, or one obvious issue you can fix in an afternoon. You can see more of the production trade-offs I tend to chew on in the rest of my work.

The point isn’t a perfect Dockerfile. It’s stopping the habits that don’t pay rent.

I stopped optimizing image size below 50MB

Multi-stage builds are still the best habit I have

I stopped writing latest anywhere

Healthchecks I now write differently

docker-compose anti-patterns I dropped

Security scanning is cheap, and I should have started years ago

What I’d do if I were starting fresh

Related Posts

Monorepo vs polyrepo: when I actually pick each one

Docker best practices in 2026: the ones I actually use

Docker Best Practices 2026: The Dockerfile Habits I Finally Killed

Laravel Horizon in Production: The Queue Setup I Stopped Babysitting

I stopped writing `latest` anywhere