Most engineers I work with learn Docker by copy-pasting Dockerfiles off Stack Overflow and learn Kubernetes by copy-pasting YAML off whatever tutorial Google served them. That works — right up until something breaks at 2 a.m. and the abstraction leaks. Then suddenly you need the mental model nobody taught you, and you're Googling "what is a pod actually" during an incident.
This post is the grounded-up version. What Docker actually is. What Kubernetes actually is. When each is the right call. And the specific ways I keep watching teams shoot themselves in the foot with both.
It's long. Read it in chunks. Bookmark the decision framework at the end.
Docker is not "lightweight VMs." That phrase has done more damage to engineers' mental models than almost any other in our field. Let me kill it properly.
A virtual machine runs a full guest operating system on top of a hypervisor. The hypervisor emulates hardware; the guest OS thinks it's on a real machine. That's why VMs are heavy — you're booting Linux inside Linux.
A container doesn't do any of that. A container is just a normal Linux process — but the kernel has been told to lie to it. Specifically, the kernel uses two features:
That's it. That's the whole trick. A container is a process in a carefully-constructed box, sharing the host kernel with every other container.
graph TB
subgraph VM["Virtual Machine Model"]
H1[Hardware] --> HV[Hypervisor]
HV --> GO1[Guest OS + App 1]
HV --> GO2[Guest OS + App 2]
HV --> GO3[Guest OS + App 3]
end
subgraph CT["Container Model"]
H2[Hardware] --> HK[Host Kernel]
HK --> C1[App 1]
HK --> C2[App 2]
HK --> C3[App 3]
end
Everything else about Docker — images, registries, the CLI — is tooling around that one kernel trick.
Four nouns you need locked down before anything else makes sense:
Images are the build artifact. Containers are the runtime. Registries are the distribution. Keep those straight and 80% of Docker makes sense.
Here's a production-shaped Dockerfile for a Node app. Read the annotations — every line is doing something deliberate.
# Multi-stage: build stage has toolchain, final stage does not
FROM node:22.11-alpine AS build
WORKDIR /app
# Copy dependency manifests FIRST so npm install layer caches independently
COPY package*.json ./
RUN npm ci --only=production
# Then copy source (changes more often, so it's the last layer to invalidate)
COPY . .
RUN npm run build
# Final stage: tiny runtime image, no build tools, non-root user
FROM node:22.11-alpine
WORKDIR /app
# Don't run as root
RUN addgroup -S app && adduser -S app -G app
USER app
COPY --from=build --chown=app:app /app/dist ./dist
COPY --from=build --chown=app:app /app/node_modules ./node_modules
COPY --from=build --chown=app:app /app/package.json ./
# Let orchestrators detect a dead container
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s \
CMD wget --quiet --tries=1 --spider http://localhost:3000/health || exit 1
EXPOSE 3000
Four things this gets right that most Dockerfiles get wrong:
npm, gcc, your source, or your test fixtures. Smaller image, smaller attack surface, faster pulls.package.json changes. Source copy is the last step. This turns a 90-second rebuild into a 5-second one.app, not root. A kernel escape no longer hands the attacker root on the host.People skip past this. Don't. A huge percentage of workloads never need Kubernetes, and trying to introduce it prematurely has killed more projects than it has helped.
Docker (single-host) is the right answer when:
Docker Compose (multi-container on one host) is the right answer when:
That covers a shocking percentage of real-world apps. Side projects. Internal tools. B2B SaaS with a dozen customers. Homelabs. Early-stage startups. If this is you, stop reading tutorials about Kubernetes and ship.
These are the antipatterns I see over and over. If any of them sound familiar, fix it this week.
1. FROM node:latest (or :latest anything). The latest tag floats. Your build that worked yesterday will break tomorrow when the upstream image gets a new default. Pin to a specific version, ideally by digest.
2. COPY . . as the first instruction. This invalidates your entire layer cache on every source change. Copy dependency manifests first, install, then copy source.
3. Running as root. The default USER is root. That means a container escape is a host-root escape. Add a non-root user unless you genuinely need privileged operations.
4. Secrets in the image. Every ENV API_KEY=... or COPY .env . is now permanently embedded in an image layer. Even if you rm it in a later layer, the earlier layer still has it. Anyone with docker pull access has your secrets. Use runtime injection (--env-file, Docker secrets, or your orchestrator's secret store).
5. One container, many processes. Running Nginx + app + cron + log shipper in one container so it "feels like a VM." You've just recreated a VM, badly, with none of the advantages of containers. One process per container. Link them.
6. No healthcheck. Without HEALTHCHECK, "the container is running" means "PID 1 hasn't exited." Your deadlocked app is technically running. Orchestrators can't help you.
7. Ignoring .dockerignore. Without it, COPY . . ships your .git directory, node_modules from the host, your editor configs, and any .env files sitting in the repo. Big images, leaked secrets. Write a .dockerignore the same day you write a .gitignore.
8. Exposing ports you didn't mean to. In Compose, ports: "8080:80" binds on every network interface, including the public one. Docker manipulates iptables directly and bypasses many host firewalls. Bind to 127.0.0.1 explicitly when the service should only be reachable by other containers.
Sources: Capital One, Codefresh, Docker Anti-Patterns (twistezo).
Kubernetes is a declarative orchestrator for containerized workloads. That's a mouthful. Unpack it:
Here's the official Kubernetes cluster diagram straight from the project's documentation — control plane components on the left, worker nodes on the right:
Source: Kubernetes.io — Cluster Architecture, licensed under CC BY 4.0.
Same idea, my simplified version:
graph TB
subgraph CP["Control Plane"]
API[API Server]
ETCD[(etcd — cluster state)]
SCH[Scheduler]
CM[Controller Manager]
API --- ETCD
API --- SCH
API --- CM
end
subgraph W1["Worker Node 1"]
K1[kubelet]
P1[Pod A]
P2[Pod B]
K1 --- P1
K1 --- P2
end
subgraph W2["Worker Node 2"]
K2[kubelet]
P3[Pod C]
K2 --- P3
end
API -.-> K1
API -.-> K2
The control plane (API server, etcd, scheduler, controller manager) decides what should run where. Worker nodes run a kubelet agent that actually launches containers. The API server is the only component that talks to etcd; everything else talks to the API server.
More nouns, but each one earns its keep:
api.yourdomain.com work.The relationship:
graph LR
USER[External traffic] --> ING[Ingress]
ING --> SVC[Service: api]
SVC --> P1[Pod 1]
SVC --> P2[Pod 2]
SVC --> P3[Pod 3]
DEP[Deployment: api] -.manages.-> P1
DEP -.manages.-> P2
DEP -.manages.-> P3
CM[ConfigMap] -.mounted into.-> P1
SEC[Secret] -.mounted into.-> P1
Production-shaped Deployment + Service for the same Node app:
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: ghcr.io/ryan/api@sha256:abc123...
Five things this gets right:
maxUnavailable: 0. Deploys never drop below the desired replica count.Kubernetes earns its (considerable) complexity budget when you genuinely have:
If you have three of those, take K8s seriously. If you have five, you're already in Kubernetes whether you've installed it or not — you're just emulating it badly with bash scripts.
If you have zero or one of those, Kubernetes is a liability. You will spend more time feeding it than shipping.
Sources: Distr, Spacelift, The New Stack.
The antipatterns here are more expensive because K8s mistakes compound across services, teams, and environments.
1. Adopting it too early. By far the most common mistake. A three-person team with one service does not need Kubernetes. They need Compose on a reasonable VM and a backup plan. Every hour spent on K8s is an hour not spent on product.
2. image: myapp:latest in production. Same lesson as Docker, higher stakes. You now have no idea what version is running. Rollbacks become guesswork. Pin by digest.
3. No resource requests and limits. The scheduler assumes your pod needs nothing and packs nodes accordingly. Then under real load, pods get OOMKilled or CPU-throttled and you have no idea why. Every container gets requests and limits. Non-negotiable.
4. Running bare pods instead of Deployments. Bare pods don't get rescheduled when a node dies. They just stay dead. Use Deployments, StatefulSets, or DaemonSets — almost never bare pods.
5. Liveness and readiness pointed at the same endpoint. A bad liveness probe will put your app in a crash loop. A bad readiness probe will just take it out of rotation until it recovers. They have different jobs — write them differently.
6. Running a stateful database inside the cluster without understanding what you're signing up for. "Just run Postgres as a StatefulSet" sounds reasonable and usually ends with lost data. Storage, backups, failover, and upgrades for stateful workloads on K8s are a whole discipline. Most teams should use a managed database (RDS, Cloud SQL, Neon) and let K8s do stateless.
7. Flat RBAC. Everyone has cluster-admin. A single compromised service account can now destroy everything. Follow least-privilege — per-service-account roles bound to specific namespaces.
8. No NetworkPolicies. By default, every pod can talk to every other pod, cluster-wide. One compromised pod, full lateral movement. Default-deny with explicit allow rules.
9. No PodDisruptionBudgets. The cluster admin drains a node for maintenance and takes out all three replicas of your service simultaneously. A PDB says "you can evict pods, but leave at least 2 of mine running at all times."
10. Treating nodes as pets. SSHing into a worker node to fix something is a sign that the automation has failed. Nodes should be disposable. If one is misbehaving, cordon it, drain it, destroy it, let the autoscaler replace it.
11. Config baked into images per environment. The same image should run in dev, staging, and prod — only the ConfigMap/Secret changes. If you rebuild the image per environment, you are not actually testing what you deploy.
Sources: Codefresh Antipatterns, Sedai 2026, Teleport.
Strip away the hype. Here's the flowchart I actually use:
graph TD
START[Do I need containers at all?] -->|No, one process, one host| BARE[Just run the binary. Seriously.]
START -->|Yes| COUNT[How many services?]
COUNT -->|1-5| HOSTS[How many hosts can I tolerate running on?]
COUNT -->|6-20| MULTI[Multi-team or multi-env?]
COUNT -->|20+| K8S1[Kubernetes]
HOSTS -->|1 is fine| COMPOSE[Docker Compose on a VM]
HOSTS -->|Need failover, 2-3 nodes| SWARM[Nomad or Docker Swarm, or managed container service like ECS/Cloud Run]
MULTI -->|Single team, single env| COMPOSE2[Docker Compose or managed container service]
MULTI -->|Multi-team or multi-env| K8S2[Kubernetes]
Rough translation in prose:
The mistake I see most is the jump from "I have three services" to "we need Kubernetes" — skipping over the managed-container tier entirely. That tier is where most companies should live.
If you're starting from scratch, the order matters. People who learn in the wrong order build broken mental models they have to unlearn later.
docker run -it ubuntu bash. Look around. Exit. Realize it's gone. Now you know what "ephemeral" means.kind or minikube locally. Then use a managed cluster (GKE, EKS, DigitalOcean, Linode). Don't run your own control plane unless you have a very specific reason.Skipping step 4 is the single biggest learning mistake I see. People jump to K8s before they've felt the pain Compose solves, so they cargo-cult K8s patterns without understanding what problem each pattern is solving.
Docker is how you package and run a single containerized process; Kubernetes is how you orchestrate many of them across many machines — and the right answer for most teams is neither extreme, but something in the middle you were told wasn't "real" infrastructure.
If this was useful, let me know what you want next — I'm thinking a deep dive on Kubernetes networking, or one on the "middle tier" (ECS, Cloud Run, Fly.io, Nomad) nobody talks about.
Sources used throughout: Capital One Tech on Docker antipatterns; Codefresh's Kubernetes antipattern series; Sedai's 2026 Kubernetes anti-patterns writeup; Teleport's Kubernetes production patterns; Distr, Spacelift, and The New Stack on the Compose vs K8s decision. Linked inline where cited.