Most developers I work with use Postgres without knowing why. It's in the company's stack. It was picked by someone who left two years ago. It runs, so nobody asks questions. They treat it like a smarter MySQL — a place to put rows and pull rows out, and the only time it gets attention is when a query is slow.
That's a shame, because Postgres is not a database, it's a platform. And the gap between how most developers use it and what it's actually capable of is enormous. A sponsor on YouTube called this the "99% of developers don't get PostgreSQL" problem, and after watching the video and thinking about how much of my own career I spent in that 99%, I wanted to write the grounded-up lesson I wish someone had handed me on day one.
If you're reading this and you've been shipping code for years on top of Postgres without ever using an index properly, without reading an EXPLAIN ANALYZE, without knowing what MVCC is or why your table keeps getting mysteriously slow — this one's for you. No shame. I was there. Let's fix it.
Before any features, get this shape in your head. Postgres has, at its core, three things that work together to make it trustworthy:
On top of that transactional core, Postgres layers an enormous amount of platform: a sophisticated query planner, rich data types (JSON, arrays, ranges, geometry), full-text search, and an extension system that lets third parties add features to the database itself — vector search (pgvector), time-series (TimescaleDB), even turning Postgres into a message queue (pgmq).
graph TB
subgraph PLATFORM["Platform Layer — Extensions, Query Planner, Data Types"]
EXT[Extensions: pgvector, PostGIS, pg_stat_statements, TimescaleDB, ...]
PLAN[Query Planner & Optimizer]
TYPES[Rich Types: JSONB, Arrays, Ranges, Geometry, Full-Text]
end
subgraph CORE["Transactional Core — The thing that makes it trustworthy"]
ACID[ACID Transactions]
MVCC[MVCC — concurrent readers + writers]
WAL[WAL — crash recovery]
end
PLATFORM --> CORE
Most developers spend their careers in the top box and never learn the bottom box. But the bottom box is what determines whether your production database survives a bad deploy, a disk failure, or a concurrent update collision. Let's do the bottom box first.
You've seen the acronym a hundred times. Here's what each letter means in practice:
INSERT a customer and then INSERT their order, and the second insert fails, the customer insert is rolled back automatically. You never end up with a customer who has no order, or an order that points at a nonexistent customer.These are not free, and they're not automatic for every database. MongoDB didn't have multi-document transactions until 4.0. MySQL's default storage engine was non-transactional for over a decade. Postgres has had real ACID since the beginning, and it's why serious applications keep landing on it.
MVCC stands for Multi-Version Concurrency Control, and understanding it will unlock 80% of Postgres behavior that feels weird.
The naive way to handle concurrent transactions is with locks: if someone is writing, nobody else can read. That's how older databases worked, and it's miserable under load — readers pile up behind writers.
Postgres does something smarter: it never overwrites a row. When you UPDATE a row, Postgres doesn't modify the existing row in place. It writes a new version of the row and marks the old version as dead. Each transaction sees a consistent snapshot of the database as it existed when the transaction started, because the old versions are still there.
sequenceDiagram
participant T1 as Transaction 1 (reader)
participant DB as Postgres
participant T2 as Transaction 2 (writer)
T1->>DB: BEGIN
T2->>DB: BEGIN
T1->>DB: SELECT balance FROM accounts WHERE id=1
DB-->>T1: 100 (version A)
T2->>DB: UPDATE accounts SET balance=50 WHERE id=1
Note over DB: Writes version B, marks A as expired at T2's commit time
T2->>DB: COMMIT
T1->>DB: SELECT balance FROM accounts WHERE id=1
DB-->>T1: 100 (still sees version A — T1's snapshot is older than T2's commit)
T1->>DB: COMMIT
Three consequences you need to know:
1. Writers don't block readers, and readers don't block writers. This is the thing that makes Postgres scale under mixed workloads. A long analytical query won't lock your checkout flow.
2. Dead row versions accumulate, and something has to clean them up. That something is VACUUM. If you never vacuum (or autovacuum isn't keeping up), dead rows pile up, your tables get bigger on disk than they should be, and queries slow down because the planner has to skip over all the tombstones. This is called table bloat, and it's one of the most common "my Postgres is inexplicably slow" causes in production.
3. Transaction IDs are a finite resource. Postgres uses a 32-bit transaction ID space. If you never vacuum, you eventually hit transaction ID wraparound, and Postgres will shut down writes to protect your data. Autovacuum normally handles this. When it doesn't, you find out the hard way.
If you take one thing from this section: VACUUM is not optional, it is load-bearing. Autovacuum is on by default, but on busy systems you need to actually check whether it's keeping up. pg_stat_user_tables.n_dead_tup tells you.
The Write-Ahead Log is the mechanism behind durability. The rule is simple: before Postgres writes your change to the actual data files, it writes the change to the WAL first, and flushes the WAL to disk.
graph LR
APP[Your app: COMMIT] --> PG[Postgres]
PG --> WAL[Write to WAL]
WAL --> FSYNC[fsync WAL to disk]
FSYNC --> ACK[Return COMMIT success to app]
ACK -.async, later.-> PAGE[Write changed pages to data files]
PAGE -.async, later.-> CHECK[Checkpoint — WAL can be truncated]
The WAL is append-only and sequential, so writing to it is fast. The data files are written asynchronously later. If Postgres crashes mid-write, on restart it replays the WAL from the last checkpoint and brings the data files back to a consistent state.
Three things the WAL gives you beyond crash recovery:
You don't usually configure WAL directly, but you should know it exists, and you should know what synchronous_commit means — setting it to off makes commits faster but means a crash can lose the last few milliseconds of committed transactions. That tradeoff is yours to make, and you can make it per-transaction.
Here's the myth I want to dismantle. A lot of developers write SQL, it runs, and they move on. If it's slow, they add an index somewhere based on vibes. They treat the planner as unknowable.
It's not. Postgres will tell you exactly what it's doing, step by step, every time you ask.
EXPLAIN ANALYZE
SELECT u.email, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.email
ORDER BY order_count DESC
LIMIT 10;
That query, with EXPLAIN ANALYZE in front of it, returns something like:
Limit (cost=1842.13..1842.16 rows=10 width=40) (actual time=23.411..23.415 rows=10 loops=1)
-> Sort (cost=1842.13..1867.04 rows=9964 width=40) (actual time=23.410..23.412 rows=10 loops=1)
Sort Key: (count(o.id)) DESC
Sort Method: top-N heapsort Memory: 26kB
-> HashAggregate (cost=1527.23..1626.87 rows=9964 width=40) (actual time=18.912..21.103 rows=9964 loops=1)
Group Key: u.email
-> Hash Right Join (cost=382.14..1427.59 rows=19928 width=36) (actual time=2.104..14.223 rows=19928 loops=1)
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders o (cost=0.00..892.00 rows=50000 width=8) (actual time=0.012..5.123 rows=50000 loops=1)
-> Hash (cost=257.00..257.00 rows=9964 width=36) (actual time=2.012..2.013 rows=9964 loops=1)
-> Index Scan using users_created_at_idx on users u (cost=0.29..257.00 rows=9964 width=36) (actual time=0.014..1.602 rows=9964 loops=1)
Index Cond: (created_at > (now() - '30 days'::interval))
Planning Time: 0.412 ms
Execution Time: 23.502 ms
Read from the inside out, bottom to top. Every node tells you:
The gaps are where you find bugs. If the planner estimates 100 rows and actually returns 1,000,000, your statistics are stale (ANALYZE will fix it) or your query is defeating the planner's assumptions (correlated subquery, function in a WHERE clause, etc.).
A few quick reads:
work_mem or add an index that provides the sort order.The lesson: write a query, EXPLAIN ANALYZE it, read the plan. Do this every single time you're about to ship a query that runs on a large table. It takes 10 seconds and saves you from shipping an accidental full table scan.
Here's where Postgres stops being "just a database" and starts being a platform.
Store structured but schemaless data, query it with SQL, index it.
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
payload JSONB NOT NULL
);
-- Insert arbitrary JSON
INSERT INTO events (payload) VALUES
('{"type": "signup", "user": {"id": 42, "plan": "pro"}}'),
('{"type": "purchase", "amount": 99.00, "currency": "USD"}');
-- Query into it like it's structured
SELECT payload->'user'->>'plan' AS plan, COUNT(*)
FROM events
WHERE payload->>'type' = 'signup'
GROUP BY plan;
-- Index a path for fast lookups
CREATE INDEX events_type_idx ON events ((payload->>'type'));
-- Or a full GIN index for arbitrary queries into the JSON
CREATE INDEX
This is your alternative to reaching for MongoDB when you have semi-structured data. You get SQL, joins, foreign keys, transactions, and flexible documents. This combination has eliminated "I need a document DB" for most workloads.
Native array columns. Not a workaround, a real type.
CREATE TABLE posts (
id BIGSERIAL PRIMARY KEY,
title TEXT,
tags TEXT[]
);
INSERT INTO posts (title, tags) VALUES
('Postgres lesson', ARRAY['postgres', 'databases', 'learning']);
-- Query it
SELECT * FROM posts WHERE 'postgres' = ANY(tags);
-- Index it for fast membership checks
CREATE INDEX posts_tags_gin ON posts USING GIN (tags);
A lot of tagging systems, preference lists, and simple many-to-many relationships that would become a join table in a textbook are cleaner and faster as an array column.
A first-class range of values, with a single operator for overlap detection. This is killer for scheduling and pricing.
CREATE TABLE reservations (
id BIGSERIAL PRIMARY KEY,
room_id BIGINT NOT NULL,
during TSRANGE NOT NULL,
EXCLUDE USING GIST (room_id WITH =, during WITH &&)
);
-- The EXCLUDE constraint means: no two reservations for the same room
-- can overlap in time. Postgres enforces it at the database level.
INSERT INTO reservations (room_id, during) VALUES
(1, '[2026-05-01 09:00, 2026-05-01 10:00)'),
(1, '[2026-05-01 09:30, 2026-05-01 11:00)'); -- This will fail.
Think about how much application code you'd write to enforce that constraint manually. Postgres does it with two lines.
CREATE INDEX posts_search_idx ON posts USING GIN (to_tsvector('english', title || ' ' || body));
SELECT title, ts_rank(to_tsvector('english', title || ' ' || body), query) AS rank
FROM posts, plainto_tsquery('english', 'postgres mental model') query
WHERE to_tsvector('english', title || ' ' || body) @@ query
ORDER BY rank DESC;
Good-enough search with stemming, ranking, and highlighting, without installing Elasticsearch. Most sites don't need Elastic. Postgres full-text search gets you to "this is fine" for a long time.
A Postgres extension is a package of SQL + C code that plugs into the server. Once installed, it's just part of your database. No new service to run, no new protocol to learn.
The ones worth knowing about:
Enabling one is usually a single SQL statement:
CREATE EXTENSION vector;
This is the thing that should change how you think about Postgres. A lot of architecture decisions — "we need a vector DB," "we need a time-series store," "we need a queue" — can become "we need an extension" instead. Fewer services, fewer failure modes, one backup strategy.
Sources: PGXN, the official Postgres extension network, lists hundreds more.
The antipatterns I keep finding in real production databases:
1. No indexes, or the wrong indexes. Running a WHERE clause on an unindexed column means a sequential scan of the whole table. Run EXPLAIN on your common queries. If you see Seq Scan on a large table, you need an index.
2. Wrong index type. B-tree (the default) is great for equality and range queries. GIN is for JSONB, arrays, and full-text search. GiST is for ranges and geometry. Using B-tree where GIN would work, and vice versa, is a common mistake.
3. Indexes on every column, "just in case." Every index has a write cost. Each INSERT or UPDATE has to update every index. Measure which queries are slow, add indexes deliberately, drop unused ones (pg_stat_user_indexes.idx_scan = 0 tells you).
4. Running long transactions. Every open transaction is an anchor point that prevents VACUUM from cleaning up dead rows past its snapshot. A connection that sat in idle in transaction for three hours is the root cause of a shocking number of "why is my database bloated" incidents. Set idle_in_transaction_session_timeout.
5. Using SELECT * in application code. Two problems: you fetch columns you don't use (network + memory), and your code breaks in surprising ways when someone adds a column. Name the columns.
6. Not using EXPLAIN ANALYZE before shipping queries that hit big tables. This is free. Do it.
7. Storing JSON as TEXT or JSON instead of JSONB. Use JSONB unless you specifically need to preserve key order and whitespace. JSONB is indexable and much faster to query.
8. Ignoring connection pooling. Each Postgres connection is a full OS process. A few hundred connections and your server is spending all its time context-switching. Use PgBouncer or your framework's built-in pool. Know the difference between transaction-level and session-level pooling.
9. Running ORMs with no inspection. ORMs are great until they generate a 40-line SQL query with four joins you didn't know about. Log slow queries. Look at what your ORM is actually sending to the database.
10. Treating backups as "the thing the hosting provider does." Test your restore. A backup you haven't restored is not a backup.
Sources: Postgres Wiki — Don't Do This, pganalyze's common mistakes series, the Postgres documentation's performance chapter.
In this order:
EXPLAIN ANALYZE on your ten most common queries. Read the plans. Don't optimize anything yet — just build intuition for what shape the planner picks.pg_stat_statements in your dev or staging database. Leave it on for a few days of normal use. Then query it: SELECT query, calls, total_exec_time FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10;. Now you know what your actual hot queries are.EXPLAIN ANALYZE, confirm the plan changed.BEGIN / COMMIT, intentionally fail the second one, and verify the first didn't persist. You now understand atomicity viscerally, not abstractly.Each of those is a half-day of work at most, and they're ordered so that each one makes the next one easier.
Postgres is an ACID-compliant, MVCC-based, extensible platform that happens to speak SQL — and learning the platform, not just the SQL dialect, is what separates developers who use Postgres from developers who get Postgres.
If this was useful, next in the queue: a deep dive on Postgres indexes (when to use B-tree vs. GIN vs. BRIN, and how to read an index-only scan), or a practical walkthrough of setting up pgvector for an AI side project. Let me know which one.
Inspiration: The Coding Gopher — "99% of Developers Don't Get PostgreSQL" on YouTube. Sources for specific claims: PostgreSQL documentation, PostgreSQL Wiki, PGXN for extensions, pganalyze for performance patterns. All diagrams rendered with Mermaid.