RYAN.SYS·SESSION_OK·PROXMOX_NODE: ONLINE·128_ACTIVE THREADS·4_CONCURRENT VENTURES·HOMELAB: R730XD·LOCATION: DALLAS_TX·RANK: E-7_CPO·ROLE: CTO·NET: 1_GBPS·MEM: 128_GB_DDR4·STATUS: BUILDING·RYAN.SYS·SESSION_OK·PROXMOX_NODE: ONLINE·128_ACTIVE THREADS·4_CONCURRENT VENTURES·HOMELAB: R730XD·LOCATION: DALLAS_TX·RANK: E-7_CPO·ROLE: CTO·NET: 1_GBPS·MEM: 128_GB_DDR4·STATUS: BUILDING·

loading…

[OK] dns resolved

[OK] tcp handshake

[..] waiting on payload

The Postgres You Inherited vs. The Postgres You Could Be Using

Most developers I work with use Postgres without knowing why. It's in the company's stack. It was picked by someone who left two years ago. It runs, so nobody asks questions. They treat it like a smarter MySQL — a place to put rows and pull rows out, and the only time it gets attention is when a query is slow.

That's a shame, because Postgres is not a database, it's a platform. And the gap between how most developers use it and what it's actually capable of is enormous. A sponsor on YouTube called this the "99% of developers don't get PostgreSQL" problem, and after watching the video and thinking about how much of my own career I spent in that 99%, I wanted to write the grounded-up lesson I wish someone had handed me on day one.

If you're reading this and you've been shipping code for years on top of Postgres without ever using an index properly, without reading an EXPLAIN ANALYZE, without knowing what MVCC is or why your table keeps getting mysteriously slow — this one's for you. No shame. I was there. Let's fix it.

The mental model: Postgres is a transactional engine with a platform on top

Before any features, get this shape in your head. Postgres has, at its core, three things that work together to make it trustworthy:

ACID transactions — the guarantees about what happens when multiple things try to change the same data at once.
MVCC (Multi-Version Concurrency Control) — how it achieves those guarantees without locking everyone out.
WAL (Write-Ahead Log) — how it survives crashes, power loss, and disk failure without losing your data.

On top of that transactional core, Postgres layers an enormous amount of platform: a sophisticated query planner, rich data types (JSON, arrays, ranges, geometry), full-text search, and an extension system that lets third parties add features to the database itself — vector search (pgvector), time-series (TimescaleDB), even turning Postgres into a message queue (pgmq).

graph TB
    subgraph PLATFORM["Platform Layer — Extensions, Query Planner, Data Types"]
        EXT[Extensions: pgvector, PostGIS, pg_stat_statements, TimescaleDB, ...]
        PLAN[Query Planner & Optimizer]
        TYPES[Rich Types: JSONB, Arrays, Ranges, Geometry, Full-Text]
    end

    subgraph CORE["Transactional Core — The thing that makes it trustworthy"]
        ACID[ACID Transactions]
        MVCC[MVCC — concurrent readers + writers]
        WAL[WAL — crash recovery]
    end

    PLATFORM --> CORE

Most developers spend their careers in the top box and never learn the bottom box. But the bottom box is what determines whether your production database survives a bad deploy, a disk failure, or a concurrent update collision. Let's do the bottom box first.

ACID, in one paragraph that actually makes sense

You've seen the acronym a hundred times. Here's what each letter means in practice:

Atomicity — a transaction either fully commits or leaves no trace. If you INSERT a customer and then INSERT their order, and the second insert fails, the customer insert is rolled back automatically. You never end up with a customer who has no order, or an order that points at a nonexistent customer.
Consistency — the database moves from one valid state to another. Constraints (foreign keys, NOT NULL, CHECK) are enforced at commit time.
Isolation — two transactions happening at the same time don't see each other's half-finished work.
Durability — once Postgres says "committed," that data will survive a power loss. That promise is the WAL's entire job.

These are not free, and they're not automatic for every database. MongoDB didn't have multi-document transactions until 4.0. MySQL's default storage engine was non-transactional for over a decade. Postgres has had real ACID since the beginning, and it's why serious applications keep landing on it.

MVCC: the concept that explains everything confusing about Postgres

MVCC stands for Multi-Version Concurrency Control, and understanding it will unlock 80% of Postgres behavior that feels weird.

The naive way to handle concurrent transactions is with locks: if someone is writing, nobody else can read. That's how older databases worked, and it's miserable under load — readers pile up behind writers.

Postgres does something smarter: it never overwrites a row. When you UPDATE a row, Postgres doesn't modify the existing row in place. It writes a new version of the row and marks the old version as dead. Each transaction sees a consistent snapshot of the database as it existed when the transaction started, because the old versions are still there.

sequenceDiagram
    participant T1 as Transaction 1 (reader)
    participant DB as Postgres
    participant T2 as Transaction 2 (writer)

    T1->>DB: BEGIN
    T2->>DB: BEGIN
    T1->>DB: SELECT balance FROM accounts WHERE id=1
    DB-->>T1: 100 (version A)
    T2->>DB: UPDATE accounts SET balance=50 WHERE id=1
    Note over DB: Writes version B, marks A as expired at T2's commit time
    T2->>DB: COMMIT
    T1->>DB: SELECT balance FROM accounts WHERE id=1
    DB-->>T1: 100 (still sees version A — T1's snapshot is older than T2's commit)
    T1->>DB: COMMIT

Three consequences you need to know:

1. Writers don't block readers, and readers don't block writers. This is the thing that makes Postgres scale under mixed workloads. A long analytical query won't lock your checkout flow.

2. Dead row versions accumulate, and something has to clean them up. That something is VACUUM. If you never vacuum (or autovacuum isn't keeping up), dead rows pile up, your tables get bigger on disk than they should be, and queries slow down because the planner has to skip over all the tombstones. This is called table bloat, and it's one of the most common "my Postgres is inexplicably slow" causes in production.

3. Transaction IDs are a finite resource. Postgres uses a 32-bit transaction ID space. If you never vacuum, you eventually hit transaction ID wraparound, and Postgres will shut down writes to protect your data. Autovacuum normally handles this. When it doesn't, you find out the hard way.

If you take one thing from this section: VACUUM is not optional, it is load-bearing. Autovacuum is on by default, but on busy systems you need to actually check whether it's keeping up. pg_stat_user_tables.n_dead_tup tells you.

WAL: why your database doesn't lose data when the power goes out

The Write-Ahead Log is the mechanism behind durability. The rule is simple: before Postgres writes your change to the actual data files, it writes the change to the WAL first, and flushes the WAL to disk.

graph LR
    APP[Your app: COMMIT] --> PG[Postgres]
    PG --> WAL[Write to WAL]
    WAL --> FSYNC[fsync WAL to disk]
    FSYNC --> ACK[Return COMMIT success to app]
    ACK -.async, later.-> PAGE[Write changed pages to data files]
    PAGE -.async, later.-> CHECK[Checkpoint — WAL can be truncated]

The WAL is append-only and sequential, so writing to it is fast. The data files are written asynchronously later. If Postgres crashes mid-write, on restart it replays the WAL from the last checkpoint and brings the data files back to a consistent state.

Three things the WAL gives you beyond crash recovery:

Replication. Streaming replication ships the WAL to standby servers in near real-time. That's how high-availability Postgres works.
Point-in-time recovery (PITR). Back up the data files, archive the WAL, and you can restore your database to any point in time, not just the last nightly backup. Lifesaver when someone drops a production table at 3 p.m.
Logical replication and change data capture. Tools like Debezium read the WAL to stream row-level changes to Kafka, Elasticsearch, or another database. Your database becomes an event source, for free.

You don't usually configure WAL directly, but you should know it exists, and you should know what synchronous_commit means — setting it to off makes commits faster but means a crash can lose the last few milliseconds of committed transactions. That tradeoff is yours to make, and you can make it per-transaction.

The query planner is not a black box

Here's the myth I want to dismantle. A lot of developers write SQL, it runs, and they move on. If it's slow, they add an index somewhere based on vibes. They treat the planner as unknowable.

It's not. Postgres will tell you exactly what it's doing, step by step, every time you ask.

EXPLAIN ANALYZE
SELECT u.email, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.email
ORDER BY order_count DESC
LIMIT 10;

That query, with EXPLAIN ANALYZE in front of it, returns something like:

Limit  (cost=1842.13..1842.16 rows=10 width=40) (actual time=23.411..23.415 rows=10 loops=1)
  ->  Sort  (cost=1842.13..1867.04 rows=9964 width=40) (actual time=23.410..23.412 rows=10 loops=1)
        Sort Key: (count(o.id)) DESC
        Sort Method: top-N heapsort  Memory: 26kB
        ->  HashAggregate  (cost=1527.23..1626.87 rows=9964 width=40) (actual time=18.912..21.103 rows=9964 loops=1)
              Group Key: u.email
              ->  Hash Right Join  (cost=382.14..1427.59 rows=19928 width=36) (actual time=2.104..14.223 rows=19928 loops=1)
                    Hash Cond: (o.user_id = u.id)
                    ->  Seq Scan on orders o  (cost=0.00..892.00 rows=50000 width=8) (actual time=0.012..5.123 rows=50000 loops=1)
                    ->  Hash  (cost=257.00..257.00 rows=9964 width=36) (actual time=2.012..2.013 rows=9964 loops=1)
                          ->  Index Scan using users_created_at_idx on users u  (cost=0.29..257.00 rows=9964 width=36) (actual time=0.014..1.602 rows=9964 loops=1)
                                Index Cond: (created_at > (now() - '30 days'::interval))
 Planning Time: 0.412 ms
 Execution Time: 23.502 ms

Read from the inside out, bottom to top. Every node tells you:

What operation (Seq Scan, Index Scan, Hash Join, Sort, etc.)
Estimated cost vs. actual time
Estimated rows vs. actual rows

The gaps are where you find bugs. If the planner estimates 100 rows and actually returns 1,000,000, your statistics are stale (ANALYZE will fix it) or your query is defeating the planner's assumptions (correlated subquery, function in a WHERE clause, etc.).

A few quick reads:

Seq Scan on a large table is a red flag unless you genuinely want all the rows.
Hash Join is usually good for large joins. Nested Loop is good for small joins. If Postgres picks Nested Loop for a million-row join, something is off.
Sort Method: external merge Disk: 120MB means Postgres had to spill to disk to sort. Bump work_mem or add an index that provides the sort order.

The lesson: write a query, EXPLAIN ANALYZE it, read the plan. Do this every single time you're about to ship a query that runs on a large table. It takes 10 seconds and saves you from shipping an accidental full table scan.

The data types nobody told you about

Here's where Postgres stops being "just a database" and starts being a platform.

JSONB

Store structured but schemaless data, query it with SQL, index it.

CREATE TABLE events (
  id BIGSERIAL PRIMARY KEY,
  payload JSONB NOT NULL
);

-- Insert arbitrary JSON
INSERT INTO events (payload) VALUES
  ('{"type": "signup", "user": {"id": 42, "plan": "pro"}}'),
  ('{"type": "purchase", "amount": 99.00, "currency": "USD"}');

-- Query into it like it's structured
SELECT payload->'user'->>'plan' AS plan, COUNT(*)
FROM events
WHERE payload->>'type' = 'signup'
GROUP BY plan;

-- Index a path for fast lookups
CREATE INDEX events_type_idx ON events ((payload->>'type'));

-- Or a full GIN index for arbitrary queries into the JSON
CREATE INDEX

This is your alternative to reaching for MongoDB when you have semi-structured data. You get SQL, joins, foreign keys, transactions, and flexible documents. This combination has eliminated "I need a document DB" for most workloads.

Arrays

Native array columns. Not a workaround, a real type.

CREATE TABLE posts (
  id BIGSERIAL PRIMARY KEY,
  title TEXT,
  tags TEXT[]
);

INSERT INTO posts (title, tags) VALUES
  ('Postgres lesson', ARRAY['postgres', 'databases', 'learning']);

-- Query it
SELECT * FROM posts WHERE 'postgres' = ANY(tags);

-- Index it for fast membership checks
CREATE INDEX posts_tags_gin ON posts USING GIN (tags);

A lot of tagging systems, preference lists, and simple many-to-many relationships that would become a join table in a textbook are cleaner and faster as an array column.

Range types

A first-class range of values, with a single operator for overlap detection. This is killer for scheduling and pricing.

CREATE TABLE reservations (
  id BIGSERIAL PRIMARY KEY,
  room_id BIGINT NOT NULL,
  during TSRANGE NOT NULL,
  EXCLUDE USING GIST (room_id WITH =, during WITH &&)
);

-- The EXCLUDE constraint means: no two reservations for the same room
-- can overlap in time. Postgres enforces it at the database level.
INSERT INTO reservations (room_id, during) VALUES
  (1, '[2026-05-01 09:00, 2026-05-01 10:00)'),
  (1, '[2026-05-01 09:30, 2026-05-01 11:00)');  -- This will fail.

Think about how much application code you'd write to enforce that constraint manually. Postgres does it with two lines.

Full-text search

CREATE INDEX posts_search_idx ON posts USING GIN (to_tsvector('english', title || ' ' || body));

SELECT title, ts_rank(to_tsvector('english', title || ' ' || body), query) AS rank
FROM posts, plainto_tsquery('english', 'postgres mental model') query
WHERE to_tsvector('english', title || ' ' || body) @@ query
ORDER BY rank DESC;

Good-enough search with stemming, ranking, and highlighting, without installing Elasticsearch. Most sites don't need Elastic. Postgres full-text search gets you to "this is fine" for a long time.

Extensions: the actual superpower

A Postgres extension is a package of SQL + C code that plugs into the server. Once installed, it's just part of your database. No new service to run, no new protocol to learn.

The ones worth knowing about:

pgvector — turns Postgres into a vector database for embeddings. If you're building anything with LLMs, you probably don't need a separate vector DB.
PostGIS — full geospatial support. The best GIS solution in any database, full stop.
pg_stat_statements — query performance tracking. Tells you which queries are eating your CPU. Turn this on in every production database.
TimescaleDB — turns Postgres into a time-series database with automatic partitioning and compression.
pgmq — a message queue inside Postgres. For many workloads, this is simpler than Kafka or SQS.
pg_cron — cron jobs scheduled inside the database.
Citus — distributes Postgres across multiple nodes for horizontal scaling.

Enabling one is usually a single SQL statement:

CREATE EXTENSION vector;

This is the thing that should change how you think about Postgres. A lot of architecture decisions — "we need a vector DB," "we need a time-series store," "we need a queue" — can become "we need an extension" instead. Fewer services, fewer failure modes, one backup strategy.

Sources: PGXN, the official Postgres extension network, lists hundreds more.

Where developers go wrong

The antipatterns I keep finding in real production databases:

1. No indexes, or the wrong indexes. Running a WHERE clause on an unindexed column means a sequential scan of the whole table. Run EXPLAIN on your common queries. If you see Seq Scan on a large table, you need an index.

2. Wrong index type. B-tree (the default) is great for equality and range queries. GIN is for JSONB, arrays, and full-text search. GiST is for ranges and geometry. Using B-tree where GIN would work, and vice versa, is a common mistake.

3. Indexes on every column, "just in case." Every index has a write cost. Each INSERT or UPDATE has to update every index. Measure which queries are slow, add indexes deliberately, drop unused ones (pg_stat_user_indexes.idx_scan = 0 tells you).

4. Running long transactions. Every open transaction is an anchor point that prevents VACUUM from cleaning up dead rows past its snapshot. A connection that sat in idle in transaction for three hours is the root cause of a shocking number of "why is my database bloated" incidents. Set idle_in_transaction_session_timeout.

5. Using SELECT * in application code. Two problems: you fetch columns you don't use (network + memory), and your code breaks in surprising ways when someone adds a column. Name the columns.

6. Not using EXPLAIN ANALYZE before shipping queries that hit big tables. This is free. Do it.

7. Storing JSON as TEXT or JSON instead of JSONB. Use JSONB unless you specifically need to preserve key order and whitespace. JSONB is indexable and much faster to query.

8. Ignoring connection pooling. Each Postgres connection is a full OS process. A few hundred connections and your server is spending all its time context-switching. Use PgBouncer or your framework's built-in pool. Know the difference between transaction-level and session-level pooling.

9. Running ORMs with no inspection. ORMs are great until they generate a 40-line SQL query with four joins you didn't know about. Log slow queries. Look at what your ORM is actually sending to the database.

10. Treating backups as "the thing the hosting provider does." Test your restore. A backup you haven't restored is not a backup.

Sources: Postgres Wiki — Don't Do This, pganalyze's common mistakes series, the Postgres documentation's performance chapter.

A lesson path for going from "I can write SQL" to "I actually get Postgres"

In this order:

Run EXPLAIN ANALYZE on your ten most common queries. Read the plans. Don't optimize anything yet — just build intuition for what shape the planner picks.
Enable pg_stat_statements in your dev or staging database. Leave it on for a few days of normal use. Then query it: SELECT query, calls, total_exec_time FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10;. Now you know what your actual hot queries are.
Pick one slow query and index your way out of it. Add an index, rerun EXPLAIN ANALYZE, confirm the plan changed.
Write a transaction that does two inserts, wrap it in BEGIN / COMMIT, intentionally fail the second one, and verify the first didn't persist. You now understand atomicity viscerally, not abstractly.
Stand up a replica with streaming replication. You've now touched the WAL. Congratulations, you're no longer in the 99%.
Install one extension — I'd suggest pgvector or pg_stat_statements if you haven't already — and use it in a real project. You've now made Postgres do something most developers don't know it can do.

Each of those is a half-day of work at most, and they're ordered so that each one makes the next one easier.

The one-sentence version

Postgres is an ACID-compliant, MVCC-based, extensible platform that happens to speak SQL — and learning the platform, not just the SQL dialect, is what separates developers who use Postgres from developers who get Postgres.

If this was useful, next in the queue: a deep dive on Postgres indexes (when to use B-tree vs. GIN vs. BRIN, and how to read an index-only scan), or a practical walkthrough of setting up pgvector for an AI side project. Let me know which one.

Inspiration: The Coding Gopher — "99% of Developers Don't Get PostgreSQL" on YouTube. Sources for specific claims: PostgreSQL documentation, PostgreSQL Wiki, PGXN for extensions, pganalyze for performance patterns. All diagrams rendered with Mermaid.

The Postgres You Inherited vs. The Postgres You Could Be Using — Ryan · ryanxf.com