The cost of a thousand pods
Horizontal scaling solves a lot of problems and creates one big one: every pod is a tiny machine with its own opinions about your database.
The first time I ran a service at a thousand pods I did not feel powerful. I felt like I had built a thousand small misbehaving children who all wanted to talk to the same database at once. Each one was reasonable on its own. Together they were a denial-of-service attack on my own infrastructure.
Connections are a global resource
The thing nobody warns you about — or, they do, in a footnote you skim — is that Postgres connections are not a per-pod concern. They are a global resource shared across the entire fleet, and your pods do not know about each other. They will happily open the maximum they are allowed, locally, until the database tips over.
// the well-meaning default that scales linearly into a wall
export const pool = new Pool({
max: 20,
connectionString: env.DATABASE_URL,
});Twenty connections per pod times a thousand pods is twenty thousand connections. Your database will refuse most of them and you will spend a Tuesday afternoon discovering that fact in production.
Pooling is the boring answer
PgBouncer in transaction mode in front of Postgres is the answer almost every time. It feels old-fashioned because it is. The thing you give up — session-scoped features like prepared statements and SET LOCAL — turns out to be a thing you were not really using on purpose anyway.
Observability is what makes it cheap
The cost of a thousand pods is not really money. It is the cost of not being able to see them. Tracing every database call back to the request that caused it is the difference between a five-minute investigation and an afternoon.