The one thing to understand first
PostgreSQL scales reads by adding copies you can query, but writes still funnel through a single primary in every mainstream managed offering. So “scaling” almost always means “scaling reads.” How well that works — and how stale the reads are — depends on whether replicas keep their own copy of the data or share the primary’s storage.
Family one: asynchronous streaming replicas
RDS, Cloud SQL, and Flexible Server read replicas are ordinary PostgreSQL streaming replicas: each has its own full copy of the data on its own disk, fed the primary’s WAL asynchronously. They can scale reads across regions, but they lag — a read replica reflects the primary as of some moments ago, and the lag grows under write pressure or network latency. Each replica also costs a full instance plus its own storage.
Family two: shared-storage replicas and read pools
Aurora replicas and AlloyDB read pools attach to the same distributed storage as the writer. They do not maintain a separate copy, so they start quickly, stay much closer to current (lag is about applying changes to cache, not durably copying data), and are cheaper to add because they bring compute only, not another full data copy. AlloyDB formalizes this as a read pool you scale by node count; Aurora lets you add up to 15 replicas behind a reader endpoint that load-balances across them.