Pagination That Survives Real Data

Pagination looks like a solved problem. LIMIT and OFFSET, page numbers, done. It works perfectly in development with fifty rows, ships, and then a year later someone notices page 5000 takes eight seconds and occasionally shows the same post twice. Both of those are the same root cause, and both go away with a different approach that's only slightly more code.

How offset pagination breaks

The familiar version:

SELECT * FROM posts ORDER BY created_at DESC OFFSET 100000 LIMIT 20;

It has two problems, and they show up at different times.

It gets slow as you go deeper. OFFSET 100000 does not skip to row 100,000 cheaply. The database has to produce and then throw away the first 100,000 rows to reach yours. The cost grows with the offset, so early pages are fast and deep pages crawl. The work is proportional to how far in you are, not how many rows you asked for.

It shows duplicates and skips when data changes. Offset counts positions in a list that's shifting underneath you. If someone inserts a new post while you're paging, every later row shifts down by one, so the first row of your next page is one you already saw. Delete a row and you skip one. On a busy feed this is constant.

DecisionOffset is fine for small, static, page-numbered data; wrong for deep or live data.

Offset's one real advantage is that you can jump straight to "page 47," which keyset can't. So it's acceptable for an admin table of a few hundred rows, or anything where you genuinely need numbered pages and the data barely changes. For an infinite-scroll feed, a large dataset, or anything that's actively being written, its slowness-with-depth and its duplicate/skip behaviour make it the wrong tool.

Keyset (cursor) pagination

Instead of counting positions, you remember the value where you stopped and ask for rows after it. "Give me 20 posts created before this timestamp" rather than "skip 100,000 rows."

SELECT * FROM posts
WHERE (created_at, id) < ('2026-05-26T10:00:00Z', 8842)
ORDER BY created_at DESC, id DESC
LIMIT 20;

This fixes both problems at once. It's fast at any depth, because with an index on the sort key the database jumps straight to your position and reads 20 rows; the cost is the same whether you're on page 1 or page 50,000. And it's stable under inserts and deletes, because the cursor is a value, not a position. A row added before your cursor doesn't shift anything you'll see next; you simply continue from the value you remembered.

Here it is end to end: paging through a list with a cursor, surviving an insert between pages.

Keyset pagination that doesn't duplicate or skiprun · edit · saved to you

Loading editor…

Building a stable cursor

The one subtlety: the sort key must be unique, or pagination breaks at ties. If you sort by created_at alone and three posts share a timestamp, a cursor at that timestamp can't tell which of the three you already saw. The fix is to make the sort key unique by appending a tiebreaker, almost always the primary key: sort by (created_at, id) and put both in the cursor.

The cursor is opaque, and that's deliberate

You encode the cursor (base64 of the sort values, say) so clients treat it as a meaningless token to pass back, not a structure to parse or construct. This lets you change what's inside it later (add a tiebreaker, switch sort keys) without breaking clients. A cursor is "where I was," not "page 6," and keeping it opaque keeps that contract clean.

What to return

A keyset-paginated response carries the rows plus how to get the next page:

{
  "data": [ /* 20 rows */ ],
  "nextCursor": "eyJjcmVhdGVkQXQiOiIuLi4iLCJpZCI6ODg0Mn0",
  "hasMore": true
}

hasMore is cheap to compute: ask for LIMIT 21, and if you get 21 rows, there's a next page (return the first 20 and set hasMore: true). No count query needed. When there's no next page, nextCursor is null and the client stops.

The total-count problem

Clients often want "showing 20 of 4,328." That total is surprisingly expensive: an exact COUNT(*) with a filter can be as costly as the query itself, because the database must examine every matching row. Options, cheapest to most exact:

Don't show a total

Infinite scroll with hasMore needs no count at all. This is what most modern feeds do, and it's the cheapest correct answer. If the UI doesn't truly need a precise total, don't pay for one.

Approximate or cap it

Show "1,000+" using Postgres's cheap row estimate (reltuples from the planner's statistics), or count up to a cap (COUNT of the first 1,000 matches) and display "999+". Good enough for "roughly how many," at a fraction of the cost.

The rule: don't return a total count unless the client genuinely needs an exact one, because counting is often slower than fetching the page you're showing.

The one idea to take away

Offset pagination counts positions, which makes deep pages slow and makes results duplicate or skip when data changes underneath you. Keyset pagination remembers the last value seen and reads forward from it, so it's fast at any depth and stable under writes. Make the sort key unique with a tiebreaker, return an opaque cursor plus a hasMore flag (via LIMIT n+1), and skip exact total counts unless you truly need them. Use offset only for small, static, genuinely page-numbered data.

Test yourself

Questions· say the answer out loud before you open it. If you can't, the chapter isn't done.

QWhy does OFFSET 100000 get slow?+

Because the database can't jump to row 100,000; it must produce and discard the first 100,000 rows to reach your page. The cost grows with the offset, so deep pages crawl while early pages are fast. The work is proportional to how far in you are, not how many rows you requested.

QWhy does offset pagination show duplicates or skips?+

Because it counts positions in a list that shifts as data changes. Insert a row before your current position and every later row shifts down, so the first row of your next page is one you already saw; delete a row and you skip one. On a busy, frequently-written dataset this happens constantly.

QHow does keyset pagination fix both problems?+

It remembers the value where you stopped (a cursor) and reads rows after it, rather than counting positions. With an index on the sort key it jumps straight to your spot at any depth, so cost is constant, and because the cursor is a value rather than a position, inserts and deletes elsewhere don't cause duplicates or skips.

QWhy must a keyset cursor's sort key be unique?+

Because ties break it. If you sort by created_at alone and several rows share a timestamp, a cursor at that timestamp can't distinguish which of them you already returned. Append a unique tiebreaker, usually the primary key, and sort by (created_at, id) so every cursor position is unambiguous.

QHow do you compute hasMore without a count query?+

Ask for one more row than the page size (LIMIT 21 for a page of 20). If you get 21 rows there's a next page, so return the first 20 and set hasMore true; if you get 20 or fewer, there isn't. This avoids a separate, expensive count entirely.

QA client wants 'showing 20 of N'. Why is N expensive and what are your options?+

An exact filtered COUNT(*) must examine every matching row, so it can cost as much as the query itself. Options, cheapest first: don't show a total (infinite scroll with hasMore), show an approximate total from Postgres's row estimate, or count up to a cap and show '999+'. Only pay for an exact count if the UI truly needs one.