Idempotency and Exactly-Once That Actually Works

Here is a true story that happens to every team eventually: a customer's payment goes through, their network drops before the confirmation arrives, the app retries, and they're charged twice. The logs show two identical requests. Nobody did anything wrong, exactly, and yet money was taken twice. This deep dive is about why that happens and the small set of patterns that make it impossible.

The uncomfortable truth: you can't prevent duplicates

The instinct is to stop the duplicate request. You can't, reliably. The network can drop a response after the work succeeded, so the client genuinely doesn't know if it worked and must retry to be safe. Load balancers retry. Mobile apps retry. Message queues redeliver. Users double-tap. Duplicates aren't an edge case; they're the normal weather of a distributed system.

So you stop trying to prevent the second arrival and instead make it harmless. An operation is idempotent if doing it twice has the same effect as doing it once. GET is naturally idempotent. DELETE usually is (deleting an already-deleted thing is fine). POST is the dangerous one, because "create a payment" run twice creates two payments, unless you design it not to.

Idempotency keys

The standard solution, popularised by Stripe: the client generates a unique key per operation and sends it in a header. The server records the result against that key. A retry carrying the same key returns the stored result instead of doing the work again.

POST /payments
Idempotency-Key: 7b8c-3f2a-...

{ "amount": 5000, "currency": "INR" }

The core logic is tiny: before doing the side effect, check whether you've seen this key; if so, replay the saved response; if not, do the work and save it. Run it and watch the second call replay instead of re-charging.

Make a retry harmless with an idempotency keyrun · edit · saved to you

Loading editor…

The race nobody mentions

The naive version has a bug under concurrency. Two retries with the same key arrive at almost the same instant. Both check "have I seen this key?", both see no, both proceed, both charge. You've reintroduced the duplicate you were trying to kill.

The fix is to make "claim the key" atomic. You insert the key into a table with a unique constraint before doing the work. The first request inserts successfully and proceeds; the second request's insert fails on the unique constraint, which tells it "someone else is already handling this key," so it waits for or returns the first one's result.

DecisionReserve the idempotency key atomically before doing the work.

Checking-then-acting in two steps leaves a gap where two concurrent retries both slip through. Inserting the key under a unique constraint first collapses the check and the claim into one atomic database operation, so exactly one request can ever own a key. The cost is a little extra bookkeeping (a keys table, a state to track) in exchange for a guarantee that holds under concurrent retries, which is exactly when you need it.

The state machine that makes it safe

A key isn't just "seen / not seen." The operation behind it moves through states, and you persist that state so any retry, at any moment, does the right thing.

Reserved
The key is claimed (unique insert) but the work hasn't finished. A retry now waits or returns "in progress" rather than starting a second operation.
Completed
The work succeeded and the response is stored against the key. Any retry replays that exact response.
Failed
The work failed in a way that's safe to retry. The key is released or marked so a genuine retry can try again, rather than being stuck.

Storing the response, not just the fact of completion, matters: the retrying client needs the same answer (the same payment ID, the same confirmation) it would have gotten the first time. Idempotency-key records typically expire after a day or so, since retries don't arrive a week later.

At-least-once + idempotent = exactly-once (the only kind you get)

People chase "exactly-once delivery" and it doesn't exist over an unreliable network. What does exist, and is operationally equivalent, is at-least-once delivery plus idempotent handlers. The system guarantees a message arrives at least once (retrying until acknowledged), and because the handler is idempotent, processing it more than once is harmless. The user experiences exactly-once. Chasing true network-level exactly-once is a trap; this combination is the real, working answer, and it shows up everywhere.

Where this applies (everywhere money or side effects live)

APIs and webhooks

Every state-changing endpoint that matters should accept an idempotency key. Inbound webhooks (from Stripe, a payment gateway, GitHub) are delivered at-least-once by the sender, so dedupe them by the event ID they carry; a replayed webhook must not re-run its effect.

Background jobs and queues

Queues deliver at-least-once: a worker can crash after doing the work but before acknowledging, so the job runs again. Every job handler must be idempotent, deduping by a job or operation key, or you'll double-send emails and double-create records.

This is why the payments ledger, the booking system, and the chat delivery in the system-design deep dives all lean on the same idea. It's one pattern wearing different clothes.

Idempotent is not the same as 'safe to retry blindly'

An operation can be idempotent at the data level but still have non-idempotent side effects you forgot about: sending an email, calling a third party that charges per call, incrementing a metric. When you make a handler idempotent, account for all its effects, not just the database write. The email send must also be guarded by the key, or you've made the row safe and the inbox spammed.

The one idea to take away

You can't prevent duplicate requests, so make them harmless. The client sends an idempotency key; the server claims it atomically (a unique constraint, not a check-then-act), records the operation's state and response, and replays that response on any retry. At-least-once delivery plus idempotent handlers is the only "exactly-once" you can actually have, and it's enough. Apply it to every endpoint, webhook, and job that has a side effect that would hurt if it ran twice.

Test yourself

Questions· say the answer out loud before you open it. If you can't, the chapter isn't done.

QWhy can't you just prevent duplicate requests instead of handling them?+

Because the network can drop a response after the work succeeded, so the client can't know whether it worked and must retry. Load balancers, queues, mobile apps, and double-tapping users all produce duplicates too. Duplicates are the normal state of a distributed system, so the only reliable approach is to make the second arrival harmless rather than to prevent it.

QHow does an idempotency key make a POST safe?+

The client generates a unique key per operation and sends it in a header. The server records the operation's result against that key, and a retry carrying the same key returns the stored result instead of redoing the work. So POSTing a payment twice with the same key creates one payment and replays the same response.

QWhat's the concurrency bug in a naive idempotency check, and how do you fix it?+

If you check 'have I seen this key?' and then act, two concurrent retries can both see 'no' and both proceed, duplicating the operation. Fix it by claiming the key atomically first: insert it under a unique constraint before doing the work, so the second request's insert fails and it returns or waits for the first result.

QWhy store the response against the key, not just 'completed'?+

Because the retrying client needs the same answer it would have gotten originally, like the same payment ID or confirmation. If you only record that it completed, a retry can't be given the correct result. Store the actual response and replay it; expire these records after a day or so since retries don't arrive much later.

QIs exactly-once delivery achievable? What do you build instead?+

Not over an unreliable network. You build at-least-once delivery plus idempotent handlers: the system retries until a message is acknowledged, and because handlers are idempotent, processing more than once is harmless. The user perceives exactly-once. Chasing true network-level exactly-once is a trap.

QWhy must every background job handler be idempotent?+

Because queues deliver at-least-once: a worker can crash after doing the work but before acknowledging, so the job is redelivered and runs again. Retries on transient errors do the same. Without idempotency (deduping by a job or operation key) you eventually double-send emails or double-create records.

QAn operation updates a row idempotently but also sends an email. Is it safe to retry?+

Not unless the email is also guarded by the key. Idempotency must cover all side effects, not just the database write. A retry would re-send the email even though the row update is safe. Gate the email (and any third-party call or metric) behind the same idempotency key so the whole operation, effects included, runs once.