Can Retried Events Introduce Inconsistent State?¶
Type: DeepDive
Category: Data
Audience: Engineers working on async pipelines, retries, and eventual consistency
đ The Real Question¶
When your event fires againâ
are you confident it wonât do something wrong?
Or are you just hoping the system âshould handle itâ?
- Idempotency: enforced where? request layer? domain? DB?
- Event replays: do they mutate state again, or just confirm idempotence?
- Side effects: are they guarded by delivery guarantees, or just âprobably wonât happen twiceâ?
â ď¸ The Silent Disaster¶
Retries feel safeâuntil they arenât.
- A retry re-applies the same state change â duplicate mutation
- A side effect (email, payment) gets triggered again â user chaos
- A compensating action runs twice â data corruption
â Safer Designs¶
- Use idempotency keys at system boundaries
- Store event execution history to detect duplicates
- Make side effects part of transactional outbox, not best-effort fire-and-forget
- Prefer "confirm success" over "assume failure and retry"
đ§ Principle¶
Retries must preserve truthânot just hope for it.
If you can't replay it safely, you never controlled it.
â FAQ¶
-
Q: Arenât retries necessary for reliability?
A: Yes. But reliability without state integrity is just faster failure. -
Q: What if retries are part of the domain logic?
A: Then encode them explicitly. Donât hide them behind infra.