Skip to content

Is a Rollback Strategy in Place for Critical Changes?

Type: DeepDive
Category: Release
Audience: SREs, senior engineers, tech leads involved in high-risk deployments


🔍 What This Perspective Covers

You don’t truly own a system until you can safely undo its changes.

This perspective examines whether rollback paths are designed, validated, and executable—not just documented.

Real-World Risk Cases

  • DB schema migration that drops columns
  • Batch jobs that transform or delete data
  • Feature rollout that modifies shared state
  • Deployment that bypasses blue/green isolation

⚠️ Failure Patterns

  • Rollback scripts exist but have never been tested
  • Infra rollback is possible, but data state is irreversible
  • Team assumes “fix forward” without safeguards
  • No plan to rollback partially completed operations

✅ Smarter Rollback Strategy

  • Identify irreversible operations early (schema, state, external calls)
  • Separate schema rollout and feature exposure if possible
  • Test rollback in staging—including data rollback
  • Automate rollback as a first-class pipeline step
  • Pair with alerting: know when rollback is needed and whether it worked

🧠 Principle

Rollback is not backup.
It’s an engineered return path.


❓ FAQ

  • Q: Do we always need rollback?
    A: No. But if rollback isn’t feasible, pre-checks and isolation become more critical.

  • Q: What’s the difference between rollback and revert?
    A: Rollback implies cleanup and recovery, not just code diff reversal.