title: Checkpoint Commit slug: checkpoint-commit category: workflow-pattern status: battle-tested difficulty: beginner tags: [git, safety, rollback, long-tasks, risk-management] prerequisites: [basic-cli-usage, basic-git] estimated_time: 5min to learn, immediate use cost_per_use: "$0.00 (workflow habit)"

Checkpoint Commit

Problem

Long agent tasks are risky. The agent works well for 10 steps, then makes a bad decision on step 11 that corrupts files modified in steps 7-10. Without checkpoints, your only options are to manually untangle the damage or start over entirely. You need rollback points so that agent mistakes are cheap to recover from.

Solution

Instruct the agent to commit at natural breakpoints during multi-step tasks. Each commit is a save point you can revert to.

Step-by-Step

  1. Include checkpoint instructions in your prompt or CLAUDE.md.
  2. Define breakpoints: after each file, after each logical step, or after each passing test.
  3. Use descriptive commit messages so you can identify good rollback points.
  4. If something goes wrong: git log --oneline to find the last good checkpoint, then git reset.

When to Use

  • Any task the agent will spend more than 5 minutes on
  • Multi-file refactors
  • Migrations or upgrades
  • Any task where you have said "I wish I could undo that"
  • When running agents in headless/background mode

When NOT to Use

  • Quick single-file edits
  • Exploratory/throwaway work on a scratch branch
  • When you prefer to squash everything into one commit at the end (but still checkpoint on a temp branch)

Example: Claude Code

# Option 1: Instruct in the prompt
claude -p "Migrate all API endpoints from Express to Fastify. \
  Work through one route file at a time. \
  After each file is migrated and its tests pass, make a git commit \
  with the message 'checkpoint: migrate <filename> to Fastify'. \
  Do not move to the next file until the current one is committed."

# Option 2: Add to CLAUDE.md for all tasks
cat >> CLAUDE.md << 'EOF'

## Git Workflow
- During multi-step tasks, commit after each logical step.
- Use the prefix "checkpoint:" for intermediate commits.
- Always run tests before committing.
- Never amend a previous checkpoint commit.
EOF

# Recovery when something goes wrong
git log --oneline -10
# a1b2c3d checkpoint: migrate payments.ts to Fastify   <-- last good
# d4e5f6g checkpoint: migrate orders.ts to Fastify      <-- broken
git reset --soft a1b2c3d   # keep changes staged so you can inspect
git diff --cached           # see what the bad step did
git reset --hard a1b2c3d   # discard the bad changes entirely
# Interactive session with manual checkpoints
claude

# > Refactor the auth module. Start with src/auth/middleware.ts.
# (agent finishes middleware.ts)
# > Commit this as a checkpoint before moving on.
# (agent commits)
# > Now refactor src/auth/tokens.ts.
# (agent makes a mess)
# > Stop. Revert to the last checkpoint and try tokens.ts again
#   with a different approach.

Example: Codex CLI

# Codex with checkpoint instructions
codex -q "Migrate route files in src/routes/ from Express to Fastify. \
  Process one file at a time. After each file, run tests and commit \
  with message 'checkpoint: migrate <file>'."

# Recovery
git log --oneline -10
git reset --hard <last-good-commit>

Cost Estimate

ActivityCost
Checkpoint commits$0.00
Recovery (git reset)$0.00
Re-running failed step~$0.05-$0.20

Checkpoints are free. The cost savings come from not re-running the entire task when only the last step failed.

Maturity Notes

Status: Battle-tested. This is the single most important safety pattern for long-running agent tasks. Every experienced CLI agent user adopts some form of checkpointing. The main pitfall is agents that commit broken code — always include "run tests before committing" in your instructions. Some teams use a dedicated checkpoint/ branch prefix and squash-merge to main when the full task succeeds.