Skip to main content
Long-term fixes address root causes to prevent recurrence. Unlike mitigation, they implement sustainable solutions.

Process

1. Understand root cause

Ensure root cause is confirmed with high confidence:
  • Root cause identified with strong evidence
  • Evidence chain established
  • Root cause validated
  • Root cause documented

2. Design solution

Design a fix that:
  • Addresses root cause, not symptoms
  • Prevents recurrence
  • Is sustainable and maintainable
  • Minimizes risk of new issues
Solution types:
  • Code fixes — Bug fixes, error handling, resource management
  • Architecture changes — Resilience, scalability, dependencies
  • Process improvements — Testing, deployment, monitoring

3. Implement fix

  1. Make code changes
  2. Test thoroughly
  3. Get code review
  4. Update documentation
  5. Deploy carefully

4. Validate fix

  • Verify fix works
  • Check for regressions
  • Monitor performance
  • Roll out gradually if possible

Example

Root cause: PR #1847 introduced connection leak in error handling Fix:
// Before
async findPendingOrders() {
  const connection = await pool.getConnection();
  // ... batch queries
  // Missing: connection release on error
}

// After
async findPendingOrders() {
  const connection = await pool.getConnection();
  try {
    // ... batch queries
  } finally {
    connection.release(); // Always release
  }
}
Validation:
  • Unit tests for error paths
  • Integration tests with connection pool
  • Code review approval
  • Gradual production rollout

Long-term fix vs. mitigation

  • Mitigation — Quick, temporary, reduces impact, restores service
  • Long-term fix — Thorough, permanent, prevents recurrence, improves system

Best practices

  • Address root cause, not symptoms
  • Test thoroughly before deploying
  • Get code review
  • Monitor after deployment
  • Document what changed and why