Aviary
Reliability

Sagas & compensation

Undo the side effects of a partially-completed run with per-step compensate callbacks that run in reverse on failure, compensationRetries for transient undos, compensate:<step> events, and compensating cancellation via engine.cancel(runId, { compensate: true }).

A durable run often performs several irreversible side effects in sequence — reserve inventory, charge a card, allocate a shipment. If a later step fails, the earlier effects are still out there in the world, and "the run failed" is not an acceptable end state when money has moved. The saga pattern handles this: alongside each step that does something, you register how to undo it, and when the run fails the engine runs those undos in reverse order. nestjs-durable builds this in via a compensate option on ctx.step.

Registering a compensation

Attach a compensate callback to a local step. The callback is registered when the step completes; if the run later fails, the engine runs every registered compensation in reverse order (last completed step undone first), restoring the world before failing the run.

@Workflow({ name: 'checkout', version: '1' })
export class CheckoutWorkflow {
  constructor(
    private readonly inventory: InventoryService,
    private readonly payments: PaymentsService,
    private readonly shipping: ShippingService,
  ) {}

  async run(ctx: WorkflowCtx, order: Order) {
    const reservation = await ctx.step(
      'reserve-inventory',
      () => this.inventory.reserve(order.items),
      { compensate: () => this.inventory.release(order.items) },
    );

    const charge = await ctx.step(
      'charge-card',
      () => this.payments.charge(order.customerId, order.totalCents),
      { compensate: () => this.payments.refund(order.customerId, order.totalCents) },
    );

    // If this step throws, the engine runs the two compensations above in reverse:
    // refund the card, then release the inventory — then fails the run.
    const label = await ctx.step('allocate-shipment', () =>
      this.shipping.allocate(order, reservation.id),
    );

    return { chargeId: charge.id, tracking: label.tracking };
  }
}

If allocate-shipment throws (and exhausts its retries, or throws a FatalError), the run fails — but before it does, the engine refunds the card and releases the inventory. The saga is reconstructed from the run's history on replay, so it works correctly even after a crash: the steps that completed are the ones whose compensations get registered.

Compensations are for local steps. A remote step is already deduplicated by its deterministic stepId (runId:seq), which workers can use as an idempotency key, so to compensate remote work, wrap the orchestration in a local step (or model the undo as its own remote step invoked from a compensating local step).

Retrying a transient undo

A compensation can itself fail transiently — the refund API might be momentarily unreachable. The engine retries each compensation up to compensationRetries times. This is an engine/module-level option (it applies to every compensation), and it defaults to 1, i.e. a single attempt with no retry:

DurableModule.forRoot({
  store,
  transport,
  compensationRetries: 5, // retry each saga undo up to 5 times before giving up on it
});

Because a compensation may run more than once, compensations should be idempotent — releasing an already-released reservation or refunding an already-refunded charge must be a no-op. A compensation that keeps failing past compensationRetries is skipped rather than allowed to throw: a permanently-failing undo must not mask the original failure or strand the remaining compensations.

Compensations are visible

Every compensation surfaces as a compensate:<step> event, emitted as a step.completed (the undo ran) or step.failed (it exhausted its retries) lifecycle event. The dashboard and the Telescope integration render these, so a stranded undo is visible rather than silently swallowed. For the checkout above you'd see compensate:charge-card and compensate:reserve-inventory appear in the timeline as the saga unwinds.

Compensating cancellation

The saga also runs when you deliberately cancel a run with compensation. A plain engine.cancel(runId) is immediate: it marks the run cancelled right away and broadcasts the cancellation so a worker actually running it can abort cooperatively — but it does not undo completed steps. Passing { compensate: true } instead runs the saga first:

// Immediate cancel — mark cancelled, abort in-flight work, but leave completed side effects in place:
await engine.cancel(runId);

// Compensating cancel — undo the completed steps in reverse, THEN mark the run cancelled:
await engine.cancel(runId, { compensate: true });

A compensating cancel works by resuming the run with a cancellation pending: the replay re-registers the saga from history, and at the run's suspension point the engine runs the compensations in reverse and marks the run cancelled (rather than re-suspending). For the checkout example, cancelling a run that had already charged the card and reserved inventory with { compensate: true } issues the refund and releases the reservation before the run becomes cancelled — leaving the world clean, exactly as a failure would.

On this page