Aviary

Durable

Durable workflows for NestJS — write a workflow as plain code; every step is checkpointed, so it survives crashes and deploys. Steps can run across apps and languages, with a built-in control plane.

@dudousxd/nestjs-durable brings durable execution to NestJS. You write a workflow as ordinary async code — call a step, use its result, call the next — and the engine records every step's output. If the process crashes or you deploy mid-run, the workflow resumes from the last checkpoint instead of starting over. Steps can run locally in NestJS or on a remote worker (even a Python one), but it stays one workflow, with one source of truth, and one end-to-end timeline.

The one rule

The engine recovers a run by replaying the workflow function from the top. Completed steps return their saved result instead of executing again, so your run body must be deterministic — no Date.now(), Math.random(), or direct I/O outside a step. See Durability & replay.

The problem it solves

Today multi-service flows are scattered: a queue here, a queue there, a piece in Python, and no single place to read or watch the whole thing. When a process dies halfway through, you are left reconstructing which steps already ran. nestjs-durable collapses that into three guarantees:

  • The flow becomes code, in one place. Read the workflow function and you understand the whole sequence — even when steps execute in different apps or languages.
  • Durability by replay. A crash or deploy never re-runs completed work. Each step is checkpointed; on recovery, finished steps replay their saved result and only unfinished work executes.
  • End-to-end visibility. Because one orchestrator owns the state, it knows about every step — including the remote ones — so a full-flow trace, dashboard, and Telescope view come almost for free.

Quickstart

The minimal loop — install, register the module, write a workflow, start a run — with zero infrastructure: the in-process event-emitter transport and an in-memory store. Swap those for BullMQ and an ORM store when you go to production. For the full walkthrough, see Getting Started.

Install the core packages, the zero-infra transport, and its peers:

pnpm add @dudousxd/nestjs-durable @dudousxd/nestjs-durable-core @dudousxd/nestjs-durable-transport-event-emitter @nestjs/event-emitter zod

Write a workflow as plain async code. Every ctx.step is checkpointed; ctx.waitForSignal suspends the run until a signal arrives:

checkout.workflow.ts
import { Workflow } from '@dudousxd/nestjs-durable';
import type { WorkflowCtx } from '@dudousxd/nestjs-durable-core';

@Workflow({ name: 'checkout', version: '1' })
export class CheckoutWorkflow {
  async run(ctx: WorkflowCtx, order: { id: string; total: number }) {
    await ctx.step('reserveStock', async () => ({ reserved: true }));
    const approval = await ctx.waitForSignal<{ approved: boolean }>(`approve:${order.id}`);
    if (!approval.approved) return { status: 'rejected' };
    await ctx.step('ship', async () => ({ shipped: true }));
    return { status: 'shipped' };
  }
}

Register the module with the event-emitter transport and an in-memory store, then provide the workflow:

app.module.ts
import { DurableModule } from '@dudousxd/nestjs-durable';
import { InMemoryStateStore } from '@dudousxd/nestjs-durable-core';
import { EventEmitterTransport } from '@dudousxd/nestjs-durable-transport-event-emitter';
import { Module } from '@nestjs/common';
import { EventEmitter2, EventEmitterModule } from '@nestjs/event-emitter';
import { CheckoutWorkflow } from './checkout.workflow';

@Module({
  imports: [
    EventEmitterModule.forRoot(),
    DurableModule.forRootAsync({
      inject: [EventEmitter2],
      useFactory: (emitter: EventEmitter2) => ({
        store: new InMemoryStateStore(),
        transport: new EventEmitterTransport(emitter),
      }),
    }),
  ],
  providers: [CheckoutWorkflow],
})
export class AppModule {}

Start a run and resume it later. start enqueues the run and returns immediately — the HTTP handler never blocks on workflow logic:

constructor(private readonly workflows: WorkflowService) {}

async checkout(order: { id: string; total: number }) {
  const { runId } = await this.workflows.start('checkout', order); // → { status: 'pending' }
  return runId; // respond now; a worker runs the workflow body
}

// later, from your approval webhook — this completes & ships the run:
async approve(orderId: string) {
  await this.workflows.signal(`approve:${orderId}`, { approved: true });
}

Need the outcome inline instead? await this.workflows.waitForRun(runId) resolves once the run settles.

Steps across apps and languages

A step does not have to run in NestJS. Declare a remote step with a typed, validated contract and call it with ctx.remote — the engine dispatches it over a pluggable transport to wherever its handler lives:

checkout.steps.ts
import { remoteStep } from '@dudousxd/nestjs-durable-core';
import { z } from 'zod';

export const chargeCard = remoteStep({
  name: 'payments.charge-card',
  input: z.object({ orderId: z.string(), amountCents: z.number().int() }),
  output: z.object({ chargeId: z.string() }),
  retries: 3,
});

The handler is just a provider method decorated with @DurableStep, decoupled from the workflow. With the event-emitter transport it runs in the same process; swap the transport for BullMQ to move it to a separate process — or implement it in a Python worker — without changing the workflow code.

The split goes both ways: a remote worker can implement a step the NestJS workflow calls, or author the whole workflow itself and call back into NestJS. Either way, the engine stays the single owner of durable state.

What you get

  • Crash-proof by replay. Each step is checkpointed and runs exactly once; only unfinished work executes after a restart.
  • Durable sleep and signals. Pause for minutes or months with ctx.sleep (no compute while waiting), or wait on a human approval or webhook with ctx.waitForSignal. Both survive restarts.
  • Bring your ORM, any SQL database. State lives in Postgres, MySQL, or SQLite through a StateStore interface, with MikroORM, TypeORM, Prisma, and Drizzle adapters and auto-schema on boot.
  • See the whole flow. A built-in control plane renders each run as a graph; OpenTelemetry and a Telescope watcher give you two more views of the same event log.

Where to go next

On this page