Capture & correlation
How watchers, batches, and AsyncLocalStorage turn scattered events into one navigable flow — the request and everything it caused, in capture order.
The unit of value in Telescope is the batch — one entry-point (a request, a queue job, a scheduled tick) and everything it caused. Correlation is the differentiator: it's the view metrics and logs can't give you, where a single request expands into the exact queries, jobs, exceptions, and mails it produced, in the order they happened.
Watchers
A watcher captures one kind of activity and records it as an Entry. Watchers come in two flavours:
- Entry-point watchers open a batch: the HTTP request watcher, the queue job watcher, the schedule watcher, and manual
Telescope.batch(). - Sub-watchers record into whatever batch is already active: query, mail, cache, event, log, Redis-command, model, and outbound-HTTP watchers.
Request and exception capture are wired automatically by TelescopeModule.forRoot(). Every other watcher is a value you add to the watchers array (most live in their own package). The same Watcher SPI backs the built-ins and your own watchers — community watchers are first-class, not second-class hooks.
interface Watcher {
readonly type: string; // entry type it produces
register(ctx: WatcherContext): void | Promise<void>; // wire NestJS hooks
shouldRecord?(candidate: unknown): boolean; // cheap pre-filter
}Watchers never touch storage and never block. They call ctx.record(...), which returns immediately.
Batches and AsyncLocalStorage
A TelescopeContext lives in AsyncLocalStorage. When an entry-point watcher calls beginBatch(origin), it seeds a Batch { id, origin, startedAt } into the ALS store for the duration of that async flow. Every record() made inside that flow then inherits the batch's id and gets a monotonic sequence — so entries reassemble in capture order with no manual plumbing.
This is why adapters like MikroORM and TypeORM correlate each query to its request: their loggers run inside the query's async context, so the active batch is already there. (Prisma is the exception — its query events fire detached from the caller's context, so Prisma queries are captured but orphaned. See the Prisma package.)
Non-HTTP entry points get their own batch:
- a queue worker opens a batch per job,
@nestjs/scheduleopens one per cron / interval / timeout tick,Telescope.batch(origin, fn)wraps an arbitrary script or CLI run.
Entries recorded outside any batch get a synthetic per-entry batch, so nothing is ever lost.
traceId / spanId from an active OpenTelemetry span are stamped onto entries (via the -otel provider), so a Telescope batch maps 1:1 to a trace and the correlation survives across the OTel bridge.
The Entry
Every watcher produces the same universal record. Type-specific data lives in content; everything else is uniform, so the API, dashboard, pruner, and OTel bridge treat all entry types identically:
interface Entry<TContent = unknown> {
id: string; // uuid v7 — time-sortable, globally unique
batchId: string; // correlation key; all entries in one batch share it
type: string; // 'request' | 'query' | 'job' | 'exception' | 'mail' | <custom>
familyHash: string | null; // groups "the same thing" (query template, exception class+message)
content: TContent; // redacted, type-specific payload
tags: string[]; // cross-cutting filters: 'status:500', 'user:42', 'slow'
sequence: number; // order within the batch (capture order)
durationMs: number | null;
origin: BatchOrigin; // 'http' | 'queue' | 'schedule' | 'cli' | 'manual'
instanceId: string; // hostname / pod id — multi-instance aggregation
createdAt: Date;
}familyHash is what powers "show me every occurrence of this exception" and the slow-query / duplicate-query views without scanning content. Queries normalize to a SQL template before hashing; exceptions hash on class + message.
Exception capture
Exceptions thrown out of a route handler are captured automatically (no watcher to register) and recorded as exception entries — which is what opens an error family, drives the new-exception alert, and feeds AI diagnosis.
By default, expected 4xx control flow is not recorded as an exception. A NestJS HttpException whose status is a 4xx — a 403 ForbiddenException, a 404 NotFoundException, a 400 from the validation pipe — is the framework doing its job (permission denied, resource missing, bad input), not an incident. Recording each one would open a new exception family (the family hash keys on class + message + top frame, so every call site is distinct), fire the new-exception alert, and — in AI auto-mode — spend model tokens diagnosing intended behaviour. In production every permission denial would page on-call and burn a diagnosis. (This default changed after exactly that incident: Telescope's own client-errors authorize gate threw a 403, which was captured as a brand-new family and paged Slack.)
The 4xx is not lost — the request watcher still records the 4xx statusCode (and a status:NNN tag, e.g. status:404) on its own request entry. You still see the 4xx in the dashboard and in error-rate metrics; it just doesn't spawn an exception family, can't fire new-exception, and can't trigger diagnosis.
Always recorded: 5xx HttpExceptions (real server errors) and any non-HttpException error (a plain Error, TypeError, etc.). Untouched: browser-reported client_exception entries — those are deliberate reports recorded directly by the ingestion endpoint, never through this filter.
To opt 4xx back in (restore the pre-change behaviour), set exceptions.captureHttp4xx:
TelescopeModule.forRoot({
exceptions: { captureHttp4xx: true }, // default false — 4xx is control flow, not an incident
});The Recorder pipeline
ctx.record(input) hands the entry to the Recorder — a bounded, async, backpressure-safe pipe between watchers and storage. The application thread never waits on it:
record(input)
→ enrich (attach batchId from ALS, instanceId, sequence, createdAt)
→ tag (run registered Taggers; built-in + user-provided)
→ redact (deep-redact configured paths + default sensitive keys)
→ sample (per-type sampling + filter() hook; drop early)
→ buffer (push to a bounded ring buffer)
→ flush (drain in batches on a timer / size threshold → StorageProvider.store)The guarantees that make this safe to run in production:
- Non-blocking.
record()is synchronous and O(1); all I/O is deferred to the flush timer. - Bounded memory. The ring buffer has a hard cap; on overflow it drops the oldest entry and increments a dropped counter — it never grows unboundedly and never blocks the app.
- Batched writes. Flushes coalesce many entries into one
store()call. - Graceful shutdown.
onApplicationShutdowndrains the buffer with a timeout. - Failure isolation. A storage error is logged and the batch is dropped — a broken telescope never breaks the host app.
Redaction is load-bearing
The synchronous redact() step is not just for privacy: it snapshots each entry's content into a plain, reference-free object at record() time. That releases live object graphs (e.g. a hydrated ORM entity captured off req.user, which references its EntityManager and identity map). Keeping redaction synchronous doubles as a detach that bounds memory — deferring it retains those graphs until flush and can OOM the host. See Performance.
The request flow, end to end:
GET /orders/42→ 5 queries (1 flagged slow, 2 duplicates) → 1 job dispatched (SendReceipt) → 1 outbound HTTP call → 1 exception — all sharing onebatchId, reassembled insequenceorder when you open the request in the dashboard.
Getting Started
Mount the Telescope dashboard in an existing NestJS app — install core + ui, import two modules, and open /telescope. Zero-config SQLite by default; swap the storage adapter when you're ready.
Storage
The StorageProvider SPI, the zero-config SQLite default, self-healing schema, and the adapter table — your DB, your store, the same contract everywhere.