← Back to Build Log

17 Iterations to a Working PR Scanner: A Vercel Lambda Postmortem

By Caleb Gates

If you are trying to run a TypeScript static analyzer inside a Vercel serverless function and the install step keeps failing, you are probably solving the wrong problem. The fastest path to a green pull request scanner is to delete the install step entirely and fetch the analyzer fresh from npm at scan time. This is the story of how we got there after 17 iterations on the Nark bot, including why silent failures cost us months of "0 violations" lies, why stderr tail beats stderr head, and why Lambda's missing primitives are a class of bug rather than a single bug.

Quick Answer: Static analyzers that read package names from import string literals do not need node_modules to scan a project. To run a full Nark scan on any TypeScript repo without installing dependencies, run npx nark --tsconfig ./tsconfig.json from a directory that has a package.json and a tsconfig.json. No pnpm install. No npm ci. No 158-line bootstrap.

What is Nark? Nark is an open-source TypeScript scanner that checks projects against Nark Profiles for 165+ npm packages and reports missing error handling, unguarded calls, and resource cleanup gaps. Run it locally with npx nark or wire it into CI with the GitHub Action.

Today the Nark PR bot went green. It scanned a pull request, found a single new violation in a changed file, and posted a dual-mode comment showing the new violation alongside a count of 142 pre-existing ones. Score: 143 total. That comment took 17 iterations over five hours, with three architectural pivots. Every iteration revealed a layer of failure that the previous one had been hiding.

This is the postmortem.


The starting state: silently lying for months

Nark is a TypeScript scanner that checks codebases against Nark Profiles. Each Profile is a small YAML spec describing how a specific npm package can throw, time out, or fail. When someone opens a PR against a repo connected to our SaaS, an Inngest function clones the PR, runs Nark, and posts a comment listing violations.

For months, that comment had been "0 new violations, 0 resolved, 0 pre-existing." Always. On every PR. Including PRs whose diffs touched code we knew had violations.

The function ran. It returned cleanly. Inngest showed green status. The bot posted comments. Nothing was wrong, except nothing was right.

The bug was a single line in scan-pr.ts:

execSync(`npx nark --diff ${baseSha}..${headSha} ... 2>/dev/null || true`)

2>/dev/null discarded stderr. || true masked any non-zero exit. The function caught what it thought was a successful scan output (actually an empty string), parsed zero violations from it, and posted "0 violations, clean scan."

We had been silently passing for months.


The first fix: make it crash

The first round was the smallest code change and the largest mindset shift. We removed the silencing and made the function throw on real failures. This immediately produced a bot comment saying "Scan failed: Nark exited 2: Permission denied writing to /dev/stdout."

That single visible error opened the door. We could see things now. The next 16 rounds were us iteratively making more errors visible, finding what was broken, and fixing it, usually with much smaller code changes than the diagnostic improvements that exposed them.


The shape of the journey

Here is a compressed version of the iteration table:

#What we thought was wrongWhat was actually wrong
1Silencing was hiding failuresTrue. Bot reported 0 violations on every PR for months.
2Nark's /dev/stdout modeNark's setupOutputLogging wrote a sibling /dev/output.txt log file. /dev is read-only.
3Parser read wrong JSON keysSchema contract had decayed across a version bump nobody had snapshot-tested.
4Webpack rejected nark/bin/nark.jsNark's exports field only exposed ./dist/index.js. Webpack honored the field strictly.
5createRequire(import.meta.url) brokeNext.js's webpack rewrites import.meta.url to virtual URLs. The base path for createRequire ended up wrong.
6"Static" analysis needs node_modulesTrue at the time. Or so we thought.
7pnpm: command not foundVercel Lambda has node and npm on PATH. Not pnpm. Not yarn. Not git.
8corepack honors packageManager field strictlyThe saas had pnpm@8.15.1 in packageManager but a v9.0 lockfile. corepack installed pnpm 8, which can't read v9 lockfiles.
9pnpm@latest rejects v9 lockfilespnpm 10 tightened compatibility checks even though both 9 and 10 nominally use lockfile v9.0.
10Real errors hide at the START of stderr, not the endNode deprecation warnings + progress dump masked the actual error message under our slice(-4000) tail.
11The whole install step shouldn't existNark's V2 analyzer reads package names from import string literals via AST text extraction. It doesn't need node_modules. We had been building the install step to solve a problem we created by misunderstanding the tool.
12Vercel build cache preserves broken stateStale nark@1.9.1 was sitting in the Lambda's .pnpm/ virtual store with its dep symlinks pruned. Our resolver picked it first alphabetically.
13Fetch Nark from npm at scan timeBypass the build entirely. Install Nark@latest into /tmp/.nark-cache/ per scan. Always-latest with zero overhead.
14Nark needed HOME againTelemetry first-run notice writes ~/.nark/config.json. Lambda has no writable HOME. Same lesson, fifth instance.
15Real errors come at BOTH ends of stderrSometimes the throw site (top) and sometimes the diagnostic flood (bottom). Capture both, slice with marker between.
16Nark itself shells out to git diff for its --diff flagThe same Lambda-has-no-git constraint that we had fixed in our own clone step weeks ago, hitting us from inside the tool.
17Stop using --diff. Compute the diff ourselves.Pull patches from the GitHub Compare API, parse added lines per file, post-tag violations after the scan.

The last commit landed. The bot now scans full projects without node_modules, fetches Nark fresh from npm every scan, and reports real violations with populated package names and messages.


The pivot that mattered

If we had to pick one moment that turned the trajectory, it was iteration 11.

Up to that point, we had been iteratively improving how we install dependencies in a serverless function. We had moved from npm to corepack, from corepack to a /tmp/.nark-tools bootstrap via npm, from pnpm@latest to pnpm@9, from --frozen-lockfile to --no-frozen-lockfile. Every fix made the install step a little more robust against a different class of customer project configuration.

Then a parallel research spike came back with a finding: Nark's V2 analyzer, which had been the default since version 2.0.0, reads package names directly from moduleSpecifier.text on import nodes. Pure AST text extraction. No call to TypeChecker.getSymbolAtLocation(). The premise we had been operating under, that Nark requires resolved imports, was true for V1, and we had been treating it as if it were still true for V2.

We ran a quick empirical test: created a minimal TypeScript file with a single import Stripe from "stripe" and an unsafe stripe.charges.create(...) call. Wrote a 14-line tsconfig. Crucially: no node_modules. Ran Nark.

files_analyzed: 1
callsites_by_package: { stripe: 1 }
violations: 1, error stripe, test.ts:7

2.2 seconds. The premise was wrong. We had been debugging an install step that did not need to exist.

That same commit deleted 158 lines of install logic from scan-pr.ts. The Vercel /tmp size constraint, the pnpm version skew, the corepack packageManager-field strict mode, the lockfile compatibility checks, the build cache pollution. All of those failure classes vanished in a single delete. They were never structural. They were artifacts of solving the wrong problem.


The lesson about visibility

Looking back, the iterations divide cleanly into two categories.

Diagnostic improvements were the ones that did not fix the bug, but made the next bug visible. These were small code changes, usually 5 to 30 lines, but they paid compound interest:

  • Tail-truncate stderr instead of head, because tools emit deprecation warnings at the start and real errors at the end
  • Capture HEAD + omission marker + TAIL, because TypeScript diagnostic dumps push the real error past the head window
  • Pull resolver calls out of template literals so their throws do not get misclassified by the surrounding catch block
  • Lazily compute things that might throw, so module-load errors do not cascade into 500s for the entire HTTP route

Each of these revealed a class of failures that the previous opacity had been hiding. They were investments in observability, not in functionality. None of them moved the bot one inch closer to posting a working comment, but without them, we could not have known what to fix.

Functional improvements were the ones that actually fixed something, but they only worked once the diagnostic improvements made the failure visible. The pattern over and over: silently failing, spend an iteration making the failure visible, see what is actually wrong, spend a smaller iteration fixing it.

If we had skipped the diagnostic improvements and tried to brute-force the functional ones, every fix would have been a guess. Some of them would have been right. Most of them would have been wrong, and we would have shipped them, and the silent failure would have continued, just with a different shape.


What the architecture looks like now

The shipped pipeline:

GitHub webhook -> POST /api/webhooks/github (Vercel Lambda) ->
  Inngest event "nark/scan.requested" ->
  Inngest Cloud queues + retries ->
  POST /api/inngest (Vercel Lambda) ->
    1. clone-repo: isomorphic-git into /tmp (no git binary)
    2. get-changed-files: GitHub Compare API + parse added-line numbers from each patch
    3. ensureLatestNarkBin: fetch nark@latest version from registry.npmjs.org,
       npm install into /tmp/.nark-cache/v<version>/ if not cached
    4. run-nark-scan: full project scan, output to /tmp tempfile
    5. parse-results: strict v.package / v.description / v.contract_clause field mapping
    6. tag-diff-introduced: for each violation, isDiffIntroduced = addedLines.has(violation.line)
    7. compute-buckets: new / resolved / pre-existing
    8. persist-scan: Scan + ViolationInstance rows
    9. report-results: dual-mode bot comment via GitHub Issues API
    10. post-inline-review: line-anchored review comments on the PR diff

The whole flow takes 60 to 120 seconds on a cold Lambda (with the Nark install) and 25 to 45 seconds when warm. The saas no longer bundles Nark. Every scan picks up the latest published version automatically. Publishing a new Nark version requires no PR, no Vercel rebuild, no manual bump. The next cold-start Lambda fetches it from the registry and the scan uses it.


Five takeaways for serverless TypeScript scanning

1. Silent failures are the worst failure mode

A loud crash with a confusing error is strictly better than a clean "everything's fine" that is lying. Build for failures to be visible by default, then optimize for noise reduction only after you can see what is happening.

2. Tail of stderr, not head

Tools that fail emit progress lines, then deprecation warnings, then sometimes a file listing, and finally the error. If you only see the first N characters of stderr, you will see the wrong thing every time. Better: capture both head and tail with a marker between, so the real error is in the slice no matter which end it lives at.

3. Periodically question whether each step is necessary

Six iterations of "make pnpm work in this serverless function" were less valuable than one iteration of "do we even need to install dependencies." The cost was not in the eventual fix. It was in not asking the question sooner.

4. Lambda's missing primitives are a class, not a bug

Vercel Lambda does not have a writable HOME, does not have pnpm / yarn / git on PATH, and has a 512 MB /tmp. Every layer of your stack that assumes a Unix-like environment will hit this independently. When you fix one instance, search aggressively for the others. We hit the missing-git constraint twice: once in our own clone step, and once inside Nark's --diff flag, weeks apart.

5. Fetch dependencies at runtime when you want to track upstream

Bundling a tool with your service ties its version to your deploy cycle. Fetching the tool fresh at runtime, into ephemeral /tmp cache, gives you always-latest with zero overhead per upgrade. This works particularly well for short-lived analyzer tools where cold-start install cost is acceptable in the overall scan budget.


Frequently asked questions

Does Nark really not need node_modules to scan a TypeScript project?

Correct, since version 2.0.0. The V2 analyzer reads package names from moduleSpecifier.text on each ImportDeclaration node, which is just AST text extraction. It does not call TypeChecker.getSymbolAtLocation() to resolve symbols against installed packages. You still need a valid tsconfig.json and a package.json, but no install step is required.

Why does Vercel Lambda not have pnpm or yarn on PATH?

Vercel Lambda runtimes ship a minimal Node.js environment with node and npm only. Package managers like pnpm, yarn, and bun are not preinstalled. You can install them at runtime via npm install -g pnpm into /tmp, but the version you get is whatever latest happens to be, which may not match your lockfile's expected version.

What is the best way to capture stderr in a Node.js exec wrapper?

Capture both ends with a marker. A pattern that works:

const HEAD_BYTES = 2000;
const TAIL_BYTES = 4000;
const stderrCapture = stderr.length > HEAD_BYTES + TAIL_BYTES
  ? stderr.slice(0, HEAD_BYTES) + "\n...[truncated]...\n" + stderr.slice(-TAIL_BYTES)
  : stderr;

This catches Node deprecation warnings at the head and the actual throw site at the tail, which is how Node.js processes typically emit their failures.

How does Nark scan PRs without git on the Lambda?

Two ways. First, the Lambda uses isomorphic-git to clone the PR into /tmp without shelling out. Second, instead of relying on Nark's --diff base..head flag (which shells out to git diff internally), the Lambda pulls patches from the GitHub Compare API, parses added-line numbers per file, runs Nark in full-project mode, and tags each violation as "new" if its line number is in the added set.


Try it on your codebase

npx nark --tsconfig ./tsconfig.json

Nark checks 165+ npm packages, including axios, prisma, stripe, redis, and openai, for unhandled error paths, missing guards, and incorrect usage patterns. To wire it into your own CI, see add Nark to GitHub Actions. To compare Nark to other static analyzers, see ESLint vs Semgrep vs Nark.

The scanner is open source at github.com/nark-sh/nark. False positives are real and they make the scanner better. Report them.