A critical, source-verified architecture comparison — deep on observability, cost, agents, durability, feedback loops, quality gates, and storage. Every claim below was checked against cloned source by independent recon agents on 2026-06-12; this supersedes the first-pass audit (PR #102) and corrects three material errors in it.
state.sqlite (v13, WAL) + Supabase mirror; Next.js/Supabase-Realtime dashboardBelow, each dimension Mitchell flagged gets a mechanics-level side-by-side. The pattern repeats: we have the deeper, production implementation; they have the cleaner seam. Both lessons matter.
structlog JSON events throughout; append-only campaign.jsonl with stage.completed {rows_out, duration_ms}generateRetro() emits a funnel table + anomalies (row-drop, verifier-fail)agents/retro.tsstage_status/checkpointsstorage/cli.py list_summary — per-stage counts on demandemail_waterfall_log / phone_waterfall_log columns (which providers were tried)cost_accumulator + lead_cost tables — per-stage and per-lead token/USD attributionstate/store.py:98 · adapters/runtime.pyllm_rate_cards.yaml pre-flight projections; costs/estimate CLIcampaign_budget_exceeded fires only after exceedingOpenAIPromptRunner writes costUsd: 0 on every call — TS-native runs report $0.00agents/openai-prompt-runner.ts:115--estimate mode returning credit cost without spendingverified in selftest.shcostUsd:0 bug is a one-line prerequisite so the gate has real numbers to read.maxRevisions=2) → escalateinner-loop.tsgpt-4.1-mini, copy on gpt-4.1; no systematic cheap-model routing for high-volume row workThreadPoolExecutor (≤8) for personalize/copy; domain-caching (1 call per unique domain)state.sqlite per-row row_state; done|cached skipped on resume, failed re-enqueuedstate/store.pyrows_in = kept + rejected or throwrow_conservation.py + agents/row-conservation.tsSamplingGate equivalent; TS CostTracker is in-memory (crash loses cost)CopySubRunner._query_nexus_intelligence() queries the KG for "copy angles · campaign performance · segment" before writing copycopy_sub_runner.py:433-452score-replies → deposit_retrospective_intelligence() → TierEngine → Nexus KG closes the loop with reply outcomesscoring/retrospective.py:171NEXUS_API_KEY; TS harness has zero Nexus wiring; signal-bank runs are manual + disconnectedmaster_contacts (you sync it from your sender)exit 7 hard blockstages/scorecard.pySamplingGate pre-fan-out quality floor (grade C+); launch gate needs operator_confirms + signaturedecisions.md (full audit trail)| Dimension | LeadGrow | External |
|---|---|---|
| Primary store | Per-campaign state.sqlite (v13, WAL, in-code migrations) | JSONL (contacts.jsonl + index.json) or Postgres |
| Shared / cloud | Supabase leadgrow_knowledge: global_companies, global_contacts, campaign_contacts, enrichments + dashboard tables | Optional Postgres; master_contacts xref (manually synced) |
| Row tracking | row_state per stage (pending/failed/done/skipped + error) | stage field per contact via advance_stage |
| Cross-campaign dedup | Supabase mirror, cross-client | crossref_master (local backend always returns "new") |
| Crash recovery | SQLite WAL + row-state retry + StateReconciler | Atomic file writes; re-run stage |
CATALOG: 12 EnrichmentSpec entries (clay | runtime | apify), tunable via spec fields + config/enrichment_providers.yamlstages/enrichments/registry.pyexecuteProvider() shared retry/backoff exists, but only millionverifier is portedmanifest.yaml per provider: auth.env, request_template, field_map (provider field names live only here, never in agent prompts)providers/*/manifest.yamlgtm.config.yaml waterfalls = the only place a provider is chosen; missing key → silently skipped (BYOK shaping)adapter.py --capability contractCATALOG), not a rewrite. Rec #2.| Capability | LeadGrow | External | Winner |
|---|---|---|---|
| End-to-end → launched campaigns | ● | ○ (CSV/draft) | LeadGrow |
| Copy generation framework | ● | ○ | LeadGrow |
| QA + iterative fixes | ● | ○ | LeadGrow |
| 11-dim scorecard hard block | ● | ○ | LeadGrow |
| Per-lead personalization loop | ● | ○ | LeadGrow |
| Per-row resume + conservation invariant | ● | ◐ | LeadGrow |
| Cross-run learning loop | ● (Nexus) | ○ | LeadGrow ⬅ #102 had this inverted |
| Live observability UI | ● | ○ | LeadGrow |
| Per-lead cost attribution (Python) | ● | ○ | LeadGrow |
| Provider manifest / config-swap | ◐ (registry) | ● | External |
| Model-routing by stage economics | ○ | ● | External |
| Role/segment title expansion | ◐ (literal-ish) | ● | External |
| Pre-spend cost gate | ◐ (reactive) | ● | External |
| BYOK graceful pipeline shaping | ○ | ● | External |
| Zero-infra portability | ○ | ● | External |
Filtered to what's verified real in the external repo and genuinely missing/weaker in ours. The #102 "cross-run intelligence" recommendation is removed — we already have it.
"Target CFOs" → an explicit equivalence class (CFO / Chief Financial Officer / VP Finance / Head of Finance / Controller / Treasurer). The external pipeline infers it once at brief time, shows it at Gate #1 with provenance (inferred vs from-context), the operator edits it, and it's frozen into the run — no stage re-infers. Prevents silent under-sourcing from literal title matching.
How: expansion step in brief capture → frozen expansion.yaml read by 03-qualify + 04-people; surface at the first checkpoint. ~200 lines + a module.
Add a field_map + request_template manifest layer on top of our existing CATALOG registry so adding/swapping a provider is a YAML file, not a code change + a Python class. Keep waterfalls as ordered name lists in config. This is the external repo's strongest idea, and we already have the registry to build it on.
How: manifest parser + registry loader; refactor the 12 specs to register via manifests; preserve the TS executeProvider() retry core. Backward-compatible wrapping + waterfall test coverage is the risk to manage.
Route by what each stage actually needs: expensive reasoning (research, copy) on a strong model; high-volume row classification (qualify, segment) on a cheap fast model in batches; keep intermediate output out of the main context. The external repo does Sonnet-research / Haiku-15-per-batch-scoring — a verified 3–5× cost delta on the heaviest stages. Most under-valued idea in the first pass.
How: per-stage target_model already exists in prompt YAML — formalize a routing policy + batch the classifier stages; fix the TS costUsd:0 bug first so savings are measurable.
We gate budget after exceeding; they estimate cost before spending (Gate #3 + --estimate). Wire our existing llm_rate_cards.yaml / enrichment cost estimates into a pre-stage gate that raises CheckpointPending when projected spend crosses a threshold. Prerequisite: fix OpenAIPromptRunner writing costUsd:0.
Two cheap wins: (a) persist a per-contact waterfall_log (which provider answered) for fill-rate analysis; (b) at init, probe which provider keys exist and light up only those stages (their BYOK shaping) instead of failing on a missing key.
decisions.md audit trail.operator_confirms + signature, exit 7 hard blocks — irreversibility handled with code, not vibes.These aren't carry-overs; they're things to fix in our own house, found while reading our source for this comparison.
| Gap | Detail | Evidence |
|---|---|---|
| TS cost tracker writes $0 | OpenAIPromptRunner records costUsd:0 every call; TS-native runs report $0.00 cost | openai-prompt-runner.ts:115 |
| Nexus not wired into TS harness | TS-native stages never read/write the KG — the feedback loop only fires on the Python path | integrations/nexus.py |
| 3 stages bridge-only | copy (07), scorecard (08.5), launch (09) have no TS handler — always spawn the Python subprocess, no migration target tracked | flows/registry.ts:22-23 |
| Signal-bank disconnected | Manual run order, no cron/Trigger.dev; a stale RUNNING.lock suggests a crashed run never cleaned up | signal-bank/runs/.../RUNNING.lock |
Verification provenance. Both external repos were shallow-cloned and read in full; gtm-orchestrator was read read-only on this machine. All file:line citations were produced by three independent recon passes on 2026-06-12 and reconciled against the first-pass audit (PR #102), whose three material errors are corrected above. Items the recon could not confirm against live APIs (e.g. the external repo's synthetic example run, exact copy-stage model) are flagged in-line rather than asserted.
LeadGrow GTM — internal architecture review · 2026-06-12