Build Tracking
Database build, not vibes.
Execution tracker for the Dig implementation plan. This page is a manual progress snapshot for ingest, transforms, gates, and the next actions needed to move from data loading to retrieval/API work.
Live Build Snapshot
Manual update from overnight runs / latest operator report
Raw entities
24,025,633
All 4 entity types ingested into
ingest.raw_entitiesraw_entities size
81 GB
Main disk pressure driver during coexisting raw + catalog transforms
DB size (post-transform)
192 GB
Observed after full releases transform + FTS population
Search benchmark
0 errors / 96
Run 8 (full corpus, Fly.io): 0 errors. p50 108ms. 7/7 warm SLOs pass
| Process | Status | Progress | Rate / note |
|---|---|---|---|
| Full restore to Fly | done | ~555M rows across 12 tables, row counts verified | pg_restore -j4, ~14h total. Disk: 156GB / 300GB |
| Releases ingest | done | 18,876,362 releases loaded | 4,770s (80 min), ~3,958/s |
| Artists transform | done | Complete | 48s |
| Labels transform | done | Complete | 626s (~10 min) |
| Masters transform | done | Complete | 1,822s (~30 min) |
| Releases transform | done | 18,876,362 transformed | Cursor-pagination resume path validated; idempotency rerun pass |
| Gate B checklist | closed w/ caveats | 6/6 checklist items checked | CAVEAT: partial artists dump + recalibrated estimate-derived thresholds |
Full-corpus restore complete. Run 8 benchmark passed (7/7 warm SLOs). EN-A enrichment schema applied. Frontend migrated from Vercel to Fly.io (always-on, no cold starts). Cover Art Archive integrated: 1.77M crosswalks from MusicBrainz, cover proxy with Redis cache, vinyl placeholder fallback. Gate E: GO for soft alpha. Master page perf fixed (10.3s → 0.9s via idx_releases_master).
Search Benchmark
Run 8 — Full corpus (18.9M releases), Fly.io production,
06b5c58 — 96 requests, 0 errors| Category | p50 | p95 | Warm SLO | Status |
|---|---|---|---|---|
| Release FTS | 115ms | 272ms | p95 < 500ms | pass |
| Common-term | 111ms | 177ms | p99 < 1,000ms | pass |
| Fuzzy | 201ms | 347ms | p95 < 500ms | pass (warm) |
| Filtered | 171ms | 298ms | p95 < 300ms | pass (warm) |
| Multi-entity | 104ms | 246ms | p95 < 500ms | pass (warm) |
| Unicode | 100ms | 173ms | p95 < 100ms | borderline (network) |
| Retrieval | 98ms | 184ms | p95 < 200ms | pass |
| Traversal | 94ms | 170ms | p95 < 200ms | pass |
108ms
Overall p50
347ms
Overall p95
7 / 7
Warm SLOs pass
Run 8 (
06b5c58): First benchmark on full 18.9M-release corpus on Fly.io (shared-cpu-2x, 4GB RAM). 0 errors across 96 requests. Cold-cache spikes on Run 1 only (fuzzy 1.2s, filtered 711ms, multi-entity 1.4s) — all resolve to <350ms by Run 2. Warm SLOs pass in all 7 categories. pg_prewarm on deploy would eliminate cold-start failures entirely. See docs/phase2-search-benchmark-results.md for full detail.
Benchmark Progression
p50 latency across 8 benchmark runs — from Docker baseline to full-corpus production
Local (Docker / native PG)
Production (Fly.io, internet)
Run 5
two-path rewrite
Run 6
+stop-word fix
Run 7
staging (50k releases)
Run 8
full corpus (18.9M)
p50 improved from Run 7 (117ms, 50k releases) to Run 8 (108ms, 18.9M releases) — full corpus is faster at p50 due to warm cache from the larger 4GB RAM allocation. Network overhead (~80ms) dominates; DB query time is 20-30ms for most operations.
First MCP Result
Willie Nelson via Claude Code → dig-mcp.fly.dev → Fly Postgres — 2026-02-28
First live MCP tool call — Claude Code querying search_catalog for Willie Nelson 1980s releases, returning structured data from Fly Postgres via SSE transport.
Dig (Full Corpus) vs Discogs API
Run 8 — Both over internet, full 18.9M releases, p50 latency (lower is better)
Dig (Fly.io Virginia, 18.9M releases)
Discogs API (remote CDN)
Release FTS
Dig 2.1x
Common-term
Dig 1.7x
Fuzzy
Even
Filtered
Dig 1.3x
Multi-entity
Dig 2.1x
Unicode
Dig 1.9x
Retrieval
Dig 2.3x
Traversal
Dig 2.4x
108ms
Dig p50 (full corpus)
212ms
Discogs p50 (CDN)
7 / 8
Categories Dig wins
Full-corpus internet comparison. Dig on Fly.io (shared-cpu-2x, 4GB RAM, Virginia, 18.9M releases) vs Discogs API (CDN). Both measured from the same macOS client. Dig is faster in 7 of 8 categories at p50, even with the full 18.9M-release dataset. Fuzzy is roughly even (201ms vs 194ms) — pg_trgm label/master scan adds ~100ms vs the Run 7 staging set. Retrieval and traversal are 2x+ faster. Network overhead (~80ms) dominates both sides.
Mitigations Applied
Search fixes shipped in Phase 2 hardening
Benchmark Progression
- ✓Statement timeout enforcement — pinned
db.connection()withSET statement_timeout = '3s'. Max query now bounded per entity type. - ✓Broad query detection — 30-term high-frequency list + short token heuristic. "Love": 12s → 5ms, "Remix": 15s → 7ms
- ✓Degraded response path — broad release queries return unranked recent matches with
meta.degraded: trueand refinement hint - ✓websearch_to_tsquery — stricter FTS matching for releases, reducing candidate sets on common terms
- ✓Rank threshold — filter
ts_rank_cd > 0.0001, eliminating noise results - ✓Max page size — reduced from 100 to 50, limiting sort cost
- ✓Per-entity-type timeout handling — if one entity type times out, remaining types still return results
- ✓Two-path release search rewrite — Path A ranked + Path B guarded degraded for broad/filtered release queries
- ✓Stop-word empty tsquery short-circuit — "The", "A", "An" etc. return instantly instead of full-table scan. "The": 3s→1ms
- ✓
degraded_reasonobservability — tracked inmetafor all degradation paths (5 reason codes) - ✓Fuzzy threshold tuning — labels/masters 0.45→0.5, cap 5 results. Warm: 87ms. Cold spikes are cache eviction
Roadmap & Checklist
Implementation plan execution tracker (manual)
Phase 0A / 0B + Gate A
- ✓System scaffold (monorepo, Fastify, Kysely, migrations, local Postgres/Redis, CI)
- ✓Full profiling for artists/labels/masters + 500k release sample, sizing report, image absence confirmation in sampled releases
- ✓Normalization Dictionary v1 + Preserve/Normalize matrix + QA Gate Spec + Image Strategy v1
- ✓Parser fixtures/tests and LEGAL draft completed; Gate A checklist closed
Phase 1 + Gate B
- ✓7.1 Ingest infra tables + catalog schema + indexes + FTS columns
- ✓7.2 Full-tree parser and ingest pipeline hardening; 52 tests passing
- ✓Raw ingest complete for all 4 entity types into
ingest.raw_entities - ✓7.3 Canonical upserts complete for releases, including child fanout tables
- ✓7.5 QA/reconciliation report completed and thresholds recalibrated with evidence
- ✓7.6 / 7.6A Idempotency and restart behavior validated with cursor-based rerun
- ✓7.7 FTS vectors populated (all 18,876,362 releases)
- ✓Gate B closed with caveats documented in implementation plan and handoff snapshot
- ◐Known limitation: release-title
pg_trgmfuzzy search p99 above target (Phase 2 mitigation)
Phase 2
- ✓Query envelope locked (
docs/phase2-query-envelope.md): filters, sorts, fuzzy policy, timeout budget, broad query spec - ✓Response contracts locked (
docs/phase2-response-contracts.md): search, entity detail, traversal, errors,meta.degraded - ✓Search mitigation plan (
docs/phase2-search-mitigation.md): release fuzzy disabled in v1 - ✓Multi-entity FTS search with filters (genre/style/year/country) + fuzzy fallback on artist/label/master
- ✓Entity retrieval services: artist, label, master, release (all child tables joined)
- ✓Traversal services: 5 link types (artist→releases, artist→masters, label→releases, master→releases, release→credits)
- ✓Fastify /v1 routes wired: search, entities, traversal — all live-tested against 192GB database
- ✓Benchmark runner (
pnpm benchmark:search): 32-query suite, 8 categories, acceptance criteria - ✓
statement_timeoutenforcement via pinneddb.connection()— 3s per-statement, graceful per-entity-type fallback - ✓Broad query detection + degraded response path: "Love" 12s→5ms, "Remix" 15s→7ms, max query 19.7s→2.0s
- ✓Query envelope tightening:
websearch_to_tsqueryfor releases, rank threshold 0.0001, max page 50 - ✓Benchmark Run 3 (Docker): 4/7 pass. Warm p50 well under all targets
- ✓Native Postgres migration — 58GB catalog cloned from Docker PG to native PG 14 (25 tables, 204M+ rows)
- ✓Benchmark Run 4 (native PG): common-term now passes. Filtered queries confirmed genuinely broken (not Docker artifact)
- ✓Discogs API comparison benchmark: Dig faster in 7/7 categories, 38/46 queries. Overall p50: 36ms vs 223ms
- ✓Filtered query fix (P0): two-path rewrite (
a16df00), migration indexes + docs sync (bd00be3) - ✓Benchmark Run 5: 0 errors / 96 queries. All filtered queries returning results. Retrieval p95 36ms, Unicode p95 80ms
- ✓Stop-word empty tsquery fix (
0b6df75): "The" 3s→1ms. Client-side short-circuit before DB hit - ✓
degraded_reasonobservability: tracked inmetafor all paths (empty_tsquery, broad_query, filtered, filtered_capped, statement_timeout) - ✓Warm/cold SLO framework documented in
docs/phase2-search-benchmark-results.md - ✓Benchmark Run 6 (
0c03bb9): 0 errors / 96. Common-term now passes. 4/8 warm SLOs pass - ✓Fuzzy threshold tuned: labels/masters 0.45→0.5 (warm 87ms in isolation; benchmark spike is cache eviction)
- ✓Phase 2 release decision block locked in benchmark docs + implementation plan gate criteria
- ✓Rate limit middleware + ops hooks shipped in Phase 3 API protection pass
- ✓Startup warmup (
pg_prewarm) — 8 indexes, 325k blocks, verified on Fly
Phase 3
- ✓REST API hardening: two-tier rate limiting (60/min anon, 300/min keyed), CORS, structured JSON logging
- ✓MCP server: 6 tools (
search_catalog,get_artist,get_label,get_master,get_release,traverse_links) - ✓MCP SSE transport via Express (port 3001), 18 contract tests + 47 smoke tests passing
- ✓Deployed to Fly.io: dig-api + dig-mcp, Fly Postgres (2.5M masters, 50k releases), Upstash Redis
- ✓Rollback drill executed (v2→v1→latest, health verified)
- ✓Production benchmark Run 7: 32 queries, 0 errors, p50 117ms (internet round trip)
- ✓Gate D: GO (staging alpha) — all required criteria met
- ✓Claude Desktop + Claude Code MCP verified (both environments confirmed working)
- ✓API quickstart doc with curl examples, MCP setup, error codes (
docs/quickstart.md) - ✓Ops runbook: 4 incident types, deployment, rollback, DB access (
docs/ops-runbook.md) - ✓Alpha invite brief with staging limitations + usage policy (
docs/alpha-invite.md) - ✓Phase 4 prerequisites: migration plan, capacity plan, cost estimates (
docs/phase4-prerequisites.md)
Phase 4
- ✓Full releases dataset migration complete (~555M rows, 12 tables, row counts verified)
- ✓ANALYZE + search_vector verification — all 18.9M releases populated
- ✓Run 8 benchmark: 0/96 errors, p50 108ms, 7/7 warm SLOs pass
- ✓Dump cleanup (22GB freed) + DB scaled to serving profile (shared-cpu-2x, 4GB)
- ✓Next.js frontend scaffold: search + release detail pages, CSS Modules design system, server-side API fetch with timeout + runtime guards
- ✓Deployed to Fly.io (
dig-web, always-on), migrated from Vercel — no cold starts - ✓Master-first search IA: grouped result sections, duplicate release collapse under master releases
- ✓New entity pages:
/master/[id]and/artist/[id], linked from search + release views - ✓Release page upgraded with per-track expandable credits (performer/engineer/producer roles)
- ✓Cover Art Archive integration — 1.77M crosswalks, cover proxy + Redis cache, frontend display
- ✓Search warmup (
pg_prewarm) — 8 indexes, 325k blocks, all warm queries <200ms - ✓Soft alpha invite — 5 keys issued, invite brief updated with web UI + cold-start caveats
- ✓Step 9: filtered release query hardening under concurrency (c100) — 0 timeouts/errors after index + fallback changes
Phase 4A (EN-A)
- ✓Step 1 — Plan locked: scope, non-goals, gates EN-A→EN-D, and source trust policy documented (
docs/enrichment-implementation-plan.md). - ✓Step 2 — Schema spec ready: EN-A DDL with keys, indexes, constraints, and upsert strategy drafted (
docs/enrichment-migration-spec-en-a.md). - ✓Step 3 — Staged prep complete: source manifests, field mappings, crosswalk candidate templates, and parser/adaptor skeletons are in repo.
- ✓Step 4 — Migration applied:
006_enrich_schema.ts— 8 tables inenrich.*schema, applied local + Fly staging. - •Step 5 — Sample ingest: load a small MB/Wikidata slice into
enrich.*, generate crosswalks, and verify additive-only behavior (nocatalog.*overwrite). - •Step 6 — EN-A gate: pass precision sampling + latency delta checks, then enable enrichment in traversal behind
include_enrichmentandmin_confidence.
Phase 5
- ✓Day 1-6 complete: SLO baseline, filtered-query hardening, track-level credits, search IA upgrade, telemetry, alpha ops gate
- •Day 7: soft-alpha invites + first 24h triage loop
- •Media embeds spike planned for release/version pages (version-first, release-level fallback) —
docs/media-embeds-release-version-plan.md - •User auth + collections remain post-alpha scope
Data layer: 18.9M releases + 2.5M masters + 2.3M labels + 584k artists — full Discogs CC0 February 2026 dump, now live on Fly.io. Disk: 156GB / 300GB.
Search: Postgres FTS (
Live: dig-api.fly.dev (REST) + dig-mcp.fly.dev (MCP SSE) + dig-web.fly.dev (frontend, Fly.io). Full corpus serving. Cover art live via CAA. DNS cutover to
Next milestone: Wider alpha rollout (Gate E broader), enrichment data ingest, search refinement.
Search: Postgres FTS (
tsvector) for all entity types + pg_trgm fuzzy for artist/label/master. Release fuzzy disabled in v1. Run 8: 7/7 warm SLOs pass.Live: dig-api.fly.dev (REST) + dig-mcp.fly.dev (MCP SSE) + dig-web.fly.dev (frontend, Fly.io). Full corpus serving. Cover art live via CAA. DNS cutover to
app.dig.baby pending.Next milestone: Wider alpha rollout (Gate E broader), enrichment data ingest, search refinement.