Strata — Database Archaeology Pipeline
Strata is the sibling methodology to Project Phoenix. Where Phoenix extracts intent from application logic and rebuilds from zero, Strata excavates database complexity, classifies what it finds, and produces a migration brief.
Why Strata Exists
Phoenix assumes a transformation target — rebuild a system in a modern stack. That assumption holds for application code. It breaks on enterprise databases.
A SQL Server installation is not one thing. It is layered:
| Layer | Contents |
|---|---|
| Schema | Tables, views, constraints, indexes, relationships |
| Logic | Stored procedures, functions, triggers |
| Jobs | SQL Agent jobs, schedules, step dependencies |
| Dependencies | Linked servers, synonyms, external aliases, cross-database references |
Each layer has a different relationship to transformation. Some objects rebuild. Some preserve. Some retire. Forcing all four layers through a single rebuild pipeline produces the wrong outputs for three of them.
Strata keeps the concerns separated.
What Strata Is Not
- Not a rebuild pipeline — Strata does not produce application code
- Not a schema migrator — it classifies, it does not execute migration
- Not a subset of Phoenix — it is a parallel methodology with different agents, different artifacts, and a different terminal decision
- Not limited to SQL Server — the methodology applies to any enterprise database with layered complexity
The Five Agents
S-00 Schema Signal → data model map
S-01 Logic Extractor → routine traces
S-02 Job Mapping → operational layer
S-03 Dependency Tracer → external surface
S-04 Classification → terminal brief| Agent | Name | Reads | Produces |
|---|---|---|---|
| S-00 | Schema Signal | Table definitions, ERD, constraints | schema artifacts — one per domain |
| S-01 | Logic Extractor | Stored procedures, functions, triggers | routine artifacts — one per routine |
| S-02 | Job Mapper | SQL Agent jobs, step definitions, schedules | job artifacts — one per job |
| S-03 | Dependency Tracer | Linked servers, synonyms, cross-db refs, aliases | dependency artifacts — one per external surface |
| S-04 | Classifier | All prior artifacts | decision artifact — transform / preserve / retire per object |
No build pass. No certifier. The terminal artifact is the classification map and migration brief — not a production codebase.
The Non-Transformative Scope
The key distinction in Strata is the non-transformative classification.
Not every database object is a rebuild candidate. SQL Agent Jobs, linked servers, and synonyms are operational infrastructure — they schedule, they route, they abstract. The question for these objects is not how to rebuild them but what to do with them:
CONSTRUCT decision
ID billing.nightly-rollup
VERSION 1
─────────────────────────────────────────
object: job
layer: sql-agent
decision: preserve
reason:
Runs nightly billing rollup — no application equivalent.
Must remain operational during migration.
Reconsider after Phase 2 app layer is stable.A decision construct with preserve status says: this object is accounted for, its scope is understood, and the decision is explicit. It does not fall through the cracks. It does not get rebuilt by accident. It does not get forgotten.
EMBER Construct Types
Strata introduces five new construct types to EMBER:
| Construct | Produced by | Purpose |
|---|---|---|
schema | S-00 | Data model signal — tables, relationships, constraints per domain |
routine | S-01 | Stored procedure or function trace — inputs, outputs, dependencies, business logic |
job | S-02 | SQL Agent job definition — steps, schedule, what it touches |
dependency | S-03 | External surface map — linked server, synonym, alias, cross-db reference |
decision | S-04 | Classification result — transform / preserve / retire with explicit reasoning |
All five use the standard EMBER header and separator. The decision construct is the only one unique to Strata — the others could theoretically appear in a Phoenix engagement too.
Example Artifacts
Schema signal — one per domain:
CONSTRUCT schema
ID billing.core
VERSION 1
─────────────────────────────────────────
tables:
invoices → 2.3M rows, partitioned by year
invoice_lines → 18.4M rows, FK to invoices
payment_events → 4.1M rows, append-only
relationships:
invoices → invoice_lines (1:many, CASCADE DELETE)
invoices → payment_events (1:many, no cascade)
constraints:
invoices.status CHECK (status IN ('draft','sent','paid','void'))
invoice_lines.qty CHECK (qty > 0)
indexes:
invoices (customer_id, created_at) — high-read composite
payment_events (invoice_id, event_type) — covering
confidence: highRoutine trace — one per stored procedure:
CONSTRUCT routine
ID billing.sp_generate_invoice
VERSION 1
─────────────────────────────────────────
type: stored-procedure
entry: EXEC billing.sp_generate_invoice @order_id, @issued_by
reads: orders, order_lines, customers, tax_rates
writes: invoices, invoice_lines, audit_log
logic:
- Calculates line totals before tax
- Applies customer-specific tax rate if exists, else default
- Inserts invoice header then lines in single transaction
- Writes audit entry on completion
boundaries:
→ linked server REPORTING for audit mirror (write-through)
gaps:
- No rollback on audit write failure — silent data loss possible
- Tax rate lookup has no fallback if customer record missing
confidence: highJob definition — one per SQL Agent job:
CONSTRUCT job
ID billing.nightly-rollup
VERSION 1
─────────────────────────────────────────
schedule: daily 02:00 UTC
steps:
1 → exec billing.sp_close_day_invoices
2 → exec reporting.sp_refresh_billing_summary
3 → exec dbo.sp_archive_processed_events (linked server)
dependencies:
→ sp_close_day_invoices must complete before step 2
→ step 3 requires REPORTING linked server available
outputs:
billing_summary table refreshed
processed_events archived to REPORTING
failure-behaviour:
step 1 failure → alert DBA, abort
step 2 failure → continue, flag in log
step 3 failure → silent (non-critical archive)
decision: preserveDecision — the terminal classification:
CONSTRUCT decision
ID billing.classification
VERSION 1
─────────────────────────────────────────
scope: billing domain — 4 tables, 12 routines, 3 jobs, 1 linked server
transform:
sp_generate_invoice → extract to application service layer
sp_apply_credit → extract to application service layer
sp_void_invoice → extract to application service layer
preserve:
nightly-rollup job → operational, no app equivalent
REPORTING linked server → archive target, retain during migration
retire:
sp_legacy_batch_import → superseded by API ingestion (2019)
sp_fix_tax_rounding → one-time correction, never called post-2021
open:
sp_calculate_arrears → business logic unclear, requires domain expert reviewHow Strata Feeds Phoenix
Strata and Phoenix are siblings. On a full engagement — an application backed by an enterprise database — both run. The sequence:
Strata S-00 → S-04 (database excavation, 1–2 days)
↓
decision artifact → feeds Phoenix A-04 (Solution Architect)
→ informs Phoenix A-05 (Builder) on data access layer
→ scopes what A-06 (Certifier) needs to verifyThe Strata decision artifact answers the question Phoenix A-04 would otherwise have to guess: what does the database layer require, what can change, and what must stay intact. Without it, the architect is making assumptions. With it, the assumptions are named decisions.
What Strata Does Not Produce
- No production code — classification only, not implementation
- No schema migration scripts — that is execution, not archaeology
- No application rebuild — use Phoenix for the application layer
- No runtime — Strata-runtime is on the roadmap; the methodology ships first
Relationship to Phoenix
Phoenix Strata
────────────────────────────── ──────────────────────────────────
Application layer Database layer
Transformation pipeline Archaeology pipeline
Terminal artifact: codebase Terminal artifact: classification map
7 agents 5 agents
Constructs: signal → cert Constructs: schema → decision
Rebuild from zero Classify and decideBoth use EMBER. Both produce .sil artifacts. Both are read by humans and agents without a manual.
Status
Strata is in specification. The methodology is defined. The runtime (strata-runtime) is on the roadmap.
- Methodology: this document
- Runtime: not yet built
- EMBER constructs:
schema,routine,job,dependency,decision— specified, not yet in@semanticintent/ember
Strata v0.1 — Project Phoenix family — phoenix.semanticintent.dev/strata/