--- title: Verify Session Artifacts Against Canonical Sources Before Creating Compounding Docs date: 2026-04-24 category: docs/solutions/workflow-issues module: ce-compound problem_type: workflow_issue component: documentation severity: high applies_when: - Merging session ledgers into docs/solutions - Creating compounding documentation from transient artifacts - Extracting labels, config values, or data points from session files symptoms: - Documentation contains outdated information - Agent creates docs without cross-checking canonical sources - Inaccurate labels propagated to durable documentation root_cause: missing_workflow_step resolution_type: workflow_improvement related_components: - ce-compound - analysis tags: [ce-compound, session-ledgers, canonical-sources, verification, documentation-quality, svd-labels] --- # Verify Session Artifacts Against Canonical Sources Before Creating Compounding Docs ## Context During a `ce-compound` ledger-to-docs merge, an agent read an old session ledger (`ses_2b9f`) from `thoughts/ledgers/` and extracted SVD component labels. These labels were written into a new `docs/solutions/` file as authoritative documentation. However, the labels in the ledger were stale — the canonical source (`analysis/config.py` `SVD_THEMES`) had since been updated. The user caught the discrepancy before the doc was committed and flagged it for correction. Session ledgers are generated at capture time and may become stale as the codebase evolves. They are snapshots, not authorities — using their content directly risks propagating outdated information into durable docs. ## Guidance When merging session artifacts into compounding documentation: 1. **Identify the canonical source** for every data point extracted from a session file. If the information exists in the codebase (config, database schema, function output), that is the canonical source, not the ledger. 2. **Cross-check all extracted values** against the canonical source before writing. For SVD labels, verify against `analysis/config.py` `SVD_THEMES`. For quantitative claims, run the pipeline function that produces them. For schema details, check the model or migration. 3. **When in doubt, ask the user** which source to use. Do not assume a ledger file is current unless you have confirmed it. 4. **Tag the doc with relevant components** (e.g., `analysis`) so future sweeps can detect drift. 5. **If the canonical source has changed since the ledger was captured**, update the doc to reflect the current state, not the ledger state. ## Why This Matters Session ledgers are transient artifacts. They capture what was true at a point in time, not what is true now. Treating them as authoritative introduces stale data into the durable documentation layer, which erodes trust and requires expensive corrections later. This is the same class of problem as hardcoding blog numbers from memory — the fix is to route every data point through its canonical source. Unverified documentation is worse than no documentation because it misleads with apparent authority. ## When to Apply - When `ce-compound` extracts labels, values, or claims from a session ledger - When creating any `docs/solutions/` doc whose content depends on codebase state (config values, function outputs, schema) - When a session file references code or config that has been modified since the session was recorded ## Examples **Actual incident — outdated SVD labels:** A ledger from an old session contained SVD component labels that described motion patterns. These labels had been revised in `analysis/config.py` (the `SVD_THEMES` dict) as the voting analysis matured. - ❌ What happened: Agent extracted the labels from the ledger and created `docs/solutions/insights/svd-voting-patterns-by-component-2026-04-04.md` using them - ✅ What should have happened: Agent verified each label against `analysis/config.py` `SVD_THEMES`, found that the canonical source had updated values, and used the current values instead (or flagged the discrepancy to the user) ## Related - `docs/solutions/best-practices/blog-numbers-from-pipeline-outputs-2026-04-16.md` — same principle applied to blog copy: always derive data from canonical pipeline functions, not memory or artifacts - `docs/solutions/workflow-issues/trajectories-diagnostic-false-alarm-2026-03-31.md` — another instance of trusting an intermediary artifact (diagnostic JSON) without verifying against the canonical database state