4.4 KiB
| title | date | category | module | problem_type | component | severity | applies_when | symptoms | root_cause | resolution_type | related_components | tags |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Verify Session Artifacts Against Canonical Sources Before Creating Compounding Docs | 2026-04-24 | docs/solutions/workflow-issues | ce-compound | workflow_issue | documentation | high | [Merging session ledgers into docs/solutions Creating compounding documentation from transient artifacts Extracting labels, config values, or data points from session files] | [Documentation contains outdated information Agent creates docs without cross-checking canonical sources Inaccurate labels propagated to durable documentation] | missing_workflow_step | workflow_improvement | [ce-compound analysis] | [ce-compound session-ledgers canonical-sources verification documentation-quality svd-labels] |
Verify Session Artifacts Against Canonical Sources Before Creating Compounding Docs
Context
During a ce-compound ledger-to-docs merge, an agent read an old session ledger (ses_2b9f) from thoughts/ledgers/ and extracted SVD component labels. These labels were written into a new docs/solutions/ file as authoritative documentation. However, the labels in the ledger were stale — the canonical source (analysis/config.py SVD_THEMES) had since been updated. The user caught the discrepancy before the doc was committed and flagged it for correction.
Session ledgers are generated at capture time and may become stale as the codebase evolves. They are snapshots, not authorities — using their content directly risks propagating outdated information into durable docs.
Guidance
When merging session artifacts into compounding documentation:
-
Identify the canonical source for every data point extracted from a session file. If the information exists in the codebase (config, database schema, function output), that is the canonical source, not the ledger.
-
Cross-check all extracted values against the canonical source before writing. For SVD labels, verify against
analysis/config.pySVD_THEMES. For quantitative claims, run the pipeline function that produces them. For schema details, check the model or migration. -
When in doubt, ask the user which source to use. Do not assume a ledger file is current unless you have confirmed it.
-
Tag the doc with relevant components (e.g.,
analysis) so future sweeps can detect drift. -
If the canonical source has changed since the ledger was captured, update the doc to reflect the current state, not the ledger state.
Why This Matters
Session ledgers are transient artifacts. They capture what was true at a point in time, not what is true now. Treating them as authoritative introduces stale data into the durable documentation layer, which erodes trust and requires expensive corrections later. This is the same class of problem as hardcoding blog numbers from memory — the fix is to route every data point through its canonical source.
Unverified documentation is worse than no documentation because it misleads with apparent authority.
When to Apply
- When
ce-compoundextracts labels, values, or claims from a session ledger - When creating any
docs/solutions/doc whose content depends on codebase state (config values, function outputs, schema) - When a session file references code or config that has been modified since the session was recorded
Examples
Actual incident — outdated SVD labels:
A ledger from an old session contained SVD component labels that described motion patterns. These labels had been revised in analysis/config.py (the SVD_THEMES dict) as the voting analysis matured.
- ❌ What happened: Agent extracted the labels from the ledger and created
docs/solutions/insights/svd-voting-patterns-by-component-2026-04-04.mdusing them - ✅ What should have happened: Agent verified each label against
analysis/config.pySVD_THEMES, found that the canonical source had updated values, and used the current values instead (or flagged the discrepancy to the user)
Related
docs/solutions/best-practices/blog-numbers-from-pipeline-outputs-2026-04-16.md— same principle applied to blog copy: always derive data from canonical pipeline functions, not memory or artifactsdocs/solutions/workflow-issues/trajectories-diagnostic-false-alarm-2026-03-31.md— another instance of trusting an intermediary artifact (diagnostic JSON) without verifying against the canonical database state