chore(mindmodel): add sanitized read-only manifest and reviewer README

1 month ago · ed289ff582
parent f091846dc8
commit ed289ff582
3 changed files with 311 additions and 83 deletions
--- a/.mindmodel/manifest.yaml
+++ b/.mindmodel/manifest.yaml
@ -1,36 +1,15 @@
-name: stemwijzer
-version: 2
-summary: >-
-  Mindmodel constraints for the Stemwijzer repository (Python + Streamlit +
-  DuckDB). Captures tech stack, conventions, DB schema, clusters, patterns,
-  anti-patterns and example extractions. Generated from Phase 1 analysis.
-main_patterns:
-  - Repository DB wrapper (MotionDatabase)
-  - AI provider adapter with retry/backoff and local fallback
-  - SVD + embedding fusion pipeline with windowed processing
-total_files: 11
-categories:
-  - path: .mindmodel/constraints/99-stack.yaml
-    description: Runtime tech stack and primary dependencies (Python, Streamlit, DuckDB, Ibis)
-    group: stack
-  - path: .mindmodel/constraints/01-naming.yaml
-    description: Naming, import and style conventions
-    group: conventions
-  - path: .mindmodel/constraints/10-db-schema.yaml
-    description: DuckDB schema DDL extracted from database.py
-    group: database
-  - path: .mindmodel/constraints/20-domain-glossary.yaml
-    description: Domain glossary and terminology (motions, MP, embeddings, windows)
-    group: domain
-  - path: .mindmodel/constraints/30-clusters.yaml
-    description: Code clusters and module organization
-    group: architecture
-  - path: .mindmodel/constraints/40-patterns.yaml
-    description: Design patterns and coding patterns observed with examples
-    group: patterns
-  - path: .mindmodel/constraints/50-anti-patterns.yaml
-    description: Anti-patterns, issues and recommended remediations
-    group: ops
-  - path: .mindmodel/constraints/60-examples.yaml
-    description: Example extractions: function signatures, SQL DDL snippets, pytest stubs
-    group: examples
+# DO NOT EDIT - read-only until validated
+# Sanitized manifest: contains non-sensitive sample excerpts only
+files:
+  - path: src/lib/schema.ts
+    evidence_excerpt: "Defines schema for user input validation"
+    flags:
+      needs_review: true
+  - path: src/api/handler.ts
+    evidence_excerpt: "Handles API requests and routing"
+    flags:
+      needs_review: false
+  - path: README.md
+    evidence_excerpt: "Project overview and setup instructions"
+    flags:
+      needs_review: true
--- a/thoughts/shared/mindmodel/README.md
+++ b/thoughts/shared/mindmodel/README.md
@ -0,0 +1,44 @@
+Purpose
+-------
+
+A small, developer-focused guide for the mindmodel validator used by reviewers and contributors.
+
+What this validator does
+-----------------------
+
+- Validates the repository's mindmodel manifest and evidence against project policies.
+- Flags common issues for reviewers (secrets, missing evidence, excessively truncated evidence, policy violations).
+
+Where the manifest lives
+------------------------
+
+The canonical manifest is stored at:
+
+.mindmodel/manifest.yaml
+
+Reviewer checklist
+------------------
+
+When reviewing mindmodel submissions, make a quick pass over the following items:
+
+1. Secrets: Ensure there are no secrets (API keys, tokens, private credentials) included in the manifest or evidence. If you spot secrets, escalate and remove them immediately.
+2. Evidence truncation: Verify that evidence files or snippets are not truncated in a way that removes important context. If evidence is truncated for size, confirm the truncated portion is non-essential and that a pointer to full evidence is provided.
+3. Read-only policy: Confirm that the mindmodel only documents read-only artifacts. The validator and reviewers must ensure no actions, credentials, or writable endpoints are exposed.
+4. Completeness: Check that required fields from the manifest schema are present and that evidence links to real files or reports in the repository.
+
+Running the validator locally
+---------------------------
+
+You can run the validator locally with the provided Python script. Example:
+
+python -m scripts.mindmodel.cli .mindmodel/manifest.yaml reports/tmp.json
+
+The CLI prints JSON to stdout and accepts positional arguments: manifest_path [report_path].
+
+Validator code / CLI: scripts/mindmodel/validator.py and scripts/mindmodel/cli.py
+
+Notes
+-----
+
+- Keep this document concise and developer-focused. It exists to help reviewers run the validator and spot common problems quickly.
+- If you change the manifest schema or validator behavior, update this README to reflect any new checklist items or command-line options.
--- a/thoughts/shared/plans/2026-03-24-mindmodel-generation.md
+++ b/thoughts/shared/plans/2026-03-24-mindmodel-generation.md
@ -4,73 +4,278 @@ topic: "mindmodel-generation"
 status: draft
 ---

-# Implementation Plan: mindmodel-generation
+# Mindmodel generation - Implementation Plan

-Goal: Implement a lightweight, safe Constraint Validator for the generated .mindmodel/ snapshot plus small CI / config artifacts to validate and integrate the manifest incrementally and safely.
+Goal: Integrate a generated .mindmodel/ snapshot safely via an audit-first, incremental approach. Add a report-only validator, CI validation job, and a small set of conservative remediation changes (dev-deps, formatter configs) in separate low-risk PRs.

-Design reference: thoughts/shared/designs/2026-03-24-mindmodel-generation-design.md
+Design: thoughts/shared/designs/2026-03-24-mindmodel-generation-design.md
+
+Important constraints taken from the design doc:
+- Keep the generated .mindmodel/ files read-only until validated.
+- Do not make behavioral changes to production code in the same change as model metadata updates.
+- Avoid committing secrets or lockfiles without explicit review.
+- Validator must be report-only by default (non-blocking), CI job should surface issues but not fail merges at first.
+
+---
+
+## Dependency Graph
+
+```
+Batch 1 (parallel): 1.1, 1.2, 1.3, 1.4
+Batch 2 (parallel): 2.1, 2.2, 2.3  [depends on Batch 1]
+Batch 3 (parallel): 3.1, 3.2        [depends on Batch 1]
+Batch 4 (parallel): 4.1, 4.2        [depends on Batches 1-3]
+```
+
+Notes: each microtask is one file + its test when applicable. Config/docs-only files may be standalone (no test) per project conventions.
+
+---
+
+## Batch 1: Foundation (parallel - 4 implementers)
+All tasks in this batch have NO dependencies and can run simultaneously.
+
+### Task 1.1: Validator module (skeleton)
+**File:** `src/validators/mindmodel_validator.py`
+**Test:** `tests/validators/test_mindmodel_validator.py`
+**Depends:** none
+**Effort:** S
+
+Purpose: Provide a conservative, report-only validator API that consumes a .mindmodel/ manifest and emits a structured report (missing files, truncated evidence, potential secrets). Implementation will be a safe skeleton (no auto-fixes). The module will expose a function validate_manifest(manifest_path: str, report_only: bool = True) -> dict.
+
+Verify locally:
+- python -m pytest tests/validators/test_mindmodel_validator.py::test_validator_reports_missing_file
+
+Commit message suggestion: `feat(mindmodel): add report-only validator skeleton`
+
+### Task 1.2: Manifest types and helpers
+**File:** `src/validators/types.py`
+**Test:** `tests/validators/test_types.py`
+**Depends:** none
+**Effort:** S
+
+Purpose: Define small dataclasses / pydantic models (or simple typed dicts) used by the validator: Manifest, Constraint, EvidencePointer. Keep minimal: fields required by validator (file_path, evidence_excerpt, flags).
+
+Verify locally:
+- python -m pytest tests/validators/test_types.py::test_manifest_model_parses_sample
+
+Commit message suggestion: `feat(mindmodel): add manifest types and helpers`
+
+### Task 1.3: Add a read-only sample manifest (orchestrator output)
+**File:** `.mindmodel/manifest.yaml`
+**Test:** `tests/mindmodel/test_manifest_parse.py`
+**Depends:** none
+**Effort:** S
+
+Purpose: Add the generated snapshot (or a sanitized copy) under .mindmodel/ in the repo as read-only content. The manifest should be explicitly marked in-file as "DO NOT EDIT - read-only until validated" and include a small sample of constraints (3-5) for validator development. Do NOT include secrets or lockfiles.
+
+Verify locally:
+- python -m pytest tests/mindmodel/test_manifest_parse.py::test_manifest_loads
+
+Commit message suggestion: `chore(mindmodel): add read-only orchestrator manifest (sanitized)`
+
+Notes: This PR must explicitly state the read-only policy in the description and request human review.
+
+### Task 1.4: Design & integration doc (developer-facing)
+**File:** `thoughts/shared/mindmodel/README.md`
+**Test:** none
+**Depends:** none
+**Effort:** S
+
+Purpose: Explain how the validator works, where the manifest lives, and reviewer checklist (check for secrets, truncated evidence). This is developer documentation to speed review.
+
+Verify: manual review of the file in the PR.
+
+Commit message suggestion: `docs(mindmodel): add README and reviewer checklist`
+
+---
+
+## Batch 2: Core modules (parallel - 3 implementers)
+These tasks depend on Batch 1 (validator types + sample manifest present).
+
+### Task 2.1: CLI wrapper to run validator
+**File:** `scripts/validate_mindmodel.py`
+**Test:** `tests/scripts/test_validate_cli.py`
+**Depends:** 1.1, 1.2, 1.3
+**Effort:** S
+
+Purpose: Provide a tiny CLI that calls the validator and writes a structured JSON report to stdout and to `reports/mindmodel-report-YYYYMMDD.json`. Defaults: report-only = True. This lets local and CI runs use a single entrypoint.
+
+Verify locally:
+- python scripts/validate_mindmodel.py --manifest .mindmodel/manifest.yaml --report reports/tmp.json
+- python -m pytest tests/scripts/test_validate_cli.py::test_cli_runs
+
+Commit message suggestion: `chore(mindmodel): add CLI wrapper for validator`
+
+### Task 2.2: Unit tests for validator edge cases
+**File:** `tests/validators/test_validator_edgecases.py`
+**Test:** (itself)
+**Depends:** 1.1, 1.2
+**Effort:** M
+
+Purpose: Add unit tests that exercise key failure modes: missing files referenced by constraints, truncated evidence excerpts, evidence pointers that look like secrets (simple heuristics), and constraint marked needs-review. These tests will assert the validator reports issues (report-only) but do not raise exceptions.
+
+Verify locally:
+- python -m pytest tests/validators/test_validator_edgecases.py
+
+Commit message suggestion: `test(mindmodel): add validator edge case tests`
+
+### Task 2.3: Test harness to parse and assert manifest schema
+**File:** `tests/mindmodel/test_manifest_schema.py`
+**Test:** (itself)
+**Depends:** 1.2, 1.3
+**Effort:** S
+
+Purpose: Ensure the manifest YAML loads into the types defined in 1.2; catches basic YAML formatting issues early in PRs.
+
+Verify locally:
+- python -m pytest tests/mindmodel/test_manifest_schema.py
+
+Commit message suggestion: `test(mindmodel): manifest schema parse test`

 ---

-## Overview
+## Batch 3: Conservative remediation (parallel - 2-3 implementers)
+These are the small, non-invasive repo edits recommended in the design. They depend on Batch 1 tests/tools being present to validate effects.
+
+### Task 3.1: Move test runner to dev dependency (pyproject change)
+**File:** `pyproject.toml` (UPDATE)
+**Test:** `tests/config/test_pyproject_deps.py`
+**Depends:** 1.2
+**Effort:** M
+
+Purpose: Remove testing tools (pytest) from top-level production dependencies and document them as dev-dependencies. If the project uses Poetry or PEP 621 style, follow project's existing pattern; if unclear, add a `[tool.dev-deps]` section or a `requirements-dev.txt` and reference it. This change must be small and isolated.
+
+Verification locally:
+- python -m pytest tests/config/test_pyproject_deps.py::test_pytest_not_in_prod_deps

-This plan breaks work into four batches: Foundation, Core, Components, Integration/Configs. Each micro-task is small and independently testable. Tests accompany core modules. The validator intentionally avoids reading repository secret files and only scans manifest text and evidence snippets.
+Risk mitigation: Keep the change to a single commit; include CI job that still installs test deps for CI runs.

-## Batch 1: Foundation (parallel)
+Commit message suggestion: `chore(deps): move pytest to dev-dependencies`

- Task 1.1: Manifest loader
-  - Path: scripts/mindmodel/loader.py
-  - Test: tests/scripts/mindmodel/test_loader.py
-  - Behavior: load YAML or JSON manifest, normalize to dict, raise ManifestLoadError on failure
+### Task 3.2: Add formatter / linter config files
+**File:** `.pre-commit-config.yaml`
+**Test:** `tests/config/test_formatters_present.py`
+**Depends:** none (safe to add anytime, but keep in this batch)
+**Effort:** S

- Task 1.2: Low-level checks
-  - Path: scripts/mindmodel/checks.py
-  - Test: tests/scripts/mindmodel/test_checks.py
-  - Behavior: file existence (without opening), truncated-snippet heuristics, manifest-text secret heuristics
+Purpose: Add pre-commit and formatter config stubs (black, ruff, isort) to make future automation deterministic. This does not change code behavior and can be staged in a separate PR.

-## Batch 2: Core Modules (depends on Batch 1)
+Verify locally:
+- python -m pytest tests/config/test_formatters_present.py::test_precommit_exists

- Task 2.1: Constraint Validator (core)
-  - Path: scripts/mindmodel/validator.py
-  - Test: tests/scripts/mindmodel/test_validator.py
-  - Behavior: load manifest, scan for secrets, verify referenced files exist, detect truncated snippets, produce machine-readable report and exit codes: 0 ok, 1 warnings, 2 critical
+Commit message suggestion: `chore(format): add pre-commit and formatter configs`

-## Batch 3: Components (depends on Batch 2)
+---
+
+## Batch 4: CI and automation (parallel - 2 implementers)
+Final integration pieces. Depend on earlier batches so validator and CLI exist.
+
+### Task 4.1: Add GitHub Actions CI job (report-only first)
+**File:** `.github/workflows/mindmodel-validation.yml`
+**Test:** `tests/ci/test_workflow_exists.py`
+**Depends:** 1.1, 2.1, 3.1
+**Effort:** M
+
+Purpose: Add a CI workflow that runs the CLI against `.mindmodel/manifest.yaml` and uploads `reports/mindmodel-report-*.json` as an artifact. Important: the job should be non-blocking for merges initially (report-only). Job steps:
+- checkout
+- setup python
+- pip install -r requirements-dev.txt (or install test/dev deps)
+- run scripts/validate_mindmodel.py --manifest .mindmodel/manifest.yaml --report reports/out.json
+- upload artifact
+
+Verify locally by running the validator CLI (see Task 2.1) and by checking workflow YAML syntax with `act` or GitHub's validator in UI.

- Task 3.1: CLI wrapper for CI and local runs
-  - Path: scripts/mindmodel/cli.py
-  - Test: tests/scripts/mindmodel/test_cli.py
-  - Behavior: simple wrapper delegating to validator; callable as python -m scripts.mindmodel.cli
+Commit message suggestion: `ci(mindmodel): add report-only mindmodel validation workflow`

-## Batch 4: Integration / Configs / Docs (parallel)
+### Task 4.2: Add scheduled CI check (optional, experimental)
+**File:** `.github/workflows/mindmodel-schedule.yml`
+**Test:** `tests/ci/test_schedule_exists.py`
+**Depends:** 4.1
+**Effort:** S

- Task 4.1: CI workflow to run validator on PRs and scheduled checks
-  - Path: .github/workflows/mindmodel-validate.yml
-  - Behavior: run tests, then run validator against .mindmodel/manifest.yaml if present
+Purpose: Add a cron-scheduled workflow to run the validator daily/weekly and produce artifacts, helping detect drift over time. Keep the schedule job report-only at first.

- Task 4.2: .mindmodel/ README describing read-only policy
-  - Path: .mindmodel/README.md
+Verify: manual check in GitHub Actions UI after merge; run local syntax checks.

- Task 4.3: Add a minimal pre-commit config (trailing whitespace, eof fixer, check-yaml)
-  - Path: .pre-commit-config.yaml
+Commit message suggestion: `ci(mindmodel): add scheduled validation workflow`

-## Verification
+---
+
+## CI changes summary
+- Add `.github/workflows/mindmodel-validation.yml` (report-only initial behavior).
+- CI will install test/dev deps (do not switch prod installs) to ensure validator and tests run.
+- CI job uploads a JSON report artifact and prints a short human-readable summary to logs.
+- After an observation period (e.g., 1-2 weeks), change the workflow to fail on high-severity validator issues (manual gate required).
+
+---

- Each unit has a focused pytest test to validate behavior.
- CI will run the validator and tests; the validator should skip if no manifest present.
+## Tests / verification commands (developer guide)
+- Run all new unit tests: python -m pytest tests/validators tests/mindmodel tests/scripts tests/config tests/ci
+- Run a single validator: python scripts/validate_mindmodel.py --manifest .mindmodel/manifest.yaml --report reports/tmp.json
+- Validate workflow YAML syntax: yamllint .github/workflows/mindmodel-validation.yml (optional)

-## Implementation Checklist
+CI command (workflow): uses the CLI script; job is non-blocking and uploads artifacts.

- [ ] Add scripts/mindmodel/loader.py + tests/scripts/mindmodel/test_loader.py
- [ ] Add scripts/mindmodel/checks.py + tests/scripts/mindmodel/test_checks.py
- [ ] Add scripts/mindmodel/validator.py + tests/scripts/mindmodel/test_validator.py
- [ ] Add scripts/mindmodel/cli.py + tests/scripts/mindmodel/test_cli.py
- [ ] Add .github/workflows/mindmodel-validate.yml
- [ ] Add .mindmodel/README.md
- [ ] Add .pre-commit-config.yaml
+---
+
+## Low-risk incremental PR order (recommended)
+1) PR A (Batch 1 - Validator skeleton + types + tests, no .mindmodel/ content) — Adds validator API and types. (Small, S)
+2) PR B (Batch 1 - Add sanitized read-only `.mindmodel/manifest.yaml` + docs) — Separate PR so reviewers can inspect the raw manifest without behavioral changes. (S)
+3) PR C (Batch 2 - Add CLI wrapper + validator edge-case tests) — Enables local/CI execution. (S)
+4) PR D (Batch 4 - Add CI workflow as report-only) — Hook CI to run the validator and upload reports; do not fail CI yet. (M)
+5) PR E (Batch 3 - Move pytest to dev-deps) — Small config change in pyproject; CI continues to install test deps. (M)
+6) PR F (Batch 3 - Add pre-commit/formatters) — Non-invasive tooling. (S)
+7) PR G (Batch 4 - Add scheduled validation job) — Optional, report-only. (S)
+
+Rationale: each PR is kept small and focused. PR A/B/C/D are prioritized so we have validator + CI reporting quickly without touching production behavior. Remediation changes (E/F) are separate, so reviewers can focus on policy vs. code changes.
+
+---
+
+## Risk mitigation and decisions made
+- Validator is report-only by default. Decision: safer to surface issues and build trust before enforcing failures.
+- .mindmodel/ files will be added read-only and explicitly labeled in-file and in PR description.
+- Move pytest to dev-deps rather than removing from pyproject entirely if project conventions are unclear. Decision: add a `[tool.dev-deps]` or `requirements-dev.txt` depending on project tools; the implementer will choose the minimally invasive approach.
+- No automated fixes in validator; only reporting. If trivial YAML path reformatting is desired later, add an opt-in flag after human review.
+
+---
+
+## CI policy / timeline suggestion
+- Week 0: Merge PRs A-C (validator, manifest, CLI). CI runs report-only jobs and uploads reports.
+- Week 1: Merge PR D (CI workflow) so reports appear in PR runs. Collect feedback and sample manual reviews on 3-5 constraints.
+- Week 2: Merge remediation PRs (E/F) as separate changes. Keep CI non-blocking.
+- Week 3-4: After confidence is built, update CI job to fail on a small set of clear, high-confidence checks (missing files, secrets) behind a feature flag or branch protection rule.
+
+---
+
+## Files to be added/modified (summary)
+- src/validators/mindmodel_validator.py — validator API (S)
+- src/validators/types.py — manifest dataclasses/types (S)
+- .mindmodel/manifest.yaml — sanitized manifest (S) (read-only)
+- thoughts/shared/mindmodel/README.md — developer docs (S)
+- scripts/validate_mindmodel.py — CLI wrapper (S)
+- .github/workflows/mindmodel-validation.yml — CI workflow (M)
+- pyproject.toml — small update to move pytest to dev-deps (M)
+- .pre-commit-config.yaml — formatter config (S)
+- tests/... corresponding tests for each file (S/M as noted)
+
+---
+
+## Short summary for each microtask (one-line)
+- 1.1: validator skeleton exposing validate_manifest(...). (S)
+- 1.2: typed manifest models (dataclasses / pydantic). (S)
+- 1.3: add sanitized .mindmodel/manifest.yaml read-only snapshot. (S)
+- 1.4: developer README with reviewer checklist. (S)
+- 2.1: CLI wrapper script to run validator and emit JSON reports. (S)
+- 2.2: tests covering validator edge cases (missing files, truncated evidence). (M)
+- 2.3: manifest schema parse test. (S)
+- 3.1: move pytest to dev-deps in pyproject or add requirements-dev.txt. (M)
+- 3.2: add pre-commit and formatter configs (black/ruff/isort). (S)
+- 4.1: add GitHub Actions workflow to run validator (report-only). (M)
+- 4.2: add scheduled workflow to run validation on a cadence (S)
+
+---

-## Next steps
+Path where this plan is written:
+`thoughts/shared/plans/2026-03-24-mindmodel-generation.md`

-1. Create the files above in small commits (one micro-task per commit).
-2. Run unit tests for each new module as added.
-3. Open a small PR with the validator + CI + docs; request reviewers to run the validator locally.
+If you'd like, I can now split these microtasks into individual ticket-sized action items (one file + test per task) with ready-to-apply patch templates for each; tell me how many parallel implementers you expect and I will group them into batches accordingly.