Sven Geboers
8af27bbf04
feat: implement agent-native architecture (U1-U6)
...
Implements the agent-native architecture plan (docs/plans/2026-05-01-002-agent-native-architecture-plan.md):
- U1: Database query primitives (agent_tools/database.py)
- query_motions, query_votes, query_svd_vectors, query_party_positions, query_pipeline_status
- U2: Pipeline control primitives (agent_tools/pipeline.py)
- pipeline_run_stage, pipeline_run_full, pipeline_check_health, pipeline_get_logs, pipeline_validate_output
- U3: Analysis & report generation (agent_tools/analysis.py, reports.py)
- analyze_party_shift, analyze_axis_stability, validate_svd_labels, generate_report
- U4: Content validation primitives (agent_tools/content.py)
- validate_motion_coverage, validate_layman_explanations, suggest_svd_label, check_embedding_quality
- U5: System prompt & context injection (SYSTEM_PROMPT.md, context.py, context.md)
- U6: Parity verification tests (tests/agent_tools/test_parity.py)
Tests: 238 passed, 2 skipped
AGENTS.md updated to surface agent_tools/
4 weeks ago
Sven Geboers
07dd393533
cleanup: remove stale .mindmodel, old venvs, orphaned code, and transient artifacts
...
Removes:
- .mindmodel/ directory and related CI workflows (mindmodel-schedule.yml, mindmodel-validation.yml)
- scripts/mindmodel/ and scripts/validate_mindmodel.py
- src/types/ and src/validators/ (orphaned type modules, only used by mindmodel)
- tests/ci/, tests/scripts/mindmodel/, tests/types/, tests/validators/ (mindmodel-only tests)
- thoughts/ledgers/ and thoughts/shared/ (stale transient directories)
- .venv_axis and .venv_plotly (orphaned virtual environments, ~1.1 GB)
- outputs/blog-charts/ (stale generated HTML files)
- data/*.json sidecars (empty cache artifacts)
- __pycache__ and *.pyc files across repo
Updates:
- .gitignore: remove thoughts/shared/analyses/ entry
Space reclaimed: ~1.1 GB+
4 weeks ago
Sven Geboers
375955dbc4
cleanup: merge session ledgers into docs/solutions and delete artifacts
...
- Remove stale thoughts/ledgers/ and thoughts/shared/ artifacts
- Fix .gitignore duplicate .worktrees entry
- Move pyright to [dependency-groups] dev
- Replace hardcoded blog correlation with reproducible metric reference
- Add docs: verify-session-artifacts, fusion-vector-dimensions,
working-tree-hygiene
- Update blog-numbers-from-pipeline-outputs with correlation example
4 weeks ago
Sven Geboers
eb71328967
chore: commit remaining modified files from refactoring
2 months ago
Sven Geboers
10c9b78d16
Add .worktrees/ to .gitignore
2 months ago
Sven Geboers
35f4667982
chore(secrets): stop tracking .env and add to .gitignore
2 months ago
Sven Geboers
2891e9ee70
feat: add StemAtlas Streamlit app, explorer, Docker deployment, blog charts
2 months ago
Sven Geboers
daa22c5e2b
feat: complete parliamentary embedding pipeline with full historical coverage
...
- Add fused (SVD + text) embedding pipeline for annual windows 2016-2026
- Fix store_fused_embedding duplicate bug: DELETE before INSERT (idempotent)
- Add --text-batch-size CLI flag to run_pipeline.py (default 200)
- Add explicit --start-date/--end-date to download_past_year.py
- Backfill mp_votes for all motions (party-level votes, 111k new rows)
- Add similarity cache recompute: 212k rows across 9 annual windows
- Improve ai_provider retry logic, text_pipeline batching
- Improve analysis/political_axis PCA handling and visualizations
- Add diagnostic/utility scripts: compare_svd, generate_compass, inspect_axis, etc.
- Untrack data/motions.db (3.6GB binary), add to .gitignore with outputs/
- Update continuity ledger with full session state
2 months ago
Sven Geboers
a36e6cba4e
feat(pipeline): implement parliamentary embedding pipeline MVP
...
- Add 4 migration files: mp_votes, mp_metadata, svd_vectors, fused_embeddings
- Extend database.py with 5 new helper methods and table init
- Add pipeline/ package: extract_mp_votes, fetch_mp_metadata, text_pipeline,
svd_pipeline (with Procrustes alignment), fusion
- Add full test suite (17 tests) covering all pipeline modules and migrations
- Fix Procrustes alignment bug: scipy scale is a norm value, not a multiplier
- Fix DuckDB date type handling in test assertions (datetime.date vs string)
- Remove duckdb.py shim; tests now run against real duckdb + scipy via uv
Ref: thoughts/shared/plans/2026-03-21-parliamentary-embedding-pipeline-plan.md
2 months ago