The mp_votes table contains both party-aggregate rows (e.g. 'PVV', 'NSC')
and individual MP rows (e.g. 'Aardema, M.'). Running SVD on both together
creates a block-diagonal vote matrix where party codes and individual MPs
occupy disjoint SVD dimensions — causing dim 0 to be zero for all 421 MPs.
Fix: _build_expanded_rows() converts every party-level vote to individual
MP votes using mp_metadata date ranges (active MPs on motion date). Motions
that already have individual MP records are kept as-is. A party name mapping
handles NSC/Nieuw Sociaal Contract and other canonical name variants.
Results for current_parliament: 517 individual MPs, all 8732 motions covered,
dim 0 std=23.1 (was 0.0 for all MPs). PVV/NSC/BBB on positive end, SP/GL/PvdD
on negative end — matches expected left-right political axis.
All 11 annual windows (2016-2026) re-run with the new pipeline.
- Add 4 migration files: mp_votes, mp_metadata, svd_vectors, fused_embeddings
- Extend database.py with 5 new helper methods and table init
- Add pipeline/ package: extract_mp_votes, fetch_mp_metadata, text_pipeline,
svd_pipeline (with Procrustes alignment), fusion
- Add full test suite (17 tests) covering all pipeline modules and migrations
- Fix Procrustes alignment bug: scipy scale is a norm value, not a multiplier
- Fix DuckDB date type handling in test assertions (datetime.date vs string)
- Remove duckdb.py shim; tests now run against real duckdb + scipy via uv
Ref: thoughts/shared/plans/2026-03-21-parliamentary-embedding-pipeline-plan.md