SVD sign/rotation is arbitrary per window. Without alignment, drift was
dominated by basis flips (~1.9/step max=2.0) rather than real political movement.
- _procrustes_align_windows(): aligns each window to the previous using
orthogonal Procrustes on common entities (scipy, falls back gracefully)
- compute_trajectories(): builds aligned window dict before per-MP drift calc,
adds normalize=True (L2-normalise) to remove cross-window magnitude differences
caused by varying numbers of motions per quarter
- Results now in sensible range: NSC=2.28, DENK=1.90, ... PVV=0.82, FVD=0.70
- NSC large late jump (1.39 in Q4→Q1 2026) matches its parliamentary fracture
- Add outputs/trajectories_party_aligned.html with cleaned-up drift chart
- fetch_mp_metadata: use real OData URL with pagination (1200 records, 5 pages)
uses Fractie.Afkorting not NaamNL for abbreviation matching
skips Verwijderd=true records
- upsert_mp_metadata: keep most recent membership (prefer active over ended,
then higher Van date) so current party affiliations are not overwritten by historical
- compute_anchor_axis: anchor directly on party-level SVD entities (GroenLinks-PvdA etc)
before falling back to mp_metadata individual MP lookup
- test_fetch_mp_metadata: fix mock for timeout kwarg + pagination + Afkorting field
- Generated anchor axis HTML for 2025-Q2 through 2026-Q1 in outputs/
New extract_mp_votes behavior inserts all actors (party + individual MPs),
not only comma-name MPs. Test now validates both types and their party column.
Also adds generated HTML visualizations (political axis x5 windows + trajectories).
- Add 4 migration files: mp_votes, mp_metadata, svd_vectors, fused_embeddings
- Extend database.py with 5 new helper methods and table init
- Add pipeline/ package: extract_mp_votes, fetch_mp_metadata, text_pipeline,
svd_pipeline (with Procrustes alignment), fusion
- Add full test suite (17 tests) covering all pipeline modules and migrations
- Fix Procrustes alignment bug: scipy scale is a norm value, not a multiplier
- Fix DuckDB date type handling in test assertions (datetime.date vs string)
- Remove duckdb.py shim; tests now run against real duckdb + scipy via uv
Ref: thoughts/shared/plans/2026-03-21-parliamentary-embedding-pipeline-plan.md