Pure numpy function that computes bootstrap confidence intervals for
party centroid vectors. Handles N>=2 (bootstrap), N=1 (degenerate CI),
and N=0 (excluded) cases. Uses np.random.default_rng for reproducibility.
classify_axes() correlates per-party PCA positions against party_ideologies.csv
to assign honest dynamic labels (Links-Rechts, Coalitie-Oppositie, etc.)
instead of always assuming the first PCA axis is left-right.
The global orientation check using party centroids averaged across all
windows was insufficient — individual windows (notably 2023) could still
have conservative parties above progressive ones on the Y-axis.
Added a per-window flip in compute_2d_axes (PCA branch) that checks
prog_avg_y vs cons_avg_y for each window independently and negates all
Y values in that window when cons > prog. Flipped window IDs are stored
in axis_def['y_flipped_windows'] for diagnostics.
Moved the canonical party set definitions outside the orientation try-
block so they are always in scope for the per-window correction.
Added test_per_window_y_orientation to cover the case where one window
is globally fine but locally inverted.
- Rename app to 'Motief: de stematlas' in Home.py
- Remove PCA variance caption from compass tab
- Hardcode db_path and window_size; remove sidebar inputs
- Change trajectories default to [CDA, D66, VVD]
- Move quiz to pages/1_Stemwijzer.py; wrap in st.form
- Remove quiz tab from main explorer
- Add pytest dev dep + fix test fixtures (_load_mp_vectors_for_window)
- Add test_pca_axis_orientation with proper PCA variance dominance
- fetch_mp_metadata: use real OData URL with pagination (1200 records, 5 pages)
uses Fractie.Afkorting not NaamNL for abbreviation matching
skips Verwijderd=true records
- upsert_mp_metadata: keep most recent membership (prefer active over ended,
then higher Van date) so current party affiliations are not overwritten by historical
- compute_anchor_axis: anchor directly on party-level SVD entities (GroenLinks-PvdA etc)
before falling back to mp_metadata individual MP lookup
- test_fetch_mp_metadata: fix mock for timeout kwarg + pagination + Afkorting field
- Generated anchor axis HTML for 2025-Q2 through 2026-Q1 in outputs/
New extract_mp_votes behavior inserts all actors (party + individual MPs),
not only comma-name MPs. Test now validates both types and their party column.
Also adds generated HTML visualizations (political axis x5 windows + trajectories).
- Add 4 migration files: mp_votes, mp_metadata, svd_vectors, fused_embeddings
- Extend database.py with 5 new helper methods and table init
- Add pipeline/ package: extract_mp_votes, fetch_mp_metadata, text_pipeline,
svd_pipeline (with Procrustes alignment), fusion
- Add full test suite (17 tests) covering all pipeline modules and migrations
- Fix Procrustes alignment bug: scipy scale is a norm value, not a multiplier
- Fix DuckDB date type handling in test assertions (datetime.date vs string)
- Remove duckdb.py shim; tests now run against real duckdb + scipy via uv
Ref: thoughts/shared/plans/2026-03-21-parliamentary-embedding-pipeline-plan.md