- Add compute_nd_axes() for N-component PCA with Procrustes alignment
- Add _get_aligned_party_scores() helper in explorer.py
- Update build_svd_components_tab to use aligned scores for all components
- Compute flip direction from aligned score centroids using CANONICAL_LEFT/RIGHT
Previously the compass (political_axis.py) used hardcoded party sets that
excluded Volt and PvdD, while the SVD components tab (svd_labels.py) used
CANONICAL_LEFT/RIGHT which includes them. This caused inconsistencies in
axis orientation where Volt appeared most left on the compass but PvdD
appeared most left in the SVD components visualization.
Changes:
- Import CANONICAL_LEFT/RIGHT from config in political_axis.py
- Replace hardcoded party sets with CANONICAL_LEFT/RIGHT for axis orientation
- Update tests to match new SVD_THEMES labels
Allow analysis modules to be imported in lightweight test environments
without duckdb installed. Modules that need duckdb for actual queries
still require it at runtime, but import-time failures are handled gracefully.
Move rng initialization before the party loop so each party gets a
unique segment of the random stream instead of identical sequences.
Replace Python bootstrap loop with vectorized numpy indexing.
Pure numpy function that computes bootstrap confidence intervals for
party centroid vectors. Handles N>=2 (bootstrap), N=1 (degenerate CI),
and N=0 (excluded) cases. Uses np.random.default_rng for reproducibility.
Previously load_scree_data computed L2-norms per dimension on current_parliament
vectors only, giving ~11% for PC1. This was inconsistent with the compass which
uses all windows + Procrustes alignment and gets PC1=24.1%.
Added compute_svd_spectrum() helper to political_axis.py that reuses the same
alignment pipeline. load_scree_data now delegates to it. _render_scree_plot
no longer re-normalizes (inputs are already EVR percentages). Hover label
updated to 'verklaarde variantie'.
Quarterly windows (29 of 41 total) diluted PC1 explained variance ratio
from ~20% down to ~14.6%. The fix splits the vector collection loop into:
- pca_vecs: annual windows only (re.match r'^\d{4}$') -> M_pca used for SVD
- all_vecs: every window -> M used for projections onto derived axes
Centering for SVD and global_mean for projection both now use M_pca.mean(axis=0)
so axes are consistent. Falls back to all windows if no annual windows exist.
The global PCA X-axis flip uses centroids averaged across all windows,
which can leave individual windows with left/right inverted (e.g. PvdA
appearing right of VVD in 2020). Mirror the existing per-window Y-axis
correction to also check and flip X values per window.
The global orientation check using party centroids averaged across all
windows was insufficient — individual windows (notably 2023) could still
have conservative parties above progressive ones on the Y-axis.
Added a per-window flip in compute_2d_axes (PCA branch) that checks
prog_avg_y vs cons_avg_y for each window independently and negates all
Y values in that window when cons > prog. Flipped window IDs are stored
in axis_def['y_flipped_windows'] for diagnostics.
Moved the canonical party set definitions outside the orientation try-
block so they are always in scope for the per-window correction.
Added test_per_window_y_orientation to cover the case where one window
is globally fine but locally inverted.
- fetch_mp_metadata: use real OData URL with pagination (1200 records, 5 pages)
uses Fractie.Afkorting not NaamNL for abbreviation matching
skips Verwijderd=true records
- upsert_mp_metadata: keep most recent membership (prefer active over ended,
then higher Van date) so current party affiliations are not overwritten by historical
- compute_anchor_axis: anchor directly on party-level SVD entities (GroenLinks-PvdA etc)
before falling back to mp_metadata individual MP lookup
- test_fetch_mp_metadata: fix mock for timeout kwarg + pagination + Afkorting field
- Generated anchor axis HTML for 2025-Q2 through 2026-Q1 in outputs/