- analysis/explorer_data.py: add AND window_id NOT LIKE '%-Q%' to
_UNIFORM_DIM_SQL so quarterly windows are filtered at the source
- explorer.py: remove stale comment justifying quarterly inclusion;
remove redundant '-Q' guard in SVD tab trajectory view
- scripts/recompute_svd.py: replace quarter_bounds() with year_bounds()
that handles annual window IDs like '2024'; filter window list to
annual-only before recomputing SVD
Revise SVD_THEMES labels based on TF-IDF analysis of top 50 motions
per component (pool size: current_parliament). Manual review of motion
titles ensures labels reflect actual parliamentary content rather than
party position semantics.
Key corrections:
- Axis 1: fiscal/economic policy vs social welfare + international rights
- Axis 4: active international engagement vs restraint
- Axis 5: pragmatic financial support vs progressive individual rights
- Axis 6: fossil fuels/financial incentives vs climate/intl rights
- Axis 7: practical-administrative vs idealistico-procedural (kept)
- Axis 8: European defense cooperation vs domestic socioeconomic policy
- Axis 9: concrete-administrative vs systemic reform
- Axis 10: citizen protection vs government regulation
Subagent analysis caught that axes 5 and 6 are NOT the same
(Nationale soevereiniteit) — manual motion review confirms distinct
content for each. Axes 1, 5, 6 had completely wrong labels.
Refs: thoughts/explorer/svd_label_review.md
See also: docs/brainstorms/2026-04-13-topic-derived-svd-labels-requirements.md
- Add compute_nd_axes() for N-component PCA with Procrustes alignment
- Add _get_aligned_party_scores() helper in explorer.py
- Update build_svd_components_tab to use aligned scores for all components
- Compute flip direction from aligned score centroids using CANONICAL_LEFT/RIGHT
Previously the compass (political_axis.py) used hardcoded party sets that
excluded Volt and PvdD, while the SVD components tab (svd_labels.py) used
CANONICAL_LEFT/RIGHT which includes them. This caused inconsistencies in
axis orientation where Volt appeared most left on the compass but PvdD
appeared most left in the SVD components visualization.
Changes:
- Import CANONICAL_LEFT/RIGHT from config in political_axis.py
- Replace hardcoded party sets with CANONICAL_LEFT/RIGHT for axis orientation
- Update tests to match new SVD_THEMES labels
Redo theme analysis after pool-based motion assignment change.
New labels reflect actual motion content per component:
1. Economische sectorbelangen versus sociale welvaart
2. Nationalistische versus multilateralistische oriëntatie
3. Verzorgingsstaat versus defensie en nationale veiligheid
4. Internationale instituties en multilateralisme versus nationale soevereiniteit
5. Gemeenschapszin versus individuele rechten
6. Ecologische transitie versus economische conservatie
7. Praktisch-bestuurlijk versus idealistisch-proceduraal
8. Internationale samenwerking versus nationale soevereiniteit
9. Pragmatische probleemoplossing versus regulering
10. Minder overheidsbemoeienis versus meer handhaving
Previously, components 1-2 in the SVD tab used Procrustes-aligned PCA
coordinates (from load_positions), which meant the SVD tab showed PCA
dimensions of the 50D aligned space rather than the actual raw SVD
components. This was a fundamental inconsistency — the SVD tab's component 2
showed completely different party ordering than the raw SVD component 2.
Changes:
- explorer.py: Unified all components 1-10 to use raw SVD values via
load_party_axis_scores_for_window(). Removed the separate
load_positions() path for components 1-2. Now all components use the
same data source (50D vectors from svd_vectors table).
- explorer.py: Updated flip computation to cover ALL components 1-10
(was range 3-11 for components 3-10 only). The compute_flip_direction
function correctly determines sign for each component.
- explorer.py: Unified rendering to always use _render_party_axis_chart_1d
(was _render_party_axis_chart for components 1-2 using 2D coords).
- explorer.py: Unified trajectory to always use load_party_scores_all_windows.
- analysis/config.py: Updated component 1 label (simplified explanation,
removed coalition-specific policy references).
- analysis/config.py: Updated component 2 label to "Nationalistisch versus
kosmopolitisch" matching raw SVD data (PVV/FVD at positive extreme,
Volt/DENK/GL-PvdA at negative extreme).
- tests: Updated test assertions to match new labels.
- scripts/validate_svd_themes.py: Verified all components pass right-wing
alignment check, config flip consistency, and theme pole consistency.
Fixes the core inconsistency: SVD tab component 2 now uses the same raw
SVD data as components 3-10, with consistent party ordering and labels.
The compass remains a separate PCA-based visualization.
Allow analysis modules to be imported in lightweight test environments
without duckdb installed. Modules that need duckdb for actual queries
still require it at runtime, but import-time failures are handled gracefully.
Two related bugs fixed:
1. Label alignment: Removed static left_pole/right_pole from SVD_THEMES
entries. These labels assumed a fixed flip direction but could mismatch
with runtime flip computation, causing right-wing parties to appear on
the wrong side. Labels are now always derived from positive_pole,
negative_pole, and the runtime flip direction.
2. Score mismatch: Changed tijdtraject view for components 3-10 from
load_party_scores_all_windows_aligned() to load_party_scores_all_windows().
Procrustes alignment rotates the full 50-dim vector space to align
components 1-2, but this also transforms components 3-10, making their
scores incomparable with the single-window view. Per-window flip
computation already handles orientation alignment for these components.
Also updated svd_labels.py to prefer analysis.config as the canonical
source for SVD_THEMES, falling back to explorer only when config is
unavailable.
- Add CANONICAL_RIGHT (PVV, FVD, JA21, SGP) and CANONICAL_LEFT frozensets
to analysis/config.py as the canonical source of truth
- Update analysis/svd_labels.py to import from config; re-export as
RIGHT_PARTIES/LEFT_PARTIES for backward compatibility
- Add build_window_party_scores helper to analysis/explorer_data.py
- Add 7 integration tests in tests/test_axis_political_orientation.py
validating that canonical right parties appear on the right side of SVD
axes (x=component 1, y=component 2) using real DuckDB data
Move rng initialization before the party loop so each party gets a
unique segment of the random stream instead of identical sequences.
Replace Python bootstrap loop with vectorized numpy indexing.
Pure numpy function that computes bootstrap confidence intervals for
party centroid vectors. Handles N>=2 (bootstrap), N=1 (degenerate CI),
and N=0 (excluded) cases. Uses np.random.default_rng for reproducibility.
Both _load_window_ids and _load_mp_vectors_for_window only read from the DB.
Opening without read_only=True caused an IOException when Streamlit already held
a read-only lock, silently returning an empty scree plot.
Previously load_scree_data computed L2-norms per dimension on current_parliament
vectors only, giving ~11% for PC1. This was inconsistent with the compass which
uses all windows + Procrustes alignment and gets PC1=24.1%.
Added compute_svd_spectrum() helper to political_axis.py that reuses the same
alignment pipeline. load_scree_data now delegates to it. _render_scree_plot
no longer re-normalizes (inputs are already EVR percentages). Hover label
updated to 'verklaarde variantie'.
Quarterly windows (29 of 41 total) diluted PC1 explained variance ratio
from ~20% down to ~14.6%. The fix splits the vector collection loop into:
- pca_vecs: annual windows only (re.match r'^\d{4}$') -> M_pca used for SVD
- all_vecs: every window -> M used for projections onto derived axes
Centering for SVD and global_mean for projection both now use M_pca.mean(axis=0)
so axes are consistent. Falls back to all windows if no annual windows exist.
The global PCA X-axis flip uses centroids averaged across all windows,
which can leave individual windows with left/right inverted (e.g. PvdA
appearing right of VVD in 2020). Mirror the existing per-window Y-axis
correction to also check and flip X values per window.
classify_axes() correlates per-party PCA positions against party_ideologies.csv
to assign honest dynamic labels (Links-Rechts, Coalitie-Oppositie, etc.)
instead of always assuming the first PCA axis is left-right.
The global orientation check using party centroids averaged across all
windows was insufficient — individual windows (notably 2023) could still
have conservative parties above progressive ones on the Y-axis.
Added a per-window flip in compute_2d_axes (PCA branch) that checks
prog_avg_y vs cons_avg_y for each window independently and negates all
Y values in that window when cons > prog. Flipped window IDs are stored
in axis_def['y_flipped_windows'] for diagnostics.
Moved the canonical party set definitions outside the orientation try-
block so they are always in scope for the per-window correction.
Added test_per_window_y_orientation to cover the case where one window
is globally fine but locally inverted.
SVD sign/rotation is arbitrary per window. Without alignment, drift was
dominated by basis flips (~1.9/step max=2.0) rather than real political movement.
- _procrustes_align_windows(): aligns each window to the previous using
orthogonal Procrustes on common entities (scipy, falls back gracefully)
- compute_trajectories(): builds aligned window dict before per-MP drift calc,
adds normalize=True (L2-normalise) to remove cross-window magnitude differences
caused by varying numbers of motions per quarter
- Results now in sensible range: NSC=2.28, DENK=1.90, ... PVV=0.82, FVD=0.70
- NSC large late jump (1.39 in Q4→Q1 2026) matches its parliamentary fracture
- Add outputs/trajectories_party_aligned.html with cleaned-up drift chart
- fetch_mp_metadata: use real OData URL with pagination (1200 records, 5 pages)
uses Fractie.Afkorting not NaamNL for abbreviation matching
skips Verwijderd=true records
- upsert_mp_metadata: keep most recent membership (prefer active over ended,
then higher Van date) so current party affiliations are not overwritten by historical
- compute_anchor_axis: anchor directly on party-level SVD entities (GroenLinks-PvdA etc)
before falling back to mp_metadata individual MP lookup
- test_fetch_mp_metadata: fix mock for timeout kwarg + pagination + Afkorting field
- Generated anchor axis HTML for 2025-Q2 through 2026-Q1 in outputs/