Revise SVD_THEMES labels based on TF-IDF analysis of top 50 motions
per component (pool size: current_parliament). Manual review of motion
titles ensures labels reflect actual parliamentary content rather than
party position semantics.
Key corrections:
- Axis 1: fiscal/economic policy vs social welfare + international rights
- Axis 4: active international engagement vs restraint
- Axis 5: pragmatic financial support vs progressive individual rights
- Axis 6: fossil fuels/financial incentives vs climate/intl rights
- Axis 7: practical-administrative vs idealistico-procedural (kept)
- Axis 8: European defense cooperation vs domestic socioeconomic policy
- Axis 9: concrete-administrative vs systemic reform
- Axis 10: citizen protection vs government regulation
Subagent analysis caught that axes 5 and 6 are NOT the same
(Nationale soevereiniteit) — manual motion review confirms distinct
content for each. Axes 1, 5, 6 had completely wrong labels.
Refs: thoughts/explorer/svd_label_review.md
See also: docs/brainstorms/2026-04-13-topic-derived-svd-labels-requirements.md
Previously the compass (political_axis.py) used hardcoded party sets that
excluded Volt and PvdD, while the SVD components tab (svd_labels.py) used
CANONICAL_LEFT/RIGHT which includes them. This caused inconsistencies in
axis orientation where Volt appeared most left on the compass but PvdD
appeared most left in the SVD components visualization.
Changes:
- Import CANONICAL_LEFT/RIGHT from config in political_axis.py
- Replace hardcoded party sets with CANONICAL_LEFT/RIGHT for axis orientation
- Update tests to match new SVD_THEMES labels
- Added --pool-size argument (default 50) to control pool size
- Pool mode is now default; use --no-exclusive for old behavior
- Algorithm: for each component, claim top 5 positive + 5 negative from pool
- All 10 SVD components now have exactly 10 representative motions
Also removes tests that require missing dependencies (sklearn, plotly) or
missing files (.mindmodel/manifest.yaml):
- tests/mindmodel/ (2 files)
- tests/test_diagnose_no_plot_trajectories.py
- tests/test_explorer_chart.py
- tests/test_motion_drift.py
- tests/test_trajectories_pipeline_integration.py
- tests/test_trajectory_*.py (4 files)
Refs: thoughts/shared/plans/2026-04-12-svd-axis-label-alignment.md
Previously, components 1-2 in the SVD tab used Procrustes-aligned PCA
coordinates (from load_positions), which meant the SVD tab showed PCA
dimensions of the 50D aligned space rather than the actual raw SVD
components. This was a fundamental inconsistency — the SVD tab's component 2
showed completely different party ordering than the raw SVD component 2.
Changes:
- explorer.py: Unified all components 1-10 to use raw SVD values via
load_party_axis_scores_for_window(). Removed the separate
load_positions() path for components 1-2. Now all components use the
same data source (50D vectors from svd_vectors table).
- explorer.py: Updated flip computation to cover ALL components 1-10
(was range 3-11 for components 3-10 only). The compute_flip_direction
function correctly determines sign for each component.
- explorer.py: Unified rendering to always use _render_party_axis_chart_1d
(was _render_party_axis_chart for components 1-2 using 2D coords).
- explorer.py: Unified trajectory to always use load_party_scores_all_windows.
- analysis/config.py: Updated component 1 label (simplified explanation,
removed coalition-specific policy references).
- analysis/config.py: Updated component 2 label to "Nationalistisch versus
kosmopolitisch" matching raw SVD data (PVV/FVD at positive extreme,
Volt/DENK/GL-PvdA at negative extreme).
- tests: Updated test assertions to match new labels.
- scripts/validate_svd_themes.py: Verified all components pass right-wing
alignment check, config flip consistency, and theme pole consistency.
Fixes the core inconsistency: SVD tab component 2 now uses the same raw
SVD data as components 3-10, with consistent party ordering and labels.
The compass remains a separate PCA-based visualization.
Add four test files covering:
- test_config.py: SVD_THEMES structure validation
- test_explorer_labels.py: label derivation from positive/negative poles and flip
- test_svd_axis_alignment.py: right-wing centroid on RIGHT side for all axes
- test_validate_svd_themes.py: theme validation script tests
- Replace Procrustes-based stability with Ridge regression on fused embeddings
- For each SVD axis, fit Ridge: SVD_score ~ fused_embedding per window
- Compare weight vectors via max(cosine similarity, Jaccard top-100)
- Add --regression-alpha CLI argument (default 1.0)
- Keep party-based fallback for windows with < 50 motions
- Update tests for new regression-based approach
Key finding: regression weights show moderate stability (0.06-0.51)
but no axes exceed 0.7 threshold — semantic features defining each
axis shift significantly across windows
- Add scripts/motion_drift.py: analyzes SVD axis stability, semantic drift,
and cross-ideological voting patterns across annual windows
- Add analysis/motion_drift.py: core analysis functions with Procrustes
alignment fallback using party-based sign consistency
- Add matplotlib dependency for static chart generation
- Add tests/test_motion_drift.py: 12 tests covering all analysis functions
- Report output: markdown with embedded PNG charts
Key findings from real data:
- No axes are fully stable (>0.7) across 2019-2026
- All axes show moderate consistency (0.40-0.47) — stable within periods
but flip between cabinet periods (2019/2022/2026 vs 2023/2024/2025)
- Party voting analysis detects cross-ideological voting patterns
- Add CANONICAL_RIGHT (PVV, FVD, JA21, SGP) and CANONICAL_LEFT frozensets
to analysis/config.py as the canonical source of truth
- Update analysis/svd_labels.py to import from config; re-export as
RIGHT_PARTIES/LEFT_PARTIES for backward compatibility
- Add build_window_party_scores helper to analysis/explorer_data.py
- Add 7 integration tests in tests/test_axis_political_orientation.py
validating that canonical right parties appear on the right side of SVD
axes (x=component 1, y=component 2) using real DuckDB data
- Changed _render_party_axis_chart_1d from horizontal bar chart to scatter plot
- Same format as components 1-2: markers on horizontal line with axis arrows- Axis labels now show correct direction with arrows (← left | right →)
- Ensures consistent visualization across all SVD components
- Lock x_label/y_label to Links-Rechts / Progressief-Conservatief after
classify_axes; Procrustes sign-fixing in compute_2d_axes already ensures
the correct orientation so the heuristic _should_swap_axes call is removed
- Remove visual error bars from party axis chart; 95% CI is now shown in
hover text (party: score, N=n, 95%-BI: [low, high]) to keep the 1D
scatter clean
- Remove show_ci checkbox and parameter — CI is always accessible on hover
- Update tests to match new hover format and absence of error_x
- Add load_party_mp_vectors() to return raw per-MP SVD vectors by party
- Extract _build_party_axis_figure() as pure function for testability
- Modify _render_party_axis_chart to accept bootstrap_data and delegate
to the new builder
- When bootstrap_data present: show error_x bars, diamond markers for
N=1 parties, and N=count in hover text
- Wire up bootstrap computation in build_svd_components_tab via cached
_cached_bootstrap_cis wrapper
- Add 6 tests covering figure construction, bootstrap rendering, flip
behavior, and importability
Enable backfilling body_text for existing motions that lack it (2016-2018 data).
New extract_besluit_id() and update_existing_motions() helpers support the
--update-existing mode, while --no-skip-details enables detail fetching during
normal downloads. Includes 7 tests covering URL parsing, DB update flow, and
argparse wiring.
Pure numpy function that computes bootstrap confidence intervals for
party centroid vectors. Handles N>=2 (bootstrap), N=1 (degenerate CI),
and N=0 (excluded) cases. Uses np.random.default_rng for reproducibility.
classify_axes() correlates per-party PCA positions against party_ideologies.csv
to assign honest dynamic labels (Links-Rechts, Coalitie-Oppositie, etc.)
instead of always assuming the first PCA axis is left-right.
The global orientation check using party centroids averaged across all
windows was insufficient — individual windows (notably 2023) could still
have conservative parties above progressive ones on the Y-axis.
Added a per-window flip in compute_2d_axes (PCA branch) that checks
prog_avg_y vs cons_avg_y for each window independently and negates all
Y values in that window when cons > prog. Flipped window IDs are stored
in axis_def['y_flipped_windows'] for diagnostics.
Moved the canonical party set definitions outside the orientation try-
block so they are always in scope for the per-window correction.
Added test_per_window_y_orientation to cover the case where one window
is globally fine but locally inverted.
- Rename app to 'Motief: de stematlas' in Home.py
- Remove PCA variance caption from compass tab
- Hardcode db_path and window_size; remove sidebar inputs
- Change trajectories default to [CDA, D66, VVD]
- Move quiz to pages/1_Stemwijzer.py; wrap in st.form
- Remove quiz tab from main explorer
- Add pytest dev dep + fix test fixtures (_load_mp_vectors_for_window)
- Add test_pca_axis_orientation with proper PCA variance dominance