Revise SVD_THEMES labels based on TF-IDF analysis of top 50 motions
per component (pool size: current_parliament). Manual review of motion
titles ensures labels reflect actual parliamentary content rather than
party position semantics.
Key corrections:
- Axis 1: fiscal/economic policy vs social welfare + international rights
- Axis 4: active international engagement vs restraint
- Axis 5: pragmatic financial support vs progressive individual rights
- Axis 6: fossil fuels/financial incentives vs climate/intl rights
- Axis 7: practical-administrative vs idealistico-procedural (kept)
- Axis 8: European defense cooperation vs domestic socioeconomic policy
- Axis 9: concrete-administrative vs systemic reform
- Axis 10: citizen protection vs government regulation
Subagent analysis caught that axes 5 and 6 are NOT the same
(Nationale soevereiniteit) — manual motion review confirms distinct
content for each. Axes 1, 5, 6 had completely wrong labels.
Refs: thoughts/explorer/svd_label_review.md
See also: docs/brainstorms/2026-04-13-topic-derived-svd-labels-requirements.md
Key findings:
- Coalition started losing votes structurally from 2019
- Not that 'right' won, but that government lost
- Added government_win_rate.png visualization
- Updated analysis with party vote counts
- Add polarization_analysis.png: spread over time for all axes
- Add axis1_deep_dive.png: focus on Axis 1 (coalition vs opposition)
- Add Dutch blog post on parliamentary polarization findings
- Add script to find motions closest to semantic gravity per axis/window
- Document Axis 1 semantic shift: from administrative law (2016)
to migration/asylum policy (2026)
- Shows that 'coalition' votes on different topics over time
- refactoring-streamlit-data-loading.md: update test count
164/164 → 173/173 (7 new axis validation tests added)
- svd-component-labels-mismatch.md: SVD_THEMES moved from
explorer.py:434-611 → analysis/config.py:67+ per the
refactoring that extracted constants to analysis/config.py
- Add CANONICAL_RIGHT (PVV, FVD, JA21, SGP) and CANONICAL_LEFT frozensets
to analysis/config.py as the canonical source of truth
- Update analysis/svd_labels.py to import from config; re-export as
RIGHT_PARTIES/LEFT_PARTIES for backward compatibility
- Add build_window_party_scores helper to analysis/explorer_data.py
- Add 7 integration tests in tests/test_axis_political_orientation.py
validating that canonical right parties appear on the right side of SVD
axes (x=component 1, y=component 2) using real DuckDB data
- Add AGENTS.md with documented solutions reference
- Include SVD label convention (right-wing parties on right side)
- Document SVD insight: labels reflect voting patterns, not semantics
- Fix SQL verification example to use Python approach
Replaces static ideology CSV as primary axis classification signal with
per-year motion projection + Dutch keyword classifier. Adds axis-swap
logic so left-right is conventionally on X when present. Adds Option C
UI expander showing top motions per axis pole.
Add design for honest PCA axis labeling — validates each compass axis
against a party ideology reference CSV and labels dynamically (Links–Rechts,
Coalitie–Oppositie, or fallback) instead of hardcoding Left–Right always.
- Add 4 migration files: mp_votes, mp_metadata, svd_vectors, fused_embeddings
- Extend database.py with 5 new helper methods and table init
- Add pipeline/ package: extract_mp_votes, fetch_mp_metadata, text_pipeline,
svd_pipeline (with Procrustes alignment), fusion
- Add full test suite (17 tests) covering all pipeline modules and migrations
- Fix Procrustes alignment bug: scipy scale is a norm value, not a multiplier
- Fix DuckDB date type handling in test assertions (datetime.date vs string)
- Remove duckdb.py shim; tests now run against real duckdb + scipy via uv
Ref: thoughts/shared/plans/2026-03-21-parliamentary-embedding-pipeline-plan.md