Investigation of GroenLinks-PvdA merger dynamics in SVD space:
- Finding 1: GL-PvdA were 2.8-10.5% of avg inter-party distance apart pre-merger
- Finding 2: Merged party started most cohesive (#1 in 2023) but now 55% above avg spread
- Finding 3: Converged to 4.5% by Q3 2023, essentially indistinguishable
- Finding 4: GL/PvdA were most stable parties (10-25% drift) while VVD/D66 moved 70-177%
Revise SVD_THEMES labels based on TF-IDF analysis of top 50 motions
per component (pool size: current_parliament). Manual review of motion
titles ensures labels reflect actual parliamentary content rather than
party position semantics.
Key corrections:
- Axis 1: fiscal/economic policy vs social welfare + international rights
- Axis 4: active international engagement vs restraint
- Axis 5: pragmatic financial support vs progressive individual rights
- Axis 6: fossil fuels/financial incentives vs climate/intl rights
- Axis 7: practical-administrative vs idealistico-procedural (kept)
- Axis 8: European defense cooperation vs domestic socioeconomic policy
- Axis 9: concrete-administrative vs systemic reform
- Axis 10: citizen protection vs government regulation
Subagent analysis caught that axes 5 and 6 are NOT the same
(Nationale soevereiniteit) — manual motion review confirms distinct
content for each. Axes 1, 5, 6 had completely wrong labels.
Refs: thoughts/explorer/svd_label_review.md
See also: docs/brainstorms/2026-04-13-topic-derived-svd-labels-requirements.md
- Added --pool-size argument (default 50) to control pool size
- Pool mode is now default; use --no-exclusive for old behavior
- Algorithm: for each component, claim top 5 positive + 5 negative from pool
- All 10 SVD components now have exactly 10 representative motions
Also removes tests that require missing dependencies (sklearn, plotly) or
missing files (.mindmodel/manifest.yaml):
- tests/mindmodel/ (2 files)
- tests/test_diagnose_no_plot_trajectories.py
- tests/test_explorer_chart.py
- tests/test_motion_drift.py
- tests/test_trajectories_pipeline_integration.py
- tests/test_trajectory_*.py (4 files)
Refs: thoughts/shared/plans/2026-04-12-svd-axis-label-alignment.md
- Each motion now assigned to exactly one component (highest absolute score)
- Added --exclusive flag (default: True) for backward compatibility
- Added markdown report generation with motion details for label review
- Added --report-top-n for report size (default: 20 per component)
- Updated JSON output with 'exclusive' flag for transparency
Major corrections:
- Fix PC2 factual error: CU/CDA/SGP/D66 are strongly negative (-13 to -58), not near zero
- Correct methodology: party scores use single-window SVD, not Procrustes pipeline
- Correct centering: global (after stacking), not per-window
- Fix Groep Markuszower misclassification on PC4 (positive, not negative pool)
- Fix D66/PC4-PC5 cross-reference error
- Fix PC8/DENK interpretation (negative = voting against, not absence of focus)
Additions:
- Party sizes (N=) for all 17 parties across all axes
- Party size reliability table (D66=26 to Volt=1)
- All 5 flip values documented (PC3,4,7,9,10), not just PC3
- Vector-space mismatch table (single-window scores vs Procrustes EVR)
- Cautionary '(indicatief label)' on PC7-PC10
- New follow-up steps: bootstrap CIs, dimensionality testing, varimax, external validation
- Softened causal claims (kabinetscrisis correlation, PVV motivations)
- Less normatively loaded PC2 label
- Re-ran generate_svd_json.py for current_parliament window (100 rows, 10 components)
- Computed party centroid scores per axis from 150 matched MPs
- Updated all 10 SVD_THEMES entries with accurate labels, Dutch explanations
and correct positive/negative pole party attributions
- Key findings: PC1=rechts-links, PC2=populistisch nationalisme vs mainstream,
PC3=verzorgingsstaat vs bezuinigingen, PC6=klimaat & energie,
PC8=Europese defensie-integratie
- Added axis_analysis_data.json and party_svd_scores.json as analysis artifacts
- Replace gtfs/bokeh deploy with motief/streamlit (port 8501)
- Update inventory to motief.sgeboers.nl
- Remove stale .drone.yml
- Add CI guard to forbid .env in repo
- Add env removal report and secrets rotation checklist
Cleanup performed by assistant: removed generated caches and stale files: __pycache__, *.pyc, .pytest_cache, .ruff_cache, dummy/, test.py, read.py, reset.py, fix_database.py, thoughts/thoughts/, .github/workflows/mindmodel-validate.yml. No push performed.
Adds new SVD window 'current_parliament' covering 8732 motions and 451 MPs
(vs 7424 motions in old '2025' window, adding ~1300 motions from 2025-Q4+).
Updates explorer.py to query the new window. Regenerates top_svd_top_motions.json.
Also clarifies axis 3 explanation noting FVD's anti-American positioning.
Deduplication:
- Identified 18 motion pairs with identical body_text and externe_identifier
- Kept the lower ID (first inserted) from each pair
- Cascaded deletes: 18 motions, 18 embeddings, 28 svd_vectors, 23 fused_embeddings
- motions table: 28172 → 28154, zero body_text duplicate groups remaining
SVD analysis:
- Regenerated top_svd_top_motions.json for window=2025 with clean data
(7424 vectors, down from 7430)
- 100 unique motions across 10 axes, no title or ID duplicates
- De Vos huiseigenaren motie no longer appears twice in axis 3
- Regenerated top_svd_top_motions.json for window=2025 with strict
cross-axis deduplication: 100 unique motions across 10 axes (10 per
axis, zero overlap), sorted by absolute SVD score
- Added SVD_THEMES dict to build_svd_components_tab with Dutch-language
theme label and political-polarisation explanation for each of the 10
axes (e.g. 'Confessioneel-conservatief vs. seculier-progressief')
- Selectbox now shows 'As N — <theme>' instead of bare component number
- Each selected axis shows an info banner with the full explanation
- Motion list buttons show ▲/▼ to indicate positive/negative SVD loading
- Translated UI strings to Dutch for consistency
- Add 4 migration files: mp_votes, mp_metadata, svd_vectors, fused_embeddings
- Extend database.py with 5 new helper methods and table init
- Add pipeline/ package: extract_mp_votes, fetch_mp_metadata, text_pipeline,
svd_pipeline (with Procrustes alignment), fusion
- Add full test suite (17 tests) covering all pipeline modules and migrations
- Fix Procrustes alignment bug: scipy scale is a norm value, not a multiplier
- Fix DuckDB date type handling in test assertions (datetime.date vs string)
- Remove duckdb.py shim; tests now run against real duckdb + scipy via uv
Ref: thoughts/shared/plans/2026-03-21-parliamentary-embedding-pipeline-plan.md