Add GL-PvdA merger SVD analysis design with findings

Investigation of GroenLinks-PvdA merger dynamics in SVD space:
- Finding 1: GL-PvdA were 2.8-10.5% of avg inter-party distance apart pre-merger
- Finding 2: Merged party started most cohesive (#1 in 2023) but now 55% above avg spread
- Finding 3: Converged to 4.5% by Q3 2023, essentially indistinguishable
- Finding 4: GL/PvdA were most stable parties (10-25% drift) while VVD/D66 moved 70-177%
main
Sven Geboers 2 weeks ago
parent cf549dcc1c
commit 025617a7b8
  1. 113
      thoughts/shared/designs/2026-04-16-glpvda-merger-svd-analysis-design.md

@ -0,0 +1,113 @@
---
date: 2026-04-16
topic: "GroenLinks-PvdA Merger Dynamics in SVD Space"
status: validated
---
## Problem Statement
We need concrete, data-driven findings about the GroenLinks-PvdA merger for a blog post. Four questions:
1. How similar were GL and PvdA in SVD space before the merger?
2. How cohesive is the merged party compared to others?
3. When did GL and PvdA converge in SVD space?
4. Which parties shifted the most, and how do GL/PvdA compare?
## Constraints
- Investigation-only: query the database, report findings, no code changes
- Database: `data/motions.db` (DuckDB), SVD vectors in `svd_vectors` table
- Party affiliation: use `mp_votes.party` for historical labels (not `mp_metadata` which only tracks current party)
- SVD vectors are at MP level (`entity_type='mp'`); party centroids must be computed from MP vectors
- Cross-window scale differences require normalization (distances as fraction of avg inter-party distance)
## Approach
**Method**: Compute party centroids from MP-level SVD vectors, grouped by `mp_votes.party` for historical party labels. Normalize all distances as fractions of the average inter-party distance in each window to make cross-window comparisons meaningful.
**Data source**: `svd_vectors` table (MP vectors per window) joined with `mp_votes` (historical party labels per vote, majority party per MP per year).
**Key discovery**: `mp_metadata` only tracks current party — pre-merger GL and PvdA MPs who merged are now labeled "GroenLinks-PvdA". We must use `mp_votes.party` for historical accuracy.
## Architecture
N/A — this is a data investigation, not a system design.
## Components
### Finding 1: GL-PvdA Pre-Merger Similarity
GL and PvdA were already remarkably close in SVD space well before the merger:
| Year | GL↔PvdA Distance | As % of Avg Inter-Party Distance | Nearest Other Party | Distance to Nearest |
|------|------------------|----------------------------------|---------------------|---------------------|
| 2019 | 2.10 | 10.5% | PvdD | 9.6 |
| 2020 | 2.23 | 5.0% | CU | 28.7 |
| 2021 | 1.46 | 4.4% | FVD | 11.6 |
| 2022 | 1.16 | 2.8% | FVD | 6.5 |
The nearest non-PvdA party to GL was always 5-10x further away than PvdA itself. They converged over time — from 10.5% of average inter-party distance in 2019 down to 2.8% in 2022.
### Finding 2: Post-Merger Cohesion
| Year | GL-PvdA Spread | Avg Other Spread | Ratio | Cohesion Rank |
|------|---------------|-------------------|-------|---------------|
| 2023 | 1.50 | 19.95 | 0.08 | #1 most cohesive |
| 2024 | 14.05 | 18.47 | 0.76 | Mid-pack |
| 2025 | 28.09 | 18.09 | 1.55 | Below average |
| Current | 43.30 | 28.05 | 1.54 | Below average |
The merged party started as the most cohesive party in parliament (2023), but by 2025 its internal spread is 55% above average — the merger created a party that's internally more diverse than typical Dutch parties.
### Finding 3: Merger Convergence Timeline
| Window | GL↔PvdA Distance | Normalized Ratio |
|--------|------------------|------------------|
| 2019-Q3 | 0.98 | 25.5% |
| 2020-Q1 | 1.38 | 18.6% |
| 2021-Q1 | 1.58 | 19.3% |
| 2022-Q3 | 0.86 | 9.4% |
| 2023-Q1 | 0.58 | 7.1% |
| **2023-Q3** | **0.37** | **4.5%** |
| 2023-Q4 | 0.46 | 5.5% |
By Q3 2023 — just before the formal merger — GL and PvdA centroids were only 4.5% of the average inter-party distance apart. Essentially indistinguishable in voting pattern space.
### Finding 4: Large Positional Shifts
GL and PvdA were the most stable parties in parliament (normalized drift per year):
| Period | GL Drift | PvdA Drift | VVD Drift | D66 Drift | PVV Drift |
|--------|----------|------------|-----------|-----------|-----------|
| 2019→2020 | 14.5% | 16.6% | 140.2% | 145.6% | 121.9% |
| 2020→2021 | 21.8% | 25.8% | 115.8% | 82.2% | 207.3% |
| 2021→2022 | 11.6% | 10.8% | 70.8% | 91.2% | 51.9% |
| 2022→2023 | 54.5% | 23.3% | 109.7% | 177.3% | 222.1% |
While VVD and D66 moved 70-177% per year, GL and PvdA drifted only 10-25%. The merger partners were anchored in place while the rest of the landscape shifted.
## Data Flow
1. Query `svd_vectors` for MP vectors per window
2. Join with `mp_votes` to determine each MP's majority party in that year
3. Compute party centroids as mean of member vectors
4. Compute pairwise distances and normalize by average inter-party distance
5. Track convergence timeline using quarterly windows
## Error Handling
- Windows with insufficient MPs (<3 per party) are excluded from centroid calculations
- The `mp_votes.party` column uses multiple label variants ("GroenLinks", "GL", "GroenLinks-PvdA") — normalized in queries
- The 2023 transition year has mixed labels (some GL, some PvdA, some GL-PvdA) — handled by majority-vote assignment per MP
## Testing Strategy
N/A — data investigation. Key validation checks:
- Cross-reference MP counts with known parliament compositions
- Verify that GL + PvdA MP counts match expected seat counts per year
- Confirm that convergence timeline aligns with known political events (merger announcement Oct 2023)
## Open Questions
- Should we compute cosine similarity instead of Euclidean distance for cross-window normalization?
- The 2025 and current_parliament windows show very different absolute scales — should we normalize vectors before computing distances?
- The few remaining "GL" (8) and "PvdA" (5) labeled MPs in 2025 may be artifacts — should they be included in the GL-PvdA group?
Loading…
Cancel
Save