You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/thoughts/blog-post-political-compass...

202 lines
18 KiB

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Mapping Dutch Democracy: Building a Political Compass</title>
<style>
body{font-family:'Inter',-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;max-width:900px;margin:40px auto;line-height:1.7;color:#c9d1d9;background:#0d1117;padding:0 20px}
pre{background:#161b22;padding:14px;border-radius:8px;overflow:auto;border:1px solid #30363d}
code{font-family:'JetBrains Mono','Fira Code',monospace;background:#1c2128;padding:2px 6px;border-radius:4px;font-size:0.9em}
pre code{background:none;padding:0;font-size:0.85em;line-height:1.5}
h1,h2,h3{color:#58a6ff;font-weight:600}
h1{font-size:1.8em;margin-top:1.2em}
h2{font-size:1.4em;margin-top:1.8em}
h3{font-size:1.15em;margin-top:1.4em}
a{color:#58a6ff;text-decoration:none}
a:hover{text-decoration:underline}
ul{margin-left:1.2rem}
strong{color:#e6edf3}
.callout{background:#161b22;border-left:4px solid #58a6ff;padding:12px 16px;margin:20px 0;border-radius:0 8px 8px 0}
.finding{background:#0d1f0d;border-left:4px solid #3fb950;padding:12px 16px;margin:20px 0;border-radius:0 8px 8px 0}
table{border-collapse:collapse;width:100%;margin:16px 0}
th,td{border:1px solid #30363d;padding:8px 12px;text-align:left}
th{background:#161b22;color:#58a6ff}
td{color:#c9d1d9}
em{color:#8b949e}
</style>
</head>
<body>
<h1>Mapping Dutch Democracy: Building a Political Compass from 28,000 Parliamentary Votes</h1>
<p><em>What if you could take every motion voted on in the Dutch Parliament over the past decade and automatically plot parties and MPs on a political map — with zero manual labeling?</em></p>
<p>That's exactly what this project does. Here's how I built it, what I had to solve along the way, and what it revealed about Dutch political dynamics.</p>
<h2>The Starting Point: Open Data, Hidden Structure</h2>
<p>The Dutch Parliament publishes every vote — every <em>motie</em>, every <em>amendement</em>, every <em>besluit</em> — in an open OData API. We're talking over <strong>28,000 distinct motions</strong> spanning 2016 to 2026, each with a record of how every individual MP voted: <em>voor</em> (for), <em>tegen</em> (against), <em>onthouden</em> (abstained), or <em>afwezig</em> (absent). That's over 500,000 individual vote records.</p>
<div class="callout">
<strong>A note on the numbers:</strong> The 28,000 figure counts distinct parliamentary decisions (motions, amendments, legislative proposals). The 500,000+ figure counts individual MP votes — each motion generates roughly 18 vote records (one per voting MP or party bloc). At ~3,000–4,000 motions per year and 70–80 parliamentary sitting days, that's roughly 50 votes per sitting day. The Dutch Second Chamber is prolific.
</div>
<p>This is an extraordinary dataset. But in raw form it's just a table of votes. The interesting question is: can we extract <em>structure</em> — left vs. right, progressive vs. conservative, governing vs. opposition — purely from the pattern of who votes with whom?</p>
<p>The answer is yes, and the method is surprisingly elegant.</p>
<h2>Step 1: Turning Votes into Geometry</h2>
<p>Each motion is a snapshot of political alignment. For each motion, we know which MPs voted together and which voted apart. If every PvdA and GroenLinks MP votes the same way almost every time, that tells us something. If PVV and CDA MPs diverge consistently, that tells us something too.</p>
<p>I represent this with <strong>Singular Value Decomposition (SVD)</strong> on the MP × motion matrix:</p>
<ul>
<li>Rows: individual MPs (and party actors for collective votes)</li>
<li>Columns: motions</li>
<li>Values: +1 (voor), −1 (tegen), 0 (absent/abstain)</li>
</ul>
<p>SVD finds the dominant axes of variation — the directions along which the chamber disagrees most. The first component almost always corresponds to a left-right axis. The second typically captures something like progressive-traditionalist or libertarian-authoritarian. The key point: <strong>the axes emerge from the math, not from any labeling on my part.</strong></p>
<h3>Making Windows Comparable: Procrustes Alignment</h3>
<p>Running SVD independently per time window creates a subtle problem: SVD axes are <strong>arbitrarily oriented</strong>. The "left-right" axis from 2020-Q3 and the "left-right" axis from 2021-Q1 might point in completely different directions — even if the underlying politics barely changed. You can't just stack the coordinates and call it a trajectory.</p>
<p>The fix is <strong>Procrustes alignment</strong>: given two sets of party/MP positions across consecutive windows, find the rotation matrix R that best maps one onto the other (minimizing the Frobenius norm of the difference), using MPs who appear in both windows as anchors:</p>
<pre><code>R = argmin_R ||A − B @ R||_F, subject to R'R = I</code></pre>
<p>This is solved cleanly via SVD of the cross-covariance matrix (a nice piece of mathematical symmetry — SVD to build the space, SVD to align it). The result: a continuous track for every party from 2019 to 2026, where position changes reflect genuine political movement rather than axis flips.</p>
<h2>Step 2: Finding Similar Motions</h2>
<p>Once we have SVD vectors for every motion in a window, we can find the most politically similar motions. Two motions are close if they produce a similar split in the chamber — same parties voting the same way.</p>
<p>The similarity computation is pure NumPy: load all SVD vectors for a window, L2-normalize, compute cosine similarity via a single matrix multiply, then extract top-k neighbors. For a 4,000-motion quarter, that's a 4000×4000 matrix — fast enough without batching.</p>
<h2>The Numbers: What We're Working With</h2>
<table>
<thead><tr><th>Year</th><th>Motions</th><th>Breakdown</th></tr></thead>
<tbody>
<tr><td>2016</td><td>162</td><td>Mostly legislative proposals (data incomplete)</td></tr>
<tr><td>2017</td><td>126</td><td>Mostly legislative proposals (data incomplete)</td></tr>
<tr><td>2018</td><td>124</td><td>Mostly legislative proposals (data incomplete)</td></tr>
<tr><td>2019</td><td>3,374</td><td>2,058 moties + 350 amendementen</td></tr>
<tr><td>2020</td><td>4,223</td><td>3,141 moties + 354 amendementen</td></tr>
<tr><td>2021</td><td>4,283</td><td>3,395 moties + 236 amendementen</td></tr>
<tr><td>2022</td><td>4,115</td><td>3,255 moties + 290 amendementen</td></tr>
<tr><td>2023</td><td>3,272</td><td>2,557 moties + 217 amendementen</td></tr>
<tr><td>2024</td><td>3,965</td><td>3,007 moties + 359 amendementen</td></tr>
<tr><td>2025</td><td>3,712</td><td>2,900 moties + 251 amendementen</td></tr>
<tr><td>2026</td><td>948</td><td>849 moties + 21 amendementen (partial year)</td></tr>
</tbody>
</table>
<p>Early years (2016–2018) are incomplete — the API data for this period is sparse and mostly contains legislative proposals rather than parliamentary motions. From 2019 onwards, the data is comprehensive, running quarterly for 41 time windows in total.</p>
<div class="callout">
<strong>The 2022 spike is striking.</strong> Over 4,000 motions in a single year — this was when the Rutte IV coalition governed amid intense debates on energy prices, housing, the war in Ukraine, and the ongoing nitrogen crisis. 2023 culminated in the November election that brought PVV to its historic first-place finish with 37 seats.
</div>
<h2>Finding 1: The Merger That Was Already Written in the Votes</h2>
<div class="finding">
<strong>The GroenLinks–PvdA merger wasn't a surprise to the data.</strong> In the raw SVD vectors, they appear as separate parties from 2019 through 2023 — but their coordinates were already converging. By late 2022, the distance between them was smaller than the internal variation within most other parties. By 2023-Q3 — the last quarter before the formal merger — GroenLinks and PvdA agreed on <strong>99.8%</strong> of recorded votes.
<img src="../docs/research/party_agreement_2023Q3.png" alt="Party agreement matrix — 2023-Q3" style="width:100%;max-width:700px;border-radius:8px;margin:12px 0;display:block">
</div>
<p>The raw data preserves the distinction carefully. From 2019 through mid-2023, the <code>svd_vectors</code> table lists <strong>GroenLinks</strong> and <strong>PvdA</strong> as separate entries per window. From late 2023 onwards — when the merger formally took effect in parliament — a single <strong>GroenLinks-PvdA</strong> entity appears. The pipeline tracks this faithfully: you can literally watch two separate points on the political compass drift together and then merge into one.</p>
<p>What's striking is <em>how early</em> the convergence is visible. By 2021 — two full years before the merger announcement — GroenLinks and PvdA coordinates in the SVD space are nearly overlapping. At the individual MP level, there was occasional divergence on defense and security votes (GroenLinks MPs pulling slightly away from the PvdA centroid), but at the party level they were practically indistinguishable.</p>
<p>This created an interesting pipeline challenge: the party normalization step has a mapping that folds both names into <code>GroenLinks-PvdA</code> across the <em>entire</em> dataset. For the post-merger period that's correct; for the pre-merger period it's a simplification that hides the convergence story. The raw vectors still capture it — you just have to know to look.</p>
<p>After the formal merger, GroenLinks-PvdA became one of the most <strong>cohesive</strong> parties in parliament. Their internal voting discipline rivals SGP and ChristenUnie — near-perfect blocs. VVD, by contrast, shows the most internal variation, which tracks with what you'd expect from a large centrist party managing conflicting wings.</p>
<h2>Finding 2: When Left and Right Unite Against the Center</h2>
<div class="finding">
<strong>The most surprising pattern in the data isn't left vs. right — it's left <em>and</em> right vs. the governing coalition.</strong>
</div>
<p>During the Rutte IV cabinet (2022–2023), a recurring pattern emerged: PVV, FvD, and JA21 (right-wing) would vote with SP, GroenLinks-PvdA, PvdD, DENK, and Volt (left-wing) <strong>against</strong> the governing parties VVD, D66, CDA, and ChristenUnie. This isn't a one-off — it happened on dozens of motions.</p>
<p>The topics tell the story:</p>
<ul>
<li><strong>Disability care bureaucracy</strong> — motions to reduce administrative burden in disability care. The populist right and the progressive left both opposed the coalition's market-oriented approach.</li>
<li><strong>Respite care for intensive caregivers</strong> — same coalition of radical left and radical right, opposing centrist fiscal restraint.</li>
<li><strong>Anti-fraud budget retention</strong> — the coalition wanted to maintain the anti-fraud apparatus (think: the toeslagenaffaire aftermath); both flanks pushed back.</li>
<li><strong>Education funding</strong> — motions to increase fundamental education budgets. VVD and D66 voted against; PVV and SP voted together.</li>
<li><strong>Regional infrastructure</strong> — train stations, Eindhoven connectivity, regional investment. Left+right voted for; coalition voted against.</li>
</ul>
<p>This is the classic "horseshoe" pattern in political science — the extremes converging against the center — but it's remarkable to see it so clearly in the voting geometry. It's not ideological agreement between left and right; it's a shared opposition to the governing consensus.</p>
<h2>Finding 3: BBB's Geometric Arrival</h2>
<p>When BBB (BoerBurgerBeweging) entered parliament after the 2023 provincial elections, their SVD position placed them between PVV and CDA — consistent with their policy profile: agrarian-nationalist populism with Catholic-provincial roots. New parties don't get to pick their geometric location; the voting record places them. That BBB landed exactly where you'd expect is a good validity check.</p>
<p>What the geometry also shows: BBB started close to PVV on the nationalist axis, but drifted toward the CDA cluster over their first year in parliament — visible as a curved trajectory rather than a fixed point.</p>
<h2>Finding 4: The Closest Votes in a Decade</h2>
<p>The controversy score (<code>1 − winning_margin</code>) reveals the knife-edge votes. In the current fragmented parliament, the tightest split is a perfect <strong>8–8 party-line tie</strong> — decided by the chamber chair's casting vote. These happened on:</p>
<ul>
<li><strong>Family reunification for AMV status holders</strong> (Boomsma motion, 2025) — immigration policy at its most contested</li>
<li><strong>Nuclear weapons and NATO</strong> (Dobbe motion, 2025) — whether to push for nuclear disarmament within the alliance</li>
<li><strong>Long COVID research funding</strong> (Kostic motion, 2025) — healthcare commitments that split parties along unexpected lines</li>
<li><strong>Cormorant population management</strong> (Kostic motion, 2025) — agricultural vs. ecological interests in a literal bird-counting exercise</li>
</ul>
<p>The narrowest non-tie votes are razor-thin too: Wilders' asylum emergency stop motion lost by the slimmest margin (5 parties for, 16 tied — effectively blocked), while Marijnissen's motion against private equity in GP practices nearly flipped the other way (16 for, 5 tied). On a different day, a different MP showing up, Dutch immigration and healthcare policy could have shifted.</p>
<p>More broadly, over <strong>15,000 motions</strong> had winning margins below 55% — these are the genuinely contested decisions, not the rubber stamps. At the other extreme, about 3,700 motions passed with 95%+ support: the uncontroversial consensus items that rarely make headlines.</p>
<h2>The Pipeline Architecture</h2>
<p>Single DuckDB database, modular Python pipeline, no cloud infrastructure:</p>
<pre><code>API (Tweede Kamer OData)
→ download_past_year.py → motions table (28,304 rows)
motions
→ extract_mp_votes.py → mp_votes table (508,765 rows)
→ sync_motion_content.py → body_text enrichment (~94% coverage)
→ svd_pipeline.py → svd_vectors table (73,165 rows, 41 windows)
svd_vectors
→ similarity/compute.py → similarity_cache (top-10 per window)</code></pre>
<p>The similarity computation is pure NumPy: load all SVD vectors for a window, pad to uniform length, L2-normalize, compute the full cosine similarity matrix via a single matrix multiply, then extract top-k neighbors. For a 4,000-motion quarter, that's a 4000×4000 matrix operation — fast enough that batching isn't needed.</p>
<p>The database sits at ~18 GB on disk — the full parliamentary text for 26,000+ motions accounts for most of that.</p>
<h2>What the Axes Actually Mean</h2>
<p>One of the trickiest problems was labeling the SVD axes. The first component reliably captures left-right economics. But components 3 through 10? The mathematical procedure is sound — SVD finds the directions of maximum variance — but the <em>meaning</em> of each axis has to be derived from the actual motions that load heavily on it.</p>
<p>I solved this by extracting the top 50 motions per component (by absolute loading score), then analyzing their content. Some clear patterns emerged:</p>
<ul>
<li><strong>Component 1</strong>: Fiscal-economic policy vs. social welfare and international rights — the classic left-right split.</li>
<li><strong>Component 2</strong>: Nationalist vs. multilateralist orientation — PVV/FvD on one side, Volt/GroenLinks-PvdA on the other.</li>
<li><strong>Component 3</strong>: Welfare state vs. defense spending — flip of the usual axis (with SP/PvdD on the pro-welfare side, VVD/SGP on the pro-defense side).</li>
</ul>
<div class="callout">
<strong>How much do the first two axes actually capture?</strong> In a single-window SVD (current parliament), PC1 explains ~29% of the variance and PC2 explains ~11.5% — together accounting for <strong>~41%</strong> of all voting variation. PC3 adds another 8.6%, but from there it drops off sharply: PC4 is under 9%, and components 5–8 each contribute 3–6%. The classic "scree plot" elbow is clear: the first two dimensions carry the signal, the rest is real but diminishing. When looking across <em>all</em> time windows with Procrustes alignment, the picture flattens considerably — PC1 and PC2 each explain ~14.6% and ~13.1% respectively — because aligning 41 different windows distributes variance more evenly. The multi-window perspective is more conservative, but the message is the same: Dutch politics is largely two-dimensional.
<img src="../docs/research/scree_multiwindow.png" alt="Scree plot — multi-window Procrustes-aligned SVD" style="width:100%;max-width:600px;border-radius:8px;margin:12px 0;display:block">
<em style="font-size:0.85em">Scree plot across 41 aligned quarterly windows. PC1 = 14.6%, PC2 = 13.1%.</em>
</div>
<h2>What's Next</h2>
<p><strong>Motion explorer</strong>: Given a motion, retrieve the 10 most politically similar ones from across the decade. Trace how a policy debate evolved — who championed it, how the coalitions shifted.</p>
<p><strong>Party trajectory animation</strong>: Procrustes-aligned positions, animated year by year. Watch GroenLinks-PvdA's pre-merger convergence, watch PVV consolidate its flank, watch new parties arrive and find their geometric home.</p>
<p><strong>Cross-party coalition patterns</strong>: Which topics produce unusual coalition configurations — motions where the normal left-right split breaks down and unexpected alliances form.</p>
<p><strong>Cabinet crisis detection</strong>: Track coalition cohesion over time. When do coalition parties start voting against each other? The Procrustes disparity between consecutive windows is itself a signal of structural political shifts.</p>
<h2>Reproducibility</h2>
<pre><code># Download historical data
python scripts/download_past_year.py --start-date 2016-01-01 --end-date 2026-01-01
# Run full pipeline (SVD, similarity cache)
python -m pipeline.run_pipeline --db-path data/motions.db \
--start-date 2016-01-01 --end-date 2026-01-01 \
--window-size quarterly --text-batch-size 200
# Enrich with full motion body text
python scripts/sync_motion_content.py --db-path data/motions.db</code></pre>
<p>All computation — SVD, similarity — runs locally on a single machine. No cloud services, no GPU required.</p>
<p>Democracy is more legible than it looks.</p>
</body>
</html>