You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
90 lines
2.9 KiB
90 lines
2.9 KiB
# Tweede Kamer Parliamentary Embedding Analysis
|
|
|
|
## Goal
|
|
|
|
Track how MPs shift politically over time and map motions onto a meaningful ideological axis, by embedding both MPs and motions into a shared vector space.
|
|
|
|
## Data
|
|
|
|
|Source|Content|
|
|
|------|-------|
|
|
|MP × motion vote matrix|yes / no / abstain per MP per motion|
|
|
|Motion text|Dutch-language motion descriptions|
|
|
|MP metadata|name, party, entry/exit dates|
|
|
|Timestamps|date of each vote|
|
|
|
|
## Approach: Late Fusion
|
|
|
|
Two independent embedding signals, combined per motion.
|
|
|
|
### 1. Vote embeddings (SVD)
|
|
|
|
- Build a sparse MP × motion matrix per time window
|
|
- Apply SVD to get latent vectors for both MPs and motions
|
|
- Encodes political alignment from actual voting behavior
|
|
|
|
### 2. Text embeddings (Qwen3-0.6B)
|
|
|
|
- Embed each motion's text using Qwen3-0.6B (multilingual, Dutch supported)
|
|
- Encodes semantic/policy topic of the motion
|
|
- Use a task instruction in English, e.g. `"Retrieve semantically similar Dutch parliamentary motions"`
|
|
|
|
### 3. Fusion
|
|
|
|
Concatenate (or weighted sum) the SVD motion vector and text vector into a single motion embedding. MPs retain their SVD vectors only.
|
|
|
|
## Temporal Tracking
|
|
|
|
### Time windows
|
|
|
|
- Default: **quarterly** (flexible — can be per half-year or per N votes)
|
|
- Adaptive option: fixed number of votes per window (e.g. 200) for stable SVD regardless of parliamentary rhythm
|
|
|
|
### Procrustes alignment
|
|
|
|
SVD axes are arbitrary per window and cannot be compared directly. Procrustes alignment finds the optimal rotation mapping one window's space onto the previous, using overlapping MPs as anchors.
|
|
|
|
```
|
|
R = argmin || W1[common] - W2[common] @ R ||
|
|
W2_aligned = W2 @ R # applied to all MPs, including newcomers
|
|
```
|
|
|
|
- Only overlapping MPs are needed to estimate R
|
|
- New MPs are placed into the aligned space via their voting pattern
|
|
- High Procrustes disparity score = structural political shift, not just individual drift
|
|
|
|
### Election transitions
|
|
|
|
At term boundaries (~60% MP overlap), alignment is noisier. Mitigation: chain alignments via the last quarter of the old term and first quarter of the new term, using only returning MPs.
|
|
|
|
## Analysis
|
|
|
|
|Question|Method|
|
|
|--------|------|
|
|
|MP drift over time|trajectory of MP vector across aligned windows|
|
|
|Political axis|first SVD component, or defined by anchor parties (e.g. VVD vs SP)|
|
|
|Swing voters|MPs closest to the boundary between party clusters|
|
|
|Thematic clustering|UMAP on fused motion embeddings|
|
|
|Cross-party coalitions|motions where party cluster boundaries blur|
|
|
|Party cohesion|variance of MP vectors within a party per window|
|
|
|
|
## Stack
|
|
|
|
|Component|Tool|
|
|
|---------|----|
|
|
|Matrix factorization|
|
|
````scipy.sparse.linalg.svds
|
|
````|
|
|
|
|
|Procrustes alignment|
|
|
````scipy.spatial.procrustes
|
|
````|
|
|
|
|
|Text embeddings|Qwen3-0.6B via
|
|
````sentence-transformers
|
|
````
|
|
|
|
or vLLM|
|
|
|Dimensionality reduction|UMAP|
|
|
|Visualization|Plotly (interactive trajectories)|
|
|
|Data handling|ibis / pandas|
|
|
|