You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/EMBEDDING_ANALYSIS.md

2.9 KiB

Tweede Kamer Parliamentary Embedding Analysis

Goal

Track how MPs shift politically over time and map motions onto a meaningful ideological axis, by embedding both MPs and motions into a shared vector space.

Data

Source Content
MP × motion vote matrix yes / no / abstain per MP per motion
Motion text Dutch-language motion descriptions
MP metadata name, party, entry/exit dates
Timestamps date of each vote

Approach: Late Fusion

Two independent embedding signals, combined per motion.

1. Vote embeddings (SVD)

  • Build a sparse MP × motion matrix per time window
  • Apply SVD to get latent vectors for both MPs and motions
  • Encodes political alignment from actual voting behavior

2. Text embeddings (Qwen3-0.6B)

  • Embed each motion's text using Qwen3-0.6B (multilingual, Dutch supported)
  • Encodes semantic/policy topic of the motion
  • Use a task instruction in English, e.g. "Retrieve semantically similar Dutch parliamentary motions"

3. Fusion

Concatenate (or weighted sum) the SVD motion vector and text vector into a single motion embedding. MPs retain their SVD vectors only.

Temporal Tracking

Time windows

  • Default: quarterly (flexible — can be per half-year or per N votes)
  • Adaptive option: fixed number of votes per window (e.g. 200) for stable SVD regardless of parliamentary rhythm

Procrustes alignment

SVD axes are arbitrary per window and cannot be compared directly. Procrustes alignment finds the optimal rotation mapping one window's space onto the previous, using overlapping MPs as anchors.

R = argmin || W1[common] - W2[common] @ R ||
W2_aligned = W2 @ R  # applied to all MPs, including newcomers
  • Only overlapping MPs are needed to estimate R
  • New MPs are placed into the aligned space via their voting pattern
  • High Procrustes disparity score = structural political shift, not just individual drift

Election transitions

At term boundaries (~60% MP overlap), alignment is noisier. Mitigation: chain alignments via the last quarter of the old term and first quarter of the new term, using only returning MPs.

Analysis

Question Method
MP drift over time trajectory of MP vector across aligned windows
Political axis first SVD component, or defined by anchor parties (e.g. VVD vs SP)
Swing voters MPs closest to the boundary between party clusters
Thematic clustering UMAP on fused motion embeddings
Cross-party coalitions motions where party cluster boundaries blur
Party cohesion variance of MP vectors within a party per window

Stack

Component Tool
Matrix factorization
````|

|Procrustes alignment|
````scipy.spatial.procrustes
````|

|Text embeddings|Qwen3-0.6B via 
````sentence-transformers

or vLLM| |Dimensionality reduction|UMAP| |Visualization|Plotly (interactive trajectories)| |Data handling|ibis / pandas|