- Add script to find motions closest to semantic gravity per axis/window - Document Axis 1 semantic shift: from administrative law (2016) to migration/asylum policy (2026) - Shows that 'coalition' votes on different topics over timemain
parent
67eda93cb1
commit
2c46e21acc
@ -0,0 +1,68 @@ |
|||||||
|
# Semantic Content Shift: Axis 1 Over Time |
||||||
|
|
||||||
|
## What Changed: "Coalition vs Opposition" Axis Content |
||||||
|
|
||||||
|
| Year | Positive Pole (Coalition) | Negative Pole (Opposition) | Key Theme | |
||||||
|
|------|-------------------------|---------------------------|-----------| |
||||||
|
| **2016** | Tax law changes, international treaties | — | Administrative law | |
||||||
|
| **2018** | Budget modifications, infrastructure, social affairs | — | Government spending | |
||||||
|
| **2019** | Working conditions, monitoring issues | — | Administrative oversight | |
||||||
|
| **2022** | Local government info, digital accounts | Digital governance, privacy | Digital transformation | |
||||||
|
| **2023** | Welfare policy, parental support | Social services | Social policy | |
||||||
|
| **2024** | Nuclear weapons, housing, Israel boycott | — | Foreign policy / Justice | |
||||||
|
| **2025** | EU sanctions on Israel, asylum policies | — | Migration / Foreign affairs | |
||||||
|
| **2026** | **Asylum stops**, Syrian permit revocations, Ukraine returns | IND backlog | **Migration dominates** | |
||||||
|
|
||||||
|
## Key Observations |
||||||
|
|
||||||
|
### 1. The "Coalition" Side Evolved Significantly |
||||||
|
|
||||||
|
| Period | Coalition Motions Focused On | |
||||||
|
|--------|---------------------------| |
||||||
|
| 2016-2019 | Administrative law, tax, budgets, infrastructure | |
||||||
|
| 2022-2023 | Digital governance, welfare, social services | |
||||||
|
| 2024-2025 | Foreign policy (Israel sanctions), migration | |
||||||
|
| **2026** | **Asylum restriction**, Syria, Ukraine returns | |
||||||
|
|
||||||
|
### 2. Axis 1 Became Migration-Centric by 2026 |
||||||
|
|
||||||
|
In 2026, the **extreme positive motions** are ALL about asylum/migration: |
||||||
|
- "Motie van het lid Vondeling over een totale asielstop" (total asylum stop) |
||||||
|
- "Motie van het lid Vondeling over alle tijdelijke asielvergunningen van Syriërs intrekken" (revoke Syrian permits) |
||||||
|
- "Motie van het lid Vondeling over een actief terugkeerbeleid voor alle Oekraïners" (active return policy for Ukrainians) |
||||||
|
|
||||||
|
This suggests the coalition/opposition dynamic in 2026 is increasingly defined by **migration policy** rather than the traditional left-right economic divide. |
||||||
|
|
||||||
|
### 3. The "Typical" Motion Changed |
||||||
|
|
||||||
|
Semantic gravity represents the "typical" motion on the axis. Its content shifted: |
||||||
|
|
||||||
|
| Year | Typical Motion Theme | |
||||||
|
|------|---------------------| |
||||||
|
| 2016 | Tax law, health law, financial administration | |
||||||
|
| 2019 | Bureaucracy reduction, Kamer control, administrative burden | |
||||||
|
| 2023 | Student finance, volunteer work, housing | |
||||||
|
| 2024 | Fossil fuel phase-out, whistleblower protection, youth care | |
||||||
|
| 2026 | Asylum, IND backlog, Ukraine, social grievances | |
||||||
|
|
||||||
|
## Implications |
||||||
|
|
||||||
|
1. **Axis label is temporally bounded**: "Rechts kabinetsbeleid versus links oppositiebeleid" works for 2016-2026 as a whole, but in 2026 it's increasingly about migration policy. |
||||||
|
|
||||||
|
2. **Party voting structure is stable** (0.83 stability), but **what parties vote on** has shifted from economics to migration. |
||||||
|
|
||||||
|
3. **Axis 6 (Migration/Culture)** low stability (0.35) may now be overlapping with Axis 1 — migration has become a coalition-defining issue. |
||||||
|
|
||||||
|
## Example: Concrete Before/After |
||||||
|
|
||||||
|
**2016 - "Coalition" side:** |
||||||
|
> "Wijziging van enkele belastingwetten en enige andere wetten (Fiscale vereenvoudigingswet 2017)" |
||||||
|
|
||||||
|
**2026 - "Coalition" side:** |
||||||
|
> "Motie van het lid Vondeling over een totale asielstop" |
||||||
|
|
||||||
|
Same axis (coalition votes FOR), but semantically completely different topics. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
*Generated by `scripts/semantic_gravity_examples.py`* |
||||||
@ -0,0 +1,286 @@ |
|||||||
|
"""semantic_gravity_examples.py — Show concrete motion examples for SVD axes across windows. |
||||||
|
|
||||||
|
For each axis and window, finds motions closest to the semantic gravity vector, |
||||||
|
providing concrete examples of what the axis "means" in that period. |
||||||
|
|
||||||
|
Usage: |
||||||
|
uv run python scripts/semantic_gravity_examples.py --db data/motions.db --axis 1 |
||||||
|
uv run python scripts/semantic_gravity_examples.py --db data/motions.db --all |
||||||
|
""" |
||||||
|
|
||||||
|
from __future__ import annotations |
||||||
|
|
||||||
|
import argparse |
||||||
|
import json |
||||||
|
import os |
||||||
|
import sys |
||||||
|
from typing import Dict, List, Tuple |
||||||
|
|
||||||
|
import duckdb |
||||||
|
import numpy as np |
||||||
|
|
||||||
|
|
||||||
|
def _load_fused_embeddings_with_titles( |
||||||
|
con: duckdb.DuckDBPyConnection, window_id: str |
||||||
|
) -> List[Tuple[int, np.ndarray, str]]: |
||||||
|
"""Load fused embeddings with motion titles for a window.""" |
||||||
|
rows = con.execute( |
||||||
|
""" |
||||||
|
SELECT f.motion_id, f.vector, m.title |
||||||
|
FROM fused_embeddings f |
||||||
|
JOIN motions m ON f.motion_id = m.id |
||||||
|
WHERE f.window_id = ? |
||||||
|
""", |
||||||
|
[window_id], |
||||||
|
).fetchall() |
||||||
|
|
||||||
|
result = [] |
||||||
|
for motion_id, raw_vec, title in rows: |
||||||
|
if isinstance(raw_vec, str): |
||||||
|
vec = json.loads(raw_vec) |
||||||
|
elif isinstance(raw_vec, (bytes, bytearray)): |
||||||
|
vec = json.loads(raw_vec.decode()) |
||||||
|
elif isinstance(raw_vec, list): |
||||||
|
vec = raw_vec |
||||||
|
else: |
||||||
|
vec = list(raw_vec) |
||||||
|
result.append( |
||||||
|
( |
||||||
|
motion_id, |
||||||
|
np.array([float(v) if v is not None else 0.0 for v in vec]), |
||||||
|
title or "", |
||||||
|
) |
||||||
|
) |
||||||
|
return result |
||||||
|
|
||||||
|
|
||||||
|
def _load_motion_scores( |
||||||
|
con: duckdb.DuckDBPyConnection, window_id: str |
||||||
|
) -> Dict[int, np.ndarray]: |
||||||
|
"""Load SVD scores for a window. Returns {motion_id: score_array}.""" |
||||||
|
rows = con.execute( |
||||||
|
"SELECT entity_id, vector FROM svd_vectors WHERE window_id = ? AND entity_type = 'motion'", |
||||||
|
[window_id], |
||||||
|
).fetchall() |
||||||
|
|
||||||
|
result = {} |
||||||
|
for entity_id, raw_vec in rows: |
||||||
|
if isinstance(raw_vec, str): |
||||||
|
vec = json.loads(raw_vec) |
||||||
|
elif isinstance(raw_vec, (bytes, bytearray)): |
||||||
|
vec = json.loads(raw_vec.decode()) |
||||||
|
elif isinstance(raw_vec, list): |
||||||
|
vec = raw_vec |
||||||
|
else: |
||||||
|
vec = list(raw_vec) |
||||||
|
result[int(entity_id)] = np.array( |
||||||
|
[float(v) if v is not None else 0.0 for v in vec] |
||||||
|
) |
||||||
|
return result |
||||||
|
|
||||||
|
|
||||||
|
def compute_semantic_gravity_examples( |
||||||
|
con: duckdb.DuckDBPyConnection, |
||||||
|
windows: List[str], |
||||||
|
axis: int, |
||||||
|
n_examples: int = 5, |
||||||
|
n_components: int = 10, |
||||||
|
) -> Dict: |
||||||
|
"""Find motions closest to semantic gravity for an axis across windows.""" |
||||||
|
comp_idx = axis - 1 |
||||||
|
results = {} |
||||||
|
|
||||||
|
for w in windows: |
||||||
|
# Load data |
||||||
|
motion_scores = _load_motion_scores(con, w) |
||||||
|
embeddings_data = _load_fused_embeddings_with_titles(con, w) |
||||||
|
|
||||||
|
if not motion_scores or not embeddings_data: |
||||||
|
continue |
||||||
|
|
||||||
|
# Build motion_id -> embedding mapping |
||||||
|
embeddings_by_id = {mid: (vec, title) for mid, vec, title in embeddings_data} |
||||||
|
|
||||||
|
# Find common motions |
||||||
|
common = [m for m in motion_scores if m in embeddings_by_id] |
||||||
|
if len(common) < 10: |
||||||
|
continue |
||||||
|
|
||||||
|
# Compute semantic gravity (weighted mean by absolute SVD score on this axis) |
||||||
|
valid_embeddings = [] |
||||||
|
weights = [] |
||||||
|
for m_id in common: |
||||||
|
scores = motion_scores[m_id] |
||||||
|
if comp_idx < len(scores): |
||||||
|
valid_embeddings.append(embeddings_by_id[m_id][0]) |
||||||
|
weights.append(abs(scores[comp_idx])) |
||||||
|
|
||||||
|
if not valid_embeddings or sum(weights) == 0: |
||||||
|
continue |
||||||
|
|
||||||
|
# Align dimensions |
||||||
|
dim = min(len(v) for v in valid_embeddings) |
||||||
|
vectors = np.array([v[:dim] for v in valid_embeddings]) |
||||||
|
weights = np.array(weights[: len(vectors)]) |
||||||
|
gravity = np.average(vectors, axis=0, weights=weights) |
||||||
|
|
||||||
|
# Find motions closest to gravity (highest cosine similarity) |
||||||
|
similarities = [] |
||||||
|
for m_id in common: |
||||||
|
vec, title = embeddings_by_id[m_id] |
||||||
|
vec = vec[:dim] |
||||||
|
norm_g = np.linalg.norm(gravity) |
||||||
|
norm_v = np.linalg.norm(vec) |
||||||
|
if norm_g > 0 and norm_v > 0: |
||||||
|
sim = np.dot(gravity, vec) / (norm_g * norm_v) |
||||||
|
similarities.append((sim, m_id, title)) |
||||||
|
|
||||||
|
# Sort by similarity and get top examples |
||||||
|
similarities.sort(reverse=True) |
||||||
|
top_positive = [s for s in similarities if s[0] > 0][:n_examples] |
||||||
|
top_negative = [s for s in similarities if s[0] < 0][-n_examples:][::-1] |
||||||
|
|
||||||
|
# Get extreme motions (highest absolute loading on this axis) |
||||||
|
extreme = sorted( |
||||||
|
common, key=lambda m: abs(motion_scores[m][comp_idx]), reverse=True |
||||||
|
)[:n_examples] |
||||||
|
extreme_motions = [] |
||||||
|
for m_id in extreme: |
||||||
|
score = motion_scores[m_id][comp_idx] |
||||||
|
title = embeddings_by_id.get(m_id, (None, ""))[1] |
||||||
|
extreme_motions.append((score, m_id, title)) |
||||||
|
|
||||||
|
results[w] = { |
||||||
|
"gravity": gravity, |
||||||
|
"top_similar": top_positive, |
||||||
|
"top_dissimilar": top_negative, |
||||||
|
"extreme": extreme_motions, |
||||||
|
} |
||||||
|
|
||||||
|
return results |
||||||
|
|
||||||
|
|
||||||
|
def _get_annual_windows(con: duckdb.DuckDBPyConnection) -> List[str]: |
||||||
|
"""Get list of annual windows that have fused embeddings, sorted by year.""" |
||||||
|
rows = con.execute( |
||||||
|
""" |
||||||
|
SELECT DISTINCT f.window_id |
||||||
|
FROM fused_embeddings f |
||||||
|
JOIN svd_vectors s ON f.window_id = s.window_id AND s.entity_type = 'motion' |
||||||
|
WHERE f.window_id NOT LIKE '%-Q%' |
||||||
|
ORDER BY f.window_id |
||||||
|
""" |
||||||
|
).fetchall() |
||||||
|
return [r[0] for r in rows] |
||||||
|
|
||||||
|
|
||||||
|
def format_results(results: Dict, axis: int) -> str: |
||||||
|
"""Format results as markdown.""" |
||||||
|
lines = [ |
||||||
|
f"# Semantic Gravity Examples for Axis {axis}", |
||||||
|
"", |
||||||
|
f"Shows motions closest to semantic gravity (weighted mean embedding) for each window.", |
||||||
|
"This represents the 'typical' motion on this axis.", |
||||||
|
"", |
||||||
|
"---", |
||||||
|
"", |
||||||
|
] |
||||||
|
|
||||||
|
for window in sorted(results.keys()): |
||||||
|
data = results[window] |
||||||
|
gravity = data["gravity"] |
||||||
|
|
||||||
|
lines.append(f"## {window}") |
||||||
|
lines.append("") |
||||||
|
|
||||||
|
# Positive-pole extreme motions |
||||||
|
lines.append("### Extreme Positive Motions (high positive loading)") |
||||||
|
for score, m_id, title in data["extreme"]: |
||||||
|
if score > 0: |
||||||
|
lines.append( |
||||||
|
f"- **[{score:+.3f}]** {title[:100]}{'...' if len(title) > 100 else ''}" |
||||||
|
) |
||||||
|
lines.append("") |
||||||
|
|
||||||
|
# Negative-pole extreme motions |
||||||
|
lines.append("### Extreme Negative Motions (high negative loading)") |
||||||
|
for score, m_id, title in data["extreme"]: |
||||||
|
if score < 0: |
||||||
|
lines.append( |
||||||
|
f"- **[{score:+.3f}]** {title[:100]}{'...' if len(title) > 100 else ''}" |
||||||
|
) |
||||||
|
lines.append("") |
||||||
|
|
||||||
|
# Motions closest to semantic gravity |
||||||
|
lines.append("### Most Representative Motions (closest to semantic gravity)") |
||||||
|
for sim, m_id, title in data["top_similar"]: |
||||||
|
lines.append( |
||||||
|
f"- **[{sim:.3f}]** {title[:100]}{'...' if len(title) > 100 else ''}" |
||||||
|
) |
||||||
|
lines.append("") |
||||||
|
|
||||||
|
return "\n".join(lines) |
||||||
|
|
||||||
|
|
||||||
|
def main(argv: List[str] | None = None) -> int: |
||||||
|
p = argparse.ArgumentParser( |
||||||
|
description="Find semantic gravity examples for SVD axes" |
||||||
|
) |
||||||
|
p.add_argument("--db", default="data/motions.db", help="Path to motions database") |
||||||
|
p.add_argument("--axis", type=int, default=1, help="SVD axis to analyze (1-10)") |
||||||
|
p.add_argument( |
||||||
|
"--windows", nargs="+", help="Specific windows (default: all annual windows)" |
||||||
|
) |
||||||
|
p.add_argument( |
||||||
|
"--n-examples", |
||||||
|
type=int, |
||||||
|
default=5, |
||||||
|
help="Number of example motions per category", |
||||||
|
) |
||||||
|
p.add_argument("--output", help="Output file (default: print to stdout)") |
||||||
|
|
||||||
|
args = p.parse_args(argv) |
||||||
|
|
||||||
|
if not os.path.exists(args.db): |
||||||
|
print(f"Error: Database not found: {args.db}", file=sys.stderr) |
||||||
|
return 1 |
||||||
|
|
||||||
|
con = duckdb.connect(database=args.db, read_only=True) |
||||||
|
try: |
||||||
|
# Determine windows |
||||||
|
if args.windows: |
||||||
|
windows = args.windows |
||||||
|
else: |
||||||
|
windows = _get_annual_windows(con) |
||||||
|
print(f"Found {len(windows)} annual windows: {windows}", file=sys.stderr) |
||||||
|
|
||||||
|
if len(windows) < 2: |
||||||
|
print("Need at least 2 windows for analysis", file=sys.stderr) |
||||||
|
return 1 |
||||||
|
|
||||||
|
# Run analysis |
||||||
|
print( |
||||||
|
f"Computing semantic gravity examples for Axis {args.axis}...", |
||||||
|
file=sys.stderr, |
||||||
|
) |
||||||
|
results = compute_semantic_gravity_examples( |
||||||
|
con, windows, args.axis, args.n_examples |
||||||
|
) |
||||||
|
|
||||||
|
# Format output |
||||||
|
output = format_results(results, args.axis) |
||||||
|
|
||||||
|
if args.output: |
||||||
|
with open(args.output, "w") as f: |
||||||
|
f.write(output) |
||||||
|
print(f"Results written to {args.output}", file=sys.stderr) |
||||||
|
else: |
||||||
|
print(output) |
||||||
|
|
||||||
|
return 0 |
||||||
|
finally: |
||||||
|
con.close() |
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__": |
||||||
|
raise SystemExit(main()) |
||||||
Loading…
Reference in new issue