You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
895 lines
30 KiB
895 lines
30 KiB
# Motion-Driven Axis Labeling Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Replace static ideology-CSV axis labeling with motion-projection-based labeling, add axis swap when Y ends up as "Links–Rechts", and expose top motions per axis to the user.
|
|
|
|
**Architecture:** `political_axis.py` exposes `global_mean` in the `axes` dict; `axis_classifier.py` gains motion-loading helpers and a keyword classifier as the primary label source (falling back to existing Pearson-r); `explorer.py` swaps axes when needed and renders a new expander showing the top motions.
|
|
|
|
**Tech Stack:** Python, NumPy, DuckDB (stdlib only — no new deps), Streamlit, pytest
|
|
|
|
---
|
|
|
|
## File Map
|
|
|
|
| File | Change |
|
|
|---|---|
|
|
| `analysis/political_axis.py` | Add `axes["global_mean"] = global_mean` (one line) |
|
|
| `analysis/axis_classifier.py` | Add `_KEYWORDS`, motion helpers, restructure `classify_axes` |
|
|
| `explorer.py` | Add `_swap_axes`, `_should_swap_axes`, wire swap, add motion expander |
|
|
| `tests/test_political_compass.py` | Add 5 new unit tests |
|
|
|
|
---
|
|
|
|
## Task 1: Expose `global_mean` from `compute_2d_axes`
|
|
|
|
**Files:**
|
|
- Modify: `analysis/political_axis.py` (line 362)
|
|
|
|
- [ ] **Step 1: Write the failing test**
|
|
|
|
Add this test at the bottom of `tests/test_political_compass.py`:
|
|
|
|
```python
|
|
def test_compute_2d_axes_exposes_global_mean(monkeypatch):
|
|
"""axes dict returned by compute_2d_axes must contain 'global_mean'."""
|
|
fake_traj = types.SimpleNamespace()
|
|
fake_traj._load_window_ids = lambda db: ["w1"]
|
|
aligned = {
|
|
"w1": {
|
|
"Alice": np.array([1.0, 0.0, 0.0]),
|
|
"Bob": np.array([-1.0, 0.5, 0.0]),
|
|
}
|
|
}
|
|
fake_traj._load_mp_vectors_for_window = lambda db, w: aligned.get(w, {})
|
|
fake_traj._procrustes_align_windows = lambda x: aligned
|
|
monkeypatch.setitem(sys.modules, "analysis.trajectory", fake_traj)
|
|
|
|
from analysis.political_axis import compute_2d_axes
|
|
|
|
_, axis_def = compute_2d_axes(db_path="dummy", window_ids=["w1"], method="pca")
|
|
assert "global_mean" in axis_def
|
|
assert isinstance(axis_def["global_mean"], np.ndarray)
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to verify it fails**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py::test_compute_2d_axes_exposes_global_mean -v
|
|
```
|
|
|
|
Expected: FAIL — `AssertionError: assert 'global_mean' in {…}` (key not yet present)
|
|
|
|
- [ ] **Step 3: Add `global_mean` to axes dict in `political_axis.py`**
|
|
|
|
In `analysis/political_axis.py`, the line at ~362 reads:
|
|
```python
|
|
global_mean = M.mean(axis=0)
|
|
positions_by_window: Dict[str, Dict[str, Tuple[float, float]]] = {
|
|
```
|
|
|
|
Add `axes["global_mean"] = global_mean` immediately after that assignment:
|
|
```python
|
|
global_mean = M.mean(axis=0)
|
|
axes["global_mean"] = global_mean
|
|
positions_by_window: Dict[str, Dict[str, Tuple[float, float]]] = {
|
|
```
|
|
|
|
- [ ] **Step 4: Run test to verify it passes**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py::test_compute_2d_axes_exposes_global_mean -v
|
|
```
|
|
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 5: Run full test suite to confirm no regressions**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py -v
|
|
```
|
|
|
|
Expected: all previously passing tests still pass + new test passes.
|
|
|
|
- [ ] **Step 6: Commit**
|
|
|
|
```bash
|
|
git add analysis/political_axis.py tests/test_political_compass.py
|
|
git commit -m "feat: expose global_mean in compute_2d_axes axes dict"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 2: Add keyword classifier helper `_classify_from_titles`
|
|
|
|
**Files:**
|
|
- Modify: `analysis/axis_classifier.py`
|
|
- Test: `tests/test_political_compass.py`
|
|
|
|
- [ ] **Step 1: Write the three failing tests**
|
|
|
|
Add to `tests/test_political_compass.py`:
|
|
|
|
```python
|
|
def test_classify_from_titles_left_right():
|
|
"""Titles dominated by left-right keywords → 'Links–Rechts'."""
|
|
from analysis.axis_classifier import _classify_from_titles
|
|
|
|
titles = [
|
|
"Motie over asielbeleid",
|
|
"Motie over minimumloon verhoging",
|
|
"Motie over vluchtelingen opvang",
|
|
"Motie over belastingverlaging",
|
|
"Motie over bijstandsuitkering",
|
|
]
|
|
label, confidence = _classify_from_titles(titles)
|
|
assert label == "Links\u2013Rechts"
|
|
assert confidence >= 0.4
|
|
|
|
|
|
def test_classify_from_titles_progressive():
|
|
"""Titles dominated by progressive/conservative keywords → 'Progressief–Conservatief'."""
|
|
from analysis.axis_classifier import _classify_from_titles
|
|
|
|
titles = [
|
|
"Motie over klimaatdoelstellingen",
|
|
"Motie over stikstofbeleid",
|
|
"Motie over duurzame energie",
|
|
"Motie over co2 uitstoot",
|
|
"Motie over energietransitie",
|
|
]
|
|
label, confidence = _classify_from_titles(titles)
|
|
assert label == "Progressief\u2013Conservatief"
|
|
assert confidence >= 0.4
|
|
|
|
|
|
def test_classify_from_titles_low_confidence():
|
|
"""Mixed/irrelevant titles → None (fallback triggered)."""
|
|
from analysis.axis_classifier import _classify_from_titles
|
|
|
|
titles = [
|
|
"Motie over sportsubsidie",
|
|
"Motie over bibliotheekregeling",
|
|
"Motie over verkeersveiligheid",
|
|
]
|
|
label, confidence = _classify_from_titles(titles)
|
|
assert label is None
|
|
assert confidence < 0.4
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests to verify they fail**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py::test_classify_from_titles_left_right tests/test_political_compass.py::test_classify_from_titles_progressive tests/test_political_compass.py::test_classify_from_titles_low_confidence -v
|
|
```
|
|
|
|
Expected: FAIL — `ImportError: cannot import name '_classify_from_titles'`
|
|
|
|
- [ ] **Step 3: Add `_KEYWORDS` constant and `_classify_from_titles` to `axis_classifier.py`**
|
|
|
|
Add after the `_INTERPRETATION_TEMPLATES` block (after line 42) and before `_load_ideology`:
|
|
|
|
```python
|
|
_KEYWORD_THRESHOLD = 0.4
|
|
|
|
_KEYWORDS: Dict[str, List[str]] = {
|
|
"Links\u2013Rechts": [
|
|
# economic
|
|
"belasting",
|
|
"uitkering",
|
|
"bijstand",
|
|
"minimumloon",
|
|
"cao",
|
|
"vakbond",
|
|
"bezuiniging",
|
|
"privatisering",
|
|
"subsidie",
|
|
"pensioen",
|
|
"aow",
|
|
"zorg",
|
|
# immigration
|
|
"asiel",
|
|
"asielaanvraag",
|
|
"migratie",
|
|
"vreemdeling",
|
|
"vluchtelingen",
|
|
"terugkeer",
|
|
"grenzen",
|
|
"opvang",
|
|
"statushouder",
|
|
],
|
|
"Progressief\u2013Conservatief": [
|
|
# environment
|
|
"klimaat",
|
|
"stikstof",
|
|
"duurzaam",
|
|
"duurzaamheid",
|
|
"co2",
|
|
"energietransitie",
|
|
"biodiversiteit",
|
|
# social
|
|
"euthanasie",
|
|
"abortus",
|
|
"lgbtq",
|
|
"transgender",
|
|
"diversiteit",
|
|
"traditi",
|
|
"gezin",
|
|
"religie",
|
|
"geloof",
|
|
],
|
|
"Nationaal\u2013Internationaal": [
|
|
"navo",
|
|
"nato",
|
|
"europees",
|
|
"europese",
|
|
" eu ",
|
|
"verdrag",
|
|
" vn ",
|
|
"internationaal",
|
|
],
|
|
}
|
|
|
|
|
|
def _classify_from_titles(titles: List[str]) -> Tuple[Optional[str], float]:
|
|
"""Classify a list of motion titles into an axis category using keyword matching.
|
|
|
|
Returns (category_label, confidence) where confidence = fraction of titles
|
|
containing at least one keyword from the winning category.
|
|
Returns (None, 0.0) if confidence is below _KEYWORD_THRESHOLD.
|
|
"""
|
|
if not titles:
|
|
return None, 0.0
|
|
|
|
counts: Dict[str, int] = {cat: 0 for cat in _KEYWORDS}
|
|
for title in titles:
|
|
lower = title.lower()
|
|
for cat, keywords in _KEYWORDS.items():
|
|
if any(kw in lower for kw in keywords):
|
|
counts[cat] += 1
|
|
|
|
best_cat = max(counts, key=lambda c: counts[c])
|
|
best_count = counts[best_cat]
|
|
confidence = best_count / len(titles)
|
|
|
|
if confidence < _KEYWORD_THRESHOLD:
|
|
return None, confidence
|
|
|
|
return best_cat, confidence
|
|
```
|
|
|
|
- [ ] **Step 4: Run the three tests to verify they pass**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py::test_classify_from_titles_left_right tests/test_political_compass.py::test_classify_from_titles_progressive tests/test_political_compass.py::test_classify_from_titles_low_confidence -v
|
|
```
|
|
|
|
Expected: all 3 PASS
|
|
|
|
- [ ] **Step 5: Run full suite to confirm no regressions**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py -v
|
|
```
|
|
|
|
- [ ] **Step 6: Commit**
|
|
|
|
```bash
|
|
git add analysis/axis_classifier.py tests/test_political_compass.py
|
|
git commit -m "feat: add _classify_from_titles keyword classifier to axis_classifier"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 3: Add motion-loading helpers to `axis_classifier.py`
|
|
|
|
**Files:**
|
|
- Modify: `analysis/axis_classifier.py`
|
|
|
|
These helpers have DB dependencies so they don't get new unit tests here — they are exercised indirectly once `classify_axes` is wired up. Error handling is the main concern.
|
|
|
|
- [ ] **Step 1: Add `import json` at top of `axis_classifier.py`**
|
|
|
|
After `import numpy as np` (line 12), add:
|
|
```python
|
|
import json
|
|
```
|
|
|
|
- [ ] **Step 2: Add the four motion helpers after `_classify_from_titles`**
|
|
|
|
```python
|
|
def _load_motion_vectors(db_path: str, window_id: str) -> Dict[int, np.ndarray]:
|
|
"""Load SVD motion vectors for a given window from DuckDB.
|
|
|
|
Returns {motion_id: vector_array}. Returns {} on any error.
|
|
"""
|
|
try:
|
|
import duckdb
|
|
|
|
conn = duckdb.connect(db_path, read_only=True)
|
|
rows = conn.execute(
|
|
"SELECT entity_id, vector FROM svd_vectors "
|
|
"WHERE entity_type = 'motion' AND window_id = ?",
|
|
[window_id],
|
|
).fetchall()
|
|
conn.close()
|
|
result = {}
|
|
for entity_id, vector_raw in rows:
|
|
try:
|
|
mid = int(entity_id)
|
|
vec = np.array(json.loads(vector_raw), dtype=float)
|
|
result[mid] = vec
|
|
except Exception:
|
|
continue
|
|
return result
|
|
except Exception as exc:
|
|
_logger.debug("Failed to load motion vectors for window %s: %s", window_id, exc)
|
|
return {}
|
|
|
|
|
|
def _project_motions(
|
|
motion_vecs: Dict[int, np.ndarray],
|
|
x_axis: np.ndarray,
|
|
y_axis: np.ndarray,
|
|
global_mean: np.ndarray,
|
|
) -> Dict[int, Tuple[float, float]]:
|
|
"""Project motion vectors onto the PCA axes after centering by global_mean.
|
|
|
|
Returns {motion_id: (x_score, y_score)}.
|
|
"""
|
|
projections: Dict[int, Tuple[float, float]] = {}
|
|
for mid, vec in motion_vecs.items():
|
|
try:
|
|
centered = vec - global_mean
|
|
x_score = float(np.dot(centered, x_axis))
|
|
y_score = float(np.dot(centered, y_axis))
|
|
projections[mid] = (x_score, y_score)
|
|
except Exception:
|
|
continue
|
|
return projections
|
|
|
|
|
|
def _top_motion_ids(
|
|
projections: Dict[int, Tuple[float, float]],
|
|
axis: str,
|
|
n: int = 5,
|
|
) -> Dict[str, List[int]]:
|
|
"""Return the top-n motion IDs at each pole of the given axis.
|
|
|
|
axis: 'x' or 'y'
|
|
Returns {'+': [motion_ids], '-': [motion_ids]} (highest positive first,
|
|
most negative first in the '-' list).
|
|
"""
|
|
idx = 0 if axis == "x" else 1
|
|
sorted_ids = sorted(projections, key=lambda mid: projections[mid][idx])
|
|
neg_ids = sorted_ids[:n] # most negative
|
|
pos_ids = sorted_ids[-n:][::-1] # most positive
|
|
return {"+": pos_ids, "-": neg_ids}
|
|
|
|
|
|
def _fetch_motion_titles(
|
|
db_path: str,
|
|
motion_ids: List[int],
|
|
) -> Dict[int, Tuple[str, str]]:
|
|
"""Fetch (title, date) for a list of motion IDs from DuckDB.
|
|
|
|
Returns {motion_id: (title, date_str)}. Missing IDs are omitted.
|
|
Returns {} on any DB error.
|
|
"""
|
|
if not motion_ids:
|
|
return {}
|
|
try:
|
|
import duckdb
|
|
|
|
placeholders = ", ".join("?" * len(motion_ids))
|
|
conn = duckdb.connect(db_path, read_only=True)
|
|
rows = conn.execute(
|
|
f"SELECT id, title, date FROM motions WHERE id IN ({placeholders})",
|
|
motion_ids,
|
|
).fetchall()
|
|
conn.close()
|
|
return {int(row[0]): (str(row[1]), str(row[2])) for row in rows}
|
|
except Exception as exc:
|
|
_logger.debug("Failed to fetch motion titles: %s", exc)
|
|
return {}
|
|
```
|
|
|
|
- [ ] **Step 3: Run full test suite to confirm nothing broke**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py -v
|
|
```
|
|
|
|
Expected: all previously passing tests still pass.
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add analysis/axis_classifier.py
|
|
git commit -m "feat: add motion-loading helpers to axis_classifier"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 4: Restructure `classify_axes` to use motion projection as primary
|
|
|
|
**Files:**
|
|
- Modify: `analysis/axis_classifier.py`
|
|
|
|
- [ ] **Step 1: Replace the body of `classify_axes`**
|
|
|
|
Replace the entire function (lines 180–269 in the current file) with the version below.
|
|
Key changes from the old version:
|
|
- Remove the `if not ideology: return axes` early return (motion path doesn't need ideology).
|
|
- New early return only if BOTH motion path AND ideology path are unavailable.
|
|
- Motion classification runs first per window; keyword result overrides Pearson-r if confident.
|
|
- New accumulators: `x_top_motions`, `y_top_motions`, `x_label_confidence`, `y_label_confidence`.
|
|
|
|
```python
|
|
def classify_axes(
|
|
positions_by_window: Dict[str, Dict[str, Tuple[float, float]]],
|
|
axes: dict,
|
|
db_path: str,
|
|
) -> dict:
|
|
"""Classify compass axes using motion projection (primary) and ideology CSV (fallback).
|
|
|
|
Motion projection path:
|
|
- Requires axes["global_mean"], axes["x_axis"], axes["y_axis"].
|
|
- Loads motion SVD vectors per window, projects onto PCA axes,
|
|
ranks top 5+5 motions, applies keyword classifier → label.
|
|
|
|
Fallback path (unchanged):
|
|
- Pearson-r against party_ideologies.csv (left_right, progressive).
|
|
- Pearson-r against coalition_membership.csv dummy.
|
|
|
|
Enriches axes with:
|
|
x_label, y_label — global modal label across annual windows
|
|
x_quality, y_quality — {window_id: float} max |r|
|
|
x_interpretation — {window_id: str}
|
|
y_interpretation — {window_id: str}
|
|
x_top_motions, y_top_motions — {window_id: {'+': [(title, date), ...], '-': [...]}}
|
|
x_label_confidence — {window_id: float}
|
|
y_label_confidence — {window_id: float}
|
|
"""
|
|
data_dir = Path(db_path).parent
|
|
ideology = _load_ideology(data_dir / "party_ideologies.csv")
|
|
coalition = _load_coalition(data_dir / "coalition_membership.csv")
|
|
|
|
# Determine whether motion projection is possible.
|
|
global_mean = axes.get("global_mean")
|
|
x_axis_arr = np.array(axes.get("x_axis", []))
|
|
y_axis_arr = np.array(axes.get("y_axis", []))
|
|
motion_path_available = (
|
|
global_mean is not None
|
|
and x_axis_arr.ndim == 1
|
|
and x_axis_arr.size > 0
|
|
and y_axis_arr.size > 0
|
|
)
|
|
|
|
if not ideology and not motion_path_available:
|
|
return axes # nothing to classify with
|
|
|
|
x_quality: Dict[str, float] = {}
|
|
y_quality: Dict[str, float] = {}
|
|
x_interpretation: Dict[str, str] = {}
|
|
y_interpretation: Dict[str, str] = {}
|
|
x_top_motions: Dict[str, Dict] = {}
|
|
y_top_motions: Dict[str, Dict] = {}
|
|
x_label_confidence: Dict[str, float] = {}
|
|
y_label_confidence: Dict[str, float] = {}
|
|
annual_x_labels: List[str] = []
|
|
annual_y_labels: List[str] = []
|
|
|
|
for wid, pos_dict in positions_by_window.items():
|
|
year = _window_year(wid)
|
|
is_annual = wid != "current_parliament" and "-" not in wid
|
|
|
|
# ── Ideology / coalition Pearson-r (unchanged logic) ──────────────────
|
|
x_lbl_fallback: Optional[str] = None
|
|
y_lbl_fallback: Optional[str] = None
|
|
x_q = 0.0
|
|
y_q = 0.0
|
|
x_int = ""
|
|
y_int = ""
|
|
|
|
if ideology:
|
|
parties = [p for p in pos_dict if p in ideology]
|
|
if len(parties) >= 5:
|
|
party_x = [pos_dict[p][0] for p in parties]
|
|
party_y = [pos_dict[p][1] for p in parties]
|
|
ref_lr = [ideology[p]["left_right"] for p in parties]
|
|
ref_pc = [ideology[p]["progressive"] for p in parties]
|
|
|
|
if year and coalition and year in coalition:
|
|
gov_set = coalition[year]
|
|
ref_co = [1.0 if p in gov_set else -1.0 for p in parties]
|
|
else:
|
|
ref_co = [0.0] * len(parties)
|
|
|
|
r_lr_x = _pearsonr(party_x, ref_lr)
|
|
r_co_x = _pearsonr(party_x, ref_co)
|
|
r_pc_x = _pearsonr(party_x, ref_pc)
|
|
x_lbl_fallback, x_int, x_q = _assign_label(r_lr_x, r_co_x, r_pc_x, "x")
|
|
|
|
r_lr_y = _pearsonr(party_y, ref_lr)
|
|
r_co_y = _pearsonr(party_y, ref_co)
|
|
r_pc_y = _pearsonr(party_y, ref_pc)
|
|
y_lbl_fallback, y_int, y_q = _assign_label(r_lr_y, r_co_y, r_pc_y, "y")
|
|
|
|
# ── Motion projection (primary) ────────────────────────────────────────
|
|
x_lbl = x_lbl_fallback
|
|
y_lbl = y_lbl_fallback
|
|
x_conf = 0.0
|
|
y_conf = 0.0
|
|
x_tops: Dict[str, List] = {"+": [], "-": []}
|
|
y_tops: Dict[str, List] = {"+": [], "-": []}
|
|
|
|
if motion_path_available:
|
|
motion_vecs = _load_motion_vectors(db_path, wid)
|
|
if motion_vecs:
|
|
projections = _project_motions(motion_vecs, x_axis_arr, y_axis_arr, global_mean)
|
|
x_ids = _top_motion_ids(projections, "x", n=5)
|
|
y_ids = _top_motion_ids(projections, "y", n=5)
|
|
|
|
all_x_ids = x_ids["+"] + x_ids["-"]
|
|
all_y_ids = y_ids["+"] + y_ids["-"]
|
|
titles_map = _fetch_motion_titles(db_path, list(set(all_x_ids + all_y_ids)))
|
|
|
|
x_title_list = [
|
|
titles_map[mid][0] for mid in all_x_ids if mid in titles_map
|
|
]
|
|
y_title_list = [
|
|
titles_map[mid][0] for mid in all_y_ids if mid in titles_map
|
|
]
|
|
|
|
x_kw_lbl, x_conf = _classify_from_titles(x_title_list)
|
|
y_kw_lbl, y_conf = _classify_from_titles(y_title_list)
|
|
|
|
if x_kw_lbl is not None:
|
|
x_lbl = x_kw_lbl
|
|
if y_kw_lbl is not None:
|
|
y_lbl = y_kw_lbl
|
|
|
|
# Build display lists: [(title, date), ...]
|
|
for pole, ids in x_ids.items():
|
|
x_tops[pole] = [
|
|
titles_map[mid] for mid in ids if mid in titles_map
|
|
]
|
|
for pole, ids in y_ids.items():
|
|
y_tops[pole] = [
|
|
titles_map[mid] for mid in ids if mid in titles_map
|
|
]
|
|
|
|
# ── Final label resolution ────────────────────────────────────────────
|
|
# If both motion and ideology paths produced nothing, use generic fallback.
|
|
if x_lbl is None:
|
|
x_lbl = _LABELS["fallback_x"]
|
|
x_int = _INTERPRETATION_TEMPLATES["fallback"].format(orientation="horizontale")
|
|
if y_lbl is None:
|
|
y_lbl = _LABELS["fallback_y"]
|
|
y_int = _INTERPRETATION_TEMPLATES["fallback"].format(orientation="verticale")
|
|
|
|
x_quality[wid] = x_q
|
|
y_quality[wid] = y_q
|
|
x_interpretation[wid] = x_int
|
|
y_interpretation[wid] = y_int
|
|
x_top_motions[wid] = x_tops
|
|
y_top_motions[wid] = y_tops
|
|
x_label_confidence[wid] = x_conf
|
|
y_label_confidence[wid] = y_conf
|
|
|
|
if is_annual:
|
|
annual_x_labels.append(x_lbl)
|
|
annual_y_labels.append(y_lbl)
|
|
|
|
def _modal(labels: List[str], fallback: str) -> str:
|
|
if not labels:
|
|
return fallback
|
|
return Counter(labels).most_common(1)[0][0]
|
|
|
|
enriched = dict(axes)
|
|
enriched["x_label"] = _modal(annual_x_labels, "Links\u2013Rechts")
|
|
enriched["y_label"] = _modal(annual_y_labels, "Progressief\u2013Conservatief")
|
|
enriched["x_quality"] = x_quality
|
|
enriched["y_quality"] = y_quality
|
|
enriched["x_interpretation"] = x_interpretation
|
|
enriched["y_interpretation"] = y_interpretation
|
|
enriched["x_top_motions"] = x_top_motions
|
|
enriched["y_top_motions"] = y_top_motions
|
|
enriched["x_label_confidence"] = x_label_confidence
|
|
enriched["y_label_confidence"] = y_label_confidence
|
|
return enriched
|
|
```
|
|
|
|
- [ ] **Step 2: Run full test suite**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py -v
|
|
```
|
|
|
|
Expected: all existing tests + all 4 tasks' new tests pass. Particularly verify the 3 classifier tests from Task 2 and the `test_compute_2d_axes_exposes_global_mean` from Task 1 still pass.
|
|
|
|
- [ ] **Step 3: Commit**
|
|
|
|
```bash
|
|
git add analysis/axis_classifier.py
|
|
git commit -m "feat: restructure classify_axes — motion projection as primary label source"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 5: Add axis-swap logic and tests in `explorer.py`
|
|
|
|
**Files:**
|
|
- Modify: `explorer.py`
|
|
- Test: `tests/test_political_compass.py`
|
|
|
|
- [ ] **Step 1: Write the two failing tests**
|
|
|
|
Add to `tests/test_political_compass.py`:
|
|
|
|
```python
|
|
def test_axis_swap_when_y_is_left_right():
|
|
"""When y_label is 'Links–Rechts' and x_label is not, positions must be swapped."""
|
|
from explorer import _swap_axes
|
|
|
|
positions_by_window = {
|
|
"2023": {
|
|
"VVD": (0.5, 0.8),
|
|
"PvdA": (-0.3, -0.6),
|
|
}
|
|
}
|
|
axis_def = {
|
|
"x_label": "Progressief\u2013Conservatief",
|
|
"y_label": "Links\u2013Rechts",
|
|
"x_quality": {"2023": 0.7},
|
|
"y_quality": {"2023": 0.8},
|
|
"x_interpretation": {"2023": "prog interpretation"},
|
|
"y_interpretation": {"2023": "lr interpretation"},
|
|
"x_top_motions": {"2023": {"+": [], "-": []}},
|
|
"y_top_motions": {"2023": {"+": [], "-": []}},
|
|
"x_label_confidence": {"2023": 0.5},
|
|
"y_label_confidence": {"2023": 0.7},
|
|
}
|
|
|
|
new_pos, new_ax = _swap_axes(positions_by_window, axis_def)
|
|
|
|
# Positions swapped: (x, y) → (y, x)
|
|
assert new_pos["2023"]["VVD"] == (0.8, 0.5)
|
|
assert new_pos["2023"]["PvdA"] == (-0.6, -0.3)
|
|
|
|
# Labels swapped
|
|
assert new_ax["x_label"] == "Links\u2013Rechts"
|
|
assert new_ax["y_label"] == "Progressief\u2013Conservatief"
|
|
|
|
# Quality swapped
|
|
assert new_ax["x_quality"] == {"2023": 0.8}
|
|
assert new_ax["y_quality"] == {"2023": 0.7}
|
|
|
|
|
|
def test_axis_swap_not_applied_when_x_is_left_right():
|
|
"""When x_label is already 'Links–Rechts', no swap should occur."""
|
|
from explorer import _should_swap_axes
|
|
|
|
axis_def = {
|
|
"x_label": "Links\u2013Rechts",
|
|
"y_label": "Progressief\u2013Conservatief",
|
|
}
|
|
assert _should_swap_axes(axis_def) is False
|
|
|
|
axis_def2 = {
|
|
"x_label": "Links\u2013Rechts",
|
|
"y_label": "Links\u2013Rechts", # both LR — no swap
|
|
}
|
|
assert _should_swap_axes(axis_def2) is False
|
|
```
|
|
|
|
- [ ] **Step 2: Run tests to verify they fail**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py::test_axis_swap_when_y_is_left_right tests/test_political_compass.py::test_axis_swap_not_applied_when_x_is_left_right -v
|
|
```
|
|
|
|
Expected: FAIL — `ImportError: cannot import name '_swap_axes'` / `'_should_swap_axes'`
|
|
|
|
- [ ] **Step 3: Add `_swap_axes` and `_should_swap_axes` to `explorer.py`**
|
|
|
|
Add these two functions near the top of `explorer.py`, just before `load_positions` (i.e. before the function that starts around line 184). A good place is after any existing module-level helpers.
|
|
|
|
```python
|
|
def _should_swap_axes(axis_def: dict) -> bool:
|
|
"""Return True if the Y axis is 'Links–Rechts' and the X axis is not.
|
|
|
|
When true, caller should swap x/y positions and metadata so left-right
|
|
is conventionally on the horizontal axis.
|
|
"""
|
|
lr = "Links\u2013Rechts"
|
|
return axis_def.get("y_label") == lr and axis_def.get("x_label") != lr
|
|
|
|
|
|
def _swap_axes(
|
|
positions_by_window: dict,
|
|
axis_def: dict,
|
|
) -> tuple:
|
|
"""Swap x and y in all positions and axis metadata.
|
|
|
|
Pure function — returns (new_positions_by_window, new_axis_def).
|
|
"""
|
|
new_positions: dict = {}
|
|
for wid, pos_dict in positions_by_window.items():
|
|
new_positions[wid] = {ent: (y, x) for ent, (x, y) in pos_dict.items()}
|
|
|
|
new_ax = dict(axis_def)
|
|
# Swap paired scalar keys
|
|
new_ax["x_label"] = axis_def.get("y_label")
|
|
new_ax["y_label"] = axis_def.get("x_label")
|
|
|
|
# Swap paired dict keys
|
|
for x_key, y_key in [
|
|
("x_quality", "y_quality"),
|
|
("x_interpretation", "y_interpretation"),
|
|
("x_top_motions", "y_top_motions"),
|
|
("x_label_confidence", "y_label_confidence"),
|
|
]:
|
|
new_ax[x_key] = axis_def.get(y_key)
|
|
new_ax[y_key] = axis_def.get(x_key)
|
|
|
|
return new_positions, new_ax
|
|
```
|
|
|
|
- [ ] **Step 4: Wire the swap in `load_positions`**
|
|
|
|
In `explorer.py`, after the `classify_axes` try/except block (currently lines 202–211, ending at `axis_def = classify_axes(...)`), add:
|
|
|
|
```python
|
|
if _should_swap_axes(axis_def):
|
|
positions_by_window, axis_def = _swap_axes(positions_by_window, axis_def)
|
|
```
|
|
|
|
Place this immediately before the `# Filter displayed windows by window_size` comment (currently ~line 213).
|
|
|
|
- [ ] **Step 5: Run tests to verify they pass**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py::test_axis_swap_when_y_is_left_right tests/test_political_compass.py::test_axis_swap_not_applied_when_x_is_left_right -v
|
|
```
|
|
|
|
Expected: both PASS
|
|
|
|
- [ ] **Step 6: Run full suite**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py -v
|
|
```
|
|
|
|
Expected: all tests pass.
|
|
|
|
- [ ] **Step 7: Commit**
|
|
|
|
```bash
|
|
git add explorer.py tests/test_political_compass.py
|
|
git commit -m "feat: add axis swap — left-right goes on horizontal axis when detected"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 6: Add motion expander UI in `build_compass_tab`
|
|
|
|
**Files:**
|
|
- Modify: `explorer.py`
|
|
|
|
No new unit tests for this task — it's pure Streamlit rendering and cannot be unit-tested without a browser. Verify visually after implementation.
|
|
|
|
- [ ] **Step 1: Add the expander block after `st.plotly_chart`**
|
|
|
|
In `explorer.py`, find the `st.plotly_chart` call (line ~974) inside `with col1:`. After the two `st.caption` calls (lines ~981–986), add:
|
|
|
|
```python
|
|
# Motion expander — show which motions define each axis for this window
|
|
x_top = axis_def.get("x_top_motions", {}).get(window_idx, {})
|
|
y_top = axis_def.get("y_top_motions", {}).get(window_idx, {})
|
|
x_conf = axis_def.get("x_label_confidence", {}).get(window_idx)
|
|
y_conf = axis_def.get("y_label_confidence", {}).get(window_idx)
|
|
evr = axis_def.get("explained_variance_ratio", [None, None])
|
|
evr0 = evr[0] if evr else None
|
|
|
|
_has_motion_data = bool(
|
|
x_top.get("+") or x_top.get("-") or y_top.get("+") or y_top.get("-")
|
|
)
|
|
if _has_motion_data:
|
|
with st.expander("\U0001f50d Wat bepaalt deze assen?"):
|
|
x_conf_pct = f" (vertrouwen: {x_conf:.0%})" if x_conf is not None else ""
|
|
y_conf_pct = f" (vertrouwen: {y_conf:.0%})" if y_conf is not None else ""
|
|
|
|
st.markdown(f"**Horizontale as: {_x_label}**{x_conf_pct}")
|
|
x_pos_titles = x_top.get("+", [])
|
|
x_neg_titles = x_top.get("-", [])
|
|
if x_pos_titles:
|
|
labels_pos = " · ".join(
|
|
f"{t} ({d})" for t, d in x_pos_titles[:3]
|
|
)
|
|
st.markdown(f" ➕ {labels_pos}")
|
|
if x_neg_titles:
|
|
labels_neg = " · ".join(
|
|
f"{t} ({d})" for t, d in x_neg_titles[:3]
|
|
)
|
|
st.markdown(f" ➖ {labels_neg}")
|
|
|
|
st.markdown(f"**Verticale as: {_y_label}**{y_conf_pct}")
|
|
y_pos_titles = y_top.get("+", [])
|
|
y_neg_titles = y_top.get("-", [])
|
|
if y_pos_titles:
|
|
labels_pos = " · ".join(
|
|
f"{t} ({d})" for t, d in y_pos_titles[:3]
|
|
)
|
|
st.markdown(f" ➕ {labels_pos}")
|
|
if y_neg_titles:
|
|
labels_neg = " · ".join(
|
|
f"{t} ({d})" for t, d in y_neg_titles[:3]
|
|
)
|
|
st.markdown(f" ➖ {labels_neg}")
|
|
|
|
if evr0 is not None:
|
|
st.caption(
|
|
f"As 1 verklaart {evr0:.1%} van de variantie in stemgedrag."
|
|
)
|
|
```
|
|
|
|
Note: `_x_label` and `_y_label` are already defined earlier in `build_compass_tab` from `axis_def.get("x_label", …)`. `window_idx` is the currently selected window string. Confirm those variable names match the existing code before inserting.
|
|
|
|
- [ ] **Step 2: Check that `explained_variance_ratio` is stored in `axis_def`**
|
|
|
|
Search `analysis/political_axis.py` for where `axes["explained_variance_ratio"]` is set. If it isn't stored, add it:
|
|
|
|
In `compute_2d_axes`, after `axes["global_mean"] = global_mean` (Task 1), find where `evr` is computed (it's the `explained_variance_ratio_` from sklearn PCA or numpy SVD). Add:
|
|
|
|
```python
|
|
axes["explained_variance_ratio"] = list(axes.get("explained_variance_ratio", [evr1, evr2]))
|
|
```
|
|
|
|
If it's already stored under a different key, use that key in the expander code instead.
|
|
|
|
- [ ] **Step 3: Run full test suite (sanity check)**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py -v
|
|
```
|
|
|
|
Expected: all tests pass (expander is UI-only, no test required).
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add explorer.py
|
|
git commit -m "feat: add motion expander to compass tab — shows top motions per axis"
|
|
```
|
|
|
|
---
|
|
|
|
## Final Verification
|
|
|
|
- [ ] **Run all tests one last time**
|
|
|
|
```bash
|
|
pytest tests/test_political_compass.py -v
|
|
```
|
|
|
|
Expected output summary: 13+ tests passing (8 existing + 5 new), 0 failing.
|
|
|
|
- [ ] **Smoke-test the app** (if DB is available)
|
|
|
|
```bash
|
|
streamlit run explorer.py
|
|
```
|
|
|
|
Navigate to the compass tab, select a window, verify:
|
|
1. Axis labels show e.g. "Links–Rechts" on X and "Progressief–Conservatief" on Y
|
|
2. The "🔍 Wat bepaalt deze assen?" expander appears and shows motions
|
|
3. No Python exceptions in the terminal
|
|
|
|
- [ ] **Final commit (if any cleanup needed)**
|
|
|
|
```bash
|
|
git add -u
|
|
git commit -m "fix: address any issues found during smoke test"
|
|
```
|
|
|