parent
9dcf6201bb
commit
93a2287c04
@ -0,0 +1,895 @@ |
||||
# Motion-Driven Axis Labeling Implementation Plan |
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. |
||||
|
||||
**Goal:** Replace static ideology-CSV axis labeling with motion-projection-based labeling, add axis swap when Y ends up as "Links–Rechts", and expose top motions per axis to the user. |
||||
|
||||
**Architecture:** `political_axis.py` exposes `global_mean` in the `axes` dict; `axis_classifier.py` gains motion-loading helpers and a keyword classifier as the primary label source (falling back to existing Pearson-r); `explorer.py` swaps axes when needed and renders a new expander showing the top motions. |
||||
|
||||
**Tech Stack:** Python, NumPy, DuckDB (stdlib only — no new deps), Streamlit, pytest |
||||
|
||||
--- |
||||
|
||||
## File Map |
||||
|
||||
| File | Change | |
||||
|---|---| |
||||
| `analysis/political_axis.py` | Add `axes["global_mean"] = global_mean` (one line) | |
||||
| `analysis/axis_classifier.py` | Add `_KEYWORDS`, motion helpers, restructure `classify_axes` | |
||||
| `explorer.py` | Add `_swap_axes`, `_should_swap_axes`, wire swap, add motion expander | |
||||
| `tests/test_political_compass.py` | Add 5 new unit tests | |
||||
|
||||
--- |
||||
|
||||
## Task 1: Expose `global_mean` from `compute_2d_axes` |
||||
|
||||
**Files:** |
||||
- Modify: `analysis/political_axis.py` (line 362) |
||||
|
||||
- [ ] **Step 1: Write the failing test** |
||||
|
||||
Add this test at the bottom of `tests/test_political_compass.py`: |
||||
|
||||
```python |
||||
def test_compute_2d_axes_exposes_global_mean(monkeypatch): |
||||
"""axes dict returned by compute_2d_axes must contain 'global_mean'.""" |
||||
fake_traj = types.SimpleNamespace() |
||||
fake_traj._load_window_ids = lambda db: ["w1"] |
||||
aligned = { |
||||
"w1": { |
||||
"Alice": np.array([1.0, 0.0, 0.0]), |
||||
"Bob": np.array([-1.0, 0.5, 0.0]), |
||||
} |
||||
} |
||||
fake_traj._load_mp_vectors_for_window = lambda db, w: aligned.get(w, {}) |
||||
fake_traj._procrustes_align_windows = lambda x: aligned |
||||
monkeypatch.setitem(sys.modules, "analysis.trajectory", fake_traj) |
||||
|
||||
from analysis.political_axis import compute_2d_axes |
||||
|
||||
_, axis_def = compute_2d_axes(db_path="dummy", window_ids=["w1"], method="pca") |
||||
assert "global_mean" in axis_def |
||||
assert isinstance(axis_def["global_mean"], np.ndarray) |
||||
``` |
||||
|
||||
- [ ] **Step 2: Run test to verify it fails** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py::test_compute_2d_axes_exposes_global_mean -v |
||||
``` |
||||
|
||||
Expected: FAIL — `AssertionError: assert 'global_mean' in {…}` (key not yet present) |
||||
|
||||
- [ ] **Step 3: Add `global_mean` to axes dict in `political_axis.py`** |
||||
|
||||
In `analysis/political_axis.py`, the line at ~362 reads: |
||||
```python |
||||
global_mean = M.mean(axis=0) |
||||
positions_by_window: Dict[str, Dict[str, Tuple[float, float]]] = { |
||||
``` |
||||
|
||||
Add `axes["global_mean"] = global_mean` immediately after that assignment: |
||||
```python |
||||
global_mean = M.mean(axis=0) |
||||
axes["global_mean"] = global_mean |
||||
positions_by_window: Dict[str, Dict[str, Tuple[float, float]]] = { |
||||
``` |
||||
|
||||
- [ ] **Step 4: Run test to verify it passes** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py::test_compute_2d_axes_exposes_global_mean -v |
||||
``` |
||||
|
||||
Expected: PASS |
||||
|
||||
- [ ] **Step 5: Run full test suite to confirm no regressions** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py -v |
||||
``` |
||||
|
||||
Expected: all previously passing tests still pass + new test passes. |
||||
|
||||
- [ ] **Step 6: Commit** |
||||
|
||||
```bash |
||||
git add analysis/political_axis.py tests/test_political_compass.py |
||||
git commit -m "feat: expose global_mean in compute_2d_axes axes dict" |
||||
``` |
||||
|
||||
--- |
||||
|
||||
## Task 2: Add keyword classifier helper `_classify_from_titles` |
||||
|
||||
**Files:** |
||||
- Modify: `analysis/axis_classifier.py` |
||||
- Test: `tests/test_political_compass.py` |
||||
|
||||
- [ ] **Step 1: Write the three failing tests** |
||||
|
||||
Add to `tests/test_political_compass.py`: |
||||
|
||||
```python |
||||
def test_classify_from_titles_left_right(): |
||||
"""Titles dominated by left-right keywords → 'Links–Rechts'.""" |
||||
from analysis.axis_classifier import _classify_from_titles |
||||
|
||||
titles = [ |
||||
"Motie over asielbeleid", |
||||
"Motie over minimumloon verhoging", |
||||
"Motie over vluchtelingen opvang", |
||||
"Motie over belastingverlaging", |
||||
"Motie over bijstandsuitkering", |
||||
] |
||||
label, confidence = _classify_from_titles(titles) |
||||
assert label == "Links\u2013Rechts" |
||||
assert confidence >= 0.4 |
||||
|
||||
|
||||
def test_classify_from_titles_progressive(): |
||||
"""Titles dominated by progressive/conservative keywords → 'Progressief–Conservatief'.""" |
||||
from analysis.axis_classifier import _classify_from_titles |
||||
|
||||
titles = [ |
||||
"Motie over klimaatdoelstellingen", |
||||
"Motie over stikstofbeleid", |
||||
"Motie over duurzame energie", |
||||
"Motie over co2 uitstoot", |
||||
"Motie over energietransitie", |
||||
] |
||||
label, confidence = _classify_from_titles(titles) |
||||
assert label == "Progressief\u2013Conservatief" |
||||
assert confidence >= 0.4 |
||||
|
||||
|
||||
def test_classify_from_titles_low_confidence(): |
||||
"""Mixed/irrelevant titles → None (fallback triggered).""" |
||||
from analysis.axis_classifier import _classify_from_titles |
||||
|
||||
titles = [ |
||||
"Motie over sportsubsidie", |
||||
"Motie over bibliotheekregeling", |
||||
"Motie over verkeersveiligheid", |
||||
] |
||||
label, confidence = _classify_from_titles(titles) |
||||
assert label is None |
||||
assert confidence < 0.4 |
||||
``` |
||||
|
||||
- [ ] **Step 2: Run tests to verify they fail** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py::test_classify_from_titles_left_right tests/test_political_compass.py::test_classify_from_titles_progressive tests/test_political_compass.py::test_classify_from_titles_low_confidence -v |
||||
``` |
||||
|
||||
Expected: FAIL — `ImportError: cannot import name '_classify_from_titles'` |
||||
|
||||
- [ ] **Step 3: Add `_KEYWORDS` constant and `_classify_from_titles` to `axis_classifier.py`** |
||||
|
||||
Add after the `_INTERPRETATION_TEMPLATES` block (after line 42) and before `_load_ideology`: |
||||
|
||||
```python |
||||
_KEYWORD_THRESHOLD = 0.4 |
||||
|
||||
_KEYWORDS: Dict[str, List[str]] = { |
||||
"Links\u2013Rechts": [ |
||||
# economic |
||||
"belasting", |
||||
"uitkering", |
||||
"bijstand", |
||||
"minimumloon", |
||||
"cao", |
||||
"vakbond", |
||||
"bezuiniging", |
||||
"privatisering", |
||||
"subsidie", |
||||
"pensioen", |
||||
"aow", |
||||
"zorg", |
||||
# immigration |
||||
"asiel", |
||||
"asielaanvraag", |
||||
"migratie", |
||||
"vreemdeling", |
||||
"vluchtelingen", |
||||
"terugkeer", |
||||
"grenzen", |
||||
"opvang", |
||||
"statushouder", |
||||
], |
||||
"Progressief\u2013Conservatief": [ |
||||
# environment |
||||
"klimaat", |
||||
"stikstof", |
||||
"duurzaam", |
||||
"duurzaamheid", |
||||
"co2", |
||||
"energietransitie", |
||||
"biodiversiteit", |
||||
# social |
||||
"euthanasie", |
||||
"abortus", |
||||
"lgbtq", |
||||
"transgender", |
||||
"diversiteit", |
||||
"traditi", |
||||
"gezin", |
||||
"religie", |
||||
"geloof", |
||||
], |
||||
"Nationaal\u2013Internationaal": [ |
||||
"navo", |
||||
"nato", |
||||
"europees", |
||||
"europese", |
||||
" eu ", |
||||
"verdrag", |
||||
" vn ", |
||||
"internationaal", |
||||
], |
||||
} |
||||
|
||||
|
||||
def _classify_from_titles(titles: List[str]) -> Tuple[Optional[str], float]: |
||||
"""Classify a list of motion titles into an axis category using keyword matching. |
||||
|
||||
Returns (category_label, confidence) where confidence = fraction of titles |
||||
containing at least one keyword from the winning category. |
||||
Returns (None, 0.0) if confidence is below _KEYWORD_THRESHOLD. |
||||
""" |
||||
if not titles: |
||||
return None, 0.0 |
||||
|
||||
counts: Dict[str, int] = {cat: 0 for cat in _KEYWORDS} |
||||
for title in titles: |
||||
lower = title.lower() |
||||
for cat, keywords in _KEYWORDS.items(): |
||||
if any(kw in lower for kw in keywords): |
||||
counts[cat] += 1 |
||||
|
||||
best_cat = max(counts, key=lambda c: counts[c]) |
||||
best_count = counts[best_cat] |
||||
confidence = best_count / len(titles) |
||||
|
||||
if confidence < _KEYWORD_THRESHOLD: |
||||
return None, confidence |
||||
|
||||
return best_cat, confidence |
||||
``` |
||||
|
||||
- [ ] **Step 4: Run the three tests to verify they pass** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py::test_classify_from_titles_left_right tests/test_political_compass.py::test_classify_from_titles_progressive tests/test_political_compass.py::test_classify_from_titles_low_confidence -v |
||||
``` |
||||
|
||||
Expected: all 3 PASS |
||||
|
||||
- [ ] **Step 5: Run full suite to confirm no regressions** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py -v |
||||
``` |
||||
|
||||
- [ ] **Step 6: Commit** |
||||
|
||||
```bash |
||||
git add analysis/axis_classifier.py tests/test_political_compass.py |
||||
git commit -m "feat: add _classify_from_titles keyword classifier to axis_classifier" |
||||
``` |
||||
|
||||
--- |
||||
|
||||
## Task 3: Add motion-loading helpers to `axis_classifier.py` |
||||
|
||||
**Files:** |
||||
- Modify: `analysis/axis_classifier.py` |
||||
|
||||
These helpers have DB dependencies so they don't get new unit tests here — they are exercised indirectly once `classify_axes` is wired up. Error handling is the main concern. |
||||
|
||||
- [ ] **Step 1: Add `import json` at top of `axis_classifier.py`** |
||||
|
||||
After `import numpy as np` (line 12), add: |
||||
```python |
||||
import json |
||||
``` |
||||
|
||||
- [ ] **Step 2: Add the four motion helpers after `_classify_from_titles`** |
||||
|
||||
```python |
||||
def _load_motion_vectors(db_path: str, window_id: str) -> Dict[int, np.ndarray]: |
||||
"""Load SVD motion vectors for a given window from DuckDB. |
||||
|
||||
Returns {motion_id: vector_array}. Returns {} on any error. |
||||
""" |
||||
try: |
||||
import duckdb |
||||
|
||||
conn = duckdb.connect(db_path, read_only=True) |
||||
rows = conn.execute( |
||||
"SELECT entity_id, vector FROM svd_vectors " |
||||
"WHERE entity_type = 'motion' AND window_id = ?", |
||||
[window_id], |
||||
).fetchall() |
||||
conn.close() |
||||
result = {} |
||||
for entity_id, vector_raw in rows: |
||||
try: |
||||
mid = int(entity_id) |
||||
vec = np.array(json.loads(vector_raw), dtype=float) |
||||
result[mid] = vec |
||||
except Exception: |
||||
continue |
||||
return result |
||||
except Exception as exc: |
||||
_logger.debug("Failed to load motion vectors for window %s: %s", window_id, exc) |
||||
return {} |
||||
|
||||
|
||||
def _project_motions( |
||||
motion_vecs: Dict[int, np.ndarray], |
||||
x_axis: np.ndarray, |
||||
y_axis: np.ndarray, |
||||
global_mean: np.ndarray, |
||||
) -> Dict[int, Tuple[float, float]]: |
||||
"""Project motion vectors onto the PCA axes after centering by global_mean. |
||||
|
||||
Returns {motion_id: (x_score, y_score)}. |
||||
""" |
||||
projections: Dict[int, Tuple[float, float]] = {} |
||||
for mid, vec in motion_vecs.items(): |
||||
try: |
||||
centered = vec - global_mean |
||||
x_score = float(np.dot(centered, x_axis)) |
||||
y_score = float(np.dot(centered, y_axis)) |
||||
projections[mid] = (x_score, y_score) |
||||
except Exception: |
||||
continue |
||||
return projections |
||||
|
||||
|
||||
def _top_motion_ids( |
||||
projections: Dict[int, Tuple[float, float]], |
||||
axis: str, |
||||
n: int = 5, |
||||
) -> Dict[str, List[int]]: |
||||
"""Return the top-n motion IDs at each pole of the given axis. |
||||
|
||||
axis: 'x' or 'y' |
||||
Returns {'+': [motion_ids], '-': [motion_ids]} (highest positive first, |
||||
most negative first in the '-' list). |
||||
""" |
||||
idx = 0 if axis == "x" else 1 |
||||
sorted_ids = sorted(projections, key=lambda mid: projections[mid][idx]) |
||||
neg_ids = sorted_ids[:n] # most negative |
||||
pos_ids = sorted_ids[-n:][::-1] # most positive |
||||
return {"+": pos_ids, "-": neg_ids} |
||||
|
||||
|
||||
def _fetch_motion_titles( |
||||
db_path: str, |
||||
motion_ids: List[int], |
||||
) -> Dict[int, Tuple[str, str]]: |
||||
"""Fetch (title, date) for a list of motion IDs from DuckDB. |
||||
|
||||
Returns {motion_id: (title, date_str)}. Missing IDs are omitted. |
||||
Returns {} on any DB error. |
||||
""" |
||||
if not motion_ids: |
||||
return {} |
||||
try: |
||||
import duckdb |
||||
|
||||
placeholders = ", ".join("?" * len(motion_ids)) |
||||
conn = duckdb.connect(db_path, read_only=True) |
||||
rows = conn.execute( |
||||
f"SELECT id, title, date FROM motions WHERE id IN ({placeholders})", |
||||
motion_ids, |
||||
).fetchall() |
||||
conn.close() |
||||
return {int(row[0]): (str(row[1]), str(row[2])) for row in rows} |
||||
except Exception as exc: |
||||
_logger.debug("Failed to fetch motion titles: %s", exc) |
||||
return {} |
||||
``` |
||||
|
||||
- [ ] **Step 3: Run full test suite to confirm nothing broke** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py -v |
||||
``` |
||||
|
||||
Expected: all previously passing tests still pass. |
||||
|
||||
- [ ] **Step 4: Commit** |
||||
|
||||
```bash |
||||
git add analysis/axis_classifier.py |
||||
git commit -m "feat: add motion-loading helpers to axis_classifier" |
||||
``` |
||||
|
||||
--- |
||||
|
||||
## Task 4: Restructure `classify_axes` to use motion projection as primary |
||||
|
||||
**Files:** |
||||
- Modify: `analysis/axis_classifier.py` |
||||
|
||||
- [ ] **Step 1: Replace the body of `classify_axes`** |
||||
|
||||
Replace the entire function (lines 180–269 in the current file) with the version below. |
||||
Key changes from the old version: |
||||
- Remove the `if not ideology: return axes` early return (motion path doesn't need ideology). |
||||
- New early return only if BOTH motion path AND ideology path are unavailable. |
||||
- Motion classification runs first per window; keyword result overrides Pearson-r if confident. |
||||
- New accumulators: `x_top_motions`, `y_top_motions`, `x_label_confidence`, `y_label_confidence`. |
||||
|
||||
```python |
||||
def classify_axes( |
||||
positions_by_window: Dict[str, Dict[str, Tuple[float, float]]], |
||||
axes: dict, |
||||
db_path: str, |
||||
) -> dict: |
||||
"""Classify compass axes using motion projection (primary) and ideology CSV (fallback). |
||||
|
||||
Motion projection path: |
||||
- Requires axes["global_mean"], axes["x_axis"], axes["y_axis"]. |
||||
- Loads motion SVD vectors per window, projects onto PCA axes, |
||||
ranks top 5+5 motions, applies keyword classifier → label. |
||||
|
||||
Fallback path (unchanged): |
||||
- Pearson-r against party_ideologies.csv (left_right, progressive). |
||||
- Pearson-r against coalition_membership.csv dummy. |
||||
|
||||
Enriches axes with: |
||||
x_label, y_label — global modal label across annual windows |
||||
x_quality, y_quality — {window_id: float} max |r| |
||||
x_interpretation — {window_id: str} |
||||
y_interpretation — {window_id: str} |
||||
x_top_motions, y_top_motions — {window_id: {'+': [(title, date), ...], '-': [...]}} |
||||
x_label_confidence — {window_id: float} |
||||
y_label_confidence — {window_id: float} |
||||
""" |
||||
data_dir = Path(db_path).parent |
||||
ideology = _load_ideology(data_dir / "party_ideologies.csv") |
||||
coalition = _load_coalition(data_dir / "coalition_membership.csv") |
||||
|
||||
# Determine whether motion projection is possible. |
||||
global_mean = axes.get("global_mean") |
||||
x_axis_arr = np.array(axes.get("x_axis", [])) |
||||
y_axis_arr = np.array(axes.get("y_axis", [])) |
||||
motion_path_available = ( |
||||
global_mean is not None |
||||
and x_axis_arr.ndim == 1 |
||||
and x_axis_arr.size > 0 |
||||
and y_axis_arr.size > 0 |
||||
) |
||||
|
||||
if not ideology and not motion_path_available: |
||||
return axes # nothing to classify with |
||||
|
||||
x_quality: Dict[str, float] = {} |
||||
y_quality: Dict[str, float] = {} |
||||
x_interpretation: Dict[str, str] = {} |
||||
y_interpretation: Dict[str, str] = {} |
||||
x_top_motions: Dict[str, Dict] = {} |
||||
y_top_motions: Dict[str, Dict] = {} |
||||
x_label_confidence: Dict[str, float] = {} |
||||
y_label_confidence: Dict[str, float] = {} |
||||
annual_x_labels: List[str] = [] |
||||
annual_y_labels: List[str] = [] |
||||
|
||||
for wid, pos_dict in positions_by_window.items(): |
||||
year = _window_year(wid) |
||||
is_annual = wid != "current_parliament" and "-" not in wid |
||||
|
||||
# ── Ideology / coalition Pearson-r (unchanged logic) ────────────────── |
||||
x_lbl_fallback: Optional[str] = None |
||||
y_lbl_fallback: Optional[str] = None |
||||
x_q = 0.0 |
||||
y_q = 0.0 |
||||
x_int = "" |
||||
y_int = "" |
||||
|
||||
if ideology: |
||||
parties = [p for p in pos_dict if p in ideology] |
||||
if len(parties) >= 5: |
||||
party_x = [pos_dict[p][0] for p in parties] |
||||
party_y = [pos_dict[p][1] for p in parties] |
||||
ref_lr = [ideology[p]["left_right"] for p in parties] |
||||
ref_pc = [ideology[p]["progressive"] for p in parties] |
||||
|
||||
if year and coalition and year in coalition: |
||||
gov_set = coalition[year] |
||||
ref_co = [1.0 if p in gov_set else -1.0 for p in parties] |
||||
else: |
||||
ref_co = [0.0] * len(parties) |
||||
|
||||
r_lr_x = _pearsonr(party_x, ref_lr) |
||||
r_co_x = _pearsonr(party_x, ref_co) |
||||
r_pc_x = _pearsonr(party_x, ref_pc) |
||||
x_lbl_fallback, x_int, x_q = _assign_label(r_lr_x, r_co_x, r_pc_x, "x") |
||||
|
||||
r_lr_y = _pearsonr(party_y, ref_lr) |
||||
r_co_y = _pearsonr(party_y, ref_co) |
||||
r_pc_y = _pearsonr(party_y, ref_pc) |
||||
y_lbl_fallback, y_int, y_q = _assign_label(r_lr_y, r_co_y, r_pc_y, "y") |
||||
|
||||
# ── Motion projection (primary) ──────────────────────────────────────── |
||||
x_lbl = x_lbl_fallback |
||||
y_lbl = y_lbl_fallback |
||||
x_conf = 0.0 |
||||
y_conf = 0.0 |
||||
x_tops: Dict[str, List] = {"+": [], "-": []} |
||||
y_tops: Dict[str, List] = {"+": [], "-": []} |
||||
|
||||
if motion_path_available: |
||||
motion_vecs = _load_motion_vectors(db_path, wid) |
||||
if motion_vecs: |
||||
projections = _project_motions(motion_vecs, x_axis_arr, y_axis_arr, global_mean) |
||||
x_ids = _top_motion_ids(projections, "x", n=5) |
||||
y_ids = _top_motion_ids(projections, "y", n=5) |
||||
|
||||
all_x_ids = x_ids["+"] + x_ids["-"] |
||||
all_y_ids = y_ids["+"] + y_ids["-"] |
||||
titles_map = _fetch_motion_titles(db_path, list(set(all_x_ids + all_y_ids))) |
||||
|
||||
x_title_list = [ |
||||
titles_map[mid][0] for mid in all_x_ids if mid in titles_map |
||||
] |
||||
y_title_list = [ |
||||
titles_map[mid][0] for mid in all_y_ids if mid in titles_map |
||||
] |
||||
|
||||
x_kw_lbl, x_conf = _classify_from_titles(x_title_list) |
||||
y_kw_lbl, y_conf = _classify_from_titles(y_title_list) |
||||
|
||||
if x_kw_lbl is not None: |
||||
x_lbl = x_kw_lbl |
||||
if y_kw_lbl is not None: |
||||
y_lbl = y_kw_lbl |
||||
|
||||
# Build display lists: [(title, date), ...] |
||||
for pole, ids in x_ids.items(): |
||||
x_tops[pole] = [ |
||||
titles_map[mid] for mid in ids if mid in titles_map |
||||
] |
||||
for pole, ids in y_ids.items(): |
||||
y_tops[pole] = [ |
||||
titles_map[mid] for mid in ids if mid in titles_map |
||||
] |
||||
|
||||
# ── Final label resolution ──────────────────────────────────────────── |
||||
# If both motion and ideology paths produced nothing, use generic fallback. |
||||
if x_lbl is None: |
||||
x_lbl = _LABELS["fallback_x"] |
||||
x_int = _INTERPRETATION_TEMPLATES["fallback"].format(orientation="horizontale") |
||||
if y_lbl is None: |
||||
y_lbl = _LABELS["fallback_y"] |
||||
y_int = _INTERPRETATION_TEMPLATES["fallback"].format(orientation="verticale") |
||||
|
||||
x_quality[wid] = x_q |
||||
y_quality[wid] = y_q |
||||
x_interpretation[wid] = x_int |
||||
y_interpretation[wid] = y_int |
||||
x_top_motions[wid] = x_tops |
||||
y_top_motions[wid] = y_tops |
||||
x_label_confidence[wid] = x_conf |
||||
y_label_confidence[wid] = y_conf |
||||
|
||||
if is_annual: |
||||
annual_x_labels.append(x_lbl) |
||||
annual_y_labels.append(y_lbl) |
||||
|
||||
def _modal(labels: List[str], fallback: str) -> str: |
||||
if not labels: |
||||
return fallback |
||||
return Counter(labels).most_common(1)[0][0] |
||||
|
||||
enriched = dict(axes) |
||||
enriched["x_label"] = _modal(annual_x_labels, "Links\u2013Rechts") |
||||
enriched["y_label"] = _modal(annual_y_labels, "Progressief\u2013Conservatief") |
||||
enriched["x_quality"] = x_quality |
||||
enriched["y_quality"] = y_quality |
||||
enriched["x_interpretation"] = x_interpretation |
||||
enriched["y_interpretation"] = y_interpretation |
||||
enriched["x_top_motions"] = x_top_motions |
||||
enriched["y_top_motions"] = y_top_motions |
||||
enriched["x_label_confidence"] = x_label_confidence |
||||
enriched["y_label_confidence"] = y_label_confidence |
||||
return enriched |
||||
``` |
||||
|
||||
- [ ] **Step 2: Run full test suite** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py -v |
||||
``` |
||||
|
||||
Expected: all existing tests + all 4 tasks' new tests pass. Particularly verify the 3 classifier tests from Task 2 and the `test_compute_2d_axes_exposes_global_mean` from Task 1 still pass. |
||||
|
||||
- [ ] **Step 3: Commit** |
||||
|
||||
```bash |
||||
git add analysis/axis_classifier.py |
||||
git commit -m "feat: restructure classify_axes — motion projection as primary label source" |
||||
``` |
||||
|
||||
--- |
||||
|
||||
## Task 5: Add axis-swap logic and tests in `explorer.py` |
||||
|
||||
**Files:** |
||||
- Modify: `explorer.py` |
||||
- Test: `tests/test_political_compass.py` |
||||
|
||||
- [ ] **Step 1: Write the two failing tests** |
||||
|
||||
Add to `tests/test_political_compass.py`: |
||||
|
||||
```python |
||||
def test_axis_swap_when_y_is_left_right(): |
||||
"""When y_label is 'Links–Rechts' and x_label is not, positions must be swapped.""" |
||||
from explorer import _swap_axes |
||||
|
||||
positions_by_window = { |
||||
"2023": { |
||||
"VVD": (0.5, 0.8), |
||||
"PvdA": (-0.3, -0.6), |
||||
} |
||||
} |
||||
axis_def = { |
||||
"x_label": "Progressief\u2013Conservatief", |
||||
"y_label": "Links\u2013Rechts", |
||||
"x_quality": {"2023": 0.7}, |
||||
"y_quality": {"2023": 0.8}, |
||||
"x_interpretation": {"2023": "prog interpretation"}, |
||||
"y_interpretation": {"2023": "lr interpretation"}, |
||||
"x_top_motions": {"2023": {"+": [], "-": []}}, |
||||
"y_top_motions": {"2023": {"+": [], "-": []}}, |
||||
"x_label_confidence": {"2023": 0.5}, |
||||
"y_label_confidence": {"2023": 0.7}, |
||||
} |
||||
|
||||
new_pos, new_ax = _swap_axes(positions_by_window, axis_def) |
||||
|
||||
# Positions swapped: (x, y) → (y, x) |
||||
assert new_pos["2023"]["VVD"] == (0.8, 0.5) |
||||
assert new_pos["2023"]["PvdA"] == (-0.6, -0.3) |
||||
|
||||
# Labels swapped |
||||
assert new_ax["x_label"] == "Links\u2013Rechts" |
||||
assert new_ax["y_label"] == "Progressief\u2013Conservatief" |
||||
|
||||
# Quality swapped |
||||
assert new_ax["x_quality"] == {"2023": 0.8} |
||||
assert new_ax["y_quality"] == {"2023": 0.7} |
||||
|
||||
|
||||
def test_axis_swap_not_applied_when_x_is_left_right(): |
||||
"""When x_label is already 'Links–Rechts', no swap should occur.""" |
||||
from explorer import _should_swap_axes |
||||
|
||||
axis_def = { |
||||
"x_label": "Links\u2013Rechts", |
||||
"y_label": "Progressief\u2013Conservatief", |
||||
} |
||||
assert _should_swap_axes(axis_def) is False |
||||
|
||||
axis_def2 = { |
||||
"x_label": "Links\u2013Rechts", |
||||
"y_label": "Links\u2013Rechts", # both LR — no swap |
||||
} |
||||
assert _should_swap_axes(axis_def2) is False |
||||
``` |
||||
|
||||
- [ ] **Step 2: Run tests to verify they fail** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py::test_axis_swap_when_y_is_left_right tests/test_political_compass.py::test_axis_swap_not_applied_when_x_is_left_right -v |
||||
``` |
||||
|
||||
Expected: FAIL — `ImportError: cannot import name '_swap_axes'` / `'_should_swap_axes'` |
||||
|
||||
- [ ] **Step 3: Add `_swap_axes` and `_should_swap_axes` to `explorer.py`** |
||||
|
||||
Add these two functions near the top of `explorer.py`, just before `load_positions` (i.e. before the function that starts around line 184). A good place is after any existing module-level helpers. |
||||
|
||||
```python |
||||
def _should_swap_axes(axis_def: dict) -> bool: |
||||
"""Return True if the Y axis is 'Links–Rechts' and the X axis is not. |
||||
|
||||
When true, caller should swap x/y positions and metadata so left-right |
||||
is conventionally on the horizontal axis. |
||||
""" |
||||
lr = "Links\u2013Rechts" |
||||
return axis_def.get("y_label") == lr and axis_def.get("x_label") != lr |
||||
|
||||
|
||||
def _swap_axes( |
||||
positions_by_window: dict, |
||||
axis_def: dict, |
||||
) -> tuple: |
||||
"""Swap x and y in all positions and axis metadata. |
||||
|
||||
Pure function — returns (new_positions_by_window, new_axis_def). |
||||
""" |
||||
new_positions: dict = {} |
||||
for wid, pos_dict in positions_by_window.items(): |
||||
new_positions[wid] = {ent: (y, x) for ent, (x, y) in pos_dict.items()} |
||||
|
||||
new_ax = dict(axis_def) |
||||
# Swap paired scalar keys |
||||
new_ax["x_label"] = axis_def.get("y_label") |
||||
new_ax["y_label"] = axis_def.get("x_label") |
||||
|
||||
# Swap paired dict keys |
||||
for x_key, y_key in [ |
||||
("x_quality", "y_quality"), |
||||
("x_interpretation", "y_interpretation"), |
||||
("x_top_motions", "y_top_motions"), |
||||
("x_label_confidence", "y_label_confidence"), |
||||
]: |
||||
new_ax[x_key] = axis_def.get(y_key) |
||||
new_ax[y_key] = axis_def.get(x_key) |
||||
|
||||
return new_positions, new_ax |
||||
``` |
||||
|
||||
- [ ] **Step 4: Wire the swap in `load_positions`** |
||||
|
||||
In `explorer.py`, after the `classify_axes` try/except block (currently lines 202–211, ending at `axis_def = classify_axes(...)`), add: |
||||
|
||||
```python |
||||
if _should_swap_axes(axis_def): |
||||
positions_by_window, axis_def = _swap_axes(positions_by_window, axis_def) |
||||
``` |
||||
|
||||
Place this immediately before the `# Filter displayed windows by window_size` comment (currently ~line 213). |
||||
|
||||
- [ ] **Step 5: Run tests to verify they pass** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py::test_axis_swap_when_y_is_left_right tests/test_political_compass.py::test_axis_swap_not_applied_when_x_is_left_right -v |
||||
``` |
||||
|
||||
Expected: both PASS |
||||
|
||||
- [ ] **Step 6: Run full suite** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py -v |
||||
``` |
||||
|
||||
Expected: all tests pass. |
||||
|
||||
- [ ] **Step 7: Commit** |
||||
|
||||
```bash |
||||
git add explorer.py tests/test_political_compass.py |
||||
git commit -m "feat: add axis swap — left-right goes on horizontal axis when detected" |
||||
``` |
||||
|
||||
--- |
||||
|
||||
## Task 6: Add motion expander UI in `build_compass_tab` |
||||
|
||||
**Files:** |
||||
- Modify: `explorer.py` |
||||
|
||||
No new unit tests for this task — it's pure Streamlit rendering and cannot be unit-tested without a browser. Verify visually after implementation. |
||||
|
||||
- [ ] **Step 1: Add the expander block after `st.plotly_chart`** |
||||
|
||||
In `explorer.py`, find the `st.plotly_chart` call (line ~974) inside `with col1:`. After the two `st.caption` calls (lines ~981–986), add: |
||||
|
||||
```python |
||||
# Motion expander — show which motions define each axis for this window |
||||
x_top = axis_def.get("x_top_motions", {}).get(window_idx, {}) |
||||
y_top = axis_def.get("y_top_motions", {}).get(window_idx, {}) |
||||
x_conf = axis_def.get("x_label_confidence", {}).get(window_idx) |
||||
y_conf = axis_def.get("y_label_confidence", {}).get(window_idx) |
||||
evr = axis_def.get("explained_variance_ratio", [None, None]) |
||||
evr0 = evr[0] if evr else None |
||||
|
||||
_has_motion_data = bool( |
||||
x_top.get("+") or x_top.get("-") or y_top.get("+") or y_top.get("-") |
||||
) |
||||
if _has_motion_data: |
||||
with st.expander("\U0001f50d Wat bepaalt deze assen?"): |
||||
x_conf_pct = f" (vertrouwen: {x_conf:.0%})" if x_conf is not None else "" |
||||
y_conf_pct = f" (vertrouwen: {y_conf:.0%})" if y_conf is not None else "" |
||||
|
||||
st.markdown(f"**Horizontale as: {_x_label}**{x_conf_pct}") |
||||
x_pos_titles = x_top.get("+", []) |
||||
x_neg_titles = x_top.get("-", []) |
||||
if x_pos_titles: |
||||
labels_pos = " · ".join( |
||||
f"{t} ({d})" for t, d in x_pos_titles[:3] |
||||
) |
||||
st.markdown(f" ➕ {labels_pos}") |
||||
if x_neg_titles: |
||||
labels_neg = " · ".join( |
||||
f"{t} ({d})" for t, d in x_neg_titles[:3] |
||||
) |
||||
st.markdown(f" ➖ {labels_neg}") |
||||
|
||||
st.markdown(f"**Verticale as: {_y_label}**{y_conf_pct}") |
||||
y_pos_titles = y_top.get("+", []) |
||||
y_neg_titles = y_top.get("-", []) |
||||
if y_pos_titles: |
||||
labels_pos = " · ".join( |
||||
f"{t} ({d})" for t, d in y_pos_titles[:3] |
||||
) |
||||
st.markdown(f" ➕ {labels_pos}") |
||||
if y_neg_titles: |
||||
labels_neg = " · ".join( |
||||
f"{t} ({d})" for t, d in y_neg_titles[:3] |
||||
) |
||||
st.markdown(f" ➖ {labels_neg}") |
||||
|
||||
if evr0 is not None: |
||||
st.caption( |
||||
f"As 1 verklaart {evr0:.1%} van de variantie in stemgedrag." |
||||
) |
||||
``` |
||||
|
||||
Note: `_x_label` and `_y_label` are already defined earlier in `build_compass_tab` from `axis_def.get("x_label", …)`. `window_idx` is the currently selected window string. Confirm those variable names match the existing code before inserting. |
||||
|
||||
- [ ] **Step 2: Check that `explained_variance_ratio` is stored in `axis_def`** |
||||
|
||||
Search `analysis/political_axis.py` for where `axes["explained_variance_ratio"]` is set. If it isn't stored, add it: |
||||
|
||||
In `compute_2d_axes`, after `axes["global_mean"] = global_mean` (Task 1), find where `evr` is computed (it's the `explained_variance_ratio_` from sklearn PCA or numpy SVD). Add: |
||||
|
||||
```python |
||||
axes["explained_variance_ratio"] = list(axes.get("explained_variance_ratio", [evr1, evr2])) |
||||
``` |
||||
|
||||
If it's already stored under a different key, use that key in the expander code instead. |
||||
|
||||
- [ ] **Step 3: Run full test suite (sanity check)** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py -v |
||||
``` |
||||
|
||||
Expected: all tests pass (expander is UI-only, no test required). |
||||
|
||||
- [ ] **Step 4: Commit** |
||||
|
||||
```bash |
||||
git add explorer.py |
||||
git commit -m "feat: add motion expander to compass tab — shows top motions per axis" |
||||
``` |
||||
|
||||
--- |
||||
|
||||
## Final Verification |
||||
|
||||
- [ ] **Run all tests one last time** |
||||
|
||||
```bash |
||||
pytest tests/test_political_compass.py -v |
||||
``` |
||||
|
||||
Expected output summary: 13+ tests passing (8 existing + 5 new), 0 failing. |
||||
|
||||
- [ ] **Smoke-test the app** (if DB is available) |
||||
|
||||
```bash |
||||
streamlit run explorer.py |
||||
``` |
||||
|
||||
Navigate to the compass tab, select a window, verify: |
||||
1. Axis labels show e.g. "Links–Rechts" on X and "Progressief–Conservatief" on Y |
||||
2. The "🔍 Wat bepaalt deze assen?" expander appears and shows motions |
||||
3. No Python exceptions in the terminal |
||||
|
||||
- [ ] **Final commit (if any cleanup needed)** |
||||
|
||||
```bash |
||||
git add -u |
||||
git commit -m "fix: address any issues found during smoke test" |
||||
``` |
||||
Loading…
Reference in new issue