parent
6b811364c5
commit
23849c9cb6
@ -0,0 +1,787 @@ |
|||||||
|
# Axis Classification Implementation Plan |
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. |
||||||
|
|
||||||
|
**Goal:** Add `analysis/axis_classifier.py` that dynamically labels the political compass axes by correlating per-party PCA positions against a party ideology reference CSV, replacing hardcoded "Links–Rechts" / "Progressief–Conservatief" labels. |
||||||
|
|
||||||
|
**Architecture:** A new pure module `classify_axes()` reads two static CSVs (`data/party_ideologies.csv`, `data/coalition_membership.csv`) and enriches the `axes` dict returned by `compute_2d_axes`. `load_positions()` in `explorer.py` calls it after PCA; the compass and trajectories renderers use the resulting `x_label`/`y_label` keys instead of hardcoded strings. CSVs are committed to git and baked into the Docker image. |
||||||
|
|
||||||
|
**Tech Stack:** Python stdlib (`pathlib`, `csv`-via-manual-parse), NumPy (already present), Streamlit (already present). No new runtime dependencies. |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## File Map |
||||||
|
|
||||||
|
| Action | Path | Responsibility | |
||||||
|
|---|---|---| |
||||||
|
| Create | `data/party_ideologies.csv` | Party left_right + progressive reference scores | |
||||||
|
| Create | `data/coalition_membership.csv` | Per-year coalition party membership | |
||||||
|
| Create | `analysis/axis_classifier.py` | `classify_axes()` — correlate positions against reference | |
||||||
|
| Modify | `tests/test_political_compass.py` | Add 3 tests for classifier behaviour | |
||||||
|
| Modify | `explorer.py:194-209` | Call `classify_axes` inside `load_positions` | |
||||||
|
| Modify | `explorer.py:927-928` | Dynamic labels in party-level scatter | |
||||||
|
| Modify | `explorer.py:946` | Dynamic labels in MP-level scatter | |
||||||
|
| Modify | `explorer.py:1050` | Accept axis_def from `load_positions` in trajectories tab | |
||||||
|
| Modify | `explorer.py:1120-1121` | Dynamic titles in trajectories chart | |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### Task 1: Write the three failing tests |
||||||
|
|
||||||
|
**Files:** |
||||||
|
- Modify: `tests/test_political_compass.py` |
||||||
|
|
||||||
|
- [ ] **Step 1: Open `tests/test_political_compass.py` and append the three test functions below** |
||||||
|
|
||||||
|
Add this block at the end of the file: |
||||||
|
|
||||||
|
```python |
||||||
|
# --------------------------------------------------------------------------- |
||||||
|
# Tests for analysis.axis_classifier |
||||||
|
# --------------------------------------------------------------------------- |
||||||
|
|
||||||
|
import importlib |
||||||
|
|
||||||
|
|
||||||
|
def _fresh_classifier(monkeypatch): |
||||||
|
"""Import axis_classifier with cleared module-level caches.""" |
||||||
|
import analysis.axis_classifier as _cls |
||||||
|
monkeypatch.setattr(_cls, "_ideology_cache", None) |
||||||
|
monkeypatch.setattr(_cls, "_coalition_cache", None) |
||||||
|
return _cls |
||||||
|
|
||||||
|
|
||||||
|
def test_axis_label_left_right(tmp_path, monkeypatch): |
||||||
|
"""Positions that closely correlate with left_right scores → label 'Links–Rechts'.""" |
||||||
|
_cls = _fresh_classifier(monkeypatch) |
||||||
|
|
||||||
|
(tmp_path / "party_ideologies.csv").write_text( |
||||||
|
"party,left_right,progressive\n" |
||||||
|
"VVD,0.65,0.10\n" |
||||||
|
"PvdA,-0.70,0.75\n" |
||||||
|
"SP,-0.90,0.50\n" |
||||||
|
"PVV,0.90,-0.50\n" |
||||||
|
"D66,-0.10,0.85\n" |
||||||
|
"CDA,0.25,-0.45\n" |
||||||
|
) |
||||||
|
(tmp_path / "coalition_membership.csv").write_text("window_id,party\n") |
||||||
|
|
||||||
|
# X values are the party's left_right scores — perfect correlation |
||||||
|
positions_by_window = { |
||||||
|
"2022": { |
||||||
|
"VVD": (0.65, 0.10), |
||||||
|
"PvdA": (-0.70, 0.20), |
||||||
|
"SP": (-0.90, 0.30), |
||||||
|
"PVV": (0.90, -0.10), |
||||||
|
"D66": (-0.10, 0.40), |
||||||
|
"CDA": (0.25, -0.20), |
||||||
|
} |
||||||
|
} |
||||||
|
axes = {"x_axis": None, "y_axis": None, "method": "pca"} |
||||||
|
|
||||||
|
result = _cls.classify_axes( |
||||||
|
positions_by_window, axes, str(tmp_path / "motions.db") |
||||||
|
) |
||||||
|
|
||||||
|
assert result["x_label"] == "Links\u2013Rechts" |
||||||
|
assert result["x_quality"]["2022"] >= 0.65 |
||||||
|
|
||||||
|
|
||||||
|
def test_axis_label_coalition_dominant(tmp_path, monkeypatch): |
||||||
|
"""Positions that match coalition pattern but NOT left-right → 'Coalitie–Oppositie'.""" |
||||||
|
_cls = _fresh_classifier(monkeypatch) |
||||||
|
|
||||||
|
(tmp_path / "party_ideologies.csv").write_text( |
||||||
|
"party,left_right,progressive\n" |
||||||
|
"VVD,0.65,0.10\n" |
||||||
|
"PvdA,-0.70,0.75\n" |
||||||
|
"SP,-0.90,0.50\n" |
||||||
|
"PVV,0.90,-0.50\n" |
||||||
|
"D66,-0.10,0.85\n" |
||||||
|
"CDA,0.25,-0.45\n" |
||||||
|
) |
||||||
|
# 2016: Rutte II coalition = VVD + PvdA |
||||||
|
(tmp_path / "coalition_membership.csv").write_text( |
||||||
|
"window_id,party\n" |
||||||
|
"2016,VVD\n" |
||||||
|
"2016,PvdA\n" |
||||||
|
) |
||||||
|
|
||||||
|
# Coalition parties (VVD + PvdA) at x ≈ +1, opposition at x ≈ -1. |
||||||
|
# VVD (right) and PvdA (left) are both near +1 → low left_right correlation |
||||||
|
# but high coalition correlation. |
||||||
|
positions_by_window = { |
||||||
|
"2016": { |
||||||
|
"VVD": (0.95, 0.10), |
||||||
|
"PvdA": (0.90, 0.20), |
||||||
|
"SP": (-0.85, 0.30), |
||||||
|
"PVV": (-0.95, -0.10), |
||||||
|
"D66": (-0.80, 0.40), |
||||||
|
"CDA": (-0.75, -0.20), |
||||||
|
} |
||||||
|
} |
||||||
|
axes = {"x_axis": None, "y_axis": None, "method": "pca"} |
||||||
|
|
||||||
|
result = _cls.classify_axes( |
||||||
|
positions_by_window, axes, str(tmp_path / "motions.db") |
||||||
|
) |
||||||
|
|
||||||
|
assert result["x_label"] == "Coalitie\u2013Oppositie" |
||||||
|
assert "coalitie" in result["x_interpretation"]["2016"].lower() |
||||||
|
|
||||||
|
|
||||||
|
def test_axis_classifier_missing_csv(tmp_path, monkeypatch): |
||||||
|
"""Missing party_ideologies.csv → returns axes dict unchanged, no exception.""" |
||||||
|
_cls = _fresh_classifier(monkeypatch) |
||||||
|
|
||||||
|
# No CSVs written — directory exists but files do not |
||||||
|
positions_by_window = {"2022": {"VVD": (1.0, 0.5), "PvdA": (-1.0, 0.3)}} |
||||||
|
axes = {"x_axis": None, "y_axis": None, "method": "pca"} |
||||||
|
|
||||||
|
result = _cls.classify_axes( |
||||||
|
positions_by_window, axes, str(tmp_path / "motions.db") |
||||||
|
) |
||||||
|
|
||||||
|
# Must not crash and must return the original axes dict unchanged |
||||||
|
assert result is axes |
||||||
|
assert "x_label" not in result |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 2: Run the tests to confirm they fail (module doesn't exist yet)** |
||||||
|
|
||||||
|
```bash |
||||||
|
uv run pytest tests/test_political_compass.py::test_axis_label_left_right tests/test_political_compass.py::test_axis_label_coalition_dominant tests/test_political_compass.py::test_axis_classifier_missing_csv -v |
||||||
|
``` |
||||||
|
|
||||||
|
Expected: 3 failures like `ModuleNotFoundError: No module named 'analysis.axis_classifier'` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### Task 2: Create the reference data files |
||||||
|
|
||||||
|
**Files:** |
||||||
|
- Create: `data/party_ideologies.csv` |
||||||
|
- Create: `data/coalition_membership.csv` |
||||||
|
|
||||||
|
- [ ] **Step 1: Create `data/party_ideologies.csv`** |
||||||
|
|
||||||
|
``` |
||||||
|
party,left_right,progressive |
||||||
|
VVD,0.65,0.10 |
||||||
|
PvdA,-0.70,0.75 |
||||||
|
SP,-0.90,0.50 |
||||||
|
CDA,0.25,-0.45 |
||||||
|
D66,-0.10,0.85 |
||||||
|
GroenLinks,-0.70,0.90 |
||||||
|
GL,-0.70,0.90 |
||||||
|
GroenLinks-PvdA,-0.70,0.82 |
||||||
|
ChristenUnie,0.10,-0.55 |
||||||
|
SGP,0.35,-0.95 |
||||||
|
PVV,0.90,-0.50 |
||||||
|
DENK,-0.40,0.55 |
||||||
|
50Plus,-0.05,-0.10 |
||||||
|
FVD,0.90,-0.75 |
||||||
|
PvdD,-0.60,0.85 |
||||||
|
Volt,-0.20,0.80 |
||||||
|
JA21,0.70,-0.30 |
||||||
|
BBB,0.50,-0.35 |
||||||
|
NSC,0.20,-0.20 |
||||||
|
Nieuw Sociaal Contract,0.20,-0.20 |
||||||
|
BVNL,0.85,-0.55 |
||||||
|
Bij1,-0.90,0.90 |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 2: Create `data/coalition_membership.csv`** |
||||||
|
|
||||||
|
``` |
||||||
|
window_id,party |
||||||
|
2012,VVD |
||||||
|
2012,PvdA |
||||||
|
2013,VVD |
||||||
|
2013,PvdA |
||||||
|
2014,VVD |
||||||
|
2014,PvdA |
||||||
|
2015,VVD |
||||||
|
2015,PvdA |
||||||
|
2016,VVD |
||||||
|
2016,PvdA |
||||||
|
2017,VVD |
||||||
|
2017,CDA |
||||||
|
2017,D66 |
||||||
|
2017,ChristenUnie |
||||||
|
2018,VVD |
||||||
|
2018,CDA |
||||||
|
2018,D66 |
||||||
|
2018,ChristenUnie |
||||||
|
2019,VVD |
||||||
|
2019,CDA |
||||||
|
2019,D66 |
||||||
|
2019,ChristenUnie |
||||||
|
2020,VVD |
||||||
|
2020,CDA |
||||||
|
2020,D66 |
||||||
|
2020,ChristenUnie |
||||||
|
2021,VVD |
||||||
|
2021,CDA |
||||||
|
2021,D66 |
||||||
|
2021,ChristenUnie |
||||||
|
2022,VVD |
||||||
|
2022,D66 |
||||||
|
2022,CDA |
||||||
|
2022,ChristenUnie |
||||||
|
2023,VVD |
||||||
|
2023,D66 |
||||||
|
2023,CDA |
||||||
|
2023,ChristenUnie |
||||||
|
2024,PVV |
||||||
|
2024,VVD |
||||||
|
2024,NSC |
||||||
|
2024,BBB |
||||||
|
2025,PVV |
||||||
|
2025,VVD |
||||||
|
2025,NSC |
||||||
|
2025,BBB |
||||||
|
2026,PVV |
||||||
|
2026,VVD |
||||||
|
2026,NSC |
||||||
|
2026,BBB |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 3: Verify the files are NOT excluded by .gitignore** |
||||||
|
|
||||||
|
```bash |
||||||
|
git check-ignore -v data/party_ideologies.csv data/coalition_membership.csv |
||||||
|
``` |
||||||
|
|
||||||
|
Expected: no output (files are not ignored — `.gitignore` only excludes `data/*.db`, `data/*.bak`, `data/*.json`) |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### Task 3: Implement `analysis/axis_classifier.py` |
||||||
|
|
||||||
|
**Files:** |
||||||
|
- Create: `analysis/axis_classifier.py` |
||||||
|
|
||||||
|
- [ ] **Step 1: Create the file with this full implementation** |
||||||
|
|
||||||
|
```python |
||||||
|
"""Axis classifier: correlate per-party PCA positions against ideology reference data |
||||||
|
to assign honest, dynamic labels to political compass axes. |
||||||
|
|
||||||
|
Public API: classify_axes(positions_by_window, axes, db_path) -> dict |
||||||
|
""" |
||||||
|
import logging |
||||||
|
from collections import Counter |
||||||
|
from pathlib import Path |
||||||
|
from typing import Dict, List, Optional, Tuple |
||||||
|
|
||||||
|
import numpy as np |
||||||
|
|
||||||
|
_logger = logging.getLogger(__name__) |
||||||
|
|
||||||
|
# Module-level caches — loaded once per process lifetime. |
||||||
|
_ideology_cache: Optional[Dict[str, Dict[str, float]]] = None |
||||||
|
_coalition_cache: Optional[Dict[str, set]] = None |
||||||
|
|
||||||
|
# Correlation threshold above which we consider an axis "explained" by a dimension. |
||||||
|
_THRESHOLD = 0.65 |
||||||
|
|
||||||
|
_LABELS = { |
||||||
|
"lr": "Links\u2013Rechts", |
||||||
|
"co": "Coalitie\u2013Oppositie", |
||||||
|
"pc": "Progressief\u2013Conservatief", |
||||||
|
"fallback_x": "Stempatroon As 1", |
||||||
|
"fallback_y": "Stempatroon As 2", |
||||||
|
} |
||||||
|
|
||||||
|
_INTERPRETATION_TEMPLATES = { |
||||||
|
"lr": "De {orientation} as weerspiegelt de klassieke links-rechts tegenstelling.", |
||||||
|
"co": ( |
||||||
|
"De {orientation} as weerspiegelt stemgedrag van coalitie- versus " |
||||||
|
"oppositiepartijen (r={r:.2f}). Links-rechts is minder dominant dit jaar." |
||||||
|
), |
||||||
|
"pc": "De {orientation} as weerspiegelt de progressief-conservatieve tegenstelling.", |
||||||
|
"fallback": ( |
||||||
|
"De {orientation} as weerspiegelt een empirisch stempatroon " |
||||||
|
"zonder duidelijke ideologische richting." |
||||||
|
), |
||||||
|
} |
||||||
|
|
||||||
|
|
||||||
|
def _load_ideology(csv_path: Path) -> Dict[str, Dict[str, float]]: |
||||||
|
"""Load party ideology scores from CSV. |
||||||
|
|
||||||
|
Returns {party_name: {"left_right": float, "progressive": float}}. |
||||||
|
Returns {} on any error (caller should treat empty as 'skip classification'). |
||||||
|
""" |
||||||
|
global _ideology_cache |
||||||
|
if _ideology_cache is not None: |
||||||
|
return _ideology_cache |
||||||
|
result: Dict[str, Dict[str, float]] = {} |
||||||
|
try: |
||||||
|
with open(csv_path, encoding="utf-8") as fh: |
||||||
|
lines = fh.read().splitlines() |
||||||
|
header = [h.strip() for h in lines[0].split(",")] |
||||||
|
lr_idx = header.index("left_right") |
||||||
|
pc_idx = header.index("progressive") |
||||||
|
for line in lines[1:]: |
||||||
|
if not line.strip(): |
||||||
|
continue |
||||||
|
parts = [p.strip() for p in line.split(",")] |
||||||
|
if len(parts) <= max(lr_idx, pc_idx): |
||||||
|
continue |
||||||
|
result[parts[0]] = { |
||||||
|
"left_right": float(parts[lr_idx]), |
||||||
|
"progressive": float(parts[pc_idx]), |
||||||
|
} |
||||||
|
except FileNotFoundError: |
||||||
|
_logger.warning("party_ideologies.csv not found at %s — axis labels will be generic", csv_path) |
||||||
|
return {} |
||||||
|
except Exception as exc: |
||||||
|
_logger.warning("Failed to load party_ideologies.csv: %s", exc) |
||||||
|
return {} |
||||||
|
_ideology_cache = result |
||||||
|
return result |
||||||
|
|
||||||
|
|
||||||
|
def _load_coalition(csv_path: Path) -> Dict[str, set]: |
||||||
|
"""Load coalition membership from CSV. |
||||||
|
|
||||||
|
Returns {window_id: set_of_party_names}. |
||||||
|
Returns {} on any error (coalition dimension will be skipped). |
||||||
|
""" |
||||||
|
global _coalition_cache |
||||||
|
if _coalition_cache is not None: |
||||||
|
return _coalition_cache |
||||||
|
result: Dict[str, set] = {} |
||||||
|
try: |
||||||
|
with open(csv_path, encoding="utf-8") as fh: |
||||||
|
lines = fh.read().splitlines() |
||||||
|
for line in lines[1:]: |
||||||
|
if not line.strip(): |
||||||
|
continue |
||||||
|
parts = [p.strip() for p in line.split(",")] |
||||||
|
if len(parts) < 2: |
||||||
|
continue |
||||||
|
wid, party = parts[0], parts[1] |
||||||
|
result.setdefault(wid, set()).add(party) |
||||||
|
except FileNotFoundError: |
||||||
|
_logger.warning( |
||||||
|
"coalition_membership.csv not found at %s — coalition axis detection disabled", csv_path |
||||||
|
) |
||||||
|
return {} |
||||||
|
except Exception as exc: |
||||||
|
_logger.warning("Failed to load coalition_membership.csv: %s", exc) |
||||||
|
return {} |
||||||
|
_coalition_cache = result |
||||||
|
return result |
||||||
|
|
||||||
|
|
||||||
|
def _window_year(window_id: str) -> Optional[str]: |
||||||
|
"""Extract year string from window_id. |
||||||
|
|
||||||
|
Returns None for 'current_parliament'. |
||||||
|
'2016' → '2016', '2016-Q3' → '2016'. |
||||||
|
""" |
||||||
|
if window_id == "current_parliament": |
||||||
|
return None |
||||||
|
return window_id.split("-")[0] |
||||||
|
|
||||||
|
|
||||||
|
def _pearsonr(x: List[float], y: List[float]) -> float: |
||||||
|
"""Pearson r; returns 0.0 for degenerate input (< 3 points or zero variance).""" |
||||||
|
if len(x) < 3: |
||||||
|
return 0.0 |
||||||
|
xa = np.array(x, dtype=float) |
||||||
|
ya = np.array(y, dtype=float) |
||||||
|
if xa.std() < 1e-12 or ya.std() < 1e-12: |
||||||
|
return 0.0 |
||||||
|
return float(np.corrcoef(xa, ya)[0, 1]) |
||||||
|
|
||||||
|
|
||||||
|
def _assign_label( |
||||||
|
r_lr: float, |
||||||
|
r_co: float, |
||||||
|
r_pc: float, |
||||||
|
axis: str, |
||||||
|
) -> Tuple[str, str, float]: |
||||||
|
"""Assign label, interpretation and quality score for one axis. |
||||||
|
|
||||||
|
Priority: left-right > coalition > progressive > fallback. |
||||||
|
Returns (label, interpretation_string, quality_score). |
||||||
|
""" |
||||||
|
orientation = "horizontale" if axis == "x" else "verticale" |
||||||
|
fallback_label = _LABELS["fallback_x"] if axis == "x" else _LABELS["fallback_y"] |
||||||
|
quality = max(abs(r_lr), abs(r_co), abs(r_pc)) |
||||||
|
|
||||||
|
if abs(r_lr) >= _THRESHOLD: |
||||||
|
return ( |
||||||
|
_LABELS["lr"], |
||||||
|
_INTERPRETATION_TEMPLATES["lr"].format(orientation=orientation), |
||||||
|
quality, |
||||||
|
) |
||||||
|
if abs(r_co) >= _THRESHOLD: |
||||||
|
return ( |
||||||
|
_LABELS["co"], |
||||||
|
_INTERPRETATION_TEMPLATES["co"].format(orientation=orientation, r=r_co), |
||||||
|
quality, |
||||||
|
) |
||||||
|
if abs(r_pc) >= _THRESHOLD: |
||||||
|
return ( |
||||||
|
_LABELS["pc"], |
||||||
|
_INTERPRETATION_TEMPLATES["pc"].format(orientation=orientation), |
||||||
|
quality, |
||||||
|
) |
||||||
|
return ( |
||||||
|
fallback_label, |
||||||
|
_INTERPRETATION_TEMPLATES["fallback"].format(orientation=orientation), |
||||||
|
quality, |
||||||
|
) |
||||||
|
|
||||||
|
|
||||||
|
def classify_axes( |
||||||
|
positions_by_window: Dict[str, Dict[str, Tuple[float, float]]], |
||||||
|
axes: dict, |
||||||
|
db_path: str, |
||||||
|
) -> dict: |
||||||
|
"""Classify compass axes by correlating per-party positions against ideology reference data. |
||||||
|
|
||||||
|
Enriches ``axes`` with: |
||||||
|
x_label, y_label — global label (modal across annual windows) |
||||||
|
x_quality, y_quality — {window_id: float} max |r| for each window |
||||||
|
x_interpretation — {window_id: str} Dutch explanation per window |
||||||
|
y_interpretation — {window_id: str} Dutch explanation per window |
||||||
|
|
||||||
|
Returns the original ``axes`` dict unchanged if reference data is unavailable. |
||||||
|
""" |
||||||
|
data_dir = Path(db_path).parent |
||||||
|
ideology = _load_ideology(data_dir / "party_ideologies.csv") |
||||||
|
if not ideology: |
||||||
|
return axes # no reference data — preserve existing behaviour |
||||||
|
|
||||||
|
coalition = _load_coalition(data_dir / "coalition_membership.csv") |
||||||
|
|
||||||
|
x_quality: Dict[str, float] = {} |
||||||
|
y_quality: Dict[str, float] = {} |
||||||
|
x_interpretation: Dict[str, str] = {} |
||||||
|
y_interpretation: Dict[str, str] = {} |
||||||
|
annual_x_labels: List[str] = [] |
||||||
|
annual_y_labels: List[str] = [] |
||||||
|
|
||||||
|
for wid, pos_dict in positions_by_window.items(): |
||||||
|
year = _window_year(wid) |
||||||
|
is_current = wid == "current_parliament" |
||||||
|
is_annual = not is_current and "-" not in wid # e.g. "2016" not "2016-Q3" |
||||||
|
|
||||||
|
# Only use parties present in both the positions and the ideology reference. |
||||||
|
parties = [p for p in pos_dict if p in ideology] |
||||||
|
if len(parties) < 5: |
||||||
|
_logger.debug( |
||||||
|
"Skipping axis classification for %s: only %d reference parties (need 5)", |
||||||
|
wid, |
||||||
|
len(parties), |
||||||
|
) |
||||||
|
continue |
||||||
|
|
||||||
|
party_x = [pos_dict[p][0] for p in parties] |
||||||
|
party_y = [pos_dict[p][1] for p in parties] |
||||||
|
ref_lr = [ideology[p]["left_right"] for p in parties] |
||||||
|
ref_pc = [ideology[p]["progressive"] for p in parties] |
||||||
|
|
||||||
|
# Coalition dummy: +1 if in government that year, -1 otherwise. |
||||||
|
# current_parliament and windows with no coalition data use a neutral vector. |
||||||
|
if year and coalition and year in coalition: |
||||||
|
gov_set = coalition[year] |
||||||
|
ref_co = [1.0 if p in gov_set else -1.0 for p in parties] |
||||||
|
else: |
||||||
|
ref_co = [0.0] * len(parties) # neutral — will never exceed threshold |
||||||
|
|
||||||
|
r_lr_x = _pearsonr(party_x, ref_lr) |
||||||
|
r_co_x = _pearsonr(party_x, ref_co) |
||||||
|
r_pc_x = _pearsonr(party_x, ref_pc) |
||||||
|
x_lbl, x_int, x_q = _assign_label(r_lr_x, r_co_x, r_pc_x, "x") |
||||||
|
|
||||||
|
r_lr_y = _pearsonr(party_y, ref_lr) |
||||||
|
r_co_y = _pearsonr(party_y, ref_co) |
||||||
|
r_pc_y = _pearsonr(party_y, ref_pc) |
||||||
|
y_lbl, y_int, y_q = _assign_label(r_lr_y, r_co_y, r_pc_y, "y") |
||||||
|
|
||||||
|
x_quality[wid] = x_q |
||||||
|
y_quality[wid] = y_q |
||||||
|
x_interpretation[wid] = x_int |
||||||
|
y_interpretation[wid] = y_int |
||||||
|
|
||||||
|
# Only annual windows vote on the global label (not quarterly, not current_parliament). |
||||||
|
if is_annual: |
||||||
|
annual_x_labels.append(x_lbl) |
||||||
|
annual_y_labels.append(y_lbl) |
||||||
|
|
||||||
|
def _modal(labels: List[str], fallback: str) -> str: |
||||||
|
if not labels: |
||||||
|
return fallback |
||||||
|
return Counter(labels).most_common(1)[0][0] |
||||||
|
|
||||||
|
enriched = dict(axes) |
||||||
|
enriched["x_label"] = _modal(annual_x_labels, "Links\u2013Rechts") |
||||||
|
enriched["y_label"] = _modal(annual_y_labels, "Progressief\u2013Conservatief") |
||||||
|
enriched["x_quality"] = x_quality |
||||||
|
enriched["y_quality"] = y_quality |
||||||
|
enriched["x_interpretation"] = x_interpretation |
||||||
|
enriched["y_interpretation"] = y_interpretation |
||||||
|
return enriched |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 2: Run the three new tests** |
||||||
|
|
||||||
|
```bash |
||||||
|
uv run pytest tests/test_political_compass.py::test_axis_label_left_right tests/test_political_compass.py::test_axis_label_coalition_dominant tests/test_political_compass.py::test_axis_classifier_missing_csv -v |
||||||
|
``` |
||||||
|
|
||||||
|
Expected: all 3 PASS |
||||||
|
|
||||||
|
- [ ] **Step 3: Run the full test suite to confirm no regressions** |
||||||
|
|
||||||
|
```bash |
||||||
|
uv run pytest tests/test_political_compass.py -v |
||||||
|
``` |
||||||
|
|
||||||
|
Expected: all tests PASS (5 original + 3 new = 8 total) |
||||||
|
|
||||||
|
- [ ] **Step 4: Commit** |
||||||
|
|
||||||
|
```bash |
||||||
|
git add data/party_ideologies.csv data/coalition_membership.csv analysis/axis_classifier.py tests/test_political_compass.py |
||||||
|
git commit -m "feat: add axis classifier with party ideology reference data |
||||||
|
|
||||||
|
classify_axes() correlates per-party PCA positions against party_ideologies.csv |
||||||
|
to assign honest dynamic labels (Links-Rechts, Coalitie-Oppositie, etc.) |
||||||
|
instead of always assuming the first PCA axis is left-right." |
||||||
|
``` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### Task 4: Wire classify_axes into load_positions |
||||||
|
|
||||||
|
**Files:** |
||||||
|
- Modify: `explorer.py:194-209` |
||||||
|
|
||||||
|
- [ ] **Step 1: In `load_positions()`, add the classify_axes call after `compute_2d_axes` returns** |
||||||
|
|
||||||
|
Find this block (lines 194–209): |
||||||
|
|
||||||
|
```python |
||||||
|
positions_by_window, axis_def = compute_2d_axes( |
||||||
|
db_path, |
||||||
|
window_ids=all_available, |
||||||
|
method="pca", |
||||||
|
pca_residual=True, |
||||||
|
normalize_vectors=True, |
||||||
|
) |
||||||
|
|
||||||
|
# Filter displayed windows by window_size AFTER PCA computation. |
||||||
|
if window_size == "annual": |
||||||
|
``` |
||||||
|
|
||||||
|
Replace with: |
||||||
|
|
||||||
|
```python |
||||||
|
positions_by_window, axis_def = compute_2d_axes( |
||||||
|
db_path, |
||||||
|
window_ids=all_available, |
||||||
|
method="pca", |
||||||
|
pca_residual=True, |
||||||
|
normalize_vectors=True, |
||||||
|
) |
||||||
|
|
||||||
|
try: |
||||||
|
from analysis.axis_classifier import classify_axes |
||||||
|
axis_def = classify_axes(positions_by_window, axis_def, db_path) |
||||||
|
except Exception: |
||||||
|
import logging |
||||||
|
logging.getLogger(__name__).exception("classify_axes failed; using generic axis labels") |
||||||
|
|
||||||
|
# Filter displayed windows by window_size AFTER PCA computation. |
||||||
|
if window_size == "annual": |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 2: Run the full test suite** |
||||||
|
|
||||||
|
```bash |
||||||
|
uv run pytest tests/test_political_compass.py -v |
||||||
|
``` |
||||||
|
|
||||||
|
Expected: all 8 tests PASS |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### Task 5: Use dynamic labels in the compass scatter plots |
||||||
|
|
||||||
|
**Files:** |
||||||
|
- Modify: `explorer.py:927-928` and `explorer.py:946` |
||||||
|
|
||||||
|
The `axis_def` variable is already in scope in `build_compass_tab` (it's returned by `load_positions` at line 817). |
||||||
|
|
||||||
|
- [ ] **Step 1: Add helper variables just before the first `px.scatter` call** |
||||||
|
|
||||||
|
Find the line `title=f"Politiek Kompas — {_window_label(window_idx)} (partijen)",` (around line 925) and locate the function `build_compass_tab`. Near the top of that function (just after `axis_def` becomes available at line 817), find a convenient spot before the first scatter plot is created. |
||||||
|
|
||||||
|
Look for the block that starts building the figure (the `if level == "Partijen":` branch). Add the two helper variables right before that `if`: |
||||||
|
|
||||||
|
```python |
||||||
|
_x_label = axis_def.get("x_label", "Links\u2013Rechts") |
||||||
|
_y_label = axis_def.get("y_label", "Progressief\u2013Conservatief") |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 2: Replace the hardcoded label in the party-level scatter (around line 927–928)** |
||||||
|
|
||||||
|
Find: |
||||||
|
```python |
||||||
|
labels={ |
||||||
|
"x": "Links \u2190 \u2192 Rechts", |
||||||
|
"y": "Progressief / Conservatief", |
||||||
|
"n": "Kamerleden", |
||||||
|
}, |
||||||
|
``` |
||||||
|
|
||||||
|
Replace with: |
||||||
|
```python |
||||||
|
labels={ |
||||||
|
"x": _x_label, |
||||||
|
"y": _y_label, |
||||||
|
"n": "Kamerleden", |
||||||
|
}, |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 3: Replace the hardcoded label in the MP-level scatter (around line 946)** |
||||||
|
|
||||||
|
Find: |
||||||
|
```python |
||||||
|
labels={"x": "Links \u2190 \u2192 Rechts", "y": "Progressief / Conservatief"}, |
||||||
|
``` |
||||||
|
|
||||||
|
Replace with: |
||||||
|
```python |
||||||
|
labels={"x": _x_label, "y": _y_label}, |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 4: Add the per-year interpretation caption after the chart is rendered** |
||||||
|
|
||||||
|
Find (around line 955–959): |
||||||
|
```python |
||||||
|
_add_y_direction_annotations(fig) |
||||||
|
|
||||||
|
with col1: |
||||||
|
st.plotly_chart(fig, use_container_width=True) |
||||||
|
``` |
||||||
|
|
||||||
|
Replace with: |
||||||
|
```python |
||||||
|
_add_y_direction_annotations(fig) |
||||||
|
|
||||||
|
with col1: |
||||||
|
st.plotly_chart(fig, use_container_width=True) |
||||||
|
_x_interp = axis_def.get("x_interpretation", {}).get(window_idx, "") |
||||||
|
_y_interp = axis_def.get("y_interpretation", {}).get(window_idx, "") |
||||||
|
if _x_interp and axis_def.get("x_quality", {}).get(window_idx, 1.0) < _THRESHOLD: |
||||||
|
st.caption(_x_interp) |
||||||
|
if _y_interp and axis_def.get("y_quality", {}).get(window_idx, 1.0) < _THRESHOLD: |
||||||
|
st.caption(_y_interp) |
||||||
|
``` |
||||||
|
|
||||||
|
Also add the constant `_THRESHOLD = 0.65` near the top of `explorer.py`, with the other module-level constants (after the imports). Search for an existing `_SPARSE_YEARS` or similar constant to find the right location. If no suitable spot exists, add it right before `build_compass_tab`. |
||||||
|
|
||||||
|
- [ ] **Step 5: Run the full test suite** |
||||||
|
|
||||||
|
```bash |
||||||
|
uv run pytest tests/test_political_compass.py -v |
||||||
|
``` |
||||||
|
|
||||||
|
Expected: all 8 tests PASS |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
### Task 6: Update the trajectories chart labels |
||||||
|
|
||||||
|
**Files:** |
||||||
|
- Modify: `explorer.py:1050` and `explorer.py:1120-1121` |
||||||
|
|
||||||
|
- [ ] **Step 1: In `build_trajectories_tab`, capture `axis_def` from `load_positions`** |
||||||
|
|
||||||
|
Find (around line 1050): |
||||||
|
```python |
||||||
|
positions_by_window, _ = load_positions(db_path, window_size) |
||||||
|
``` |
||||||
|
|
||||||
|
Replace with: |
||||||
|
```python |
||||||
|
positions_by_window, axis_def = load_positions(db_path, window_size) |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 2: Replace hardcoded axis titles in the trajectories chart (around line 1120–1121)** |
||||||
|
|
||||||
|
Find: |
||||||
|
```python |
||||||
|
xaxis_title="Links \u2190 \u2192 Rechts", |
||||||
|
yaxis_title="Progressief / Conservatief", |
||||||
|
``` |
||||||
|
|
||||||
|
Replace with: |
||||||
|
```python |
||||||
|
xaxis_title=axis_def.get("x_label", "Links\u2013Rechts"), |
||||||
|
yaxis_title=axis_def.get("y_label", "Progressief\u2013Conservatief"), |
||||||
|
``` |
||||||
|
|
||||||
|
- [ ] **Step 3: Run the full test suite one final time** |
||||||
|
|
||||||
|
```bash |
||||||
|
uv run pytest tests/test_political_compass.py -v |
||||||
|
``` |
||||||
|
|
||||||
|
Expected: all 8 tests PASS |
||||||
|
|
||||||
|
- [ ] **Step 4: Final commit** |
||||||
|
|
||||||
|
```bash |
||||||
|
git add explorer.py |
||||||
|
git commit -m "feat: use dynamic axis labels in compass and trajectories UI |
||||||
|
|
||||||
|
Replace hardcoded 'Links-Rechts' / 'Progressief-Conservatief' axis labels |
||||||
|
with values from classify_axes(). Add per-year interpretation caption when |
||||||
|
axis quality score is below the 0.65 correlation threshold." |
||||||
|
``` |
||||||
|
|
||||||
|
--- |
||||||
|
|
||||||
|
## Self-Review |
||||||
|
|
||||||
|
### Spec coverage check |
||||||
|
|
||||||
|
| Spec requirement | Covered by | |
||||||
|
|---|---| |
||||||
|
| `analysis/axis_classifier.py` with `classify_axes()` | Task 3 | |
||||||
|
| CSV paths derived from `Path(db_path).parent` | Task 3 (line in implementation) | |
||||||
|
| Pearson r for left_right, progressive, coalition dimensions | Task 3 (`_pearsonr`, `_assign_label`) | |
||||||
|
| Priority: lr > coalition > progressive > fallback | Task 3 (`_assign_label`) | |
||||||
|
| Global label = modal across annual windows | Task 3 (`_modal`, `is_annual` flag) | |
||||||
|
| `current_parliament` excluded from modal vote | Task 3 (`is_current`, `is_annual` check) | |
||||||
|
| Quarterly windows excluded from modal vote | Task 3 (`is_annual` = no `-` in wid) | |
||||||
|
| Backward-compatible when CSVs missing | Task 3 (`_load_ideology` returns `{}`; `classify_axes` returns original `axes`) | |
||||||
|
| `data/party_ideologies.csv` committed to git | Task 2 | |
||||||
|
| `data/coalition_membership.csv` committed to git | Task 2 | |
||||||
|
| `load_positions` calls `classify_axes` | Task 4 | |
||||||
|
| Dynamic x/y labels in compass scatter | Task 5 Steps 2–3 | |
||||||
|
| Per-year caption when quality < 0.65 | Task 5 Step 4 | |
||||||
|
| Dynamic labels in trajectories chart | Task 6 | |
||||||
|
| 3 tests: left_right, coalition, missing CSV | Task 1 | |
||||||
|
|
||||||
|
All spec requirements covered. No gaps. |
||||||
|
|
||||||
|
### Placeholder scan |
||||||
|
|
||||||
|
No TBDs, TODOs, or vague steps present. |
||||||
|
|
||||||
|
### Type consistency |
||||||
|
|
||||||
|
- `classify_axes` returns `dict` with keys `x_label` (str), `y_label` (str), `x_quality` (dict), `y_quality` (dict), `x_interpretation` (dict), `y_interpretation` (dict) — consistent across Tasks 3, 4, 5, 6. |
||||||
|
- `_THRESHOLD` is used in Task 5 Step 4; the constant is introduced in that same step. |
||||||
|
- `axis_def.get("x_label", "Links–Rechts")` matches the key name `"x_label"` set in Task 3. |
||||||
Loading…
Reference in new issue