7.8 KiB
| date | topic | status |
|---|---|---|
| 2026-03-29 | Bootstrap confidence intervals and data enrichment | validated |
Bootstrap Confidence Intervals & Data Enrichment
Problem Statement
The SVD axis charts show party centroid scores as point estimates with no indication of reliability. Volt (N=1) and D66 (N=49) look equally confident. Additionally:
- 2016–2018 motions lack body text, weakening embedding quality for those windows
party_svd_scores.jsonis a stale ad-hoc file missing NSC — should be deleted
Constraints
- No re-SVD per bootstrap replicate — too expensive, only centroid uncertainty needed
- Single-window bootstrap only — party scores come from
current_parliamentraw SVD vectors, not the Procrustes pipeline - Functional Python, using existing patterns (uv, duckdb, numpy)
- Don't break existing Streamlit rendering — error bars are additive
- Fixed random seed for reproducibility
Approach
Single-window centroid bootstrap. For each party, resample its N MPs with replacement 1000×, recompute centroid per replicate, take percentile CIs. Cheap (no re-SVD needed), directly answers "how reliable is this score?".
Rejected alternatives:
- Multi-window Procrustes bootstrap: 1000× SVD cost, requires orientation canonicalization. Overkill.
- Analytical SE (std/sqrt(N)): assumes normality, misses skewed distributions.
Components
A. Download Script Enhancement (scripts/download_past_year.py)
Add two CLI flags:
--skip-details(default:True, matching current hardcoded behavior) — whenFalse, fetches body text via_get_motion_details→_fetch_body_text--update-existing(default:False) — whenTrue, re-processes motions already in DB to fetch missing body_text and update the record
The update-existing flow:
- Query motions table for rows WHERE date BETWEEN start_date AND end_date AND (body_text IS NULL OR body_text = '')
- Extract besluit_id from the URL column (format:
https://www.tweedekamer.nl/kamerstukken/stemmingsuitslagen/{besluit_id}— take last path segment) - For each such motion, call
api._get_motion_details(besluit_id)to fetch body_text - UPDATE the motions row with the new body_text (and title/description if also missing)
Note: the motions table has no besluit_id column — it's only embedded in the URL. The update flow must parse it from the URL.
Run once after implementation: --start-date 2016-01-01 --end-date 2018-12-31 --update-existing
(No need for --skip-details when using --update-existing — it always fetches details for the targeted rows.)
B. Bootstrap Computation (analysis/political_axis.py)
New function:
compute_party_bootstrap_cis(
party_vectors: Dict[str, List[np.ndarray]],
n_boot: int = 1000,
ci: float = 95.0,
seed: int = 42
) -> Dict[str, Dict]
Input: party_vectors is a dict mapping party name → list of individual MP vectors (each a numpy array of length 50). The caller (explorer.py) builds this from DB queries using existing mp→party mapping logic.
Returns per-party:
{
"PVV": {
"centroid": [50 floats],
"ci_lower": [50 floats],
"ci_upper": [50 floats],
"std": [50 floats],
"n_mps": 19
},
...
}
Algorithm:
- Receive pre-grouped
party_vectorsfrom caller - For each party with N >= 2:
- Create numpy Generator with fixed seed
- For each of n_boot replicates: sample N indices with replacement, compute mean vector
- Compute percentile CIs (alpha/2, 100-alpha/2) and std across replicates per dimension
- For parties with N = 1: set ci_lower == ci_upper == centroid, std = 0, flag n_mps = 1
Dependencies: numpy, duckdb (read_only), json.
Import issue: _PARTY_NORMALIZE and CURRENT_PARLIAMENT_PARTIES live in explorer.py (a Streamlit app). The bootstrap function in analysis/political_axis.py can't import from there. Solution: the bootstrap function accepts party_vectors: Dict[str, List[np.ndarray]] as input — the caller (explorer.py) handles the mp→party mapping and passes grouped vectors in. This keeps the analysis module independent of Streamlit app constants and avoids duplicating the normalization logic.
Alternatively, the caller can pass the already-computed party_scores dict from load_party_axis_scores plus raw per-party MP vector lists. The simplest approach: add a helper in explorer.py that loads grouped MP vectors per party (reusing existing mapping logic) and pass that to the bootstrap function.
C. Chart Enhancement (explorer.py)
Modify _render_party_axis_chart to accept optional bootstrap_data: Dict[str, Dict] = None.
When bootstrap_data is provided:
- For each party, compute error magnitude:
(ci_upper[axis_idx] - ci_lower[axis_idx]) / 2 - When flip is True, error magnitude stays the same (symmetric around the negated centroid)
- Add
error_x=dict(type="data", array=error_array, visible=True)to the party marker Scatter trace - Parties with N=1: render with a distinct marker (diamond shape instead of circle) as visual unreliability warning
- Add
N={n_mps}to hover text for all parties
The bootstrap computation should be cached alongside party scores using @st.cache_data.
D. Delete Stale JSON File
Remove thoughts/explorer/party_svd_scores.json. The app never reads this file — load_party_axis_scores always computes live from the DB. The file was generated ad-hoc during analysis and is missing NSC.
Also remove thoughts/explorer/axis_analysis_data.json — same situation, ad-hoc analysis artifact not used by the app.
Data Flow
DB (svd_vectors, mp_metadata)
│
├──→ load_party_axis_scores()
│ returns Dict[str, List[float]] (party → 50-dim centroid)
│
└──→ load_party_mp_vectors() [NEW helper in explorer.py]
returns Dict[str, List[np.ndarray]] (party → list of individual MP vectors)
reuses same mp→party mapping as load_party_axis_scores
│
↓
compute_party_bootstrap_cis(party_vectors, n_boot=1000, ci=95, seed=42)
│ returns Dict[str, Dict] (party → {centroid, ci_lower, ci_upper, std, n_mps})
↓
_render_party_axis_chart(party_scores, comp_sel, theme, bootstrap_data=None)
│ indexes [comp_sel - 1] from centroid and CIs
│ applies flip (negate score AND CI bounds)
│ adds error_x to Plotly Scatter trace
↓
Streamlit renders chart with error bars
Both functions cached via @st.cache_data with same TTL.
Error Handling
- N=1 parties (Volt, Lid Keijzer): Return centroid as both CI bounds, std=0. Chart renders diamond marker. Hover says "N=1, geen betrouwbaarheidsinterval".
- N=2 parties (50PLUS): CIs will be wide — that's correct, let data speak.
- SVD vector parsing failures: Skip MP, log warning (same as existing pattern).
- Download/scraping failures: Per-chunk try/except already handles this.
_fetch_body_textreturns None on failure (existing behavior). - update-existing with no besluit_id: Skip motion, log. Not all motions have a besluit_id traceable to body text.
Testing Strategy
Unit Tests
test_bootstrap_fixed_seed: Synthetic data (5 parties, varying N), fixed seed. Verify:- Output shape matches expected structure
- CI bounds bracket centroid for all parties
- N=1 party has ci_lower == ci_upper == centroid
- Same seed produces identical output
- Larger N produces narrower CIs
Integration Tests
test_bootstrap_real_db: Run against actual DB, verify:- Returns data for all 17 current parliament parties (+NSC)
- n_mps values match known party sizes
- CI width for D66 (N=49) << CI width for SP (N=3)
Visual Validation
- Run Streamlit app, verify error bars appear on SVD axis charts
- Verify N=1 parties have distinct marker style
- Verify hover text includes party size
Open Questions
None — design is straightforward. The only future enhancement would be multi-window bootstrap for axis stability testing, but that's a separate project.