From 0183bbc8a33ca8d03baa554f731b418709b3920d Mon Sep 17 00:00:00 2001
From: Sven Geboers <geboers.sven@gmail.com>
Date: Sun, 31 May 2026 23:18:50 +0200
Subject: [PATCH] docs(solutions): compound extended Overton analysis
 methodology

---
 ...xtended-analysis-methodology-2026-05-26.md | 229 ++++++++++++++++++
 ...ton-window-shift-methodology-2026-05-24.md |   7 +
 2 files changed, 236 insertions(+)
 create mode 100644 docs/solutions/best-practices/overton-extended-analysis-methodology-2026-05-26.md

diff --git a/docs/solutions/best-practices/overton-extended-analysis-methodology-2026-05-26.md b/docs/solutions/best-practices/overton-extended-analysis-methodology-2026-05-26.md
new file mode 100644
index 0000000..706024d
--- /dev/null
+++ b/docs/solutions/best-practices/overton-extended-analysis-methodology-2026-05-26.md
@@ -0,0 +1,229 @@
+---
+module: analysis/right_wing
+date: 2026-05-26
+problem_type: best_practice
+tags:
+  - overton-window
+  - voting-margin
+  - mechanism-classification
+  - party-differentiation
+  - left-wing-response
+  - 2d-extremity
+---
+
+# Analytical Extensions for Overton Window Shift Analysis
+
+## Context
+
+The core methodology in `overton-window-shift-methodology-2026-05-24.md` established the 7-step framework (strict centrist definition, Procrustes-aligned SVD, centrist support fraction, opposition control, extremity stratification, 2D audit). These are prerequisites — this document covers extensions built on that foundation for richer, more defensible analysis.
+
+Domain decomposition (`domain-decomposition-overton-analysis.md`) must also be applied before interpretation to avoid conflating strategic moderation with genuine acceptance expansion.
+
+## Guidance
+
+### 1. Replace Binary Pass Rate with Voting Margin
+
+The Dutch Tweede Kamer passes 96%+ of motions. Pass rate is a structurally useless ceiling-capped metric. **Always compute `voting_margin` as the primary success metric** before drawing conclusions about motion effectiveness.
+
+**How to compute:**
+```
+margin = (voor - tegen) / (voor + tegen + afwezig)
+```
+
+This produces a continuous [-1, +1] scale from unanimous rejection to unanimous support. A motion passing 14-1 (margin=+0.87) and one passing 8-7 (margin=+0.07) are both "passed" — the margin exposes the real signal.
+
+**Validation:** Correlate voting margin against centrist support. If Spearman ρ > 0.7, margin captures a meaningful success gradient that pass rate cannot. In the Dutch data, ρ=0.812.
+
+**Implementation:** `analysis/right_wing/voting_margin.py` — reads `motions.voting_results` (per-party JSON), computes per-motion margin, produces quartile-stratified statistics, and computes pre/post Cohen's d.
+
+**Key design decision:** Per-party aggregation (1 party = 1 vote) rather than seat-weighted. This measures *breadth of cross-spectrum support* — exactly the Overton concept — without conflating coalition size effects.
+
+### 2. Coalition Coding Must Split 2024 at July 1
+
+Treating all 2024 as the Schoof cabinet (PVV/VVD/NSC/BBB) overestimates the coalition effect. The year contains two governments:
+
+- **Jan–Jun 2024:** Rutte IV (VVD/D66/CDA/CU) — caretaker, limited legislative agenda
+- **Jul–Dec 2024:** Schoof I (PVV/VVD/NSC/BBB) — right-wing coalition
+
+**Method:** When filtering opposition-only motions, use submission date (not year) against a date-aware coalition dictionary. Motions from Jan–Jun 2024 whose lead submitter is in the Rutte IV coalition must be filtered as government motions, not opposition.
+
+**Impact:** In the Dutch data, this reclassified 32 PVV/BBB motions from "opposition" to "government," correcting an artificial inflation of the opposition-only shift. Without this fix, the opposition-only Cohen's d overstates the shift by attributing coalition-party motions to opposition.
+
+**Implementation:** Embedded in all opposition-filter scripts (`temporal_trajectory.py`, `causal_timing.py`, `predictive_model.py`) via `COALITION` dict. Scripts that only use year-level filtering contain a fragility note.
+
+### 3. Party Differentiation: Disaggregate Right-Wing Bloc
+
+Right-wing parties are not monolithic. Before attributing any Overton shift to "the right," disaggregate by party to identify the actual driver.
+
+**Required dimensions per party:**
+- Volume (motion count) over time
+- Centrist support over time
+- Material impact over time
+- Pre/post CS delta and volume delta
+
+**Method:** Parse submitter party from motion title prefixes using regex patterns matching Dutch motion conventions (e.g., `"Motie van het lid Wilders ..."`). Use `mp_metadata` table as authority for last-name-to-party mapping. Normalize party names via `_PARTY_NORMALIZE` (e.g., `Groep Markuszower → PVV`).
+
+**What to look for:**
+- A party whose centrist support rose *without* filing more motions: not the driver
+- A party whose volume rose *without* CS increase: also not the driver
+- A party with both volume AND CS gains: the primary driver
+- A party that entered government (reduced motions, milder content): follower, not leader
+
+**Surprising finding from Dutch data:** PVV (largest right-wing party) moderated significantly after entering government — *fewer, milder motions*. JA21 (smaller opposition party) was the primary driver of both volume and support gains. The aggregate "right-wing shift" narrative masks a succession story: PVV moderated, JA21 filled the vacated extreme space.
+
+**Implementation:** `analysis/right_wing/party_differentiation.py` — parses submitter parties, builds yearly aggregates, produces 4-panel figure (volume × CS × material impact × pre/post bars).
+
+### 4. Left-Wing Response Analysis: Verify Asymmetry
+
+An Overton window shift is not symmetric. Rising centrist support for right-wing motions could theoretically be driven by left-wing parties hardening their opposition (making centrist support appear to rise relative to a higher denominator). Always verify the left-wing side by computing centrist support for *left-wing* motions.
+
+**Method:**
+1. Identify left-wing motions using the same submitter parsing approach
+2. Compute centrist support for left motions pre/post
+3. Compare Cohen's d for centrist → right shift vs centrist → left shift
+4. Compute per-party left-wing softening scores (Δ in CS for each left party)
+
+**Interpretation guard:** If the centrist shift toward right-wing motions is 10×+ larger than any left-wing hardening, the asymmetry confirms the shift is genuine centrist acceptance, not artifact of denominator change. If both sides shifted, investigate coalition mechanics.
+
+**Dutch finding:** Centrist shift toward right (d=+1.89) was 18.3× larger than left-wing hardening (d=−0.75). Left-wing opposition support barely moved (21.3%→20.2%). Volt was the only left party that softened (+12.9pp). This extreme asymmetry eliminates denominator artifacts as alternative explanations.
+
+**Implementation:** `analysis/right_wing/left_wing_response.py` — loads motions with `CANONICAL_LEFT` submitter filtering, computes both centrist support and left-party support for right-wing motions, produces asymmetry figures.
+
+### 5. Mechanism Validation with Second Classifier
+
+LLM-based mechanism classifications are inherently subjective. Always validate inter-rater reliability before drawing conclusions from mechanism distributions.
+
+**Method:**
+1. Classify motion mechanism with primary classifier (inline subagent, reading full text)
+2. Run independent second classifier with a **different** prompt:
+   - Use English wording (vs Dutch for original)
+   - Present mechanisms in **reversed** order
+   - Include explicit definitions for each mechanism
+   - Require strict JSON output with confidence scores
+3. Compute Cohen's κ on overlapping classifications
+4. For disagreements, prefer the classification with confidence ≥ 4, otherwise retain original
+
+**Agreement thresholds:**
+- κ < 0.2: taxonomy is unusable, scrap and rebuild
+- 0.2 ≤ κ < 0.4: taxonomy needs major revision
+- 0.4 ≤ κ < 0.6: taxonomy needs targeted refinement (merge ambiguous pairs)
+- κ ≥ 0.6: taxonomy is sufficiently reliable
+
+**When κ is moderate (0.4–0.6):** Analyze the disagreement pairs. In the Dutch data, the primary confusion was `institutional_rule_of_law ↔ targeted_restriction` — many motions addressing migration enforcement could be classified as either. This revealed a taxonomy ambiguity, not a classification failure: institutional and targeted mechanisms can coexist in the same motion. The fix is not to discard the taxonomy but to note this ambiguity and consider a multi-label classification approach for future iterations.
+
+**Implementation:** `analysis/right_wing/mechanism_validation.py` — runs second classifier via parallel LLM calls, computes κ, confusion matrix, resolves disagreements, generates validated distribution.
+
+### 6. Predictive Model: Feature Importance, Not Just Accuracy
+
+A predictive model's primary value in this domain is not prediction but feature importance — identifying which motion characteristics drive centrist acceptance.
+
+**Method:**
+1. Build features: submitter party (one-hot), category, 2D extremity scores, year, word count, sentiment, coalition membership
+2. Target: high centrist support (CS > 0.5) as binary classification
+3. Train logistic regression (interpretable coefficients) and random forest (captures non-linearities)
+4. Run stratified 5-fold cross-validation with AUC-ROC as primary metric
+5. Report top-10 feature importance rankings from both models
+
+**What models cannot do:** Predict individual motion outcomes in a new parliament. Coalition dynamics, the specific political moment, and the text of the motion interact in ways no tabular model captures. The models validate that the factors we *hypothesize* as important (party, extremity, category) are indeed predictive — this is methodological triangulation, not forecasting.
+
+**Dutch finding:** AUC-ROC=0.81 (logistic), 0.84 (RF). Top predictors: submitter party (FVD=−1.33 → hard to get support), SGP (+0.99 → easy), category (asylum/migration hardest), and `stijl_extremiteit` (higher stylistic extremity predicts lower support). These coefficients validate the analytical framework.
+
+**Implementation:** `analysis/right_wing/predictive_model.py` — sklearn logistic regression + random forest, cross-validated, with SHAP-style coefficient interpretation.
+
+### 7. SVD Spatial Divergence: Center of Gravity Trajectories
+
+The core methodology established Procrustes-aligned SVD drift. The extension decomposes this into group-level center-of-gravity trajectories to measure *divergence*.
+
+**Method:**
+1. Compute Procrustes-aligned party coordinates per window (from core methodology)
+2. For each window, compute the center of gravity for the centrist bloc and right-wing bloc as the component-wise mean
+3. Plot both trajectories on the same axes, with year labels
+4. Compute Euclidean distance between the two centers of gravity at each window
+5. Test for trend: is the inter-bloc distance increasing?
+
+**What divergence means:** If centrists moved LEFT while right-wing stayed put or moved right, the distance between groups grew. This is acceptance without conversion — the Overton window widened without centrists converting to right-wing positions.
+
+**Implementation:** `analysis/right_wing/svd_trajectory_viz.py` — two-panel figure showing individual party arrow trajectories and group center-of-gravity movement.
+
+### 8. Temporal Trajectory: Quarterly Granularity
+
+Binary pre/post analysis is too coarse to distinguish an electoral shock from a gradual trend. Always decompose into quarters.
+
+**Method:**
+1. Assign each motion to a quarter based on submission date
+2. Compute quarterly mean centrist support, with 95% CI
+3. Compute quarter-over-quarter Cohen's d to identify the largest single-step change
+4. Run a breakpoint detection method (e.g., PELT or binary segmentation) to find structural changes
+5. Annotate with known political events (elections, coalition changes, EU-level shifts)
+
+**What to look for:**
+- Immediate jump: the shift was an electoral/political event, not a trend
+- Gradual ramp: the shift was a cultural/social trend, harder to attribute to specific events
+- Peak and revert: if support peaked and returned to baseline, the shift may be temporary
+
+**Dutch finding:** The shift was an immediate electoral jump (2023-Q4→2024-Q1: +0.18), suggesting the right-wing electoral victory, not a gradual normalization process. The peak at 2024-Q4 (0.648) followed by reversion to 0.334 by 2026-Q1 raises the possibility of a temporary "coalition honeymoon" rather than permanent shift.
+
+**Implementation:** `analysis/right_wing/temporal_trajectory.py` — quarterly binning from submission dates, computes rolling statistics, CIs, and event annotations. `causal_timing.py` adds political event overlay and breakpoint detection.
+
+### 9. 2D Extremity Divergence: Separate Stylistic from Material
+
+Single-dimension extremity scores masked diverging trends. Always decompose into stylistic and material dimensions.
+
+**Method:**
+1. Use the two-dimensional rescoring pipeline from the core methodology to score both `stijl_extremiteit` and `materiele_impact`
+2. Compute yearly means separately for each dimension
+3. Test for divergence: are the two dimensions moving in opposite directions?
+4. Compute paired Wilcoxon signed-rank test between pre and post periods for each dimension
+5. If significant divergence is found, single-score trend analysis is structurally misleading
+
+**When to suspect divergence:** If the single-score trend is flat (d≈0) but the underlying dimensions are correlated with centrist support in opposite directions, the flat trend masks real change. A decrease in material impact (motions got substantively milder) can cancel out an increase in stylistic extremity (language got sharper), producing a misleading "no change" signal.
+
+**Dutch finding:** Material impact decreased (−0.146) while stylistic extremity increased (+0.097), with significant Wilcoxon (p=0.002). The single-score "flat trend" was an artifact of these cancelling directions.
+
+**Implementation:** `analysis/right_wing/extremity_2d_temporal.py` — joins `extremity_scores_2d` with `right_wing_motions`, computes per-dimension yearly means, runs Wilcoxon, generates dual-axis time series figure.
+
+### 10. Mechanism Classification: Stratified Sampling with Chi-Squared
+
+When testing whether a specific mechanism (e.g., consensus framing) drives centrist support, use stratified sampling and formal hypothesis testing.
+
+**Method:**
+1. Stratify the motion population into 2×2 buckets: period (pre/post) × support level (high/low CS)
+2. Sample deterministically (fixed motion IDs) for reproducibility
+3. Classify mechanism through manual/subagent review of full title + body text
+4. Test H0: mechanism distribution is independent of support level (chi-squared contingency test)
+5. Test H0: consensus framing is equally common in high-CS and low-CS post-2024 motions (2×2 Fisher/chi-squared)
+
+**Support threshold:** CS > 0.5 for "high" support. This threshold is arbitrary but produces clear separation between mechanisms that appeal broadly (consensus framing) vs those that polarize (targeted restriction, system dismantling).
+
+**When to reject the taxonomy:** If κ < 0.4 from independent validation, the mechanism labels themselves are unreliable and chi-squared results are misleading. In this case, report κ rather than mechanism findings.
+
+**Dutch finding:** Consensus framing accounts for 24% of high-CS post-2024 motions vs 8% of low-CS (significant). This validates the mechanism hypothesis even with moderate κ=0.41 — the signal survives the noise.
+
+**Implementation:** `analysis/right_wing/mechanism_classification.py` — deterministic sample of 200 motions, inline classifications, chi-squared tests, generated report.
+
+## Why This Matters
+
+Without these extensions, Overton window analysis produces results that are correct in direction but misleading in detail:
+- Binary pass rate creates false nulls (no detectable change)
+- Undifferentiated "right-wing" analysis misattributes effects to the wrong party
+- Missing left-wing verification cannot rule out denominator artifacts
+- Single-dimension extremity masks opposite trajectories in stylistic vs material content
+- Missing mechanism validation overstates confidence in classification-based findings
+- Annual aggregation cannot distinguish shock vs trend
+
+## When to Apply
+
+- When pass rate exceeds 90% and provides no signal (always in multi-party parliaments with high passage rates)
+- When right-wing parties have heterogeneous trajectories — some entering government, some remaining in opposition
+- When the analysis must rule out left-wing denominator effects as alternative explanations
+- When LLM-scored extremity metrics are used as independent variables
+- When mechanism classification forms part of the explanatory framework
+- When temporal resolution matters: quarterly decomposition is necessary to distinguish electoral shock from gradual shift
+- When 2D extremity data is available and single-score trends appear flat
+
+## Related
+
+- `overton-window-shift-methodology-2026-05-24.md` — core 7-step methodology (prerequisite)
+- `domain-decomposition-overton-analysis.md` — per-domain decomposition (apply before interpretation)
+- `svd-labels-voting-patterns-not-semantics.md` — foundational: SVD captures voting patterns
+- `.opencode/skills/score-extremity/SKILL.md` — two-dimensional extremity scoring pipeline
diff --git a/docs/solutions/best-practices/overton-window-shift-methodology-2026-05-24.md b/docs/solutions/best-practices/overton-window-shift-methodology-2026-05-24.md
index 570b017..059d80a 100644
--- a/docs/solutions/best-practices/overton-window-shift-methodology-2026-05-24.md
+++ b/docs/solutions/best-practices/overton-window-shift-methodology-2026-05-24.md
@@ -134,8 +134,15 @@ The "acceptance without conversion" finding has implications beyond Dutch politi
 - Strict centrist definition (D66, CDA, CU, NSC) → unmasked the full effect size
 - Runtime flip correction produces correct axis convention → centrists moved LEFT on both axes, diverging from right
 
+## Extensions
+
+The core methodology above is extended by:
+
+- `overton-extended-analysis-methodology-2026-05-26.md` — voting margin, party differentiation, coalition date coding, mechanism validation, left-wing response, predictive modeling, SVD divergence trajectories, quarterly temporal decomposition, and 2D extremity divergence
+
 ## Related
 
+- `domain-decomposition-overton-analysis.md` — domain decomposition reveals hidden variance in aggregate shifts
 - `docs/solutions/best-practices/svd-labels-voting-patterns-not-semantics.md` — foundational: SVD captures voting patterns, not semantic content
 - `docs/solutions/ui-bugs/svd-axis-pole-labels-incorrect-after-flip.md` — sign convention discovery via runtime flip
 - `docs/solutions/insights/quantifying-political-extremity.md` — voting extremity vs policy extremity are independent