docs(solutions): compound extended Overton analysis methodology

3 weeks ago · 0183bbc8a3
parent 28b24084f6
commit 0183bbc8a3
2 changed files with 236 additions and 0 deletions
--- a/docs/solutions/best-practices/overton-extended-analysis-methodology-2026-05-26.md
+++ b/docs/solutions/best-practices/overton-extended-analysis-methodology-2026-05-26.md
@ -0,0 +1,229 @@
 ---
 module: analysis/right_wing
 date: 2026-05-26
 problem_type: best_practice
 tags:
  - overton-window
  - voting-margin
  - mechanism-classification
  - party-differentiation
  - left-wing-response
  - 2d-extremity
 ---
 # Analytical Extensions for Overton Window Shift Analysis
 ## Context
 The core methodology in `overton-window-shift-methodology-2026-05-24.md` established the 7-step framework (strict centrist definition, Procrustes-aligned SVD, centrist support fraction, opposition control, extremity stratification, 2D audit). These are prerequisites — this document covers extensions built on that foundation for richer, more defensible analysis.
 Domain decomposition (`domain-decomposition-overton-analysis.md`) must also be applied before interpretation to avoid conflating strategic moderation with genuine acceptance expansion.
 ## Guidance
 ### 1. Replace Binary Pass Rate with Voting Margin
 The Dutch Tweede Kamer passes 96%+ of motions. Pass rate is a structurally useless ceiling-capped metric. **Always compute `voting_margin` as the primary success metric** before drawing conclusions about motion effectiveness.
 **How to compute:**
 ```
 margin = (voor - tegen) / (voor + tegen + afwezig)
 ```
 This produces a continuous [-1, +1] scale from unanimous rejection to unanimous support. A motion passing 14-1 (margin=+0.87) and one passing 8-7 (margin=+0.07) are both "passed" — the margin exposes the real signal.
 **Validation:** Correlate voting margin against centrist support. If Spearman ρ > 0.7, margin captures a meaningful success gradient that pass rate cannot. In the Dutch data, ρ=0.812.
 **Implementation:** `analysis/right_wing/voting_margin.py` — reads `motions.voting_results` (per-party JSON), computes per-motion margin, produces quartile-stratified statistics, and computes pre/post Cohen's d.
 **Key design decision:** Per-party aggregation (1 party = 1 vote) rather than seat-weighted. This measures *breadth of cross-spectrum support* — exactly the Overton concept — without conflating coalition size effects.
 ### 2. Coalition Coding Must Split 2024 at July 1
 Treating all 2024 as the Schoof cabinet (PVV/VVD/NSC/BBB) overestimates the coalition effect. The year contains two governments:
 - **Jan–Jun 2024:** Rutte IV (VVD/D66/CDA/CU) — caretaker, limited legislative agenda
 - **Jul–Dec 2024:** Schoof I (PVV/VVD/NSC/BBB) — right-wing coalition
 **Method:** When filtering opposition-only motions, use submission date (not year) against a date-aware coalition dictionary. Motions from Jan–Jun 2024 whose lead submitter is in the Rutte IV coalition must be filtered as government motions, not opposition.
 **Impact:** In the Dutch data, this reclassified 32 PVV/BBB motions from "opposition" to "government," correcting an artificial inflation of the opposition-only shift. Without this fix, the opposition-only Cohen's d overstates the shift by attributing coalition-party motions to opposition.
 **Implementation:** Embedded in all opposition-filter scripts (`temporal_trajectory.py`, `causal_timing.py`, `predictive_model.py`) via `COALITION` dict. Scripts that only use year-level filtering contain a fragility note.
 ### 3. Party Differentiation: Disaggregate Right-Wing Bloc
 Right-wing parties are not monolithic. Before attributing any Overton shift to "the right," disaggregate by party to identify the actual driver.
 **Required dimensions per party:**
 - Volume (motion count) over time
 - Centrist support over time
 - Material impact over time
 - Pre/post CS delta and volume delta
 **Method:** Parse submitter party from motion title prefixes using regex patterns matching Dutch motion conventions (e.g., `"Motie van het lid Wilders ..."`). Use `mp_metadata` table as authority for last-name-to-party mapping. Normalize party names via `_PARTY_NORMALIZE` (e.g., `Groep Markuszower → PVV`).
 **What to look for:**
 - A party whose centrist support rose *without* filing more motions: not the driver
 - A party whose volume rose *without* CS increase: also not the driver
 - A party with both volume AND CS gains: the primary driver
 - A party that entered government (reduced motions, milder content): follower, not leader
 **Surprising finding from Dutch data:** PVV (largest right-wing party) moderated significantly after entering government — *fewer, milder motions*. JA21 (smaller opposition party) was the primary driver of both volume and support gains. The aggregate "right-wing shift" narrative masks a succession story: PVV moderated, JA21 filled the vacated extreme space.
 **Implementation:** `analysis/right_wing/party_differentiation.py` — parses submitter parties, builds yearly aggregates, produces 4-panel figure (volume × CS × material impact × pre/post bars).
 ### 4. Left-Wing Response Analysis: Verify Asymmetry
 An Overton window shift is not symmetric. Rising centrist support for right-wing motions could theoretically be driven by left-wing parties hardening their opposition (making centrist support appear to rise relative to a higher denominator). Always verify the left-wing side by computing centrist support for *left-wing* motions.
 **Method:**
 1. Identify left-wing motions using the same submitter parsing approach
 2. Compute centrist support for left motions pre/post
 3. Compare Cohen's d for centrist → right shift vs centrist → left shift
 4. Compute per-party left-wing softening scores (Δ in CS for each left party)
 **Interpretation guard:** If the centrist shift toward right-wing motions is 10×+ larger than any left-wing hardening, the asymmetry confirms the shift is genuine centrist acceptance, not artifact of denominator change. If both sides shifted, investigate coalition mechanics.
 **Dutch finding:** Centrist shift toward right (d=+1.89) was 18.3× larger than left-wing hardening (d=−0.75). Left-wing opposition support barely moved (21.3%→20.2%). Volt was the only left party that softened (+12.9pp). This extreme asymmetry eliminates denominator artifacts as alternative explanations.
 **Implementation:** `analysis/right_wing/left_wing_response.py` — loads motions with `CANONICAL_LEFT` submitter filtering, computes both centrist support and left-party support for right-wing motions, produces asymmetry figures.
 ### 5. Mechanism Validation with Second Classifier
 LLM-based mechanism classifications are inherently subjective. Always validate inter-rater reliability before drawing conclusions from mechanism distributions.
 **Method:**
 1. Classify motion mechanism with primary classifier (inline subagent, reading full text)
 2. Run independent second classifier with a **different** prompt:
   - Use English wording (vs Dutch for original)
   - Present mechanisms in **reversed** order
   - Include explicit definitions for each mechanism
   - Require strict JSON output with confidence scores
 3. Compute Cohen's κ on overlapping classifications
 4. For disagreements, prefer the classification with confidence ≥ 4, otherwise retain original
 **Agreement thresholds:**
 - κ < 0.2: taxonomy is unusable, scrap and rebuild
 - 0.2 ≤ κ < 0.4: taxonomy needs major revision
 - 0.4 ≤ κ < 0.6: taxonomy needs targeted refinement (merge ambiguous pairs)
 - κ ≥ 0.6: taxonomy is sufficiently reliable
 **When κ is moderate (0.4–0.6):** Analyze the disagreement pairs. In the Dutch data, the primary confusion was `institutional_rule_of_law ↔ targeted_restriction` — many motions addressing migration enforcement could be classified as either. This revealed a taxonomy ambiguity, not a classification failure: institutional and targeted mechanisms can coexist in the same motion. The fix is not to discard the taxonomy but to note this ambiguity and consider a multi-label classification approach for future iterations.
 **Implementation:** `analysis/right_wing/mechanism_validation.py` — runs second classifier via parallel LLM calls, computes κ, confusion matrix, resolves disagreements, generates validated distribution.
 ### 6. Predictive Model: Feature Importance, Not Just Accuracy
 A predictive model's primary value in this domain is not prediction but feature importance — identifying which motion characteristics drive centrist acceptance.
 **Method:**
 1. Build features: submitter party (one-hot), category, 2D extremity scores, year, word count, sentiment, coalition membership
 2. Target: high centrist support (CS > 0.5) as binary classification
 3. Train logistic regression (interpretable coefficients) and random forest (captures non-linearities)
 4. Run stratified 5-fold cross-validation with AUC-ROC as primary metric
 5. Report top-10 feature importance rankings from both models
 **What models cannot do:** Predict individual motion outcomes in a new parliament. Coalition dynamics, the specific political moment, and the text of the motion interact in ways no tabular model captures. The models validate that the factors we *hypothesize* as important (party, extremity, category) are indeed predictive — this is methodological triangulation, not forecasting.
 **Dutch finding:** AUC-ROC=0.81 (logistic), 0.84 (RF). Top predictors: submitter party (FVD=−1.33 → hard to get support), SGP (+0.99 → easy), category (asylum/migration hardest), and `stijl_extremiteit` (higher stylistic extremity predicts lower support). These coefficients validate the analytical framework.
 **Implementation:** `analysis/right_wing/predictive_model.py` — sklearn logistic regression + random forest, cross-validated, with SHAP-style coefficient interpretation.
 ### 7. SVD Spatial Divergence: Center of Gravity Trajectories
 The core methodology established Procrustes-aligned SVD drift. The extension decomposes this into group-level center-of-gravity trajectories to measure *divergence*.
 **Method:**
 1. Compute Procrustes-aligned party coordinates per window (from core methodology)
 2. For each window, compute the center of gravity for the centrist bloc and right-wing bloc as the component-wise mean
 3. Plot both trajectories on the same axes, with year labels
 4. Compute Euclidean distance between the two centers of gravity at each window
 5. Test for trend: is the inter-bloc distance increasing?
 **What divergence means:** If centrists moved LEFT while right-wing stayed put or moved right, the distance between groups grew. This is acceptance without conversion — the Overton window widened without centrists converting to right-wing positions.
 **Implementation:** `analysis/right_wing/svd_trajectory_viz.py` — two-panel figure showing individual party arrow trajectories and group center-of-gravity movement.
 ### 8. Temporal Trajectory: Quarterly Granularity
 Binary pre/post analysis is too coarse to distinguish an electoral shock from a gradual trend. Always decompose into quarters.
 **Method:**
 1. Assign each motion to a quarter based on submission date
 2. Compute quarterly mean centrist support, with 95% CI
 3. Compute quarter-over-quarter Cohen's d to identify the largest single-step change
 4. Run a breakpoint detection method (e.g., PELT or binary segmentation) to find structural changes
 5. Annotate with known political events (elections, coalition changes, EU-level shifts)
 **What to look for:**
 - Immediate jump: the shift was an electoral/political event, not a trend
 - Gradual ramp: the shift was a cultural/social trend, harder to attribute to specific events
 - Peak and revert: if support peaked and returned to baseline, the shift may be temporary
 **Dutch finding:** The shift was an immediate electoral jump (2023-Q4→2024-Q1: +0.18), suggesting the right-wing electoral victory, not a gradual normalization process. The peak at 2024-Q4 (0.648) followed by reversion to 0.334 by 2026-Q1 raises the possibility of a temporary "coalition honeymoon" rather than permanent shift.
 **Implementation:** `analysis/right_wing/temporal_trajectory.py` — quarterly binning from submission dates, computes rolling statistics, CIs, and event annotations. `causal_timing.py` adds political event overlay and breakpoint detection.
 ### 9. 2D Extremity Divergence: Separate Stylistic from Material
 Single-dimension extremity scores masked diverging trends. Always decompose into stylistic and material dimensions.
 **Method:**
 1. Use the two-dimensional rescoring pipeline from the core methodology to score both `stijl_extremiteit` and `materiele_impact`
 2. Compute yearly means separately for each dimension
 3. Test for divergence: are the two dimensions moving in opposite directions?
 4. Compute paired Wilcoxon signed-rank test between pre and post periods for each dimension
 5. If significant divergence is found, single-score trend analysis is structurally misleading
 **When to suspect divergence:** If the single-score trend is flat (d≈0) but the underlying dimensions are correlated with centrist support in opposite directions, the flat trend masks real change. A decrease in material impact (motions got substantively milder) can cancel out an increase in stylistic extremity (language got sharper), producing a misleading "no change" signal.
 **Dutch finding:** Material impact decreased (−0.146) while stylistic extremity increased (+0.097), with significant Wilcoxon (p=0.002). The single-score "flat trend" was an artifact of these cancelling directions.
 **Implementation:** `analysis/right_wing/extremity_2d_temporal.py` — joins `extremity_scores_2d` with `right_wing_motions`, computes per-dimension yearly means, runs Wilcoxon, generates dual-axis time series figure.
 ### 10. Mechanism Classification: Stratified Sampling with Chi-Squared
 When testing whether a specific mechanism (e.g., consensus framing) drives centrist support, use stratified sampling and formal hypothesis testing.
 **Method:**
 1. Stratify the motion population into 2×2 buckets: period (pre/post) × support level (high/low CS)
 2. Sample deterministically (fixed motion IDs) for reproducibility
 3. Classify mechanism through manual/subagent review of full title + body text
 4. Test H0: mechanism distribution is independent of support level (chi-squared contingency test)
 5. Test H0: consensus framing is equally common in high-CS and low-CS post-2024 motions (2×2 Fisher/chi-squared)
 **Support threshold:** CS > 0.5 for "high" support. This threshold is arbitrary but produces clear separation between mechanisms that appeal broadly (consensus framing) vs those that polarize (targeted restriction, system dismantling).
 **When to reject the taxonomy:** If κ < 0.4 from independent validation, the mechanism labels themselves are unreliable and chi-squared results are misleading. In this case, report κ rather than mechanism findings.
 **Dutch finding:** Consensus framing accounts for 24% of high-CS post-2024 motions vs 8% of low-CS (significant). This validates the mechanism hypothesis even with moderate κ=0.41 — the signal survives the noise.
 **Implementation:** `analysis/right_wing/mechanism_classification.py` — deterministic sample of 200 motions, inline classifications, chi-squared tests, generated report.
 ## Why This Matters
 Without these extensions, Overton window analysis produces results that are correct in direction but misleading in detail:
 - Binary pass rate creates false nulls (no detectable change)
 - Undifferentiated "right-wing" analysis misattributes effects to the wrong party
 - Missing left-wing verification cannot rule out denominator artifacts
 - Single-dimension extremity masks opposite trajectories in stylistic vs material content
 - Missing mechanism validation overstates confidence in classification-based findings
 - Annual aggregation cannot distinguish shock vs trend
 ## When to Apply
 - When pass rate exceeds 90% and provides no signal (always in multi-party parliaments with high passage rates)
 - When right-wing parties have heterogeneous trajectories — some entering government, some remaining in opposition
 - When the analysis must rule out left-wing denominator effects as alternative explanations
 - When LLM-scored extremity metrics are used as independent variables
 - When mechanism classification forms part of the explanatory framework
 - When temporal resolution matters: quarterly decomposition is necessary to distinguish electoral shock from gradual shift
 - When 2D extremity data is available and single-score trends appear flat
 ## Related
 - `overton-window-shift-methodology-2026-05-24.md` — core 7-step methodology (prerequisite)
 - `domain-decomposition-overton-analysis.md` — per-domain decomposition (apply before interpretation)
 - `svd-labels-voting-patterns-not-semantics.md` — foundational: SVD captures voting patterns
 - `.opencode/skills/score-extremity/SKILL.md` — two-dimensional extremity scoring pipeline
--- a/docs/solutions/best-practices/overton-window-shift-methodology-2026-05-24.md
+++ b/docs/solutions/best-practices/overton-window-shift-methodology-2026-05-24.md
@ -134,8 +134,15 @@ The "acceptance without conversion" finding has implications beyond Dutch politi
 - Strict centrist definition (D66, CDA, CU, NSC) → unmasked the full effect size
 - Runtime flip correction produces correct axis convention → centrists moved LEFT on both axes, diverging from right
 ## Extensions
 The core methodology above is extended by:
 - `overton-extended-analysis-methodology-2026-05-26.md` — voting margin, party differentiation, coalition date coding, mechanism validation, left-wing response, predictive modeling, SVD divergence trajectories, quarterly temporal decomposition, and 2D extremity divergence
 ## Related
 - `domain-decomposition-overton-analysis.md` — domain decomposition reveals hidden variance in aggregate shifts
 - `docs/solutions/best-practices/svd-labels-voting-patterns-not-semantics.md` — foundational: SVD captures voting patterns, not semantic content
 - `docs/solutions/ui-bugs/svd-axis-pole-labels-incorrect-after-flip.md` — sign convention discovery via runtime flip
 - `docs/solutions/insights/quantifying-political-extremity.md` — voting extremity vs policy extremity are independent