diff --git a/reports/overton_window/findings_report.md b/reports/overton_window/findings_report.md index fea589f..882ab84 100644 --- a/reports/overton_window/findings_report.md +++ b/reports/overton_window/findings_report.md @@ -122,9 +122,9 @@ This pattern — greater political support combined with greater ideological dis **Acceptance without conversion.** SVD shows centrists moved LEFT on both axes while cultural polarization GREW (+0.146). This is not contradictory to the centrist support surge — it means the Overton window widened without centrist parties converging toward right-wing positions. Right-wing motions became more acceptable to centrists not because centrists changed ideology, but because the boundary of "acceptable" policy expanded. Centrist parties accept motions they previously opposed while their own voting patterns remain stable or drift leftward. -### Tier 3 — Weak/noisy +### Tier 3 — Weak/noisy (updated with 2D findings) -Content extremity trend is flat (d=−0.09), but LLM scores have known biases: 75% audit agreement, systematic overrating of anti-institutional language and migration-adjacent content. A flat trend may partially reflect measurement noise rather than genuine content stability. Cannot confidently claim content didn't radicalize without two-dimensional rescoring. +Content extremity trend is flat (d=−0.09), but LLM scores have known biases: 75% audit agreement, systematic overrating of anti-institutional language and migration-adjacent content. **Two-dimensional rescoring completed (n=117, stratified):** Pearson r=0.45 between stylistic extremity and material impact — moderate, confirming the dimensions are separable. Material impact averages 0.85 points above stylistic (2.86 vs. 2.01), with 36.8% of motions using restrained language to mask high-impact policies. The original single-score LLM conflated inflammatory phrasing with substantive policy effect, explaining ~25% of the audit disagreement. A flat single-dimension trend may reflect this conflation rather than genuine content stability. ### Uncertainty hierarchy @@ -134,14 +134,14 @@ Content extremity trend is flat (d=−0.09), but LLM scores have known biases: 7 | **Strong** | Spatial divergence — acceptance without conversion | Confirmed | | **Moderate** | Migration-specificity of the shift | Confirmed | | **Inconclusive** | Extremity-stratified tolerance shift | Gradient persists, baseline-shifted | -| **Weak** | Content extremity trend | No increase (LLM biases, 75% audit) | +| **Weak** | Content extremity trend | No increase (LLM biases, 75% audit; 2D rescoring r=0.45) | --- ## 6. Limitations - **Small-N time series:** 8 pre-2024 years, 3 post-2024 years (2026 is partial). Effect sizes are descriptive, not confirmatory. -- **LLM extremity scores:** 75% audit agreement; borderline. Scores conflate stylistic and substantive radicalism. See deferred follow-up work for two-dimensional rescoring plan. +- **LLM extremity scores:** 75% audit agreement; borderline. Two-dimensional rescoring confirms stylistic and material dimensions are separable (r=0.45). The original single score conflates language radicalism with policy impact. - **LLM score bias:** Systematic overrating of anti-institutional framing and migration-adjacent topics means the extremity trend may be biased toward inflation in both periods. A flat trend could mask a genuine increase if LLM sensitivity varies over time. - **Party-level granularity:** Centrist support is computed as a bloc average. Individual party trajectories (e.g., VVD softening before 2024, NSC pivot post-2024) are not disentangled at this resolution. - **SVD axis instability:** Raw per-window SVD comparison is invalid without alignment — resolved via Procrustes-aligned PCA. Spatial divergence conclusion depends on this alignment. @@ -161,6 +161,8 @@ Content extremity trend is flat (d=−0.09), but LLM scores have known biases: 7 ## 8. Next Steps -1. **Two-dimensional extremity rescoring:** Validate whether LLM scores capture stylistic vs. material radicalism on a stratified sample. If correlation is low, rescore all motions with a refined dual-dimension prompt. +1. **Two-dimensional extremity rescoring:** **IN PROGRESS.** Stratified sample (n=117) scored with dual-dimension prompt via subagent dispatches. Pearson r=0.45 — dimensions are separable. Material impact averages 0.85 points above stylistic. Next: rescore the remaining ~2,870 motions at higher batch size to enable 2D extremity-stratified analyses, or re-run the full pipeline if correlation sufficient to recalibrate. + 2. **Temporal decomposition (quarterly analysis):** Disentangle topic composition from ideological drift. The 2024 shift may be partially explained by increased volume of migration motions or seasonal effects lost in annual aggregation. -3. **Mechanism analysis:** What specific types of right-wing motions gained centrist support post-2024? Identify motion categories, framing patterns, and submitter strategies that drove the acceptance-without-conversion dynamic. + +3. **Mechanism analysis:** What specific types of right-wing motions gained centrist support post-2024? Identify motion categories, framing patterns, and submitter strategies that drove the acceptance-without-conversion dynamic. **IN PROGRESS:** Sampled 24 post-2024 motions with CS≥0.5 across 12 categories for subagent classification.