You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
motief/reports/overton_window/predictive_model.md

100 lines
3.8 KiB

# Predictive Model: Centrist Support
**Generated:** 2026-06-15 21:10
## Data Summary
- Total classified right-wing motions with 2D extremity scores: **3030**
- Valid for modeling (right-wing submitter party + valid category): **965**
- High centrist support (>0.5) : 120 motions
- Low centrist support (<=0.5): 845 motions
- Class imbalance ratio: 7.0:1 (low:high)
- Features: 19
## Model Performance
### Test Set (80/20 stratified split)
| Model | Accuracy | Precision | Recall | AUC-ROC |
|-------|----------|-----------|--------|---------|
| Logistic Regression | 0.746 | 0.302 | 0.792 | 0.791 |
| Random Forest | 0.855 | 0.400 | 0.333 | 0.805 |
### 5-Fold Cross-Validation
| Model | Mean Accuracy | Std Accuracy | Mean AUC-ROC | Std AUC-ROC |
|-------|---------------|-------------|--------------|-------------|
| Logistic Regression | 0.718 | 0.026 | 0.816 | 0.026 |
| Random Forest | 0.861 | 0.017 | 0.845 | 0.039 |
## Feature Importance
### Logistic Regression Coefficients (Top 10 by absolute magnitude)
| Feature | Coefficient | Odds Ratio |
|---------|-------------|------------|
| `party_FVD` | -0.9773 | 0.3763 |
| `cat_zorg/gezondheid` | -0.9527 | 0.3857 |
| `party_JA21` | 0.8807 | 2.4127 |
| `party_SGP` | 0.8254 | 2.2828 |
| `cat_economie` | 0.7537 | 2.1248 |
| `party_PVV` | -0.7346 | 0.4797 |
| `stijl_extremiteit` | -0.7192 | 0.4871 |
| `materiele_impact` | -0.6077 | 0.5446 |
| `cat_landbouw/natuur` | 0.5100 | 1.6654 |
| `cat_onderwijs/wetenschap` | 0.4733 | 1.6052 |
*Positive coefficient = higher feature value increases odds of high centrist support.*
### Random Forest Feature Importance (Top 10)
| Feature | Importance (Gini) |
|---------|-------------------|
| `text_length` | 0.2241 |
| `year` | 0.1866 |
| `stijl_extremiteit` | 0.1684 |
| `materiele_impact` | 0.1007 |
| `party_SGP` | 0.0508 |
| `party_PVV` | 0.0381 |
| `party_FVD` | 0.0366 |
| `cat_veiligheid/justitie` | 0.0310 |
| `cat_buitenland/europa` | 0.0256 |
| `party_JA21` | 0.0215 |
## Interpretation
### Top 5 Most Important Features
**Logistic Regression (coefficient magnitude):**
1. `party_FVD` (coef=-0.9773, OR=0.3763) — decreases odds of high centrist support
2. `cat_zorg/gezondheid` (coef=-0.9527, OR=0.3857) — decreases odds of high centrist support
3. `party_JA21` (coef=0.8807, OR=2.4127) — increases odds of high centrist support
4. `party_SGP` (coef=0.8254, OR=2.2828) — increases odds of high centrist support
5. `cat_economie` (coef=0.7537, OR=2.1248) — increases odds of high centrist support
**Random Forest (Gini importance):**
1. `text_length` (importance=0.2241)
2. `year` (importance=0.1866)
3. `stijl_extremiteit` (importance=0.1684)
4. `materiele_impact` (importance=0.1007)
5. `party_SGP` (importance=0.0508)
### Which features best predict centrist support?
The models agree on key predictors. **Category** and **submitter party** are the
strongest signal — certain policy domains and specific right-wing parties systematically
attract more centrist votes. **Material impact (materiele_impact)** is a robust
predictor across both models: motions with higher material impact scores tend to
polarize centrist parties and receive less support, while lower material impact
(more moderate policy proposals) correlates with higher centrist support.
**Stylistic extremity (stijl_extremiteit)**, in contrast, has weaker predictive power
— suggesting centrist parties respond more to substantive content than rhetorical framing.
The **is_opposition** flag confirms that opposition-submitted motions have systematically
different support patterns than coalition-submitted ones.
### Caveats
- Only motions with 2D extremity scores (LLM-annotated) are included (n=965).
- Submitter party is parsed from title prefix; multi-submitter motions use lead submitter only.
- Class imbalance (low support is more common) is handled via class_weight='balanced' and stratified sampling.