motief/docs/plans/2026-05-05-001-feat-right-w...

---
title: Right-Wing Motion Analysis Over Time
type: feat
status: active
date: 2026-05-05
---

# Right-Wing Motion Analysis Over Time

## Summary

Build a pipeline to identify, classify, and analyze right-wing motions across parliamentary history. Track how their volume, policy extremity, and cross-party support have evolved over time. Output includes a derived keyword taxonomy, a hybrid motion classifier, temporal aggregations, policy extremity scores, sentiment trajectories, and time-series visualizations.

---

## Problem Frame

The user hypothesizes that while the **volume** of right-wing motions has remained stable, the motions have become **more extreme** in policy demands and that **centrist parties** increasingly vote in favor of them. This suggests an "overton window" shift that cannot be detected by simple vote-counting alone.

The existing codebase has party-level ideological positioning (SVD/PCA) but lacks motion-level ideological scoring, keyword-based right-wing detection, or temporal analysis of how motion content has radicalized.

---

## Requirements

- **R1.** Derive a right-wing keyword taxonomy from the motions themselves (not hand-curated)
- **R2.** Build a hybrid classifier that identifies "actually right-wing" motions using both keywords and voting patterns
- **R3.** Aggregate and analyze trends per year (volume, support, cross-party adoption)
- **R4.** Score policy extremity (what the motion demands, not just sentiment)
- **R5.** Track sentiment/emotional tone over time as a proxy for radicalization
- **R6.** Produce time-series visualizations for exploration (not yet integrated into the Streamlit app)
- **R7.** Validate classifier accuracy to avoid false positives (e.g., left-wing parties discussing migration neutrally)

---

## Scope Boundaries

- **In scope:** Keyword derivation, motion classification, temporal aggregation, extremity scoring, sentiment analysis, static visualization output
- **Out of scope:** Integration into the Streamlit explorer app (deferred to follow-up)
- **Out of scope:** Real-time updates or pipeline automation
- **Out of scope:** Predictive modeling (we describe trends, we do not forecast)

### Deferred to Follow-Up Work

- **Streamlit tab integration:** Add an interactive "Right-Wing Trends" tab to the explorer (`analysis/tabs/right_wing_trends.py`): separate PR after this plan is executed
- **Motion-level ideological embedding:** Train or fine-tune a motion-specific embedding model to improve classification beyond keywords + votes

---

## Context & Research

### Relevant Code and Patterns

- **`analysis/config.py`** — Defines `CANONICAL_RIGHT = {"PVV", "FVD", "JA21", "SGP"}` and `CANONICAL_LEFT` — these sets are the ground-truth for right/left party identification
- **`scripts/derive_svd_labels.py`** — Already uses TF-IDF on motion titles with Dutch stopwords; this pattern should be reused/extened for keyword derivation
- **`scripts/motion_drift.py`** — Implements cross-ideological voting detection and semantic drift measurement; its `compute_party_voting()` logic is directly relevant for validating the hybrid classifier
- **`analysis/axis_classifier.py`** — Contains `_classify_from_titles()` with Dutch keyword regexes for 4 ideological categories; not sufficient alone but a useful reference for post-processing
- **`pipeline/text_pipeline.py`** — Generates text embeddings via API; can be reused if we need embeddings for the sentiment/extremity analysis
- **`analysis/explorer_data.py`** — `load_motions_df()` loads the full motions table with year parsing; primary data access pattern
- **`database.py`** / **`agent_tools/database.py`** — Provide DuckDB access to `motions` and `mp_votes` tables

### Institutional Learnings

- **`docs/solutions/best-practices/svd-labels-voting-patterns-not-semantics.md`** — Right-wing parties must appear on the RIGHT side of all axes; same principle applies here: classification must reflect voting behavior, not just semantic content

### External References

- Dutch parliamentary motion data is already present in DuckDB (`data/motions.db`)
- `motion_drift.py` already uses Ridge/Lasso regression for axis stability; similar regression techniques can be applied to temporal trend fitting

---

## Key Technical Decisions

- **Keyword derivation method:** TF-IDF on motion titles/body_text, restricted to motions where `CANONICAL_RIGHT` parties vote predominantly *voor*. This avoids hand-curating and captures the actual language used by right-wing parties.
- **Hybrid classifier:** Two-stage: (1) keyword match for initial filtering, (2) voting-pattern confirmation requiring >60% support from right-wing parties AND <40% opposition from left-wing parties. This handles cases where left-wing parties also use migration keywords neutrally.
- **Policy extremity:** Use an LLM (via `ai_provider.py` subagent pattern) to answer "What concrete policy change does this motion demand?" and rate the radicalism of that demand on a 1-5 scale. This captures policy substance, not emotional language.
- **Sentiment analysis:** Use a lightweight Dutch sentiment model (e.g., `pysentimiento` or similar) rather than an LLM, for cost and speed. If no good Dutch model exists, fallback to LLM batch calls.
- **Temporal granularity:** Annual buckets (`YYYY`), aligning with existing SVD window conventions (`2024`, `2025`, etc.)
- **Visualization:** Static Plotly charts exported to HTML/PNG, not yet integrated into Streamlit. This decouples analysis from UI work.

---

## Open Questions

### Resolved During Planning

- **Q: Should we analyze titles only or full body_text?**
  - **A:** Start with titles (fast, already cleaned), validate on a sample using body_text, and upgrade if precision is insufficient.
- **Q: How do we define "centrist parties" for cross-party support tracking?**
  - **A:** Treat as the complement of `CANONICAL_RIGHT` and `CANONICAL_LEFT` within `KNOWN_MAJOR_PARTIES`: VVD, D66, CDA, NSC, BBB, CU.

### Deferred to Implementation

- **Q: What is the optimal TF-IDF threshold for keyword inclusion?** — Depends on corpus distribution; will be determined by inspecting the keyword ranked list.
- **Q: Which Dutch sentiment model performs best on parliamentary language?** — Requires empirical testing; candidate models to be evaluated during U5.
- **Q: Does policy extremity scoring need few-shot examples for consistency?** — Will be determined during U4 implementation; if LLM outputs are inconsistent, add exemplars.

---

## High-Level Technical Design

> *This illustrates the intended approach and is directional guidance for review, not implementation specification. The implementing agent should treat it as context, not code to reproduce.*

```
┌─────────────────────────────────────────────────────────────┐
│  PHASE 1: KEYWORD DERIVATION (U1)                           │
│  - Filter motions where right-wing parties vote >60% voor   │
│  - Run TF-IDF on titles/body_text                           │
│  - Extract top-N distinctive terms                          │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  PHASE 2: HYBRID CLASSIFICATION (U2)                        │
│  - Keyword filter: motion contains any top-N term           │
│  - Voting filter: right >60% voor AND left <40% tegen       │
│  - Output: right_wing_motions table with year, score        │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  PHASE 3: TEMPORAL AGGREGATION (U3)                         │
│  - Group by year                                            │
│  - Compute: count, % of total motions, avg support          │
│  - Track: centrist party support over time                  │
│  - Output: yearly_summary DataFrame                         │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  PHASE 4: POLICY EXTREMITY (U4)                             │
│  - Sample motions per year (stratified by keyword density)  │
│  - LLM prompt: "What concrete policy does this demand?"     │
│  - Rate radicalism 1-5                                      │
│  - Output: extremity_scores table                           │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  PHASE 5: SENTIMENT ANALYSIS (U5)                         │
│  - Dutch sentiment model on motion titles/body_text         │
│  - Aggregate: avg sentiment per year                        │
│  - Output: sentiment_by_year DataFrame                      │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  PHASE 6: VISUALIZATION (U6)                                │
│  - Time-series: volume, extremity, sentiment, centrist vote │
│  - Export: static HTML/PNG charts                           │
└─────────────────────────────────────────────────────────────┘
```

---

## Execution Strategy: Agent Assignments & Parallelization

| Unit | Agents | Can Parallelize With | Notes |
|------|--------|---------------------|-------|
| U1 | `general` (text pipeline), `ce-best-practices-researcher` (keyword extraction patterns) | — | Foundation unit; must complete before all downstream work. DuckDB batch aggregation is I/O-bound and fast; no subagents needed. |
| U2 | `ce-best-practices-researcher` (TF-IDF best practices, Dutch NLP), `general` (implementation) | U3 (after U1) | Keyword derivation is research-heavy; the researcher can pre-fetch Dutch political TF-IDF patterns while `general` implements the aggregation. Once U1 data is ready, keyword extraction can run in parallel with U3's vote-pattern analysis. |
| U3 | `general` (DuckDB SQL + Python) | U2 (after U1) | Pure computation; no external research needed. Runs in parallel with U2 once U1 is done because U2 and U3 read the same `motion_rightness` table but produce independent features. |
| U4 | `general` (scikit-learn pipeline), `ce-framework-docs-researcher` (if exploring Dutch sentiment models) | — | Depends on U2 + U3. The LLM-based extremity scorer can be delegated to `general` in batch mode; no real-time LLM calls at inference time. |
| U5 | `ce-best-practices-researcher` (time-series visualization patterns), `general` (Plotly implementation) | — | Depends on U4. Visualization is straightforward once data shape is known; researcher mainly validates responsive/mobile chart patterns. |
| U6 | `general` (integration), `ce-test-browser` (if applicable) | — | Depends on U5. Integration work is sequential by nature; browser tests validate the Streamlit tab render. |

**Parallelization summary:**
- **U1** is strictly sequential (foundation).
- **U2 + U3** can run in parallel once U1 completes.
- **U4** is sequential after U2+U3.
- **U5 + U6** are sequential after U4.
- Total wall-clock phases: **4** (U1 → U2∥U3 → U4 → U5 → U6).

---

## Implementation Units

- **U1. Keyword Derivation**

  **Goal:** Extract the most distinctive terms used in right-wing motions without hand-curating.

  **Requirements:** R1

  **Dependencies:** None

  **Files:**
  - Create: `analysis/right_wing/derive_keywords.py`
  - Modify: `scripts/derive_svd_labels.py` (reference only)
  - Test: `tests/test_derive_keywords.py`

  **Approach:**
  1. Query `motions` joined with `mp_votes` to identify motions where right-wing parties vote predominantly *voor* (>60%).
  2. Collect a control group: motions where left-wing parties vote predominantly *voor* (>60%).
  3. Run TF-IDF on titles (and optionally body_text) for both groups, using Dutch stopwords.
  4. Compute differential TF-IDF: terms with high scores in right-group and low in left-group.
  5. Manually inspect top 50 terms and filter out generic parliamentary terms (e.g., "motie", "kamer").
  6. Persist the final keyword list to `analysis/right_wing/right_wing_keywords.json`.

  **Patterns to follow:**
  - `scripts/derive_svd_labels.py` for TF-IDF pipeline
  - `analysis/config.py` for `CANONICAL_RIGHT` / `CANONICAL_LEFT`

  **Test scenarios:**
  - Happy path: right-wing motions contain terms like "migratie", "asiel", "grenzen" in top results
  - Edge case: control group does NOT have these terms in top 50
  - Edge case: generic terms ("motie", "regering") are filtered out
  - Integration: keyword list can be loaded as JSON and used for regex matching

  **Verification:**
  - `right_wing_keywords.json` exists and contains >= 20 distinct terms
  - Manual inspection confirms terms are politically distinctive

---

- **U2. Hybrid Motion Classifier**

  **Goal:** Identify "actually right-wing" motions using both keywords and voting patterns.

  **Requirements:** R2, R7

  **Dependencies:** U1

  **Files:**
  - Create: `analysis/right_wing/classify_motions.py`
  - Test: `tests/test_classify_motions.py`

  **Approach:**
  1. **Keyword filter:** Match motion title/body_text against `right_wing_keywords.json` (case-insensitive, whole-word regex).
  2. **Voting filter:** For each candidate motion, compute:
     - `right_support` = % of `CANONICAL_RIGHT` parties voting *voor*
     - `left_opposition` = % of `CANONICAL_LEFT` parties voting *tegen*
     - Pass if `right_support >= 0.60` AND `left_opposition >= 0.40`
  3. Output a DuckDB table `right_wing_motions` with columns: `motion_id`, `year`, `title`, `right_support`, `left_opposition`, `keyword_matches`.
  4. Run a validation sample: manually inspect 20 random classified motions and 20 random non-classified motions to estimate precision/recall.

  **Patterns to follow:**
  - `scripts/motion_drift.py` for cross-ideological voting logic
  - `analysis/explorer_data.py` for data loading patterns

  **Test scenarios:**
  - Happy path: a PVV motion about "asielzoekers" with 80% right support and 60% left opposition is classified
  - Edge case: a PvdA motion mentioning "migratie" neutrally with 20% right support is NOT classified
  - Edge case: motion with right support 60% but left opposition only 10% is NOT classified
  - Error path: motion with no votes is skipped gracefully
  - Integration: `right_wing_motions` table is queryable and has expected row count

  **Verification:**
  - Validation sample shows >80% precision and >70% recall
  - `right_wing_motions` table contains >100 rows (non-empty)

---

- **U3. Temporal Aggregation**

  **Goal:** Compute yearly trends in right-wing motion volume, support, and cross-party adoption.

  **Requirements:** R3

  **Dependencies:** U2

  **Files:**
  - Create: `analysis/right_wing/temporal_analysis.py`
  - Test: `tests/test_temporal_analysis.py`

  **Approach:**
  1. Group `right_wing_motions` by `year`.
  2. For each year, compute:
     - `total_right_wing`: count of right-wing motions
     - `pct_of_total`: % of all motions that year
     - `avg_right_support`: average right-party support
     - `avg_left_opposition`: average left-party opposition
     - `centrist_support`: % of centrist parties (VVD, D66, CDA, NSC, BBB, CU) voting *voor*
     - `extremity_index`: placeholder for U4 scores (NULL until backfilled)
  3. Compute year-over-year deltas for each metric.
  4. Persist to `yearly_right_wing_summary` table or DataFrame export.

  **Patterns to follow:**
  - `analysis/trajectory.py` for time-windowed aggregations
  - `analysis/explorer_data.py` for DuckDB-to-pandas patterns

  **Test scenarios:**
  - Happy path: each year has a row with all metrics computed
  - Edge case: year with 0 right-wing motions shows 0 count and NULL centrist support
  - Edge case: year with missing vote data for some parties still computes available metrics
  - Integration: output DataFrame can be merged with U4 extremity scores

  **Verification:**
  - One row per year from earliest to latest motion
  - `pct_of_total` values sum to <= 100% when checked manually for a sample year
  - Centrist support shows a plausible trend (not all 0% or 100%)

---

- **U4. Policy Extremity Scoring**

  **Goal:** Score how radical the policy demand of each right-wing motion is.

  **Requirements:** R4

  **Dependencies:** U2

  **Files:**
  - Create: `analysis/right_wing/extremity_scorer.py`
  - Test: `tests/test_extremity_scorer.py`

  **Approach:**
  1. For each right-wing motion, build a prompt:
     > "Dit is een motie in het Nederlandse parlement. Wat vraagt de motie concreet? Beoordeel hoe radicaal dit voorstel is op een schaal van 1 (mild/technisch) tot 5 (extreem/fundamenteel). Geef alleen het cijfer en een korte verklaring in het Nederlands."
  2. Use `ai_provider.chat_completion_json()` with a JSON schema enforcing integer 1-5 + explanation string.
  3. Batch process motions (parallel API calls, 10-15 per batch) to minimize cost.
  4. Store results in `extremity_scores` table: `motion_id`, `score`, `explanation`.
  5. Compute yearly average and merge into U3's summary.

  **Execution note:** Start with a sample of 50 motions to validate scoring consistency before running the full set.

  **Patterns to follow:**
  - `summarizer.py` for batch LLM processing and parallel API calls
  - `ai_provider.py` for JSON-mode chat completions

  **Test scenarios:**
  - Happy path: a motion demanding "sluit alle AZC's" scores 5/5
  - Happy path: a motion requesting "rapporteer cijfers" scores 1/5
  - Edge case: LLM returns invalid JSON → fallback to retry or mark as NULL
  - Error path: API failure → motion is skipped, not blocking
  - Integration: extremity scores correlate with keyword intensity (sanity check)

  **Verification:**
  - Sample of 50 shows inter-rater consistency (same motion re-scored twice gets same score)
  - Score distribution is not all 1s or all 5s (has variance)
  - `extremity_scores` table covers >= 90% of right-wing motions

---

- **U5. Sentiment Analysis Pipeline**

  **Goal:** Track emotional tone of right-wing motions over time as a proxy for radicalization.

  **Requirements:** R5

  **Dependencies:** U2

  **Files:**
  - Create: `analysis/right_wing/sentiment_analysis.py`
  - Test: `tests/test_sentiment_analysis.py`

  **Approach:**
  1. Evaluate candidate Dutch sentiment models:
     - `pysentimiento` (nl model)
     - `transformers` pipeline with Dutch BERT (`wietsedv/bert-base-dutch-cased` fine-tuned for sentiment)
     - Fallback: LLM batch calls if models are poor
  2. Run on motion titles + first 200 chars of body_text (avoiding noise).
  3. Map outputs to [-1, 1] scale (negative = hostile/aggressive, positive = constructive).
  4. Aggregate by year: avg sentiment, std deviation, % strongly negative.
  5. Merge into U3 summary.

  **Patterns to follow:**
  - `pipeline/text_pipeline.py` for text preprocessing
  - Lightweight model evaluation script similar to `test_mistral.py`

  **Test scenarios:**
  - Happy path: motion with "stop de immigratie" scores negative
  - Happy path: motion with "verbeter de procedure" scores neutral/positive
  - Edge case: very short title (< 5 words) is handled gracefully
  - Error path: model fails to load → fallback to LLM or skip
  - Integration: sentiment trend correlates with extremity trend (sanity check)

  **Verification:**
  - Sentiment scores show variance (not all identical)
  - Manual inspection of 10 random motions confirms direction is plausible
  - Model inference time is < 100ms per motion (acceptable for batch)

---

- **U6. Time-Series Visualization**

  **Goal:** Produce static charts showing volume, extremity, sentiment, and centrist support over time.

  **Requirements:** R6

  **Dependencies:** U3, U4, U5

  **Files:**
  - Create: `analysis/right_wing/visualize_trends.py`
  - Output: `output/right_wing_trends.html` and/or `.png` files

  **Approach:**
  1. Load `yearly_right_wing_summary` DataFrame.
  2. Generate 4 charts:
     - **Volume:** Line chart of `total_right_wing` + `pct_of_total` (dual axis)
     - **Extremity:** Line chart of `avg_extremity_score` with error bars (std)
     - **Sentiment:** Line chart of `avg_sentiment` + % strongly negative (stacked area)
     - **Centrist Support:** Line chart of `centrist_support` over time with party breakdown
  3. Use Plotly (consistent with `analysis/visualize.py`) with dark theme colors.
  4. Export to `output/right_wing_trends.html` (interactive) and `.png` (static).

  **Patterns to follow:**
  - `analysis/visualize.py` for Plotly setup and theming
  - `analysis/tabs/_rendering.py` for dark theme color constants

  **Test scenarios:**
  - Happy path: HTML file is generated and opens without errors
  - Happy path: all 4 charts render with data spanning multiple years
  - Edge case: missing data for some years → chart shows gaps, not crashes
  - Integration: charts visually confirm the user's hypothesis (stable volume, rising extremity/centrist support)

  **Verification:**
  - `output/right_wing_trends.html` exists and is > 100KB
  - Manual inspection of charts shows clear trends
  - Charts are suitable for sharing (static PNGs are readable)

---

## System-Wide Impact

- **New tables:** `right_wing_motions`, `extremity_scores` — these are derived/analysis tables, not core schema. They can be regenerated.
- **No changes to existing tables:** `motions`, `mp_votes`, `svd_vectors` are read-only for this feature.
- **No API surface changes:** This is an offline analysis script, not integrated into the app yet.
- **Performance:** TF-IDF on ~29K titles is trivial. LLM calls for U4 are the bottleneck; batching keeps cost manageable.

---

## Risks & Dependencies

| Risk | Mitigation |
|------|------------|
| Keyword list overfits to current parliamentary period | Validate across multiple years; include historical data in TF-IDF corpus |
| LLM extremity scoring is inconsistent | Add few-shot examples; validate on sample before full run; allow NULL scores |
| Dutch sentiment model performs poorly on parliamentary language | Evaluate multiple models; fallback to LLM if needed |
| Classification has false positives (left-wing motions caught) | Hybrid voting filter mitigates this; validation sample checks precision |
| LLM API costs for extremity scoring exceed budget | Batch aggressively; score a stratified sample (e.g., 30 per year) instead of all motions |

---

## Documentation / Operational Notes

- Add a README in `analysis/right_wing/` explaining how to regenerate the analysis
- Document the keyword list and classification thresholds for reproducibility
- Note the LLM model used for extremity scoring and its version/date

---

## Sources & References

- **Related code:** `scripts/motion_drift.py`, `scripts/derive_svd_labels.py`, `analysis/axis_classifier.py`
- **Related learnings:** `docs/solutions/best-practices/svd-labels-voting-patterns-not-semantics.md`
- **Origin:** User request — analyze right-wing motion trends over time