--- module: llm-classification tags: [polarization, nlp, prompt-design, democratic-norms] problem_type: classification-schema-design date: 2026-04-05 reviewed_by: - correctness-reviewer - domain-expert (Dutch politics) - clarity-reviewer --- # LLM Motion Classification: Prompt Design Lessons ## Problem Wanted to classify 28,000 Dutch parliamentary motions by "extremity" to measure polarization over time. Initial prompt conflated multiple concepts: - Democratic norm erosion - Populist rhetoric style - Group targeting - Restrictiveness vs permissiveness ## Initial v1 Design (Flawed) ```python EXTREMITY_SCORE (1-5): - 1: Mainstream - 5: "Undermines checks & balances, threatens rule of law, discriminates groups, populist rhetoric" ``` **Problems identified:** 1. Populist rhetoric is style, not substance — shouldn't be in same score as democratic erosion 2. "Extreme" undefined — compared to what baseline? 3. Score 4/5 boundary unclear 4. TARGETED_GROUP redundant with EXTREMITY_SCORE 5. EU deviation always = score 5 (too broad) 6. Missing Dutch-specific patterns (Nexit, referendum abolition) ## Refined v2 Design (Four Orthogonal Dimensions) ### 1. DEMOCRATIC_EROSION (0-4) — Substance only | Score | Label | Criteria | |-------|-------|----------| | 0 | None | No impact on democratic norms | | 1 | Minor | Small procedural deviations | | 2 | Moderate | Significant policy change, within constitutional framework | | 3 | Significant | Fundamental change to checks & balances | | 4 | Critical | Undermines rule of law, press freedom, systematic discrimination | **Decision rules:** - Score 4 ONLY if: (a) direct attack on judiciary/press, OR (b) systematic discrimination in law, OR (c) call to violate international treaties - Score 3 if: (a) abolish referendum, OR (b) fundamentally question EU cooperation, OR (c) significantly expand executive powers ### 2. POPULIST_STYLE (0-1) — Style only Independent of democratic impact. A motion can be populist (1) but democratic (0). **Indicators:** - "Het volk" vs "de elite/den Haag" - "Wij vs zij" framing - Call for "direct democracy" without checks - Emotionally charged language ### 3. GROUP_TARGETING (0-2) — Targeting only | Score | Label | |-------|-------| | 0 | Universal — general policy | | 1 | Indirect — general policy that disproportionately affects groups | | 2 | Direct — explicitly targets specific population group | ### 4. RESTRICTIVENESS (-1 to +1) — Direction only | Score | Label | |-------|-------| | -1 | Expansive | | 0 | Neutral | | +1 | Restrictive | ## Key Lessons Learned ### 1. Separate Style from Substance Populist rhetoric ≠ democratic erosion. A mainstream party using strong language isn't anti-democratic. Conflating them causes false positives. ### 2. Make Dimensions Orthogonal - DEMOCRATIC_EROSION × RESTRICTIVENESS: A policy can be erosive AND restrictive, or erosive AND permissive - POPULIST_STYLE × DEMOCRATIC_EROSION: Can have populist (1) with democratic (0), and vice versa - GROUP_TARGETING × RESTRICTIVENESS: Restrictive ≠ targeted (and vice versa) ### 3. Add Decision Rules for Boundaries Vague transitions ("significant" → "critical") cause inconsistency. Define specific triggers: ``` Score 4 ONLY when: (a) OR (b) OR (c) Score 3 when: (a) OR (b) OR (c) ``` ### 4. Gradate EU Deviation Not all EU deviation is equal: - Dutch implementation of EU policy → erosion 0-1 - Nexit / leave EU → erosion 3-4 - Violate EU rules → erosion 2-3 ### 5. Include Domain-Specific Patterns Dutch context matters: - Referendum abolition = score 3 - "Den Haag" / "establishment" attacks = check for populist style - Nexit = score 3-4 depending on framing ### 6. Define Reference Baselines "Abnormal" compared to what? - 2016 consensus - EU norms - Historical Dutch practice - International standards ## Testing Recommendations 1. **Calibration set**: 50 motions with expert annotations before production 2. **Boundary cases**: Test score 3/4 transitions explicitly 3. **Cross-rater reliability**: Multiple classifiers on same motions 4. **Domain-specific test cases**: Migration, EU, constitutional reform ## Files - `scripts/classify_motions.py` — Implementation with v2 prompt - `docs/research/motion-classification-prompt-v2.md` — Full prompt documentation