You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
4.2 KiB
4.2 KiB
| module | tags | problem_type | date | reviewed_by |
|---|---|---|---|---|
| llm-classification | [polarization nlp prompt-design democratic-norms] | classification-schema-design | 2026-04-05 | [correctness-reviewer domain-expert (Dutch politics) clarity-reviewer] |
LLM Motion Classification: Prompt Design Lessons
Problem
Wanted to classify 28,000 Dutch parliamentary motions by "extremity" to measure polarization over time.
Initial prompt conflated multiple concepts:
- Democratic norm erosion
- Populist rhetoric style
- Group targeting
- Restrictiveness vs permissiveness
Initial v1 Design (Flawed)
EXTREMITY_SCORE (1-5):
- 1: Mainstream
- 5: "Undermines checks & balances, threatens rule of law,
discriminates groups, populist rhetoric"
Problems identified:
- Populist rhetoric is style, not substance — shouldn't be in same score as democratic erosion
- "Extreme" undefined — compared to what baseline?
- Score 4/5 boundary unclear
- TARGETED_GROUP redundant with EXTREMITY_SCORE
- EU deviation always = score 5 (too broad)
- Missing Dutch-specific patterns (Nexit, referendum abolition)
Refined v2 Design (Four Orthogonal Dimensions)
1. DEMOCRATIC_EROSION (0-4) — Substance only
| Score | Label | Criteria |
|---|---|---|
| 0 | None | No impact on democratic norms |
| 1 | Minor | Small procedural deviations |
| 2 | Moderate | Significant policy change, within constitutional framework |
| 3 | Significant | Fundamental change to checks & balances |
| 4 | Critical | Undermines rule of law, press freedom, systematic discrimination |
Decision rules:
- Score 4 ONLY if: (a) direct attack on judiciary/press, OR (b) systematic discrimination in law, OR (c) call to violate international treaties
- Score 3 if: (a) abolish referendum, OR (b) fundamentally question EU cooperation, OR (c) significantly expand executive powers
2. POPULIST_STYLE (0-1) — Style only
Independent of democratic impact. A motion can be populist (1) but democratic (0).
Indicators:
- "Het volk" vs "de elite/den Haag"
- "Wij vs zij" framing
- Call for "direct democracy" without checks
- Emotionally charged language
3. GROUP_TARGETING (0-2) — Targeting only
| Score | Label |
|---|---|
| 0 | Universal — general policy |
| 1 | Indirect — general policy that disproportionately affects groups |
| 2 | Direct — explicitly targets specific population group |
4. RESTRICTIVENESS (-1 to +1) — Direction only
| Score | Label |
|---|---|
| -1 | Expansive |
| 0 | Neutral |
| +1 | Restrictive |
Key Lessons Learned
1. Separate Style from Substance
Populist rhetoric ≠ democratic erosion. A mainstream party using strong language isn't anti-democratic. Conflating them causes false positives.
2. Make Dimensions Orthogonal
- DEMOCRATIC_EROSION × RESTRICTIVENESS: A policy can be erosive AND restrictive, or erosive AND permissive
- POPULIST_STYLE × DEMOCRATIC_EROSION: Can have populist (1) with democratic (0), and vice versa
- GROUP_TARGETING × RESTRICTIVENESS: Restrictive ≠ targeted (and vice versa)
3. Add Decision Rules for Boundaries
Vague transitions ("significant" → "critical") cause inconsistency. Define specific triggers:
Score 4 ONLY when: (a) OR (b) OR (c)
Score 3 when: (a) OR (b) OR (c)
4. Gradate EU Deviation
Not all EU deviation is equal:
- Dutch implementation of EU policy → erosion 0-1
- Nexit / leave EU → erosion 3-4
- Violate EU rules → erosion 2-3
5. Include Domain-Specific Patterns
Dutch context matters:
- Referendum abolition = score 3
- "Den Haag" / "establishment" attacks = check for populist style
- Nexit = score 3-4 depending on framing
6. Define Reference Baselines
"Abnormal" compared to what?
- 2016 consensus
- EU norms
- Historical Dutch practice
- International standards
Testing Recommendations
- Calibration set: 50 motions with expert annotations before production
- Boundary cases: Test score 3/4 transitions explicitly
- Cross-rater reliability: Multiple classifiers on same motions
- Domain-specific test cases: Migration, EU, constitutional reform
Files
scripts/classify_motions.py— Implementation with v2 promptdocs/research/motion-classification-prompt-v2.md— Full prompt documentation