You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
125 lines
4.2 KiB
125 lines
4.2 KiB
---
|
|
module: llm-classification
|
|
tags: [polarization, nlp, prompt-design, democratic-norms]
|
|
problem_type: classification-schema-design
|
|
date: 2026-04-05
|
|
reviewed_by:
|
|
- correctness-reviewer
|
|
- domain-expert (Dutch politics)
|
|
- clarity-reviewer
|
|
---
|
|
|
|
# LLM Motion Classification: Prompt Design Lessons
|
|
|
|
## Problem
|
|
|
|
Wanted to classify 28,000 Dutch parliamentary motions by "extremity" to measure polarization over time.
|
|
|
|
Initial prompt conflated multiple concepts:
|
|
- Democratic norm erosion
|
|
- Populist rhetoric style
|
|
- Group targeting
|
|
- Restrictiveness vs permissiveness
|
|
|
|
## Initial v1 Design (Flawed)
|
|
|
|
```python
|
|
EXTREMITY_SCORE (1-5):
|
|
- 1: Mainstream
|
|
- 5: "Undermines checks & balances, threatens rule of law,
|
|
discriminates groups, populist rhetoric"
|
|
```
|
|
|
|
**Problems identified:**
|
|
1. Populist rhetoric is style, not substance — shouldn't be in same score as democratic erosion
|
|
2. "Extreme" undefined — compared to what baseline?
|
|
3. Score 4/5 boundary unclear
|
|
4. TARGETED_GROUP redundant with EXTREMITY_SCORE
|
|
5. EU deviation always = score 5 (too broad)
|
|
6. Missing Dutch-specific patterns (Nexit, referendum abolition)
|
|
|
|
## Refined v2 Design (Four Orthogonal Dimensions)
|
|
|
|
### 1. DEMOCRATIC_EROSION (0-4) — Substance only
|
|
| Score | Label | Criteria |
|
|
|-------|-------|----------|
|
|
| 0 | None | No impact on democratic norms |
|
|
| 1 | Minor | Small procedural deviations |
|
|
| 2 | Moderate | Significant policy change, within constitutional framework |
|
|
| 3 | Significant | Fundamental change to checks & balances |
|
|
| 4 | Critical | Undermines rule of law, press freedom, systematic discrimination |
|
|
|
|
**Decision rules:**
|
|
- Score 4 ONLY if: (a) direct attack on judiciary/press, OR (b) systematic discrimination in law, OR (c) call to violate international treaties
|
|
- Score 3 if: (a) abolish referendum, OR (b) fundamentally question EU cooperation, OR (c) significantly expand executive powers
|
|
|
|
### 2. POPULIST_STYLE (0-1) — Style only
|
|
Independent of democratic impact. A motion can be populist (1) but democratic (0).
|
|
|
|
**Indicators:**
|
|
- "Het volk" vs "de elite/den Haag"
|
|
- "Wij vs zij" framing
|
|
- Call for "direct democracy" without checks
|
|
- Emotionally charged language
|
|
|
|
### 3. GROUP_TARGETING (0-2) — Targeting only
|
|
| Score | Label |
|
|
|-------|-------|
|
|
| 0 | Universal — general policy |
|
|
| 1 | Indirect — general policy that disproportionately affects groups |
|
|
| 2 | Direct — explicitly targets specific population group |
|
|
|
|
### 4. RESTRICTIVENESS (-1 to +1) — Direction only
|
|
| Score | Label |
|
|
|-------|-------|
|
|
| -1 | Expansive |
|
|
| 0 | Neutral |
|
|
| +1 | Restrictive |
|
|
|
|
## Key Lessons Learned
|
|
|
|
### 1. Separate Style from Substance
|
|
Populist rhetoric ≠ democratic erosion. A mainstream party using strong language isn't anti-democratic. Conflating them causes false positives.
|
|
|
|
### 2. Make Dimensions Orthogonal
|
|
- DEMOCRATIC_EROSION × RESTRICTIVENESS: A policy can be erosive AND restrictive, or erosive AND permissive
|
|
- POPULIST_STYLE × DEMOCRATIC_EROSION: Can have populist (1) with democratic (0), and vice versa
|
|
- GROUP_TARGETING × RESTRICTIVENESS: Restrictive ≠ targeted (and vice versa)
|
|
|
|
### 3. Add Decision Rules for Boundaries
|
|
Vague transitions ("significant" → "critical") cause inconsistency. Define specific triggers:
|
|
```
|
|
Score 4 ONLY when: (a) OR (b) OR (c)
|
|
Score 3 when: (a) OR (b) OR (c)
|
|
```
|
|
|
|
### 4. Gradate EU Deviation
|
|
Not all EU deviation is equal:
|
|
- Dutch implementation of EU policy → erosion 0-1
|
|
- Nexit / leave EU → erosion 3-4
|
|
- Violate EU rules → erosion 2-3
|
|
|
|
### 5. Include Domain-Specific Patterns
|
|
Dutch context matters:
|
|
- Referendum abolition = score 3
|
|
- "Den Haag" / "establishment" attacks = check for populist style
|
|
- Nexit = score 3-4 depending on framing
|
|
|
|
### 6. Define Reference Baselines
|
|
"Abnormal" compared to what?
|
|
- 2016 consensus
|
|
- EU norms
|
|
- Historical Dutch practice
|
|
- International standards
|
|
|
|
## Testing Recommendations
|
|
|
|
1. **Calibration set**: 50 motions with expert annotations before production
|
|
2. **Boundary cases**: Test score 3/4 transitions explicitly
|
|
3. **Cross-rater reliability**: Multiple classifiers on same motions
|
|
4. **Domain-specific test cases**: Migration, EU, constitutional reform
|
|
|
|
## Files
|
|
|
|
- `scripts/classify_motions.py` — Implementation with v2 prompt
|
|
- `docs/research/motion-classification-prompt-v2.md` — Full prompt documentation
|
|
|