You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
motief/docs/solutions/insights/llm-motion-classification-p...

4.2 KiB

module tags problem_type date reviewed_by
llm-classification [polarization nlp prompt-design democratic-norms] classification-schema-design 2026-04-05 [correctness-reviewer domain-expert (Dutch politics) clarity-reviewer]

LLM Motion Classification: Prompt Design Lessons

Problem

Wanted to classify 28,000 Dutch parliamentary motions by "extremity" to measure polarization over time.

Initial prompt conflated multiple concepts:

  • Democratic norm erosion
  • Populist rhetoric style
  • Group targeting
  • Restrictiveness vs permissiveness

Initial v1 Design (Flawed)

EXTREMITY_SCORE (1-5):
  - 1: Mainstream
  - 5: "Undermines checks & balances, threatens rule of law, 
        discriminates groups, populist rhetoric"

Problems identified:

  1. Populist rhetoric is style, not substance — shouldn't be in same score as democratic erosion
  2. "Extreme" undefined — compared to what baseline?
  3. Score 4/5 boundary unclear
  4. TARGETED_GROUP redundant with EXTREMITY_SCORE
  5. EU deviation always = score 5 (too broad)
  6. Missing Dutch-specific patterns (Nexit, referendum abolition)

Refined v2 Design (Four Orthogonal Dimensions)

1. DEMOCRATIC_EROSION (0-4) — Substance only

Score Label Criteria
0 None No impact on democratic norms
1 Minor Small procedural deviations
2 Moderate Significant policy change, within constitutional framework
3 Significant Fundamental change to checks & balances
4 Critical Undermines rule of law, press freedom, systematic discrimination

Decision rules:

  • Score 4 ONLY if: (a) direct attack on judiciary/press, OR (b) systematic discrimination in law, OR (c) call to violate international treaties
  • Score 3 if: (a) abolish referendum, OR (b) fundamentally question EU cooperation, OR (c) significantly expand executive powers

2. POPULIST_STYLE (0-1) — Style only

Independent of democratic impact. A motion can be populist (1) but democratic (0).

Indicators:

  • "Het volk" vs "de elite/den Haag"
  • "Wij vs zij" framing
  • Call for "direct democracy" without checks
  • Emotionally charged language

3. GROUP_TARGETING (0-2) — Targeting only

Score Label
0 Universal — general policy
1 Indirect — general policy that disproportionately affects groups
2 Direct — explicitly targets specific population group

4. RESTRICTIVENESS (-1 to +1) — Direction only

Score Label
-1 Expansive
0 Neutral
+1 Restrictive

Key Lessons Learned

1. Separate Style from Substance

Populist rhetoric ≠ democratic erosion. A mainstream party using strong language isn't anti-democratic. Conflating them causes false positives.

2. Make Dimensions Orthogonal

  • DEMOCRATIC_EROSION × RESTRICTIVENESS: A policy can be erosive AND restrictive, or erosive AND permissive
  • POPULIST_STYLE × DEMOCRATIC_EROSION: Can have populist (1) with democratic (0), and vice versa
  • GROUP_TARGETING × RESTRICTIVENESS: Restrictive ≠ targeted (and vice versa)

3. Add Decision Rules for Boundaries

Vague transitions ("significant" → "critical") cause inconsistency. Define specific triggers:

Score 4 ONLY when: (a) OR (b) OR (c)
Score 3 when: (a) OR (b) OR (c)

4. Gradate EU Deviation

Not all EU deviation is equal:

  • Dutch implementation of EU policy → erosion 0-1
  • Nexit / leave EU → erosion 3-4
  • Violate EU rules → erosion 2-3

5. Include Domain-Specific Patterns

Dutch context matters:

  • Referendum abolition = score 3
  • "Den Haag" / "establishment" attacks = check for populist style
  • Nexit = score 3-4 depending on framing

6. Define Reference Baselines

"Abnormal" compared to what?

  • 2016 consensus
  • EU norms
  • Historical Dutch practice
  • International standards

Testing Recommendations

  1. Calibration set: 50 motions with expert annotations before production
  2. Boundary cases: Test score 3/4 transitions explicitly
  3. Cross-rater reliability: Multiple classifiers on same motions
  4. Domain-specific test cases: Migration, EU, constitutional reform

Files

  • scripts/classify_motions.py — Implementation with v2 prompt
  • docs/research/motion-classification-prompt-v2.md — Full prompt documentation