# Motion Extremity Classification with LLMs ## Implementation Status **Script**: `scripts/classify_motions.py` - Ready to run **Requirements**: - Valid OpenRouter API key in `.env` (current key returns "User not found") - ~28,000 motions to classify **Usage**: ```bash # Classify all motions (will take hours) .venv/bin/python scripts/classify_motions.py --delay 0.5 # Test with small sample first .venv/bin/python scripts/classify_motions.py --limit 10 --delay 2 # Analyze existing classifications .venv/bin/python scripts/classify_motions.py --analyze-only ``` ## Why LLMs? Rule-based keyword matching is too crude: - Only captures 3-4% as "high extremity" - Can't understand nuance ("verbod" appears in mundane contexts) - Can't assess policy impact magnitude LLMs can: - Understand policy context and implications - Assess deviation from consensus/norms - Interpret Dutch political terminology ## Proposed LLM Classification Schema ### Output Format ```json { "extremity_score": 1-5, "policy_domain": "migration|identity|economy|social|climate|foreign_policy|justice|education|health|other", "policy_direction": "restrictive|permissive|neutral", "deviation_type": "procedural|semantic|structural", "consensus_level": "broad|partial|narrow|opposition", "rationale": "1-2 sentence explanation" } ``` ### Extremity Scale (1-5) | Score | Label | Description | Examples | |-------|-------|-------------|----------| | 1 | Mainstream | Standard governance, routine | Budget adjustments, procedural changes | | 2 | Minor deviation | Small policy tweaks within consensus | Minor fee changes, small program adjustments | | 3 | Moderate deviation | Meaningful but within coalition consensus | Immigration processing changes, targeted regulations | | 4 | Major deviation | Challenges status quo meaningfully | Tighter migration rules, significant policy reversals | | 5 | Extreme | Fundamental/populist, outside consensus | Complete bans, anti-democratic motions | ### Policy Direction - **restrictive**: Limits freedoms, tightens rules, reduces access - **permissive**: Expands freedoms, loosens rules, increases access - **neutral**: Procedural, administrative, technical ### Consensus Level - **broad**: Passed with 80%+ parties voting same way - **partial**: Passed with 60-80% agreement - **narrow**: Passed with 50-60% (close vote) - **opposition**: Coalition parties voted against ## LLM Prompt ``` SYSTEM: You are an expert on Dutch parliamentary politics. Classify parliamentary motions on policy extremity using the provided schema. CLASSIFICATION_RUBRIC: - Score 1 (Mainstream): Routine governance, budget adjustments, procedural changes - Score 2 (Minor): Small policy tweaks within consensus - Score 3 (Moderate): Meaningful changes but within coalition consensus - Score 4 (Major): Challenges status quo, significant policy shifts - Score 5 (Extreme): Fundamental changes, populist, outside consensus Consider: - Policy impact magnitude - Deviation from current norms/policies - Coalition/opposition dynamics - Dutch political context USER: Classify this motion: Title: {title} Description: {description} Voting result: {passed/rejected}, {party_coalition} parties voted for Respond in JSON format. ``` ## Batch Processing Strategy ```python import json import asyncio from openai import AsyncOpenAI async def classify_motion_batch(motions: list[dict], model: str = "gpt-4o") -> list[dict]: """Process motions in parallel batches.""" client = AsyncOpenAI() async def classify_one(motion: dict) -> dict: prompt = build_prompt(motion) response = await client.chat.completions.create( model=model, messages=[{"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": prompt}], response_format={"type": "json_object"} ) result = json.loads(response.choices[0].message.content) result["motion_id"] = motion["id"] return result # Process 50 in parallel results = [] for i in range(0, len(motions), 50): batch = motions[i:i+50] batch_results = await asyncio.gather(*[classify_one(m) for m in batch]) results.extend(batch_results) return results async def main(): motions = load_motions() # Load from database classifications = await classify_motion_batch(motions) save_to_database(classifications) asyncio.run(main()) ``` ## Cost Estimate | Dataset Size | Model | Est. Cost | Est. Time | |-------------|-------|-----------|-----------| | 35,000 motions | gpt-4o-mini | ~$5-10 | 30-60 min | | 35,000 motions | gpt-4o | ~$50-100 | 2-4 hours | Using `gpt-4o-mini` is sufficient for classification tasks. ## Analysis After Classification Once classified, we can analyze: ```python # Extremity by period df.groupby(['period', 'extremity_score']).size().unstack(fill_value=0) # Domain-Extremity heatmap pivot = df.pivot_table(values='motion_id', index='policy_domain', columns='extremity_score', aggfunc='count') # Passed vs rejected extremity df.groupby('passed')['extremity_score'].mean() # Coalition shift analysis df[df['policy_domain'] == 'migration'].groupby(['period', 'policy_direction']).size() ``` ## Expected Insights 1. **Extremity distribution over time** - Has 4-5 score increased? 2. **Domain-extremity correlation** - Which domains produce extreme policies? 3. **Direction-extremity** - Restrictive vs permissive extremity by period 4. **Consensus-extremity** - Are extreme policies passing with broad or narrow consensus? 5. **Coalition voting** - Which parties support extreme policies?