Sven Geboers
|
d3dfb0ce2f
|
feat(right-wing): hybrid motion classifier using keywords + votes
Implements U2: classify_motions.py loads keywords from U1 and classifies
motions as right-wing when:
- right_support >= 60% (CANONICAL_RIGHT parties voting 'voor')
- left_opposition >= 40% (CANONICAL_LEFT parties voting 'tegen')
- AND at least 1 right-wing keyword match in title/body_text
Outputs DuckDB table with:
- motion_id, year, title, right_support, left_opposition, centrist_support
- right_keyword_matches, left_keyword_matches, classified flag
Classified 2986 of 28331 motions (10.5%) as right-wing.
|
2 months ago |
Sven Geboers
|
c6f8540671
|
feat(right-wing): derive right-wing keywords via differential TF-IDF
Implements U1: derive_keywords.py uses party voting patterns to classify
motions as right-wing vs left-wing, then computes differential TF-IDF on
cleaned motion titles to surface policy terms distinctive to right-wing
motions.
Key design choices:
- Vote threshold: 60% of parties in group must vote 'voor'
- Text cleaning strips motion prefixes aggressively (handles multi-word
surnames, plural 'leden', t.v.v. parentheticals)
- Expanded Dutch stopword list filters procedural and generic noise
- Results written to analysis/right_wing/right_wing_keywords.json
Produces ~50 filtered terms including: asielzoekers, defensie, kernenergie,
boeren, vreemdelingenbeleid, stikstof, asielstop, strafrecht.
|
2 months ago |