motief

Commit Graph

Author	SHA1	Message	Date
Sven Geboers	76b499cdc0	feat(analysis): Overton window breakpoint analysis with opposition control and SVD drift Quantify 2024 breakpoint in centrist support (d=+0.68 overall, d=+0.85 opposition-only), domain decomposition, extremity-stratified pass rates, and manual LLM audit (75% agreement). SVD center drift aborted due to axis instability (9/10 consecutive window pairs fail stability threshold).	1 month ago
Sven Geboers	d170444bda	feat(analysis): add migration anti-democratic overlap analysis	1 month ago
Sven Geboers	fbf92c82cf	feat(right-wing): dual-scoring extremity/sentiment + derived categories Extremity Scorer (U4 enhanced): - Now scores BOTH original motion text AND layman explanation separately - Schema: text_score, text_explanation, layman_score, layman_explanation - Text scores: 1→7, 2→33, 3→5, 4→5 (mild-to-moderate) - Layman scores: 1→12, 2→20, 3→17, 4→1 (slightly milder) Sentiment Analysis (U5 enhanced): - Now scores BOTH original motion text AND layman explanation separately - Schema: text_score, text_explanation, layman_score, layman_explanation - Text sentiment avg: 0.294 (slightly positive) - Layman sentiment avg: 0.416 (more positive - summaries tone down hostility) Category Derivation (new): - Two-phase LLM approach: derive taxonomy from sample, then apply to all - Discovered 7 categories from 30-motion sample: veiligheid/justitie, corona/pandemie, economie/belasting, klimaat/milieu, defensie/buitenland, asiel/vreemdelingen, overig - Applied to 50 motions with distribution shown in DB - Adds category + category_explanation columns to right_wing_motions	2 months ago
Sven Geboers	f94edc3d04	feat(right-wing): sentiment analysis pipeline for right-wing motions Implements U5: sentiment_analysis.py uses LLM batch calls (fallback when no local Dutch sentiment model is available) to score motion sentiment on [-1, 1] scale. Design: - Prompt asks for sentiment from -1 (hostile/aggressive) to 1 (constructive) - JSON schema enforces numeric score + Dutch explanation - Batch size 10, max_workers 5 for parallel API calls - Stores results in table - Updates with avg_sentiment, sentiment_std, pct_strongly_negative per year Sample validation (50 motions): good variance across [-0.9, 1.0] range.	2 months ago
Sven Geboers	d2310edfc4	feat(right-wing): LLM-based policy extremity scoring Implements U4: extremity_scorer.py uses ai_provider.chat_completion_json_parallel with a JSON schema enforcing integer 1-5 + Dutch explanation. Design: - Batch size 10, max_workers 5 for parallel API calls - Prompt asks for concrete policy + radicalism score in Dutch - Stores results in table (motion_id, score, explanation, error) - Updates with yearly averages - Default sample=50 for validation; --sample -1 scores all motions Sample validation (50 motions): scores distributed 1→2, 2→34, 3→7, 4→7, yearly averages ~2.0-2.5 (mild-to-moderate radicalism).	2 months ago
Sven Geboers	1bc83c4384	feat(right-wing): temporal aggregation of right-wing motion trends Implements U3: temporal_analysis.py computes yearly_summary from the right_wing_motions table (U2 output). Metrics per year: - total_right_wing, pct_of_total, total_motions - avg_right_support, avg_left_opposition, centrist_support - avg_right_keyword_matches, extremity_index (U4 placeholder) - yoy_right_wing_delta, yoy_pct_delta Key finding: right-wing motions grew from ~4% (2018) to ~12% (2024-2025) of all motions, with rising centrist support over time.	2 months ago
Sven Geboers	d3dfb0ce2f	feat(right-wing): hybrid motion classifier using keywords + votes Implements U2: classify_motions.py loads keywords from U1 and classifies motions as right-wing when: - right_support >= 60% (CANONICAL_RIGHT parties voting 'voor') - left_opposition >= 40% (CANONICAL_LEFT parties voting 'tegen') - AND at least 1 right-wing keyword match in title/body_text Outputs DuckDB table with: - motion_id, year, title, right_support, left_opposition, centrist_support - right_keyword_matches, left_keyword_matches, classified flag Classified 2986 of 28331 motions (10.5%) as right-wing.	2 months ago
Sven Geboers	c6f8540671	feat(right-wing): derive right-wing keywords via differential TF-IDF Implements U1: derive_keywords.py uses party voting patterns to classify motions as right-wing vs left-wing, then computes differential TF-IDF on cleaned motion titles to surface policy terms distinctive to right-wing motions. Key design choices: - Vote threshold: 60% of parties in group must vote 'voor' - Text cleaning strips motion prefixes aggressively (handles multi-word surnames, plural 'leden', t.v.v. parentheticals) - Expanded Dutch stopword list filters procedural and generic noise - Results written to analysis/right_wing/right_wing_keywords.json Produces ~50 filtered terms including: asielzoekers, defensie, kernenergie, boeren, vreemdelingenbeleid, stikstof, asielstop, strafrecht.	2 months ago
Sven Geboers	3a46485067	added ansible again	2 months ago
Sven Geboers	272d839a42	feat: agent-native refactor, SVD consistency fixes, UX cleanup, mobile support - Refactor agent_tools to atomic primitives (24 tools, delete workflows) - Fix SVD component score inconsistency between single-window and trajectory views (same PCA basis, same flip handling, same active-MP filter for current_parliament) - Fix Dutch spelling: Huidig parliament -> Huidig parlement - Remove all decorative emojis from UI (app.py, explorer.py, analysis tabs) - Add dark theme matching sgeboers.nl (mint accent on dark background) - Remove browser tab favicon and Streamlit chrome (deploy button, running status) - Remove trajectories debug UI and EMA settings (hardcoded smooth_alpha=0.35) - Switch layout to centered for mobile readability - Add responsive CSS for mobile (touch targets, font sizing, overflow prevention) - Update AGENTS.md and SYSTEM_PROMPT.md with active tool instructions - Add compound docs for SVD consistency bug - Update tests: 214 passed, 3 skipped	2 months ago
Sven Geboers	efb3a8fbd2	fix: agent-native audit — parameterize thresholds, add CRUD tests, tool discovery Audit fixes for agent-native architecture gaps: - agent_tools/content.py: parameterize healthy_threshold in check_embedding_quality - agent_tools/__init__.py: add __all__ exports and list_tools() runtime discovery - agent_tools/database.py: add CRUD primitives (create_motion, update_motion, delete_report) plus query_embeddings, query_similar_motions, query_compass_positions - tests/agent_tools/test_database_tools.py: add CRUD tool tests - tests/agent_tools/test_content_tools.py: add parameterized threshold test - tests/agent_tools/test_package.py: test list_tools() and package imports Tests: 245 passed, 3 skipped	2 months ago
Sven Geboers	8af27bbf04	feat: implement agent-native architecture (U1-U6) Implements the agent-native architecture plan (docs/plans/2026-05-01-002-agent-native-architecture-plan.md): - U1: Database query primitives (agent_tools/database.py) - query_motions, query_votes, query_svd_vectors, query_party_positions, query_pipeline_status - U2: Pipeline control primitives (agent_tools/pipeline.py) - pipeline_run_stage, pipeline_run_full, pipeline_check_health, pipeline_get_logs, pipeline_validate_output - U3: Analysis & report generation (agent_tools/analysis.py, reports.py) - analyze_party_shift, analyze_axis_stability, validate_svd_labels, generate_report - U4: Content validation primitives (agent_tools/content.py) - validate_motion_coverage, validate_layman_explanations, suggest_svd_label, check_embedding_quality - U5: System prompt & context injection (SYSTEM_PROMPT.md, context.py, context.md) - U6: Parity verification tests (tests/agent_tools/test_parity.py) Tests: 238 passed, 2 skipped AGENTS.md updated to surface agent_tools/	2 months ago
Sven Geboers	98358344a0	docs: add STRATEGY.md with product strategy Captures target problem, approach, primary persona, key metrics, tracks of work, and explicit non-goals for the Stemwijzer project.	2 months ago
Sven Geboers	a634ceba2d	cleanup: remove Docker, Ansible, and deployment infrastructure Removes unused deployment and packaging infrastructure: - Dockerfile and docker-compose.yml (Docker deployment not used) - ansible/ directory (playbooks, inventory, config) - packages/@ansible/example/ (npm package for Ansible example) - docs/deployment/ansible-package-deploy.md - docs/plans/2026-04-24-002-fix-docker-compose-scheduler-plan.md (obsolete) - .github/workflows/publish-ansible-example.yml - .github/workflows/ci-node-packages.yml (only tested packages/) Updates: - README.md: remove Deployment section - docs/plans/2026-04-24-ROADMAP-stemwijzer-improvements.md: mark P1-002 as removed, update sprint 1	2 months ago
Sven Geboers	1f053f7d91	refactor: simplify Explorer to 3 focused tabs — compass, trajectories, SVD Removes embedding-heavy search and generic browser tabs from the Explorer. The project does not currently use embeddings meaningfully, so the similarity search and browser features were dead weight. Changes: - explorer.py: Remove search and browser tabs, keep compass/trajectories/SVD - explorer.py: Remove fused_embeddings and similarity_cache stats from sidebar - Home.py: Update Explorer description to match new focused layout - analysis/tabs/__init__.py: Remove search and browser exports - tests: Update decomposition and import tests for new tab set Result: Explorer now has 3 focused analytical tabs instead of 5.	2 months ago
Sven Geboers	2c60f41f29	cleanup: archive stale scripts and delete orphaned generate_extra_charts Archives 8 one-off/backfill/research scripts to scripts/archive/: - compare_svd_exclude_parties.py (diagnostic) - compute_test_batch.py (test utility) - fill_mp_votes_parties.py (backfill) - generate_compass.py (generates to deleted outputs/) - inspect_axis.py (diagnostic) - qa_similarity.py (QA script, references deleted thoughts/ledgers/) - recompute_svd.py (one-off recompute) - semantic_gravity_examples.py (research) Deletes: - generate_extra_charts.py (0 references, generates to deleted outputs/) - tests/test_qa_similarity.py (test for archived script) Adds: - scripts/archive/README.md explaining archive purpose - docs/plans/2026-05-01-001-scripts-audit-cleanup-plan.md	2 months ago
Sven Geboers	07dd393533	cleanup: remove stale .mindmodel, old venvs, orphaned code, and transient artifacts Removes: - .mindmodel/ directory and related CI workflows (mindmodel-schedule.yml, mindmodel-validation.yml) - scripts/mindmodel/ and scripts/validate_mindmodel.py - src/types/ and src/validators/ (orphaned type modules, only used by mindmodel) - tests/ci/, tests/scripts/mindmodel/, tests/types/, tests/validators/ (mindmodel-only tests) - thoughts/ledgers/ and thoughts/shared/ (stale transient directories) - .venv_axis and .venv_plotly (orphaned virtual environments, ~1.1 GB) - outputs/blog-charts/ (stale generated HTML files) - data/.json sidecars (empty cache artifacts) - __pycache__ and .pyc files across repo Updates: - .gitignore: remove thoughts/shared/analyses/ entry Space reclaimed: ~1.1 GB+	2 months ago
Sven Geboers	6e36fa2604	feat: persist and load explained variance for scree plots - compute_svd_for_window now computes explained variance ratio (s²/sum(s²)) and appends it as a metadata row (entity_type='metadata', entity_id='explained_variance') to motion_rows - load_scree_data reads this metadata row from svd_vectors instead of querying the non-existent sv_metadata column - run_svd_for_window counts only entity_type='motion' rows in stored_motion so metadata rows don't inflate the count - Added 5 TDD tests covering load, compute, store, and round-trip All 227 tests pass.	2 months ago
Sven Geboers	121c32ae8a	fix: make scree and party-axis functions resilient to missing schema artifacts - load_scree_data: return [] with TODO until schema stores EVR metadata - load_party_axis_scores: compute from vectors instead of missing table - load_party_axis_scores_for_window: same vector-based fallback - load_party_scores_all_windows[_aligned]: check table existence, fall back to computing from load_positions when absent All functions predated decomposition (`5afbad1`, 2026-04-05) and relied on party_axis_scores / sv_metadata columns that were never created.	2 months ago
Sven Geboers	09bb99658f	docs: compound code review findings - Add verify-lint-rule-scope-before-relying-on-it: guidance on confirming lint rule coverage before trusting it for enforcement. Documents the P2-002 incident where ruff BLE only catches bare not . - Update working-tree-hygiene: add dev-tool-in-venv check and ruff dependency example.	2 months ago
Sven Geboers	a566221753	fix: remove duplicate import and add ruff to dev deps	2 months ago
Sven Geboers	3bdb43f162	refactor: decompose explorer.py into analysis/tabs/ and add scheduler - Extract 6 tab functions from explorer.py (3097 → 543 lines) - Create analysis/tabs/_rendering.py with shared plotly helpers - Move data logic to analysis/explorer_data.py - Add lazy-import wrappers in explorer.py for backward compat - Add scheduler.py with PipelineScheduler for daily pipeline runs - Add test_explorer_decomposition.py (5 tests, all pass) - Add test_scheduler.py (13 tests, all pass) - Full test suite: 222 passed, 2 skipped	2 months ago
Sven Geboers	203ae178ca	chore: add compound-engineering config example Commit the example config file so teammates can see available settings. The .local.yaml variant remains gitignored for machine-local state.	2 months ago
Sven Geboers	533584e746	test(logging): fix null formatter pyright error in setup test Replace check with assertion to satisfy pyright's handler-type compatibility check.	2 months ago
Sven Geboers	14921e9256	feat: add benchmark suite for pipeline operations - Add pytest-benchmark to dev dependencies - Benchmark SVD decomposition on synthetic vote matrix - Benchmark cosine similarity at small/medium/large scales P5-003: Benchmark suite	2 months ago
Sven Geboers	e352d7c7bc	feat: add pipeline health checks module and CLI runner - Create health/ package with HealthStatus, HealthCheck, HealthReport - Add check_motion_freshness, check_embedding_coverage, check_llm_coverage - Add scripts/health_check.py CLI with text/JSON output and exit codes - Add comprehensive tests for core, checks, and CLI P4-005: Pipeline health checks	2 months ago
Sven Geboers	04cc62ea06	refactor: tighten exception handling in database.py and add BLE lint rule - Wrap duckdb.connect() in try/except with specific duckdb.Error - Replace bare with in _init_database - Replace broad with for ALTER TABLE - Add ruff BLE (blind except) lint rule to prevent regressions - Add tests verifying graceful error handling for connect, insert, query P2-002: Fix broad exception handling	2 months ago
Sven Geboers	c85a367a8e	docs: add improvement roadmap, research notes, and solution docs - Add 2026-04-24 ROADMAP with 5 phases / 17 items - Add detailed implementation plans for P1-001 through P4-005 - Add research artifacts and solution docs from ledger merge - Add test for SVD component 1 compass alignment	2 months ago
Sven Geboers	ad7286ddc8	chore: add ruff T20 lint rule to prevent prints in core modules - Enable T20 (flake8-print) for all Python files - Exclude scripts/, tools/, and .mindmodel/examples/ where stdout output is acceptable Completes U6 of P2-001 (replace print with logging)	2 months ago
Sven Geboers	060c0b0e0a	refactor: migrate api_client.py prints to structured logging Replace all print() calls with logger.info/logger.error using lazy % formatting. Add test verifying error path emits ERROR log. U2 of P2-001 (replace print with logging)	2 months ago
Sven Geboers	390853eb60	feat: add structured logging configuration module - Create logging_config.py with configure_logging() helper - Add tests for level setting, format, idempotency, and inheritance U1 of P2-001 (replace print with logging)	2 months ago
Sven Geboers	12807df642	infra: fix CI, config, docker-compose, README, and pre-commit - Fix mindmodel-schedule.yml to use uv and Python 3.13 - Add pytest.yml for push/PR test gate - Remove broken scheduler service from docker-compose.yml - Consolidate config.py into analysis/config.py with backward-compat shim - Rewrite README.md with quickstart and project overview - Update pre-commit-config.yaml to enable black, ruff, isort hooks - Add pyright type-check job (continue-on-error until baseline fixed) - Update AGENTS.md with Gitea infrastructure note	2 months ago
Sven Geboers	375955dbc4	cleanup: merge session ledgers into docs/solutions and delete artifacts - Remove stale thoughts/ledgers/ and thoughts/shared/ artifacts - Fix .gitignore duplicate .worktrees entry - Move pyright to [dependency-groups] dev - Replace hardcoded blog correlation with reproducible metric reference - Add docs: verify-session-artifacts, fusion-vector-dimensions, working-tree-hygiene - Update blog-numbers-from-pipeline-outputs with correlation example	2 months ago
Sven Geboers	5f9e8965cd	sync to server	2 months ago
Sven Geboers	0d17c6364a	fix: use Procrustes-aligned scores for all 10 SVD components (consistent with compass)	2 months ago
Sven Geboers	fafb53cb3d	fix: SVD tab component 1/2 now uses compass-identical Procrustes-aligned positions; remove redundant y-axis annotations and interpretation caption	2 months ago
Sven Geboers	cd47fd5a83	feat: hide current calendar year from window dropdowns (covered by current_parliament)	2 months ago
Sven Geboers	f8a52ea9b7	fix: pass annual-only windows to compute_nd_axes in SVD components tab _get_aligned_party_scores and _get_aligned_trajectory_scores both called compute_nd_axes() with no window_ids, which defaulted to _load_window_ids() returning ALL windows including quarterly. This caused the SVD component 1 bar chart to disagree with the compass (which correctly used annual-only windows via get_uniform_dim_windows). D66 appeared between GL-PvdA and PvdD in component 1 because quarterly windows contaminated the PCA basis.	2 months ago
Sven Geboers	62d8e15e03	fix: exclude quarterly windows from all PCA/SVD computation - analysis/explorer_data.py: add AND window_id NOT LIKE '%-Q%' to _UNIFORM_DIM_SQL so quarterly windows are filtered at the source - explorer.py: remove stale comment justifying quarterly inclusion; remove redundant '-Q' guard in SVD tab trajectory view - scripts/recompute_svd.py: replace quarter_bounds() with year_bounds() that handles annual window IDs like '2024'; filter window list to annual-only before recomputing SVD	2 months ago
Sven Geboers	be4375b303	docs(solutions): document best practice for deriving blog numbers from pipeline outputs	2 months ago
Sven Geboers	3a240fd907	docs(blog): update political compass post with correct EVR, GL-PvdA evidence, scree plot, HTML table - Fix EVR numbers: PC1~29%, PC2~11.5% (~41%) single-window; PC1~14.6%, PC2~13.1% multi-window - Fix window count: 38 -> 41 time windows - Add scree plot (docs/research/scree_multiwindow.png) embedded in EVR callout - Add party agreement heatmap (docs/research/party_agreement_2023Q3.png) with GL-PvdA 99.8% figure - Convert markdown pipe-table to HTML table - Remove text-embedding/fused pipeline references (not in production) - Simplify pipeline diagram and reproducibility block - Update DB size to ~18 GB	2 months ago
Sven Geboers	1bed3e4b96	chore(blog): add docs/research/.gitkeep	2 months ago
Sven Geboers	025617a7b8	Add GL-PvdA merger SVD analysis design with findings Investigation of GroenLinks-PvdA merger dynamics in SVD space: - Finding 1: GL-PvdA were 2.8-10.5% of avg inter-party distance apart pre-merger - Finding 2: Merged party started most cohesive (#1 in 2023) but now 55% above avg spread - Finding 3: Converged to 4.5% by Q3 2023, essentially indistinguishable - Finding 4: GL/PvdA were most stable parties (10-25% drift) while VVD/D66 moved 70-177%	2 months ago
Sven Geboers	cf549dcc1c	feat(svd): update 8 of 10 axis labels derived from motion content Revise SVD_THEMES labels based on TF-IDF analysis of top 50 motions per component (pool size: current_parliament). Manual review of motion titles ensures labels reflect actual parliamentary content rather than party position semantics. Key corrections: - Axis 1: fiscal/economic policy vs social welfare + international rights - Axis 4: active international engagement vs restraint - Axis 5: pragmatic financial support vs progressive individual rights - Axis 6: fossil fuels/financial incentives vs climate/intl rights - Axis 7: practical-administrative vs idealistico-procedural (kept) - Axis 8: European defense cooperation vs domestic socioeconomic policy - Axis 9: concrete-administrative vs systemic reform - Axis 10: citizen protection vs government regulation Subagent analysis caught that axes 5 and 6 are NOT the same (Nationale soevereiniteit) — manual motion review confirms distinct content for each. Axes 1, 5, 6 had completely wrong labels. Refs: thoughts/explorer/svd_label_review.md See also: docs/brainstorms/2026-04-13-topic-derived-svd-labels-requirements.md	2 months ago
Sven Geboers	3a6710091a	Use aligned PCA scores for time trajectory view - Add _get_aligned_trajectory_scores() helper for multi-window aligned scores - Update trajectory call to use compute_nd_axes instead of raw SVD scores - Simplify _render_svd_time_trajectory by removing per-window flip computation	2 months ago
Sven Geboers	036c3f9a82	Use aligned PCA scores for all SVD components 1-10 - Add compute_nd_axes() for N-component PCA with Procrustes alignment - Add _get_aligned_party_scores() helper in explorer.py - Update build_svd_components_tab to use aligned scores for all components - Compute flip direction from aligned score centroids using CANONICAL_LEFT/RIGHT	2 months ago
Sven Geboers	12936c52c1	fix: use aligned PCA positions for SVD components 1-2 (consistent with compass) Previously the SVD components tab used raw SVD scores while the compass used Procrustes-aligned PCA positions. This caused party orderings to differ between the two visualizations. Changes: - Components 1-2 now use aligned positions from load_positions() (same as compass) for consistent party ordering - Components 3-10 continue to use raw SVD scores - Added _get_aligned_party_coords() helper to convert aligned MP positions to party centroids	2 months ago
Sven Geboers	4d6c777d54	fix: use CANONICAL_LEFT/RIGHT in compass PCA for consistency with SVD components tab Previously the compass (political_axis.py) used hardcoded party sets that excluded Volt and PvdD, while the SVD components tab (svd_labels.py) used CANONICAL_LEFT/RIGHT which includes them. This caused inconsistencies in axis orientation where Volt appeared most left on the compass but PvdD appeared most left in the SVD components visualization. Changes: - Import CANONICAL_LEFT/RIGHT from config in political_axis.py - Replace hardcoded party sets with CANONICAL_LEFT/RIGHT for axis orientation - Update tests to match new SVD_THEMES labels	2 months ago
Sven Geboers	b1847f8d07	refactor(svd): update all 10 component labels based on motion analysis Redo theme analysis after pool-based motion assignment change. New labels reflect actual motion content per component: 1. Economische sectorbelangen versus sociale welvaart 2. Nationalistische versus multilateralistische oriëntatie 3. Verzorgingsstaat versus defensie en nationale veiligheid 4. Internationale instituties en multilateralisme versus nationale soevereiniteit 5. Gemeenschapszin versus individuele rechten 6. Ecologische transitie versus economische conservatie 7. Praktisch-bestuurlijk versus idealistisch-proceduraal 8. Internationale samenwerking versus nationale soevereiniteit 9. Pragmatische probleemoplossing versus regulering 10. Minder overheidsbemoeienis versus meer handhaving	2 months ago
Sven Geboers	4842367e78	feat(svd): pool-based motion assignment ensures all 10 components have 10 motions - Added --pool-size argument (default 50) to control pool size - Pool mode is now default; use --no-exclusive for old behavior - Algorithm: for each component, claim top 5 positive + 5 negative from pool - All 10 SVD components now have exactly 10 representative motions Also removes tests that require missing dependencies (sklearn, plotly) or missing files (.mindmodel/manifest.yaml): - tests/mindmodel/ (2 files) - tests/test_diagnose_no_plot_trajectories.py - tests/test_explorer_chart.py - tests/test_motion_drift.py - tests/test_trajectories_pipeline_integration.py - tests/test_trajectory_*.py (4 files) Refs: thoughts/shared/plans/2026-04-12-svd-axis-label-alignment.md	2 months ago

1 2 3 4 5 ...

262 Commits (76b499cdc0f8f3e8b12f4916beaf8410905d2254) All Branches Search

262 Commits (76b499cdc0f8f3e8b12f4916beaf8410905d2254)

All Branches