Extremity Scorer (U4 enhanced):
- Now scores BOTH original motion text AND layman explanation separately
- Schema: text_score, text_explanation, layman_score, layman_explanation
- Text scores: 1→7, 2→33, 3→5, 4→5 (mild-to-moderate)
- Layman scores: 1→12, 2→20, 3→17, 4→1 (slightly milder)
Sentiment Analysis (U5 enhanced):
- Now scores BOTH original motion text AND layman explanation separately
- Schema: text_score, text_explanation, layman_score, layman_explanation
- Text sentiment avg: 0.294 (slightly positive)
- Layman sentiment avg: 0.416 (more positive - summaries tone down hostility)
Category Derivation (new):
- Two-phase LLM approach: derive taxonomy from sample, then apply to all
- Discovered 7 categories from 30-motion sample:
veiligheid/justitie, corona/pandemie, economie/belasting, klimaat/milieu,
defensie/buitenland, asiel/vreemdelingen, overig
- Applied to 50 motions with distribution shown in DB
- Adds category + category_explanation columns to right_wing_motions
Implements U5: sentiment_analysis.py uses LLM batch calls (fallback when no
local Dutch sentiment model is available) to score motion sentiment on [-1, 1]
scale.
Design:
- Prompt asks for sentiment from -1 (hostile/aggressive) to 1 (constructive)
- JSON schema enforces numeric score + Dutch explanation
- Batch size 10, max_workers 5 for parallel API calls
- Stores results in table
- Updates with avg_sentiment, sentiment_std,
pct_strongly_negative per year
Sample validation (50 motions): good variance across [-0.9, 1.0] range.
Implements U3: temporal_analysis.py computes yearly_summary from the
right_wing_motions table (U2 output).
Metrics per year:
- total_right_wing, pct_of_total, total_motions
- avg_right_support, avg_left_opposition, centrist_support
- avg_right_keyword_matches, extremity_index (U4 placeholder)
- yoy_right_wing_delta, yoy_pct_delta
Key finding: right-wing motions grew from ~4% (2018) to ~12% (2024-2025)
of all motions, with rising centrist support over time.
Removes embedding-heavy search and generic browser tabs from the Explorer.
The project does not currently use embeddings meaningfully, so the similarity
search and browser features were dead weight.
Changes:
- explorer.py: Remove search and browser tabs, keep compass/trajectories/SVD
- explorer.py: Remove fused_embeddings and similarity_cache stats from sidebar
- Home.py: Update Explorer description to match new focused layout
- analysis/tabs/__init__.py: Remove search and browser exports
- tests: Update decomposition and import tests for new tab set
Result: Explorer now has 3 focused analytical tabs instead of 5.
- compute_svd_for_window now computes explained variance ratio (s²/sum(s²))
and appends it as a metadata row (entity_type='metadata',
entity_id='explained_variance') to motion_rows
- load_scree_data reads this metadata row from svd_vectors instead of
querying the non-existent sv_metadata column
- run_svd_for_window counts only entity_type='motion' rows in stored_motion
so metadata rows don't inflate the count
- Added 5 TDD tests covering load, compute, store, and round-trip
All 227 tests pass.
- load_scree_data: return [] with TODO until schema stores EVR metadata
- load_party_axis_scores: compute from vectors instead of missing table
- load_party_axis_scores_for_window: same vector-based fallback
- load_party_scores_all_windows[_aligned]: check table existence,
fall back to computing from load_positions when absent
All functions predated decomposition (5afbad1, 2026-04-05) and relied on
party_axis_scores / sv_metadata columns that were never created.
- Add verify-lint-rule-scope-before-relying-on-it: guidance on
confirming lint rule coverage before trusting it for enforcement.
Documents the P2-002 incident where ruff BLE only catches bare
not .
- Update working-tree-hygiene: add dev-tool-in-venv check and
ruff dependency example.
- Add pytest-benchmark to dev dependencies
- Benchmark SVD decomposition on synthetic vote matrix
- Benchmark cosine similarity at small/medium/large scales
P5-003: Benchmark suite
- Wrap duckdb.connect() in try/except with specific duckdb.Error
- Replace bare with in _init_database
- Replace broad with for ALTER TABLE
- Add ruff BLE (blind except) lint rule to prevent regressions
- Add tests verifying graceful error handling for connect, insert, query
P2-002: Fix broad exception handling
- Add 2026-04-24 ROADMAP with 5 phases / 17 items
- Add detailed implementation plans for P1-001 through P4-005
- Add research artifacts and solution docs from ledger merge
- Add test for SVD component 1 compass alignment
- Enable T20 (flake8-print) for all Python files
- Exclude scripts/, tools/, and .mindmodel/examples/ where
stdout output is acceptable
Completes U6 of P2-001 (replace print with logging)
Replace all print() calls with logger.info/logger.error using
lazy % formatting. Add test verifying error path emits ERROR log.
U2 of P2-001 (replace print with logging)
- Create logging_config.py with configure_logging() helper
- Add tests for level setting, format, idempotency, and inheritance
U1 of P2-001 (replace print with logging)
- Fix mindmodel-schedule.yml to use uv and Python 3.13
- Add pytest.yml for push/PR test gate
- Remove broken scheduler service from docker-compose.yml
- Consolidate config.py into analysis/config.py with backward-compat shim
- Rewrite README.md with quickstart and project overview
- Update pre-commit-config.yaml to enable black, ruff, isort hooks
- Add pyright type-check job (continue-on-error until baseline fixed)
- Update AGENTS.md with Gitea infrastructure note
- Remove stale thoughts/ledgers/ and thoughts/shared/ artifacts
- Fix .gitignore duplicate .worktrees entry
- Move pyright to [dependency-groups] dev
- Replace hardcoded blog correlation with reproducible metric reference
- Add docs: verify-session-artifacts, fusion-vector-dimensions,
working-tree-hygiene
- Update blog-numbers-from-pipeline-outputs with correlation example
_get_aligned_party_scores and _get_aligned_trajectory_scores both called
compute_nd_axes() with no window_ids, which defaulted to _load_window_ids()
returning ALL windows including quarterly. This caused the SVD component 1
bar chart to disagree with the compass (which correctly used annual-only
windows via get_uniform_dim_windows). D66 appeared between GL-PvdA and PvdD
in component 1 because quarterly windows contaminated the PCA basis.
- analysis/explorer_data.py: add AND window_id NOT LIKE '%-Q%' to
_UNIFORM_DIM_SQL so quarterly windows are filtered at the source
- explorer.py: remove stale comment justifying quarterly inclusion;
remove redundant '-Q' guard in SVD tab trajectory view
- scripts/recompute_svd.py: replace quarter_bounds() with year_bounds()
that handles annual window IDs like '2024'; filter window list to
annual-only before recomputing SVD
Investigation of GroenLinks-PvdA merger dynamics in SVD space:
- Finding 1: GL-PvdA were 2.8-10.5% of avg inter-party distance apart pre-merger
- Finding 2: Merged party started most cohesive (#1 in 2023) but now 55% above avg spread
- Finding 3: Converged to 4.5% by Q3 2023, essentially indistinguishable
- Finding 4: GL/PvdA were most stable parties (10-25% drift) while VVD/D66 moved 70-177%
Revise SVD_THEMES labels based on TF-IDF analysis of top 50 motions
per component (pool size: current_parliament). Manual review of motion
titles ensures labels reflect actual parliamentary content rather than
party position semantics.
Key corrections:
- Axis 1: fiscal/economic policy vs social welfare + international rights
- Axis 4: active international engagement vs restraint
- Axis 5: pragmatic financial support vs progressive individual rights
- Axis 6: fossil fuels/financial incentives vs climate/intl rights
- Axis 7: practical-administrative vs idealistico-procedural (kept)
- Axis 8: European defense cooperation vs domestic socioeconomic policy
- Axis 9: concrete-administrative vs systemic reform
- Axis 10: citizen protection vs government regulation
Subagent analysis caught that axes 5 and 6 are NOT the same
(Nationale soevereiniteit) — manual motion review confirms distinct
content for each. Axes 1, 5, 6 had completely wrong labels.
Refs: thoughts/explorer/svd_label_review.md
See also: docs/brainstorms/2026-04-13-topic-derived-svd-labels-requirements.md
- Add _get_aligned_trajectory_scores() helper for multi-window aligned scores
- Update trajectory call to use compute_nd_axes instead of raw SVD scores
- Simplify _render_svd_time_trajectory by removing per-window flip computation
- Add compute_nd_axes() for N-component PCA with Procrustes alignment
- Add _get_aligned_party_scores() helper in explorer.py
- Update build_svd_components_tab to use aligned scores for all components
- Compute flip direction from aligned score centroids using CANONICAL_LEFT/RIGHT
Previously the SVD components tab used raw SVD scores while the compass
used Procrustes-aligned PCA positions. This caused party orderings to
differ between the two visualizations.
Changes:
- Components 1-2 now use aligned positions from load_positions()
(same as compass) for consistent party ordering
- Components 3-10 continue to use raw SVD scores
- Added _get_aligned_party_coords() helper to convert aligned MP
positions to party centroids
Previously the compass (political_axis.py) used hardcoded party sets that
excluded Volt and PvdD, while the SVD components tab (svd_labels.py) used
CANONICAL_LEFT/RIGHT which includes them. This caused inconsistencies in
axis orientation where Volt appeared most left on the compass but PvdD
appeared most left in the SVD components visualization.
Changes:
- Import CANONICAL_LEFT/RIGHT from config in political_axis.py
- Replace hardcoded party sets with CANONICAL_LEFT/RIGHT for axis orientation
- Update tests to match new SVD_THEMES labels
Redo theme analysis after pool-based motion assignment change.
New labels reflect actual motion content per component:
1. Economische sectorbelangen versus sociale welvaart
2. Nationalistische versus multilateralistische oriëntatie
3. Verzorgingsstaat versus defensie en nationale veiligheid
4. Internationale instituties en multilateralisme versus nationale soevereiniteit
5. Gemeenschapszin versus individuele rechten
6. Ecologische transitie versus economische conservatie
7. Praktisch-bestuurlijk versus idealistisch-proceduraal
8. Internationale samenwerking versus nationale soevereiniteit
9. Pragmatische probleemoplossing versus regulering
10. Minder overheidsbemoeienis versus meer handhaving
- Added --pool-size argument (default 50) to control pool size
- Pool mode is now default; use --no-exclusive for old behavior
- Algorithm: for each component, claim top 5 positive + 5 negative from pool
- All 10 SVD components now have exactly 10 representative motions
Also removes tests that require missing dependencies (sklearn, plotly) or
missing files (.mindmodel/manifest.yaml):
- tests/mindmodel/ (2 files)
- tests/test_diagnose_no_plot_trajectories.py
- tests/test_explorer_chart.py
- tests/test_motion_drift.py
- tests/test_trajectories_pipeline_integration.py
- tests/test_trajectory_*.py (4 files)
Refs: thoughts/shared/plans/2026-04-12-svd-axis-label-alignment.md