You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/.mindmodel/system.md

3.7 KiB

System Overview

Project: Stemwijzer (Dutch Political Voting Compass)

Purpose: A web application that maps the Dutch Tweede Kamer (House of Representatives) based on real parliamentary votes, helping citizens discover which political party aligns best with their views.

Architecture Summary

Data Flow

TweedeKamer OData API
        ↓
  API Client (api_client.py)
        ↓
  DuckDB Database (database.py)
        ↓
  Pipeline Processing (pipeline/)
        ├── fetch_mp_metadata     # MP party + tenure
        ├── extract_mp_votes     # voting_results → mp_votes
        ├── svd_pipeline          # SVD on vote matrix + Procrustes
        ├── text_pipeline         # AI embeddings via OpenRouter
        └── fusion                # Combine SVD + text vectors
        ↓
  Streamlit Web App (Home.py, pages/)
        ├── Home.py               # Landing page
        ├── 1_Stemwijzer.py       # Voting quiz
        └── 2_Explorer.py        # Political compass explorer

Key Components

Component Purpose File(s)
Database Motion storage, MP votes, embeddings database.py
API Client TweedeKamer OData API integration api_client.py
AI Provider OpenRouter API for embeddings/summaries ai_provider.py
Pipeline Orchestrated data processing pipeline/run_pipeline.py
Analysis SVD, clustering, trajectory computation analysis/*.py
Explorer Helpers Pure functions, chart builders explorer_helpers.py
Web App Streamlit UI Home.py, pages/*.py

Tech Stack

  • Language: Python 3.13+
  • Web Framework: Streamlit (multi-page app)
  • Database: DuckDB with ibis ORM (DuckDB-native implementation)
  • ML/Analytics: scipy (SVD, Procrustes), scikit-learn (KMeans, cosine_similarity), umap-learn (optional)
  • AI/LLM: OpenRouter-compatible API (QWEN embeddings + chat)
  • Visualization: Plotly (interactive charts), matplotlib (optional)
  • HTTP: requests with Session pooling and retry
  • Parsing: beautifulsoup4, lxml

Key Patterns

  1. Module-Level Singletons: db = MotionDatabase(), config = Config()
  2. Repository Pattern: MotionDatabase class with method-per-query
  3. Service Layer: TweedeKamerAPI, ai_provider with retry/backoff
  4. Pipeline Orchestration: ThreadPoolExecutor for parallel SVD
  5. Short-Lived Connections: DuckDB connections in try/finally blocks
  6. Graceful Degradation: try/except around optional dependencies

Domain Invariants

CRITICAL RULES (from AGENTS.md):

  1. Right-wing parties on RIGHT: PVV, FVD, JA21, SGP must appear on RIGHT side of all axes in visualizations
  2. SVD labels = voting patterns: SVD labels reflect voting patterns, NOT semantic content

Database Tables

Table Purpose
motions Parliamentary motions with id, title, date, category
mp_votes Individual MP votes on motions (Voor/Tegen/Onthouden)
mp_metadata MP names, parties, tenure info
svd_vectors 2D SVD-computed political positions per entity
fused_embeddings Combined SVD + text embeddings
embeddings Text embeddings for motions
user_sessions Voting session tracking
party_results Party match results per session

Conventions

  • Error Handling: Catch Exception, return safe fallbacks (False/[]/None)
  • Logging: Use logging.getLogger(__name__)never use print()
  • Imports: stdlib → 3rd party → local (3 groups)
  • Type Hints: Required on public functions with typing module imports
  • DuckDB: Short-lived connections with try/finally conn.close()