# System Overview ## Project: Stemwijzer (Dutch Political Voting Compass) **Purpose**: A web application that maps the Dutch Tweede Kamer (House of Representatives) based on real parliamentary votes, helping citizens discover which political party aligns best with their views. ## Architecture Summary ### Data Flow ``` TweedeKamer OData API ↓ API Client (api_client.py) ↓ DuckDB Database (database.py) ↓ Pipeline Processing (pipeline/) ├── fetch_mp_metadata # MP party + tenure ├── extract_mp_votes # voting_results → mp_votes ├── svd_pipeline # SVD on vote matrix + Procrustes ├── text_pipeline # AI embeddings via OpenRouter └── fusion # Combine SVD + text vectors ↓ Streamlit Web App (app.py, pages/) ├── Home.py # Landing page ├── 1_Stemwijzer.py # Voting quiz └── 2_Explorer.py # Political compass explorer ``` ### Key Components | Component | Purpose | File(s) | |-----------|---------|---------| | **Database** | Motion storage, MP votes, embeddings | `database.py` | | **API Client** | TweedeKamer OData API integration | `api_client.py` | | **AI Provider** | OpenRouter API for embeddings/summaries | `ai_provider.py` | | **Pipeline** | Orchestrated data processing | `pipeline/run_pipeline.py` | | **Analysis** | SVD, clustering, trajectory computation | `analysis/*.py` | | **Similarity** | Motion similarity search | `similarity/*.py` | | **Web App** | Streamlit UI | `app.py`, `pages/*.py` | ### Data Models **Core Entities**: - `Motion`: Parliamentary motion with voting results - `MP` / `MPMetadata`: Member of Parliament with party/tenure - `MPVote`: Individual vote record (Voor/Tegen/Onthouden/Geen stem/Afwezig) - `Party`: Political party - `UserSession` / `UserVote`: Voting session tracking - `SVDVector`: Dimensionality-reduced vote vectors - `FusedEmbedding`: Combined SVD + text embedding - `SimilarityCache`: Pre-computed motion similarities ### Technical Decisions 1. **DuckDB over SQLite**: Chosen for OLAP performance with complex analytical queries 2. **ibis ORM**: Database-agnostic query building (currently using DuckDB backend) 3. **SVD + Procrustes**: Aligns voting vectors across time windows 4. **UMAP for visualization**: Non-linear dimensionality reduction for compass display 5. **OpenRouter API**: Abstraction layer for AI embeddings (currently using Qwen) 6. **Module-level singletons**: `db = MotionDatabase()` pattern for shared state ### Key Conventions - **DuckDB connections**: Short-lived per method, always close - **Error handling**: Catch `Exception`, return safe fallbacks (False/[]/None) - **Logging**: Use `logging.getLogger(__name__)` - avoid print() - **Type hints**: Required on public functions with typing module imports - **Config**: Dataclass `Config` in `config.py`, accessed as `from config import config`