- Fix mindmodel-schedule.yml to use uv and Python 3.13 - Add pytest.yml for push/PR test gate - Remove broken scheduler service from docker-compose.yml - Consolidate config.py into analysis/config.py with backward-compat shim - Rewrite README.md with quickstart and project overview - Update pre-commit-config.yaml to enable black, ruff, isort hooks - Add pyright type-check job (continue-on-error until baseline fixed) - Update AGENTS.md with Gitea infrastructure notemain
parent
375955dbc4
commit
12807df642
@ -0,0 +1,53 @@ |
|||||||
|
name: Pytest |
||||||
|
|
||||||
|
on: |
||||||
|
push: |
||||||
|
branches: [main] |
||||||
|
pull_request: |
||||||
|
branches: [main] |
||||||
|
|
||||||
|
jobs: |
||||||
|
test: |
||||||
|
runs-on: ubuntu-latest |
||||||
|
steps: |
||||||
|
- name: Checkout |
||||||
|
uses: actions/checkout@v4 |
||||||
|
|
||||||
|
- name: Install uv |
||||||
|
uses: astral-sh/setup-uv@v5 |
||||||
|
with: |
||||||
|
version: "0.6.x" |
||||||
|
|
||||||
|
- name: Set up Python |
||||||
|
uses: actions/setup-python@v5 |
||||||
|
with: |
||||||
|
python-version: "3.13" |
||||||
|
|
||||||
|
- name: Install dependencies |
||||||
|
run: uv sync --locked |
||||||
|
|
||||||
|
- name: Run tests |
||||||
|
run: uv run pytest tests/ -q |
||||||
|
|
||||||
|
typecheck: |
||||||
|
runs-on: ubuntu-latest |
||||||
|
steps: |
||||||
|
- name: Checkout |
||||||
|
uses: actions/checkout@v4 |
||||||
|
|
||||||
|
- name: Install uv |
||||||
|
uses: astral-sh/setup-uv@v5 |
||||||
|
with: |
||||||
|
version: "0.6.x" |
||||||
|
|
||||||
|
- name: Set up Python |
||||||
|
uses: actions/setup-python@v5 |
||||||
|
with: |
||||||
|
python-version: "3.13" |
||||||
|
|
||||||
|
- name: Install dependencies |
||||||
|
run: uv sync --locked |
||||||
|
|
||||||
|
- name: Run pyright |
||||||
|
continue-on-error: true |
||||||
|
run: uv run pyright |
||||||
@ -1,17 +1,18 @@ |
|||||||
# Minimal pre-commit config stub |
|
||||||
# This file is intentionally minimal and does not enable hooks by installing them. |
|
||||||
repos: |
repos: |
||||||
- repo: https://github.com/psf/black |
- repo: https://github.com/psf/black |
||||||
rev: 23.9.1 |
rev: 25.1.0 |
||||||
hooks: |
hooks: |
||||||
- id: black |
- id: black |
||||||
|
language_version: python3.13 |
||||||
|
|
||||||
- repo: https://github.com/charliermarsh/ruff |
- repo: https://github.com/charliermarsh/ruff-pre-commit |
||||||
rev: v0.11.1 |
rev: v0.11.1 |
||||||
hooks: |
hooks: |
||||||
- id: ruff |
- id: ruff |
||||||
|
args: [--fix] |
||||||
|
|
||||||
- repo: https://github.com/PyCQA/isort |
- repo: https://github.com/PyCQA/isort |
||||||
rev: 5.12.0 |
rev: 6.0.1 |
||||||
hooks: |
hooks: |
||||||
- id: isort |
- id: isort |
||||||
|
args: [--profile, black] |
||||||
|
|||||||
@ -1,22 +1,81 @@ |
|||||||
# stemwijzer |
# Stemwijzer |
||||||
|
|
||||||
A small project that uses QWEN embeddings for semantic features. The codebase includes an example Ansible package under packages/@ansible/example and helper scripts for deployment. |
A Dutch parliamentary voting compass that lets you vote on real Tweede Kamer motions and see which parties match your positions. |
||||||
|
|
||||||
Embeddings |
 |
||||||
- This project uses QWEN embeddings (model: `qwen/qwen3-embedding-4b`) via OpenRouter-compatible APIs. |
|
||||||
- Preferred environment variable: `OPENROUTER_API_KEY` with a fallback to `OPENAI_API_KEY`. |
|
||||||
|
|
||||||
Publishing and deploying the Ansible package |
## What is Stemwijzer? |
||||||
|
|
||||||
- Package location: `packages/@ansible/example` — this contains the Ansible playbooks and packaging used by CI. |
Stemwijzer ingests motions and voting records from the Dutch House of Representatives (Tweede Kamer), stores them in DuckDB, generates AI-powered explanations with an LLM, and presents a Streamlit UI where users can vote on real motions and explore party positions through SVD visualizations, trajectory analysis, and embedding-based similarity search. |
||||||
- To publish the package (CI): create a git tag for the version and provide `NPM_TOKEN` as a secret to the CI runner so it can publish to npm. |
|
||||||
- To deploy the package (CI): set the following repository secrets in your CI pipeline: |
|
||||||
- `DEPLOY_HOST` (default: `motief.sgeboers.nl`) |
|
||||||
- `DEPLOY_SSH_KEY` (private key for the `webapps` user) |
|
||||||
- `DEPLOY_USER` (default: `webapps`) |
|
||||||
|
|
||||||
Defaults |
## Features |
||||||
- DEPLOY_HOST: `motief.sgeboers.nl` |
|
||||||
- DEPLOY_USER: `webapps` |
|
||||||
|
|
||||||
See docs/deployment/ansible-package-deploy.md for more detailed deploy instructions and defaults. |
- **Voting Compass** — Vote on real parliamentary motions and see which parties align with your choices |
||||||
|
- **Explorer** — Interactive SVD visualizations, party trajectories over time, motion browser, and semantic search |
||||||
|
- **Analytics** — SVD decomposition of voting patterns, UMAP projections, clustering, and drift analysis |
||||||
|
- **LLM Enrichment** — Automatic generation of layman-friendly motion explanations using QWEN via OpenRouter |
||||||
|
|
||||||
|
## Prerequisites |
||||||
|
|
||||||
|
- Python >= 3.13 |
||||||
|
- [uv](https://docs.astral.sh/uv/) for dependency management |
||||||
|
- (Optional) `OPENROUTER_API_KEY` for LLM enrichment |
||||||
|
|
||||||
|
## Quickstart |
||||||
|
|
||||||
|
```bash |
||||||
|
# Clone and enter the repository |
||||||
|
git clone <your-gitea-url>/sgeboers/stemwijzer.git |
||||||
|
cd stemwijzer |
||||||
|
|
||||||
|
# Install dependencies |
||||||
|
uv sync |
||||||
|
|
||||||
|
# Run the Streamlit app |
||||||
|
uv run streamlit run Home.py |
||||||
|
|
||||||
|
# Run the data pipeline (fetch motions, compute embeddings, etc.) |
||||||
|
uv run python pipeline/run_pipeline.py |
||||||
|
|
||||||
|
# Run tests |
||||||
|
uv run pytest tests/ -q |
||||||
|
``` |
||||||
|
|
||||||
|
The app will be available at http://localhost:8501. |
||||||
|
|
||||||
|
## Project Structure |
||||||
|
|
||||||
|
``` |
||||||
|
├── app.py # Streamlit UI entrypoint |
||||||
|
├── database.py # DuckDB schema and queries |
||||||
|
├── api_client.py # Tweede Kamer OData API client |
||||||
|
├── explorer.py # Explorer page with SVD visualizations |
||||||
|
├── pipeline/ # Data ingestion and analysis pipelines |
||||||
|
├── analysis/ # SVD, clustering, trajectory modules |
||||||
|
├── tests/ # pytest test suite |
||||||
|
├── docs/ # Documentation, research, and plans |
||||||
|
└── data/motions.db # DuckDB database (~18 GB) |
||||||
|
``` |
||||||
|
|
||||||
|
## Documentation |
||||||
|
|
||||||
|
- **[ARCHITECTURE.md](ARCHITECTURE.md)** — Comprehensive architecture overview, tech stack, and contributor guidance |
||||||
|
- **[CODE_STYLE.md](CODE_STYLE.md)** — Coding conventions, naming, typing, and testing standards |
||||||
|
- **[docs/solutions/](docs/solutions/)** — Documented solutions to past bugs and best practices |
||||||
|
|
||||||
|
## Tech Stack |
||||||
|
|
||||||
|
- **Language:** Python 3.13+ |
||||||
|
- **Data:** DuckDB via ibis-framework |
||||||
|
- **UI:** Streamlit + Plotly |
||||||
|
- **ML/Analysis:** scipy, scikit-learn, umap-learn |
||||||
|
- **LLM:** QWEN via OpenRouter (OpenAI-compatible) |
||||||
|
- **Package Manager:** uv |
||||||
|
|
||||||
|
## Deployment |
||||||
|
|
||||||
|
See [docs/deployment/ansible-package-deploy.md](docs/deployment/ansible-package-deploy.md) for server deployment instructions using the Ansible package. |
||||||
|
|
||||||
|
## License |
||||||
|
|
||||||
|
[Your license here] |
||||||
|
|||||||
@ -1,51 +1,2 @@ |
|||||||
# config.py (complete updated version) |
# Backward-compatibility shim — root config now lives in analysis.config |
||||||
import os |
from analysis.config import Config, config # noqa: F401 |
||||||
from dataclasses import dataclass |
|
||||||
from typing import List |
|
||||||
|
|
||||||
|
|
||||||
@dataclass |
|
||||||
class Config: |
|
||||||
# Database settings |
|
||||||
DATABASE_PATH = "data/motions.db" |
|
||||||
|
|
||||||
# API settings (updated) |
|
||||||
TWEEDE_KAMER_ODATA_API = "https://gegevensmagazijn.tweedekamer.nl/OData/v4/2.0" |
|
||||||
API_TIMEOUT = 30 |
|
||||||
API_BATCH_SIZE = 250 # Increased based on API capabilities |
|
||||||
API_MAX_LIMIT = 250 |
|
||||||
|
|
||||||
# AI settings |
|
||||||
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY") |
|
||||||
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1" |
|
||||||
QWEN_MODEL = "qwen/qwen-2.5-72b-instruct" |
|
||||||
|
|
||||||
# App settings |
|
||||||
DEFAULT_MOTION_COUNT = 10 |
|
||||||
DEFAULT_WINNING_MARGIN_MIN = ( |
|
||||||
0 # % - include all, filter by layman_explanation instead |
|
||||||
) |
|
||||||
DEFAULT_WINNING_MARGIN_MAX = 100 # % |
|
||||||
SESSION_TIMEOUT_DAYS = 30 |
|
||||||
|
|
||||||
# Policy areas |
|
||||||
POLICY_AREAS = [ |
|
||||||
"Alle", |
|
||||||
"Economie", |
|
||||||
"Klimaat", |
|
||||||
"Immigratie", |
|
||||||
"Zorg", |
|
||||||
"Onderwijs", |
|
||||||
"Defensie", |
|
||||||
"Sociale Zaken", |
|
||||||
"Algemeen", |
|
||||||
] |
|
||||||
|
|
||||||
# Scraper defaults (previously missing) |
|
||||||
BASE_URL = ( |
|
||||||
"https://www.tweedekamer.nl/zoeken/zoekresultaten" # base for scraping motions |
|
||||||
) |
|
||||||
SCRAPING_DELAY = int(os.getenv("SCRAPING_DELAY", "5")) |
|
||||||
|
|
||||||
|
|
||||||
config = Config() |
|
||||||
|
|||||||
Loading…
Reference in new issue