- Fix mindmodel-schedule.yml to use uv and Python 3.13 - Add pytest.yml for push/PR test gate - Remove broken scheduler service from docker-compose.yml - Consolidate config.py into analysis/config.py with backward-compat shim - Rewrite README.md with quickstart and project overview - Update pre-commit-config.yaml to enable black, ruff, isort hooks - Add pyright type-check job (continue-on-error until baseline fixed) - Update AGENTS.md with Gitea infrastructure notemain
parent
375955dbc4
commit
12807df642
@ -0,0 +1,53 @@ |
||||
name: Pytest |
||||
|
||||
on: |
||||
push: |
||||
branches: [main] |
||||
pull_request: |
||||
branches: [main] |
||||
|
||||
jobs: |
||||
test: |
||||
runs-on: ubuntu-latest |
||||
steps: |
||||
- name: Checkout |
||||
uses: actions/checkout@v4 |
||||
|
||||
- name: Install uv |
||||
uses: astral-sh/setup-uv@v5 |
||||
with: |
||||
version: "0.6.x" |
||||
|
||||
- name: Set up Python |
||||
uses: actions/setup-python@v5 |
||||
with: |
||||
python-version: "3.13" |
||||
|
||||
- name: Install dependencies |
||||
run: uv sync --locked |
||||
|
||||
- name: Run tests |
||||
run: uv run pytest tests/ -q |
||||
|
||||
typecheck: |
||||
runs-on: ubuntu-latest |
||||
steps: |
||||
- name: Checkout |
||||
uses: actions/checkout@v4 |
||||
|
||||
- name: Install uv |
||||
uses: astral-sh/setup-uv@v5 |
||||
with: |
||||
version: "0.6.x" |
||||
|
||||
- name: Set up Python |
||||
uses: actions/setup-python@v5 |
||||
with: |
||||
python-version: "3.13" |
||||
|
||||
- name: Install dependencies |
||||
run: uv sync --locked |
||||
|
||||
- name: Run pyright |
||||
continue-on-error: true |
||||
run: uv run pyright |
||||
@ -1,17 +1,18 @@ |
||||
# Minimal pre-commit config stub |
||||
# This file is intentionally minimal and does not enable hooks by installing them. |
||||
repos: |
||||
- repo: https://github.com/psf/black |
||||
rev: 23.9.1 |
||||
rev: 25.1.0 |
||||
hooks: |
||||
- id: black |
||||
language_version: python3.13 |
||||
|
||||
- repo: https://github.com/charliermarsh/ruff |
||||
- repo: https://github.com/charliermarsh/ruff-pre-commit |
||||
rev: v0.11.1 |
||||
hooks: |
||||
- id: ruff |
||||
args: [--fix] |
||||
|
||||
- repo: https://github.com/PyCQA/isort |
||||
rev: 5.12.0 |
||||
rev: 6.0.1 |
||||
hooks: |
||||
- id: isort |
||||
args: [--profile, black] |
||||
|
||||
@ -1,22 +1,81 @@ |
||||
# stemwijzer |
||||
# Stemwijzer |
||||
|
||||
A small project that uses QWEN embeddings for semantic features. The codebase includes an example Ansible package under packages/@ansible/example and helper scripts for deployment. |
||||
A Dutch parliamentary voting compass that lets you vote on real Tweede Kamer motions and see which parties match your positions. |
||||
|
||||
Embeddings |
||||
- This project uses QWEN embeddings (model: `qwen/qwen3-embedding-4b`) via OpenRouter-compatible APIs. |
||||
- Preferred environment variable: `OPENROUTER_API_KEY` with a fallback to `OPENAI_API_KEY`. |
||||
 |
||||
|
||||
Publishing and deploying the Ansible package |
||||
## What is Stemwijzer? |
||||
|
||||
- Package location: `packages/@ansible/example` — this contains the Ansible playbooks and packaging used by CI. |
||||
- To publish the package (CI): create a git tag for the version and provide `NPM_TOKEN` as a secret to the CI runner so it can publish to npm. |
||||
- To deploy the package (CI): set the following repository secrets in your CI pipeline: |
||||
- `DEPLOY_HOST` (default: `motief.sgeboers.nl`) |
||||
- `DEPLOY_SSH_KEY` (private key for the `webapps` user) |
||||
- `DEPLOY_USER` (default: `webapps`) |
||||
Stemwijzer ingests motions and voting records from the Dutch House of Representatives (Tweede Kamer), stores them in DuckDB, generates AI-powered explanations with an LLM, and presents a Streamlit UI where users can vote on real motions and explore party positions through SVD visualizations, trajectory analysis, and embedding-based similarity search. |
||||
|
||||
Defaults |
||||
- DEPLOY_HOST: `motief.sgeboers.nl` |
||||
- DEPLOY_USER: `webapps` |
||||
## Features |
||||
|
||||
See docs/deployment/ansible-package-deploy.md for more detailed deploy instructions and defaults. |
||||
- **Voting Compass** — Vote on real parliamentary motions and see which parties align with your choices |
||||
- **Explorer** — Interactive SVD visualizations, party trajectories over time, motion browser, and semantic search |
||||
- **Analytics** — SVD decomposition of voting patterns, UMAP projections, clustering, and drift analysis |
||||
- **LLM Enrichment** — Automatic generation of layman-friendly motion explanations using QWEN via OpenRouter |
||||
|
||||
## Prerequisites |
||||
|
||||
- Python >= 3.13 |
||||
- [uv](https://docs.astral.sh/uv/) for dependency management |
||||
- (Optional) `OPENROUTER_API_KEY` for LLM enrichment |
||||
|
||||
## Quickstart |
||||
|
||||
```bash |
||||
# Clone and enter the repository |
||||
git clone <your-gitea-url>/sgeboers/stemwijzer.git |
||||
cd stemwijzer |
||||
|
||||
# Install dependencies |
||||
uv sync |
||||
|
||||
# Run the Streamlit app |
||||
uv run streamlit run Home.py |
||||
|
||||
# Run the data pipeline (fetch motions, compute embeddings, etc.) |
||||
uv run python pipeline/run_pipeline.py |
||||
|
||||
# Run tests |
||||
uv run pytest tests/ -q |
||||
``` |
||||
|
||||
The app will be available at http://localhost:8501. |
||||
|
||||
## Project Structure |
||||
|
||||
``` |
||||
├── app.py # Streamlit UI entrypoint |
||||
├── database.py # DuckDB schema and queries |
||||
├── api_client.py # Tweede Kamer OData API client |
||||
├── explorer.py # Explorer page with SVD visualizations |
||||
├── pipeline/ # Data ingestion and analysis pipelines |
||||
├── analysis/ # SVD, clustering, trajectory modules |
||||
├── tests/ # pytest test suite |
||||
├── docs/ # Documentation, research, and plans |
||||
└── data/motions.db # DuckDB database (~18 GB) |
||||
``` |
||||
|
||||
## Documentation |
||||
|
||||
- **[ARCHITECTURE.md](ARCHITECTURE.md)** — Comprehensive architecture overview, tech stack, and contributor guidance |
||||
- **[CODE_STYLE.md](CODE_STYLE.md)** — Coding conventions, naming, typing, and testing standards |
||||
- **[docs/solutions/](docs/solutions/)** — Documented solutions to past bugs and best practices |
||||
|
||||
## Tech Stack |
||||
|
||||
- **Language:** Python 3.13+ |
||||
- **Data:** DuckDB via ibis-framework |
||||
- **UI:** Streamlit + Plotly |
||||
- **ML/Analysis:** scipy, scikit-learn, umap-learn |
||||
- **LLM:** QWEN via OpenRouter (OpenAI-compatible) |
||||
- **Package Manager:** uv |
||||
|
||||
## Deployment |
||||
|
||||
See [docs/deployment/ansible-package-deploy.md](docs/deployment/ansible-package-deploy.md) for server deployment instructions using the Ansible package. |
||||
|
||||
## License |
||||
|
||||
[Your license here] |
||||
|
||||
@ -1,51 +1,2 @@ |
||||
# config.py (complete updated version) |
||||
import os |
||||
from dataclasses import dataclass |
||||
from typing import List |
||||
|
||||
|
||||
@dataclass |
||||
class Config: |
||||
# Database settings |
||||
DATABASE_PATH = "data/motions.db" |
||||
|
||||
# API settings (updated) |
||||
TWEEDE_KAMER_ODATA_API = "https://gegevensmagazijn.tweedekamer.nl/OData/v4/2.0" |
||||
API_TIMEOUT = 30 |
||||
API_BATCH_SIZE = 250 # Increased based on API capabilities |
||||
API_MAX_LIMIT = 250 |
||||
|
||||
# AI settings |
||||
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY") |
||||
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1" |
||||
QWEN_MODEL = "qwen/qwen-2.5-72b-instruct" |
||||
|
||||
# App settings |
||||
DEFAULT_MOTION_COUNT = 10 |
||||
DEFAULT_WINNING_MARGIN_MIN = ( |
||||
0 # % - include all, filter by layman_explanation instead |
||||
) |
||||
DEFAULT_WINNING_MARGIN_MAX = 100 # % |
||||
SESSION_TIMEOUT_DAYS = 30 |
||||
|
||||
# Policy areas |
||||
POLICY_AREAS = [ |
||||
"Alle", |
||||
"Economie", |
||||
"Klimaat", |
||||
"Immigratie", |
||||
"Zorg", |
||||
"Onderwijs", |
||||
"Defensie", |
||||
"Sociale Zaken", |
||||
"Algemeen", |
||||
] |
||||
|
||||
# Scraper defaults (previously missing) |
||||
BASE_URL = ( |
||||
"https://www.tweedekamer.nl/zoeken/zoekresultaten" # base for scraping motions |
||||
) |
||||
SCRAPING_DELAY = int(os.getenv("SCRAPING_DELAY", "5")) |
||||
|
||||
|
||||
config = Config() |
||||
# Backward-compatibility shim — root config now lives in analysis.config |
||||
from analysis.config import Config, config # noqa: F401 |
||||
|
||||
Loading…
Reference in new issue