You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
286 lines
9.5 KiB
286 lines
9.5 KiB
# StemAtlas Deployment — Implementation Plan
|
|
|
|
**Design:** `thoughts/shared/designs/2026-03-22-stematlas-deployment-design.md`
|
|
**Date:** 2026-03-22
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Four independent batches. Batches A and B can run in parallel. Batch C requires the pipeline to finish first. Batch D is VPS infrastructure (manual steps, done once).
|
|
|
|
```
|
|
Batch A: stemwijzer repo — Streamlit multi-page + Docker
|
|
Batch B: sgeboers.nl repo — blog/, nav, blog post HTML skeleton
|
|
Batch C: Charts — generate + embed (after pipeline finishes)
|
|
Batch D: VPS infrastructure — Nginx vhost + Certbot + /srv/stematlas/
|
|
```
|
|
|
|
---
|
|
|
|
## Batch A — stemwijzer repo: Streamlit multi-page + Docker
|
|
|
|
### A1. Check Dockerfile
|
|
Read existing `Dockerfile` — verify it installs all deps from `pyproject.toml` and sets `CMD` to start the app. Note current entrypoint (probably `streamlit run app.py`).
|
|
|
|
### A2. Create `Home.py`
|
|
New file at project root. Streamlit landing/about page:
|
|
- Title: "StemAtlas"
|
|
- Brief description of the two pages (quiz + explorer)
|
|
- Links (Streamlit sidebar nav handles the rest automatically)
|
|
- `st.page_link()` cards pointing to the two pages
|
|
|
|
### A3. Create `pages/1_Stemwijzer.py`
|
|
Thin wrapper that imports and calls `app.main()`:
|
|
- Import `from app import main`
|
|
- Remove the `if __name__ == "__main__": main()` guard from `app.py` (or keep it — Streamlit ignores it when the file is imported)
|
|
- The page title shown in Streamlit nav comes from the filename: `1_Stemwijzer` → "Stemwijzer"
|
|
|
|
### A4. Create `pages/2_Explorer.py`
|
|
Same pattern:
|
|
- Import `from explorer import run_app`
|
|
- Call `run_app()`
|
|
- Filename → nav label: "Explorer"
|
|
|
|
### A5. Update Dockerfile CMD
|
|
Change entrypoint from `streamlit run app.py` to `streamlit run Home.py --server.port 8501 --server.address 0.0.0.0`.
|
|
|
|
### A6. Create `docker-compose.yml`
|
|
Two services in the stemwijzer repo:
|
|
|
|
```yaml
|
|
version: "3.9"
|
|
services:
|
|
stematlas:
|
|
image: ${DOCKER_REGISTRY}/sgeboers/stemwijzer:latest
|
|
ports:
|
|
- "127.0.0.1:8501:8501"
|
|
volumes:
|
|
- /srv/stematlas/data:/app/data
|
|
restart: unless-stopped
|
|
environment:
|
|
- DB_PATH=/app/data/motions.db
|
|
|
|
scheduler:
|
|
image: ${DOCKER_REGISTRY}/sgeboers/stemwijzer:latest
|
|
command: python scheduler.py
|
|
volumes:
|
|
- /srv/stematlas/data:/app/data
|
|
restart: unless-stopped
|
|
environment:
|
|
- DB_PATH=/app/data/motions.db
|
|
```
|
|
|
|
`127.0.0.1:8501` — only accessible from localhost, Nginx proxies externally.
|
|
|
|
### A7. Smoke test for `Home.py`
|
|
Add `tests/test_home_import.py` — same pattern as `test_explorer_import.py`. Verify `Home` module is importable, `run_app` or equivalent callable exists.
|
|
|
|
### A8. Run tests
|
|
`.venv/bin/python -m pytest -q` — all existing + new smoke tests must pass.
|
|
|
|
### Verification
|
|
`docker build -t stematlas-local .` locally to confirm image builds without errors.
|
|
|
|
---
|
|
|
|
## Batch B — sgeboers.nl repo: blog/ + nav
|
|
|
|
> This batch requires access to the sgeboers.nl repo on git.sgeboers.nl.
|
|
> Steps below assume the repo is cloned locally.
|
|
|
|
### B1. Inspect existing site structure
|
|
Read `index.html` and any existing CSS files to understand:
|
|
- Current nav structure (header? sidebar? footer?)
|
|
- CSS class conventions for links/sections
|
|
- Any existing page patterns to copy for the blog post
|
|
|
|
### B2. Create `blog/` directory
|
|
Add `blog/index.html` — a minimal blog listing page:
|
|
- Title: "Blog"
|
|
- One entry: "StemAtlas — Mapping Dutch Democracy" → `blog/stematlas.html`
|
|
- Matches existing site style
|
|
|
|
### B3. Add nav link to main site
|
|
Update `index.html` (or whichever file contains the nav) to add a "Blog" link pointing to `/blog/`.
|
|
|
|
### B4. Create `blog/stematlas.html` skeleton
|
|
Full blog post HTML based on `thoughts/blog-post-political-compass.md`:
|
|
- Convert markdown to HTML (headings, paragraphs, code blocks, tables)
|
|
- Add Plotly CDN `<script>` in `<head>`
|
|
- **Chart placeholders**: `<!-- CHART: compass_latest -->`, `<!-- CHART: trajectories -->` — to be filled in Batch C
|
|
- Add two CTAs linking to `stematlas.sgeboers.nl`:
|
|
- After compass chart: *"Explore every window interactively →"*
|
|
- At bottom: *"Try the Stemwijzer quiz →"*
|
|
- Match existing site CSS (link the same stylesheet)
|
|
|
|
### B5. Update Drone pipeline (sgeboers.nl repo)
|
|
Confirm the existing `.drone.yml` in sgeboers.nl picks up new files under `blog/` automatically (it should, if it deploys the whole repo root). No changes needed if it's already a `rsync` or `cp -r` deploy.
|
|
|
|
### Verification
|
|
Open `blog/stematlas.html` locally in browser — post renders correctly with placeholder chart divs, nav works.
|
|
|
|
---
|
|
|
|
## Batch C — Charts: generate + embed (after pipeline finishes ~21:40)
|
|
|
|
> Requires `data/motions.db` to be unlocked (pipeline complete).
|
|
|
|
### C1. Run tests
|
|
`.venv/bin/python -m pytest -q` — confirm all pass now that DB is free.
|
|
|
|
### C2. Run similarity cache recompute
|
|
```
|
|
.venv/bin/python -m pipeline.run_pipeline \
|
|
--db-path data/motions.db \
|
|
--start-date 2019-01-01 --end-date 2025-01-01 \
|
|
--window-size quarterly \
|
|
--skip-metadata --skip-extract --skip-svd --skip-text
|
|
```
|
|
Fusion only — fills `fused_embeddings` for new 2019–2021 and 2024 windows.
|
|
|
|
### C3. Recompute similarity cache
|
|
```
|
|
.venv/bin/python -c "
|
|
from similarity.compute import compute_similarities
|
|
import duckdb
|
|
conn = duckdb.connect('data/motions.db', read_only=True)
|
|
windows = [r[0] for r in conn.execute(\"SELECT DISTINCT window_id FROM fused_embeddings ORDER BY 1\").fetchall()]
|
|
conn.close()
|
|
for w in windows:
|
|
print(f'Computing {w}...')
|
|
compute_similarities('data/motions.db', w, top_k=20)
|
|
"
|
|
```
|
|
|
|
### C4. Generate compass HTML files
|
|
```
|
|
.venv/bin/python scripts/generate_compass.py \
|
|
--db data/motions.db \
|
|
--out outputs/blog-charts \
|
|
--method pca --pca-residual
|
|
```
|
|
|
|
This produces `outputs/blog-charts/compass_*.html` and `outputs/blog-charts/trajectories_*.html`.
|
|
|
|
### C5. Extract Plotly snippets
|
|
For each chart file, extract the embeddable snippet:
|
|
```python
|
|
# Run once per chart to get embeddable HTML
|
|
import plotly.io as pio
|
|
# OR: just strip everything outside <div id="..."> and its <script>
|
|
# The generate_compass.py output is self-contained — use BeautifulSoup or
|
|
# manual extraction to get just the div+script block
|
|
```
|
|
|
|
Simpler: modify `generate_compass.py` to add a `--partial` flag that calls `fig.to_html(include_plotlyjs=False, full_html=False)` and writes `.partial.html` files alongside the full ones.
|
|
|
|
### C6. Fill chart placeholders in blog post
|
|
Replace `<!-- CHART: compass_latest -->` and `<!-- CHART: trajectories -->` in `blog/stematlas.html` with the extracted Plotly div+script blocks.
|
|
|
|
### C7. Update motion count table in blog post
|
|
Run SQL to get authoritative counts:
|
|
```sql
|
|
SELECT strftime(date, '%Y') AS year, COUNT(*) AS motions
|
|
FROM motions
|
|
GROUP BY year ORDER BY year;
|
|
```
|
|
Replace placeholder numbers in `blog/stematlas.html` table.
|
|
|
|
### C8. Push sgeboers.nl repo
|
|
Commit and push `blog/stematlas.html` + `blog/index.html` + nav changes to git.sgeboers.nl → Drone deploys.
|
|
|
|
---
|
|
|
|
## Batch D — VPS infrastructure (manual, one-time)
|
|
|
|
> SSH into the VPS. Steps are sequential.
|
|
|
|
### D1. Create data directory
|
|
```bash
|
|
sudo mkdir -p /srv/stematlas/data
|
|
sudo chown $USER:$USER /srv/stematlas/data
|
|
```
|
|
|
|
### D2. Copy `motions.db` to VPS
|
|
From local machine:
|
|
```bash
|
|
rsync -avz --progress data/motions.db user@vps:/srv/stematlas/data/motions.db
|
|
```
|
|
~3.6GB transfer — takes a few minutes.
|
|
|
|
### D3. Add Nginx vhost
|
|
New file `/etc/nginx/sites-available/stematlas`:
|
|
```nginx
|
|
server {
|
|
listen 80;
|
|
server_name stematlas.sgeboers.nl;
|
|
return 301 https://$host$request_uri;
|
|
}
|
|
|
|
server {
|
|
listen 443 ssl;
|
|
server_name stematlas.sgeboers.nl;
|
|
|
|
# Let's Encrypt certs (Certbot fills these in)
|
|
ssl_certificate /etc/letsencrypt/live/stematlas.sgeboers.nl/fullchain.pem;
|
|
ssl_certificate_key /etc/letsencrypt/live/stematlas.sgeboers.nl/privkey.pem;
|
|
|
|
location / {
|
|
proxy_pass http://127.0.0.1:8501;
|
|
proxy_http_version 1.1;
|
|
proxy_set_header Upgrade $http_upgrade;
|
|
proxy_set_header Connection "upgrade";
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_read_timeout 86400;
|
|
}
|
|
}
|
|
```
|
|
|
|
Enable: `sudo ln -s /etc/nginx/sites-available/stematlas /etc/nginx/sites-enabled/`
|
|
|
|
### D4. Get Let's Encrypt cert
|
|
```bash
|
|
sudo certbot --nginx -d stematlas.sgeboers.nl
|
|
```
|
|
(Assumes Certbot is already installed and working for other subdomains on this VPS.)
|
|
|
|
### D5. First deploy
|
|
The Drone pipeline for the stemwijzer repo will handle future deploys. For the first deploy, either:
|
|
- Push a commit to trigger Drone, OR
|
|
- Manually on VPS: `cd /srv/stematlas && docker-compose pull && docker-compose up -d`
|
|
|
|
### D6. Verify
|
|
- `https://stematlas.sgeboers.nl` → Streamlit loads, shows Home.py
|
|
- Both pages accessible from Streamlit nav
|
|
- `docker-compose logs stematlas` — no errors
|
|
|
|
---
|
|
|
|
## Dependencies Between Batches
|
|
|
|
```
|
|
A (stemwijzer repo) ──► D5 (first deploy) ──► D6 (verify)
|
|
B (sgeboers.nl repo) ──► C8 (push blog)
|
|
C (charts) ──► C8 (push blog)
|
|
D1-D4 (VPS infra) ──► D5 (first deploy)
|
|
|
|
Pipeline finish (~21:40) ──► C1 (tests) ──► C2-C7 (charts)
|
|
```
|
|
|
|
Batches A and B are fully independent — can start now.
|
|
Batch C waits only for the pipeline to finish.
|
|
Batch D is VPS-side and independent of code changes.
|
|
|
|
---
|
|
|
|
## Estimated Effort
|
|
|
|
| Batch | Tasks | Est. Time |
|
|
|-------|-------|-----------|
|
|
| A | Multi-page Streamlit + docker-compose | 45 min |
|
|
| B | Blog HTML + nav (after inspecting site) | 60 min |
|
|
| C | Charts + embed (after pipeline) | 30 min |
|
|
| D | VPS infra (manual SSH) | 30 min |
|
|
| **Total** | | **~2.5 hours** |
|
|
|