--- date: 2026-04-16 topic: "political-compass-blog-update" status: draft --- ## Problem Statement We need the "political compass" blog post under thoughts/ to show figures and numbers that exactly match the repository's canonical pipeline outputs. That requires producing reproducible assets (scree plots, party-agreement CSVs and heatmaps) from the codebase, placing them in docs/research, and making minimal edits to the blog HTML to reference those files. **Key constraint:** All numbers and figures must come from the canonical functions or the authoritative DB (data/motions.db). No invented values. ## Constraints **Non-negotiables:** - Use canonical functions (analysis.political_axis.compute_svd_spectrum, analysis.explorer_data.load_scree_data) as data sources. - Place generated files under **docs/research/** with reproducible, deterministic filenames. - Keep blog edits minimal and reversible: swap the markdown table for an HTML table and insert and CSV links. **Operational constraints:** - Plotly SVG export requires kaleido; provide a reliable matplotlib fallback. - data/motions.db must contain required rows (e.g. singular_values) or we must run compute_svd_spectrum first. ## Approach (chosen) I'm choosing a single, pragmatic approach that balances reproducibility, low-risk changes, and minimal new dependencies: **Chosen approach:** write a small export script (scripts/export_blog_assets.py) that: - Calls **analysis.political_axis.compute_svd_spectrum(db_path)** for the multi-window scree and **analysis.explorer_data.load_scree_data(db_path)** for the current_parliament scree fallback. - Re-uses the explorer._render_scree_plot logic (or extracts the Plotly-building code into a helper) to build a Plotly Figure and export SVG via **fig.write_image(..., format='svg')** when kaleido is available. - Falls back to matplotlib-based rendering if fig.write_image fails. - Computes pairwise party agreement / GL–PvdA trajectory using SQL and the logic from scripts/generate_extra_charts.py, writes CSV with pandas.DataFrame.to_csv(...), and writes a heatmap SVG to docs/research. - Writes assets with deterministic filenames into **docs/research/** and prints/returns the exact paths and the key numeric values (EVR% for caption). Why this approach: - It uses the canonical functions already present in the codebase so numbers match UI and tests. - Keeps edits limited to a single script and the blog HTML, making review and rollback trivial. - Provides a clear fallback for environments without kaleido. Alternatives considered (brief): 1) Modify existing scripts (scripts/generate_extra_charts.py) to write into docs/research. - Pro: reuses plotting code directly. - Con: those scripts are opinionated about output layout and write HTML, not SVG/CSV; harder to keep minimal change. 2) Recompute everything via pipeline.run_pipeline and copy pipeline outputs to docs/research. - Pro: purely canonical pipeline outputs. - Con: heavier — pipeline run may be slow and more intrusive; more environment setup. I rejected them because the export-script approach is lighter, reproducible, and gives explicit control over filenames and fallbacks. ## Architecture High-level: a small command-line script (scripts/export_blog_assets.py) driven by the canonical DB, the analysis layer, and the visualize helpers. **Major pieces:** - **Exporter script**: orchestrates reads from DB, computes metrics, builds figures, writes CSV/SVG into docs/research. - **Canonical analysis functions**: analysis.political_axis and analysis.explorer_data (data source only, no side effects). - **Plot builders**: reuse of explorer._render_scree_plot / analysis.visualize helpers to produce Plotly Figure objects. - **Fallback renderer**: minimal matplotlib routines producing PNG/SVG if Plotly image export fails. - **Blog edit**: minimal HTML changes in thoughts/blog-post-political-compass.html to reference the generated assets. ## Components and Responsibilities **scripts/export_blog_assets.py** (new) - Inputs: path to DB (default data/motions.db), optional --window (e.g. 2023Q3 or 'current_parliament'), output directory (default docs/research). - Responsibilities: - Run compute_svd_spectrum(db_path) and/or load_scree_data(db_path). - Build scree Plotly figures and export SVGs (multi-window and current_parliament). - Compute party agreement matrices, export CSVs and heatmap SVGs for requested window(s). - Print the EVR numbers and paths for copy into blog captions. - Exit non-zero on fatal errors (missing DB, empty results) with clear messages. **Explorer / analysis helpers** - analysis.political_axis.compute_svd_spectrum(db_path): canonical EVR source for multi-window scree. - analysis.explorer_data.load_scree_data(db_path): canonical loader for current_parliament scree (fallback). - explorer._render_scree_plot(importances): returns Plotly figure in Streamlit — reuse the building logic to return a Figure for export. **Fallback renderer** - Minimal matplotlib code that takes the EVR vector and draws a bar/scree-like chart and saves as SVG/PNG. **Blog file edits** - thoughts/blog-post-political-compass.html: replace markdown pipe table with an HTML table and insert and plus CSV links. ## Data Flow 1. Exporter reads data from **data/motions.db**. 2. Calls compute_svd_spectrum(db_path) to get multi-window EVR arrays. 3. Calls load_scree_data(db_path) to get 'current_parliament' singular values if available. 4. Builds Plotly Figures for scree plots (multi-window and current_parliament). 5. Exports Figures to **docs/research/*.svg** (uses fig.write_image when kaleido is present, otherwise matplotlib fallback). 6. Computes party agreement matrices via the SQL used in scripts/generate_extra_charts.py, writes CSVs to **docs/research/**. 7. Writes a party-heatmap SVG to **docs/research/**. 8. The blog HTML references those files via relative paths (../docs/research/...). ## Error Handling Strategy **Fail early with informative messages.** - If DB is missing or unreadable: exit with a clear error and suggestion to run the pipeline or point --db to a valid file. - If compute_svd_spectrum returns empty / no windows: print guidance to run scripts/recompute_svd.py or pipeline.run_pipeline and exit non-zero. - If Plotly image export fails (kaleido missing): log the error, attempt matplotlib fallback, and continue. - If CSV or SVG write fails due to IO permissions: log path and permission error and exit non-zero (don't silently drop assets). All non-fatal warnings are printed with suggested remediation steps. ## Testing Strategy Local verification steps (automated script + manual checks): - Unit smoke: run scripts/export_blog_assets.py --db data/motions.db --dry-run to verify the functions produce non-empty arrays and print expected output paths. - Functional: run the script to produce assets and assert files exist: docs/research/scree_multiwindow.svg, docs/research/scree_current_parliament.svg, docs/research/party_agreement_.csv, docs/research/party_agreement_.svg. - Sanity numbers: script prints the top EVR values used in captions. Cross-check printed EVR against explorer UI numbers (run explorer locally if needed). - Blog preview: open thoughts/blog-post-political-compass.html in browser (file://) and confirm images render and captions match printed numbers. Add a basic test under tests/ that runs the exporter against a small fixture DB (or a tmp DB produced from tests/test_political_compass.py fixtures) to assert the script creates at least the CSV and a PNG/SVG. ## Effort Estimate & Schedule - Draft exporter script and fallback renderer: 2–3 hours. - Wire up SQL for party agreement and CSV export: 1 hour. - Run and verify assets locally (including possible compute_svd if DB missing): 30–60 minutes. - Blog HTML edits and quick preview: 30 minutes. - Add a minimal test + docs: 1 hour. Total: ~5–6 hours of focused work (assuming data/motions.db is present and reasonably up-to-date). If compute_svd must be run across many windows or pipeline.run_pipeline is required, add 30–90 minutes. ## Risks & Mitigations - **Missing singular_values row for current_parliament.** Mitigation: script detects and runs compute_svd_spectrum or instructs operator to run scripts/recompute_svd.py. - **Kaleido not installed causing fig.write_image to fail.** Mitigation: implement matplotlib fallback and print clear message recommending pip install kaleido. - **DB schema drift or missing party ids.** Mitigation: script validates expected tables/columns and fails with actionable message. - **Assets not committed to git.** Mitigation: recommend the maintainer commit the generated files; optionally script can print a git add/commit suggestion but must not auto-commit without user request. ## Open Questions - Which specific window id(s) do we want for the GL–PvdA CSV/heatmap? (I'll default to 'current_parliament' and allow an explicit --window flag.) - Should the script auto-commit generated assets to git, or should it stop and ask human to commit? (I recommend manual commit.) --- I'm proceeding to create the design doc. Interrupt if you want changes.