Use one DuckDB write connection for the entire update loop instead of
opening/closing per row, wrapped in try/finally for proper cleanup.
Move 'import duckdb' to module level with other imports.
Enable backfilling body_text for existing motions that lack it (2016-2018 data).
New extract_besluit_id() and update_existing_motions() helpers support the
--update-existing mode, while --no-skip-details enables detail fetching during
normal downloads. Includes 7 tests covering URL parsing, DB update flow, and
argparse wiring.
Removes the raw_title[:80] cap on expander labels so full titles show.
Adds scripts/generate_svd_json.py to regenerate top_svd_top_motions.json
from any SVD window after a recompute.
- Add 4 migration files: mp_votes, mp_metadata, svd_vectors, fused_embeddings
- Extend database.py with 5 new helper methods and table init
- Add pipeline/ package: extract_mp_votes, fetch_mp_metadata, text_pipeline,
svd_pipeline (with Procrustes alignment), fusion
- Add full test suite (17 tests) covering all pipeline modules and migrations
- Fix Procrustes alignment bug: scipy scale is a norm value, not a multiplier
- Fix DuckDB date type handling in test assertions (datetime.date vs string)
- Remove duckdb.py shim; tests now run against real duckdb + scipy via uv
Ref: thoughts/shared/plans/2026-03-21-parliamentary-embedding-pipeline-plan.md