You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Sven Geboers
a78bee9b0a
feat(similarity): add precomputed similarity cache, fix fusion N+1, add 429 retry
- Add similarity/ package (compute.py, lookup.py) with numpy-based
pairwise cosine similarity and cached lookup
- database.py: create embeddings + similarity_cache tables in _init_database(),
add store_similarity_batch/get_cached_similarities/clear_similarity_cache helpers
- pipeline/fusion.py: replace N+1 per-motion embedding SELECT with single
bulk JOIN using DuckDB QUALIFY window function
- ai_provider.py: retry HTTP 429 with Retry-After header support
- migrations/2026-03-22-add-similarity-cache.sql: make executable
- Add tests for similarity compute, db helpers, and 429 retry (34 pass, 2 skip)
|
1 month ago |
| .. |
|
__init__.py
|
feat(pipeline): implement parliamentary embedding pipeline MVP
|
1 month ago |
|
extract_mp_votes.py
|
fix(pipeline): fix API pagination, add skip_details fast path, bulk mp_votes insert
|
1 month ago |
|
fetch_mp_metadata.py
|
feat(analysis): fetch real MP metadata, fix anchor axis for party-level actors
|
1 month ago |
|
fusion.py
|
feat(similarity): add precomputed similarity cache, fix fusion N+1, add 429 retry
|
1 month ago |
|
run_pipeline.py
|
fix(pipeline): fix API pagination, add skip_details fast path, bulk mp_votes insert
|
1 month ago |
|
svd_pipeline.py
|
feat(pipeline): implement parliamentary embedding pipeline MVP
|
1 month ago |
|
text_pipeline.py
|
feat(pipeline): implement parliamentary embedding pipeline MVP
|
1 month ago |