You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/.mindmodel/constraints/20-domain-glossary.yaml

22 lines
1.5 KiB

# Domain Glossary
## Rules
- Use consistent domain terms across code and DB: Motion, MP, Party, embedding, window, svd_vector, fused_embedding, similarity_cache, session_id.
## Terms
- Motion: parliamentary motion stored in `motions` table. Evidence: database.py CREATE TABLE motions (file: database.py lines ~40-110)
- MP (Member of Parliament): individual with votes stored in `mp_votes`. Evidence: database.py CREATE TABLE mp_votes
- Embedding: text embedding stored in `embeddings` table; fused vectors in `fused_embeddings`.
- SVD vector: reduced-dimensional vectors stored in `svd_vectors` table.
- Window: time window identifier (e.g., "2024-Q1") used across SVD/fusion pipelines. Evidence: pipeline/run_pipeline.py _generate_windows
- Controversy score: derived field stored on motions as controversy_score. Evidence: database.py insert_motion sets controversy_score
## Examples / Usage
- pipeline.run_pipeline._generate_windows produces window ids used when storing svd_vectors and fused_embeddings. Evidence: pipeline/run_pipeline.py lines ~1-120
## Evidence pointers
- database.py: motions, mp_votes, embeddings, fused_embeddings tables (file: database.py)
- pipeline/run_pipeline.py: window generation and pipeline phases (file: pipeline/run_pipeline.py)
## Anti-patterns
- Inconsistent naming of domain terms across modules (e.g., `mp_vote_parties` vs `mp_votes` usage in database.insert_motion and pipeline extraction). Prefer canonical names matching DB columns and use small adapter functions when transitioning representations.