You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/.mindmodel/constraints/30-clusters.yaml

30 lines
1.4 KiB

# Code Clusters / Organization
## Rules
- The repository organizes code into the following clusters (observed):
- UI / Streamlit: Home.py, pages/, app.py, explorer.py
- Database & persistence: database.py, config.py
- ETL / pipeline: pipeline/ (run_pipeline.py, svd_pipeline, text_pipeline, fusion)
- AI provider & summarization: ai_provider.py, pipeline/..., analysis/
- Similarity & caching: similarity/*, similarity_cache table in DB
- API client & scraping: api_client.py, pipeline/fetch_mp_metadata
- Analysis & visualization: analysis/visualize.py, explorer.py
- CLI & scheduler: scheduler.py, pipeline/run_pipeline.py
- Tests & migrations: tests/ (pytest) and database reset helpers
## Examples
### Pipeline orchestrator (cluster: CLI & pipeline)
```python
from database import MotionDatabase
db = MotionDatabase(db_path)
# then phases: fetch_mp_metadata, extract_mp_votes, compute svd, ensure_text_embeddings, fuse_for_window
```
## Remediations
- Add a brief CONTRIBUTING.md describing where to add new pipeline stages and how to run tests locally. Include notes about optional duckdb dependency and JSON fallback for tests.
## Evidence pointers
- pipeline/run_pipeline.py: orchestrator and cluster boundaries (file: pipeline/run_pipeline.py)
- ai_provider.py: AI adapter for embeddings and chat (file: ai_provider.py)
- analysis/visualize.py: visualization cluster (file: analysis/visualize.py)