You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
118 lines
4.1 KiB
118 lines
4.1 KiB
---
|
|
date: 2026-04-04
|
|
topic: explorer-refactor
|
|
---
|
|
|
|
# Explorer.py Refactor: Extract to analysis/
|
|
|
|
## Problem Frame
|
|
|
|
explorer.py is 3715 lines with 39 functions mixing:
|
|
- Data loading (DuckDB queries)
|
|
- Business logic (SVD projections, trajectory alignment)
|
|
- UI rendering (Streamlit components)
|
|
|
|
This makes the file:
|
|
- Hard to navigate (no clear boundaries)
|
|
- Hard to test (requires Streamlit + DuckDB)
|
|
- Hard to review (changes affect everything)
|
|
|
|
**Goal**: Improve navigability by extracting computation-heavy logic to `analysis/`, leaving explorer.py as a UI orchestration layer.
|
|
|
|
## Requirements
|
|
|
|
### Data Layer
|
|
|
|
- **R1.1**: Create `analysis/explorer_data.py` containing all data loading functions currently in explorer.py:
|
|
- `get_available_windows()`
|
|
- `get_uniform_dim_windows()`
|
|
- `load_positions()`
|
|
- `load_party_map()`
|
|
- `load_active_mps()`
|
|
- `load_party_axis_scores()`
|
|
- `load_party_scores_all_windows()`
|
|
- `load_party_scores_all_windows_aligned()`
|
|
- `load_party_mp_vectors()`
|
|
- `load_scree_data()`
|
|
- `load_motions_df()`
|
|
|
|
- **R1.2**: All extracted functions must be callable without Streamlit imports (no `@st.cache_data`, no `st.*` calls)
|
|
|
|
- **R1.3**: Functions return pure Python data structures (DataFrames, dicts, lists) - no Plotly figures
|
|
|
|
### Business Logic Layer
|
|
|
|
- **R2.1**: Move computation functions to `analysis/` modules based on domain:
|
|
- `_should_swap_axes()`, `_swap_axes()` → `analysis/axis_utils.py` (new)
|
|
- `compute_party_discipline()` → `analysis/trajectories.py`
|
|
- Trajectory computation functions → `analysis/trajectories.py`
|
|
- SVD projection functions → `analysis/svd_labels.py` or new `analysis/projections.py`
|
|
|
|
- **R2.2**: Computations must be pure functions (no IO, deterministic outputs)
|
|
|
|
### UI Layer (explorer.py)
|
|
|
|
- **R3.1**: explorer.py becomes a thin orchestration layer:
|
|
- Imports from `analysis/explorer_data.py` for data
|
|
- Imports from `analysis/` modules for computations
|
|
- Contains only Streamlit UI code and `@st.cache_data` wrappers
|
|
|
|
- **R3.2**: Render functions (`_render_*`) stay in explorer.py (they're UI-only)
|
|
|
|
- **R3.3**: Tab-building functions (`build_*_tab()`) stay in explorer.py but delegate to imported functions
|
|
|
|
### Import Safety
|
|
|
|
- **R4.1**: New `analysis/` modules must not import from `explorer.py` (no circular dependencies)
|
|
|
|
- **R4.2**: `analysis/explorer_data.py` may import from `database.py` (already exists)
|
|
|
|
### Testing
|
|
|
|
- **R5.1**: Extracted data functions should be testable with mocked DuckDB connections
|
|
|
|
- **R5.2**: Extracted computation functions should be pure and testable without database
|
|
|
|
## Success Criteria
|
|
|
|
- explorer.py reduced to under 1500 lines (from 3715)
|
|
- No function in explorer.py exceeds 100 lines
|
|
- Clear module boundaries: data → computation → UI
|
|
- All extracted functions have docstrings with type hints
|
|
- No circular imports between `analysis/` and `explorer/`
|
|
|
|
## Scope Boundaries
|
|
|
|
**Included:**
|
|
- Data loading functions
|
|
- Computation/transformation logic
|
|
- Clear separation of concerns
|
|
|
|
**Excluded:**
|
|
- UI rendering functions (they can stay in explorer.py)
|
|
- Database schema changes
|
|
- New features or behavior changes
|
|
- Test suite updates (handled separately)
|
|
|
|
## Key Decisions
|
|
|
|
- **Domain-based splitting**: Computation goes to relevant `analysis/` module, not all to one file
|
|
- **Import direction**: `explorer.py` imports from `analysis/`, never vice versa
|
|
- **Preserve function signatures**: Refactoring shouldn't change public APIs
|
|
|
|
## Dependencies / Assumptions
|
|
|
|
- `database.py` provides `MotionDatabase` singleton - data functions will use this
|
|
- `explorer_helpers.py` pattern is already established - follow its conventions
|
|
- Streamlit caching (`@st.cache_data`) stays in explorer.py as the orchestration layer
|
|
|
|
## Outstanding Questions
|
|
|
|
### Deferred to Planning
|
|
- [ ] [Implementation] Should `_load_mp_vectors_by_party()` and variants be merged or kept separate?
|
|
- [ ] [Implementation] Should we create `analysis/projections.py` or extend existing `analysis/axis_classifier.py`?
|
|
- [ ] [Implementation] How to handle the `_cached_bootstrap_cis()` function - move to analysis or keep as cache wrapper?
|
|
|
|
## Next Steps
|
|
|
|
→ `/ce:plan` for structured implementation planning
|
|
|