4.1 KiB
| date | topic |
|---|---|
| 2026-04-04 | explorer-refactor |
Explorer.py Refactor: Extract to analysis/
Problem Frame
explorer.py is 3715 lines with 39 functions mixing:
- Data loading (DuckDB queries)
- Business logic (SVD projections, trajectory alignment)
- UI rendering (Streamlit components)
This makes the file:
- Hard to navigate (no clear boundaries)
- Hard to test (requires Streamlit + DuckDB)
- Hard to review (changes affect everything)
Goal: Improve navigability by extracting computation-heavy logic to analysis/, leaving explorer.py as a UI orchestration layer.
Requirements
Data Layer
-
R1.1: Create
analysis/explorer_data.pycontaining all data loading functions currently in explorer.py:get_available_windows()get_uniform_dim_windows()load_positions()load_party_map()load_active_mps()load_party_axis_scores()load_party_scores_all_windows()load_party_scores_all_windows_aligned()load_party_mp_vectors()load_scree_data()load_motions_df()
-
R1.2: All extracted functions must be callable without Streamlit imports (no
@st.cache_data, nost.*calls) -
R1.3: Functions return pure Python data structures (DataFrames, dicts, lists) - no Plotly figures
Business Logic Layer
-
R2.1: Move computation functions to
analysis/modules based on domain:_should_swap_axes(),_swap_axes()→analysis/axis_utils.py(new)compute_party_discipline()→analysis/trajectories.py- Trajectory computation functions →
analysis/trajectories.py - SVD projection functions →
analysis/svd_labels.pyor newanalysis/projections.py
-
R2.2: Computations must be pure functions (no IO, deterministic outputs)
UI Layer (explorer.py)
-
R3.1: explorer.py becomes a thin orchestration layer:
- Imports from
analysis/explorer_data.pyfor data - Imports from
analysis/modules for computations - Contains only Streamlit UI code and
@st.cache_datawrappers
- Imports from
-
R3.2: Render functions (
_render_*) stay in explorer.py (they're UI-only) -
R3.3: Tab-building functions (
build_*_tab()) stay in explorer.py but delegate to imported functions
Import Safety
-
R4.1: New
analysis/modules must not import fromexplorer.py(no circular dependencies) -
R4.2:
analysis/explorer_data.pymay import fromdatabase.py(already exists)
Testing
-
R5.1: Extracted data functions should be testable with mocked DuckDB connections
-
R5.2: Extracted computation functions should be pure and testable without database
Success Criteria
- explorer.py reduced to under 1500 lines (from 3715)
- No function in explorer.py exceeds 100 lines
- Clear module boundaries: data → computation → UI
- All extracted functions have docstrings with type hints
- No circular imports between
analysis/andexplorer/
Scope Boundaries
Included:
- Data loading functions
- Computation/transformation logic
- Clear separation of concerns
Excluded:
- UI rendering functions (they can stay in explorer.py)
- Database schema changes
- New features or behavior changes
- Test suite updates (handled separately)
Key Decisions
- Domain-based splitting: Computation goes to relevant
analysis/module, not all to one file - Import direction:
explorer.pyimports fromanalysis/, never vice versa - Preserve function signatures: Refactoring shouldn't change public APIs
Dependencies / Assumptions
database.pyprovidesMotionDatabasesingleton - data functions will use thisexplorer_helpers.pypattern is already established - follow its conventions- Streamlit caching (
@st.cache_data) stays in explorer.py as the orchestration layer
Outstanding Questions
Deferred to Planning
- [Implementation] Should
_load_mp_vectors_by_party()and variants be merged or kept separate? - [Implementation] Should we create
analysis/projections.pyor extend existinganalysis/axis_classifier.py? - [Implementation] How to handle the
_cached_bootstrap_cis()function - move to analysis or keep as cache wrapper?
Next Steps
→ /ce:plan for structured implementation planning