You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/docs/plans/2026-04-04-002-refactor-exp...

7.1 KiB

title type status date origin
refactor: Extract business logic from explorer.py to analysis/ refactor active 2026-04-04 docs/brainstorms/2026-04-04-explorer-refactor-requirements.md

Refactor: Extract Business Logic from explorer.py to analysis/

Overview

Split the 3715-line explorer.py into clear layers: data loading, business logic, and UI. This improves navigability and testability while preserving all existing behavior.

Problem Frame

explorer.py mixes three concerns (data loading, computation, UI) making it:

  • Hard to navigate — no clear boundaries
  • Hard to test — requires Streamlit + DuckDB
  • Hard to review — changes affect everything

Requirements Trace

  • R1.1: Create analysis/explorer_data.py with data loading functions
  • R1.2: Data functions callable without Streamlit imports
  • R1.3: Functions return pure Python data structures
  • R2.1: Move computation to domain-appropriate analysis/ modules
  • R2.2: Computations are pure functions
  • R3.1: explorer.py becomes thin orchestration layer
  • R3.2: _render_* functions stay in explorer.py
  • R3.3: build_*_tab() functions delegate to imported functions
  • R4.1: No circular imports
  • R5.1: Data functions testable with mocked DuckDB
  • R5.2: Computation functions pure and testable

Key Technical Decisions

  • Domain-based splitting: Computation goes to relevant analysis/ module
  • Import direction: explorer.py imports from analysis/, never vice versa
  • Preserve signatures: Refactoring doesn't change public APIs
  • _load_mp_vectors_by_party variants: Keep separate (serve different use cases)
  • analysis/projections.py: Create new file (distinct from axis_classifier.py)
  • _cached_bootstrap_cis(): Keep as cache wrapper in explorer.py, move computation to analysis/

Open Questions

Resolved During Planning

  • _load_mp_vectors_by_party variants: Keep separate — they have different signatures and use cases
  • analysis/projections.py: Create new file — projections are distinct from axis classification
  • _cached_bootstrap_cis(): Keep wrapper in explorer.py, move computation to analysis/trajectories.py

Deferred to Implementation

  • Exact function grouping within analysis/explorer_data.py — will be refined during extraction
  • Whether to add __all__ exports — decide based on usage patterns after extraction

Implementation Units

  • Unit 1: Create analysis/explorer_data.py skeleton

Goal: Create the data loading module with extracted functions

Requirements: R1.1, R1.2, R1.3

Dependencies: None

Files:

  • Create: analysis/explorer_data.py

Approach:

  1. Create module with docstring and imports
  2. Add stub functions with original signatures (no implementation)
  3. Copy docstrings and type hints from explorer.py

Functions to extract:

  • get_available_windows(db_path: str) -> List[str]
  • get_uniform_dim_windows(db_path: str) -> List[str]
  • load_positions(db_path: str, window_size: str) -> pd.DataFrame
  • load_party_map(db_path: str) -> Dict[str, str]
  • load_active_mps(db_path: str) -> set
  • load_party_axis_scores(db_path: str) -> Dict[str, List[float]]
  • load_party_axis_scores_for_window(db_path: str, window: str) -> Dict[str, List[float]]
  • load_party_scores_all_windows(db_path: str) -> Dict[str, List[List[float]]]
  • load_party_scores_all_windows_aligned(db_path: str) -> Dict[str, List[List[float]]]
  • load_party_mp_vectors(db_path: str) -> Dict[str, List[np.ndarray]]
  • load_scree_data(db_path: str) -> List[float]
  • load_motions_df(db_path: str) -> pd.DataFrame

Patterns to follow:

  • explorer_helpers.py conventions (pure functions, no IO side effects)
  • database.py for DuckDB connection patterns

Verification:

  • Module imports without errors
  • All functions have correct signatures

  • Unit 2: Create analysis/projections.py

Goal: Create module for SVD projection and axis utilities

Requirements: R2.1, R2.2

Dependencies: Unit 1

Files:

  • Create: analysis/projections.py

Approach:

  1. Extract _should_swap_axes() and _swap_axes() from explorer.py
  2. Add pure projection computation functions

Functions to extract:

  • _should_swap_axes(axis_def: dict) -> bool
  • _swap_axes(axis_def: dict) -> dict
  • project_motions_onto_axis(motion_ids, scores) -> List[Tuple[int, float]] (stub)

Patterns to follow:

  • Pure function conventions from explorer_helpers.py

Verification:

  • Functions work without Streamlit/DuckDB imports

  • Unit 3: Update analysis/trajectories.py

Goal: Add trajectory computation functions from explorer.py

Requirements: R2.1, R2.2

Dependencies: Unit 1

Files:

  • Modify: analysis/trajectories.py

Approach:

  1. Add compute_party_discipline() and related functions
  2. Add compute_trajectory_points() (pure computation)

Functions to add:

  • compute_party_discipline(mp_scores: Dict[str, List[float]]) -> Dict[str, float]
  • compute_2d_trajectories(positions_by_window, party_axis_scores) (stub)
  • compute_aligned_trajectories(positions_by_window, party_scores_all) (stub)

Verification:

  • Functions are pure (no IO)
  • Existing trajectory.py tests pass

  • Unit 4: Wire up imports in explorer.py

Goal: Update explorer.py to import from new modules

Requirements: R3.1, R3.3, R4.1

Dependencies: Units 1, 2, 3

Files:

  • Modify: explorer.py

Approach:

  1. Replace local function definitions with imports
  2. Keep wrapper functions where needed for @st.cache_data
  3. Verify no circular imports

Verification:

  • explorer.py imports work
  • No circular import errors
  • Streamlit app runs correctly

  • Unit 5: Final cleanup and verification

Goal: Ensure explorer.py meets success criteria

Requirements: All

Dependencies: Unit 4

Approach:

  1. Count lines in explorer.py — target under 1500
  2. Check no function exceeds 100 lines
  3. Verify all extracted functions have docstrings
  4. Run existing tests

Verification:

  • wc -l explorer.py < 1500
  • All functions under 100 lines
  • Tests pass

System-Wide Impact

  • Interaction graph: explorer.py imports from analysis/ — no reverse imports
  • Error propagation: Data functions raise exceptions on DB errors (same as before)
  • API surface parity: All function signatures preserved
  • Unchanged invariants: UI behavior identical, no new features

Risks & Dependencies

Risk Mitigation
Breaking existing function signatures Preserve exact signatures, update in place
Circular imports One-way import direction (explorer → analysis only)
Regression in UI behavior Test after each unit, verify Streamlit app runs

Documentation / Operational Notes

  • Update ARCHITECTURE.md to document new analysis/explorer_data.py module
  • No changes to deployment or configuration needed

Sources & References

  • Requirements doc: docs/brainstorms/2026-04-04-explorer-refactor-requirements.md
  • Related code: explorer.py, explorer_helpers.py, analysis/trajectories.py
  • Pattern reference: explorer_helpers.py (pure function conventions)