6.2 KiB
| date | topic | focus |
|---|---|---|
| 2026-04-04 | code-quality-architecture-ideation | code quality and architecture improvements |
Ideation: Code Quality & Architecture Improvements
Codebase Context
- explorer.py: 3715 lines — monolithic Streamlit app with 65+
except Exception:handlers - database.py: 1366 lines —
MotionDatabaseclass with similar exception patterns - explorer_helpers.py: 317 lines — pure functions, import-safe, well-testable (the pattern)
- Anti-patterns: 208 instances of bare/broad exception handling, nested try-except blocks
- Tests: Well-organized in
tests/with good coverage of helpers
Ranked Ideas
1. Systematic Exception Handler Audit & Refactor
Description: Audit all 208 except Exception: blocks across the codebase. Categorize by failure mode (missing dependency, data validation, network, IO) and replace with specific exceptions. Add error context propagation.
Rationale: The current pattern silently swallows errors, making debugging impossible. Refactoring to specific exceptions enables proper error handling, logging, and user feedback. This compounds: each fix reduces 2-3 nested exception handlers.
Downsides: High volume of changes requires careful regression testing.
Confidence: 90%
Complexity: High
Status: Unexplored
2. Extract Business Logic from explorer.py into Pure Functions
Description: Identify and extract computation-heavy sections from the 3715-line explorer.py. Move to pure functions in a new module (e.g., explorer_logic.py), keeping Streamlit UI glue in the main file.
Rationale: explorer.py mixes UI code with business logic, making it untestable and hard to reason about. The existing explorer_helpers.py proves this pattern works — same approach applied more broadly enables unit testing of core algorithms.
Downsides: Requires careful interface design to avoid breaking the Streamlit page.
Confidence: 85%
Complexity: Medium
Status: Unexplored
3. Create Typed Data Transfer Objects (DTOs) for Database Layer
Description: Replace dictionary-based data passing between database.py and consumers with typed dataclasses or Pydantic models. Define MotionDTO, PartyResultDTO, SessionDTO.
Rationale: 208 exception handlers often mask type mismatches that would be caught at compile-time with typed DTOs. The src/validators/types.py shows existing type awareness — extend this systematically to the data layer.
Downsides: Migration effort; some duckdb results may not serialize cleanly.
Confidence: 75%
Complexity: Medium
Status: Unexplored
4. Establish Explicit Error Recovery Strategies
Description: Rather than catch-all exception handling, implement explicit recovery strategies per failure mode: retry with backoff for transient failures, fallback to cached data for missing dependencies, graceful degradation for optional features.
Rationale: The anti-pattern exists because there's no systematic recovery approach. Explicit strategies replace 208 silent catches with intentional behavior — this is the "compound leverage" angle.
Downsides: Requires identifying which failures are transient vs. permanent per operation.
Confidence: 80%
Complexity: Medium
Status: Unexplored
5. Modularize database.py into Focused Modules
Description: Split database.py (1366 lines) into: db_connection.py (connection lifecycle), db_motions.py (motion queries), db_sessions.py (session management), db_migrations.py (schema updates).
Rationale: Single-responsibility violation — database.py handles connection, schema, queries, and migrations. Splitting enables independent testing and clearer ownership. The pipeline/ modular structure shows this is already the project's convention.
Downsides: Breaking changes for any existing imports.
Confidence: 70%
Complexity: Medium
Status: Unexplored
6. Add Comprehensive Type Hints to Core Modules
Description: Run mypy on explorer.py, database.py, analysis/*.py. Fix missing type hints and enable strict type checking in CI.
Rationale: Type hints catch the errors that 208 exception handlers are currently masking. The src/types/motion_types.py shows the project already has some type investment — this extends it to the pain points.
Downsides: May require cast() in some duckdb interop scenarios.
Confidence: 85%
Complexity: Low
Status: Unexplored
7. Create Code Climate Metrics & Monitoring
Description: Add radon or lizard to measure cyclomatic complexity per module. Set thresholds that fail CI if exceeded. Track over time.
Rationale: Quantitative baseline for refactoring impact. Currently no way to measure if the 3715-line explorer.py is improving or degrading. Compounds: each refactor can be measured.
Downsides: Tool overhead; thresholds may need tuning.
Confidence: 60%
Complexity: Low
Status: Unexplored
8. Extract Static Analysis Rule for Bare Except Detection
Description: Add a flake8 plugin or ruff rule that flags except: and except Exception: without re-raising or logging. Document the project-specific exception hierarchy.
Rationale: Prevents the anti-pattern from re-entering. The project has 208 violations — a custom lint rule catches new violations and encodes the team's error-handling philosophy. This is the "assumption-breaking" angle: stop fixing cases, fix the system.
Downsides: Requires defining what specific exceptions ARE allowed per context.
Confidence: 70%
Complexity: Low
Status: Unexplored
Rejection Summary
| # | Idea | Reason Rejected |
|---|---|---|
| 1 | Add docstrings to all functions | Too obvious; not leverage-focused |
| 2 | Migrate to async database operations | Premature optimization; duckdb is sync |
| 3 | Add logging library (structured logging) | Tool-focused, not addressing root cause |
| 4 | Replace Streamlit with another framework | Out of scope for this codebase |
| 5 | Add Caching layer for database queries | Already exists via Streamlit caching; not addressing architecture |
Session Log
- 2026-04-04: Initial ideation — 13 generated, 8 survived