You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
motief/docs/ideation/2026-04-04-code-quality-arc...

149 lines
6.2 KiB

---
date: 2026-04-04
topic: code-quality-architecture-ideation
focus: code quality and architecture improvements
---
# Ideation: Code Quality & Architecture Improvements
## Codebase Context
- **explorer.py**: 3715 lines — monolithic Streamlit app with 65+ `except Exception:` handlers
- **database.py**: 1366 lines — `MotionDatabase` class with similar exception patterns
- **explorer_helpers.py**: 317 lines — pure functions, import-safe, well-testable (the pattern)
- **Anti-patterns**: 208 instances of bare/broad exception handling, nested try-except blocks
- **Tests**: Well-organized in `tests/` with good coverage of helpers
## Ranked Ideas
### 1. Systematic Exception Handler Audit & Refactor
**Description:** Audit all 208 `except Exception:` blocks across the codebase. Categorize by failure mode (missing dependency, data validation, network, IO) and replace with specific exceptions. Add error context propagation.
**Rationale:** The current pattern silently swallows errors, making debugging impossible. Refactoring to specific exceptions enables proper error handling, logging, and user feedback. This compounds: each fix reduces 2-3 nested exception handlers.
**Downsides:** High volume of changes requires careful regression testing.
**Confidence:** 90%
**Complexity:** High
**Status:** Unexplored
---
### 2. Extract Business Logic from explorer.py into Pure Functions
**Description:** Identify and extract computation-heavy sections from the 3715-line explorer.py. Move to pure functions in a new module (e.g., `explorer_logic.py`), keeping Streamlit UI glue in the main file.
**Rationale:** explorer.py mixes UI code with business logic, making it untestable and hard to reason about. The existing `explorer_helpers.py` proves this pattern works — same approach applied more broadly enables unit testing of core algorithms.
**Downsides:** Requires careful interface design to avoid breaking the Streamlit page.
**Confidence:** 85%
**Complexity:** Medium
**Status:** Unexplored
---
### 3. Create Typed Data Transfer Objects (DTOs) for Database Layer
**Description:** Replace dictionary-based data passing between `database.py` and consumers with typed dataclasses or Pydantic models. Define `MotionDTO`, `PartyResultDTO`, `SessionDTO`.
**Rationale:** 208 exception handlers often mask type mismatches that would be caught at compile-time with typed DTOs. The `src/validators/types.py` shows existing type awareness — extend this systematically to the data layer.
**Downsides:** Migration effort; some duckdb results may not serialize cleanly.
**Confidence:** 75%
**Complexity:** Medium
**Status:** Unexplored
---
### 4. Establish Explicit Error Recovery Strategies
**Description:** Rather than catch-all exception handling, implement explicit recovery strategies per failure mode: retry with backoff for transient failures, fallback to cached data for missing dependencies, graceful degradation for optional features.
**Rationale:** The anti-pattern exists because there's no systematic recovery approach. Explicit strategies replace 208 silent catches with intentional behavior — this is the "compound leverage" angle.
**Downsides:** Requires identifying which failures are transient vs. permanent per operation.
**Confidence:** 80%
**Complexity:** Medium
**Status:** Unexplored
---
### 5. Modularize database.py into Focused Modules
**Description:** Split `database.py` (1366 lines) into: `db_connection.py` (connection lifecycle), `db_motions.py` (motion queries), `db_sessions.py` (session management), `db_migrations.py` (schema updates).
**Rationale:** Single-responsibility violation — database.py handles connection, schema, queries, and migrations. Splitting enables independent testing and clearer ownership. The `pipeline/` modular structure shows this is already the project's convention.
**Downsides:** Breaking changes for any existing imports.
**Confidence:** 70%
**Complexity:** Medium
**Status:** Unexplored
---
### 6. Add Comprehensive Type Hints to Core Modules
**Description:** Run mypy on `explorer.py`, `database.py`, `analysis/*.py`. Fix missing type hints and enable strict type checking in CI.
**Rationale:** Type hints catch the errors that 208 exception handlers are currently masking. The `src/types/motion_types.py` shows the project already has some type investment — this extends it to the pain points.
**Downsides:** May require `cast()` in some duckdb interop scenarios.
**Confidence:** 85%
**Complexity:** Low
**Status:** Unexplored
---
### 7. Create Code Climate Metrics & Monitoring
**Description:** Add radon or lizard to measure cyclomatic complexity per module. Set thresholds that fail CI if exceeded. Track over time.
**Rationale:** Quantitative baseline for refactoring impact. Currently no way to measure if the 3715-line explorer.py is improving or degrading. Compounds: each refactor can be measured.
**Downsides:** Tool overhead; thresholds may need tuning.
**Confidence:** 60%
**Complexity:** Low
**Status:** Unexplored
---
### 8. Extract Static Analysis Rule for Bare Except Detection
**Description:** Add a flake8 plugin or ruff rule that flags `except:` and `except Exception:` without re-raising or logging. Document the project-specific exception hierarchy.
**Rationale:** Prevents the anti-pattern from re-entering. The project has 208 violations — a custom lint rule catches new violations and encodes the team's error-handling philosophy. This is the "assumption-breaking" angle: stop fixing cases, fix the system.
**Downsides:** Requires defining what specific exceptions ARE allowed per context.
**Confidence:** 70%
**Complexity:** Low
**Status:** Unexplored
---
## Rejection Summary
| # | Idea | Reason Rejected |
|---|------|-----------------|
| 1 | Add docstrings to all functions | Too obvious; not leverage-focused |
| 2 | Migrate to async database operations | Premature optimization; duckdb is sync |
| 3 | Add logging library (structured logging) | Tool-focused, not addressing root cause |
| 4 | Replace Streamlit with another framework | Out of scope for this codebase |
| 5 | Add Caching layer for database queries | Already exists via Streamlit caching; not addressing architecture |
## Session Log
- 2026-04-04: Initial ideation — 13 generated, 8 survived