You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/docs/ideation/2026-04-04-code-quality-arc...

6.2 KiB

date topic focus
2026-04-04 code-quality-architecture-ideation code quality and architecture improvements

Ideation: Code Quality & Architecture Improvements

Codebase Context

  • explorer.py: 3715 lines — monolithic Streamlit app with 65+ except Exception: handlers
  • database.py: 1366 lines — MotionDatabase class with similar exception patterns
  • explorer_helpers.py: 317 lines — pure functions, import-safe, well-testable (the pattern)
  • Anti-patterns: 208 instances of bare/broad exception handling, nested try-except blocks
  • Tests: Well-organized in tests/ with good coverage of helpers

Ranked Ideas

1. Systematic Exception Handler Audit & Refactor

Description: Audit all 208 except Exception: blocks across the codebase. Categorize by failure mode (missing dependency, data validation, network, IO) and replace with specific exceptions. Add error context propagation.

Rationale: The current pattern silently swallows errors, making debugging impossible. Refactoring to specific exceptions enables proper error handling, logging, and user feedback. This compounds: each fix reduces 2-3 nested exception handlers.

Downsides: High volume of changes requires careful regression testing.

Confidence: 90%

Complexity: High

Status: Unexplored


2. Extract Business Logic from explorer.py into Pure Functions

Description: Identify and extract computation-heavy sections from the 3715-line explorer.py. Move to pure functions in a new module (e.g., explorer_logic.py), keeping Streamlit UI glue in the main file.

Rationale: explorer.py mixes UI code with business logic, making it untestable and hard to reason about. The existing explorer_helpers.py proves this pattern works — same approach applied more broadly enables unit testing of core algorithms.

Downsides: Requires careful interface design to avoid breaking the Streamlit page.

Confidence: 85%

Complexity: Medium

Status: Unexplored


3. Create Typed Data Transfer Objects (DTOs) for Database Layer

Description: Replace dictionary-based data passing between database.py and consumers with typed dataclasses or Pydantic models. Define MotionDTO, PartyResultDTO, SessionDTO.

Rationale: 208 exception handlers often mask type mismatches that would be caught at compile-time with typed DTOs. The src/validators/types.py shows existing type awareness — extend this systematically to the data layer.

Downsides: Migration effort; some duckdb results may not serialize cleanly.

Confidence: 75%

Complexity: Medium

Status: Unexplored


4. Establish Explicit Error Recovery Strategies

Description: Rather than catch-all exception handling, implement explicit recovery strategies per failure mode: retry with backoff for transient failures, fallback to cached data for missing dependencies, graceful degradation for optional features.

Rationale: The anti-pattern exists because there's no systematic recovery approach. Explicit strategies replace 208 silent catches with intentional behavior — this is the "compound leverage" angle.

Downsides: Requires identifying which failures are transient vs. permanent per operation.

Confidence: 80%

Complexity: Medium

Status: Unexplored


5. Modularize database.py into Focused Modules

Description: Split database.py (1366 lines) into: db_connection.py (connection lifecycle), db_motions.py (motion queries), db_sessions.py (session management), db_migrations.py (schema updates).

Rationale: Single-responsibility violation — database.py handles connection, schema, queries, and migrations. Splitting enables independent testing and clearer ownership. The pipeline/ modular structure shows this is already the project's convention.

Downsides: Breaking changes for any existing imports.

Confidence: 70%

Complexity: Medium

Status: Unexplored


6. Add Comprehensive Type Hints to Core Modules

Description: Run mypy on explorer.py, database.py, analysis/*.py. Fix missing type hints and enable strict type checking in CI.

Rationale: Type hints catch the errors that 208 exception handlers are currently masking. The src/types/motion_types.py shows the project already has some type investment — this extends it to the pain points.

Downsides: May require cast() in some duckdb interop scenarios.

Confidence: 85%

Complexity: Low

Status: Unexplored


7. Create Code Climate Metrics & Monitoring

Description: Add radon or lizard to measure cyclomatic complexity per module. Set thresholds that fail CI if exceeded. Track over time.

Rationale: Quantitative baseline for refactoring impact. Currently no way to measure if the 3715-line explorer.py is improving or degrading. Compounds: each refactor can be measured.

Downsides: Tool overhead; thresholds may need tuning.

Confidence: 60%

Complexity: Low

Status: Unexplored


8. Extract Static Analysis Rule for Bare Except Detection

Description: Add a flake8 plugin or ruff rule that flags except: and except Exception: without re-raising or logging. Document the project-specific exception hierarchy.

Rationale: Prevents the anti-pattern from re-entering. The project has 208 violations — a custom lint rule catches new violations and encodes the team's error-handling philosophy. This is the "assumption-breaking" angle: stop fixing cases, fix the system.

Downsides: Requires defining what specific exceptions ARE allowed per context.

Confidence: 70%

Complexity: Low

Status: Unexplored


Rejection Summary

# Idea Reason Rejected
1 Add docstrings to all functions Too obvious; not leverage-focused
2 Migrate to async database operations Premature optimization; duckdb is sync
3 Add logging library (structured logging) Tool-focused, not addressing root cause
4 Replace Streamlit with another framework Out of scope for this codebase
5 Add Caching layer for database queries Already exists via Streamlit caching; not addressing architecture

Session Log

  • 2026-04-04: Initial ideation — 13 generated, 8 survived