You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
4.9 KiB
4.9 KiB
Title: Make modules import-safe (duckdb/plotly)
Why
- Enable lightweight unit tests and imports in environments without heavy runtime deps (duckdb, plotly) without changing runtime behaviour when those deps are present.
Scope
- Primary focus: library modules that are commonly imported by tests or other modules (not CLI scripts that are only executed).
- Initial rollout: small, reviewable batches. Do not push or change remote branches.
Non-goals
- Remove duckdb/plotly dependency from runtime environments.
- Refactor functionality beyond import-time safety.
Approach
- Two safe patterns (apply conservatively):
-
Pattern A — module-level guard
try: import duckdb except Exception: # pragma: no cover duckdb = None # type: ignoreUse when multiple functions in the module call duckdb and adding the guard is least invasive.
-
Pattern B — function-local import (preferred for DB helpers) Move
import duckdbinto the function that uses it and raise a clear RuntimeError when invoked without duckdb:def open_conn(path): try: import duckdb except Exception: raise RuntimeError("duckdb is required for open_conn") from None return duckdb.connect(path)
-
Targets (first batch — high impact)
database.py— Pattern B (move DB imports into helpers; provide clear RuntimeError when called without duckdb). Tests importdatabase.pyso make this robust.app.py— Pattern B (app modules often get imported during test runs; delay duckdb until handlers that need it).pipeline/svd_pipeline.py— Pattern A (guard top-level import; pipeline code is heavy and module-level guard is fine).
Expanded target list (subsequent batches)
pipeline/text_pipeline.py,pipeline/fusion.py— Pattern Apipeline/extract_mp_votes.py— Pattern Bsimilarity/compute.py,summarizer.py— Pattern A or B after inspection- scripts under
scripts/only if tests import them (prefer moving tomain())
Step-by-step rollout
- Prepare patches for batch 1 (three files). Create one patch file per file so changes are atomic and easy to revert.
- Apply edits in a feature branch or local commit. Run focused tests:
pytest tests/test_database_audit.py -qpytest tests/test_political_compass.py::test_* -q(if applicable)
- If focused tests pass, run full suite in .venv:
.venv/bin/python -m pytest tests/ -q
- If failures occur, inspect tracebacks for missing duckdb at runtime and either revert the specific change or convert Pattern A ↔ Pattern B as needed.
- Repeat for next batches until all targeted modules are covered.
Verification
- After each file change, run the focused tests that touch the module. After the batch, run full test suite (local
.venvrecommended):.venv/bin/python -m pytest tests/ -q— expect no new failures.
- Confirm importability in empty environment (simulate by temporarily renaming
.venvor running in environment without duckdb):python -c "import analysis; print('ok')"— should not raise ImportError for guarded modules.
Rollback strategy
- Make one file change per commit. If tests fail, revert the last commit and open an issue with the failure trace.
Patch previews (examples)
-
Pattern A (top-level guard):
- replace
import duckdbwith:try: import duckdb except Exception: # pragma: no cover duckdb = None # type: ignore
- replace
-
Pattern B (function-local import + clear error):
- before:
import duckdb def open_conn(path): return duckdb.connect(path) - after:
def open_conn(path): try: import duckdb except Exception: raise RuntimeError("duckdb is required for open_conn") from None return duckdb.connect(path)
- before:
Risks & mitigations
- Risk: hiding missing dependency until runtime. Mitigation: when using Pattern B raise descriptive RuntimeError at call site so failures are explicit.
- Risk: tests that intentionally require duckdb may break if we change behavior incorrectly. Mitigation: run focused tests that import duckdb intentionally and keep those files unchanged.
Owner & next actions for me
- I can generate the exact patch diffs for batch 1 (three files) and present them for review before applying. This is recommended to keep the change small and reviewable.
- Reply with:
prepare— I will create the patch diffs fordatabase.py,app.py,pipeline/svd_pipeline.pyand show them to you (no files modified yet), orapply— I will apply the first batch now and run focused tests locally.
Notes
- I already applied import-guards in several
analysis/modules (trajectory, explorer_data, clustering, political_axis) during earlier review; this plan continues that conservative approach.
References
- Examples:
analysis/explorer_data.py,analysis/trajectory.py,analysis/visualize.py