motief

6.2 KiB

Raw Blame History

date	topic	status
2026-03-30	diagnose-no-plot-trajectories	draft

Problem Statement

The Trajectories tab currently shows no Plotly chart at all (not just an empty chart). We need a low-risk way to determine exactly which runtime gate or swallowed exception is preventing any plot from being rendered and fix it so the chart appears or we surface a clear error message.

Key observation: upstream code contains multiple early-returns (no data), and broad except/pass handlers that can silently swallow exceptions — either can cause the UI to skip calling st.plotly_chart entirely.

Constraints

Keep changes small and reversible.
Do not change user-facing defaults unless gated by an explicit debug toggle or environment variable.
Prefer adding diagnostics and logging over big refactors; short-term changes must be removable after diagnosis.
Preserve public function locations and names used by other code/tests.

Chosen approach (what I'll do)

I'm choosing a focused instrumentation strategy: add a temporary, opt-in debug mode that surfaces the exact runtime decisions and any exceptions taken along the Trajectories rendering path, and un-silence key broad excepts so we can observe stack traces.

Why: It's the fastest, lowest-risk way to get definitive evidence of why the plot doesn't render, and it avoids changing production logic except under an explicit debug toggle.

High-level changes:

Add a DEBUG toggle (UI checkbox + env var EXPLORER_DEBUG_TRAJECTORIES) that enables verbose diagnostics in the Trajectories UI.
When debug is enabled, show step-by-step status for each early-return gate: result of load_positions, axis_def presence, length of positions_by_window, centroids size, mp_positions size, helper returns (fig/trace_count) and any exception tracebacks.
Replace the helper-call swallow (except Exception: pass) around select_trajectory_plot_data with a handler that logs and displays the exception (only when debug is enabled) and increments a visible diagnostic counter.
Add compact, structured diagnostics to the existing DEBUG expander (windows_count, party_map_count, centroids_sample, mp_positions_sample, helper_trace_count, helper_exception_string).

Alternatives considered (brief)

Force-show MP fallback unconditionally. Pros: quickly confirm plotting plumbing works. Cons: noisy, may mask root cause and changes production behaviour.
Heavy refactor to move pure plotting logic into an import-safe separate module and run offline tests. Pros: clean separation and easier tests. Cons: slower and higher-risk for this urgent diagnosis.

I rejected both for immediate work because they are heavier than necessary to learn the root cause.

Architecture (where changes live)

Explorer UI (explorer.py) — add debug checkbox and diagnostic panel wiring inside build_trajectories_tab.
Diagnostics collector (small helper in explorer_helpers.py or local helper) — produce structured status dicts (counts, samples) used by the UI.
Error surfacer — modify the select_trajectory_plot_data call-site to log exceptions (logger.exception) and, when debug enabled, call st.exception(...) or st.text_area(...) with the traceback.

Components and responsibilities

Debug toggle UI: checkbox + env var binding; enables/disables verbose diagnostics.
Diagnostic collector: pure helper that inspects positions_by_window, party_map, centroids, mp_positions and returns compact samples and counts.
Exception handler change: convert broad except: pass at the helper boundary into except Exception as e: logger.exception(e); diagnostic['select_helper_exception']=traceback; if debug: st.exception(e).
Temporary UX: display a compact, clearly labeled diagnostics block inside the DEBUG expander. Make it obvious this is a temporary troubleshooting aid.

Data flow (quick)

load_positions(db) -> positions_by_window, axis_def
diagnostic collector inspects positions_by_window and party_map
build_trajectories_tab calls select_trajectory_plot_data(...) inside a try/except
on success: use returned fig and trace_count to decide whether to call st.plotly_chart
on exception: diagnostic collector records traceback and UI shows it if debug enabled

Error handling strategy

Do not swallow exceptions silently at the helper boundary. Always log with logger.exception(...).
Only surface full tracebacks to the Streamlit UI when debug mode is enabled.
Keep production behaviour unchanged when debug mode is off.

Testing approach

Unit tests for the diagnostic collector with synthetic positions_by_window covering: empty data, partial centroids, and full centroids.
Unit test that simulates the helper raising an exception (monkeypatch) and asserts that the exception is logged and (when debug enabled) that the diagnostics struct contains the exception string.
Manual reproduction: run Streamlit locally with EXPLORER_DEBUG_TRAJECTORIES=1 and the same DB used in production to capture the diagnostics panel and fix the underlying issue.

Open questions

Can you reproduce the issue locally (same DB and same command to start Streamlit)? I assume yes and will base debug advice on that.
Are we allowed to enable a short-lived debug toggle in production logs if needed, or will you only run this locally?

I'm proceeding to create the design doc. Interrupt if you want changes. \n+## Environment management (use uv, not pip)

We will not use pip directly. Use the project's uv tool to manage dependencies and run scripts so the environment is reproducible and follows local project conventions.

Recommended commands:

Add duckdb to the project virtual environment:
- uv add duckdb
Run the diagnostic CLI with debug enabled:
- EXPLORER_DEBUG_TRAJECTORIES=1 uv run python scripts/diagnose_trajectories_cli.py
Start Streamlit inside the uv-managed environment (example):
- uv run streamlit run pages/2_Explorer.py

Notes:

If the planner or any follow-up steps need to install or run packages, they should use uv add and uv run rather than pip install or direct interpreter calls.
If uv is not on PATH in a particular environment, prefer python -m uv or consult the project README/ARCHITECTURE.md for local developer environment instructions.

6.2 KiB Raw Blame History