7.2 KiB
| date | topic | status |
|---|---|---|
| 2026-03-30 | fix-missing-trajectories | draft |
Problem Statement
We're seeing empty/absent party trajectories in the Explorer "Partij Trajectories" tab despite compute_2d_axes producing windows and many parties having centroids. The UI shows no visible traces for selected parties in some runs, making the feature unreliable for end users.
Root hypothesis: either (A) selected parties have only missing/None centroid values at plot time, (B) a runtime exception (eg float(None)) aborts trace creation silently, or (C) label/party normalization mismatch filters out traces. We need a low-risk, diagnostic-first fix to reveal which of these is happening and restore visible traces quickly.
Constraints
- Preserve public function names and locations: compute_2d_axes, classify_axes, load_positions, _build_party_axis_figure, build_trajectories_tab, build_compass_tab, _spline_smooth.
- Avoid large refactors; prefer reversible, minimal changes that surface diagnostics.
- Do not expose internal modal tokens ("As 1"/"As 2") to end users; use axis_classifier.display_label_for_modal(...) or choose_trajectory_title() where appropriate.
- Visual traces should remain smoothed; hover must include raw centroid values for auditability.
Chosen Approach (what we'll implement)
I'm choosing a minimal triage-first approach: add precise diagnostics and defensive conversions around plotting, so we either restore visible traces immediately or produce deterministic diagnostics that reveal the real data mismatch.
Why: low risk, fastest feedback loop. This will either fix simple runtime errors (safe float conversion, exceptions while adding traces) or provide clear evidence that deeper normalization changes are required.
Key changes:
- Add a small helper: safe_float(x) — converts numeric-like values to floats, maps None/NaN/invalid -> float('nan') without raising.
- In build_trajectories_tab/_build_party_axis_figure:
- Wrap per-party fig.add_trace(...) in try/except and log the exception with party id/name to the DEBUG expander instead of aborting the whole plot.
- Emit per-selected-party diagnostics into the existing DEBUG expander: number of raw centroids, counts of non-NaN coordinates, example first 5 raw xs/ys, and lengths per window.
- Replace direct float(...) casts on raw centroid values used in hover/customdata with safe_float.
- Ensure per-MP fallback plotting path still exists and can be forced via EXPLORER_FORCE_SHOW_TRAJECTORIES for diagnosis.
- Add unit tests for safe_float and targeted integration tests that assert traces are created when centroids contain NaNs and when party_map exists.
Alternatives Considered
-
Full normalization sweep: align party centroids to global windows (fill missing with NaN) and accept parties with at least one non-NaN value.
- Pros: robust long-term fix, canonical data shape.
- Cons: larger change surface, higher risk, slower to validate in production data.
-
Refactor plotting pipeline to use a normalized DataFrame (rows=windows, cols=parties) and build traces from that canonical shape.
- Pros: clearer data flow, easier testing.
- Cons: larger refactor, touches many modules.
I considered both but rejected them for immediate work because we need quick deterministic diagnostics to determine if these larger efforts are warranted.
Architecture (high-level)
Inputs: positions_by_window (from compute_2d_axes), party_map, selected_parties.
Flow:
- compute_2d_axes -> positions_by_window
- load_positions / helpers -> party-centroid dicts keyed by party
- build_trajectories_tab calls _build_party_axis_figure to build per-party traces
- _build_party_axis_figure uses smoothing helpers (_spline_smooth) to produce visible traces and also builds hover customdata with raw centroid values (smoothed coords for the trace, raw values in customdata)
Intervention points: build_trajectories_tab and _build_party_axis_figure (small helper additions and safe conversion), plus tests and diagnostic output in the DEBUG expander.
Components and Responsibilities
- safe_float helper: convert inputs to float or return float('nan') safely. Centralized to avoid repeated float(None) errors.
- Diagnostic emitter: small utility used by build_trajectories_tab to format and write per-party diagnostic rows to the DEBUG expander.
- Plotly trace wrapper: per-party try/except around fig.add_trace that writes exception details to diagnostics instead of failing silently.
- Unit + integration tests: verify hover customdata creation, safe_float behaviour, trajectories rendered with partial centroids, and UI label mapping does not emit "As 1"/"As 2".
Data Flow (detailed)
- compute_2d_axes produces windows (time labels) and canonical positions_by_window.
- load_positions consumes positions_by_window and returns a mapping party -> list of centroids (one per window) where centroids may contain None/NaN for missing windows.
- build_trajectories_tab selects parties and for each party calls _build_party_axis_figure which:
- extracts raw xs_raw, ys_raw arrays aligned to windows
- computes smoothed xs_plot, ys_plot via _spline_smooth
- builds Plotly trace using xs_plot/ys_plot for the line and includes xs_raw/ys_raw in customdata with safe_float conversion
- adds the trace inside a try/except and emits any exception + raw samples to debug
Error Handling
- Use safe_float to prevent float(None) and similar runtime TypeErrors when building hover/customdata.
- Use per-party try/except to avoid a single-party failure blanking the whole chart; log the error and continue plotting other parties.
- Show structured diagnostics in the existing DEBUG expander with these fields: party name, windows_count, raw_centroid_count, non_nan_count, sample_raw_xs, sample_raw_ys, exception (if any).
Testing Strategy
-
Unit tests:
- safe_float: None -> nan, '1.23' -> 1.23 (if strings are expected), invalid -> nan
- UI label helpers: axis_classifier.display_label_for_modal(...) and choose_trajectory_title() do not return raw "As 1"/"As 2"
-
Integration tests (lightweight):
- Build a synthetic positions_by_window with some None / NaN holes and assert _build_party_axis_figure returns a Plotly trace object (or equivalent structure) and that customdata contains numeric/NaN values not exceptions.
- Test that build_trajectories_tab's DEBUG expander receives the expected diagnostic entries for a party with missing centroids.
-
Manual verification steps (later): run full Streamlit with duckdb/plotly installed and open Explorer -> Trajectories to confirm traces are visible for typical parties and inspect the DEBUG expander.
Open Questions
- Are there other UI locations still exposing raw modal labels? We should sweep the repo and tests already added help with this, but it may not be exhaustive.
- Do we want safe_float to try to coerce numeric strings? My proposal is no coercion (only pass-through numeric types and map others -> nan) unless tests show string encodings exist in centroid data.
- If diagnostics show that many parties are missing centroids entirely, we'll need the full normalization sweep (alternative #1).
I'm proceeding to create the design doc. Interrupt if you want changes.