You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/docs/brainstorms/2026-04-05-motion-semantic-...

154 lines
9.3 KiB

---
date: 2026-04-05
topic: motion-semantic-drift-over-time
---
# Motion Semantic Drift Analysis Over Time
## Problem Frame
The SVD explorer shows *where* parties and motions sit on axes at a point in time, but doesn't reveal *how the semantic content of motions on each axis evolves*. Users can see that axis 1 separates right-wing from left-wing motions, but can't answer questions like:
- Did "right-wing" motions on axis 1 become more extreme over time, or did the framing shift?
- Are the SVD axes themselves stable across windows, or do they capture different dimensions in different periods?
- How did individual parties' motion sponsorship patterns shift along axes over time?
This analysis would add a temporal dimension to the SVD analysis, revealing how political discourse evolved.
## Requirements
**Axis Stability**
- R1. Compute cosine similarity between SVD component vectors (or motion projection patterns) across all annual windows (2016-2024)
- R2. Generate a stability heatmap showing which axes are comparable across time (e.g., axis 1 in 2020 ≈ axis 1 in 2024)
- R3. Detect axis reordering — cases where axis N in window A ≈ axis M in window B where N ≠ M
- R4. Flag unstable axes where no clear match exists across windows (may indicate the dimension dissolved or merged)
**Semantic Drift**
- R5. For each stable axis, compute the average fused embedding centroid of top N motions per window (ranked by absolute SVD loading on that axis)
- R6. Track semantic drift over time using cosine distance between consecutive window centroids
- R7. Identify inflection points where drift accelerated (statistical change-point detection or threshold-based)
- R8. For each inflection point, show example motions before/after to illustrate framing shifts
**Party-Level Drift**
- R9. For each party, compute their voting centroid per window along each stable axis (weighted average of motions they voted "voor" on)
- R10. Track how parties move along axes over time (party trajectory plots)
- R11. Detect cross-ideological voting — cases where left-wing parties vote "voor" on right-wing motions (or vice versa), and how this pattern changes over time
- R12. Show concrete examples: motions where parties voted against their ideological alignment, with before/after comparisons across windows
**Output**
- R13. Script produces a markdown report with embedded charts (static, shareable)
- R14. Report includes: axis stability heatmap, semantic drift timelines, party trajectory plots, inflection point analysis with motion examples
- R15. Script is parameterized: `--db`, `--windows`, `--top-n`, `--output` for flexibility
## Success Criteria
- Axis stability heatmap clearly shows which of the 10 SVD components are comparable across available annual windows (2016-2024)
- Semantic drift timelines reveal and characterize drift patterns; where inflection points exist, identifies at least 2-3 per affected axis
- Party-level analysis reveals cross-ideological voting patterns — cases where parties vote against their expected position on the axis
- Report is readable and actionable — a political analyst could use it to understand discourse evolution
- Script runs in under 2 minutes on the full dataset
## Scope Boundaries
- Annual windows only (2016-2024) — quarterly windows are too sparse for meaningful drift analysis. Note: 2025-2026 data may be incomplete; script should validate coverage before analysis.
- Fused embeddings (SVD + text) for semantic analysis, not raw SVD vectors alone
- No UI/explorer integration in this phase — script + report only
- No statistical significance testing beyond basic change-point detection (keep it interpretable)
## Key Decisions
- **Fused embeddings for semantic drift, SVD vectors for axis stability:** Fused embeddings (50 SVD + 2560 text dims) capture both voting patterns and semantic content, making them better for drift analysis. SVD component vectors are used for axis stability comparison (comparing vector directions, not semantic content). Note: SVD component vectors (V^T matrix) are not currently stored — axis stability must be computed indirectly.
- **Annual windows only:** Quarterly windows have too few motions (some have <100) for stable centroid computation
- **Cosine similarity for axis stability:** Standard metric for comparing vector directions; interpretable (1.0 = identical, 0.0 = orthogonal)
- **Centroid-based drift:** Average embedding of top motions per window is more stable than individual motion tracking and captures the axis's semantic center
## Dependencies / Assumptions
- SVD component vectors (V^T matrix) must be stored during SVD pipeline runs currently not persisted. A new table or metadata field is needed for axis stability comparison
- Motion text is available in `motions` table for example extraction
- Party metadata is available in `mp_metadata`; motions can be linked to parties via voting records (`mp_votes`), not sponsorship data
## Outstanding Questions
### Resolve Before Planning
- [Affects R7][Technical] What change-point detection method? Simple threshold on drift rate (recommended simpler and more interpretable, no new dependencies).
- [Affects R9][Technical] How to link motions to parties? Use voting records from `mp_votes` (party voted "voor" on motion). This measures party alignment patterns, not sponsorship.
### Deferred to Planning
- [Affects R5][Needs research] What's the optimal N for top motions per axis? Too few = noisy, too many = dilutes the signal. Should test N=10, 20, 50.
- [Affects R13][Technical] Charting library: matplotlib recommended for static markdown embedding (Plotly already in stack but produces HTML; matplotlib produces static images suitable for markdown).
- [Affects R11][Needs research] What threshold counts as "cross-ideological voting"? Provisional: a party voting "voor" on motions where the canonical opposite-wing parties have high absolute loadings. E.g., left-wing party voting "voor" on motions where PVV/FVD/JA21/SGP have high positive scores.
## Results (2026-04-05)
### Key Finding: Stability and Overtone Shift are Independent Phenomena
Analysis of 9 annual windows (2016-2026) revealed that **axis stability** and **overtone shift** measure fundamentally different aspects of SVD axes over time:
| Metric | Measures | Typical Values | Implication |
|--------|----------|---------------|-------------|
| **Axis Stability** | Structural consistency of which motions/embeddings load on axis | 0.70-0.83 for stable axes | The same semantic pattern persists |
| **Overtone Shift** | How motion content evolves over time | 1.30-1.97 cosine distance | The specific topics change substantially |
**Critical insight**: Axes can be structurally stable (parties vote similarly across years) while their semantic content drifts dramatically (different motions define the axis).
### Axis Stability Results
| Axis | Stability | Status | Interpretation |
|------|-----------|--------|----------------|
| 1 | 0.83 | Stable | Coalition vs opposition pattern consistent |
| 2 | 0.75 | Stable | PVV/FVD populist positioning stable |
| 3 | 0.78 | Stable | Welfare vs market liberalism consistent |
| 4 | 0.72 | Stable | NSC/BBB vs D66/CDA/JA21 |
| 5 | 0.70 | Stable | Christian-social vs progressive-individual |
| **6** | **0.35** | **Reordered** | Migration/culture axis most volatile |
| 7 | 0.77 | Stable | Administrative pragmatism |
| 8 | 0.79 | Stable | Healthcare/education/regional housing |
| 9 | 0.76 | Stable | System reform vs practical governance |
| 10 | 0.74 | Stable | Regulation vs deregulation |
### Overtone Shift Results
All stable axes show **high overtone shift** (1.3-1.9), indicating significant semantic drift:
| Axis | Avg Shift | Max Shift | Inflection Points |
|------|-----------|-----------|------------------|
| 1 | 1.47 | 1.97 | 0 |
| 2 | 1.42 | 1.79 | 0 |
| 3 | 1.38 | 1.83 | 0 |
| 4 | 1.39 | 1.89 | 0 |
| 5 | 1.43 | 1.93 | 0 |
| 7 | 1.31 | 1.84 | 0 |
| 8 | 1.30 | 1.89 | 0 |
| 9 | 1.38 | 1.93 | 0 |
| 10 | 1.30 | 1.72 | 0 |
**No inflection points detected** drift is gradual and continuous, not sudden.
### Methodological Notes
1. **Lasso regression (alpha=0.1)** was used for axis stability because:
- Initial approach (Jaccard similarity of motion IDs) failed motions are unique per window
- Embedding centroid similarity failed near-zero similarity due to varying dimensions
- Ridge regression was less interpretable; Lasso concentrates signal
2. **Overtone shift** uses semantic gravity (weighted mean fused embedding) to track how the "center of mass" of motion content moves.
3. **No cross-ideological voting detected** parties consistently vote within their ideological alignment.
### Implications
1. **Axis labels are temporally bounded** Themes accurately describe 2016-2026, but specific motions differ
2. **Coalition/opposition dimension is remarkably stable** Despite cabinet changes
3. **Axis 6 (Migration/Culture)** warrants monitoring Low stability suggests potential drift during axis updates
4. **Gradual drift, not sudden shifts** Policy discourse evolves incrementally
### Deliverables
- `scripts/motion_drift.py` Analysis script
- `reports/drift/report.md` Generated report with charts
- `docs/research/2026-04-05-svd-overtone-shift-deep-dive.md` Deep analysis document
## Next Steps
`/ce:plan` for structured implementation planning