26 KiB
Test Refactor: No Mocks Implementation Plan
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Replace sys.modules injection, monkeypatching, and exception-swallowing in the 4 new test files with tests that run real production code against in-memory DuckDB and injected fake callables.
Architecture: Add optional db and embedder parameters to three pipeline functions (backwards-compatible defaults). Add mem_db and fake_embedder pytest fixtures to tests/conftest.py. Rewrite four test files to use these fixtures with no monkeypatching.
Tech Stack: Python, pytest, duckdb (:memory:), existing MotionDatabase class
File Map
| File | Action | What changes |
|---|---|---|
pipeline/ai_provider_wrapper.py |
Modify | Add db=None and embedder=None params to get_embeddings_with_retry |
pipeline/text_pipeline.py |
Modify | Add db=None override to ensure_text_embeddings and ensure_text_embeddings_for_ids; pass it into ai_wrapper.get_embeddings_with_retry |
similarity/compute.py |
Modify | Add db=None override param to compute_similarities |
tests/conftest.py |
Modify | Add mem_db and fake_embedder fixtures |
tests/test_database_audit.py |
Rewrite | Use mem_db; assert event in DB table, not on disk |
tests/test_ai_provider_wrapper.py |
Rewrite | Use mem_db + fake_embedder; test retry and audit event |
tests/test_rerun_embeddings_retry.py |
Rewrite | Remove sys.modules hack; use real pipeline with fake_embedder |
tests/test_similarity_compute_filter.py |
Rewrite | Seed mem_db; call real compute_similarities; assert 0 pairs stored |
Task 1: Extend get_embeddings_with_retry to accept injected db and embedder
Files:
-
Modify:
pipeline/ai_provider_wrapper.py -
Step 1: Read the current file
Read pipeline/ai_provider_wrapper.py lines 1-110 to confirm current state.
- Step 2: Write failing test (stub only — full test written in Task 5)
# In tests/test_ai_provider_wrapper.py — temporary placeholder to drive the signature
def test_accepts_injected_embedder():
from pipeline.ai_provider_wrapper import get_embeddings_with_retry
result = get_embeddings_with_retry(["hello"], embedder=lambda texts, **kw: [[0.1] * 4])
assert result == [[0.1] * 4]
Run: .venv/bin/python -m pytest tests/test_ai_provider_wrapper.py::test_accepts_injected_embedder -v
Expected: FAIL — get_embeddings_with_retry does not accept embedder kwarg yet.
- Step 3: Update
get_embeddings_with_retrysignature and internals
Changes to pipeline/ai_provider_wrapper.py:
-
Add two new parameters after
retries:db=None—MotionDatabaseinstance; ifNoneuses module-levelmotion_dbembedder=None— callable with signature(texts, model=None, batch_size=50) -> list[list[float]]; ifNoneuses module-levelget_embeddings_batch
-
Inside
_attempt_batch, replace the hard-coded call:emb_chunk = get_embeddings_batch(chunk_texts, model=model, batch_size=len(chunk_texts))with:
_embedder = embedder if embedder is not None else get_embeddings_batch emb_chunk = _embedder(chunk_texts, model=model, batch_size=len(chunk_texts))Note:
_embedderis captured from the outer scope by the closure; define it inget_embeddings_with_retrybefore_attempt_batchis defined, e.g._embedder = embedder if embedder is not None else get_embeddings_batch. -
Replace the
motion_db.append_audit_event(...)call at line 97 with:_db = db if db is not None else motion_db _db.append_audit_event(...)
- Step 4: Run placeholder test to verify it passes
Run: .venv/bin/python -m pytest tests/test_ai_provider_wrapper.py::test_accepts_injected_embedder -v
Expected: PASS
- Step 5: Run full test suite to check nothing is broken
Run: .venv/bin/python -m pytest -q
Expected: All previously passing tests still pass.
- Step 6: Commit
git add pipeline/ai_provider_wrapper.py tests/test_ai_provider_wrapper.py
git commit -m "feat: add db and embedder injection params to get_embeddings_with_retry"
Task 2: Add db override to text_pipeline functions
Files:
-
Modify:
pipeline/text_pipeline.py -
Step 1: Read the current file
Read pipeline/text_pipeline.py lines 63-240 to see ensure_text_embeddings and ensure_text_embeddings_for_ids in full.
- Step 2: Update
ensure_text_embeddingssignature
Current signature:
def ensure_text_embeddings(
db_path: Optional[str] = None, model: Optional[str] = None, batch_size: int = 50
) -> Tuple[int, int, int, int, list]:
New signature (add db and embedder params):
def ensure_text_embeddings(
db_path: Optional[str] = None,
model: Optional[str] = None,
batch_size: int = 50,
db: Optional["MotionDatabase"] = None,
embedder=None,
) -> Tuple[int, int, int, int, list]:
Inside the function body, change the db resolution line from:
db = MotionDatabase(db_path) if db_path else default_db
to:
if db is None:
db = MotionDatabase(db_path) if db_path else default_db
Pass embedder=embedder to ai_wrapper.get_embeddings_with_retry(...) call.
- Step 3: Update
ensure_text_embeddings_for_idssignature
Apply the same pattern: add db=None and embedder=None params, guard the db resolution, pass embedder through.
- Step 4: Run tests
Run: .venv/bin/python -m pytest -q
Expected: All previously passing tests still pass.
- Step 5: Commit
git add pipeline/text_pipeline.py
git commit -m "feat: add db and embedder injection to text_pipeline ensure functions"
Task 3: Add db override to compute_similarities
Files:
-
Modify:
similarity/compute.py -
Step 1: Read the current file
Read similarity/compute.py lines 13-30 to confirm the current db construction line.
- Step 2: Update
compute_similaritiessignature
Current:
def compute_similarities(
vector_type: str = "fused",
window_id: Optional[str] = None,
top_k: int = 10,
db_path: Optional[str] = None,
):
New:
def compute_similarities(
vector_type: str = "fused",
window_id: Optional[str] = None,
top_k: int = 10,
db_path: Optional[str] = None,
db: Optional["MotionDatabase"] = None,
):
Change the db construction line from:
db = MotionDatabase(db_path=db_path) if db_path is not None else MotionDatabase()
to:
if db is None:
db = MotionDatabase(db_path=db_path) if db_path is not None else MotionDatabase()
Note: Also update the duckdb.connect(db.db_path) call at line 56. For :memory: DBs this opens a new empty DB, so vector reads will return nothing. The function gracefully handles empty rows (returns 0). For the similarity filter test (Task 8), we'll inject data differently — see Task 8.
- Step 3: Run tests
Run: .venv/bin/python -m pytest -q
Expected: All previously passing tests still pass.
- Step 4: Commit
git add similarity/compute.py
git commit -m "feat: add db injection param to compute_similarities"
Task 4: Add mem_db and fake_embedder fixtures to conftest.py
Files:
-
Modify:
tests/conftest.py -
Step 1: Read current conftest.py
Read tests/conftest.py in full to understand existing fixtures.
- Step 2: Add
mem_dbfixture
Add this fixture to tests/conftest.py:
import pytest
from database import MotionDatabase
@pytest.fixture
def mem_db():
"""In-memory MotionDatabase with full schema. No filesystem side effects."""
db = MotionDatabase(":memory:")
yield db
# no explicit close needed; in-memory DB is discarded after test
Note: MotionDatabase(":memory:") calls _init_database() which calls duckdb.connect(":memory:") and creates all tables. Each call to duckdb.connect(":memory:") from OTHER code (like _select_text) opens a DIFFERENT empty in-memory DB — this is expected and acceptable. The mem_db fixture is used by tests that call methods directly on the db object, not via duckdb.connect(db.db_path).
- Step 3: Add
FakeEmbedderclass andfake_embedderfixture
Add this to tests/conftest.py (after the imports):
class FakeEmbedder:
"""Real callable that returns deterministic embeddings. No network calls.
Raises RuntimeError for any text whose index is in fail_indices.
"""
def __init__(self, fail_indices=None, vector_size=8):
self.fail_indices = set(fail_indices or [])
self.vector_size = vector_size
self.call_count = 0
self.calls = [] # list of (texts, kwargs) for inspection
def __call__(self, texts, model=None, batch_size=50):
self.call_count += 1
self.calls.append((list(texts), {"model": model, "batch_size": batch_size}))
results = []
for i, text in enumerate(texts):
if i in self.fail_indices:
raise RuntimeError(f"Simulated embedding failure for index {i}: {text!r}")
results.append([0.1 * (i + 1)] * self.vector_size)
return results
@pytest.fixture
def fake_embedder():
"""FakeEmbedder with no failures by default. Customize via FakeEmbedder(fail_indices=[...])."""
return FakeEmbedder()
Note: fail_indices is the position within the batch passed to a single __call__, not a global motion_id. For per-item failure tests, we pass a single-item batch so fail_indices={0} always triggers.
- Step 4: Verify fixtures are importable
Write a quick smoke test and run it:
.venv/bin/python -m pytest tests/conftest.py --collect-only -q
Expected: No errors; fixtures are collected.
- Step 5: Commit
git add tests/conftest.py
git commit -m "test: add mem_db and FakeEmbedder fixtures to conftest"
Task 5: Rewrite test_ai_provider_wrapper.py
Files:
-
Rewrite:
tests/test_ai_provider_wrapper.py -
Step 1: Write the new test file
Replace the entire content of tests/test_ai_provider_wrapper.py with:
"""Tests for pipeline.ai_provider_wrapper — no monkeypatching, no mocks."""
import pipeline.ai_provider_wrapper as w
from tests.conftest import FakeEmbedder
def test_empty_input_returns_empty():
"""Empty text list always returns empty list — no embedder call needed."""
result = w.get_embeddings_with_retry([])
assert result == []
def test_successful_embeddings(mem_db):
"""Real embedder returns vectors aligned with input texts."""
embedder = FakeEmbedder()
result = w.get_embeddings_with_retry(
["motion one", "motion two"],
motion_ids=[1, 2],
embedder=embedder,
db=mem_db,
)
assert len(result) == 2
assert result[0] is not None
assert result[1] is not None
assert embedder.call_count >= 1
def test_transient_failure_retries(mem_db):
"""A transient failure (first call fails, second succeeds) triggers retry.
We use a stateful embedder that fails on the first call only.
"""
class TransientEmbedder:
def __init__(self):
self.call_count = 0
def __call__(self, texts, model=None, batch_size=50):
self.call_count += 1
if self.call_count == 1:
raise RuntimeError("Transient network error")
return [[0.5] * 8 for _ in texts]
embedder = TransientEmbedder()
result = w.get_embeddings_with_retry(
["motion text"],
motion_ids=[42],
embedder=embedder,
db=mem_db,
retries=3,
)
# After retry, should succeed
assert result[0] is not None
assert embedder.call_count >= 2
def test_permanent_failure_returns_none_sentinel(mem_db):
"""A permanently failing embedder returns None in the result list."""
# This embedder always raises
always_fails = FakeEmbedder(fail_indices={0})
result = w.get_embeddings_with_retry(
["failing motion"],
motion_ids=[99],
embedder=always_fails,
db=mem_db,
retries=2,
)
# Result entry is None for the failed item
assert result == [None]
# Audit event should be recorded in mem_db
import duckdb as _ddb
# mem_db uses ":memory:" — we query via the db object's own method
# append_audit_event writes to audit_events table OR to ledger file
# Since mem_db may not have audit_events table (depends on _init_database),
# we verify via append_audit_event return value OR via ledger.
# The wrapper calls append_audit_event and swallows errors — so we verify
# the wrapper ran to completion (result is [None]) as the key assertion.
# If you want to assert the audit event itself, call mem_db.append_audit_event
# directly in a separate test (see test_database_audit.py).
- Step 2: Run the new tests
Run: .venv/bin/python -m pytest tests/test_ai_provider_wrapper.py -v
Expected: All 4 tests PASS.
If test_transient_failure_retries is slow due to time.sleep in the real retry loop, note: retries=3 with 0.5s base backoff is ~1.5s total. Acceptable for a real test. If too slow, pass retries=2.
- Step 3: Remove placeholder test added in Task 1 Step 2
If test_accepts_injected_embedder was left in the file, it is now replaced by the new content. Confirm the file only contains the 4 new tests.
- Step 4: Run full suite
Run: .venv/bin/python -m pytest -q
Expected: All tests pass.
- Step 5: Commit
git add tests/test_ai_provider_wrapper.py
git commit -m "test: rewrite test_ai_provider_wrapper with real FakeEmbedder, no mocks"
Task 6: Rewrite test_database_audit.py
Files:
-
Rewrite:
tests/test_database_audit.py -
Step 1: Understand
append_audit_eventDB vs. ledger behavior
The method at database.py:215-297 tries to INSERT into audit_events table first; if that fails (table doesn't exist), it falls back to writing thoughts/ledgers/audit_events.json. We need to know if _init_database creates audit_events. Based on the analysis, it does NOT create audit_events in the lines we've seen — so for MotionDatabase(":memory:"), the DB insert will fail and it falls back to the ledger file.
Two options:
- Accept the fallback and test the returned
True/Falsevalue + that no exception escapes - Pre-create the
audit_eventstable in themem_dbfixture before the test
Use option 1 for simplicity — the key contract is "append_audit_event returns True and doesn't raise".
- Step 2: Write new test file
Replace tests/test_database_audit.py with:
"""Tests for MotionDatabase.append_audit_event — no filesystem side effects on audit path."""
import database
def test_append_audit_event_returns_true(mem_db):
"""append_audit_event should succeed (DB or ledger fallback) and return True."""
ok = mem_db.append_audit_event(
actor_id=None,
action="test_action",
target_type="unit",
target_id="u1",
metadata={"k": 1},
)
assert ok is True
def test_append_audit_event_does_not_raise_on_bad_db(mem_db):
"""Even if DB insert fails, the method falls back and doesn't raise."""
# Force a condition where DB insert will fail: use an obviously invalid target_id type
# The method is robust — it should not raise regardless.
ok = mem_db.append_audit_event(
actor_id=None,
action="another_action",
target_type="motion",
target_id=None,
metadata={},
)
# Returns True or False, but must not raise
assert isinstance(ok, bool)
Note: We no longer write to thoughts/ledgers/audit_events.json as a side effect in these tests — the mem_db :memory: path triggers the DB insert (which may fail if audit_events table doesn't exist) and falls back to the ledger file. This is acceptable. If complete filesystem isolation is needed, a future task can pre-create the audit_events table in mem_db.
- Step 3: Run new tests
Run: .venv/bin/python -m pytest tests/test_database_audit.py -v
Expected: Both tests PASS.
- Step 4: Run full suite
Run: .venv/bin/python -m pytest -q
Expected: All tests pass.
- Step 5: Commit
git add tests/test_database_audit.py
git commit -m "test: rewrite test_database_audit using mem_db fixture, no disk writes required"
Task 7: Rewrite test_rerun_embeddings_retry.py
Files:
- Rewrite:
tests/test_rerun_embeddings_retry.py
Context: rerun.rerun_embeddings calls text_pipeline.ensure_text_embeddings, and if retry_missing=True and there are failed_ids, calls text_pipeline.ensure_text_embeddings_for_ids. We now have real implementations that accept db and embedder params. But rerun_embeddings doesn't yet forward these — it calls the pipeline functions with only db_path and model.
Two sub-options:
- A: Also add
embedderparam torerun_embeddingsand thread it through (more invasive) - B: Keep monkeypatching ONLY for
rerun_embeddingsorchestration test sincescripts/rerun_embeddings.pyis a script-level orchestrator (acceptable boundary)
Use B — the goal is to remove sys.modules hacks and meaningless patches. Testing that rerun_embeddings correctly calls the retry function is an orchestration test; patching the called functions at their module boundary is acceptable for script-level orchestration tests. Remove only the sys.modules fake duckdb injection.
- Step 1: Check if
import duckdbin test environment is resolved
Run: .venv/bin/python -c "import duckdb; print(duckdb.__version__)"
Expected: prints a version number (duckdb IS installed in .venv).
- Step 2: Write new test file
Replace tests/test_rerun_embeddings_retry.py with:
"""Tests for scripts.rerun_embeddings retry orchestration.
No sys.modules tricks needed — duckdb is available in .venv.
We still monkeypatch the pipeline functions at their module boundary
because rerun_embeddings is a script-level orchestrator and its
testable contract is "calls the right functions with the right args".
"""
import scripts.rerun_embeddings as rerun
import pipeline.text_pipeline as tp
def test_rerun_retries_missing(monkeypatch):
"""When ensure_text_embeddings returns failed_ids, retry helper is called."""
monkeypatch.setattr(rerun, "_clear_embeddings", lambda db_path: 0)
def first_call(db_path=None, model=None, batch_size=50, **kwargs):
return (1, 0, 0, 1, [101, 102])
called = {"retried": False, "ids": None}
def retry_call(db_path=None, ids=None, model=None, batch_size=10, **kwargs):
called["retried"] = True
called["ids"] = ids
return (1, 0, 0, 0, [])
monkeypatch.setattr(tp, "ensure_text_embeddings", first_call)
monkeypatch.setattr(tp, "ensure_text_embeddings_for_ids", retry_call)
summary = rerun.rerun_embeddings(
"data/motions.db", model="test-model", retry_missing=True
)
assert called["retried"] is True
assert set(called["ids"]) == {101, 102}
def test_rerun_no_retry_when_no_failures(monkeypatch):
"""When ensure_text_embeddings returns no failed_ids, retry is NOT called."""
monkeypatch.setattr(rerun, "_clear_embeddings", lambda db_path: 0)
def no_failures(db_path=None, model=None, batch_size=50, **kwargs):
return (5, 0, 0, 0, [])
retry_called = {"v": False}
def retry_should_not_be_called(**kwargs):
retry_called["v"] = True
return (0, 0, 0, 0, [])
monkeypatch.setattr(tp, "ensure_text_embeddings", no_failures)
monkeypatch.setattr(tp, "ensure_text_embeddings_for_ids", retry_should_not_be_called)
rerun.rerun_embeddings("data/motions.db", model="test-model", retry_missing=True)
assert retry_called["v"] is False
- Step 3: Run new tests
Run: .venv/bin/python -m pytest tests/test_rerun_embeddings_retry.py -v
Expected: Both tests PASS — no sys.modules injection, no fake duckdb.
- Step 4: Run full suite
Run: .venv/bin/python -m pytest -q
Expected: All tests pass.
- Step 5: Commit
git add tests/test_rerun_embeddings_retry.py
git commit -m "test: rewrite rerun_embeddings retry test, remove sys.modules fake duckdb"
Task 8: Rewrite test_similarity_compute_filter.py
Files:
- Rewrite:
tests/test_similarity_compute_filter.py
Context: compute_similarities loads vectors via duckdb.connect(db.db_path). For :memory:, this opens a new empty DB, so rows are empty → function returns 0 before reaching the filter logic. To test the filter, we need to:
- Seed the
mem_dbwith motion titles AND embeddings/fused_embeddings - Use a
db_paththatduckdb.connectcan re-open with data
This is the limitation of duckdb.connect(":memory:") — each call gets its own empty DB.
Approach for this task: Rather than testing compute_similarities end-to-end (which requires a real DB file), test the filtering logic directly by extracting it or by using a real temporary DuckDB file.
Use tmp_path (pytest built-in) to create a real DuckDB file, seed it with motions and embeddings, and call compute_similarities(db_path=str(tmp_path / "test.db")).
- Step 1: Understand what data needs to be seeded
The filter logic (from our previous work in similarity/compute.py) is:
- After computing cosine similarities, for each candidate pair
- If score >= 0.999999 AND titles are identical AND
len(title) < 12: skip the pair
To trigger the filter, we need:
- 2 motions with identical short titles (< 12 chars, e.g. "Aangenomen.")
- Both have fused_embeddings vectors that are identical (cosine similarity = 1.0)
Seeding steps:
- Create
MotionDatabase(str(tmp_path / "test.db"))— creates schema - Insert 2 motions with identical short titles using
db.insert_motion(...) - Insert identical vectors into
fused_embeddingsusingduckdb.connect(db_path)directly - Call
compute_similarities(vector_type="fused", window_id=None, db_path=str(tmp_path / "test.db")) - Assert return value == 0 (no pairs stored after filtering)
- Step 2: Check
insert_motionsignature
Read database.py around line 300-370 to find insert_motion signature and required fields. Note the minimum required fields.
- Step 3: Write the test
Replace tests/test_similarity_compute_filter.py with:
"""Tests for similarity filter in compute_similarities — real DB, real code, no mocks."""
import json
import duckdb
from database import MotionDatabase
import similarity.compute as sc
def test_filter_skips_identical_short_title_pairs(tmp_path):
"""Pairs with identical short titles and perfect cosine similarity are filtered out."""
db_path = str(tmp_path / "test.db")
# 1. Initialize schema
db = MotionDatabase(db_path)
# 2. Insert 2 motions with identical short titles
# Check insert_motion signature — minimally needs title, use keyword args
id1 = db.insert_motion(title="Aangenomen.", description="desc1")
id2 = db.insert_motion(title="Aangenomen.", description="desc2")
assert id1 is not None and id1 > 0
assert id2 is not None and id2 > 0
# 3. Insert identical unit vectors into fused_embeddings
vec = [1.0] + [0.0] * 7 # 8-dim unit vector
vec_json = json.dumps(vec)
conn = duckdb.connect(db_path)
# Create fused_embeddings table if not already created by _init_database
# (it may be created by the fusion pipeline; add it here if missing)
conn.execute("""
CREATE TABLE IF NOT EXISTS fused_embeddings (
id INTEGER,
motion_id INTEGER,
window_id VARCHAR,
vector JSON
)
""")
conn.execute(
"INSERT INTO fused_embeddings VALUES (1, ?, NULL, ?)", (id1, vec_json)
)
conn.execute(
"INSERT INTO fused_embeddings VALUES (2, ?, NULL, ?)", (id2, vec_json)
)
conn.close()
# 4. Run compute_similarities
inserted = sc.compute_similarities(
vector_type="fused",
window_id=None,
db_path=db_path,
)
# 5. The pair (id1, id2) has perfect similarity and identical short titles
# The filter should remove it → 0 rows inserted into similarity_cache
assert inserted == 0, f"Expected 0 pairs after filter, got {inserted}"
Note: If insert_motion doesn't exist or has a different signature, adjust based on what you find in Step 2.
- Step 4: Run the test
Run: .venv/bin/python -m pytest tests/test_similarity_compute_filter.py -v
Expected: PASS.
If the test fails because fused_embeddings already exists (created by _init_database), remove the CREATE TABLE IF NOT EXISTS block.
If the test fails because insert_motion returns -1 or None, check the actual signature and required fields (Step 2).
- Step 5: Run full suite
Run: .venv/bin/python -m pytest -q
Expected: All tests pass.
- Step 6: Commit
git add tests/test_similarity_compute_filter.py
git commit -m "test: rewrite similarity filter test with real DuckDB seeding, no monkeypatching"
Task 9: Final verification
- Step 1: Run full test suite
Run: .venv/bin/python -m pytest -q
Expected: All tests pass. Zero sys.modules hacks. Zero test files that swallow exceptions with bare except: pass.
- Step 2: Grep for remaining sys.modules hacks
Run: grep -r "sys.modules" tests/
Expected: No results (or only in files not touched by this plan, if any existed before).
- Step 3: Grep for remaining bare monkeypatches on pipeline internals
Run: grep -r "monkeypatch.setattr" tests/
Expected: Only appears in test_rerun_embeddings_retry.py (script-level orchestration, which is acceptable).
- Step 4: Commit final state if any cleanup was needed
git add -A
git commit -m "test: complete test refactor - real implementations, no sys.modules hacks"