You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/.mindmodel/patterns/duckdb-access.md

2.2 KiB

title category
DuckDB Access Pattern patterns

DuckDB Access Pattern

Rules

  • Prefer using read_only=True for compute-only subprocesses (e.g., SVD compute) to allow concurrent readers.
  • Prefer "with duckdb.connect(db_path, read_only=True) as conn" for scoped connections so conn.close() is automatic.
  • If a long-lived connection is created at module level, provide explicit close() or ensure operation is safe for Streamlit's lifecycle.
  • Prefer parameterizing db_path in pipelines and creating connections locally (avoid global connections that cross threads).

Examples

database.py - Explicit connect/close for schema init

conn = duckdb.connect(self.db_path)
...
conn.execute("""
    CREATE TABLE IF NOT EXISTS fused_embeddings (
        id INTEGER DEFAULT nextval('fused_embeddings_id_seq'),
        motion_id INTEGER NOT NULL,
        window_id TEXT NOT NULL,
        vector JSON NOT NULL,
        svd_dims INTEGER NOT NULL,
        text_dims INTEGER NOT NULL,
        created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
        PRIMARY KEY (id)
    )
""")
conn.close()

pipeline/svd_pipeline.py - Read-only connection

conn = duckdb.connect(db_path, read_only=True)
try:
    rows = conn.execute(
        "SELECT motion_id, mp_name, vote FROM mp_votes WHERE date BETWEEN ? AND ?",
        (start_date, end_date),
    ).fetchall()
finally:
    conn.close()

similarity/compute.py - Preferred 'with' context

try:
    import duckdb
except Exception:
    logger.exception("duckdb import failed; cannot load vectors")
    return 0

with duckdb.connect(db.db_path) as conn:
    rows = conn.execute(query, params).fetchall()

Anti-Patterns

Bad: Connection without closure

# BAD: connection may leak if exception occurs before explicit close
conn = duckdb.connect(db_path)
rows = conn.execute("SELECT ...").fetchall()
# missing finally/close

Remediation: Use "with" context or ensure conn.close() in finally block.

Bad: Parallel write connections

Problem: Opening write connections from many parallel workers without coordination.

Remediation: Open read_only for compute processes and centralize writes via short-lived connections or a single writer worker.