You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
74 lines
2.3 KiB
74 lines
2.3 KiB
# Database Schema (DuckDB) — extracted DDL
|
|
|
|
## Rules
|
|
- Use DuckDB for persistent storage when available; fallback to JSON files when duckdb is not installed (database.py).
|
|
- Keep schema migrations additive (ALTER TABLE ADD COLUMN IF NOT EXISTS used in database.py).
|
|
|
|
## Examples (DDL snippets extracted from database.py)
|
|
|
|
### motions table
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS motions (
|
|
id INTEGER DEFAULT nextval('motions_id_seq'),
|
|
title TEXT NOT NULL,
|
|
description TEXT,
|
|
date DATE,
|
|
policy_area TEXT,
|
|
voting_results JSON,
|
|
winning_margin FLOAT,
|
|
controversy_score FLOAT,
|
|
layman_explanation TEXT,
|
|
externe_identifier TEXT,
|
|
body_text TEXT,
|
|
url TEXT UNIQUE,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
PRIMARY KEY (id)
|
|
)
|
|
```
|
|
|
|
### mp_votes table
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS mp_votes (
|
|
id INTEGER DEFAULT nextval('mp_votes_id_seq'),
|
|
motion_id INTEGER NOT NULL,
|
|
mp_name TEXT NOT NULL,
|
|
party TEXT,
|
|
vote TEXT NOT NULL,
|
|
date DATE,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
PRIMARY KEY (id)
|
|
)
|
|
```
|
|
|
|
### embeddings / fused_embeddings
|
|
```sql
|
|
CREATE TABLE IF NOT EXISTS embeddings (
|
|
id INTEGER DEFAULT nextval('embeddings_id_seq'),
|
|
motion_id INTEGER NOT NULL,
|
|
model TEXT,
|
|
vector JSON NOT NULL,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
PRIMARY KEY (id)
|
|
)
|
|
|
|
CREATE TABLE IF NOT EXISTS fused_embeddings (
|
|
id INTEGER DEFAULT nextval('fused_embeddings_id_seq'),
|
|
motion_id INTEGER NOT NULL,
|
|
window_id TEXT NOT NULL,
|
|
vector JSON NOT NULL,
|
|
svd_dims INTEGER NOT NULL,
|
|
text_dims INTEGER NOT NULL,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
PRIMARY KEY (id)
|
|
)
|
|
```
|
|
|
|
## Anti-patterns
|
|
- Broad try/except around duckdb import (database.py top) — acceptable for optional dependency but should log explicitly the missing dependency and document test behavior.
|
|
|
|
## Remediations
|
|
- Add a simple migration/versioning table (schema_version) to track schema changes and apply migrations deterministically.
|
|
- Add tests that exercise both duckdb-backed and JSON-fallback database paths. Evidence: database.py contains JSON fallback logic (lines ~1-80).
|
|
|
|
## Evidence pointers
|
|
- database.py: DDL strings and sequences (file: database.py lines ~1-300 and further). See create table blocks for motions, mp_votes, embeddings, fused_embeddings.
|
|
|