You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
5.2 KiB
5.2 KiB
CODE STYLE
Purpose
This document records the conventions already in use in the codebase so new contributors and AI agents can produce code that fits the repository's existing style.
General
- Language: Python (3.x)
- Project uses one file-per-module with descriptive snake_case filenames (e.g., api_client.py, database.py)
- Top-level module singletons are exposed when a single shared instance is desired (e.g.
db = MotionDatabase()) - Keep code synchronous unless you introduce async consistently across modules (none currently use async/await)
Naming
- Files / modules: snake_case.py (e.g., motion_scraper -> scraper.py, api_client.py)
- Classes: PascalCase (e.g., MotionDatabase, MotionSummarizer, TweedeKamerAPI)
- Functions and methods: snake_case (including private helpers with a single leading underscore)
- Constants / config fields: UPPER_SNAKE_CASE (placed in config.py and referenced via
from config import config)
File organization
- Keep top-level domain modules in the repository root (this repo uses a flat layout)
- Each module should contain one primary responsibility (e.g., database.py for DB logic)
- Module-level singletons: create at module bottom and import from other modules (pattern used widely)
Imports
- Group imports in this order with a blank line between groups:
- Standard library (datetime, json, typing)
- Third-party libraries (requests, duckdb, ibis, streamlit)
- Local imports (from config import config, from database import db)
- Use absolute imports (module name) rather than relative imports
Typing
- Add type hints to public function signatures where helpful (project uses typing in several places).
- Use typing.Dict, typing.List, typing.Optional for simple container annotations.
Error handling & logging
- Current pattern: functions catch broad Exception and print error messages, then return a safe default (False, [], None). Examples in database.py and api_client.py.
- When updating code, prefer to:
- Keep the existing behavior (return safe fallback) to avoid breaking call sites
- Consider adding structured logging (use logging module) rather than print, but maintain similar high-level error flows unless refactoring intentionally.
LLM / external API calls
- OpenAI-compatible client usage is in summarizer.py. Environment variables are read from config.py.
- Do NOT commit API keys or secrets. Use environment variables (OPENROUTER_API_KEY, etc.) and reference them by name.
- Network calls are synchronous using requests. Keep request timeouts and error handling consistent with existing patterns (catch requests.exceptions.RequestException and return safe fallback values).
Database patterns
- Database is DuckDB stored at data/motions.db. The MotionDatabase class opens short-lived duckdb connections inside methods (conn = duckdb.connect(self.db_path)). This pattern is used widely.
- Queries and schema initialization happen inside MotionDatabase._init_database(). Keep DDL grouped there.
- When writing methods that modify DB, follow the try/except + conn.close() pattern to guarantee cleanup.
Testing
- Currently the project uses ad-hoc test scripts (test.py). If adding tests, follow pytest conventions:
- Place tests in tests/ directory
- Use filenames test_.py and functions test_ with assertions
- Mock external APIs (requests, LLM client) via monkeypatch or unittest.mock
Patterns observed (use these when adding new code)
- Singletons: expose module-level instance (e.g.
db = MotionDatabase()), import it elsewhere - Private helpers: name with a single leading underscore (e.g., _get_voting_records)
- Config: centralize in config.py and reference via
from config import config(don't hardcode paths)
Do's and Don'ts
Do:
- Follow existing naming: snake_case for files/functions
- Add simple type hints for clarity
- Return the same safe fallback values used in existing functions on error
- Use module-level singletons for shared services if helpful
Don't:
- Don't add async/await in a single module without broader coordination
- Don't print secret values or commit .env files
- Don't create circular imports (be careful when modules instantiate singletons at import time)
Example snippets
Conformant class and method:
class ExampleService: def init(self, param: str = config.DATABASE_PATH): self.param = param
def do_work(self, items: typing.List[dict]) -> bool:
try:
# short-lived DB/HTTP usage
conn = duckdb.connect(config.DATABASE_PATH)
# ... perform work
conn.close()
return True
except Exception as e:
print(f"Error in do_work: {e}")
if 'conn' in locals():
conn.close()
return False
Adding a new module
- Create snake_case file (e.g., new_service.py)
- Add a PascalCase class implementing the behavior and small helper functions prefixed with _
- If you need a shared instance, create
service = NewService()at the module bottom - Import via
from new_service import servicein other modules