You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
motief/CODE_STYLE.md

118 lines
5.2 KiB

CODE STYLE
==========
Purpose
-------
This document records the conventions already in use in the codebase so new contributors and AI
agents can produce code that fits the repository's existing style.
General
-------
- Language: Python (3.x)
- Project uses one file-per-module with descriptive snake_case filenames (e.g., api_client.py, database.py)
- Top-level module singletons are exposed when a single shared instance is desired (e.g. `db = MotionDatabase()`)
- Keep code synchronous unless you introduce async consistently across modules (none currently use async/await)
Naming
------
- Files / modules: snake_case.py (e.g., motion_scraper -> scraper.py, api_client.py)
- Classes: PascalCase (e.g., MotionDatabase, MotionSummarizer, TweedeKamerAPI)
- Functions and methods: snake_case (including private helpers with a single leading underscore)
- Constants / config fields: UPPER_SNAKE_CASE (placed in config.py and referenced via `from config import config`)
File organization
-----------------
- Keep top-level domain modules in the repository root (this repo uses a flat layout)
- Each module should contain one primary responsibility (e.g., database.py for DB logic)
- Module-level singletons: create at module bottom and import from other modules (pattern used widely)
Imports
-------
- Group imports in this order with a blank line between groups:
1. Standard library (datetime, json, typing)
2. Third-party libraries (requests, duckdb, ibis, streamlit)
3. Local imports (from config import config, from database import db)
- Use absolute imports (module name) rather than relative imports
Typing
------
- Add type hints to public function signatures where helpful (project uses typing in several places).
- Use typing.Dict, typing.List, typing.Optional for simple container annotations.
Error handling & logging
------------------------
- Current pattern: functions catch broad Exception and print error messages, then return a safe default
(False, [], None). Examples in database.py and api_client.py.
- When updating code, prefer to:
- Keep the existing behavior (return safe fallback) to avoid breaking call sites
- Consider adding structured logging (use logging module) rather than print, but maintain similar
high-level error flows unless refactoring intentionally.
LLM / external API calls
------------------------
- OpenAI-compatible client usage is in summarizer.py. Environment variables are read from config.py.
- Do NOT commit API keys or secrets. Use environment variables (OPENROUTER_API_KEY, etc.) and
reference them by name.
- Network calls are synchronous using requests. Keep request timeouts and error handling consistent with
existing patterns (catch requests.exceptions.RequestException and return safe fallback values).
Database patterns
-----------------
- Database is DuckDB stored at data/motions.db. The MotionDatabase class opens short-lived duckdb
connections inside methods (conn = duckdb.connect(self.db_path)). This pattern is used widely.
- Queries and schema initialization happen inside MotionDatabase._init_database(). Keep DDL grouped there.
- When writing methods that modify DB, follow the try/except + conn.close() pattern to guarantee cleanup.
Testing
-------
- Currently the project uses ad-hoc test scripts (test.py). If adding tests, follow pytest conventions:
- Place tests in tests/ directory
- Use filenames test_*.py and functions test_* with assertions
- Mock external APIs (requests, LLM client) via monkeypatch or unittest.mock
Patterns observed (use these when adding new code)
-----------------------------------------------
- Singletons: expose module-level instance (e.g. `db = MotionDatabase()`), import it elsewhere
- Private helpers: name with a single leading underscore (e.g., _get_voting_records)
- Config: centralize in config.py and reference via `from config import config` (don't hardcode paths)
Do's and Don'ts
---------------
Do:
- Follow existing naming: snake_case for files/functions
- Add simple type hints for clarity
- Return the same safe fallback values used in existing functions on error
- Use module-level singletons for shared services if helpful
Don't:
- Don't add async/await in a single module without broader coordination
- Don't print secret values or commit .env files
- Don't create circular imports (be careful when modules instantiate singletons at import time)
Example snippets
----------------
Conformant class and method:
class ExampleService:
def __init__(self, param: str = config.DATABASE_PATH):
self.param = param
def do_work(self, items: typing.List[dict]) -> bool:
try:
# short-lived DB/HTTP usage
conn = duckdb.connect(config.DATABASE_PATH)
# ... perform work
conn.close()
return True
except Exception as e:
print(f"Error in do_work: {e}")
if 'conn' in locals():
conn.close()
return False
Adding a new module
-------------------
1. Create snake_case file (e.g., new_service.py)
2. Add a PascalCase class implementing the behavior and small helper functions prefixed with _
3. If you need a shared instance, create `service = NewService()` at the module bottom
4. Import via `from new_service import service` in other modules