Development¶
CaseGraph uses strict validation from the first bootstrap slice. Run the full gate before finishing substantive implementation work.
For the runnable local product path, start with Quickstart. For command and route details, use CLI Reference and API Reference.
Python¶
uv sync --all-groups --python 3.13
docker compose up -d postgres
uv run --all-groups --python 3.13 alembic upgrade head
uv run --all-groups --python 3.13 ruff format --check .
uv run --all-groups --python 3.13 ruff check .
uv run --all-groups --python 3.13 mypy src examples tests scripts
uv run --all-groups --python 3.13 pyright
uv run --all-groups --python 3.13 pytest tests/unit tests/integration --cov=casegraph --cov-report=term-missing --cov-fail-under=95
Default tests do not call OpenAI. The optional live provider smoke is skipped unless explicitly enabled and an API key is configured:
export OPENAI_API_KEY=sk-...
CASEGRAPH_RUN_LIVE_OPENAI=1 \
uv run --all-groups --python 3.13 pytest tests/integration/test_model_calls_live_openai.py -q
The live test uses the configured default model, gpt-5.4-mini unless overridden by
CASEGRAPH_OPENAI_DEFAULT_MODEL, and records normal model-run provenance in the local database.
Model evaluation is separate from graph provenance. The fake eval suite is safe for local and CI
checks and writes ignored reports under .casegraph/model-evals/:
Live model evals are opt-in and require the explicit live flag plus a server-side OpenAI API key:
Dogfood checks are the broader confidence path before changing provider-facing code:
For local dogfood, prefer putting OPENAI_API_KEY=... in ignored .env.local; real process
environment variables still take precedence.
Scenario checks are local end-to-end dogfood tests for CaseGraph implementation and UX. They let the main developer try the workflow as a user would, so implementation bugs, confusing flows, weak defaults, missing docs, and poor errors are caught early. They are development validation, not a user-facing CaseGraph capability. The local suite is deterministic and provider-free or fake-provider backed:
Live provider scenarios are opt-in release gates:
Use these checks when adding or changing workflow-level behavior.
Adapter scenarios are dependency-specific so native CaseGraph work does not require every
framework package. The Pydantic-AI adapter check is provider-free, but it requires the
casegraph[pydantic-ai] extra, which is installed in the repo dev group:
If the optional dependency is missing, adapter code should fail with the install command, not with a raw import error.
The local suite includes the checked-in minimal transactional pack scenario. To run only that path:
uv run --all-groups --python 3.13 casegraph scenarios run \
--scenario custom_pack_transactional_loop \
--json
Expected result: status=passed and replay in_sync.
The native worker authoring scenario starts from a generated transactional scaffold, edits the
generated decorated WorkerOutput function, validates the pack, runs author-check, runs the
generated demo, and checks replay:
uv run --all-groups --python 3.13 casegraph scenarios run \
--scenario greenfield_native_worker_loop \
--json
The decorated worker scenario is the current greenfield authoring dogfood path. It verifies that
generated workers use @worker(...) plus explicit context.step(...) instrumentation:
uv run --all-groups --python 3.13 casegraph scenarios run \
--scenario greenfield_decorated_worker_loop \
--json
The cover inspection scenario is the first existing-system migration dogfood path. It creates a temporary fake agentic codebase, runs static inspection, and checks that prompts, model calls, tools, structured outputs, unsafe external writes, and tests/examples are reported without importing the target:
uv run --all-groups --python 3.13 casegraph scenarios run \
--scenario cover_existing_system_inspection_loop \
--json
The worker-step scenario starts from a generated transactional scaffold, edits a
context.step(...) worker, verifies the synthetic step marker provenance, runs the generated demo,
and checks replay:
uv run --all-groups --python 3.13 casegraph scenarios run \
--scenario greenfield_worker_step_loop \
--json
For generated read-only packs, use the authoring check before running the generated demo against Postgres:
make pack-author-check PACK_ROOT=.casegraph/my_readonly_pack
uv run --all-groups --python 3.13 casegraph packs author-check \
--pack-root .casegraph/my_readonly_pack \
--json
author-check is provider-free and database-free. It validates pack structure, renders prompt
expectation files, and dry-runs declared workers through the native WorkerOutput path or the
advanced read-only provenance fixture contract.
Web¶
The Playwright e2e command expects Postgres migrations to be applied and starts real FastAPI and
Vite dev servers for the browser smoke path. Use make demo-check for the full local demo gate; it
defaults to e2e ports 8120 and 5174, which can be overridden with CASEGRAPH_E2E_API_PORT and
CASEGRAPH_E2E_WEB_PORT.
To regenerate browser screenshots used by the documentation:
The command seeds the deterministic support reference-pack case and writes screenshots under
docs/assets/screenshots/.
Documentation¶
The documentation site is built with MkDocs Material and deployed to GitHub Pages from the
dedicated Docs workflow.
uv run --all-groups --python 3.13 mkdocs build --strict
uv run --all-groups --python 3.13 mkdocs serve
Keep mkdocs.yml navigation aligned with checked-in documentation whenever files are added,
renamed, or removed. Internal docs navigation should point to local MkDocs pages such as
README.md, quickstart.md, or api-reference.md, not GitHub blob URLs.