ADR-005 — Metadata-fetch and render deps go in main, not extras¶
Status: Accepted (P3/P4, 2026-04-23).
Context¶
The P0 scaffolding stratified optional dependencies aggressively: hf, render, meta, pdf, url, vllm, llamacpp, verify. Rationale was “let users pick what they need.” In P3/P4 we reassessed:
Rendering (
citeproc-py, or eventually our own — see ADR-004) is a core value prop of citeformer. You don’t use this library without wanting rendered references.Metadata adapters (
httpx,diskcache,readability-lxml,lxml,pypdf) are small, pure-Python-or-wheel, and the most common user workflow goes through at least one of them.The heavy ML deps (
torch,transformers,xgrammar,accelerate) are genuinely optional — users who only ever call API-provider backends (future v0.2+) don’t need them. And even today, a user who brings their own HF-loaded model + tokenizer could hypothetically skip thehfextra.
Decision¶
Move httpx, diskcache, readability-lxml, lxml, and pypdf into main dependencies. Remove the now-redundant render, pdf, url, and meta extras. Keep:
(Originally citeproc-py was also promoted here, but the home-grown render rewrite removed it entirely. The six built-in formatters have no runtime citeproc-py dependency.)
hf(torch + transformers + xgrammar + llguidance + accelerate) — heavy ML stack.vllm(Linux/CUDA only; see ADR-006).llamacpp(the llama-cpp-python wheel).verify(NLI — lands in P6).docs,examples,all,dev— developer / user-convenience bundles.
Consequences¶
pip install citeformeris ~30 MB of deps (vs. the original ~10 MB of just pydantic/typer/rich). Tolerable trade for a working out-of-the-box render + metadata pipeline.Users who genuinely want the minimal base can’t anymore. Feedback so far suggests no one wants that.
CI install shrinks —
devextra stopped pulling inhf-level ML deps in P3, souv sync --extra devinstalls only docs + test tooling + main deps. Fast.With ADR-004,
citeproc-pyleft the dependency tree entirely. Main deps shrank accordingly; the rewrite is fully self-contained.