ADR-004 — Replace citeproc-py with a home-grown formatter¶
Status: Accepted and implemented (2026-04-23).
Context¶
P3 shipped rendering via citeproc-py. It works but has accumulated friction:
Chicago page-range bug:
UnboundLocalErrorin citeproc-py’sminimal-twopage-range formatter for multi-page citations. We worked around by using single-page test data.APA double-period:
"Poe, E. A.. (1845). …"— citeproc-py appends a period after the given-name initials regardless of existing punctuation.Noisy warnings: Every render prints
UserWarningabout unsupported CSL-JSON fields (indexed,reference-count, …) that Crossref’s transform response includes.Vancouver gap: no canonical
vancouver.cslin the upstream styles repo (see ADR-003). Implementing Vancouver ourselves is trivial; translating a CSL style file for it isn’t.Maintenance concentration: citeproc-py’s CSL test suite passes at ~60%; a volunteer team of three maintains the project. Our ability to fix edge cases upstream is limited.
Home-grown feel: the rest of citeformer is small, owned code. The render layer being an external lib with known quirks is inconsistent with the rest of the library’s ethos.
Decision¶
Rewrite citeformer.render to a home-grown formatter. Specifically:
Remove the
citeproc-pydependency entirely — both main deps and the initialciteproc-compatextra. Users who want the “any of 10,000 CSL files” escape hatch installciteproc-pythemselves and feed it our CSL-JSON-compliantSource.metadata; no shim needed from us.Implement six styles procedurally in Python: APA 7, MLA 9, Chicago (author-date), IEEE, Nature, and Vancouver. Each style is a
CitationFormattersubclass withinline(item: CSLItem, number: int) -> strandbibliography(item: CSLItem, number: int) -> strmethods.The styles still consume CSL-JSON as the input shape (
Source.metadata). §10.2 doesn’t change. Users can keep feeding us Crossref / arXiv output verbatim.Author a
.claude/skills/add-citation-format/SKILL.mdthat documents the exact template, test matrix, and edge cases to cover when implementing a new style. Adding a seventh style becomes a 30-minute skill-driven task.
Consequences¶
Full control over output. Chicago page-range crash, APA double-period, noisy warnings — all fixed by construction. The six built-ins go through zero citeproc-py code paths.
Vancouver joined the bundle (six styles total: APA 7, MLA 9, Chicago author-date, IEEE, Nature, Vancouver).
Initially we shipped a
citeproc-compatextra as a placeholder for a future compat module, but since it had no implementation we removed it to keep the promised surface honest. Anyone wanting arbitrary CSL files today canpip install citeproc-pythemselves —Reference/Source.metadatastay CSL-JSON compliant.Loss of “10,000 styles for free” out-of-the-box. Mitigated by the
add-citation-formatskill at.claude/skills/add-citation-format/SKILL.md(adding a new style is a 30-minute skill-driven task) and by the option to drop down tociteproc-pydirectly.Implementation landed at
src/citeformer/render/formatters/(one file per style;_base.pyfor shared helpers:Author,parse_authors,parse_year,ensure_period,format_page_range,format_doi,get_str,get_title). Total ~1,200 LOC across formatters, 60+ granular tests, 6 snapshot tests, 1 skill file.citeproc-pyremoved from main dependencies. Bundled.cslfiles deleted (apa.csl,modern-language-association.csl,chicago-author-date.csl,ieee.csl,nature.csl).Public-API changes:
load_style()andresolve_style_path()are gone (they were citeproc-py-specific). Callers useget_formatter(name)to obtain a formatter object.§10.2 (
Source.metadataas CSL-JSON) unchanged. §10.3 (output schemas) unchanged. Pure implementation-layer refactor from users’ perspective.ADR-003 marked superseded in this round.