citeformer v0 — frozen spec¶
This is the frozen genesis spec that v1.0 is built against. Living docs (contract updates, architecture revisions) go in ../reference/; this file is the historical source.
Context¶
citeformer is a bulletproof way to generate verifiably cited text from language models — a philosophical successor to jsonformer (dormant since 2024) aimed at RAG-citation fabrication, not JSON schema compliance.
The problem: 14–95% of LLM-generated citations are fabricated in 2026 benchmarks (GhostCite); RAG systems hallucinate 3–13% of cited URLs; NeurIPS 2025 accepted ~50 papers with AI-generated fake references. No open-source library in 2026 provides citation markers that are structurally impossible to fabricate via constrained decoding. That’s the gap.
v0.1 target¶
Load any HF transformers model, pass retrieved chunks with metadata, get back well-cited text where referring to a non-existent source is literally impossible at the logit level, with claim-level NLI verification attached, and a deterministic APA/MLA/Chicago/IEEE/Vancouver reference list rendered by citeproc-py (not the model).
Positioning: provider-agnostic for local models only (HF, vLLM, llama.cpp). API providers land in v0.2+.
Decisions locked at spec time¶
Backends for v0.1: Local only — HF, vLLM, llama.cpp via XGrammar/llguidance. No OpenAI, Gemini, or Anthropic in v0.1.
v0.1 scope: Includes NLI-based
verify(). Ship when the full loop (generate → render → verify) is tight.Multi-language: Python-only library. Any future TS/Rust surface is a sibling repo, not a binding.
House style: Match
jellycell— src/ layout, hatchling + uv, Sphinx + furo, GitHub Actions + OIDC PyPI, pre-commit enabled.
Three §10 contracts¶
§10.1 — Citation marker grammar (
cite-id ::= "[" <digits> "]"), policy semantics.§10.2 —
Source.metadataCSL-JSON shape.§10.3 —
GenerationResult+VerificationReportpydantic schema_version.
Full ceremony in ../reference/contracts.md.
Phase plan — v0.1 deliverable¶
See ../reference/architecture.md for the full six-phase plan (P0 scaffolding → P6 verification + release). Exit criterion for each phase is mergeable; v0.1 lands at the end of P6.
Post-v0.1 roadmap (not in scope here)¶
API-provider backends (OpenAI, Gemini, Mistral) at schema-level enforcement tier.
Anthropic Citations-API adapter — a thin wrapper so users can compare; not “citeformer enforcing citations on Claude” since Anthropic’s native API already does that.
Advanced policies: streaming, user-supplied grammars, multi-lingual citation markers.
citeformer-tssibling — only if TypeScript demand materializes. Shares conventions, not code.Rust core extraction — unlikely unless profiling shows real grammar-compilation bottlenecks. Hot paths already live in XGrammar (C++) and llguidance (Rust).