citeformer.citeformer

The Citeformer orchestrator — the public entry point for generation.

Composes a Backend with a citation policy and a CSL style. Calls the backend to produce raw text with inline [N] markers, parses those markers into Citation objects, renders the reference list, and packages the whole thing as a GenerationResult.

Reference rendering uses the home-grown formatters in citeformer.render.formatters — see ADR-004 for the rationale.

Streaming:

  • Citeformer.stream() returns a StreamingResult — iterate to consume chunks as they’re decoded, then call .finalize() to get the full GenerationResult. See the class docstring below for the usage pattern.

Module Contents

Classes

Citeformer

High-level orchestrator for generating citation-backed text.

StreamingResult

Iterable wrapper over a backend’s streaming output.

AsyncStreamingResult

Async-iterable wrapper over a backend’s async streaming output.

Functions

deduplicate_adjacent_cites

Collapse runs of adjacent [N] markers to the unique ids only.

API

citeformer.citeformer.deduplicate_adjacent_cites(text: str) str

Collapse runs of adjacent [N] markers to the unique ids only.

The REQUIRED policy’s grammar allows cite-group ::= cite-id (ws cite-id)* — more than one citation between content and sent-end. Small instruction-tuned models under REQUIRED often emit runs like [1] [2] [3] [1] [2] [3] [1] when closing a sentence where they wanted to cite something out-of-scope: the grammar forces progress, but the model fills the cite-group by cycling valid ids.

This helper rewrites each such run to contain each cite id at most once, preserving order of first appearance. [1] [2] [3] [1] [2][1] [2] [3]. Single markers are untouched.

Args: text: The generated text (GenerationResult.text).

Returns: The same string with adjacent-cite runs deduplicated. Non-citation content is copied verbatim.

Example: >>> deduplicate_adjacent_cites(“Foo [1] [2] [3] [1] [2]. Bar [4].”) ‘Foo [1] [2] [3]. Bar [4].’

class citeformer.citeformer.Citeformer(backend: citeformer.backends.base.Backend, style: str = 'apa-7', citation_policy: citeformer.core.Policy = Policy.REQUIRED, marker_style: citeformer.core.MarkerStyle = MarkerStyle.BRACKET)

High-level orchestrator for generating citation-backed text.

Wraps a Backend with a citation policy and a CSL style. References are rendered deterministically by the home-grown formatters in citeformer.render.formatters — the model never emits bibliography text.

Example: >>> from citeformer import Citeformer, Source >>> from citeformer.backends import MockBackend >>> sources = [Source(metadata={“id”: “a”, “type”: “book”}, content=”…”)] >>> cf = Citeformer(backend=MockBackend()) >>> result = cf.generate(prompt=”hi”, sources=sources) >>> “[1]” in result.text True

Attributes: backend: The backend used to generate raw text. style: CSL style identifier (e.g. "apa-7") for the reference renderer. citation_policy: Default citation enforcement policy.

Initialization

Construct a Citeformer.

Args: backend: Backend instance to delegate generation to. style: CSL style identifier for reference rendering (see bundled_style_names() for available styles). citation_policy: Default citation enforcement policy for generate(). marker_style: Shape of inline citation markers. Default is :attr:MarkerStyle.BRACKET ([N]). Swap for PAREN ((N)), CURLY ({N}), or CARET (^N) when the downstream pipeline already reserves square brackets. The structural guarantee (digit enum bounded by len(sources)) holds identically across styles.

generate(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy | None = None, **options: Any) citeformer.core.GenerationResult

Generate text with constrained citations.

Args: prompt: User prompt. The orchestrator passes it through to the backend verbatim; retrieval-augmented stitching is the caller’s job until we add a prompt-assembly helper in a later phase. sources: Evidence chunks in scope. Position (1-indexed) determines the citation id the model emits and the id used in Citation.source_id and Reference.source_id. policy: Override the default citation_policy for this call. **options: Backend-specific options forwarded to Backend.generate(). Pass marker_style= to override the orchestrator’s default for a single call.

Returns: A GenerationResult with text, parsed citations, and rendered references.

stream(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy | None = None, **options: Any) citeformer.citeformer.StreamingResult

Stream generation as text chunks while tracking state for finalization.

Returns a StreamingResult which is both iterable (yielding decoded chunks in order) and finalizable (building a full GenerationResult from the accumulated text after the stream completes).

Typical usage::

stream = cf.stream(prompt="...", sources=sources)
for chunk in stream:
    print(chunk, end="", flush=True)
result = stream.finalize()  # full GenerationResult with refs + verify()

If you call .finalize() without consuming the iterator first, it will exhaust the iterator internally so you get a complete result either way.

Args: prompt: User prompt, same semantics as generate(). sources: Sources in scope (position → citation id). policy: Override the default citation_policy for this call. **options: Forwarded to Backend.stream().

Returns: A StreamingResult wrapping the backend’s chunk iterator.

async agenerate(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy | None = None, **options: Any) citeformer.core.GenerationResult

Async counterpart of :meth:generate. See ADR-014.

Calls the backend’s agenerate() (which is concrete on every backend — the ABC default uses asyncio.to_thread; OpenAI / Anthropic backends override with native async clients). Parsing, rendering, and usage-threading are identical to the sync path; the only difference is the await on the backend call.

Returns: Same as :meth:generate.

astream(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy | None = None, **options: Any) citeformer.citeformer.AsyncStreamingResult

Async counterpart of :meth:stream. See ADR-014.

Returns an :class:AsyncStreamingResult — async-iterable for chunk consumption and await -finalizable to a complete

Class:

GenerationResult. The orchestrator method itself is sync (no await); the async work happens when you iterate or finalize.

Typical usage::

stream = cf.astream(prompt="...", sources=sources)
async for chunk in stream:
    print(chunk, end="", flush=True)
result = await stream.finalize()

Args: prompt: User prompt, same semantics as :meth:generate. sources: Sources in scope (position → citation id). policy: Override the default citation_policy for this call. **options: Forwarded to :meth:Backend.astream.

Returns: An :class:AsyncStreamingResult wrapping the backend’s async chunk iterator.

class citeformer.citeformer.StreamingResult(*, chunks: collections.abc.Iterator[str], sources: list[citeformer.core.Source], style: str, marker_style: citeformer.core.MarkerStyle = MarkerStyle.BRACKET, backend: citeformer.backends.base.Backend | None = None)

Iterable wrapper over a backend’s streaming output.

Acts as an Iterator[str], yielding decoded chunks in order. Internally accumulates the full text so that after iteration completes, finalize() can parse citations + render references and return a GenerationResult identical to what Citeformer.generate() would have produced for the same (prompt, sources, policy).

Idempotent: calling finalize() multiple times returns the same GenerationResult instance. Calling it before exhausting the iterator consumes the remaining chunks so the result is complete — partial finalize isn’t supported by design; if you want the partial text at any point, use the text property.

Attributes: sources: Sources passed to Citeformer.stream(). style: CSL style used to render references.

Initialization

Wrap a backend chunk iterator. Not for direct construction by users.

property text: str

The text consumed so far. Updates as iteration progresses.

finalize() citeformer.core.GenerationResult

Exhaust the iterator (if needed) and build the full GenerationResult.

Safe to call multiple times — the first call caches the result and subsequent calls return the same instance.

class citeformer.citeformer.AsyncStreamingResult(*, chunks: collections.abc.AsyncIterator[str], sources: list[citeformer.core.Source], style: str, marker_style: citeformer.core.MarkerStyle = MarkerStyle.BRACKET, backend: citeformer.backends.base.Backend | None = None)

Async-iterable wrapper over a backend’s async streaming output.

The async parallel of :class:StreamingResult. Same surface — chunks arrive in order, accumulated text is exposed via :attr:text, and

Meth:

finalize builds the full :class:GenerationResult — but the iteration is async for and finalize() is awaitable.

Idempotent: calling await stream.finalize() multiple times returns the same GenerationResult instance. Calling it before exhausting the iterator awaits the remaining chunks so the result is complete.

Typical usage::

stream = cf.astream(prompt="...", sources=sources)
async for chunk in stream:
    print(chunk, end="", flush=True)
result = await stream.finalize()

Attributes: sources: Sources passed to :meth:Citeformer.astream. style: CSL style used to render references.

Initialization

Wrap a backend async-chunk iterator. Not for direct construction by users.

property text: str

The text consumed so far. Updates as iteration progresses.

async finalize() citeformer.core.GenerationResult

Exhaust the async iterator (if needed) and build the full result.

Safe to call multiple times — the first call caches the result and subsequent calls return the same instance.