citeformer.backends.gemini

Gemini backend — schema-level cite-id enforcement via response_schema.

Gemini’s generate_content accepts a response_mime_type="application/json"

  • response_schema=<schema> pair; the model is constrained to emit a JSON object that validates against the schema. The schema subset supported is OpenAPI-ish — not full JSON Schema — but type, enum, items, properties, and required are all honoured, which is everything we need to express citations[*] {1..N}.

Tier honesty (same story as OpenAI):

  • Local backends enforce at the logit layer — a fabricated cite id is token-impossible to sample.

  • This backend enforces at the schema layer. Gemini validates the assistant’s response against the schema server-side; fabrication is structurally impossible in the returned payload.

Requires the gemini extra: pip install citeformer[gemini]. The extra pulls in google-genai (the unified SDK that replaced google-generativeai in 2025). Pass a live client via client=… to override the default genai.Client() construction (which reads GEMINI_API_KEY / GOOGLE_API_KEY from the environment).

Model requirements: any gemini-1.5 or gemini-2.x family model that supports structured output. gemini-2.0-flash is a good default.

Module Contents

Classes

GeminiBackend

Gemini backend with schema-level cite enforcement.

API

class citeformer.backends.gemini.GeminiBackend(model: str = _DEFAULT_MODEL, *, client: Any | None = None, **client_kwargs: Any)

Bases: citeformer.backends.base.Backend

Gemini backend with schema-level cite enforcement.

Requests the same structured payload shape as :class:OpenAIBackend::

{
  "segments": [
    {"text": "A sentence.", "citations": [1, 2]},
    {"text": "Another one.", "citations": [3]}
  ]
}

where citations[*] integers are enum-constrained to 1..N. After validation, segments are flattened into citation-marked plain text so downstream consumers (Citeformer / verify / render) see the same shape as local-backend output.

Attributes: model: Gemini model identifier (gemini-2.0-flash default). client: The authenticated google.genai.Client. last_usage: Token-usage payload from the most recent generate() call. None before the first call.

Initialization

Construct a Gemini backend.

Args: model: Gemini model id (gemini-2.0-flash, gemini-1.5-pro …). client: Pre-built genai.Client. If None, one is built from env (picks up GEMINI_API_KEY / GOOGLE_API_KEY). **client_kwargs: Forwarded to genai.Client(**kwargs) when client is None (api_key, vertexai, …).

model: str

None

client: Any

None

last_usage: citeformer.core.TokenUsage | None

None

generate(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy, **options: Any) str

Generate text with schema-level citation constraint.

Args: prompt: User prompt. sources: Sources in scope. Position (1-indexed) becomes the enum entry. policy: Citation policy. Shapes the system instruction and the schema’s minItems on the citations array (REQUIRED → 1, AUTO/QUOTES_ONLY → 0). **options: max_tokens (default 1024), temperature (default 0.7), marker_style (default BRACKET), system_prompt (extra system content).

Returns: Flattened text carrying marker_style markers for every cited source, in document order.

Raises: ValueError: If sources is empty.

stream(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy, **options: Any) collections.abc.Iterator[str]

Yield sentence-level chunks. Wraps :meth:generate + naive splitting.

Gemini’s true streaming surface emits partial JSON tokens which aren’t safe to flatten per-chunk — we’d risk yielding a citation fragment. Slicing the validated response on sentence boundaries gives callers progressive output without violating the schema contract.