citeformer.backends.gemini¶
Gemini backend — schema-level cite-id enforcement via response_schema.
Gemini’s generate_content accepts a response_mime_type="application/json"
response_schema=<schema>pair; the model is constrained to emit a JSON object that validates against the schema. The schema subset supported is OpenAPI-ish — not full JSON Schema — buttype,enum,items,properties, andrequiredare all honoured, which is everything we need to expresscitations[*] ∈ {1..N}.
Tier honesty (same story as OpenAI):
Local backends enforce at the logit layer — a fabricated cite id is token-impossible to sample.
This backend enforces at the schema layer. Gemini validates the assistant’s response against the schema server-side; fabrication is structurally impossible in the returned payload.
Requires the gemini extra: pip install citeformer[gemini]. The
extra pulls in google-genai (the unified SDK that replaced
google-generativeai in 2025). Pass a live client via client=… to
override the default genai.Client() construction (which reads
GEMINI_API_KEY / GOOGLE_API_KEY from the environment).
Model requirements: any gemini-1.5 or gemini-2.x family model
that supports structured output. gemini-2.0-flash is a good default.
Module Contents¶
Classes¶
Gemini backend with schema-level cite enforcement. |
API¶
- class citeformer.backends.gemini.GeminiBackend(model: str = _DEFAULT_MODEL, *, client: Any | None = None, **client_kwargs: Any)¶
Bases:
citeformer.backends.base.BackendGemini backend with schema-level cite enforcement.
Requests the same structured payload shape as :class:
OpenAIBackend::{ "segments": [ {"text": "A sentence.", "citations": [1, 2]}, {"text": "Another one.", "citations": [3]} ] }where
citations[*]integers are enum-constrained to 1..N. After validation, segments are flattened into citation-marked plain text so downstream consumers (Citeformer / verify / render) see the same shape as local-backend output.Attributes: model: Gemini model identifier (
gemini-2.0-flashdefault). client: The authenticatedgoogle.genai.Client. last_usage: Token-usage payload from the most recentgenerate()call.Nonebefore the first call.Initialization
Construct a Gemini backend.
Args: model: Gemini model id (
gemini-2.0-flash,gemini-1.5-pro…). client: Pre-builtgenai.Client. IfNone, one is built from env (picks upGEMINI_API_KEY/GOOGLE_API_KEY). **client_kwargs: Forwarded togenai.Client(**kwargs)whenclientisNone(api_key,vertexai, …).- last_usage: citeformer.core.TokenUsage | None¶
None
- generate(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy, **options: Any) str¶
Generate text with schema-level citation constraint.
Args: prompt: User prompt. sources: Sources in scope. Position (1-indexed) becomes the enum entry. policy: Citation policy. Shapes the system instruction and the schema’s
minItemson the citations array (REQUIRED → 1, AUTO/QUOTES_ONLY → 0). **options:max_tokens(default 1024),temperature(default 0.7),marker_style(default BRACKET),system_prompt(extra system content).Returns: Flattened text carrying
marker_stylemarkers for every cited source, in document order.Raises: ValueError: If
sourcesis empty.
- stream(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy, **options: Any) collections.abc.Iterator[str]¶
Yield sentence-level chunks. Wraps :meth:
generate+ naive splitting.Gemini’s true streaming surface emits partial JSON tokens which aren’t safe to flatten per-chunk — we’d risk yielding a citation fragment. Slicing the validated response on sentence boundaries gives callers progressive output without violating the schema contract.