citeformer.backends.mistral¶
Mistral backend — schema-level cite-id enforcement via response_format.
Mistral’s chat.complete supports a response_format={"type": "json_schema", "json_schema": {...}} parameter that constrains the
assistant to produce a JSON object validating against the supplied
schema. When strict: true is set, the Mistral server rejects any
response whose citation integers fall outside the supplied enum —
structurally equivalent to what XGrammar does at the logit layer for
local backends.
Tier honesty (same story as OpenAI):
Local backends enforce at the logit layer — fabrication is token-impossible to sample.
This backend enforces at the schema layer — fabrication is structurally impossible in the returned payload.
Requires the mistral extra: pip install citeformer[mistral].
Model requirements: the strict: true JSON-schema mode is supported
on mistral-large-2411 (Nov 2024) and every Mistral model released
after, including mistral-small-latest and mistral-large-latest.
SDK version: pins mistralai>=2.0. The 2.x line switched to a
namespace-package layout (from mistralai.client import Mistral);
1.x used a different entry-point name and isn’t supported here.
Module Contents¶
Classes¶
Mistral Chat Completions backend with schema-level cite enforcement. |
API¶
- class citeformer.backends.mistral.MistralBackend(model: str = _DEFAULT_MODEL, *, client: Any | None = None, **client_kwargs: Any)¶
Bases:
citeformer.backends.base.BackendMistral Chat Completions backend with schema-level cite enforcement.
Request shape mirrors :class:
OpenAIBackend— segments + citations with enum-bounded integers — so downstream code is identical.Attributes: model: Mistral model id (
mistral-large-latestby default). client: The authenticatedmistralai.Mistralclient. last_usage: Token-usage payload from the most recentgenerate()call.Nonebefore the first call.Initialization
Construct a Mistral backend.
Args: model: Mistral model id supporting strict JSON schema (
mistral-large-2411or later, or*-latestaliases). client: Pre-builtmistralai.Mistralclient. IfNone, one is constructed from env (picks upMISTRAL_API_KEY). **client_kwargs: Forwarded toMistral(**kwargs)whenclientisNone.- last_usage: citeformer.core.TokenUsage | None¶
None
- generate(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy, **options: Any) str¶
Generate text with schema-level citation constraint.
Args: prompt: User prompt. sources: Sources in scope. Position (1-indexed) becomes the enum entry. policy: Citation policy — shapes the system prompt and the schema’s
minItems(REQUIRED → 1; AUTO / QUOTES_ONLY → 0). **options:max_tokens(default 1024),temperature(default 0.7),marker_style(default BRACKET),system_prompt(extra system content).Returns: Flattened text carrying
marker_stylemarkers for every cited source, in document order.Raises: ValueError: If
sourcesis empty.
- stream(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy, **options: Any) collections.abc.Iterator[str]¶
Yield sentence-level chunks by slicing :meth:
generate’s output.Same rationale as OpenAI / Gemini: Mistral’s streaming surface emits partial JSON which isn’t safe to flatten before the full response validates.