citeformer.backends.mistral

Mistral backend — schema-level cite-id enforcement via response_format.

Mistral’s chat.complete supports a response_format={"type": "json_schema", "json_schema": {...}} parameter that constrains the assistant to produce a JSON object validating against the supplied schema. When strict: true is set, the Mistral server rejects any response whose citation integers fall outside the supplied enum — structurally equivalent to what XGrammar does at the logit layer for local backends.

Tier honesty (same story as OpenAI):

  • Local backends enforce at the logit layer — fabrication is token-impossible to sample.

  • This backend enforces at the schema layer — fabrication is structurally impossible in the returned payload.

Requires the mistral extra: pip install citeformer[mistral].

Model requirements: the strict: true JSON-schema mode is supported on mistral-large-2411 (Nov 2024) and every Mistral model released after, including mistral-small-latest and mistral-large-latest.

SDK version: pins mistralai>=2.0. The 2.x line switched to a namespace-package layout (from mistralai.client import Mistral); 1.x used a different entry-point name and isn’t supported here.

Module Contents

Classes

MistralBackend

Mistral Chat Completions backend with schema-level cite enforcement.

API

class citeformer.backends.mistral.MistralBackend(model: str = _DEFAULT_MODEL, *, client: Any | None = None, **client_kwargs: Any)

Bases: citeformer.backends.base.Backend

Mistral Chat Completions backend with schema-level cite enforcement.

Request shape mirrors :class:OpenAIBackend — segments + citations with enum-bounded integers — so downstream code is identical.

Attributes: model: Mistral model id (mistral-large-latest by default). client: The authenticated mistralai.Mistral client. last_usage: Token-usage payload from the most recent generate() call. None before the first call.

Initialization

Construct a Mistral backend.

Args: model: Mistral model id supporting strict JSON schema (mistral-large-2411 or later, or *-latest aliases). client: Pre-built mistralai.Mistral client. If None, one is constructed from env (picks up MISTRAL_API_KEY). **client_kwargs: Forwarded to Mistral(**kwargs) when client is None.

model: str

None

client: Any

None

last_usage: citeformer.core.TokenUsage | None

None

generate(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy, **options: Any) str

Generate text with schema-level citation constraint.

Args: prompt: User prompt. sources: Sources in scope. Position (1-indexed) becomes the enum entry. policy: Citation policy — shapes the system prompt and the schema’s minItems (REQUIRED → 1; AUTO / QUOTES_ONLY → 0). **options: max_tokens (default 1024), temperature (default 0.7), marker_style (default BRACKET), system_prompt (extra system content).

Returns: Flattened text carrying marker_style markers for every cited source, in document order.

Raises: ValueError: If sources is empty.

stream(prompt: str, sources: list[citeformer.core.Source], policy: citeformer.core.Policy, **options: Any) collections.abc.Iterator[str]

Yield sentence-level chunks by slicing :meth:generate’s output.

Same rationale as OpenAI / Gemini: Mistral’s streaming surface emits partial JSON which isn’t safe to flatten before the full response validates.