citeformer.backends.together¶
Together AI backend — provider-runtime enforcement on the OpenAI wire format.
Together AI’s chat-completions endpoint is OpenAI-API-compatible and
exposes a response_format={"type": "json_schema", "json_schema": {...}}
mode (docs <https://docs.together.ai/docs/json-mode>_) that runs
constrained decoding inside the Together runtime — the same mechanism
OpenAI / Mistral use, just on Together-hosted open-weight models. They
also support a response_format={"type": "regex", "pattern": "..."}
mode which would be a natural fit for marker-only output, but the
json_schema path is what every existing API backend uses; reusing it
keeps the conformance contract uniform.
Tier honesty: this is provider-runtime constrained sampling — a fabricated cite id is token-impossible to sample inside the Together runtime. Same guarantee as OpenAI’s strict mode, just with open-weight upstream models (Llama, Qwen, DeepSeek, …) instead of closed ones.
Implementation note: like OpenRouter, Together is a thin
- class:
OpenAIBackendsubclass — schema construction, segment flattening, streaming pseudo-stream, andlast_usageextraction all inherited unchanged. We only override__init__to point the SDK at Together’s base URL and pick upTOGETHER_API_KEY.
Requires the together extra: pip install citeformer[together]
(re-uses the openai SDK; no Together-specific client needed).
Module Contents¶
Classes¶
Together AI backend with strict json_schema cite enforcement. |
Data¶
API¶
- citeformer.backends.together.DEFAULT_BASE_URL¶
- class citeformer.backends.together.TogetherBackend(model: str = _DEFAULT_MODEL, *, client: Any | None = None, async_client: Any | None = None, api_key: str | None = None, base_url: str = DEFAULT_BASE_URL, **client_kwargs: Any)¶
Bases:
citeformer.backends.openai.OpenAIBackendTogether AI backend with strict json_schema cite enforcement.
Attributes: model: Together model id (
meta-llama/...,Qwen/..., …). client: The underlyingopenai.OpenAIclient, configured with Together’s base URL.Initialization
Construct a Together backend.
Args: model: Together model id. See https://api.together.xyz/models for the live catalogue. Default is
Meta-Llama-3.1-8B-Instruct-Turbo— small + cheap + supports json_schema constrained decoding. client: Pre-builtopenai.OpenAIclient (already pointed at Together). WhenNone, one is constructed using the other arguments. api_key: Together API key. WhenNone, falls back toTOGETHER_API_KEYfrom the environment, then toclient_kwargs["api_key"]if supplied. base_url: Together API base URL. Override only for staging / proxy testing. **client_kwargs: Forwarded toopenai.OpenAI(**kwargs)whenclientisNone(timeout,max_retries, …).