citeformer.prompts

Prompt assembly helpers for RAG-style generation with citations.

Most users of citeformer stitch their own prompt before calling Citeformer.generate() — they add a system message, list the sources, drop in a few citation-density hints, and append the task. That boilerplate is easy to get subtly wrong (misnumber the sources, forget to show the [N] example, bury the task under too much preamble), so we ship a canonical builder here.

build_rag_prompt is intentionally string-in / string-out — the returned string is the exact prompt to feed to Citeformer.generate(). No chat templates applied: that’s the caller’s job (model-specific). If you need to format with a specific model’s chat template, wrap the string returned here as a user message and apply the template yourself.

The design goal is “helpful default, trivially overridable”. Every section is optional; pass None / empty to skip.

Module Contents

Functions

build_rag_prompt

Assemble a RAG-style prompt with numbered source context.

API

citeformer.prompts.build_rag_prompt(*, query: str, sources: list[citeformer.core.Source], system: str | None = None, cite_hint: str | None = _DEFAULT_CITE_HINT, example: str | None = None, answer_prefix: str | None = 'Answer:') str

Assemble a RAG-style prompt with numbered source context.

Args: query: The user-facing task or question. Required. sources: Sources in scope. Their 1-indexed position in the list becomes the citation id the model is allowed to emit. Must be non-empty. system: Optional top-of-prompt system message (role framing, style guidance). Rendered verbatim at the top of the prompt. cite_hint: Short instruction telling the model how to cite. Defaults to a reasonable generic hint; pass None to omit (useful if your system already explains citation conventions). example: Optional one-line example sentence showing the [N] pattern in context. Helps small models imitate the shape. answer_prefix: Trailing token(s) that invite the model to start the answer (e.g. "Answer:", "Survey:", "Summary:"). Set to None to omit — useful when your chat template adds its own assistant-turn prefix.

Returns: A string suitable for passing to Citeformer.generate(prompt=…).

Raises: ValueError: If query is empty or sources is empty.

Example: >>> prompt = build_rag_prompt( … query=”Explain self-attention.”, … sources=[Source(metadata={“id”: “vaswani”, “type”: “article-journal”, … “title”: “Attention Is All You Need”, … “author”: [{“family”: “Vaswani”}]}, … content=”…”)], … ) >>> “[1]” in prompt True