citeformer.integrations.llamaindex

LlamaIndex ↔ citeformer adapter.

LlamaIndex retrievers return List[NodeWithScore] (each wraps a TextNode with text + metadata attributes and a relevance score). To feed those into Citeformer.generate we convert each to a Source with CSL-JSON-shaped metadata.

Duck-typed: we don’t import LlamaIndex at module load. Any object with a text: str attribute and a metadata: dict attribute works — whether it’s llama_index.core.schema.TextNode, a NodeWithScore (the adapter unwraps .node transparently), a pydantic model, or a plain namespace.

Typical usage::

from citeformer import Citeformer
from citeformer.integrations.llamaindex import sources_from_nodes

nodes = index.as_retriever().retrieve(query)
sources = sources_from_nodes(nodes)

cf = Citeformer(backend=...)
result = cf.generate(prompt=query, sources=sources)

Module Contents

Functions

default_metadata_converter

Fallback conversion from LlamaIndex-style metadata to CSL-JSON.

source_from_node

Convert one LlamaIndex-shaped node into a citeformer Source.

sources_from_nodes

Convert an iterable of LlamaIndex nodes to citeformer sources.

Data

API

citeformer.integrations.llamaindex.MetadataConverter

None

citeformer.integrations.llamaindex.default_metadata_converter(metadata: dict[str, Any]) dict[str, Any]

Fallback conversion from LlamaIndex-style metadata to CSL-JSON.

Pulls common keys LlamaIndex loaders use (title, file_name, url, page_label, document_title) and packages them as a CSL-JSON {id, type: 'webpage', title} item. Loaders that set richer structured metadata (e.g. the SimpleDirectoryReader’s file_path) get stashed under _llamaindex_metadata so callers keep visibility.

citeformer.integrations.llamaindex.source_from_node(node: citeformer.integrations.llamaindex._TextNodeLike | citeformer.integrations.llamaindex._NodeWithScoreLike, *, metadata_converter: citeformer.integrations.llamaindex.MetadataConverter | None = None) citeformer.core.Source

Convert one LlamaIndex-shaped node into a citeformer Source.

Accepts either a bare TextNode-like object (has text + metadata) or a NodeWithScore-like wrapper (has .node with the above attributes). The adapter unwraps the latter automatically, so callers don’t have to reach into .node themselves.

Args: node: The LlamaIndex node or node-with-score to convert. metadata_converter: Optional override for CSL-JSON conversion.

Returns: A Source with content from node.text and CSL-JSON metadata.

Raises: TypeError: If the object doesn’t have the expected attributes.

citeformer.integrations.llamaindex.sources_from_nodes(nodes: collections.abc.Iterable[citeformer.integrations.llamaindex._TextNodeLike | citeformer.integrations.llamaindex._NodeWithScoreLike], *, metadata_converter: citeformer.integrations.llamaindex.MetadataConverter | None = None) list[citeformer.core.Source]

Convert an iterable of LlamaIndex nodes to citeformer sources.

Preserves order; LlamaIndex retrievers return relevance-sorted results, and the citation-id assigned by citeformer (1-indexed position) mirrors that order.