citeformer.metadata.zotero

Zotero CSL-JSON library loader.

Zotero’s “Export → CSL JSON” option produces an array of CSL-JSON items — already the shape :class:~citeformer.core.Source consumes. This module provides a thin loader plus a couple of ergonomic niceties:

  • De-duplication of items whose id collides on export (Zotero sometimes emits colliding itemKey values when the same record appears in multiple collections).

  • Optional filtering by a user-supplied predicate (e.g. “only papers from 2020 onward”, “only AI/ML tags”).

  • Graceful handling of minor Zotero quirks (issued.date-parts with stringified year, stray null fields).

The Better BibTeX plugin’s CSL-JSON export is also supported — it’s substantially identical to stock Zotero output. The same loader handles both.

Module Contents

Functions

load_zotero_csl

Load a Zotero CSL-JSON export → list of normalised CSL-JSON items.

API

citeformer.metadata.zotero.load_zotero_csl(source: str | pathlib.Path | collections.abc.Iterable[dict[str, Any]], *, filter_fn: collections.abc.Callable[[dict[str, Any]], bool] | None = None, dedupe: bool = True) list[dict[str, Any]]

Load a Zotero CSL-JSON export → list of normalised CSL-JSON items.

Args: source: Path to a .json CSL-JSON export, a raw CSL-JSON string, or an iterable of items (lets you compose with in-memory data). filter_fn: Optional predicate; items returning False are dropped. Passed each item after normalisation so it sees the final shape downstream code will see. dedupe: If True (default), items with duplicate id values are merged keeping the first occurrence. Zotero’s CSL export sometimes emits colliding keys when the same record lives in multiple collections.

Returns: Normalised CSL-JSON items in document order.