citeformer.metadata.bibtex

Minimal BibTeX → CSL-JSON parser.

Small, dependency-free BibTeX adapter for the common case: @article{...}, @book{...}, @inproceedings{...}, etc., with a handful of familiar fields (author, title, year, journal, pages, doi …). Complex BibTeX corners — @string macro substitution, @preamble blocks, LaTeX accent escapes, crossref-inheritance — are out of scope. Users with a corner-heavy library should run the file through bibtexparser <https://pypi.org/project/bibtexparser/>_ or convert via Zotero and load via :func:~citeformer.metadata.zotero.load_zotero_csl instead.

What we do support:

  • @type{key, followed by field = {value}, or field = "value",.

  • Balanced braces inside values (title = {The {B}ook}).

  • Author / editor splitting on " and " (case-insensitive) with Family, Given and Given Family conventions.

  • A common-field and entry-type map to CSL 1.0.

Unknown fields are preserved under custom to avoid silent data loss.

Public API:

  • func:

    parse_bibtex — parse a BibTeX string → list of entries.

  • func:

    bibtex_to_csl_json — convert one parsed entry to CSL-JSON.

  • func:

    load_bibtex — parse a file path or string, return a list of CSL-JSON dicts ready to hand to :class:~citeformer.core.Source.

Module Contents

Functions

load_bibtex

Parse a BibTeX file path or string, return a list of CSL-JSON dicts.

parse_bibtex

Parse BibTeX text into a list of raw-field dicts.

bibtex_to_csl_json

Convert one parsed BibTeX entry (from :func:parse_bibtex) to CSL-JSON.

Data

API

citeformer.metadata.bibtex.BIBTEX_TYPE_MAP: dict[str, str]

None

citeformer.metadata.bibtex.load_bibtex(source: str | pathlib.Path) list[dict[str, Any]]

Parse a BibTeX file path or string, return a list of CSL-JSON dicts.

Args: source: Either a path-like to a .bib file or the BibTeX text itself. Detection is by Path.is_file — a filesystem check, so a string that happens to resemble a path but doesn’t exist is treated as BibTeX source.

Returns: A list of CSL-JSON item dicts in document order. Each item has id (the BibTeX cite key), type (mapped from the BibTeX entry type), and whichever fields we recognised.

citeformer.metadata.bibtex.parse_bibtex(text: str) list[dict[str, Any]]

Parse BibTeX text into a list of raw-field dicts.

Each dict has keys __type (entry type, lowercased), __key (cite key), and one entry per field. Values are strings (curly braces and quotes stripped). Use :func:bibtex_to_csl_json to map to CSL-JSON.

Args: text: Full BibTeX file content.

Returns: List of entries in source order. Entries that fail to parse are skipped (no exception raised).

citeformer.metadata.bibtex.bibtex_to_csl_json(entry: dict[str, Any]) dict[str, Any]

Convert one parsed BibTeX entry (from :func:parse_bibtex) to CSL-JSON.

The BibTeX cite key becomes the CSL id; the BibTeX entry type is mapped via :data:BIBTEX_TYPE_MAP (unmapped types become document). Known fields are renamed; unknown fields land in custom so round-tripping through the adapter is lossless.

Args: entry: A dict from :func:parse_bibtex with __type and __key bookkeeping keys plus raw field values.

Returns: A CSL-JSON item dict.