Skip to content

Docling And OpenAlex Walkthrough

This tutorial focuses on the two strongest enrichment steps in biblio.

Goal

Starting from bib/srcbib/*.bib and local PDFs, produce:

  • Docling markdown and sidecars under bib/derivatives/docling/
  • OpenAlex resolution output under bib/derivatives/openalex/

Merge and fetch

biblio bibtex merge
biblio bibtex fetch-pdfs

Run Docling for all known papers

biblio docling run --all

Expected outputs per citekey:

  • markdown
  • structured JSON
  • provenance sidecar

Resolve OpenAlex

biblio openalex resolve

Expected outputs:

  • bib/derivatives/openalex/resolved.jsonl
  • optional CSV summary
  • cached API responses under bib/derivatives/openalex/cache/

Next step

Use these derivatives to build a browsable portal:

biblio site build