PDF Discovery Candidates¶

Date: 2026-03-09

Context¶

biblio currently supports:

It does not yet have a true PDF discovery layer for finding accessible full text from metadata such as DOI, OpenAlex IDs, or related identifiers.

Priority order:

DOI resolution is the most natural first step because biblio already supports DOI ingestion.
Unpaywall-style OA lookup is a strong legal and reliable source for accessible PDFs.
OpenAlex is already a core metadata and graph backend for biblio, so it is a natural companion signal source.
arXiv and PMC are high-value special cases with relatively deterministic full-text locations.
generic publisher scraping should remain a fallback rather than the primary architecture.

Possible future command:

biblio pdf discover

Suggested behavior:

Avoid: