Skip to content

biblio_pdf_fetch_oa: html_fallback should be a clear failure

Symptom (pixecog 2026-05-05): Calling biblio_pdf_fetch_oa(citekeys=["dvorak_2021_DentateSpikes"]) returned {"total":1, "openalex":0, "unpaywall":0, "ezproxy":0, "pool_linked":0, "html_fallback":1, "error":0}. Looks like a success (error=0). But subsequent biblio_status showed pdf: false — no actual PDF was fetched. Required manual placement of a downloaded PDF at bib/articles/<citekey>/<citekey>.pdf to unblock docling/grobid.

Expected: html_fallback outcome should make it obvious to the caller that no usable PDF was obtained. Options: 1. Reclassify html_fallback as a failure (count it under error or a dedicated failed bucket) when the resulting file is HTML rather than PDF. 2. Don't write the HTML to bib/articles/.../<citekey>.pdf at all — leave the slot empty so biblio_status and the cascade output agree. 3. At minimum return an explicit next_action: "manual_pdf_drop" hint with the canonical target path (bib/articles/<citekey>/<citekey>.pdf).

The current behavior creates an asymmetry: cascade output says "html_fallback:1" (sounds like a tier of the cascade succeeded), biblio_status says pdf: false (correct). An agent has to know to cross-check both.

Reference investigation: see pixecog 2026-05-05 dvorak_2021_DentateSpikes ingest.


Source context: pixecog

PixEcog (pixecog): Neuropixels and ECoG dataset and analysis

Recent commits:

37dc64d fix author: Arash Sal Moslehian → Arash Shahidi
8bcc47f calibrate_ieeg_clean: Plan B coherence-test failure + handoff
9bf2e76 spikesorting docs: park date_taskgroup mode, document UnitMatch path

README:


type: readme


Quick Start for Collaborators

Follow this checklist to get started with Pixecog documentation and workflows.

🐀 Pixecog Project — Compact Overview

Core principles

  • One immutable BIDS raw dataset (raw/) as the canonical baseline
  • Each analysis pipeline ha