biblio_pdf_fetch_oa: html_fallback should be a clear failure¶
Symptom (pixecog 2026-05-05): Calling biblio_pdf_fetch_oa(citekeys=["dvorak_2021_DentateSpikes"]) returned {"total":1, "openalex":0, "unpaywall":0, "ezproxy":0, "pool_linked":0, "html_fallback":1, "error":0}. Looks like a success (error=0). But subsequent biblio_status showed pdf: false — no actual PDF was fetched. Required manual placement of a downloaded PDF at bib/articles/<citekey>/<citekey>.pdf to unblock docling/grobid.
Expected: html_fallback outcome should make it obvious to the caller that no usable PDF was obtained. Options:
1. Reclassify html_fallback as a failure (count it under error or a dedicated failed bucket) when the resulting file is HTML rather than PDF.
2. Don't write the HTML to bib/articles/.../<citekey>.pdf at all — leave the slot empty so biblio_status and the cascade output agree.
3. At minimum return an explicit next_action: "manual_pdf_drop" hint with the canonical target path (bib/articles/<citekey>/<citekey>.pdf).
The current behavior creates an asymmetry: cascade output says "html_fallback:1" (sounds like a tier of the cascade succeeded), biblio_status says pdf: false (correct). An agent has to know to cross-check both.
Reference investigation: see pixecog 2026-05-05 dvorak_2021_DentateSpikes ingest.
Source context: pixecog¶
PixEcog (pixecog): Neuropixels and ECoG dataset and analysis
Recent commits:
37dc64d fix author: Arash Sal Moslehian → Arash Shahidi
8bcc47f calibrate_ieeg_clean: Plan B coherence-test failure + handoff
9bf2e76 spikesorting docs: park date_taskgroup mode, document UnitMatch path
README:
type: readme
Quick Start for Collaborators¶
Follow this checklist to get started with Pixecog documentation and workflows.
🐀 Pixecog Project — Compact Overview¶
Core principles
- One immutable BIDS raw dataset (
raw/) as the canonical baseline - Each analysis pipeline ha