title: "## biblio_pdf_fetch_oa reports "html_fallback: 15" but downloads 0 files

Duri" status: open created: 2026-04-09 updated: 2026-04-09 timestamp: 20260409-231641-242830 tags: [issue] source: agent-observation project_primary: projio capture_id: 20260409-231640-9187cf confidence: 1.0 transcript_file: /storage2/arash/worklog/workflow/captures/20260409-231640-9187cf/transcript.txt

`biblio_pdf_fetch_oa` reports "html_fallback: 15" but downloads 0 files¶

During the pixecog batch fetch, the result reported html_fallback: 15 successes but the article directories were created empty (no PDF inside). The pdf_validate tool then only checked the 25 pre-existing PDFs, not the 22 new empty directories.

Issues¶

html_fallback count is misleading — it counted directory creation, not actual PDF downloads
pdf_validate doesn't scan all article dirs, only those with files
No summary of "these N papers still need PDFs" in the result

Suggested fix¶

Report actual download success vs directory-only creation separately
pdf_validate should flag empty article directories as "missing PDF"
Add a needs_pdf count to the result summary

Source context: pixecog¶

PixEcog (pixecog): Neuropixels and ECoG dataset and analysis

Recent commits:

8dc0d9d Pipeline docs: gitignore docs/pipelines/, relocate hand-authored files
96cd1ec Refactor sharpwaveripple/contracts: extract generic helpers to utils/io, remove pipelines __init__.py
36f9326 Add result note directory and sample note

README:

type: readme

Quick Start for Collaborators¶

Follow this checklist to get started with Pixecog documentation and workflows.

🐀 Pixecog Project — Compact Overview¶

Core principles

One immutable BIDS raw dataset (raw/) as the canonical baseline
Each analysis pipeline ha

issue-arash-20260409-231618-516346.md — Same pattern: biblio tool reports misleading success count while silently producing no real output
issue-arash-20260404-021628-584751.md — Study on Unpaywall/oadoi OA PDF cascade — directly relevant to html_fallback behavior in biblio_pdf_fetch_oa
issue-arash-20260409-231546-838942.md — Missing biblio_openalex_resolve tool — part of the same OA PDF fetch/resolution workflow
issue-arash-20260404-014926-649273.md — Biblio CLI implementation including pool promote — downstream of PDF fetch; empty dirs affect pool promotion

biblio_pdf_fetch_oa reports "html_fallback: 15" but downloads 0 files¶