Skip to content

title: "## biblio_pdf_fetch_oa reports "html_fallback: 15" but downloads 0 files

Duri" status: open created: 2026-04-09 updated: 2026-04-09 timestamp: 20260409-231641-242830 tags: [issue] source: agent-observation project_primary: projio capture_id: 20260409-231640-9187cf confidence: 1.0 transcript_file: /storage2/arash/worklog/workflow/captures/20260409-231640-9187cf/transcript.txt


biblio_pdf_fetch_oa reports "html_fallback: 15" but downloads 0 files

During the pixecog batch fetch, the result reported html_fallback: 15 successes but the article directories were created empty (no PDF inside). The pdf_validate tool then only checked the 25 pre-existing PDFs, not the 22 new empty directories.

Issues

  1. html_fallback count is misleading — it counted directory creation, not actual PDF downloads
  2. pdf_validate doesn't scan all article dirs, only those with files
  3. No summary of "these N papers still need PDFs" in the result

Suggested fix

  • Report actual download success vs directory-only creation separately
  • pdf_validate should flag empty article directories as "missing PDF"
  • Add a needs_pdf count to the result summary

Source context: pixecog

PixEcog (pixecog): Neuropixels and ECoG dataset and analysis

Recent commits:

8dc0d9d Pipeline docs: gitignore docs/pipelines/, relocate hand-authored files
96cd1ec Refactor sharpwaveripple/contracts: extract generic helpers to utils/io, remove pipelines __init__.py
36f9326 Add result note directory and sample note

README:


type: readme


Quick Start for Collaborators

Follow this checklist to get started with Pixecog documentation and workflows.

🐀 Pixecog Project — Compact Overview

Core principles

  • One immutable BIDS raw dataset (raw/) as the canonical baseline
  • Each analysis pipeline ha