pipeio docs_collect: adopt Sphinx-inspired explicit manifest and source-only convention¶
Overview¶
Redesign pipeio docs_collect to follow Sphinx-inspired principles: explicit page manifests, source-tree immutability, and composable includes. Currently docs_collect uses implicit convention-based discovery that silently drops files it doesn't know about (e.g., overview.md, top-level cross-flow docs).
Lessons from Sphinx¶
-
Explicit toctree, not implicit collection. Sphinx requires you to declare pages in
toctreedirectives.docs_collectauto-discovers mod docs and notebooks but silently ignores anything outside its conventions. A flow-leveldocs.ymlmanifest (analogous tonotebook.yml) would let flows declare their pages explicitly while still auto-discovering mods/notebooks. -
Index vs content separation. Sphinx's
index.rstis a navigational page (toctree links), not a content page.docs_collectconflates these by usingoverview.mdas the index. Keeping them separate allows both a navigational index and a content overview. -
Glob with explicit ordering. Sphinx
toctreesupports:glob:for auto-discovery but allows explicit entries first for ordering control.docs_collectcould auto-discover mod docs / notebooks but allow adocs_order.ymlor similar to control page ordering and include hand-authored pages. -
Source tree is read-only during build. Sphinx writes only to
_build/, never back to the source tree.docs_collectcurrently regenerates stubindex.mdfiles in the sourcecode/pipelines/{flow}/docs/dir when they're missing. Generated output should go only todocs/pipelines/; the source tree undercode/pipelines/should be hand-authored only. -
Include directives for composability. Sphinx
.. include::lets you compose pages from fragments. Ifdocs_collectsupported an{include: overview.md}directive in index templates, you could compose pages without duplicating content.
Proposed design¶
- Add optional
docs.ymlper flow (sibling tonotebook.yml) listing pages, ordering, and includes - When absent, fall back to current auto-discovery (mods, scripts, notebooks)
- When present, merge auto-discovered content with explicit entries
- Never write to
code/pipelines/{flow}/docs/during collection — only todocs/pipelines/ - Support top-level
code/pipelines/*.mdas ecosystem-level docs (architecture, cross-flow DAG) - Convention:
overview.mdin source → becomes flow index content;index.mdis auto-generated navigation only
Source context: pixecog¶
PixEcog (pixecog): Neuropixels and ECoG dataset and analysis
Recent commits:
8dc0d9d Pipeline docs: gitignore docs/pipelines/, relocate hand-authored files
96cd1ec Refactor sharpwaveripple/contracts: extract generic helpers to utils/io, remove pipelines __init__.py
36f9326 Add result note directory and sample note
README:
type: readme
Quick Start for Collaborators¶
Follow this checklist to get started with Pixecog documentation and workflows.
🐀 Pixecog Project — Compact Overview¶
Core principles
- One immutable BIDS raw dataset (
raw/) as the canonical baseline - Each analysis pipeline ha
Related Notes¶
- idea-arash-20260407-171834-423514.md — Directly related — configurable docs paths for subsystem-owned docs is a prerequisite concern for any explicit manifest approach in docs_collect
- idea-arash-20260330-162752-883803.md — pipeio mod_context and notebook metadata tools are the read-side counterpart to docs_collect's write-side conventions — both hinge on how docs are discovered and structured
- deep-research-pipeio-scope.md — Broad pipeio scope research likely covers docs tooling design decisions relevant to a docs_collect redesign
- idea-arash-20260330-174518-164647.md — pipeio v2 roadmap sets the design direction that a Sphinx-inspired docs_collect overhaul would need to align with