Skip to content

Commit pipeio find_registry fix + add input stage resolution to PipelineContext

Goal

(promoted from note)

Context

(see source note)

Prompt

Fix the issue described below (source: /storage2/arash/projects/projio/docs/log/issue/issue-arash-20260407-050947-246557.md). Understand the problem, then implement the proposed fix.


Commit pipeio fixes + add input stage resolution to PipelineContext

Part 1: Commit existing fixes (already applied, just need commit)

Changes already made in this session, uncommitted:

packages/pipeio/src/pipeio/registry.py

  • Added find_registry(root) — public function that checks .projio/pipeio/registry.yml first, then .pipeio/registry.yml

packages/pipeio/src/pipeio/resolver.py

  • PipelineContext.from_registry() now uses find_registry() instead of hardcoded .pipeio/registry.yml

packages/pipeio/src/pipeio/mcp.py, cli.py, docs.py

  • Three duplicate _find_registry() implementations replaced with imports from pipeio.registry.find_registry

Verify all changes are correct, run tests if available, then commit.

Part 2: Add input stage resolution to PipelineContext

The Stage.resolve(sess, prefer) method only works with output registry groups. But notebooks need to resolve input data too — e.g. stage_prefer = ("raw", "interpolate", "linenoise", "preprocess") where "raw" is the input_dir, not an output group.

What needs to happen:

  1. PipelineContext should expose input stages (from input_dir + pybids_inputs in config) alongside output registry groups
  2. Stage.resolve() should be able to check input stage paths on disk
  3. The raw input stage should resolve paths using input_dir and the pybids_inputs filters (suffix, extension, datatype)

Design constraint:

  • Keep it simple — a "raw" stage just needs to resolve {root}/{input_dir}/sub-{subject}/ses-{session}/{datatype}/ paths using the pybids_inputs member definitions
  • Don't need full pybids integration — just path construction from entities + filters

Test:

ctx = PipelineContext.from_registry("preprocess_ieeg", root=repo_abs())
sess = ctx.session(subject="01", session="04", task="free")
# This should work:
stg = ctx.stage("raw")  # from input_dir
stg.have(sess)  # checks if raw files exist
stg.resolve(sess, ("raw", "interpolate", "preprocess"))  # returns first existing

After implementing, run the pipeio test suite to verify nothing breaks.


Source context: pixecog

PixEcog (pixecog): Neuropixels and ECoG dataset and analysis

Recent commits:

9b2f6fa Scaffold ecephys TTL removal mod, flow overview + mod docs, demo notebook
80194af Add TTL characterization & removal demo notebook (preprocess_ieeg)
dc93496 Update mkdocs pipeline nav

README:


type: readme


Quick Start for Collaborators

Follow this checklist to get started with Pixecog documentation and workflows.

🐀 Pixecog Project — Compact Overview

Core principles

  • One immutable BIDS raw dataset (raw/) as the canonical baseline
  • Each analysis pipeline ha

Acceptance Criteria

  • [ ]

Result

Completed in 466.6s (1 steps, model: opus)

Session: e610359f-1e38-4004-895f-322bde7bf93e Result note: /storage2/arash/worklog/workflow/captures/20260407-051933-d8b32e/note.md

Output

Done. Here's what was committed to pipeio (commit `3b0f772`):

**Part 1 — Consolidation & fixes (already applied, now committed):**
- `find_registry()` extracted to `registry.py` as a public function; three duplicates in `cli.py`, `docs.py`, `mcp.py` replaced with imports
- `resolver.py` uses `find_registry()` instead of hardcoded `.pipeio/registry.yml`
- Workspace-aware notebook path resolution in `mcp.py` (explore/demo `.src/` layouts)
- `python_bin` now accepts `str | list[str]` for conda-wrapped commands
- `mcp_nb_exec` always uses `sys.executable` for jupytext/papermill; kernel handles cell execution

**Part 2 — InputStage for input resolution (new):**
- Added `InputStage` dataclass in `resolver.py` with BIDS-style path construction from `pybids_inputs` config
- `PipelineContext.stage()` now resolves input stages (from `input_dir` + `pybids_inputs`) after output registry groups
- `PipelineContext.input_stages()` returns available input stage names
- `InputStage` supports `paths()`, `have()`, `resolve()` — same interface as `Stage`
- `stage_aliases` work for input stages too
- 14 new tests covering all InputStage functionality
- All 269 tests pass

## Next Steps
1. Update the issue status to closed: `note_update` on the issue file to set `status: done`.
2. Add InputStage usage examples to pipeio's CLAUDE.md or a notebook — e.g. `ctx.stage("raw").resolve(sess, ("raw", "interpolate", "preprocess"))`.
3. Test InputStage against a real pixecog flow config to verify BIDS path resolution matches actual raw data layout.