Skip to content

Flatten pipe/flow to flow-only: registry schema and core model

Goal

Change pipeio's addressing model from (pipe, flow) to flow-only. Remove the pipe concept entirely — flow names are globally unique and the derivative directory is always derivatives/{flow}/.

Context

Currently all pipeio tools require (pipe, flow) to address a flow. In practice the two-level hierarchy is redundant. Projects should use descriptive flow names (e.g. preprocess_ieeg instead of preprocess/ieeg). The derivative directory is always derivatives/{flow_name}/ — no separate derivative field needed.

Design decisions (validated): - Registry key changes from "pipe/flow" to just "flow" (flow names must be globally unique) - FlowEntry.pipe is removed entirely (not renamed to derivative) - Derivative directory = derivatives/{flow_name}/ — implicit, no field needed - FlowEntry.code_path, config_path, doc_path remain as-is - No physical file moves — only the addressing model changes - list_pipes() is removed (no concept of pipes anymore)

Prompt

Refactor the pipeio registry schema to use flow-only addressing. Remove the pipe concept entirely.

Files to modify

packages/pipeio/src/pipeio/registry.py: 1. FlowEntry: remove the pipe field entirely. The model becomes: name, code_path, config_path, doc_path, mods, app_type. 2. PipelineRegistry.flows dict: change key from "pipe/flow" to just the flow name 3. Remove list_pipes() entirely 4. list_flows(pipe=None)list_flows() (no filter param — or add an optional prefix filter if useful) 5. get(pipe, flow)get(flow), remove auto-selection logic 6. remove(pipe, flow)remove(flow) 7. scan() — update _discover_flows: key by flow name. For nested dirs like preprocess/ieeg/, the flow name should be the deepest directory name (e.g. ieeg). For flat dirs like brainstate/, flow name = dir name. 8. to_yaml() / from_yaml() — update serialization. Backward compat: if old YAML has "pipe/flow" keys, extract the flow part as key and discard pipe. If entries have a pipe field, ignore it. 9. validate() — check flow name uniqueness 10. _discover_flows(pipe_dir, pipe, docs_dir) — the pipe parameter is no longer set on FlowEntry. The function discovers flows within a directory and returns them keyed by flow name.

Important constraints

  • Run PYTHONPATH=src python -m pytest tests/ -q to verify all tests pass (update test assertions as needed)
  • Do NOT modify files outside packages/pipeio/ — later tasks handle other layers
  • Save with datalad_save when done

Acceptance Criteria

  • [ ] FlowEntry has no pipe field
  • [ ] Registry keys are flow names only
  • [ ] get(flow) works, old get(pipe, flow) signature removed
  • [ ] list_pipes() removed
  • [ ] scan() produces flow-keyed registry
  • [ ] Backward compat: old YAML with pipe/flow keys and pipe field loads correctly
  • [ ] All pipeio tests pass

Result

(Filled in after execution)

Batch Result

  • status: failed
  • batch queue_id: 48c064cff89c
  • batch duration: 1800.1s