Update pipeio specs and CLAUDE.md docs for v2 architectural decisions¶
Goal¶
Update pipeio's documentation to reflect the v2 architectural decisions: - One flow = one snakebids app = one derivative directory - pipeio is an agent-facing authoring + discovery layer (not an execution engine) - Execution delegated to snakebids/snakemake, provenance to DataLad - pipe is a category label, not a hierarchy level
Context¶
See roadmap: docs/log/idea/idea-arash-20260330-174518-164647.md
See research: docs/log/idea/deep-research-pipeio-scope.md
Key docs to update:
- packages/pipeio/CLAUDE.md — package-level guidance
- packages/pipeio/docs/specs/pipeio/overview.md — architecture overview
- packages/pipeio/docs/specs/pipeio/registry.md — registry spec
- packages/pipeio/docs/specs/pipeio/flow-config.md — config spec
- packages/pipeio/docs/specs/pipeio/mcp-tools.md — tool documentation
- CLAUDE.md (projio root) — ecosystem description
Prompt¶
Update pipeio documentation to reflect v2 architectural decisions. This is a docs-only task — do NOT change any Python code.
Key decisions to document:
North star: pipeio is an agent-facing authoring + discovery layer. It does not compete with execution engines (Snakemake), provenance systems (DataLad), app lifecycle (snakebids), or path resolution (snakebids bids()). It makes pipeline knowledge queryable and actionable for AI agents.
One flow = one derivative: Each flow is a self-contained snakebids app producing one derivative directory. The pipe/flow hierarchy is being flattened —
pipebecomes a category tag.Delegation model:
- Execution: snakebids run.py → Snakemake
- Provenance: DataLad run records
- Path resolution: snakebids bids() + generate_inputs()
App lifecycle: snakebids deployment modes
pipeio's unique value: registry/discovery, AI-safe authoring (rule_insert, config_patch, mod_create), contract semantics (I/O validation, cross-flow wiring), notebook lifecycle, documentation.
Tool categories going forward: keep (27 unique), thin-to-adapter (4: dag, completion, log_parse, config_init), stop/replace (4: run, run_status, run_dashboard, run_kill).
New features to document:
pipeio_dag_export— thin adapter oversnakemake --rulegraph/--dag/--d3dag, supports dot/mermaid/svg/json outputpipeio_report— thin adapter oversnakemake --report, supports targeting a specific rule (e.g.report) for flows with partial outputspipeio_nb_update— update notebook metadata (kind, description, status) in notebook.ymlpipeio_mod_context— bundled read tool returning rules, scripts, doc, config params, bids signatures for a mod in one callpfshell helper (bin/pf.sh) — flow navigator modeled onwg(project navigator). Source from bashrc. Commands:pf(list flows),pf <flow>(cd into flow dir),pf <flow> smk [args](snakemake with auto-resolved --snakefile/--directory/conda env),pf <flow> deriv(cd into derivative dir),pf <flow> config(print config path). Tab completion from registry.- CLI subcommands:
pipeio flow ids,pipeio flow path <flow>,pipeio flow config <flow>,pipeio flow deriv <flow>,pipeio flow smk <flow> [args]pipeio_runnow resolves snakemake via conda env wrapping (cogpy) and passes--directoryfor config resolution.use_condaflag passes--use-condato snakemake CLI.- NotebookEntry now has
kind,description,statusfields for lifecycle trackingRead each doc file listed above. Update them to reflect these decisions and new features. Be concise — don't bloat the docs. Remove or mark as deprecated any sections describing pipeio as an execution engine. Add a clear "Delegation" or "Ecosystem" section explaining what pipeio owns vs delegates.
For CLAUDE.md files, update the architecture description, tool routing tables, and CLI surface section.
Acceptance Criteria¶
- [ ] pipeio CLAUDE.md updated with v2 architecture and new features
- [ ] Overview spec reflects delegation model
- [ ] Registry spec notes pipe→category evolution
- [ ] MCP tools spec notes which tools are being thinned/replaced and lists new tools
- [ ] CLI surface documented (pf shell helper, flow subcommands)
- [ ] projio root CLAUDE.md updated
Result¶
(Filled in after execution)
Batch Result¶
- status: done
- batch queue_id:
d52d4b497700 - session:
dd9c6c81-6b88-4a7d-87b4-d0280355ac87 - batch duration: 379.5s