pipeio Specifications¶
Design specifications for pipeio — an agent-facing authoring and discovery layer for computational pipelines in research repositories.
pipeio makes pipeline knowledge (registry, configs, rules, contracts, notebooks) queryable and actionable for AI agents. It delegates execution to Snakemake, provenance to DataLad, path resolution to snakebids, and app lifecycle to snakebids deployment modes.
Spec Documents¶
| Spec | Domain | Status |
|---|---|---|
| Ontology | Concepts, entity relationships, directory conventions, naming | Current |
| Overview & Architecture | Package scope, design principles, ecosystem fit | Implemented |
| Registry | Pipe/flow/mod hierarchy, YAML schema, scan & validation | Implemented |
| Flow Config | Per-flow config.yml schema, output registry (data contracts) |
Implemented |
| Path Resolution | PathResolver protocol, PipelineContext, Session, Stage |
Implemented (SimpleResolver + BidsResolver) |
| Notebook Lifecycle | Pair, sync, execute, publish — replacing Makefile shell scripts | Implemented |
| Scaffolding | Flow and mod creation from templates | Implemented (flow new + mod_create) |
| Contracts | Declarative input/output validation framework | Implemented (models + validation) |
| CLI | Command-line interface design | Implemented (full surface + pf helper) |
| MCP Tools | Agent-facing tools via projio MCP server (38 tools) | Implemented |
Reference Implementation¶
These specs are derived from an audit of the pixecog project's pipeline infrastructure (code/utils/io/, code/pipelines/, workflow/). The audit document lives at pixecog/prompts/plan/pipeio-audit-and-design.md.
Design Principles¶
- Agent-facing authoring layer — pipeio makes pipeline knowledge queryable and provides safe authoring operations; it does not own execution, provenance, or path resolution
- One flow = one derivative — each flow is a self-contained snakebids app producing one derivative directory;
pipeis a category tag - Delegation over duplication — execution → Snakemake, provenance → DataLad, paths → snakebids
bids(), app lifecycle → snakebids - Declarative over imperative — registries and configs are YAML; validation is schema-driven
- Graceful degradation — pipeio works without optional extras (
[bids],[notebook]) - Search before creation — registry queries help discover existing flows before scaffolding new ones
- Notebook as first-class artifact — the lifecycle (pair/sync/exec/publish) is managed, not ad-hoc