Delegation Model — Engineering vs. Science¶
projio's ecosystem splits ownership of a research project into two surfaces:
- Engineering — the machinery that produces data: flows, DAGs, configs, rules, notebooks. Owned by pipeio.
- Science — the findings those machines enable, the questions they answer, and the narratives that tell the story. Owned by questio, notio result notes, and deliverables.
Each subsystem has its own spec; this document is the shared boundary statement that explains how they interact. Read this first when authoring new flows, results, or reports — it keeps you from duplicating content across layers.
Who owns what¶
| Object | Owns | Does not own |
|---|---|---|
pipeio flow (code/pipelines/<flow>/) |
Snakefile, config, rules, scripts, notebooks, docs/index.md (engineering overview), CHANGELOG.md (design history), DAG, snakemake report, run state |
Scientific findings, hypothesis evaluation, narrative prose |
questio (plan/questions/) |
Research questions, hypothesis tracking, prior art aggregation, question-to-result binding | Data pipelines, individual findings, narrative documents |
notio result note (docs/log/result/) |
One scientific finding: plot + sentence of interpretation + metric + confidence | Pipeline engineering, narrative aggregation, hypothesis tracking |
deliverable (docs/deliverables/) |
Narrative artifact for an external audience: reports, slide decks, posters | Atomic findings, pipeline details, hypothesis state |
The rule: no subsystem embeds content it doesn't own. A flow page does not embed a result plot; a result note does not duplicate engineering rationale; a deliverable does not re-derive the DAG.
Linking direction¶
Cross-references use a consistent rule: downstream objects reference upstream objects via frontmatter. Backlinks (upstream → downstream) are computed at render time by scanning descendants — never stored.
flow (upstream substrate)
↑
│ source_flow: preprocess_ieeg
│
result note question (upstream science substrate)
↑ ↑
│ results: [result-...] │ questions: [q-ieeg-artifacts]
│ │
└───────── deliverable ───────────┘
At authoring time each downstream object sets the references it knows about:
| Note type | Upstream references (set by author) |
|---|---|
result |
question (required), source_flow (when produced by a pipeio flow), series |
deliverable |
questions, results, source_flows |
A new flow sets no references — flows are the substrate. Questions
similarly don't track downstream references; questio_status computes them
at render time by scanning results.
Why this direction¶
- Curation happens downstream. The result author knows which question and which flow their finding came from; encoding that at authoring time is cheap and accurate.
- Upstream objects don't churn. Adding a result never requires editing
a flow or a question. Deleting a result removes its backlinks on the
next
docs_collectrun. - Backlinks stay current. Since they're derived at render time by scanning descendants, there's no stale reference problem.
- It matches what already works. questio already binds results to
questions this way via the
questionandmilestonefrontmatter fields. Extending to flows (source_flow) and deliverables (source_flows) is one more field, not a new model.
Flow pages are an engineering surface¶
When you open docs/pipelines/<flow>/ on the site, you see:
- Purpose, Input/Output, Mod Chain, DAG, Report, CHANGELOG — the engineering view
- Results and Deliverables backlink sections (generated by a
ResultsLinkCollectorthat scans notio anddocs/deliverables/for references to this flow)
You do not see embedded plots, full result interpretations, or narrative prose. Those live in the results and deliverables directories and the flow page links to them.
The new collector is not yet implemented; until it lands, flow authors can
hand-maintain a ## Related work section in docs/index.md. The
convention above is the contract the collector will honor when built.
Science pages are a curation surface¶
Question pages (questio), result notes, and deliverables each show different slices of the science, driven by their own spec files:
- Question page — status, milestones, evidence (results), prior art, deliverables that answer it. Rendered from scanning results and deliverables frontmatter.
- Result note — one finding, its question, its source flow, metric, confidence. Atomic and self-contained.
- Deliverable — an audience-facing document. Cites results and
questions; may cite
source_flowsfor engineering provenance.
None of these embed pipeline engineering. If a reader wants to know how a
result was produced, they click through to the source flow via the
source_flow frontmatter link.
Where to go next¶
- pipeio pipeline-docs spec
— flow documentation structure and
docs/index.mdtemplate - questio explanation — research question workflow and data model
- deliverables spec — narrative artifact convention and indexing
- ecosystem overview — which subsystem owns which domain
If you're about to add content to a flow, question, result, or deliverable and you aren't sure whether it belongs there, re-read the "Who owns what" table above. The rule is: put the content where it's owned, then reference it from anywhere else that needs to mention it.