Skip to content

Delegation Model — Engineering vs. Science

projio's ecosystem splits ownership of a research project into two surfaces:

  • Engineering — the machinery that produces data: flows, DAGs, configs, rules, notebooks. Owned by pipeio.
  • Science — the findings those machines enable, the questions they answer, and the narratives that tell the story. Owned by questio, notio result notes, and deliverables.

Each subsystem has its own spec; this document is the shared boundary statement that explains how they interact. Read this first when authoring new flows, results, or reports — it keeps you from duplicating content across layers.

engineering vs. science

Who owns what

Object Owns Does not own
pipeio flow (code/pipelines/<flow>/) Snakefile, config, rules, scripts, notebooks, docs/index.md (engineering overview), CHANGELOG.md (design history), DAG, snakemake report, run state Scientific findings, hypothesis evaluation, narrative prose
questio (plan/questions/) Research questions, hypothesis tracking, prior art aggregation, question-to-result binding Data pipelines, individual findings, narrative documents
notio result note (docs/log/result/) One scientific finding: plot + sentence of interpretation + metric + confidence Pipeline engineering, narrative aggregation, hypothesis tracking
deliverable (docs/deliverables/) Narrative artifact for an external audience: reports, slide decks, posters Atomic findings, pipeline details, hypothesis state

The rule: no subsystem embeds content it doesn't own. A flow page does not embed a result plot; a result note does not duplicate engineering rationale; a deliverable does not re-derive the DAG.

Linking direction

Cross-references use a consistent rule: downstream objects reference upstream objects via frontmatter. Backlinks (upstream → downstream) are computed at render time by scanning descendants — never stored.

flow (upstream substrate)
  ↑
  │ source_flow: preprocess_ieeg
  │
result note                       question (upstream science substrate)
  ↑                                 ↑
  │ results: [result-...]           │ questions: [q-ieeg-artifacts]
  │                                 │
  └───────── deliverable ───────────┘

At authoring time each downstream object sets the references it knows about:

Note type Upstream references (set by author)
result question (required), source_flow (when produced by a pipeio flow), series
deliverable questions, results, source_flows

A new flow sets no references — flows are the substrate. Questions similarly don't track downstream references; questio_status computes them at render time by scanning results.

Why this direction

  • Curation happens downstream. The result author knows which question and which flow their finding came from; encoding that at authoring time is cheap and accurate.
  • Upstream objects don't churn. Adding a result never requires editing a flow or a question. Deleting a result removes its backlinks on the next docs_collect run.
  • Backlinks stay current. Since they're derived at render time by scanning descendants, there's no stale reference problem.
  • It matches what already works. questio already binds results to questions this way via the question and milestone frontmatter fields. Extending to flows (source_flow) and deliverables (source_flows) is one more field, not a new model.

Flow pages are an engineering surface

When you open docs/pipelines/<flow>/ on the site, you see:

  • Purpose, Input/Output, Mod Chain, DAG, Report, CHANGELOG — the engineering view
  • Results and Deliverables backlink sections (generated by a ResultsLinkCollector that scans notio and docs/deliverables/ for references to this flow)

You do not see embedded plots, full result interpretations, or narrative prose. Those live in the results and deliverables directories and the flow page links to them.

The new collector is not yet implemented; until it lands, flow authors can hand-maintain a ## Related work section in docs/index.md. The convention above is the contract the collector will honor when built.

Science pages are a curation surface

Question pages (questio), result notes, and deliverables each show different slices of the science, driven by their own spec files:

  • Question page — status, milestones, evidence (results), prior art, deliverables that answer it. Rendered from scanning results and deliverables frontmatter.
  • Result note — one finding, its question, its source flow, metric, confidence. Atomic and self-contained.
  • Deliverable — an audience-facing document. Cites results and questions; may cite source_flows for engineering provenance.

None of these embed pipeline engineering. If a reader wants to know how a result was produced, they click through to the source flow via the source_flow frontmatter link.

Where to go next

If you're about to add content to a flow, question, result, or deliverable and you aren't sure whether it belongs there, re-read the "Who owns what" table above. The rule is: put the content where it's owned, then reference it from anywhere else that needs to mention it.