Code Tiers: Libraries, Utils, and Flow Scripts¶
Overview¶
A projio project organizes code in three tiers. Each tier has a different scope, development model, and relationship to the pipeline system. pipeio's scaffolding and audit tools must be tier-aware.
Tier Definitions¶
Core Library¶
Location: code/lib/{name}/ (one or more per project)
Properties: - Dataset-agnostic: no project paths, no snakemake references, no registry lookups - Pure functions: numpy/scipy/xarray in → same out - Independently testable with synthetic data - Versioned, may be published to PyPI - Actively developed alongside the project pipelines
Examples: cogpy.preprocess.filter_bandpass(), cogpy.spectral.compute_spectrogram()
codio role: core — agents can add code here via nb_promote or manual extraction.
A project may have multiple core libraries (e.g., cogpy for signal processing, vizlib for plotting).
Project Utils¶
Location: code/utils/ (one per project)
Properties: - Project-specific: references project conventions, data layouts, derivative structures - Cross-flow reusable: shared by multiple pipelines but too project-coupled for a core library - Not independently publishable — coupled to the project - May read project-specific file formats, coordinate systems, or derivative outputs
Examples: utils.atlas.plot_dorsal_cortex_ecog_overlay(), utils.io.brainstate.read_state_epochs()
projio config: code.project_utils: code/utils
Flow Scripts¶
Location: code/pipelines/{flow}/scripts/{script}.py
Properties:
- Snakemake wiring: snakemake.input, snakemake.output, snakemake.params
- Flow-specific: one script per rule (may be shared across rules within a flow)
- Imports from core library and project utils
- Not reusable across flows — if the same logic appears in two flows, it should be promoted
pipeio tools: mod_create, script_create, nb_promote
Configuration¶
projio config¶
# .projio/config.yml
code:
project_utils: code/utils # project glue layer path
Core libraries are auto-discovered from code/lib/*/ and registered in codio with role: core.
codio catalog¶
# .projio/codio/catalog.yml
libraries:
cogpy:
kind: internal
role: core # actively developed, promote target
runtime_import: cogpy
source_path: code/lib/cogpy
labbox:
kind: internal
role: shared # installed, not developed here
runtime_import: labbox
source_path: code/lib/labbox
Role values:
- core — project's own compute library. Agents can add functions here.
- shared — lab/org library, used as-is. Agents should not modify.
- external — PyPI package. Never modified.
Tool Interactions¶
Scaffolding (downward flow)¶
When pipeio scaffolds code, it queries projio config + codio to generate tier-aware templates:
┌─────────────┐ codio: role=core libraries ┌──────────────────┐
│ pipeio │ ──────────────────────────────→ │ import cogpy │
│ nb_create │ projio: project_utils path │ from utils.atlas │
│ script_create│ ──────────────────────────────→ │ import ... │
│ mod_create │ │ │
└─────────────┘ └──────────────────┘
nb_create — kind-aware + tier-aware¶
investigate/explore notebook:
# %% Setup
from pathlib import Path
import cogpy # ← core library (from codio, role=core)
from pipeio import PipelineContext # ← pipeio (pipeline context + path resolution)
from sutil.repo_root import repo_abs # ← lab utils (repo root resolution)
ctx = PipelineContext.from_registry("<flow>", root=repo_abs())
sess = ctx.session(subject="01", session="04", task="free")
# Available registry groups: ctx.groups()
demo notebook:
# %% Setup
import cogpy # ← core library
output_dir = Path("derivatives/<flow>")
# %% Load Outputs
# ...
script_create / mod_create¶
"""Description"""
from pathlib import Path
import cogpy # ← core library (from codio)
def main(snakemake):
# --- Inputs ---
lfp = Path(snakemake.input.lfp)
# --- Outputs ---
out = Path(snakemake.output.out)
# --- Processing ---
result = cogpy.preprocess.filter_bandpass(...) # ← agent fills this
...
Promotion (upward flow)¶
When code moves between tiers, pipeio classifies it:
┌─────────────────────────────────┐
notebook cell → │ pipeio_code_classify / nb_promote│
└──────────┬──────────────────────┘
│
┌───────────────┼───────────────────┐
▼ ▼ ▼
core_library project_utils flow_script
(cogpy) (code/utils) (scripts/)
Classification heuristic¶
| Signal in code | Tier | Reason |
|---|---|---|
References snakemake.* |
flow_script | Coupled to snakemake execution |
| Reads project-specific file formats, derivative structures, coordinate systems | project_utils | Project-coupled but cross-flow reusable |
| Pure function: ndarray/xarray in → same out, no project deps | core_library | Dataset-agnostic, reusable |
| Uses project-specific constants, naming conventions, atlas registrations | project_utils | Project-coupled but not flow-coupled |
| Hardcoded subject/session values | flow_script (or refactor) | Too specific for any library |
AST-based detection¶
def classify_function(func_ast, project_utils_imports, core_lib_imports):
names_used = extract_all_names(func_ast)
if "snakemake" in names_used:
return "flow_script"
if names_used & project_utils_imports: # PipelineContext, Session, etc.
return "project_utils"
if is_pure_function(func_ast): # no global state, no IO side effects
return "core_library"
return "project_utils" # conservative default
Audit (lateral / validation)¶
mod_audit enhancement — import health¶
Add to existing mod_audit checks:
- Deprecated import: Script imports
utils.io.PipelineRegistry(superseded by pipeio) - Missing import: Script imports
cogpy.preprocess.filter_v2but function was renamed/removed - Tier violation: Script in
code/lib/cogpy/imports fromutils.io(core library should not depend on project utils)
script_audit — new tool, cross-flow¶
pipeio_script_audit(flow=None) # None = all flows
Returns: - Duplicate functions: Same function body in scripts across different flows - Common patterns: Import patterns shared by 3+ scripts → candidate for utils/library - Promotion candidates: Pure functions in flow scripts that belong in core library
New Tools¶
pipeio_code_classify (embedded in nb_promote)¶
Enhance nb_promote to return tier classification for each extracted function:
result = pipeio_nb_promote(flow, name, mod)
# result now includes:
{
"classifications": [
{"name": "bandpass_filter", "tier": "core_library",
"reason": "pure function, numpy in/out, no project deps",
"suggested_module": "cogpy.preprocess"},
{"name": "plot_session_atlas_overlay", "tier": "project_utils",
"reason": "reads project-specific atlas registrations and electrode coordinates",
"suggested_module": "utils.atlas"},
{"name": "run_pipeline", "tier": "flow_script",
"reason": "references snakemake.input"}
]
}
pipeio_script_audit (new)¶
Cross-flow script analysis:
result = pipeio_script_audit()
{
"duplicate_functions": [
{"name": "load_lfp", "flows": ["preprocess_ieeg", "sharpwaveripple"],
"similarity": 0.95, "suggestion": "promote to utils.io or cogpy.io"}
],
"promotion_candidates": [
{"name": "robust_zscore", "flow": "preprocess_ieeg", "script": "badlabel.py",
"tier": "core_library", "reason": "pure function, used in 3 scripts"}
],
"import_patterns": {
"cogpy.preprocess": 12, # used in 12 scripts
"utils.atlas": 4, # used in 4 scripts
"numpy": 25, # universal
}
}
mod_audit enhancement — import checks¶
Add to existing mod_audit:
# New check category: "import_health"
{
"severity": "warning",
"check": "import_health",
"message": "Script badlabel.py imports utils.io.PipelineContext (deprecated, use pipeio.PipelineContext)"
}
projio responsibilities¶
projio init / projio sync — auto-register libraries¶
When projio initializes or syncs a workspace:
- Scan
code/lib/*/for Python packages - For each, check if registered in codio catalog
- If not, auto-register with:
kind: internalrole: coreif path matchescode.core_libraryconfig (or first found)role: sharedotherwiseruntime_import: {package_name}- Register
code/utils/path in projio config if it exists
project_context() enhancement¶
Include code tier info in project context response:
{
"code_tiers": {
"core_libraries": [
{"name": "cogpy", "import": "cogpy", "path": "code/lib/cogpy", "role": "core"}
],
"project_utils": "code/utils",
"flow_count": 9
}
}
Entity Management Matrix (updated)¶
| Entity | Create | Read | Update | Audit | Promote to |
|---|---|---|---|---|---|
| Flow | flow_new |
flow_list, flow_status |
agent edits | registry_validate |
— |
| Mod | mod_create |
mod_list, mod_context |
agent edits | mod_audit |
— |
| Rule | rule_stub, rule_insert |
rule_list |
rule_update |
mod_audit |
— |
| Config | config_init |
config_read |
config_patch |
contracts_validate |
— |
| Script | script_create |
mod_context |
agent edits | mod_audit + import checks |
core_library, project_utils |
| Notebook | nb_create (tier-aware) |
nb_read, nb_analyze |
nb_update |
nb_audit |
flow_script via nb_promote |
| Mod docs | mod_create |
mod_context |
mod_doc_refresh |
mod_audit |
— |
| Core library | codio_add (via projio sync) | codio_get, codio_func_doc |
agent edits | script_audit reverse-deps |
— |
| Project utils | projio config | agent reads | agent edits | mod_audit import checks |
core_library |
Implementation Order¶
- projio config model — add
code.project_utilsfield, parse in_load_config() - projio sync — auto-discover
code/lib/*/, register in codio with role - codio catalog — add
rolefield to catalog schema - pipeio nb_create — query projio config for project_utils, generate tier-aware bootstrap
- pipeio script_create / mod_create — include project_utils imports when relevant
- pipeio nb_promote — add classification output (AST-based tier detection)
- pipeio mod_audit — add import health checks (deprecated utils, missing functions)
- pipeio script_audit — new tool, cross-flow duplicate/pattern detection