cogpy Design Specification
This document is the single authoritative reference for cogpy’s design: the data model, processing abstractions, and module contracts. It consolidates decisions from earlier per-version specs into one living document.
1. Purpose
cogpy exists to provide composable, file-agnostic compute primitives for ECoG / iEEG signal analysis, backed by a structured I/O layer and reproducible pipelines. It also serves as the backend API for visualization frontends (TensorScope, a standalone React + TypeScript application).
2. Data Model
All core compute operates on xarray.DataArray (or xarray.Dataset) objects
with standardized dimension names and metadata. This section defines the
canonical schemas.
2.1 Signal Schemas (cogpy.base.ECoGSchema)
Schema |
Dims |
Use case |
|---|---|---|
Grid ECoG |
|
Spatial analysis on 2D electrode arrays |
Flat ECoG |
|
Channel-indexed analysis without grid semantics |
Multichannel |
|
Generic multichannel (no grid) |
Required metadata:
fs(float) — sampling rate in Hz. Accessible as a 0-D coordinate named"fs"or asattrs["fs"]. Validated bycogpy.base.ensure_fs().timecoordinate — seconds, 1-D, strictly increasing.
Optional metadata:
units(str) — e.g."uV"AP,MLcoordinates — physical positions (mm), 1-D
2.2 Spectrogram Schema
Entity |
Dims |
Notes |
|---|---|---|
|
|
For orthoslicer / spatial-spectral views |
Coordinates: ml, ap (physical), time (seconds), freq (Hz, increasing).
2.3 Event Representations
Entity |
Type |
Required fields |
|---|---|---|
|
|
|
|
|
|
EventCatalog optional columns:
Interval:
t0,t1,durationSpatial:
channel,AP,MLSpectral:
freq,f0,f1,bandwidthProvenance:
label,score,detector,pipeline
Converters: to_events(), to_intervals(), to_point_intervals(),
to_event_stream() (for visualization).
2.4 Coercion and Validation
Boundary functions in cogpy.datasets.schemas enforce schemas at entry points:
validate_*()— raiseValueErrorwith hints on mismatch.coerce_*()— fix “almost-right” inputs (dim permutations, missingfs) before validation.
Core compute functions should not coerce internally; validation happens at boundaries (I/O, CLI, GUI entry points).
3. Processing Framework
3.1 Layering
cogpy.io Load / save / sidecar management
↓ xarray
cogpy.* Pure compute (filtering, spectral, detection, …)
↓ xarray + EventCatalog
cogpy.cli / wf Thin orchestration (Snakemake, argparse)
↓ API
Frontend TensorScope (standalone React + TS app), notebooks
Compute functions never touch the filesystem. Functions in cogpy.io
never do heavy compute. Pipelines compose both.
3.2 Preprocessing Stack
Canonical modules (under cogpy.preprocess):
Module |
Responsibility |
|---|---|
|
Temporal/spatial filtering (xarray + Dask aware); |
|
Per-channel feature extraction |
|
Spatial normalization (neighborhood statistics) |
|
Sliding-window feature-map orchestration |
|
Outlier labeling (DBSCAN) |
|
Line-noise removal |
|
Downsampling / decimation |
|
Bad-channel interpolation |
Legacy modules (channel_feature_functions, channel_feature, detect_bads)
remain for backward compatibility but are not the target for new code.
3.3 Spectral Analysis
Function |
Module |
Description |
|---|---|---|
|
|
Power spectral density (Welch / multitaper) |
|
|
Time–frequency spectrogram |
|
|
Coherence between channels |
All accept xarray.DataArray and return xarray.DataArray with appropriate
frequency/time-frequency dimensions.
3.4 Detection Framework
Detection is built on three abstractions:
EventDetector (cogpy.detect.base):
detect(data) -> EventCatalogcan_accept(data) -> boolneeds_transform(data) -> bool(smart transform: accept raw or precomputed)Serializable via
to_dict()/from_dict()
Concrete detectors:
Detector |
Input |
Output |
Wraps |
|---|---|---|---|
|
spectrogram or raw signal |
point events |
|
|
1-D signal |
interval events |
contiguous-run finder |
|
raw signal |
interval events |
bandpass → envelope → z-score → dual threshold |
|
raw signal |
interval events |
|
Transform (cogpy.detect.transforms):
compute(data) -> xr.DataArrayConcrete:
BandpassTransform,HighpassTransform,LowpassTransform,SpectrogramTransform,HilbertTransform,ZScoreTransform
DetectionPipeline (cogpy.detect.pipeline):
Chains transforms + detector into a single reproducible unit.
run(data) -> EventCatalogAdds provenance to output metadata.
Serializable via
to_dict()/from_dict().
Pre-built pipelines (cogpy.detect.pipelines):
BURST_PIPELINE, RIPPLE_PIPELINE, FAST_RIPPLE_PIPELINE, GAMMA_BURST_PIPELINE
4. I/O Layer
Module |
Formats |
Key functions |
|---|---|---|
|
Binary |
|
|
BIDS-iEEG |
|
|
NWB-style ecephys |
|
|
Zarr, DAT |
|
|
JSON metadata |
|
I/O is also responsible for constructing valid xarray.DataArray objects with
correct schemas from raw file data.
5. Datasets & Fixtures
cogpy.datasets provides deterministic synthetic data for testing and GUI
development:
Entity generators (
datasets.entities) — single arrays matching schemas above.Bundles (
datasets.gui_bundles) — coordinated collections for GUI dev:ieeg_grid_bundle()→ grid signal + stacked view + RMS scalar + atlas hookspectrogram_bursts_bundle()→ 4D spectrogram + burst peaks
Modes —
"small"(fast debug) and"large"(stress-test rendering).All accept
seedfor reproducibility.
6. Open Design Questions
Finalize canonical dim order:
("time", "AP", "ML")vs("time", "ML", "AP")— current code has both; spec target is("time", "AP", "ML").Standardize
fsas coordinate vs attribute (currently both are accepted).Define when
from_file()should returnDataArrayvsDataset.Define a public API surface that the TensorScope frontend should depend on.