Compose artifact analysis from cogpy primitives

This guide shows how to assemble cogpy’s atomic primitives for artifact analysis in an external project (notebook, Snakemake pipeline, or standalone script). cogpy provides the operators; your project provides the orchestration.

For the full primitive catalog and design philosophy, see Atomic Primitives — Design Philosophy.

Setup

All examples assume:

import numpy as np
import xarray as xr

1. Detect events in a reference signal

Use when you have a signal (e.g. TTL channel, sync pulse, stimulus trace) that contains discrete events you want to locate.

from cogpy.detect import ThresholdDetector

detector = ThresholdDetector(
    threshold=3.0,          # in signal units (or z-scored units)
    direction="positive",   # "positive", "negative", or "both"
    min_duration=0.001,     # seconds — reject transients shorter than this
    merge_gap=0.002,        # seconds — merge events closer than this
)
catalog = detector.detect(signal)       # signal: xr.DataArray with "time" dim
event_times = catalog.df["t"].values    # ndarray of event times in seconds

Alternatively, for a continuous score signal (e.g. envelope, power trace):

from cogpy.detect.utils import score_to_bouts

bouts = score_to_bouts(
    score, times,
    low=2.0, high=3.0,       # dual-threshold: enter at high, exit at low
    min_duration=0.02,
    merge_gap=0.01,
)
# bouts: list of dicts with keys {t0, t, t1, duration, value}

2. Match events across streams

Use when you have events detected in two different signals (e.g. TTL pulses recorded on two systems) and need to align them.

from cogpy.events.match import (
    match_nearest_symmetric,
    estimate_lag,
    estimate_drift,
    event_lag_histogram,
)

# One-to-one matching
idx_a, idx_b, lags = match_nearest_symmetric(times_a, times_b, max_lag=0.05)

# Global lag estimate
lag = estimate_lag(times_a, times_b, max_lag=0.05, method="median")

# Drift over time (polynomial fit to lag vs time)
coeffs = estimate_drift(times_a, times_b, max_lag=0.05, degree=1)
# coeffs[0] = drift rate (s/s), coeffs[1] = offset (s)
# predicted lag at time t: np.polyval(coeffs, t)

# Visualize lag distribution
counts, bin_edges = event_lag_histogram(times_a, times_b, max_lag=0.05, bin_width=0.001)

3. Extract event-locked epochs

Use when you want to cut windows around events from a continuous signal.

from cogpy.brainstates.intervals import perievent_epochs

epochs = perievent_epochs(
    signal,              # xr.DataArray with "time" dim
    event_times,         # 1-D array of event times (seconds)
    fs=1000.0,           # sampling rate
    pre=0.010,           # seconds before event
    post=0.010,          # seconds after event
)
# epochs: DataArray with dims (event, <original non-time dims>, lag)
# lag coordinate: relative time from -pre to +post

4. Compute triggered statistics

from cogpy.triggered import (
    triggered_average,
    triggered_std,
    triggered_median,
    triggered_snr,
)

avg = triggered_average(epochs)    # mean across events
med = triggered_median(epochs)     # robust median
std = triggered_std(epochs)        # variability
snr = triggered_snr(epochs)        # mean / std ratio

5. Estimate and subtract a template

Use when each event produces a stereotyped waveform you want to remove.

from cogpy.triggered import estimate_template, fit_scaling, subtract_template

# Estimate template (median is robust to outlier events)
template = estimate_template(epochs, method="median")

# Optional: fit per-event amplitude scaling
alpha = fit_scaling(epochs.values, template.values)

# Subtract from the continuous signal
event_samples = (event_times * fs).astype(int)
cleaned = subtract_template(
    signal, event_samples, template.values, scaling=alpha,
)

6. Regression-based component removal

Use when the artifact is correlated with a reference signal but the waveform shape varies (e.g. different propagation delays per channel).

from cogpy.regression import (
    lagged_design_matrix,
    event_design_matrix,
    ols_fit, ols_predict, ols_residual,
)

# Option A: lagged reference regression
X = lagged_design_matrix(reference_1d, lags=range(-10, 11), intercept=True)
beta = ols_fit(X, target_multichannel)      # (n_features, n_channels)
cleaned = ols_residual(X, target_multichannel, beta)

# Option B: event-based template regression
X = event_design_matrix(n_time, event_samples, template_1d, intercept=True)
beta = ols_fit(X, target_multichannel)
cleaned = ols_residual(X, target_multichannel, beta)

7. Spectral diagnostics

Use to characterize contamination or verify cleaning.

from cogpy.spectral.psd import psd_multitaper
from cogpy.spectral.specx import spectrogramx, psdx
from cogpy.spectral.features import (
    band_power,
    spectral_peak_freqs,
    ftest_line_scan,
    narrowband_ratio,
    line_noise_ratio,
    am_artifact_score,
)

# PSD (numpy)
psd, freqs = psd_multitaper(signal_1d, fs, NW=4)

# PSD (xarray, preserves coords)
psd_da = psdx(signal_xr, method="multitaper", bandwidth=4.0)

# Spectrogram (xarray)
spec = spectrogramx(signal_xr, bandwidth=4.0, nperseg=256)

# Detect spectral peaks
peaks_hz = spectral_peak_freqs(psd, freqs, prominence=2.0)

# F-test for sinusoidal lines (more sensitive than peak detection)
fstat, freqs, sig_mask = ftest_line_scan(signal_1d, fs, NW=4.0)

# Per-bin narrowband ratio (values >> 1 = line artifact)
ratio = narrowband_ratio(psd, freqs, flank_hz=5.0)

# Amplitude modulation sidebands (e.g. carrier=fc, modulator=fm)
score = am_artifact_score(psd, freqs, fc=120.0, fm=60.0)

# Band power
power = band_power(psd, freqs, band=(100, 140))

8. Spatial analysis

Use for grid-electrode recordings to characterize spatial structure.

from cogpy.measures.spatial import (
    moran_i,
    gradient_anisotropy,
    spatial_kurtosis,
    marginal_energy_outlier,
    csd_power,
)
from cogpy.preprocess.filtering.reference import cmrx
from cogpy.preprocess.filtering.spatial import gaussian_spatialx
from cogpy.decomposition.pca import erpPCA

# Spatial autocorrelation
I = moran_i(grid_map, adjacency="queen")   # +1 smooth, 0 random, -1 checker

# Directional anisotropy (detect row/column striping)
aniso = gradient_anisotropy(grid_map)

# Row/column energy outliers
outliers = marginal_energy_outlier(grid_map, robust=True)

# Common median reference
rereferenced = cmrx(signal_grid)

# Spatial smoothing
smoothed = gaussian_spatialx(signal_grid, sigma=1.0)

# PCA decomposition
pca = erpPCA(nfac=5)
pca.fit(design_matrix)
scores = pca.FSr       # factor scores
loadings = pca.LR      # rotated loadings

9. Validate before vs after

Use to quantify whether cleaning improved the signal.

from cogpy.measures.comparison import (
    snr_improvement,
    residual_energy_ratio,
    bandpower_change,
    waveform_residual_rms,
)

# SNR change in dB
delta_snr = snr_improvement(
    psd_before, psd_after, freqs,
    signal_band=(1, 40), noise_band=(100, 140),
)

# Fraction of energy remaining (< 1 means energy was removed)
ratio = residual_energy_ratio(original, cleaned)

# Fractional bandpower change in a specific band
delta_bp = bandpower_change(psd_before, psd_after, freqs, band=(100, 140))

# Template waveform difference (for event-locked QC)
rms = waveform_residual_rms(template_before, template_after)

Composition patterns

These primitives are designed to be mixed freely. Common patterns:

  • Detect → epoch → template → subtract → validate (event-triggered template removal)

  • Detect → design matrix → regress → validate (regression-based removal)

  • Detect → epoch → spatial measures (characterize artifact spatial footprint)

  • Before PSD → clean → after PSD → compare (spectral validation loop)

The choice of pattern depends on your data. cogpy provides the building blocks; your project provides the assembly.