# How to build a Snakemake preprocessing pipeline

cogpy ships Snakemake workflows as package data. You can use the built-in
pipeline or compose your own from cogpy's building blocks.

## Using the built-in pipeline

```bash
# Run the full preprocessing pipeline
cogpy-preproc all --config /path/to/config.yml

# Run individual steps
cogpy-preproc lowpass --config /path/to/config.yml
cogpy-preproc feature --config /path/to/config.yml
cogpy-preproc badlabel --config /path/to/config.yml
```

The pipeline steps are:

1. **raw_zarr** — convert raw data to Zarr format
2. **lowpass** — lowpass filter
3. **downsample** — decimate to target sampling rate
4. **feature** — extract channel features (windowed)
5. **badlabel** — label bad channels (DBSCAN)
6. **plot_feature_maps** — generate QC visualizations
7. **interpolate** — interpolate bad channels

## Writing a custom Snakefile

A custom pipeline composes `cogpy.io` (load/save) with `cogpy` compute
subpackages:

```python
# scripts/my_step.py
import cogpy.io.ecog_io as ecog_io
from cogpy.preprocess.filtering import bandpassx, notchesx

# Load
sig = ecog_io.from_file(snakemake.input[0])

# Compute (no file I/O here)
sig = notchesx(sig, freqs=[60.0, 120.0, 180.0])
sig = bandpassx(sig, wl=0.5, wh=300.0, order=4, axis="time")

# Save
ecog_io.to_zarr(sig, snakemake.output[0])
```

```python
# Snakefile
rule filter_and_denoise:
    input: "{subject}/raw.zarr"
    output: "{subject}/filtered.zarr"
    script: "scripts/my_step.py"
```

## Design principles

- **Rules are thin orchestrators.** Heavy logic belongs in `cogpy` compute subpackages.
- **Use `cogpy.io` for all file operations.** Do not read/write files directly
  in core functions.
- **Sidecar management** (updating JSON metadata after resampling, etc.)
  happens in `cogpy.io`, not in Snakemake rules.

## Configuration

Pipelines use YAML configuration:

```yaml
# config.yml
subjects: ["sub-01", "sub-02"]
fs_target: 500.0
lowpass_freq: 200.0
line_freq: 60.0
badchannel:
  window_size: 2048
  window_step: 1024
  dbscan_eps: 1.5
  dbscan_min_samples: 5
```