Feature request: pipeio_nb_report — extract notebook results into a structured report¶

Problem¶

Explore notebooks accumulate many cells (imports, data loading, reshaping, intermediate debugging) mixed with the actual findings (figures, metrics, interpretations). Reading through a 500+ line notebook to find the key results is slow, especially for sharing with collaborators or preparing meeting notes.

Proposed tool: pipeio_nb_report(flow, name)¶

An MCP tool that extracts the meaningful outputs from an executed notebook and produces a structured report.

Implementation approach¶

Use nbconvert's MarkdownExporter + ExtractOutputPreprocessor (already installed, no new deps):

from nbconvert import MarkdownExporter
from nbconvert.preprocessors import ExtractOutputPreprocessor

exporter = MarkdownExporter()
exporter.register_preprocessor(ExtractOutputPreprocessor, enabled=True)
body, resources = exporter.from_filename("notebook.ipynb")
# body = markdown with image references
# resources['outputs'] = dict of {filename: bytes} for extracted PNGs

Tool behavior¶

Read the executed .ipynb (from the workspace dir, after pipeio_nb_exec)
Run nbconvert MarkdownExporter + ExtractOutputPreprocessor
Save extracted PNGs to docs/assets/reports/<flow>/<name>/
Return structured payload:
markdown_cells: the narrative text (what's being computed, why)
figures: [{path, cell_index, caption_if_any}]
text_outputs: [{cell_index, content}] (print statements with metrics)
code_cells: optional, for context

Companion slash command: /report¶

Takes the pipeio_nb_report output and asks the agent to write a result note (in docs/log/result/) with: - Concise description of each analysis step (the math, not the code) - Embedded figures with captions and interpretation - Key metrics highlighted - Overall conclusions

Cell tagging (optional enhancement)¶

Allow notebook authors to tag cells for inclusion in the report using a comment marker (e.g., # %% [report] or # REPORT: prefix in markdown cells). The tool would then only extract tagged cells, producing a curated subset rather than the full notebook dump.

Longer-term: MyST embed¶

We already generate .myst paired files via jupytext. MyST-NB supports labeling cells (# | label: my-figure) and embedding outputs in any other page. This would allow docs/plan/ or docs/manuscript/ pages to directly embed notebook figures without copying. But this requires adopting MyST-NB rendering in the mkdocs build.

References¶

nbconvert ExtractOutputPreprocessor: https://nbconvert.readthedocs.io/en/latest/nbconvert_library.html
MyST embed/reuse: https://mystmd.org/guide/reuse-jupyter-outputs
Junix (simpler CLI alternative): https://pypi.org/project/junix/
Motivated by pixecog TTL characterization notebook (30+ cells, key findings buried in 5-6 plots)

Source context: pixecog¶

PixEcog (pixecog): Neuropixels and ECoG dataset and analysis

Recent commits:

6b295b2 Update badlabel audit note: full pipeline comparison (85 new vs 7 legacy vs 71 TTL), zarr fix confirmed, 97.2% TTL catch rate
a532476 Fix zarr write bug in feature.py: include coordinate variables in chunk encoding
9cb9de5 Update badlabel audit result note with pipeline run findings: LNR useless after lowpass, zarr v3 write bug

README:

type: readme

Quick Start for Collaborators¶

Follow this checklist to get started with Pixecog documentation and workflows.

🐀 Pixecog Project — Compact Overview¶

Core principles

One immutable BIDS raw dataset (raw/) as the canonical baseline
Each analysis pipeline ha

idea-arash-20260408-035035-245990.md — Both concern notebooks producing structured result outputs; Auto-QC idea is a direct use case for pipeio_nb_report
idea-arash-20260330-162752-883803.md — pipeio smart read tools / notebook metadata extraction is the same design space as nb_report
idea-arash-20260409-135130-379286.md — pipeio docs_collect manifest idea overlaps with saving extracted PNGs/markdown to docs/assets/reports
idea-arash-20260407-171834-423514.md — Configurable docs paths is directly relevant to where nb_report writes its output assets