Feature request: pipeio_nb_report — extract notebook results into a structured report¶
Problem¶
Explore notebooks accumulate many cells (imports, data loading, reshaping, intermediate debugging) mixed with the actual findings (figures, metrics, interpretations). Reading through a 500+ line notebook to find the key results is slow, especially for sharing with collaborators or preparing meeting notes.
Proposed tool: pipeio_nb_report(flow, name)¶
An MCP tool that extracts the meaningful outputs from an executed notebook and produces a structured report.
Implementation approach¶
Use nbconvert's MarkdownExporter + ExtractOutputPreprocessor (already installed, no new deps):
from nbconvert import MarkdownExporter
from nbconvert.preprocessors import ExtractOutputPreprocessor
exporter = MarkdownExporter()
exporter.register_preprocessor(ExtractOutputPreprocessor, enabled=True)
body, resources = exporter.from_filename("notebook.ipynb")
# body = markdown with image references
# resources['outputs'] = dict of {filename: bytes} for extracted PNGs
Tool behavior¶
- Read the executed .ipynb (from the workspace dir, after pipeio_nb_exec)
- Run nbconvert MarkdownExporter + ExtractOutputPreprocessor
- Save extracted PNGs to
docs/assets/reports/<flow>/<name>/ - Return structured payload:
- markdown_cells: the narrative text (what's being computed, why)
- figures: [{path, cell_index, caption_if_any}]
- text_outputs: [{cell_index, content}] (print statements with metrics)
- code_cells: optional, for context
Companion slash command: /report¶
Takes the pipeio_nb_report output and asks the agent to write a result note (in docs/log/result/) with: - Concise description of each analysis step (the math, not the code) - Embedded figures with captions and interpretation - Key metrics highlighted - Overall conclusions
Cell tagging (optional enhancement)¶
Allow notebook authors to tag cells for inclusion in the report using a comment marker (e.g., # %% [report] or # REPORT: prefix in markdown cells). The tool would then only extract tagged cells, producing a curated subset rather than the full notebook dump.
Longer-term: MyST embed¶
We already generate .myst paired files via jupytext. MyST-NB supports labeling cells (# | label: my-figure) and embedding outputs in any other page. This would allow docs/plan/ or docs/manuscript/ pages to directly embed notebook figures without copying. But this requires adopting MyST-NB rendering in the mkdocs build.
References¶
- nbconvert ExtractOutputPreprocessor: https://nbconvert.readthedocs.io/en/latest/nbconvert_library.html
- MyST embed/reuse: https://mystmd.org/guide/reuse-jupyter-outputs
- Junix (simpler CLI alternative): https://pypi.org/project/junix/
- Motivated by pixecog TTL characterization notebook (30+ cells, key findings buried in 5-6 plots)
Source context: pixecog¶
PixEcog (pixecog): Neuropixels and ECoG dataset and analysis
Recent commits:
6b295b2 Update badlabel audit note: full pipeline comparison (85 new vs 7 legacy vs 71 TTL), zarr fix confirmed, 97.2% TTL catch rate
a532476 Fix zarr write bug in feature.py: include coordinate variables in chunk encoding
9cb9de5 Update badlabel audit result note with pipeline run findings: LNR useless after lowpass, zarr v3 write bug
README:
type: readme
Quick Start for Collaborators¶
Follow this checklist to get started with Pixecog documentation and workflows.
🐀 Pixecog Project — Compact Overview¶
Core principles
- One immutable BIDS raw dataset (
raw/) as the canonical baseline - Each analysis pipeline ha
Related Notes¶
- idea-arash-20260408-035035-245990.md — Both concern notebooks producing structured result outputs; Auto-QC idea is a direct use case for pipeio_nb_report
- idea-arash-20260330-162752-883803.md — pipeio smart read tools / notebook metadata extraction is the same design space as nb_report
- idea-arash-20260409-135130-379286.md — pipeio docs_collect manifest idea overlaps with saving extracted PNGs/markdown to docs/assets/reports
- idea-arash-20260407-171834-423514.md — Configurable docs paths is directly relevant to where nb_report writes its output assets