Skip to content

Handbook + Workshop initiative — architecture, conceptual frame, tools

Initiative

Public handbook + blog + workshop in the spirit of goodresearch.dev (Patrick Mineault), xcorr.net (also Mineault), and cartesian.app (Elias Yilma — interactive DSA handbook). Two audiences: Arash himself (living workflow doc) and other researchers.

Anchor deliverable: 4-day practical workshop in September 2026, department-facing.

  • Days 1–3: lecture + hands-on, 6 h/day
  • Day 4: participant presentations (~20 min each)
  • Scope: BIDS + DataLad + Snakemake (open-science foundation) + projio (agentic toolchain) + worklog (orchestration)
  • Working title: The Agentic Research Workflow — final TBD.

Architecture decision (2026-05-07)

Two distinct workspaces, not one:

Surface Location Lifecycle
Handbook + blog projio/docs/handbook/ and projio/docs/blog/ (this workspace) Living, evergreen, projio's existing docs site
Workshop course materials /storage2/arash/teaching/agentic-workshop/2026-09/ (NEW workspace, not yet provisioned) Time-bounded; iterations as sibling dirs (2027-XX/, etc.)

Why the split: - Handbook = textbook (curated, polished, evergreen). Workshop = course (scaffolded for first exposure, time-stamped, has datasets/exercises/participant artifacts). - Mixing pollutes handbook with workshop ephemera (rosters, dry-run notes, post-mortems). - Workshop iteration N+1 should be a sibling dir, not branch history.

Cross-reference, don't co-locate. Workshop materials link into handbook chapters; original workshop content (exercise specs, datasets, learning outcomes, day-4 rubric) lives only in teaching/agentic-workshop/.

The Rust Book pattern (separate book repo, branded under language site) was discussed but rejected for now: Arash is sole author of both projio and the handbook → no community/cadence reason to fork repos. Revisit only if projio becomes a public/open-source tool that other researchers adopt.

Sirocampus is NOT involved. It's the lab-shared repo, not a generic publish surface.

Workshop directory layout (planned, not yet created)

teaching/agentic-workshop/
├── README.md                       # iteration scheme
├── 2026-09/
│   ├── _quarto.yml                 # multi-output: website + book + revealjs slides
│   ├── announcement.md             # see existing draft (in conversation)
│   ├── syllabus.qmd
│   ├── pre-workshop-setup.qmd
│   ├── day-1-foundations/          # BIDS + DataLad + Snakemake
│   │   ├── lecture.qmd             # → revealjs
│   │   ├── handout.qmd             # → website + book chapter
│   │   └── exercises/              # marimo notebooks
│   ├── day-2-agentic/              # Claude Code, MCP, projio
│   ├── day-3-orchestration/        # projio ecosystem + worklog
│   ├── day-4-presentations/        # rubric + participant template
│   ├── participants/               # roster, submissions
│   └── post-mortem.qmd
└── shared/                         # cross-iteration: scaffolds, datasets, slide templates

Tooling decisions

Quarto project (not Quarto book) for the workshop workspace. Single source → website + book + slides + executable notebooks. Don't force the book metaphor on heterogeneous content; render book as one output among several.

mkdocs-material stays for the handbook (projio's existing docs site uses it; pipeio_mkdocs_nav_patch MCP tool exists). Don't migrate projio docs to Quarto.

Marimo (already in Arash's stack — marimo-pair skill exists): - Workshop hands-on notebooks (each afternoon session). - Handbook explorables: marimo export html-wasm produces standalone interactive HTML, no backend, embeddable per chapter.

Manim — invest in one asset: 3–5 min opening animation for the workshop showing data → pipeline → dispatch → result through projio architecture vs. manual workflow. High-leverage, reusable across iterations and as blog/social asset.

Observable — for blog essays only (interactive arguments à la Bret Victor / Nicky Case), not handbook. One worked essay (e.g., "How a worklog goal becomes a dispatched task") would teach more than a static post.

Conceptual frame (borrowed from the Deep Research PDF)

Reference: docs/reference/research/Interactive Mathematics Beyond the Static Page.pdf (22-page ChatGPT Deep Research, 2026-05-07). Migrated from worklog 2026-05-11.

Key conceptual transfers:

  1. The 7-paradigm taxonomy (domain coloring, reactive notebooks, executable notebooks, symbolic-numeric hybrids, diagrammatic reasoning, geometry viewers, expository animation, proof-aware docs) translates to research workflow:
Math paradigm Research-workflow analog What it makes visible
Domain coloring Pipeline state heatmap Where in DAG; what's stale
Reactive notebook Marimo / Observable for analyses Param → cascade
Executable notebook Jupyter / Marimo narratives Code + prose + figure together
Diagrammatic Snakemake DAG, projio dependency graph Composition of computations
Proof-aware doc Provenance-aware doc (claim ↔ data ↔ commit) Reproducibility as navigable object
Expository animation Manim explainer Why a workflow is shaped a certain way
Geometry viewer Project topology viewer (multi-repo, cross-link) What touches what

This frame should anchor handbook chapter 1 ("Why interactivity matters for research practice") and provide vocabulary used throughout.

  1. The gap argument: PDF identifies operator theory as math's interactivity gap. Analog: research workflow itself has no mature interactive layer. Tools exist for individual computations (Jupyter, Snakemake, DataLad); nothing makes the whole research workflow a manipulable object the way complex-analysis.com makes complex functions manipulable. projio + worklog + handbook are filling that gap. Handbook should say so explicitly.

  2. Reusable "media forms" intentionally. PDF observes that Sanderson, IMAGINARY, JuliaDynamics didn't just make content; they invented forms others reuse. Candidate forms for projio's contribution to the open-science vocabulary:

  3. Workflow trace — visual rep of a research session (notes → tasks → dispatched runs → outputs).
  4. Permission scope diagram — what a project's sandbox can/can't do.
  5. Goal critical path — worklog already computes this; render as reusable component.

  6. Fragility warning. PDF: Quantomatic ("elegant research software can become historically important before it becomes infrastructurally stable"). Direct application: projio + worklog are single-creator. PDF prescription: code + docs + examples + community governance. The handbook is the docs+examples leg; the workshop is the start of community. This makes the handbook+workshop pair part of projio's survival strategy, not just outreach.

Inspirations / reading list

Beyond goodresearch / xcorr / cartesian: - Solo handbooks: Jenny Bryan (Happy Git with R), Hadley Wickham (R Packages, R for Data Science), Karpathy ("Recipe for Training Neural Networks"), Stas Bekman (ml-engineering), Google DL Tuning Playbook, Vince Buffalo (Bioinformatics Data Skills), The Turing Way. - Personal essay-blogs: Simon Willison (best living example of cadence + note-to-blog flow), Julia Evans, Lilian Weng, Jay Alammar, Chris Olah, Andy Matuschak, Eugene Yan, Chip Huyen, Maggie Appleton, Dan Luu. - Interactive / explorable: Bartosz Ciechanowski (gold standard), Amit Patel (Red Blob Games), Nicky Case, Bret Victor, Distill.pub, Setosa.io, Seeing Theory, Immersive Linear Algebra. - Neuroscience-specific: Mike X Cohen (Analyzing Neural Time Series Data), Russ Poldrack (Statistical Thinking for the 21st Century), Neuromatch Academy.

Top three to study as templates (per session synthesis): 1. Bartosz Ciechanowski — ceiling of solo-author site. 2. Simon Willison — sustainable cadence + note-to-blog publishing flow. 3. Hadley Wickham's R books — multi-surface single-source pattern (handbook + blog + book + workshops).

Workshop structure (sketch — needs deepening)

Day Morning (3h) Afternoon (3h)
1 — Reproducible foundations BIDS + DataLad Snakemake
2 — Agentic AI for code Claude Code, MCP model, projio scaffold Working with the agent: routing, context, cost/safety
3 — Workflow orchestration projio ecosystem (codio, biblio, figio, notio, manuscript) worklog: goals, tasks, scheduling, dispatch
4 — Participant presentations ~20-min presentations + group discussion (cont.)

Not yet decided — these are syllabus-level decisions for teaching/agentic-workshop/2026-09/syllabus.qmd: - Per-session learning outcomes - Per-session hands-on exercise specs (which dataset → which end state) - Pre-workshop setup checklist for participants - Backup plans (no internet, agent rate-limited, weird BYO data) - Day-4 assessment rubric - Post-workshop deliverables (certificate? template repo? mailing list?)

Workshop announcement draft

Drafted in this session under the working title The Agentic Research Workflow. TODO: move into teaching/agentic-workshop/2026-09/announcement.md once that workspace is provisioned (see task note). Pull from session log if not yet captured.

Things to fill in before announcing: department + room + exact dates, capacity (suggest 8–12), hardware/Claude credits policy, prereq strictness, audience scope (PhD only? postdocs? PIs?).

Open questions deferred

  1. Will projio become a public open-source tool with external adopters? Decision affects whether the handbook ever needs to extract to its own repo (Rust Book pattern). Not blocking today.
  2. Is the blog a section of the handbook site, or eventually its own surface? "Same site for now" — revisit after first ~10 essays.
  3. Final public titlecoherence, personal handle (Mineault/Wickham-style), or something else. Decide closer to first launch.

Conversation provenance

Session 2026-05-07. Earlier conversation steps remembered in Claude memory: - project_handbook_blog.md — initiative meta - project_research_priorities.md — competing pull from TAC III (May) and lab progress reports - feedback_lab_deck_publish_flow.md — sirocampus is meeting/lab-shared repo, not generic publish surface