Handbook + Workshop initiative — architecture, conceptual frame, tools
Initiative¶
Public handbook + blog + workshop in the spirit of goodresearch.dev (Patrick Mineault), xcorr.net (also Mineault), and cartesian.app (Elias Yilma — interactive DSA handbook). Two audiences: Arash himself (living workflow doc) and other researchers.
Anchor deliverable: 4-day practical workshop in September 2026, department-facing.
- Days 1–3: lecture + hands-on, 6 h/day
- Day 4: participant presentations (~20 min each)
- Scope: BIDS + DataLad + Snakemake (open-science foundation) + projio (agentic toolchain) + worklog (orchestration)
- Working title: The Agentic Research Workflow — final TBD.
Architecture decision (2026-05-07)¶
Two distinct workspaces, not one:
| Surface | Location | Lifecycle |
|---|---|---|
| Handbook + blog | projio/docs/handbook/ and projio/docs/blog/ (this workspace) |
Living, evergreen, projio's existing docs site |
| Workshop course materials | /storage2/arash/teaching/agentic-workshop/2026-09/ (NEW workspace, not yet provisioned) |
Time-bounded; iterations as sibling dirs (2027-XX/, etc.) |
Why the split: - Handbook = textbook (curated, polished, evergreen). Workshop = course (scaffolded for first exposure, time-stamped, has datasets/exercises/participant artifacts). - Mixing pollutes handbook with workshop ephemera (rosters, dry-run notes, post-mortems). - Workshop iteration N+1 should be a sibling dir, not branch history.
Cross-reference, don't co-locate. Workshop materials link into handbook chapters; original workshop content (exercise specs, datasets, learning outcomes, day-4 rubric) lives only in teaching/agentic-workshop/.
The Rust Book pattern (separate book repo, branded under language site) was discussed but rejected for now: Arash is sole author of both projio and the handbook → no community/cadence reason to fork repos. Revisit only if projio becomes a public/open-source tool that other researchers adopt.
Sirocampus is NOT involved. It's the lab-shared repo, not a generic publish surface.
Workshop directory layout (planned, not yet created)¶
teaching/agentic-workshop/
├── README.md # iteration scheme
├── 2026-09/
│ ├── _quarto.yml # multi-output: website + book + revealjs slides
│ ├── announcement.md # see existing draft (in conversation)
│ ├── syllabus.qmd
│ ├── pre-workshop-setup.qmd
│ ├── day-1-foundations/ # BIDS + DataLad + Snakemake
│ │ ├── lecture.qmd # → revealjs
│ │ ├── handout.qmd # → website + book chapter
│ │ └── exercises/ # marimo notebooks
│ ├── day-2-agentic/ # Claude Code, MCP, projio
│ ├── day-3-orchestration/ # projio ecosystem + worklog
│ ├── day-4-presentations/ # rubric + participant template
│ ├── participants/ # roster, submissions
│ └── post-mortem.qmd
└── shared/ # cross-iteration: scaffolds, datasets, slide templates
Tooling decisions¶
Quarto project (not Quarto book) for the workshop workspace. Single source → website + book + slides + executable notebooks. Don't force the book metaphor on heterogeneous content; render book as one output among several.
mkdocs-material stays for the handbook (projio's existing docs site uses it; pipeio_mkdocs_nav_patch MCP tool exists). Don't migrate projio docs to Quarto.
Marimo (already in Arash's stack — marimo-pair skill exists):
- Workshop hands-on notebooks (each afternoon session).
- Handbook explorables: marimo export html-wasm produces standalone interactive HTML, no backend, embeddable per chapter.
Manim — invest in one asset: 3–5 min opening animation for the workshop showing data → pipeline → dispatch → result through projio architecture vs. manual workflow. High-leverage, reusable across iterations and as blog/social asset.
Observable — for blog essays only (interactive arguments à la Bret Victor / Nicky Case), not handbook. One worked essay (e.g., "How a worklog goal becomes a dispatched task") would teach more than a static post.
Conceptual frame (borrowed from the Deep Research PDF)¶
Reference: docs/reference/research/Interactive Mathematics Beyond the Static Page.pdf (22-page ChatGPT Deep Research, 2026-05-07). Migrated from worklog 2026-05-11.
Key conceptual transfers:
- The 7-paradigm taxonomy (domain coloring, reactive notebooks, executable notebooks, symbolic-numeric hybrids, diagrammatic reasoning, geometry viewers, expository animation, proof-aware docs) translates to research workflow:
| Math paradigm | Research-workflow analog | What it makes visible |
|---|---|---|
| Domain coloring | Pipeline state heatmap | Where in DAG; what's stale |
| Reactive notebook | Marimo / Observable for analyses | Param → cascade |
| Executable notebook | Jupyter / Marimo narratives | Code + prose + figure together |
| Diagrammatic | Snakemake DAG, projio dependency graph | Composition of computations |
| Proof-aware doc | Provenance-aware doc (claim ↔ data ↔ commit) | Reproducibility as navigable object |
| Expository animation | Manim explainer | Why a workflow is shaped a certain way |
| Geometry viewer | Project topology viewer (multi-repo, cross-link) | What touches what |
This frame should anchor handbook chapter 1 ("Why interactivity matters for research practice") and provide vocabulary used throughout.
-
The gap argument: PDF identifies operator theory as math's interactivity gap. Analog: research workflow itself has no mature interactive layer. Tools exist for individual computations (Jupyter, Snakemake, DataLad); nothing makes the whole research workflow a manipulable object the way complex-analysis.com makes complex functions manipulable. projio + worklog + handbook are filling that gap. Handbook should say so explicitly.
-
Reusable "media forms" intentionally. PDF observes that Sanderson, IMAGINARY, JuliaDynamics didn't just make content; they invented forms others reuse. Candidate forms for projio's contribution to the open-science vocabulary:
- Workflow trace — visual rep of a research session (notes → tasks → dispatched runs → outputs).
- Permission scope diagram — what a project's sandbox can/can't do.
-
Goal critical path — worklog already computes this; render as reusable component.
-
Fragility warning. PDF: Quantomatic ("elegant research software can become historically important before it becomes infrastructurally stable"). Direct application: projio + worklog are single-creator. PDF prescription: code + docs + examples + community governance. The handbook is the docs+examples leg; the workshop is the start of community. This makes the handbook+workshop pair part of projio's survival strategy, not just outreach.
Inspirations / reading list¶
Beyond goodresearch / xcorr / cartesian: - Solo handbooks: Jenny Bryan (Happy Git with R), Hadley Wickham (R Packages, R for Data Science), Karpathy ("Recipe for Training Neural Networks"), Stas Bekman (ml-engineering), Google DL Tuning Playbook, Vince Buffalo (Bioinformatics Data Skills), The Turing Way. - Personal essay-blogs: Simon Willison (best living example of cadence + note-to-blog flow), Julia Evans, Lilian Weng, Jay Alammar, Chris Olah, Andy Matuschak, Eugene Yan, Chip Huyen, Maggie Appleton, Dan Luu. - Interactive / explorable: Bartosz Ciechanowski (gold standard), Amit Patel (Red Blob Games), Nicky Case, Bret Victor, Distill.pub, Setosa.io, Seeing Theory, Immersive Linear Algebra. - Neuroscience-specific: Mike X Cohen (Analyzing Neural Time Series Data), Russ Poldrack (Statistical Thinking for the 21st Century), Neuromatch Academy.
Top three to study as templates (per session synthesis): 1. Bartosz Ciechanowski — ceiling of solo-author site. 2. Simon Willison — sustainable cadence + note-to-blog publishing flow. 3. Hadley Wickham's R books — multi-surface single-source pattern (handbook + blog + book + workshops).
Workshop structure (sketch — needs deepening)¶
| Day | Morning (3h) | Afternoon (3h) |
|---|---|---|
| 1 — Reproducible foundations | BIDS + DataLad | Snakemake |
| 2 — Agentic AI for code | Claude Code, MCP model, projio scaffold | Working with the agent: routing, context, cost/safety |
| 3 — Workflow orchestration | projio ecosystem (codio, biblio, figio, notio, manuscript) | worklog: goals, tasks, scheduling, dispatch |
| 4 — Participant presentations | ~20-min presentations + group discussion | (cont.) |
Not yet decided — these are syllabus-level decisions for teaching/agentic-workshop/2026-09/syllabus.qmd:
- Per-session learning outcomes
- Per-session hands-on exercise specs (which dataset → which end state)
- Pre-workshop setup checklist for participants
- Backup plans (no internet, agent rate-limited, weird BYO data)
- Day-4 assessment rubric
- Post-workshop deliverables (certificate? template repo? mailing list?)
Workshop announcement draft¶
Drafted in this session under the working title The Agentic Research Workflow. TODO: move into teaching/agentic-workshop/2026-09/announcement.md once that workspace is provisioned (see task note). Pull from session log if not yet captured.
Things to fill in before announcing: department + room + exact dates, capacity (suggest 8–12), hardware/Claude credits policy, prereq strictness, audience scope (PhD only? postdocs? PIs?).
Open questions deferred¶
- Will projio become a public open-source tool with external adopters? Decision affects whether the handbook ever needs to extract to its own repo (Rust Book pattern). Not blocking today.
- Is the blog a section of the handbook site, or eventually its own surface? "Same site for now" — revisit after first ~10 essays.
- Final public title —
coherence, personal handle (Mineault/Wickham-style), or something else. Decide closer to first launch.
Conversation provenance¶
Session 2026-05-07. Earlier conversation steps remembered in Claude memory:
- project_handbook_blog.md — initiative meta
- project_research_priorities.md — competing pull from TAC III (May) and lab progress reports
- feedback_lab_deck_publish_flow.md — sirocampus is meeting/lab-shared repo, not generic publish surface