Skip to content

OSF integration: manifest, push/pull, registration, preprint

Context

Projio and OSF are complementary: projio is the local-first working environment, OSF is a cloud publication/registration/archive façade offering free DataCite DOIs, immutable registrations, preprints, and ORCID-anchored contributors. Guiding principle mirrors OSF's own add-on model: link, don't copy — the repo stays the source of truth.

Full spec: docs/specs/osf-integration.md.

Goal

One-command publication of a projio repo to OSF with a DOI, without duplicating metadata and without compromising the local-first model.

Phased plan

Phase 1 — Manifest + derived citation artifacts

  • [ ] .projio/osf.yml schema (title, description, category, SPDX license, tags, contributors w/ ORCID+CRediT, funders, related_identifiers, osf.include/exclude, components:)
  • [ ] Loader at src/projio/osf/manifest.py
  • [ ] projio sync generates CITATION.cff, codemeta.json, datacite.yaml from the manifest (header comment: generated, do not edit)
  • [ ] projio osf validate CLI + osf_manifest_validate MCP tool
  • [ ] How-to doc docs/how-to/osf-publish.md

Phase 2 — Push / pull

  • [ ] projio osf push — create/update OSF project from manifest via v2 API + osfclient for uploads; preview-first with --yes to execute (mirror datalad module pattern)
  • [ ] projio osf pull.projio/osf/state.json (DOI, contributors, public flag, registrations, preprints)
  • [ ] MCP tools: osf_status, osf_push, osf_pull, osf_doi_get in src/projio/mcp/osf.py
  • [ ] Register in src/projio/mcp/server.py
  • [ ] Auth via OSF_TOKEN env var, configurable in ~/.config/projio/config.yml under osf.token_env

Phase 3 — Components mapping

  • [ ] Process components: block on push
  • [ ] Per-kind handlers: manuscript, pipeline, bibliography, code (GitHub add-on link), data (S3 add-on link), figures attached to manuscript components
  • [ ] Per-component metadata propagation
  • [ ] Agents use osf_push --preview to see the planned component tree

Phase 4 — Registration + preprint

  • [ ] .projio/osf-registration.yml schema (schema name + answers)
  • [ ] projio osf register: git tag osf/reg/<timestamp> + file-hash manifest + draft via v2 API; default schema OSF-Standard Pre-Data Collection; embargo support (up to 4 years)
  • [ ] projio osf preprint <manuscript>: build PDF via manuscript_build, upload to chosen preprint server (PsyArXiv/SocArXiv/…), fill metadata from notio frontmatter + osf.yml
  • [ ] MCP tools: osf_register, osf_preprint
  • [ ] Write returned DOIs back into manifest related_identifiers / manuscript frontmatter

Phase 5 — Exploratory (only on demand)

  • [ ] OSF Storage as a DataLad sibling via WebDAV
  • [ ] Custom OSF add-on surfacing .projio/ natively (only if an institutional partner asks)

Open questions

  1. osfclient (file ops only) + direct requests for metadata/components/registrations — confirmed approach.
  2. Preprint server default: per-manuscript frontmatter with user-global fallback.
  3. Start with OSF-Standard Pre-Data Collection registration schema; add others on demand.
  4. DOI ownership: osf.yml is DataCite-shaped so a future projio zenodo can share it. Consider renaming to .projio/citation.yml once Zenodo lands.
  5. Contributor sync: default one-way (projio → OSF); two-way only via explicit osf_contributors_sync --pull.

Non-goals

  • Re-implementing OSF storage locally
  • Making OSF the source of truth for code / data / bib
  • Uploading every file; uploads are curated via osf.include/exclude
  • Running a custom OSF instance

Definition of done (Phase 1–2 MVP)

  • .projio/osf.yml validates; projio sync produces citation artifacts
  • projio osf push --yes creates a live OSF project with DOI, contributors, license, and uploaded manuscript PDF + compiled.bib
  • projio osf pull populates .projio/osf/state.json
  • MCP tools callable by agents with preview-first semantics