Skip to content

Issue: pipeio_config_patch apply=True destroys YAML comments, anchors, and formatting

Problem

pipeio_config_patch(apply=True) rewrites the entire config.yml via standard YAML dump, which: - Strips all comments (section headers, inline explanations) - Resolves YAML anchors/aliases (&json_default, *ieeg_bundle, <<: merge keys) into inlined duplicates - Converts flow-style mappings ({suffix: "ieeg", extension: ".lfp"}) to block style - Removes blank lines and section separators - Changes quoting style (quoted strings → unquoted)

This makes apply=True unusable on any real config.yml. The diff preview is useful but the apply path is broken.

Root cause

Using PyYAML (or similar) for round-trip. PyYAML doesn't preserve comments or formatting.

Proposed fix: ruamel.yaml round-trip mode

Switch to ruamel.yaml with YAML(typ='rt') (round-trip). This preserves: - Comments (inline and block) - Anchor/alias syntax - Flow vs block style per node - Key ordering - Blank lines

Additional improvements needed

Surgical insertion operations

Instead of full-file rewrite, provide targeted operations:

  1. pipeio_config_add_registry(pipe, flow, group_name, group_dict, after=None)
  2. Inserts a new registry group at a specific position
  3. Uses ruamel to insert a CommentedMap node into the registry: mapping
  4. Preserves all surrounding content

  5. pipeio_config_add_params(pipe, flow, section_name, params_dict)

  6. Appends a params section at the end of the file
  7. Adds a blank line separator before the new section

Anchor-aware member generation

When creating a new registry group, check _member_sets for matching anchors: - If requested members == ieeg_bundle → emit <<: *ieeg_bundle - If a subset matches (e.g. json + lfp) → emit individual anchor refs - Otherwise inline the members

This requires ruamel's anchor-handling API (yaml.anchor, yaml.merge).

Position-aware rule insertion

pipeio_rule_insert(after_rule=...) should: - If after_rule not specified, use pipeio_dag to find the logical insertion point based on the rule's inputs - Support inserting into .smk include files (not just the main Snakefile)

Workaround (current)

Use apply=False to get the diff preview, then manually edit config.yml. This is what we did for the ttl_removal mod — the diff was informative but the apply was unusable.


Source context: pixecog

PixEcog (pixecog): Neuropixels and ECoG dataset and analysis

Recent commits:

c309f45 Fix pipeline doc naming drift, populate registry doc_path, close 3 issues
84d605b Migrate 43 scripts from utils.smk.smk_log
5808910 [DATALAD] removed content

README:


type: readme


Quick Start for Collaborators

Follow this checklist to get started with Pixecog documentation and workflows.

🐀 Pixecog Project — Compact Overview

Core principles

  • One immutable BIDS raw dataset (raw/) as the canonical baseline
  • Each analysis pipeline ha
  • [[issue-arash-20260326-232732-019412.md]] — Implements pipeio config functionality — directly related to pipeio_config_patch improvements
  • [[issue-arash-20260326-191057-730671.md]] — pipeio_mod_resolve work — same pipeio subsystem, config/registry contracts affected by patch behavior