Maintenance and health workflows
Org2 treats corpus maintenance as a core compiler capability: cleanup, graph repair, and review become part of the normal workflow.
A knowledge corpus has source files, derived artifacts, links, IDs, provenance, publish outputs, editor indexes, and generated views. The compiler model gives Org2 one place to parse those files, check trust boundaries, and report or repair drift before it reaches publishing or daily planning.
Core passes Org2 should own
The maintenance story groups existing and planned checks into a few explicit passes:
Format normalization with
org2 fmt: keep syntax and prose layout stable enough for clean diffs and generated edits.Corpus lint / health with
org2 lint: validate artifact metadata, IDs, provenance references, and corpus-flow conventions.Link graph health with
org2 backlinks,org2 query, andorg2 roam graph: surface broken or ambiguous knowledge links before they become stale navigation.Generated artifact maintenance with
org2 publishand export options: rebuild HTML/views from source files instead of editing generated outputs by hand.Editor index maintenance with
org2 roam db-syncand LSP features: keep file IDs and editor intelligence aligned with the corpus model.
Recommended maintenance loop
Run a lightweight check before committing note changes:
npm run org2 -- fmt --dir ~/notes --recursive --check
npm run org2 -- lint --dir ~/notes --recursive
Run the repair-style passes intentionally, review their diffs, then commit the result:
npm run org2 -- fmt --dir ~/notes --recursive --apply
npm run org2 -- roam db-sync --dir ~/notes --recursive --apply
npm run org2 -- publish docs-site --config org2.json --preview
Use org2 roam graph --format report for an inspectable local maintenance view, including orphan/high-degree nodes, alias collisions, unresolved/ambiguous links, and linkify suggestions with source locations. Use --format json on lint, backlink, query, graph, and roam workflows when wiring these checks into scripts or CI; graph JSON carries the same maintenance payload under maintenance.
Repository generated artifact hygiene
The Org2 repository intentionally keeps a small set of generated outputs under version control so releases, docs hosting, and spec conformance are inspectable from a checkout:
| Path | Source-controlled? | Why |
dist/ | yes | TypeScript build output used by the package bin entrypoints and npm distribution. |
site/ | yes | Published HTML docs generated from docs/site/*.org for static hosting. |
spec/v0/tests/*.json | yes | Expected canonical AST fixtures paired with sibling *.org inputs. |
ad hoc compiled/, views/, local reports, temp exports | no | Disposable corpus outputs unless a maintainer explicitly promotes them into docs, fixtures, or another reviewed artifact. |
Before committing changes that affect parser output, docs publishing, or TypeScript build output, run:
npm run check:generated
This rebuilds dist/, validates fixture pairs with tools/fixture-runner.mjs --e2e, republishes the docs site, and then runs git diff --exit-code -- dist site spec/v0/tests. If it fails, review and commit intentional generated diffs alongside the source change; otherwise fix the source of nondeterminism before committing.
Trust boundaries
Org2 should preserve a clear distinction between source files and generated artifacts.
Source notes are edited by people and tools.
Compiled views, publish outputs, and generated reports should carry artifact role/provenance metadata when they live in the corpus.
Lint should flag missing or stale provenance before derived content is trusted as source.
Repair passes should default to preview/check modes where possible and require
--applybefore writing.
Current concrete workflow
Today, org2 lint is the main corpus-health command. It checks artifact metadata, duplicate IDs, unresolved provenance references, graph link health (broken id: links, unresolved wiki links, and ambiguous wiki labels), and conventional corpus-flow directories such as raw/, notes/, compiled/, views/, and publish/.
The same diagnostics are available in --format json for CI/editor consumers, so VS Code and scripts can surface corpus trust-boundary problems without scraping report text.
For command details, see Tooling reference: Corpus lint.