This RFC uses note (NL: notitie) for the feature, following the terminology
decision in RFC-005. The W3C technical term Annotation is kept only
where it refers to the data model itself. “Annotation” as a Dutch legal genre
(annotatie, noot, “m.nt.”) is deliberately not what this feature is called.
RFC-005 defines the note format: the W3C Web Annotation Data Model with TextQuoteSelector for version-resilient text anchoring. It specifies what a note looks like (selector, motivation, body, resolution states) but is explicitly storage-agnostic. It does not say where notes live, how they are loaded, who creates them, or how they are displayed.
This RFC addresses the implementation questions RFC-005 leaves open. They map directly onto the open design questions raised in interpretation discussions:
machine_readable execution logic.machine_readable elements, making the
interpretation chain explicit and auditable.Two other RFCs shape the infrastructure:
RFC-009 (Multi-Organization Execution) introduces
competent_authority (bevoegd gezag) as the boundary that determines who can
authoritatively execute what. The same concept applies to notes: the competent
authority’s note linking text to machine_readable represents the official
interpretation (gezaghebbende interpretatie). Other organizations, legal
experts, and citizens can bring their own notes alongside, as advisory
perspectives, not as the authoritative reading.
RFC-010 (Federated Corpus) introduces a “bring your own regulations” model where municipalities (gemeenten), provinces (provincies), and other organizations maintain laws in their own repositories. Notes need the same federated model: any organization can annotate any law from their own perspective, in their own repository.
Nothing is implemented on main. A historical Python proof-of-concept exists on
the feature/annotation-resolver branch (TextQuoteSelector resolution, fuzzy
matching, BDD tests), but the stack has moved to Rust (engine) and Vue 3
(frontend). The Python code is not reused; the BDD scenarios are ported.
This RFC has been renumbered twice. An earlier draft circulated as PR #328
numbered “RFC-013”; that number was already taken on main (Execution
Provenance), so it became RFC-016. RFC-016 in turn collided with the open
PR #510, which uses that number for Collection Operations (foreach). RFC-016 is
the older claim, so the note infrastructure RFC takes the next free number,
RFC-018 (rfc-011 was never used; rfc-016 and rfc-017 are claimed by
PR #510). PR #328 is closed in favor of this RFC.
Ten interconnected decisions form the note infrastructure.
Notes are stored in separate YAML files alongside law files, not embedded in the law YAML. This preserves the verbatim legal text (RFC-005 requirement: notes must not modify the source) and enables independent versioning.
Directory convention:
The annotation directory is keyed by the law’s $id (e.g. zorgtoeslagwet),
not by its filesystem path under regulation/ (e.g. wet_op_de_zorgtoeslag).
The $id is the stable identifier laws reference each other by.
annotations.yaml: law-level notes that apply across all versions. The
TextQuoteSelector resolves on whichever version is being viewed.{valid_from}.annotations.yaml: notes specific to a particular law version.
Used when the annotated text only exists in that version (e.g., a percentage
that was changed in a later amendment).The directory is named annotations/ (not notes/) because it stores W3C
Annotation objects; the storage path follows the data model, the feature name
follows RFC-005.
Discovery: convention-based. Given a law with $id: zorgtoeslagwet, the
engine looks for notes at corpus/annotations/zorgtoeslagwet/annotations.yaml. No
registry manifest is needed for local notes.
Validation: note files conform to schema/v0.5.2/annotation-schema.json,
validated by just validate-annotations.
Example note file:
Notes follow the same “bring your own” pattern as laws in RFC-010. Any organization can maintain notes in their own repository, alongside their regulations or in a dedicated note repository.
Required structure in a source repository:
When a source in corpus-registry.yaml (RFC-010) includes an annotations/
directory, the engine discovers and loads those notes alongside the source’s
regulations.
Example: a municipality (gemeente) annotating a national law:
Amsterdam’s note on the Healthcare Allowance Act (Wet op de zorgtoeslag) might link text about “toeslagpartner” to their local implementation of the partner check, or add a comment explaining how Amsterdam interprets a provision in the context of their municipal social assistance (bijstand) policy.
Personal notes: a .local.annotations.yaml file (gitignored) follows the
RFC-010 .local.yaml override pattern. Personal notes are loaded alongside public
notes but not committed.
Loading order: the engine loads notes from all registered sources. Notes from different sources are layered, not prioritized (see Decision 7).
Each note carries a creator field (part of the W3C Web Annotation Data Model)
identifying who created it. The relationship between the creator and the annotated
law determines the note’s authority level.
| Creator | Relationship to competent_authority | Authority level | Example |
|---|---|---|---|
| Competent authority (bevoegd gezag) | Same org | Authoritative (gezaghebbend) | Allowances Service (Dienst Toeslagen) annotates Healthcare Allowance Act |
| Other government org | Different org | Advisory (adviserend) | Municipality (gemeente) annotates national law from local perspective |
| Legal expert / researcher | No org | Personal (persoonlijk) | Law professor adds explanatory comment |
| Automated tooling | Generated | Generated (gegenereerd) | LLM-produced linking note |
The authority level is derived at display time, not stored. It follows the same principle as RFC-009’s execute/accept boundary: the law determines authority, the engine derives behavior.
Derivation logic:
For the MVP, creator is a string field (e.g., "Dienst Toeslagen",
"LLM-generated", "J. de Vries"). When RFC-009’s identity model
(EngineIdentity with OIN) is implemented, creator can carry a structured
identity with signature verification. The schema accommodates both:
Provenance: the note file’s Git history provides when each note was created
and by whom (git log, git blame). The schema optionally includes created and
modified timestamps for systems that do not use Git.
Each scope dimension maps to a concrete mechanism:
| Scope dimension | Mechanism | How it works |
|---|---|---|
| Entire law, all versions (hele wet, alle versies) | Law-level annotations.yaml | TextQuoteSelector resolves on whichever version is being viewed. A note on “zorgtoeslag” finds the word in both the 2020 and 2025 versions. |
| Specific version (specifieke versie) | Version-specific {valid_from}.annotations.yaml | A note on a percentage changed by Staatsblad 2008, 516 only makes sense on the pre-2008 version. |
| Personal / public (persoonlijk / publiek) | Public: in Git. Personal: .local.annotations.yaml (gitignored) | An expert’s draft notes before publishing; a student’s study notes. |
| Structure or content (structuur of inhoud) | Notes target text content via TextQuoteSelector | The note on “zorgtoeslag” finds the word regardless of which article it is in. If article 2 is renumbered to article 3, the note follows the text. Article numbers appear only as performance hints (non-authoritative). |
The W3C Web Annotation vocabulary defines 13 motivation types. Four are primary for regelrecht.
Connects text to a machine_readable element. This is the most critical type: it
makes the interpretation chain from law text to executable logic explicit and
auditable.
The body.source uses a regelrecht:// URI (as defined in
packages/engine/src/uri.rs, grammar regelrecht://{law_id}/{output}#{field})
pointing to the specific execution element. The path segment after the law id
(hoogte_zorgtoeslag) names the output. An optional #{field} fragment extracts
a single field from that output (for example #value); omit it to target the
whole output, as here.
Human explanation of a legal concept or provision.
Classification of legal concepts for search, analysis, and cross-referencing. Also the carrier for ambiguity state (Decision 9).
Open question raised during interpretation. Tracked via the workflow field. This
type carries the ambiguity-tracking use case (Decision 9).
This decision answers the original interpretation-research need: labelling ambiguity in work-in-progress laws, where we start from existing law with (possibly unknown) implementation policy and want to reach a validated executable rule set, knowing we have imperfect information.
The states are not a fixed list. Today there are a handful (“open norm partially filled”, “open norm not yet filled”, “needs explanation by implementation policy”, “document still being searched for”); the set will grow as the interpretation process matures. Freezing it into a validated schema enum would mean every new state is a schema version bump plus a migration of existing note files plus an RFC amendment. That is exactly the brittleness RFC-005 avoids for text anchoring; we avoid it here too.
Model. Ambiguity is expressed with the existing W3C dimensions, no schema extension:
motivation: questioning: there is an open interpretation issue here.workflow: open | resolved: has the issue been addressed?tagging body whose value is the specific ambiguity state, drawn from a
controlled vocabulary (Decision 9).A note can carry both a questioning body (the question in prose) and a tagging
body (the machine-readable state). The W3C model permits multiple bodies.
Example: an open norm that is only partially filled in.
Missing documents. A note can be about a document that does not exist in the corpus yet but is needed to resolve a provision. The target is the text fragment that triggers the search; the body describes what is missing. This makes “what are we still looking for” a queryable property of the corpus, not tribal knowledge.
The TextQuoteSelector resolver is implemented in Rust as a module within the engine crate. It is exposed to the frontend via WASM.
Module structure:
Types (in types.rs):
Resolution algorithm (in resolver.rs):
The hint is checked first so it provides an actual fast path: searching one article is O(article length) versus O(law length) for the full scan. The hint is non-authoritative, so a miss falls through to the full search rather than failing.
Dependency: strsim crate for Levenshtein distance.
WASM bindings (additions to packages/engine/src/wasm.rs):
BDD tests: port the 8 scenarios from
feature/annotation-resolver:features/annotation.feature into
features/notes.feature:
Plus ambiguity scenarios (Decision 6): a questioning note with
workflow: open and a tagging body open-norm-partial resolving correctly, and
a workflow: resolved variant.
Notes from different sources layer: they do not conflict. This is
fundamentally different from laws, where RFC-010 uses priority to resolve $id
collisions. Two organizations can both annotate the same text fragment; both notes
are valid and visible.
Layering model:
All three coexist. The editor displays them layered, with authoritative notes visually distinguished (solid highlight vs dashed border).
When law text changes:
resolution: found (no action needed)resolution: orphanedTooling-generated notes:
creator: "{tool-name}" and authority level generatedTagging bodies (including ambiguity states from Decision 6) draw their value
from a controlled vocabulary stored as a plain YAML list:
just validate-annotations warns (does not fail) when a tagging body’s value
is not in the vocabulary. This catches typos and keeps the set queryable, without
forking the W3C standard and without a schema bump when a state is added. The
warning-not-error stance mirrors how orphaned notes are reported (Decision 8).
frontend/src/composables/useNotes.js:
/data/annotations/{lawId}/annotations.yamlWasmEngine.resolveAnnotations() to get match positions per article{ note, match } objects, filtered by the selected
articlefrontend/src/components/AnnotatedText.vue:
A variant of ArticleText.vue for the editor’s Tekst pane. Takes article text and
resolved positions, renders <mark> spans. Each span carries:
AnnotatedText.vueuseTextSelection.js captures the selection, extracts exact, computes
prefix (30-50 chars before) and suffix (30-50 chars after)WasmEngine.resolveAnnotation(): if the selector
matches multiple locations, asks for more contextNoteCreator.vue opens as a popover: pick motivation, select target
machine_readable element (linking), type text (commenting/questioning), or
pick an ambiguity tag from the vocabulary (questioning)localStorageAmended after implementation. Step 6 is no longer the only way out. The editor can also write notes back through
editor-api. The manual export stays for the offline case.The write runs through the active traject, exactly like law and scenario edits since the traject concept landed (PR #632). A traject is a named, member-scoped editing project with its own federated corpus config; its
write_target_for_sourcemap decides which backend (and branch) a given law’s edits land in. The notes sidecar forlaw_idis written toannotations/{law_id}/annotations.yamlthrough that same backend, so a note and a law edit made in one session ride the same branch and PR. There is no separate per-session branch and noX-Editor-Sessionheader: the session cookie carries the active traject. With no active traject the save returns403, the same rule the law and scenario writes follow.This also settles the federated-target question from Decision 2 without a per-note source picker. An org annotating another org’s law into its own repo configures that as its traject’s writable source; the traject’s routing map then sends the notes there. Routing is a property of the traject, decided once when the traject is set up, not a free per-request override. The earlier
?source=override is gone: a second routing mechanism that could disagree with the traject’s own config was a privilege gap, not federation.The write is append-only, and this matters. The browser sends only the notes it just created, not a rebuilt file.
editor-apireads the current sidecar from the traject branch, appends the new notes (deduplicated by content), and validates the merged document against the schema. The non-shrink property is structural, not a post-hoc size check: the normal path keeps the existing file’s bytes verbatim and only appends, so it cannot drop a note. The one path that rebuilds the file (a base whoseannotationsis not a readable block sequence, e.g. flow style or non-LF) carries an explicit destructive-shrink guard that refuses rather than rebuild over content it could not parse. An earlier sketch had the browser send the whole file rebuilt from the static/datamirror; that mirror is the corpus at deploy time, not the branch, so a save silently discarded notes other contributors had added. Append-only follows directly from Decision 8: notes layer, they do not conflict, so the write path must never overwrite, only add.Known limitation: read-your-writes holds within a single
editor-apiprocess (the traject’s checkout accumulates its own commits on disk), but the checkout is cloned once per traject and not pulled before each read. Under horizontal scaling, or if the traject branch is updated out-of-band, a replica can read a stale base. Append-only bounds the blast radius, the worst case is a dedup miss or a non-fast-forward push that fails loudly, not the silent loss of others’ notes the/datamirror caused, but a strict cross-replica guarantee needs a pull-before-read (or a single-writer assumption) and is not yet built.Every note’s
target.sourcemust resolve to the law the path names. The schema allows any string there, so this is checked explicitly: a note whose source is absent or not aregelrecht://{law_id}URI is rejected, not skipped. It is the note-side counterpart of the$id/path guard onsave_law.The content-review half of Decision 8’s two-layer model relies on branch protection requiring code-owner review on
main. ACODEOWNERSentry forcorpus/annotations/exists, but on its own it only requests a reviewer; it gates a merge only once the repository enablesrequire_code_owner_reviews. Until then the schema/resolve checks in CI are the only enforced layer.Known limitation: schema drift on an existing sidecar. The merged file is validated against the current schema before it is written. If the sidecar already on the branch contains a note that no longer satisfies the schema (a version bump landed without migrating that file), the append cannot proceed: writing it would commit a file CI then rejects. The write path validates the existing file separately and returns a distinct
409(“the file is itself invalid, this is not your note”) rather than the generic note-invalid400, so the author is not sent chasing a fault in their own valid note. The fix is to repair or migrate the offending file; this is rare and a hard, clearly-attributed stop is preferred over silently appending into a file that will fail the gate anyway.
The right pane’s segmented control gains a third option, labelled Notities:
Git gives full note history for free. Every change is a commit with author,
timestamp, and diff. git blame shows who added each note and when.
Sidecar files preserve verbatim legal text. The law YAML is never modified (RFC-005 requirement). Essential for legal integrity: the source must be identical to the official publication.
Federated model lets every org bring their own notes. A municipality can annotate national laws from their perspective without touching the central corpus. This mirrors RFC-010.
Authority model reflects legal reality. The competent authority’s
interpretation carries more weight, as in administrative law. Authority is
derived from existing schema fields (competent_authority), following RFC-009.
Linking notes make interpretation auditable. Every connection between text
and machine_readable logic is an explicit, reviewable note. This supports the
reasoning requirement (motiveringsplicht, Awb 3:46): the chain from law text to
computation can be inspected by citizens, courts, and oversight bodies.
Ambiguity is tracked without freezing a vocabulary. Interpretation research can label open norms and missing documents today and refine the categories later, without schema migrations.
W3C standard enables interoperability. The format works with Hypothesis, Apache Annotator, and Recogito. External parties can produce notes without the RegelRecht editor.
Rust resolver is the single source of truth. The same resolver runs in the engine (CI validation, server-side) and the browser (WASM). No divergence between what the editor shows and what the engine validates.
Two places to look. A law’s notes live in a separate directory from its YAML source. Tooling must load both. Convention-based discovery keeps this straightforward but it is an extra step.
No real-time collaboration. Notes are Git-based. Concurrent editing requires branches and pull requests. Acceptable for the MVP; production may need a collaboration layer.
WASM adds build complexity. The wasm-pack build step is additional. The
engine already supports WASM compilation, so the incremental cost is low.
Authority derivation requires competent_authority in the law. Laws without
a declared competent_authority cannot distinguish authoritative from advisory
notes. This is an existing gap (RFC-009 also depends on it).
Tag vocabulary is soft-validated. A typo in a tagging value produces a warning, not a hard failure. Deliberate: it keeps the vocabulary growable without ceremony.
Notes inside law YAML. Embed an annotations array in the law file. Rejected:
modifies the verbatim source (RFC-005 forbids this) and creates noisy Git diffs
where note changes obscure law changes.
Database storage. Store notes in PostgreSQL (the pipeline already uses it). Rejected for MVP: introduces a backend dependency where none exists, and loses Git’s built-in versioning, blame, and merge workflow. A database layer can be added later without changing the format.
JavaScript-only resolver. Implement TextQuoteSelector resolution in JS. Rejected: duplicates the fuzzy matching algorithm. The engine needs resolution for CI validation; a separate JS implementation would diverge. WASM compiles the same Rust code for the browser.
Centralized note authority. Designate one organization as the note authority per law. Rejected: does not match RFC-010’s federated model or RFC-009’s multi-org reality. The layering model accommodates multiple legitimate annotators.
Dedicated regelrecht:ambiguity enum field. Add a validated enum to the
schema for ambiguity states. Rejected: the state set is still emerging; an enum
makes every new state a schema version bump plus migration plus RFC amendment. The
tagging-body plus controlled-vocabulary approach gives queryability and typo
detection without forking the W3C model.
Implementation is planned in six phases, each delivering a working whole.
Phase 0: RFC review RFC-005 and this RFC accepted and merged before code lands.
Phase 1: Rust resolver + BDD tests
Implement the annotation module (types.rs, resolver.rs). Port the 8 BDD
scenarios. Add strsim to Cargo.toml.
Phase 2: Note schema + first notes
Create schema/v0.5.2/annotation-schema.json. Write the first real linking and
commenting notes for the Healthcare Allowance Act article 2 plus the ambiguity
vocabulary. Add validate-annotations to the Justfile.
Phase 3: WASM bindings
Add resolve_annotation() and resolve_annotations() to WasmEngine.
Phase 4: Frontend display
Create useNotes.js and AnnotatedText.vue. Integrate behind a feature flag.
Update the build script to copy note files.
Phase 5: Frontend creation
Create useTextSelection.js and NoteCreator.vue.
Phase 6: CI validation + ambiguity use case
Add validate-annotations to the quality gate (orphaned notes as warnings). Add
ambiguity BDD scenarios and one concrete ambiguity note in the corpus.
| File/Directory | Change |
|---|---|
packages/engine/src/annotation/ | New: resolver module (mod.rs, types.rs, resolver.rs) |
packages/engine/src/lib.rs | Add pub mod annotation and re-exports |
packages/engine/src/wasm.rs | Add resolve_annotation(), resolve_annotations() |
packages/engine/Cargo.toml | Add strsim dependency |
schema/v0.5.2/annotation-schema.json | New: JSON Schema for note files |
corpus/annotations/ | New: note sidecar files + _vocabulary/ambiguity.yaml |
features/notes.feature | New: BDD scenarios (ported + ambiguity) |
packages/engine/tests/bdd/steps/ | New: note step definitions |
frontend/src/components/AnnotatedText.vue | New: text with highlights |
frontend/src/components/NoteCreator.vue | New: note creation form |
frontend/src/composables/useNotes.js | New: note loading and resolution |
frontend/src/composables/useTextSelection.js | New: text selection capture |
frontend/src/EditorApp.vue | Add Notities tab, use AnnotatedText behind a flag |
frontend/scripts/copy-laws.js | Copy note files to public/data/annotations/ |
Justfile | Add validate-annotations recipe |
open_terms and implementsAn exploration by Bureau Architectuur of the Dutch Ministry of the Interior into the possibilities of transparent, executable legislation.
Bureau Architectuur
Ministry of the Interior and Kingdom Relations