Description
Idea from @jdesrosiers on slack (with minor tweaks from me, and probably a misinterpretation or two, but this is at least good enough to record the general concept):
Instead of discussing $id
as primarily assigning URIs to schema objects, shift the focus to schema documents. For reasons that will be apparent later, also say that $id
URI references MUST NOT contain a fragment.
The key idea is that schema documents can be embedded in other schema documents.
$id
is used to indicate an embedded document, and the schema object containing that $id
is considered to be the root schema of the embedded document. Whether it is standalone or embedded, a schema document's base URI is the value of $id
in its root schema.
An embedded document's $id
can be a relative URI reference, in which case it is resolved against the base URI of the containing schema document.
The contents of embedded documents cannot be referenced with a JSON Pointer fragment attached to the containing document's base URI:
{
"$id": "https://example.com/outer",
"additionalProperties": {
"$id": "inner" {
"items": {...}
}
}
}
In this example, the schema that is the value of "items"
can be referenced as https://example.com/inner#/items
. It cannot be referenced as https://example.com/outer#/items
, which is a change from the current behavior.
The reason for this may be more intuitive when considering this, functionally identical schema:
{
"$id": "https://example.com/outer",
"additionalProperties": {"$ref": "inner"}
}
This is essentially the same schema, but with "inner" included by reference rather than directly embedded. In this example, it is clear that https://example.com/outer#/items
is meaningless. There is no such location. Embedding the schema does not make that URI meaningful; essentially, JSON Pointer fragment evaluation cannot cross into an embedded document.
In this approach, $id
is always indicating the base URI of a document. As fragments are stripped from base URIs, it does not make sense to allow fragments in $id
when used this way. In fact, RFC 3986 section 6 states:
Some protocol elements allow only the absolute form of a URI without
a fragment identifier. For example, defining a base URI for later
use by relative references calls for an absolute-URI syntax rule that
does not allow a fragment.
Therefore, the only time fragments make sense in $id
is in the plain name fragment definition form: {"$id": "#foo"}
.
This form does not work when considering that $id
indicates an embedded document. Because fragments are removed from a URI before it is used as a base, the base URI of such an embedded document would be identical to that of its containing schema document. This is obviously an incorrect usage of URIs, and should not be allowed.
While all of the behavior of $id
as specified in draft-07 is simply a result of applying RFC 3986 rules to the hierarchical schema structure, most users seem to view the fragment definition form and the base URI change form as separate features. Since this form is not compatible with $id
as an embedded document identifier, and many users view it as a different feature anyway, let's drop this form.
In its place, the "$anchor"
keyword defines plain name fragments. Note that its value is simply the plain name, without the #
fragment: {"$anchor": "foo"}
is the equivalent of the former {"$id": "#foo"}
.
This also allows for
{"$id": "https://example.com/foo", "$anchor": "bar"}
to replace
{"$id": "https://example.com/foo#bar"}
which, as far as I can tell, is currently a valid use of $id
, although apparently I put a CREF in draft-07 wondering whether or how it should actually work. But if we split that function off into an $anchor
keyword and outright forbid fragments in $id
, this is no longer a weird corner case. Each keyword functions separately and unambiguously.