Description
IMPORTANT: This changes how meta-schemas are organized but not really how they work.
Relevant to this discussion:
- whether optional vocabs make sense
- URI naming scheme
- adding keyword URIs
- including that keyword URI in the output format
- vocabulary-defining vocabulary concept PR
- defining keyword meta-data
- keyword dependencies definition and its PR
- Precisely this exact topic already raised 🤦
- ideas specifically around machine-readable vocab descriptions that have been expressed to me in Slack DMs over the past few years
I've been thinking about all of these ☝️ things together to get a larger picture of where vocabularies could go. The discussions I've been a part of have all described a vocabulary definition file as serving several purposes:
- enumerating the keywords the vocab defines
- assigning each keyword an ID
- syntactically defining them and providing assertion functionality (i.e. schemas that validate their values) ⭐
- categorizing them into their function (e.g. assertion, annotation, applicator)
- multiple categories may apply per keyword, e.g.
properties
functions as all of these
- multiple categories may apply per keyword, e.g.
Impact to the Meta-Schema
The ⭐ in particular is where the meta-schema is changed. Currently the schema for a keyword's value is contained in the meta-schema body, generally under a properties
keyword. However, if the vocabulary definition file carries and enforces the schema for a keyword's value, then the meta-schema's entry is redundant. This means that the entire properties
keyword for a meta-schema could be removed as it's all in the vocab files.
I don't think this is a breaking change, however. A significant reorganization, sure, but the functionality is all still there. Moreover, we can make this change iteratively.
Suppose the only change we make to how the meta-schema is processed is that $vocabulary
acquires some validation behavior, applying the keyword schemas from all of the vocabularies it lists (it becomes an in-place applicator similar to properties
). Ideally, those keyword schemas would be the same as what's already in the meta-schema. However, even if they're not, the meta-schema is defining a dialect by virtue of declaring a set of vocabularies. In doing so, it's free to apply additional constraints to keywords.
For example, consider a modified Validation meta-schema where I've required that enum
have unique values (which isn't a current requirement):
enum
, as defined in the vocabulary, doesn't have the uniqueness constraint. This is actually possible now: the above meta-schema should be supported without any issues.
Now consider adding in-place-applicator / assertion functionality to $vocabulary
which (for enum
) enforces the type
and items
constraints but not uniqueItems
. The functionality of this meta-schema is unchanged.
Going further, we could change the original Validation meta-schema to this:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://json-schema.org/draft/2020-12/meta/validation",
"$vocabulary": {
"https://json-schema.org/draft/2020-12/vocab/validation": true
},
"$dynamicAnchor": "meta",
"title": "Validation vocabulary meta-schema",
"type": [
"object",
"boolean"
]
}
We don't need properties
because that's only defining the keywords, which are now defined in the vocabulary document identified by https://json-schema.org/draft/2020-12/vocab/validation
, and we don't need $defs
because that was only used to support the subschemas in properties
.
In fact we may not even need the vocab meta-schemas anymore. Because the top-level meta-schema lists all of the vocabularies, it would automatically perform all of the validation that the vocab meta-schemas currently provide. We could remove the allOf
making it just:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://json-schema.org/draft/2020-12/schema",
"$vocabulary": {
"https://json-schema.org/draft/2020-12/vocab/core": true,
"https://json-schema.org/draft/2020-12/vocab/applicator": true,
"https://json-schema.org/draft/2020-12/vocab/unevaluated": true,
"https://json-schema.org/draft/2020-12/vocab/validation": true,
"https://json-schema.org/draft/2020-12/vocab/meta-data": true,
"https://json-schema.org/draft/2020-12/vocab/format-annotation": true,
"https://json-schema.org/draft/2020-12/vocab/content": true
},
"$dynamicAnchor": "meta",
"title": "Core and Validation specifications meta-schema",
"type": ["object", "boolean"]
}
(I've also removed the deprecated keywords listing.)
Adoption
First of all, we've agreed that vocabularies and the $vocabulary
keyword are (at best) unstable, so modifying it (even in a breaking way) isn't out of the question.
Adding in-place-applicator / assertion behavior to $vocabulary
in the way described above isn't a breaking change as long as we copy the keyword schemas correctly.
Later, once $vocabulary
is promoted to being a stable feature, we can update the meta-schemas to remove the redundancies.
Readability and Accessibility
There is an issue of readability and accessibility when all of the keywords are defined in vocab files. While most people would be used to just looking in the meta-schema to see what keywords are available and how they're defined, now they'd have to follow another file reference to get that same information.
I don't think this is a big issue, though, and people will eventually get used to it.
On the other hand, creating a new meta-schema is immensely easier: you just list the vocabularies you want, and everything else is taken care of.
Automatic Support for Undefined Keyword Checking
With this in place, implementations will be able to look at the vocab files to see if and how a keyword is defined.
Further, the implementation would be able to detect trying to circumvent the "keywords must be defined in vocabs" requirement by defining a new keyword directly in the meta-schema. Currently, trying to do this is troublesome for implementations (annoying but not impossible).
(There may be some intersection here with x-
keywords, but I haven't thought about it too hard.)
$vocabulary
Requires Special Treatment
Currently $vocabulary
is only to be processed when the schema that contains it is being processed as a meta-schema. I don't think this should change as it only defines what keywords the instance (another schema) can use.
In this way, maybe it does break the nice symmetry we have around "a meta-schema validating a schema" is just "a schema validating an instance." But it could be argued that such symmetry was broken when $vocabulary
was introduced.
It may have an impact on the Test Suite since we do have a number of tests that validate schemas based on the meta-schema, and they'd need to be updated to pass along the context of "this is a meta-schema evaluation" in order to get the validation result from $vocabulary
.
Out of scope
I haven't addressed
- what the file might look like, specifically, only that it should contain the things I listed above
- how the value of
$vocabulary
might change (which depends on whether optional vocabs are still worth having, see link at top) - how the referencing of a vocab file works (would it be an implicit reference or do we need
$ref
in some capacity?)
I'd like to get the concept defined before we start considering mechanics.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status