Description
This ticket tracks the work remaining to complete adding source location information into sqlparser
Background
- @Nyrox and @iffyio introduced the foundations for storing source location information in the AST nodes in Implement
Spanned
to retrieve source locations on AST nodes #1435. This information can be used to provide more specific error messages, and potentially syntax highlighting among other great things. - In order to 1. minimize the disruption to downstream projects that use
sqlparser-rs
and 2. avoid a single massive PR and 3. work together as a community, we are implementing this feature incrementally over several releases.
Let's use this ticket to organize needed / remaining work. If you find additional features are needed / issues, please leave a comment on this ticket
Source Span Contributing Guidelines
For contributing source spans improvement in addition to the general
contribution guidelines, please make sure to pay attention to the
following:
-
Ident
always have correct source spans -
We try to minimize downstream breaking changes
-
Consider using
Span::union
in favor of storing spans on all nodes -
Any metadata added to compute spans must not change semantics (
Eq
,Ord
,Hash
, etc.). SeeAttachedToken
for more information.
When adding support for source spans on a type, consider the impact to consumers of that type and whether your change would require a consumer to do non-trivial changes to their code.
Example of a trivial change
match node {
ast::Query {
field1,
field2,
location: _, // add a new line to ignored location
}
If adding source spans to a type would require a significant change like wrapping the type, please open an issue to discuss.
# AST Node Equality and Hashes
When adding tokens to AST nodes, make sure to store them using the [AttachedToken](https://docs.rs/sqlparser/latest/sqlparser/ast/helpers/struct.AttachedToken.html) (TODO UPDATE SOURCE REFERENCE)to ensure that semantically equivalent AST nodes compare as equal and hash to the same value. i.e. `select 5` and `SELECT 5` would compare as different `Select` nodes, if the select token was stored directly. f.e.
```rust
struct Select {
select_token: AttachedToken, // only used for spans
/// remaining fields
field1,
field2,
...
}
Some high level work (list from #1435)
- Store keyword
TokenWithLocation
for expressions that currently don't have them - Implement spans for the rest of the AST, namely
Statement
s
Tasks
- Create test pattern for Spans #1563
- Look into reducing AST size with smaller offset sizes (e.g.
u32
rather thanusize
) - Store spans for ast::value::Value