Skip to content

The Gordian Knot #277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 8, 2023
Merged

Conversation

susliko
Copy link
Collaborator

@susliko susliko commented Jun 5, 2023

A startling discovery

changing $.lambda_expression body from $._block to $._indentable_expression results in

  • 11-fold decrease in parser generation times (from 110s to 10s on my laptop)
  • parser size reduction from 41M to 22M
  • resolution of Allow block before lambda_expression #54
  • approximately the same smoke test accuracies (86% in scala2 compiler, 82% in dotty)

However, it brings the following problems:

Problem1: conflict of $.lambda_expression and $.self_type

Self-type in

trait A {
  self =>
  val a = 1
}

is parsed as lambda expression. Seems to be not much of a problem, as it can be resolved by disabling $.lambda_expression in $.templabe_body (that's also how it's in the ebnf)

Problem2: multi-line lambdas without newline before closing paren

x.map((x) => 
if a then b.c)

fails to be parsed, for $._indent token is emmitted and $._outdent is not.

x.map((x) => 
if a then b.c
)

is, on contrary, parsed correctly because of the newline before the closing paren.
For this issue - I can't come up with a solution and will be glad of any ideas

@susliko susliko requested review from eed3si9n, keynmol and ckipp01 June 5, 2023 21:37
@susliko susliko marked this pull request as draft June 5, 2023 21:37
@eed3si9n
Copy link
Collaborator

eed3si9n commented Jun 5, 2023

scala> trait A
// defined trait A

scala> foo(new A:
     |   def x = 1)
val res0: A = anon$1@50df37eb

so at least in this case, the closing parenthesis works as an outdent, so maybe we should support that in a situation where the parser is looking for an outdent.

@eed3si9n
Copy link
Collaborator

eed3si9n commented Jun 6, 2023

Here's my PR for the outdent - #279

@SethTisue
Copy link

😮

@susliko susliko force-pushed the rework-lambda-expressions branch 14 times, most recently from fa01307 to fb9fd67 Compare June 7, 2023 20:55
@susliko
Copy link
Collaborator Author

susliko commented Jun 7, 2023

@eed3si9n Thank you for the fix of outdents! It helped to solve remaining issues in this PR, it's ready for review
Key results from the initial description remained:

  • ~11-fold decrease in parser generation times with 1Gb max memory usage ceiling
  • parser size reduction from 41M to 22M with 1300 as grammar complexity ceiling

@susliko susliko marked this pull request as ready for review June 7, 2023 21:07
@susliko susliko force-pushed the rework-lambda-expressions branch from fb9fd67 to a889c3c Compare June 8, 2023 14:20
Summary
-------
`$.lambda_expression` body was changed from `$._block` to
`$._indentable_expression`. This had the following effects:
* x10 faster parser generation
* parser size reduced from 41M to 24M
* conflict with `$.self_type`, which was resolved by matching
  indent-tokens in `$.template_body`. This change, in its turn required
  scanner.c to stop emitting INDENT and OUTDENT tokens when encountering
  comments
Copy link
Collaborator

@eed3si9n eed3si9n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @susliko!

@eed3si9n eed3si9n merged commit d244679 into tree-sitter:master Jun 8, 2023
@ckipp01
Copy link
Collaborator

ckipp01 commented Jun 9, 2023

Really great job on this @susliko!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants