Skip to content

[flang][OpenMP] Make parsing of trait properties more context-sensitive #122900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 29, 2025

Conversation

kparzysz
Copy link
Contributor

A trait poperty can be one of serveral alternatives (name, expression, etc.), and each property in a list was parsed as if it could be any of these alternatives independently from other properties. This made the parsing vulnerable to certain ambiguities in the trait grammar (provided in the OpenMP spec).
At the same time the OpenMP spec gives the expected types of properties for almost every trait: all properties listed for a given trait are usually of the same type, e.g. names, clauses, etc.

Incorporate these restrictions into the parser, and additionally use property extensions as the fallback if the parsing of the expected property type failed. This is intended to allow the parser to succeed, and instead let the semantic-checking code emit a more user-friendly message.

A trait poperty can be one of serveral alternatives, and each property
in a list was parsed as if it could be any of these alternatives
independently from other properties. This made the parsing vulnerable
to certain ambiguities in the trait grammar (provided in the OpenMP
spec).
At the same time the OpenMP spec gives the expected types of properties
for almost every trait: all properties listed for a given trait are
usually of the same type, e.g. names, clauses, etc.

Incorporate these restrictions into the parser, and additionally use
property extensions as the fallback if the parsing of the expected
property type failed. This is intended to allow the parser to succeed,
and instead let the semantic-checking code emit a more user-friendly
message.
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:parser labels Jan 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-flang-parser

Author: Krzysztof Parzyszek (kparzysz)

Changes

A trait poperty can be one of serveral alternatives (name, expression, etc.), and each property in a list was parsed as if it could be any of these alternatives independently from other properties. This made the parsing vulnerable to certain ambiguities in the trait grammar (provided in the OpenMP spec).
At the same time the OpenMP spec gives the expected types of properties for almost every trait: all properties listed for a given trait are usually of the same type, e.g. names, clauses, etc.

Incorporate these restrictions into the parser, and additionally use property extensions as the fallback if the parsing of the expected property type failed. This is intended to allow the parser to succeed, and instead let the semantic-checking code emit a more user-friendly message.


Full diff: https://github.com/llvm/llvm-project/pull/122900.diff

4 Files Affected:

  • (modified) flang/include/flang/Parser/dump-parse-tree.h (+1-1)
  • (modified) flang/include/flang/Parser/parse-tree.h (+13-27)
  • (modified) flang/lib/Parser/openmp-parsers.cpp (+99-26)
  • (modified) flang/lib/Parser/unparse.cpp (+3-2)
diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h
index 11725991e9c9a9..49eeed0e7b4393 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -479,7 +479,7 @@ class ParseTreeDumper {
   NODE(parser, OmpTraitPropertyName)
   NODE(parser, OmpTraitScore)
   NODE(parser, OmpTraitPropertyExtension)
-  NODE(OmpTraitPropertyExtension, ExtensionValue)
+  NODE(OmpTraitPropertyExtension, Complex)
   NODE(parser, OmpTraitProperty)
   NODE(parser, OmpTraitSelectorName)
   NODE_ENUM(OmpTraitSelectorName, Value)
diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h
index 00d85aa05fb3a5..f8175ea1de679e 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3505,37 +3505,22 @@ struct OmpTraitScore {
 };
 
 // trait-property-extension ->
-//    trait-property-name (trait-property-value, ...)
-// trait-property-value ->
 //    trait-property-name |
-//    scalar-integer-expression |
-//    trait-property-extension
-//
-// The grammar in OpenMP 5.2+ spec is ambiguous, the above is a different
-// version (but equivalent) that doesn't have ambiguities.
-// The ambiguity is in
-//   trait-property:
-//      trait-property-name          <- (a)
-//      trait-property-clause
-//      trait-property-expression    <- (b)
-//      trait-property-extension     <- this conflicts with (a) and (b)
-//   trait-property-extension:
-//      trait-property-name          <- conflict with (a)
-//      identifier(trait-property-extension[, trait-property-extension[, ...]])
-//      constant integer expression  <- conflict with (b)
+//    scalar-expr |
+//    trait-property-name (trait-property-extension, ...)
 //
 struct OmpTraitPropertyExtension {
   CharBlock source;
-  TUPLE_CLASS_BOILERPLATE(OmpTraitPropertyExtension);
-  struct ExtensionValue {
+  UNION_CLASS_BOILERPLATE(OmpTraitPropertyExtension);
+  struct Complex {  // name (prop-ext, prop-ext, ...)
     CharBlock source;
-    UNION_CLASS_BOILERPLATE(ExtensionValue);
-    std::variant<OmpTraitPropertyName, ScalarExpr,
-        common::Indirection<OmpTraitPropertyExtension>>
-        u;
+    TUPLE_CLASS_BOILERPLATE(Complex);
+    std::tuple<OmpTraitPropertyName,
+        std::list<common::Indirection<OmpTraitPropertyExtension>>>
+        t;
   };
-  using ExtensionList = std::list<ExtensionValue>;
-  std::tuple<OmpTraitPropertyName, ExtensionList> t;
+
+  std::variant<OmpTraitPropertyName, ScalarExpr, Complex> u;
 };
 
 // trait-property ->
@@ -3568,9 +3553,10 @@ struct OmpTraitProperty {
 //    UID |               T        // unique-string-id /impl-defined
 //    VENDOR |            I        // name-list (vendor-id /add-def-doc)
 //    EXTENSION |         I        // name-list (ext_name /impl-defined)
-//    ATOMIC_DEFAULT_MEM_ORDER I | // value of admo
+//    ATOMIC_DEFAULT_MEM_ORDER I | // clause-list (value of admo)
 //    REQUIRES |          I        // clause-list (from requires)
 //    CONDITION           U        // logical-expr
+//    <other name>        I        // treated as extension
 //
 // Trait-set-selectors:
 //    [D]evice, [T]arget_device, [C]onstruct, [I]mplementation, [U]ser.
@@ -3579,7 +3565,7 @@ struct OmpTraitSelectorName {
   UNION_CLASS_BOILERPLATE(OmpTraitSelectorName);
   ENUM_CLASS(Value, Arch, Atomic_Default_Mem_Order, Condition, Device_Num,
       Extension, Isa, Kind, Requires, Simd, Uid, Vendor)
-  std::variant<Value, llvm::omp::Directive> u;
+  std::variant<Value, llvm::omp::Directive, std::string> u;
 };
 
 // trait-selector ->
diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp
index 5ff91da082c852..a7b3986845d985 100644
--- a/flang/lib/Parser/openmp-parsers.cpp
+++ b/flang/lib/Parser/openmp-parsers.cpp
@@ -158,31 +158,23 @@ static TypeDeclarationStmt makeIterSpecDecl(std::list<ObjectName> &&names) {
 static std::string nameToString(Name &&name) { return name.ToString(); }
 
 TYPE_PARSER(sourced(construct<OmpTraitPropertyName>( //
-    (space >> charLiteralConstantWithoutKind) ||
-    applyFunction(nameToString, Parser<Name>{}))))
+    construct<OmpTraitPropertyName>(space >> charLiteralConstantWithoutKind) ||
+    construct<OmpTraitPropertyName>(
+        applyFunction(nameToString, Parser<Name>{})))))
 
 TYPE_PARSER(sourced(construct<OmpTraitScore>( //
-    "SCORE" >> parenthesized(scalarIntExpr))))
+    "SCORE"_tok >> parenthesized(scalarIntExpr))))
 
-TYPE_PARSER(sourced(construct<OmpTraitPropertyExtension::ExtensionValue>(
-    // Parse nested extension first.
-    construct<OmpTraitPropertyExtension::ExtensionValue>(
-        indirect(Parser<OmpTraitPropertyExtension>{})) ||
-    construct<OmpTraitPropertyExtension::ExtensionValue>(
-        Parser<OmpTraitPropertyName>{}) ||
-    construct<OmpTraitPropertyExtension::ExtensionValue>(scalarExpr))))
-
-TYPE_PARSER(sourced(construct<OmpTraitPropertyExtension>( //
+TYPE_PARSER(sourced(construct<OmpTraitPropertyExtension::Complex>(
     Parser<OmpTraitPropertyName>{},
     parenthesized(nonemptySeparated(
-        Parser<OmpTraitPropertyExtension::ExtensionValue>{}, ","_tok)))))
+        indirect(Parser<OmpTraitPropertyExtension>{}), ","_tok)))))
 
-TYPE_PARSER(sourced(construct<OmpTraitProperty>(
-    // Try clause first, then extension before OmpTraitPropertyName.
-    construct<OmpTraitProperty>(indirect(Parser<OmpClause>{})) ||
-    construct<OmpTraitProperty>(Parser<OmpTraitPropertyExtension>{}) ||
-    construct<OmpTraitProperty>(Parser<OmpTraitPropertyName>{}) ||
-    construct<OmpTraitProperty>(scalarExpr))))
+TYPE_PARSER(sourced(construct<OmpTraitPropertyExtension>(
+    construct<OmpTraitPropertyExtension>(
+        Parser<OmpTraitPropertyExtension::Complex>{}) ||
+    construct<OmpTraitPropertyExtension>(Parser<OmpTraitPropertyName>{}) ||
+    construct<OmpTraitPropertyExtension>(scalarExpr))))
 
 TYPE_PARSER(construct<OmpTraitSelectorName::Value>(
     "ARCH" >> pure(OmpTraitSelectorName::Value::Arch) ||
@@ -201,15 +193,96 @@ TYPE_PARSER(construct<OmpTraitSelectorName::Value>(
 TYPE_PARSER(sourced(construct<OmpTraitSelectorName>(
     // Parse predefined names first (because of SIMD).
     construct<OmpTraitSelectorName>(Parser<OmpTraitSelectorName::Value>{}) ||
-    construct<OmpTraitSelectorName>(OmpDirectiveNameParser{}))))
+    construct<OmpTraitSelectorName>(OmpDirectiveNameParser{}) ||
+    // identifier-or-string for extensions
+    construct<OmpTraitSelectorName>(
+        applyFunction(nameToString, Parser<Name>{})) ||
+    construct<OmpTraitSelectorName>(space >> charLiteralConstantWithoutKind))))
+
+// Parser for OmpTraitSelector::Properties
+template <typename... PropParser>
+static constexpr auto propertyListParser(PropParser... pp) {
+  // Parse the property list "(score(expr): item1...)" in three steps:
+  // 1. Parse the "("
+  // 2. Parse the optional "score(expr):"
+  // 3. Parse the "item1, ...)", together with the ")".
+  // The reason for including the ")" in the 3rd step is to force parsing
+  // the entire list in each of the alternative property parsers. Otherwise,
+  // the name parser could stop after "foo" in "(foo, bar(1))", without
+  // allowing the next parser to give the list a try.
+
+  using P = OmpTraitProperty;
+  return maybe("(" >>
+      construct<OmpTraitSelector::Properties>(
+          maybe(Parser<OmpTraitScore>{} / ":"_tok),
+          (attempt(nonemptySeparated(construct<P>(pp), ","_tok) / ")"_tok) ||
+              ...)));
+}
+
+// Parser for OmpTraitSelector
+struct TraitSelectorParser {
+  using resultType = OmpTraitSelector;
+
+  constexpr TraitSelectorParser(Parser<OmpTraitSelectorName> p) : np(p) {}
+
+  std::optional<resultType> Parse(ParseState &state) const {
+    auto name{attempt(np).Parse(state)};
+    if (!name.has_value()) {
+      return std::nullopt;
+    }
+
+    // Default fallback parser for lists that cannot be parser using the
+    // primary property parser.
+    auto extParser{Parser<OmpTraitPropertyExtension>{}};
+
+    if (auto *v{std::get_if<OmpTraitSelectorName::Value>(&name->u)}) {
+      switch (*v) {
+      // name-list properties
+      case OmpTraitSelectorName::Value::Arch: // [6.0:319:18]
+      case OmpTraitSelectorName::Value::Extension: // [6.0:319:30]
+      case OmpTraitSelectorName::Value::Isa:  // [6.0:319:15]
+      case OmpTraitSelectorName::Value::Kind: // [6.0:319:10]
+      case OmpTraitSelectorName::Value::Uid: // [6.0:319:23](*)
+      case OmpTraitSelectorName::Value::Vendor: { // [6.0:319:27]
+        auto pp{propertyListParser(Parser<OmpTraitPropertyName>{}, extParser)};
+        return OmpTraitSelector(std::move(*name), std::move(*pp.Parse(state)));
+      }
+      // clause-list
+      case OmpTraitSelectorName::Value::Atomic_Default_Mem_Order:
+        // [6.0:321:26-29](*)
+      case OmpTraitSelectorName::Value::Requires: // [6.0:319:33]
+      case OmpTraitSelectorName::Value::Simd: { // [6.0:318:31]
+        auto pp{propertyListParser(indirect(Parser<OmpClause>{}), extParser)};
+        return OmpTraitSelector(std::move(*name), std::move(*pp.Parse(state)));
+      }
+      // expr-list
+      case OmpTraitSelectorName::Value::Condition: // [6.0:321:33](*)
+      case OmpTraitSelectorName::Value::Device_Num: { // [6.0:321:23-24](*)
+        auto pp{propertyListParser(scalarExpr, extParser)};
+        return OmpTraitSelector(std::move(*name), std::move(*pp.Parse(state)));
+      }
+      // (*) The spec doesn't assign any list-type to these traits, but for
+      // convenience they can be treated as if they were.
+      } // switch
+    } else {
+      // The other alternatives are `llvm::omp::Directive`, and `std::string`.
+      // The former doesn't take any properties[1], the latter is a name of an
+      // extension[2].
+      // [1] [6.0:319:1-2]
+      // [2] [6.0:319:36-37]
+      auto pp{propertyListParser(extParser)};
+      return OmpTraitSelector(std::move(*name), std::move(*pp.Parse(state)));
+    }
 
-TYPE_PARSER(construct<OmpTraitSelector::Properties>(
-    maybe(Parser<OmpTraitScore>{} / ":"_tok),
-    nonemptySeparated(Parser<OmpTraitProperty>{}, ","_tok)))
+    llvm_unreachable("Unhandled trait name?");
+  }
+
+private:
+  const Parser<OmpTraitSelectorName> np;
+};
 
-TYPE_PARSER(sourced(construct<OmpTraitSelector>( //
-    Parser<OmpTraitSelectorName>{}, //
-    maybe(parenthesized(Parser<OmpTraitSelector::Properties>{})))))
+TYPE_PARSER(construct<OmpTraitSelector>(
+    TraitSelectorParser(Parser<OmpTraitSelectorName>{})))
 
 TYPE_PARSER(construct<OmpTraitSetSelectorName::Value>(
     "CONSTRUCT" >> pure(OmpTraitSetSelectorName::Value::Construct) ||
diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp
index 7bf404bba2c3e4..af5259e1daec43 100644
--- a/flang/lib/Parser/unparse.cpp
+++ b/flang/lib/Parser/unparse.cpp
@@ -2075,10 +2075,11 @@ class UnparseVisitor {
     Walk(x.v);
     Put(")");
   }
-  void Unparse(const OmpTraitPropertyExtension &x) {
+  void Unparse(const OmpTraitPropertyExtension::Complex &x) {
+    using PropList = std::list<common::Indirection<OmpTraitPropertyExtension>>;
     Walk(std::get<OmpTraitPropertyName>(x.t));
     Put("(");
-    Walk(std::get<OmpTraitPropertyExtension::ExtensionList>(x.t), ",");
+    Walk(std::get<PropList>(x.t), ",");
     Put(")");
   }
   void Unparse(const OmpTraitSelector &x) {

Copy link

github-actions bot commented Jan 14, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@kparzysz
Copy link
Contributor Author

Ping... This is a part of METADIRECTIVE support.

Comment on lines +180 to +191
"ARCH"_id >> pure(OmpTraitSelectorName::Value::Arch) ||
"ATOMIC_DEFAULT_MEM_ORDER"_id >>
pure(OmpTraitSelectorName::Value::Atomic_Default_Mem_Order) ||
"CONDITION" >> pure(OmpTraitSelectorName::Value::Condition) ||
"DEVICE_NUM" >> pure(OmpTraitSelectorName::Value::Device_Num) ||
"EXTENSION" >> pure(OmpTraitSelectorName::Value::Extension) ||
"ISA" >> pure(OmpTraitSelectorName::Value::Isa) ||
"KIND" >> pure(OmpTraitSelectorName::Value::Kind) ||
"REQUIRES" >> pure(OmpTraitSelectorName::Value::Requires) ||
"SIMD" >> pure(OmpTraitSelectorName::Value::Simd) ||
"UID" >> pure(OmpTraitSelectorName::Value::Uid) ||
"VENDOR" >> pure(OmpTraitSelectorName::Value::Vendor)))
"CONDITION"_id >> pure(OmpTraitSelectorName::Value::Condition) ||
"DEVICE_NUM"_id >> pure(OmpTraitSelectorName::Value::Device_Num) ||
"EXTENSION"_id >> pure(OmpTraitSelectorName::Value::Extension) ||
"ISA"_id >> pure(OmpTraitSelectorName::Value::Isa) ||
"KIND"_id >> pure(OmpTraitSelectorName::Value::Kind) ||
"REQUIRES"_id >> pure(OmpTraitSelectorName::Value::Requires) ||
"SIMD"_id >> pure(OmpTraitSelectorName::Value::Simd) ||
"UID"_id >> pure(OmpTraitSelectorName::Value::Uid) ||
"VENDOR"_id >> pure(OmpTraitSelectorName::Value::Vendor)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was using the _id combinator required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default the parser will recognize a given string when it's a prefix of another string. For example "ATOMIC" >> ... will trigger for ATOMIC_DEFAULT_MEM_ORDER, causing a syntax error which is rather non-obvious (the parser then looks at _DEFAULT_MEM_ORDER). Adding "_id" to the string ensures that it will only trigger when it's delimited.

I don't remember what exactly I ran into here, but to eliminate the risk of these kinds of issues, I added "_id" to the strings in these parsers.

Do you have concerns with this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, it is used only when a token is a prefix of another token in the same or group.


// Parser for OmpTraitSelector::Properties
template <typename... PropParser>
static constexpr auto propertyListParser(PropParser... pp) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we document some of these custom openmp parsers separately in the docs page?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parsers related to context selectors are still somewhat work-in-progress. I've been trying to keep them as general as possible (i.e. parsing unknown values in a reasonable manner), but once we handle the details (like arch names, etc.) the structure of these parsers may change a bit. We should document it somewhere once they stabilize.

Comment on lines +292 to +297
"CONSTRUCT"_id >> pure(OmpTraitSetSelectorName::Value::Construct) ||
"DEVICE"_id >> pure(OmpTraitSetSelectorName::Value::Device) ||
"IMPLEMENTATION"_id >>
pure(OmpTraitSetSelectorName::Value::Implementation) ||
"TARGET_DEVICE"_id >> pure(OmpTraitSetSelectorName::Value::Target_Device) ||
"USER"_id >> pure(OmpTraitSetSelectorName::Value::User)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same _id question here as well.

Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG.

@kparzysz kparzysz merged commit c859323 into main Jan 29, 2025
8 checks passed
@kparzysz kparzysz deleted the users/kparzysz/spr/m02-context-parse branch January 29, 2025 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:parser flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants