Skip to content

[clang][bytecode] Implement __builtin_{wcscmp,wcsncmp} #132723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 24, 2025

Conversation

tbaederr
Copy link
Contributor

No description provided.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 24, 2025

@llvm/pr-subscribers-clang

Author: Timm Baeder (tbaederr)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/132723.diff

2 Files Affected:

  • (modified) clang/lib/AST/ByteCode/InterpBuiltin.cpp (+26-2)
  • (modified) clang/test/AST/ByteCode/builtin-functions.cpp (+46)
diff --git a/clang/lib/AST/ByteCode/InterpBuiltin.cpp b/clang/lib/AST/ByteCode/InterpBuiltin.cpp
index 285ea7151a9cf..5ea1d36a148ae 100644
--- a/clang/lib/AST/ByteCode/InterpBuiltin.cpp
+++ b/clang/lib/AST/ByteCode/InterpBuiltin.cpp
@@ -212,11 +212,13 @@ static bool interp__builtin_strcmp(InterpState &S, CodePtr OpPC,
   const Pointer &A = getParam<Pointer>(Frame, 0);
   const Pointer &B = getParam<Pointer>(Frame, 1);
 
-  if (ID == Builtin::BIstrcmp || ID == Builtin::BIstrncmp)
+  if (ID == Builtin::BIstrcmp || ID == Builtin::BIstrncmp ||
+      ID == Builtin::BIwcscmp || ID == Builtin::BIwcsncmp)
     diagnoseNonConstexprBuiltin(S, OpPC, ID);
 
   uint64_t Limit = ~static_cast<uint64_t>(0);
-  if (ID == Builtin::BIstrncmp || ID == Builtin::BI__builtin_strncmp)
+  if (ID == Builtin::BIstrncmp || ID == Builtin::BI__builtin_strncmp ||
+      ID == Builtin::BIwcsncmp || ID == Builtin::BI__builtin_wcsncmp)
     Limit = peekToAPSInt(S.Stk, *S.getContext().classify(Call->getArg(2)))
                 .getZExtValue();
 
@@ -231,6 +233,9 @@ static bool interp__builtin_strcmp(InterpState &S, CodePtr OpPC,
   if (A.isDummy() || B.isDummy())
     return false;
 
+  bool IsWide = ID == Builtin::BIwcscmp || ID == Builtin::BIwcsncmp ||
+                ID == Builtin::BI__builtin_wcscmp ||
+                ID == Builtin::BI__builtin_wcsncmp;
   assert(A.getFieldDesc()->isPrimitiveArray());
   assert(B.getFieldDesc()->isPrimitiveArray());
 
@@ -248,6 +253,21 @@ static bool interp__builtin_strcmp(InterpState &S, CodePtr OpPC,
         !CheckRange(S, OpPC, PB, AK_Read)) {
       return false;
     }
+
+    if (IsWide)
+      INT_TYPE_SWITCH(
+          *S.getContext().classify(S.getASTContext().getWCharType()), {
+            T A = PA.deref<T>();
+            T B = PB.deref<T>();
+            if (A < B) {
+              pushInteger(S, -1, Call->getType());
+              return true;
+            } else if (A > B) {
+              pushInteger(S, 1, Call->getType());
+              return true;
+            }
+          });
+
     uint8_t CA = PA.deref<uint8_t>();
     uint8_t CB = PB.deref<uint8_t>();
 
@@ -2120,6 +2140,10 @@ bool InterpretBuiltin(InterpState &S, CodePtr OpPC, const Function *F,
   case Builtin::BIstrcmp:
   case Builtin::BI__builtin_strncmp:
   case Builtin::BIstrncmp:
+  case Builtin::BI__builtin_wcsncmp:
+  case Builtin::BIwcsncmp:
+  case Builtin::BI__builtin_wcscmp:
+  case Builtin::BIwcscmp:
     if (!interp__builtin_strcmp(S, OpPC, Frame, F, Call))
       return false;
     break;
diff --git a/clang/test/AST/ByteCode/builtin-functions.cpp b/clang/test/AST/ByteCode/builtin-functions.cpp
index 8cba1ec2e5b3b..f42ff49456af6 100644
--- a/clang/test/AST/ByteCode/builtin-functions.cpp
+++ b/clang/test/AST/ByteCode/builtin-functions.cpp
@@ -22,6 +22,8 @@ extern "C" {
   extern char *strchr(const char *s, int c);
   extern wchar_t *wmemchr(const wchar_t *s, wchar_t c, size_t n);
   extern wchar_t *wcschr(const wchar_t *s, wchar_t c);
+  extern int wcscmp(const wchar_t *s1, const wchar_t *s2);
+  extern int wcsncmp(const wchar_t *s1, const wchar_t *s2, size_t n);
 }
 
 namespace strcmp {
@@ -66,6 +68,50 @@ namespace strcmp {
   static_assert(__builtin_strncmp("abab\0banana", "abab\0canada", 100) == 0);
 }
 
+namespace WcsCmp {
+  constexpr wchar_t kFoobar[6] = {L'f',L'o',L'o',L'b',L'a',L'r'};
+  constexpr wchar_t kFoobazfoobar[12] = {L'f',L'o',L'o',L'b',L'a',L'z',L'f',L'o',L'o',L'b',L'a',L'r'};
+
+  static_assert(__builtin_wcscmp(L"abab", L"abab") == 0);
+  static_assert(__builtin_wcscmp(L"abab", L"abba") == -1);
+  static_assert(__builtin_wcscmp(L"abab", L"abaa") == 1);
+  static_assert(__builtin_wcscmp(L"ababa", L"abab") == 1);
+  static_assert(__builtin_wcscmp(L"abab", L"ababa") == -1);
+  static_assert(__builtin_wcscmp(L"abab\0banana", L"abab") == 0);
+  static_assert(__builtin_wcscmp(L"abab", L"abab\0banana") == 0);
+  static_assert(__builtin_wcscmp(L"abab\0banana", L"abab\0canada") == 0);
+#if __WCHAR_WIDTH__ == 32
+  static_assert(__builtin_wcscmp(L"a\x83838383", L"a") == (wchar_t)-1U >> 31);
+#endif
+  static_assert(__builtin_wcscmp(0, L"abab") == 0); // both-error {{not an integral constant}} \
+                                                    // both-note {{dereferenced null}}
+  static_assert(__builtin_wcscmp(L"abab", 0) == 0); // both-error {{not an integral constant}} \
+                                                    // both-note {{dereferenced null}}
+
+  static_assert(__builtin_wcscmp(kFoobar, kFoobazfoobar) == -1);
+  static_assert(__builtin_wcscmp(kFoobar, kFoobazfoobar + 6) == 0); // both-error {{not an integral constant}} \
+                                                                    // both-note {{dereferenced one-past-the-end}}
+
+  static_assert(__builtin_wcsncmp(L"abaa", L"abba", 5) == -1);
+  static_assert(__builtin_wcsncmp(L"abaa", L"abba", 4) == -1);
+  static_assert(__builtin_wcsncmp(L"abaa", L"abba", 3) == -1);
+  static_assert(__builtin_wcsncmp(L"abaa", L"abba", 2) == 0);
+  static_assert(__builtin_wcsncmp(L"abaa", L"abba", 1) == 0);
+  static_assert(__builtin_wcsncmp(L"abaa", L"abba", 0) == 0);
+  static_assert(__builtin_wcsncmp(0, 0, 0) == 0);
+  static_assert(__builtin_wcsncmp(L"abab\0banana", L"abab\0canada", 100) == 0);
+#if __WCHAR_WIDTH__ == 32
+  static_assert(__builtin_wcsncmp(L"a\x83838383", L"aa", 2) ==
+                (wchar_t)-1U >> 31);
+#endif
+
+  static_assert(__builtin_wcsncmp(kFoobar, kFoobazfoobar, 6) == -1);
+  static_assert(__builtin_wcsncmp(kFoobar, kFoobazfoobar, 7) == -1);
+  static_assert(__builtin_wcsncmp(kFoobar, kFoobazfoobar + 6, 6) == 0);
+  static_assert(__builtin_wcsncmp(kFoobar, kFoobazfoobar + 6, 7) == 0); // both-error {{not an integral constant}} \
+                                                                        // both-note {{dereferenced one-past-the-end}}
+}
+
 /// Copied from constant-expression-cxx11.cpp
 namespace strlen {
 constexpr const char *a = "foo\0quux";

@tbaederr tbaederr merged commit f7aea4d into llvm:main Mar 24, 2025
14 checks passed
@rorth
Copy link
Collaborator

rorth commented Mar 24, 2025

This patch broke the Solaris/sparcv9 bot.

@tbaederr
Copy link
Contributor Author

I saw, I hope #132762 fixes the builder

@rorth
Copy link
Collaborator

rorth commented Mar 24, 2025

I fear it won't, unfortunately: the Solaris builders (and no doubt others) are configured to support only the native target (sparc* in the current case), not necessarily x86.

@tbaederr
Copy link
Contributor Author

If the target is the problem it should still work though, right? If all RUN lines have a triple set, the native triple doesn't matter, e.g. https://github.com/llvm/llvm-project/blob/main/clang/test/AST/ByteCode/builtin-bit-cast-bitfields.cpp does the same thing

@rorth
Copy link
Collaborator

rorth commented Mar 24, 2025

Usually the SPARC-only clang will reject any non-sparc* triple as unsupported. However, there are cases where they still work; I never understood when this works and when it doesn't.

tbaederr added a commit that referenced this pull request Mar 25, 2025
@rorth
Copy link
Collaborator

rorth commented Mar 25, 2025

Thanks for the reversal. I've meanwhile determined in a local sparcv9-sun-solaris2.11 all-targets build that the test still FAILed even with the X86 target configured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants