Skip to content

Commit 62f2f3a

Browse files
committed
Merged main:522c1d0eeaa2 into amd-gfx:286fb304cde7
Local branch amd-gfx 286fb30 Merged main:71c83fb8b618 into amd-gfx:71157ae85e25 Remote branch main 522c1d0 [mlir][gpu][bufferization] Implement BufferDeallocationOpInterface for gpu.terminator (llvm#66880) Change-Id: Ia9139d64b9c0614f8597c2ba63023ae3a7a10051
2 parents 286fb30 + 522c1d0 commit 62f2f3a

File tree

788 files changed

+27069
-7243
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

788 files changed

+27069
-7243
lines changed

.github/workflows/issue-release-workflow.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ jobs:
7070
if: >-
7171
(github.repository == 'llvm/llvm-project') &&
7272
!startswith(github.event.comment.body, '<!--IGNORE-->') &&
73-
contains(github.event.comment.body, '/branch')
73+
contains(github.event.comment.body, '/branch ')
7474
7575
steps:
7676
- name: Fetch LLVM sources

bolt/runtime/CMakeLists.txt

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
cmake_minimum_required(VERSION 3.20.0)
2+
include(CheckCXXCompilerFlag)
23
include(CheckIncludeFiles)
34
include(GNUInstallDirs)
45

@@ -33,7 +34,10 @@ if (CMAKE_SYSTEM_PROCESSOR MATCHES "x86_64")
3334
set(BOLT_RT_FLAGS ${BOLT_RT_FLAGS} "-mno-sse")
3435
endif()
3536
if (CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64")
36-
set(BOLT_RT_FLAGS ${BOLT_RT_FLAGS} "-mno-outline-atomics")
37+
check_cxx_compiler_flag("-mno-outline-atomics" CXX_SUPPORTS_OUTLINE_ATOMICS)
38+
if (CXX_SUPPORTS_OUTLINE_ATOMICS)
39+
set(BOLT_RT_FLAGS ${BOLT_RT_FLAGS} "-mno-outline-atomics")
40+
endif()
3741
endif()
3842

3943
# Don't let the compiler think it can create calls to standard libs
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
// RUN: %check_clang_tidy %s bugprone-inc-dec-in-conditions %t
2+
3+
_BitInt(8) v_401_0() {
4+
0 && ({
5+
_BitInt(5) y = 0;
6+
16777215wb ?: ++y;
7+
});
8+
}
9+
// CHECK-MESSAGES: warning

clang/docs/OpenMPSupport.rst

Lines changed: 14 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -16,15 +16,12 @@
1616
OpenMP Support
1717
==============
1818

19-
Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
20-
PPC64[LE] and has `basic support for Cuda devices`_.
21-
22-
* #pragma omp declare simd: :part:`Partial`. We support parsing/semantic
23-
analysis + generation of special attributes for X86 target, but still
24-
missing the LLVM pass for vectorization.
19+
Clang fully supports OpenMP 4.5, almost all of 5.0 and most of 5.1/2.
20+
Clang supports offloading to X86_64, AArch64, PPC64[LE], NVIDIA GPUs (all models) and AMD GPUs (all models).
2521

2622
In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools
2723
Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and macOS.
24+
OMPT is also supported for NVIDIA and AMD GPUs.
2825

2926
For the list of supported features from OpenMP 5.0 and 5.1
3027
see `OpenMP implementation details`_ and `OpenMP 51 implementation details`_.
@@ -36,42 +33,16 @@ General improvements
3633
collapse clause by replacing the expensive remainder operation with
3734
multiplications and additions.
3835

39-
- The default schedules for the `distribute` and `for` constructs in a
40-
parallel region and in SPMD mode have changed to ensure coalesced
41-
accesses. For the `distribute` construct, a static schedule is used
42-
with a chunk size equal to the number of threads per team (default
43-
value of threads or as specified by the `thread_limit` clause if
44-
present). For the `for` construct, the schedule is static with chunk
45-
size of one.
46-
47-
- Simplified SPMD code generation for `distribute parallel for` when
48-
the new default schedules are applicable.
49-
5036
- When using the collapse clause on a loop nest the default behavior
5137
is to automatically extend the representation of the loop counter to
5238
64 bits for the cases where the sizes of the collapsed loops are not
5339
known at compile time. To prevent this conservative choice and use
5440
at most 32 bits, compile your program with the
5541
`-fopenmp-optimistic-collapse`.
5642

57-
.. _basic support for Cuda devices:
58-
59-
Cuda devices support
60-
====================
6143

62-
Directives execution modes
63-
--------------------------
64-
65-
Clang code generation for target regions supports two modes: the SPMD and
66-
non-SPMD modes. Clang chooses one of these two modes automatically based on the
67-
way directives and clauses on those directives are used. The SPMD mode uses a
68-
simplified set of runtime functions thus increasing performance at the cost of
69-
supporting some OpenMP features. The non-SPMD mode is the most generic mode and
70-
supports all currently available OpenMP features. The compiler will always
71-
attempt to use the SPMD mode wherever possible. SPMD mode will not be used if:
72-
73-
- The target region contains user code (other than OpenMP-specific
74-
directives) in between the `target` and the `parallel` directives.
44+
GPU devices support
45+
===================
7546

7647
Data-sharing modes
7748
------------------
@@ -82,8 +53,9 @@ performance and can be activated using the `-fopenmp-cuda-mode` flag. In
8253
`Generic` mode all local variables that can be shared in the parallel regions
8354
are stored in the global memory. In `Cuda` mode local variables are not shared
8455
between the threads and it is user responsibility to share the required data
85-
between the threads in the parallel regions.
86-
56+
between the threads in the parallel regions. Often, the optimizer is able to
57+
reduce the cost of `Generic` mode to the level of `Cuda` mode, but the flag,
58+
as well as other assumption flags, can be used for tuning.
8759

8860
Features not supported or with limited support for Cuda devices
8961
---------------------------------------------------------------
@@ -96,9 +68,6 @@ Features not supported or with limited support for Cuda devices
9668

9769
- Nested parallelism: inner parallel regions are executed sequentially.
9870

99-
- Automatic translation of math functions in target regions to device-specific
100-
math functions is not implemented yet.
101-
10271
- Debug information for OpenMP target regions is supported, but sometimes it may
10372
be required to manually specify the address class of the inspected variables.
10473
In some cases the local variables are actually allocated in the global memory,
@@ -139,7 +108,7 @@ implementation.
139108
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
140109
| memory management | allocate directive and allocate clause | :good:`done` | r355614,r335952 |
141110
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
142-
| OMPD | OMPD interfaces | :part:`done` | https://reviews.llvm.org/D99914 (Supports only HOST(CPU) and Linux |
111+
| OMPD | OMPD interfaces | :good:`done` | https://reviews.llvm.org/D99914 (Supports only HOST(CPU) and Linux |
143112
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
144113
| OMPT | OMPT interfaces (callback support) | :good:`done` | |
145114
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
@@ -171,7 +140,7 @@ implementation.
171140
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
172141
| device | infer target functions from initializers | :part:`worked on` | |
173142
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
174-
| device | infer target variables from initializers | :part:`done` | D146418 |
143+
| device | infer target variables from initializers | :good:`done` | D146418 |
175144
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
176145
| device | OMP_TARGET_OFFLOAD environment variable | :good:`done` | D50522 |
177146
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
@@ -217,7 +186,7 @@ implementation.
217186
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
218187
| device | support close modifier on map clause | :good:`done` | D55719,D55892 |
219188
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
220-
| device | teams construct on the host device | :part:`done` | r371553 |
189+
| device | teams construct on the host device | :good:`done` | r371553 |
221190
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
222191
| device | support non-contiguous array sections for target update | :good:`done` | |
223192
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
@@ -235,15 +204,15 @@ implementation.
235204
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
236205
| misc | library shutdown (omp_pause_resource[_all]) | :good:`done` | D55078 |
237206
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
238-
| misc | metadirectives | :part:`worked on` | D91944 |
207+
| misc | metadirectives | :part:`mostly done` | D91944 |
239208
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
240209
| misc | conditional modifier for lastprivate clause | :good:`done` | |
241210
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
242211
| misc | iterator and multidependences | :good:`done` | |
243212
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
244213
| misc | depobj directive and depobj dependency kind | :good:`done` | |
245214
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
246-
| misc | user-defined function variants | :part:`worked on` | D67294, D64095, D71847, D71830, D109635 |
215+
| misc | user-defined function variants | :good:`done`. | D67294, D64095, D71847, D71830, D109635 |
247216
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
248217
| misc | pointer/reference to pointer based array reductions | :good:`done` | |
249218
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
@@ -298,7 +267,7 @@ implementation.
298267
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
299268
| device | indirect clause on declare target directive | :none:`unclaimed` | |
300269
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
301-
| device | allow virtual functions calls for mapped object on device | :none:`unclaimed` | |
270+
| device | allow virtual functions calls for mapped object on device | :part:`partial` | |
302271
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+
303272
| device | interop construct | :part:`partial` | parsing/sema done: D98558, D98834, D98815 |
304273
+------------------------------+--------------------------------------------------------------+--------------------------+-----------------------------------------------------------------------+

clang/docs/ReleaseNotes.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,6 +287,9 @@ Bug Fixes to C++ Support
287287
a non-template inner-class between the function and the class template.
288288
(`#65810 <https://github.com/llvm/llvm-project/issues/65810>`_)
289289

290+
- Fix a crash when calling a non-constant immediate function
291+
in the initializer of a static data member.
292+
(`#65985 <https://github.com/llvm/llvm-project/issues/65985>_`).
290293
- Clang now properly converts static lambda call operator to function
291294
pointers on win32.
292295
(`#62594 <https://github.com/llvm/llvm-project/issues/62594>`_)
@@ -305,6 +308,9 @@ Bug Fixes to C++ Support
305308
that contains a `return`.
306309
(`#48527 <https://github.com/llvm/llvm-project/issues/48527>`_)
307310

311+
- Clang now no longer asserts when an UnresolvedLookupExpr is used as an
312+
expression requirement. (`#66612 https://github.com/llvm/llvm-project/issues/66612`)
313+
308314
Bug Fixes to AST Handling
309315
^^^^^^^^^^^^^^^^^^^^^^^^^
310316
- Fixed an import failure of recursive friend class template.

clang/include/clang/AST/Type.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6642,7 +6642,7 @@ class BitIntType final : public Type, public llvm::FoldingSetNode {
66426642
bool isSugared() const { return false; }
66436643
QualType desugar() const { return QualType(this, 0); }
66446644

6645-
void Profile(llvm::FoldingSetNodeID &ID) {
6645+
void Profile(llvm::FoldingSetNodeID &ID) const {
66466646
Profile(ID, isUnsigned(), getNumBits());
66476647
}
66486648

clang/include/clang/Analysis/Analyses/ThreadSafetyCommon.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -361,7 +361,7 @@ class SExprBuilder {
361361
unsigned NumArgs = 0;
362362

363363
// Function arguments
364-
const Expr *const *FunArgs = nullptr;
364+
llvm::PointerUnion<const Expr *const *, til::SExpr *> FunArgs = nullptr;
365365

366366
// is Self referred to with -> or .?
367367
bool SelfArrow = false;

clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,7 @@ class DependencyScanningFilesystemLocalCache {
215215
public:
216216
/// Returns entry associated with the filename or nullptr if none is found.
217217
const CachedFileSystemEntry *findEntryByFilename(StringRef Filename) const {
218+
assert(llvm::sys::path::is_absolute_gnu(Filename));
218219
auto It = Cache.find(Filename);
219220
return It == Cache.end() ? nullptr : It->getValue();
220221
}
@@ -224,6 +225,7 @@ class DependencyScanningFilesystemLocalCache {
224225
const CachedFileSystemEntry &
225226
insertEntryForFilename(StringRef Filename,
226227
const CachedFileSystemEntry &Entry) {
228+
assert(llvm::sys::path::is_absolute_gnu(Filename));
227229
const auto *InsertedEntry = Cache.insert({Filename, &Entry}).first->second;
228230
assert(InsertedEntry == &Entry && "entry already present");
229231
return *InsertedEntry;
@@ -282,13 +284,14 @@ class DependencyScanningWorkerFilesystem : public llvm::vfs::ProxyFileSystem {
282284
public:
283285
DependencyScanningWorkerFilesystem(
284286
DependencyScanningFilesystemSharedCache &SharedCache,
285-
IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS)
286-
: ProxyFileSystem(std::move(FS)), SharedCache(SharedCache) {}
287+
IntrusiveRefCntPtr<llvm::vfs::FileSystem> FS);
287288

288289
llvm::ErrorOr<llvm::vfs::Status> status(const Twine &Path) override;
289290
llvm::ErrorOr<std::unique_ptr<llvm::vfs::File>>
290291
openFileForRead(const Twine &Path) override;
291292

293+
std::error_code setCurrentWorkingDirectory(const Twine &Path) override;
294+
292295
/// Returns entry for the given filename.
293296
///
294297
/// Attempts to use the local and shared caches first, then falls back to
@@ -304,8 +307,11 @@ class DependencyScanningWorkerFilesystem : public llvm::vfs::ProxyFileSystem {
304307
/// For a filename that's not yet associated with any entry in the caches,
305308
/// uses the underlying filesystem to either look up the entry based in the
306309
/// shared cache indexed by unique ID, or creates new entry from scratch.
310+
/// \p FilenameForLookup will always be an absolute path, and different than
311+
/// \p OriginalFilename if \p OriginalFilename is relative.
307312
llvm::ErrorOr<const CachedFileSystemEntry &>
308-
computeAndStoreResult(StringRef Filename);
313+
computeAndStoreResult(StringRef OriginalFilename,
314+
StringRef FilenameForLookup);
309315

310316
/// Scan for preprocessor directives for the given entry if necessary and
311317
/// returns a wrapper object with reference semantics.
@@ -388,6 +394,12 @@ class DependencyScanningWorkerFilesystem : public llvm::vfs::ProxyFileSystem {
388394
/// The local cache is used by the worker thread to cache file system queries
389395
/// locally instead of querying the global cache every time.
390396
DependencyScanningFilesystemLocalCache LocalCache;
397+
398+
/// The working directory to use for making relative paths absolute before
399+
/// using them for cache lookups.
400+
llvm::ErrorOr<std::string> WorkingDirForCacheLookup;
401+
402+
void updateWorkingDirForCacheLookup();
391403
};
392404

393405
} // end namespace dependencies

clang/lib/AST/StmtProfile.cpp

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1333,7 +1333,13 @@ void StmtProfiler::VisitPredefinedExpr(const PredefinedExpr *S) {
13331333
void StmtProfiler::VisitIntegerLiteral(const IntegerLiteral *S) {
13341334
VisitExpr(S);
13351335
S->getValue().Profile(ID);
1336-
ID.AddInteger(S->getType()->castAs<BuiltinType>()->getKind());
1336+
1337+
QualType T = S->getType();
1338+
ID.AddInteger(T->getTypeClass());
1339+
if (auto BitIntT = T->getAs<BitIntType>())
1340+
BitIntT->Profile(ID);
1341+
else
1342+
ID.AddInteger(T->castAs<BuiltinType>()->getKind());
13371343
}
13381344

13391345
void StmtProfiler::VisitFixedPointLiteral(const FixedPointLiteral *S) {

0 commit comments

Comments
 (0)