Skip to content

[libomptarget][tests] Maximum assumed device heap size. #71747

Open
@Meinersbur

Description

@Meinersbur

Despite the GPU of the openmp-offload-cuda-project and openmp-offload-cuda-runtime buildbots having 4 GiB of memory, the default device heap size as returned by cuCtxGetLimit(CU_LIMIT_MALLOC_HEAP_SIZE, ...) is 8388608 (8 MiB), while offloading/malloc.c requires ~55MB and offloading/malloc_parallel.c even more.

I don't know how CUDA determines the default heap size limit, but I assume it is constant and inherited from the earliest days of CUDA.

To fix this, we either

  1. limit the amount of heap allocated by any test to 8 MiB (e.g. reducing the number of teams in parallel.c to 48), or
  2. set LIBOMPTARGET_HEAP_SIZE to the maximum heap size allocated by any test. This patch fixes the two malloc tests, reducing the number of failed tests to 30:
diff --git a/openmp/libomptarget/test/lit.cfg b/openmp/libomptarget/test/lit.cfg
index 6dab31bd35a9..e288827c50f6 100644
--- a/openmp/libomptarget/test/lit.cfg
+++ b/openmp/libomptarget/test/lit.cfg
@@ -31,6 +31,8 @@ if 'LIBOMPTARGET_LOCK_MAPPED_HOST_BUFFERS' in os.environ:
 if 'OMP_TARGET_OFFLOAD' in os.environ:
     config.environment['OMP_TARGET_OFFLOAD'] = os.environ['OMP_TARGET_OFFLOAD']

+config.environment['LIBOMPTARGET_HEAP_SIZE'] = '134217728' # 128 MiB
+
 # set default environment variables for test
 if 'CHECK_OPENMP_ENV' in os.environ:
     test_env = os.environ['CHECK_OPENMP_ENV'].split()

A 64 MiB heap is sufficient for the malloc.c test, but not for malloc_parallel.c.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions