Description
Description
There is an inconsistency in the reported GPU free memory between the Intel Compute Runtime and tools such as xpu-smi
. When using the Intel Compute Runtime on Intel Arc(TM) A770 Graphics, the reported free memory value is incorrect, consistently showing the same value as the total memory, even when memory is being consumed. This issue was observed in both Python (dpctl
) and a standalone C++ executable.
Steps to Reproduce
- Set up an environment with the Intel Compute Runtime and
xpu-smi
installed. - Save the following C++ code as say
mem.cpp
:
#include <iostream>
#include <vector>
#include <string>
#include <sycl/sycl.hpp>
int main(void) {
sycl::queue q{sycl::default_selector_v};
const sycl::device &dev = q.get_device();
const std::string &dev_name = dev.get_info<sycl::info::device::name>();
const std::string &driver_ver = dev.get_info<sycl::info::device::driver_version>();
std::cout << "Device: " << dev_name << " [" << driver_ver << "]" << std::endl;
auto global_mem_size = dev.get_info<sycl::info::device::global_mem_size>();
std::cout << "Global device memory size: " << global_mem_size << " bytes" << std::endl;
if (dev.has(sycl::aspect::ext_intel_free_memory)) {
auto free_memory = dev.get_info<sycl::ext::intel::info::device::free_memory>();
std::cout << "Free memory: " << free_memory << " bytes" << std::endl;
std::cout << "Implied memory in use: " << global_mem_size - free_memory << " bytes" << std::endl;
} else {
std::cout << "Free memory descriptor is not available" << std::endl;
}
return 0;
}
- Compile the code to obtain the binary:
icpx -fsycl mem.cpp -o mem.x
- Execute the compiled binary with the environment variable
ZES_ENABLE_SYSMAN
set to1
:
export ZES_ENABLE_SYSMAN=1
./mem.x
- Compare the output with the results from
xpu-smi
:
xpu-smi stats -d 0
Observed Behavior
The C++ code consistently reports the same value for global_mem_size
and free_memory
, implying 0 bytes of used memory, even when memory is being consumed by the GPU. In contrast, xpu-smi
correctly reports non-zero GPU memory usage.
Expected Behavior
The free_memory
value reported by the Intel Compute Runtime should reflect the actual free memory, showing a decrease when GPU memory is used, consistent with the output from xpu-smi
.
Environment Details
- OS: HiveOS (Based on Ubuntu 20.04 and 22.04)
- GPU: Intel(R) Arc(TM) A770 Graphics
- GPU driver versions tested:
- 1.3.27642
- 1.3.29735
- Intel Compute Runtime: Relevant versions for the above drivers
- Compiler: Intel DPC++/C++ Compiler (
icpx
)
Additional Information
This issue is tracked in the dpctl
repository here. The problem appears to stem from the GPU driver or the Intel Compute Runtime itself, as confirmed by running a standalone C++ executable.
Please let me know if further information or testing is required. Thank you for investigating this issue.