P1-Blocker: GPU workload can not access the GPU devices from the Container environment without setsebool container_use_devices on

updates according to @mregmi @vbedida79's comments

### Summary
GPU workload can not access the GPU devices from the Container environment without setsebool container_use_devices on

### Detail
GPU Workload pods requesting `gpu.intel.com/i915` resource cant be executed- until they have access for /dev/drm on the GPU node.
This can be achieved by setting- `setsebool container_use_devices on` on the host node. This is not feasible to implement if a cluster has multiple GPU nodes and this permission has to be set on each node manually.

### Root cause
The /dev/drm access permission is not been added to the container_device_t policy so the access of the /dev/drm is blocked by SELinux which makes the workload app in the can't access the GPU device node files from the container environment.

### Solution

- Work with container-selinux upstream to add the needed permission, and make sure the new container-selinux with the fixing got merged into OCP release. 
- Before it is merged into OCP release, we have to distribute this new policy through [user-container-policy](https://github.com/intel/user-container-selinux) project. 

## Workaround
To ensure all GPU workloads ([clinfo](https://github.com/intel/intel-technology-enabling-for-openshift/blob/main/e2e/inference/README.md), [AI inference](https://github.com/intel/intel-technology-enabling-for-openshift/blob/main/e2e/inference/README.md)) work properly, please run the following command on the GPU nodes.

1.	Find all nodes with an Intel Data Center GPU card using the following command:
``` 
$ oc get nodes -l intel.feature.node.kubernetes.io/gpu=true
```
Example output: 
```
NAME         STATUS   ROLES    AGE   VERSION
icx-dgpu-1   Ready    worker   30d   v1.25.4+18eadca
```

2.    Navigate to the node terminal on the web console (Compute -> Nodes -> Select a node -> Terminal). Run the following commands in the terminal. Repeat step 2 for any other nodes with an Intel Data Center GPU card.
```
$ chroot /host
$ setsebool container_use_devices on
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

P1-Blocker: GPU workload can not access the GPU devices from the Container environment without setsebool container_use_devices on #107

Summary

Detail

Root cause

Solution

Workaround

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

P1-Blocker: GPU workload can not access the GPU devices from the Container environment without setsebool container_use_devices on #107

Description

Summary

Detail

Root cause

Solution

Workaround

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions