Description
Comments
- I'd like to open his issue to discuss and track the SCC and SeLinux setting issue on OpenShift-4.9 platform to enable intel-device-plugin framework and SGX plugin.
- if I mount the directory with :z, and open SeLinux as enforcing mode, below issue can be resolved but it will run into access host devices(/dev/sgx_x) deney issue, for detail please see section " The proper way to access shared directory in pod "
- After I run the device plugin as a privileged container with SCC privileged, the issue can be resolved. please see section "The proper way to access host devices in container/pod"
The issue is:
If I enable SeLinux up as below on my work node
sh-4.4# sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: enforcing
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Memory protection checking: actual (secure)
Max kernel policy version: 33
Then My initial container will run into "permission access denied" issue on all the volume mounted in the pod
if I close the Selinux as below
sh-4.4# sestatus
SELinux status: enabled
SELinuxfs mount: /sys/fs/selinux
SELinux root directory: /etc/selinux
Loaded policy name: targeted
Current mode: permissive
Mode from config file: enforcing
Policy MLS status: enabled
Policy deny_unknown status: allowed
Memory protection checking: actual (secure)
Max kernel policy version: 33
The operator can be up and running properly.
You can reproduce the issue using the below steps
Reproduce Steps
Firstly I have to apply below patches to setup SCC according to documents:
SCC in OCP-4.9
Guide to UID, GID
Author: MartinXu <[email protected]>
Date: Thu Nov 18 22:29:51 2021 -0500
Add SCC hostaccess to manager-role on OpenShift
So the default SA (Service Account) can have the privilige to create pod to access to all
host namespaces but still requires pods to be run with a UID and SELinux context that are
allocated to the namespace.
For detail
see https://docs.openshift.com/container-platform/4.9/authentication/managing-security-context-constraints.html
diff --git a/deployments/operator/rbac/role.yaml b/deployments/operator/rbac/role.yaml
index 8d19b7a..dd93674 100644
--- a/deployments/operator/rbac/role.yaml
+++ b/deployments/operator/rbac/role.yaml
@@ -176,3 +176,11 @@ rules:
- get
- list
- watch
+- apiGroups:
+ - security.openshift.io
+ resources:
+ - securitycontextconstraints
+ resourceNames:
+ - hostmount-anyuid
+ verbs:
+ - use
commit 9e3106cef687a7f83ed7daed90575f7e16b16993
Author: Xu <[email protected]>
Date: Thu Nov 18 19:27:25 2021 -0500
Dropoff securityContext from manager deployment
OpenShift SCC (Security Context Constraints) is used to manage security
context. See
https://cloud.redhat.com/blog/a-guide-to-openshift-and-uids
https://docs.openshift.com/container-platform/4.9/authentication/managing-security-context-constraints.html
By default restricted SCC is used to Ensure that pods cannot be run as privileged.
So this commit drops off the securityconttext to run as non-root user
diff --git a/deployments/operator/default/manager_auth_proxy_patch.yaml b/deployments/operator/default/manager_auth_proxy_patch.yaml
index 8ba668c..082782f 100644
--- a/deployments/operator/default/manager_auth_proxy_patch.yaml
+++ b/deployments/operator/default/manager_auth_proxy_patch.yaml
@@ -19,11 +19,11 @@ spec:
ports:
- containerPort: 8443
name: https
- securityContext:
- runAsNonRoot: true
- runAsUser: 1000
- runAsGroup: 1000
- readOnlyRootFilesystem: true
+ #securityContext:
+ #runAsNonRoot: true
+ #runAsUser: 1000
+ #runAsGroup: 1000
+ #readOnlyRootFilesystem: true
- name: manager
args:
- "--metrics-addr=127.0.0.1:8080"
diff --git a/deployments/operator/manager/manager.yaml b/deployments/operator/manager/manager.yaml
index db335d3..9ee0a94 100644
--- a/deployments/operator/manager/manager.yaml
+++ b/deployments/operator/manager/manager.yaml
@@ -33,11 +33,11 @@ spec:
requests:
cpu: 100m
memory: 20Mi
- securityContext:
- runAsNonRoot: true
- runAsUser: 65532
- runAsGroup: 65532
- readOnlyRootFilesystem: true
+ #securityContext:
+ #runAsNonRoot: true
+ #runAsUser: 65532
+ #runAsGroup: 65532
+ #readOnlyRootFilesystem: true
env:
- name: DEVICEPLUGIN_NAMESPACE
valueFrom:
commit fbf8bd8b120ab65fc456d4778fb156214230ffac
Author: MartinXu <[email protected]>
Date: Thu Nov 18 20:45:51 2021 -0500
Backport https://github.com/intel/intel-device-plugins-for-kubernetes/pull/756
diff --git a/deployments/operator/rbac/role.yaml b/deployments/operator/rbac/role.yaml
index 3e490e5..8d19b7a 100644
--- a/deployments/operator/rbac/role.yaml
+++ b/deployments/operator/rbac/role.yaml
@@ -143,6 +143,12 @@ rules:
- patch
- update
- watch
+- apiGroups:
+ - deviceplugin.intel.com
+ resources:
+ - sgxdeviceplugins/finalizers
+ verbs:
+ - update
- apiGroups:
- deviceplugin.intel.com
resources:
run operator manually
Then start the intel device plugins framework using command
$ oc apply -k intel-device-plugins-for-kubernetes/deployments/operator/default/
and start SGX pluin DS as
oc apply -f intel-device-plugins-for-kubernetes/deployments/operator/samples/deviceplugin_v1_sgxdeviceplugin.yaml
The intel device plugins framework can up and running, and the SGX plugin DS also up and running.
But the init container in the pod run into the "permission access denied issue" when try to access directory
/etc/kubernetes/node-feature-discovery/source.d/
Run operator though OLM
You can also run the operator through OLM
operator-sdk run bundle docker.io/walnuxdocker/intel-device-plugins-operator-bundle:0.22.0
The result is the same with run manually
this is the volume mounted in the pod
nodeSelector:
feature.node.kubernetes.io/custom-intel.sgx: 'true'
kubernetes.io/arch: amd64
restartPolicy: Always
initContainers:
- name: intel-sgx-initcontainer
image: 'intel/intel-sgx-initcontainer:0.22.0'
resources: {}
volumeMounts:
- name: nfd-source-hooks
mountPath: /etc/kubernetes/node-feature-discovery/source.d/
- name: kube-api-access-nkpq6
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
capabilities:
drop:
- MKNOD
readOnlyRootFilesystem: true
Analysis:
You can see that I assigned the SCC as hostmount-anyuid.
And after I disabled the Selinux with command on worknode 1 with command
$sudo setenforce 0
Operator up and run on this node.
But I leave Selinux enable on worknode 0
"The permission access denied issue still there"
After I set the SCC as hostaccess, no matter I disable or enable the SeLinux, The permission access denied issue always happens.
The proper way to access shared directory in pod
mountPath: '/etc/kubernetes/node-feature-discovery/source.d/:z' and using SCC hostmount-anyuid, looks like above issue can be resolved the init container can work with Selinux set as enforcing mode.
the root cause is:
According to https://www.redhat.com/sysadmin/user-namespaces-selinux-rootless-containers
The root cause might be:
The container engine, Podman, launches each container with a unique process SELinux label (usually container_t) and labels all of the container content with a single label (usually container_file_t). We have rules that state that container_t can read and write all content labeled container_file_t. This simple idea has blocked major file system exploits.
Everything works perfectly until the user attempts a volume mount. The problem with volumes is that they usually only bind mounts on the host. They bring in the labels from the host, which the SELinux policy does not allow the process label to interact with, and the container blows up.
However the sgxplugin container runinto permission access deny issue
initContainers:
- name: intel-sgx-initcontainer
image: 'intel/intel-sgx-initcontainer:0.22.0'
resources: {}
volumeMounts:
- name: nfd-source-hooks
mountPath: '/etc/kubernetes/node-feature-discovery/source.d/:z'
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
securityContext:
readOnlyRootFilesystem: false
The error is:
E1130 05:11:07.898395 1 sgx_plugin.go:75] No SGX enclave file available: stat /dev/sgx_enclave: permission denied
Try to resolve the above issue
using the similar way to mount /dev/sgx_enclave with :z
containers:
- resources: {}
terminationMessagePath: /dev/termination-log
name: intel-sgx-plugin
securityContext:
readOnlyRootFilesystem: false
imagePullPolicy: IfNotPresent
volumeMounts:
- name: sgxdevices
mountPath: /dev/sgx
- name: sgx-enclave
mountPath: '/dev/sgx_enclave:z'
It runs into below error
sgx_plugin.go:75] No SGX enclave file available: stat /dev/sgx_enclave: no such file or directory
The proper way to access host devices from the container
After I use SCC privileged, and
set privileged: true
containers:
- resources: {}
terminationMessagePath: /dev/termination-log
name: intel-sgx-plugin
securityContext:
privileged: true
above issue can be resolved.
according to https://kubernetes.io/docs/concepts/policy/pod-security-policy/
a "privileged" container is given access to all devices on the host. This allows the container nearly all the same access as processes running on the host. This is useful for containers that want to use linux capabilities like manipulating the network stack and accessing devices.
I am concerned about using this privilege right
And others also has the similar concern and request a new feature in K8S
See kubernetes/kubernetes#60748
However, since the SGX device plugin has to access the SGX devices of host, looks like we can only use the privileged container.
@mythi What's your comments? :)
reference to similar project like SRO
In Special resource operator, looks like the similar security policy is applied
https://github.com/openshift/special-resource-operator/blob/master/charts/xilinx/fpga-xrt-driver-4.7.11/templates/1000-driver-container.yaml#L17