Unexpected WebSocketConnectionClosedException when streaming long running commands, missing pings?

**What happened (please include outputs or screenshots)**:
After exactly 300sec of idle, exec commands streamed with `stream` loose their connection, even though 
* pod is still running,
* The process that was execd in container is still running (checked with `kubectl exec POD_NAME ps aux`)
* Network otherwise appears to be healthy.
* No warnings logged by kube-apiserver

Following stack trace is given
```
  File "kubernetes/stream/ws_client.py", line 141, in readline_stderr
    return self.readline_channel(STDERR_CHANNEL, timeout=timeout)
  File "kubernetes/stream/ws_client.py", line 104, in readline_channel
    self.update(timeout=(timeout - time.time() + start))
  File "kubernetes/stream/ws_client.py", line 192, in update
    op_code, frame = self.sock.recv_data_frame(True)
  File "websocket/_core.py", line 401, in recv_data_frame
    frame = self.recv_frame()
  File "websocket/_core.py", line 440, in recv_frame
    return self.frame_buffer.recv_frame()
  File "websocket/_abnf.py", line 338, in recv_frame
    self.recv_header()
  File "websocket/_abnf.py", line 294, in recv_header
    header = self.recv_strict(2)
  File "websocket/_abnf.py", line 373, in recv_strict
    bytes_ = self.recv(min(16384, shortage))
  File "websocket/_core.py", line 524, in _recv
    return recv(self.sock, bufsize)
  File "websocket/_socket.py", line 122, in recv
    raise WebSocketConnectionClosedException(
websocket._exceptions.WebSocketConnectionClosedException: Connection to remote host was lost. 
```


**What you expected to happen**:
* Connection is not lost
* If it is lost, connection is automatically reconnected (up to some configured timeout limit) 
   Reading https://github.com/websocket-client/websocket-client#long-lived-connection, there is a description on how to use a built in way from the websocket library to do this. It appears that the stream command is not using this method.


**How to reproduce it (as minimally and precisely as possible)**:
(Mostly inspired by https://github.com/kubernetes-client/python/blob/master/examples/pod_exec.py)
```py
from kubernetes.stream import stream
...
...
s = stream(
    _client.connect_get_namespaced_pod_exec,
    _pod.metadata.name,
    _pod.metadata.namespace,
    command=["sleep", "100000"],           
    async_req=False,
    stderr=True,
    stdin=False,
    stdout=True,
    tty=False,
    _preload_content=False,
)

def tail_logs():
    if (ret_stdout := s.readline_stdout(timeout=1)) is not None:
        print(ret_stdout)

while (retcode := s.returncode) is None:
    tail_logs(s)
tail_logs(s)
```

**Anything else we need to know?**:
I've experimented by adding `s.sock.ping()` in each iteration of the `tail_logs` function and it appears to improve the situation but some times it still get stuck inside readline_stdout, despite the `timeout=1` parameter being set.  :raised_eyebrow: It get stuck so long that ping never has chance to run within the next 300sec timeout.
Does one need to run `ping` in a separate thread for this to work? [That's how it is done inside the `WebSocketApp`](https://github.com/websocket-client/websocket-client/blob/master/websocket/_app.py#L339), as used by the long lived connection example from websocket library.
Can one use WebSocketapp together with stream or do i need to implement similar ping-feature myself when using stream?

I'm also curious where 300sec come from, is there a way to find this time through connection handshake or is it a setting in my cluster/azure?

**Environment**:
- Kubernetes version: Azure AKS v1.23.5 
- Python version 3.9.7
- Python client version 23.6.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected WebSocketConnectionClosedException when streaming long running commands, missing pings? #1841

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected WebSocketConnectionClosedException when streaming long running commands, missing pings? #1841

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions