Skip to content

Deadlock between consumer and heartbeat thread during coordinator lookup #1626

Closed
@zhgjun

Description

@zhgjun

consumer dead after broker restart, it cannot receive message and maybe dead . and I got the stack trace:

========================================================================
====                         Green Threads                          ====
========================================================================
------                        Green Thread                        ------

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/eventlet/green/thread.py:41 in __thread_body
    `func(*args, **kwargs)`

/usr/lib64/python2.7/threading.py:784 in __bootstrap
    `self.__bootstrap_inner()`

/usr/lib64/python2.7/threading.py:811 in __bootstrap_inner
    `self.run()`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/coordinator/base.py:964 in run
    `self._run_once()`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/coordinator/base.py:999 in _run_once
    `self.coordinator._client.poll(timeout_ms=0)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/client_async.py:560 in poll
    `responses.extend(self._fire_pending_completed_requests())`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/client_async.py:659 in _fire_pending_completed_requests
    `future.success(response)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/future.py:36 in success
    `self._call_backs('callback', self._callbacks, self.value)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/future.py:79 in _call_backs
    `f(value)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/coordinator/base.py:716 in _handle_group_coordinator_response
    `with self._client._lock, self._lock:`

/usr/lib64/python2.7/threading.py:173 in acquire
    `rc = self.__block.acquire(blocking)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/eventlet/semaphore.py:113 in acquire
    `hubs.get_hub().switch()`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`



------                        Green Thread                        ------
    `messages = self.consumer.poll(timeout * 1000.0)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/consumer/group.py:614 in poll
    `records = self._poll_once(remaining, max_records)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/consumer/group.py:634 in _poll_once
    `self._coordinator.poll()`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/coordinator/consumer.py:262 in poll
    `self.ensure_coordinator_ready()`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/coordinator/base.py:259 in ensure_coordinator_ready
    `self._client.poll(future=future)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/client_async.py:556 in poll
    `self._poll(timeout)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/client_async.py:574 in _poll
    `ready = self._selector.select(timeout)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/kafka/vendor/selectors34.py:342 in select
    `r, w, _ = self._select(self._readers, self._writers, [], timeout)`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/eventlet/green/select.py:86 in select
    `return hub.switch()`

/opt/cloud/services/network-agent/venv/lib/python2.7/site-packages/eventlet/hubs/hub.py:294 in switch
    `return self.greenlet.switch()`

anyone help? @dpkp @tvoinarovskyi @jeffwidman , any ideas?

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions