Skip to content

WiFi (specifically WifiGeneric) handling of events is subject to race conditions #6947

Closed
@egnor

Description

@egnor

Board

any board (with networking)

Device Description

any device (with networking)

Hardware Configuration

no special configuration

Version

latest development Release Candidate (RC-X)

IDE Name

any IDE

Operating System

any OS

Flash frequency

any frequency

PSRAM enabled

yes

Upload speed

any speed

Description

It's hard to write a specific repro case for this since it depends on race conditions. (Also, my understanding may be off, please correct!)

The WiFi instance allows user code to register for events with WiFi.onEvent(). This isn't documented, but it is present in most example code, described in various tutorials and copied into many/most network-using sketches. This is implemented the WiFiGeneric class (one of the base classes for WiFiClass, of which WiFi is an instance) using an independently running FreeRTOS task (pinned to core 0(?) by default):

static void _arduino_event_task(void * arg){

That free-running task reads from a queue and invokes WiFiGenericClass::_eventCallback():

esp_err_t WiFiGenericClass::_eventCallback(arduino_event_t *event)

After translating the event, _eventCallback() loops through all registered callbacks and invokes them serially:

for(uint32_t i = 0; i < cbEventList.size(); i++) {

Nowhere in this process is locking or other synchronization done, that I can figure out. I see these implications?

  1. If code adds/removes callbacks at the same time as callbacks are being invoked, arbitrarily bad undefined behavior can result due to conflict over cbEventList. Most code will add callbacks at the start and leave them there forever, but if that's the expectation it seems worth documenting?
  2. User callbacks are invoked on a separate task/thread (the "arduino event thread") from the user's main loop. That means the user should be careful about synchronization between those callbacks and the user's main code. However, this isn't documented, and example code (e.g. WiFiClientEvents.ino, WiFiUDPClient.ino, ETH_LAN8720, etc) make lots of unsynchronized calls and global variable changes in ways that seem unsafe.

(Hopefully you'll tell me why this is all wrong and secretly it's all serialized so the callbacks only happen between calls to loop() or something like that...?)

Sketch

(many of the example sketches, see above)

Debug Message

(no specific debug logs)

Other Steps to Reproduce

No response

I have checked existing issues, online documentation and the Troubleshooting Guide

  • I confirm I have checked existing issues, online documentation and Troubleshooting guide.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions