Skip to content

light sleep mode & WiFi & millis() overflow #8913

Open
@Bruno-Muller

Description

@Bruno-Muller

Hi,

Issue:
WiFi stops working after a few hours when using the light sleep mode.

Description:
I am not sure if the Arduino library is doing well with the time drifting caused by the Light Sleep Mode because the clock is idle. And if the millis() overflow can cause any dead lock issue in the timeout condition to exit loops.

I'm not an expert, I suspect that the WiFi API does not renew and negotiate some WiFi token in a timely manner because millis() is not a good time indicator when we use the light sleep mode.

Steps to reproduce:
Periodically put the ESP8266 in light sleep mode.
Here in my tests, the wake time is around 5s, and sleep time is around 15s.

In setup():

WiFi.mode(WIFI_STA);

// I've seen that the library will override the value to MODEM_SLEEP later.
// but I will go around to avoid the situation with some custom sleep functions.
WiFi.setSleepMode(WiFiSleepType::WIFI_LIGHT_SLEEP);

WiFi.begin(WIFI_SSID, WIFI_PASSWORD);

Go to light sleep and wake up functions:

volatile bool wifi_sleeping = false;

// call this function to go to light sleep mode
void start_light_sleep(uint32_t time_ms) {

  // Not sure if this is needed, but I've read that all timer interrupts should be disabled to not interfere
  // So maybe it makes sense:
  extern os_timer_t *timer_list; 
  timer_list = nullptr; 

  wifi_station_disconnect();
  wifi_set_opmode(NULL_MODE);
  wifi_fpm_set_sleep_type(LIGHT_SLEEP_T);
  wifi_fpm_open(); 

  Serial.println("ENTER LIGHT SLEEP MODE");
  Serial.flush();
  Serial.end();

  wifi_sleeping = true;
  
  wifi_fpm_set_wakeup_cb(light_sleep_wakeup);
  wifi_fpm_do_sleep(time_ms*1000);  // in us, finally we are going to light sleep mode.
  
  // So I've read on the internet that the delay should be sleep_time_ms + 1 ms.
  // But I'm not sure if this is exact arithmancy.
  // I've measured that the ESP8266 needs around 460 ms to go to light sleep.
  // So I assume that the delay has to be at least 460ms.
  esp_delay(time_ms+1, [](){ return wifi_sleeping; }); 
}

// this function is called by the wake up callback interrupt. Do not call it yourself.
void light_sleep_wakeup(void) { 

 // I've read that some functions taking time should be called here. Not sure why.
 // Maybe removing it will break the wake up function.
 // I think the underlying function may call some esp yield or else.
  Serial.begin(BAUDRATE);
  Serial.println("EXIT LIGHT SLEEP MODE");
  Serial.flush();

  wifi_fpm_close();  
  wifi_set_opmode(STATION_MODE); 
  wifi_station_connect();   

  wifi_sleeping = false;
}

Analysis:

I notice that millis() function is badly understood on the internet, and a lot of people assume that it is 2^32 ms (i.e. 49.7 days) overflow.
I did some measurement and it is clear that millis() expires after 2^32/1000 ms (i.e. 2^32 us, or 71.6 min).

In addition to this, I notice in Arduino ESP8266 library source code that the code is compensating some clock drifting (maybe 826us per some amount of time, I don't recall exactly).

I am not sure that the clock drifting compensation or the overflow can cause side effects on the timeout while loops or exit condition used in the library.
For example, the following line is from the WiFi API, if millis() is less than last_sent, then it could lead to unexpected behavior.
if (millis() - last_sent > (uint32_t) max_wait_ms) {...}

I also noticed that the light sleep mode is badly understood as well. I've seen people measuring current with an oscilloscope and do some arithmancy, trying to guess if light sleep mode works or not, and doing assumption on how it has to be done, with no good reasons.......
Or maybe Google just does not help to find the reliable information lost somewhere on the Internet.
Well, by looking at the source code of Arduino ESP8266 Library source code and the official documentation of ESP 8266, I found out a workable solution to use the light sleep mode (mentioned above).

As a consequence, because the clock is idle during light sleep mode, the millis() function cannot be used to measure the time anymore.
I wonder if millis() is the time source in the WiFi API to trigger the renegotiation of some WiFi token.
That could explain why the wifi stops working.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions