Using mgos_uptime in ISR corrupts uptime and freezes system in some situations

Hello,

If we call mgos_uptime in ISR handler then uptime during at least network operations is being shifted. Also during some operations (eg VFS access) makes the system freeze. Our mos version is 2.15.0.

Please consider following code:

static void cb2(void *arg) {
  (void) arg;
  LOG(LL_INFO, ("UT %f", mgos_uptime()));
}

volatile int i = 0;

static IRAM void cb1(void *a) {
  mgos_uptime();

  if (i++ % 100 == 0) {
    mgos_invoke_cb(cb2, NULL, true);
  }
}

enum mgos_app_init_result mgos_app_init(void) {
// in our code we use mgos_gpio_set_int_handler_isr where it is reproducible too
  mgos_set_hw_timer(1000, MGOS_TIMER_REPEAT, cb1, NULL);
  return MGOS_APP_INIT_SUCCESS;
}

If I run it together with configured wifi and Azure IoT connection then I see this:

[Nov  5 22:31:37.482] load 0x40100000, len 1540, room 16 
[Nov  5 22:31:37.486] tail 4
[Nov  5 22:31:37.487] chksum 0x62
[Nov  5 22:31:37.488] load 0x3ffe8000, len 748, room 4 
[Nov  5 22:31:37.492] tail 8
[Nov  5 22:31:37.492] chksum 0x85
[Nov  5 22:31:37.494] csum 0x85    u  jUJ  +V+RU    K  ZkVVU   U (TT U          TT U+UZ I U TT U UT  QUT E+V P Z V E        Q Z V UP  XQU T    p  N   p  o|  l   {$ c      | {  l o   o   l    r l $   : softAP(2e:f4:32:08
[Nov  5 22:31:37.573] esp_main.c:172          lamp 1.0.0 (20191105-212410/g23ec741-master-dirty)
[Nov  5 22:31:37.579] esp_main.c:174          Mongoose OS 2.15.0 (20191105-193302/2.15.0-g787ac38)
[Nov  5 22:31:37.586] esp_main.c:178          CPU: ESP8266EX, 160 MHz, RAM: 47008 total, 44396 free
[Nov  5 22:31:37.593] esp_main.c:180          SDK 2.2.1(1247cc5); flash: 2M
[Nov  5 22:31:37.598] esp_exc.c:211           Reset cause: 0 (power on)
[Nov  5 22:31:37.602] mg_lwip_ev_mgr.c:93     Mongoose 6.15, LwIP 1.4.1
[Nov  5 22:31:37.607] mg_ssl_if_mbedtls.c:57  mbed TLS 2.16.0-cesanta4
[Nov  5 22:31:37.611] mgos_vfs_dev.c:73       sfl0: sysflash (), size 2097152
[Nov  5 22:31:37.617] mgos_vfs_dev.c:73       root: part ({"dev": "sfl0", "offset": 897024, "size": 131072}), size 131072
[Nov  5 22:31:37.626] mgos_vfs.c:147          /: SPIFFS @ root, opts 
[Nov  5 22:31:37.650] mgos_vfs.c:320          /: size 113201, used: 30622, free: 82579
[Nov  5 22:31:37.708] mgos_sys_config.c:368   MAC: 2e:f4:32:08:8d:4c
[Nov  5 22:31:37.713] mgos_sys_config.c:376   WDT: 30 seconds
[Nov  5 22:31:37.717] mgos_deps_init.c:102    init i2c...
[Nov  5 22:31:37.720] mgos_deps_init.c:102    init atca...
[Nov  5 22:31:37.723] mgos_deps_init.c:102    init mqtt...
[Nov  5 22:31:37.727] mgos_deps_init.c:102    init shadow...
[Nov  5 22:31:37.730] mgos_deps_init.c:102    init sntp...
[Nov  5 22:31:37.733] mgos_deps_init.c:102    init azure...
[Nov  5 22:31:37.737] mgos_azure.c:166        Azure IoT Hub client for dev-willow-iot-hub-we.azure-devices.net/livisi-willow-lamp1-088D4C (azure.crt.pem)
[Nov  5 22:31:37.749] mgos_azure_shadow.c:203 Azure Device Twin enabled, device livisi-willow-lamp1-088D4C
[Nov  5 22:31:37.757] mgos_deps_init.c:102    init mbedtls...
[Nov  5 22:31:37.761] mgos_deps_init.c:102    init ota-http-client...
[Nov  5 22:31:37.765] mgos_deps_init.c:102    init rpc-common...
[Nov  5 22:31:37.769] mgos_deps_init.c:102    init rpc-service-config...
[Nov  5 22:31:37.774] mgos_deps_init.c:102    init rpc-service-fs...
[Nov  5 22:31:37.779] mgos_deps_init.c:102    init rpc-service-ota...
[Nov  5 22:31:37.784] mgos_deps_init.c:102    init rpc-uart...
[Nov  5 22:31:37.787] mgos_rpc_channel_ua:313 0x3fff237c UART0
[Nov  5 22:31:37.791] mg_rpc.c:539            0x3fff237c '' UART
[Nov  5 22:31:37.795] mg_rpc.c:470            0x3fff237c CHAN OPEN (UART UART0)
[Nov  5 22:31:37.800] mgos_deps_init.c:102    init wifi...
[Nov  5 22:31:37.804] bcn 0
[Nov  5 22:31:37.805] del if1
[Nov  5 22:31:37.805] usl
[Nov  5 22:31:37.806] mode : null
[Nov  5 22:31:37.807] mgos_wifi.c:88          WiFi STA: Using config 0 (bono)
[Nov  5 22:31:37.812] mgos_wifi.c:468         WiFi mode: STA
[Nov  5 22:31:37.815] esp_wifi.c:106          WiFi mode: STA
[Nov  5 22:31:37.819] mode : sta(2c:f4:32:08:8d:4c)
[Nov  5 22:31:37.822] add if0
[Nov  5 22:31:37.822] sleep disable
[Nov  5 22:31:37.824] WPA2 ENTERPRISE VERSION: [v2.0] disable
[Nov  5 22:31:37.827] mgos_wifi.c:270         WiFi STA: Connecting to bono
[Nov  5 22:31:37.833] mgos_init.c:36          Init done, RAM: 47008 total, 39844 free, 38320 min free
[Nov  5 22:31:37.841] mgos_event.c:135        ev MOS0 triggered 0 handlers
[Nov  5 22:31:37.846] mgos_ota_core.c:967     1 0
[Nov  5 22:31:37.850] mgos_ota_core.c:1308    UID: 03493635109898f3, license: none
[Nov  5 22:31:37.856] mgos_event.c:135        ev WFI1 triggered 0 handlers
[Nov  5 22:31:37.861] ir.c:74                 UT 0.382260
[Nov  5 22:31:37.865] mgos_net.c:85           WiFi STA: connecting
[Nov  5 22:31:37.869] mgos_event.c:135        ev NET1 triggered 3 handlers
[Nov  5 22:31:37.934] ir.c:74                 UT 0.455324
[Nov  5 22:31:38.034] ir.c:74                 UT 0.555354
[Nov  5 22:31:38.134] ir.c:74                 UT 0.655339
[Nov  5 22:31:38.234] ir.c:74                 UT 0.755344
[Nov  5 22:31:38.334] ir.c:74                 UT 0.855350
[Nov  5 22:31:38.434] ir.c:74                 UT 0.955353
[Nov  5 22:31:38.534] ir.c:74                 UT 1.055358
[Nov  5 22:31:38.634] ir.c:74                 UT 1.155362
[Nov  5 22:31:38.734] ir.c:74                 UT 1.255355
[Nov  5 22:31:38.834] ir.c:74                 UT 1.355334
[Nov  5 22:31:38.934] ir.c:74                 UT 1.455342
[Nov  5 22:31:39.034] ir.c:74                 UT 1.555353
[Nov  5 22:31:39.038] mgos_wifi.c:345         WiFi STA: Connect timeout
[Nov  5 22:31:39.043] mgos_wifi.c:88          WiFi STA: Using config 0 (bono)
[Nov  5 22:31:39.048] scandone
[Nov  5 22:31:39.049] WPA2 ENTERPRISE VERSION: [v2.0] disable
[Nov  5 22:31:39.053] mgos_wifi.c:270         WiFi STA: Connecting to bono
[Nov  5 22:31:39.059] mgos_event.c:135        ev WFI1 triggered 0 handlers
[Nov  5 22:31:39.064] mgos_net.c:85           WiFi STA: connecting
[Nov  5 22:31:39.069] mgos_event.c:135        ev NET1 triggered 3 handlers
[Nov  5 22:31:39.088] mgos_wifi.c:345         WiFi STA: Connect timeout
[Nov  5 22:31:39.093] mgos_wifi.c:88          WiFi STA: Using config 0 (bono)
[Nov  5 22:31:39.098] scandone
[Nov  5 22:31:39.099] WPA2 ENTERPRISE VERSION: [v2.0] disable
[Nov  5 22:31:39.103] mgos_wifi.c:270         WiFi STA: Connecting to bono
[Nov  5 22:31:39.109] mgos_event.c:135        ev WFI1 triggered 0 handlers
[Nov  5 22:31:39.114] mgos_net.c:85           WiFi STA: connecting
[Nov  5 22:31:39.118] mgos_event.c:135        ev NET1 triggered 3 handlers
[Nov  5 22:31:39.134] ir.c:74                 UT 8591.589905
[Nov  5 22:31:39.234] ir.c:74                 UT 8591.689909
[Nov  5 22:31:39.334] ir.c:74                 UT 8591.789927
[Nov  5 22:31:39.434] ir.c:74                 UT 8591.889937
[Nov  5 22:31:39.534] ir.c:74                 UT 8591.989928
[Nov  5 22:31:39.634] ir.c:74                 UT 8592.089935
[Nov  5 22:31:39.734] ir.c:74                 UT 8592.189936
[Nov  5 22:31:39.834] ir.c:74                 UT 8592.289924
[Nov  5 22:31:39.934] ir.c:74                 UT 8592.389935
[Nov  5 22:31:40.034] ir.c:74                 UT 8592.489944
[Nov  5 22:31:40.134] ir.c:74                 UT 8592.589922
[Nov  5 22:31:40.234] ir.c:74                 UT 8592.689943
[Nov  5 22:31:40.334] ir.c:74                 UT 8592.789950
[Nov  5 22:31:40.434] ir.c:74                 UT 8592.889922
[Nov  5 22:31:40.534] ir.c:74                 UT 8592.989943
[Nov  5 22:31:40.634] ir.c:74                 UT 8593.089952
[Nov  5 22:31:40.734] ir.c:74                 UT 8593.189935
[Nov  5 22:31:40.834] ir.c:74                 UT 8593.289945
[Nov  5 22:31:40.934] ir.c:74                 UT 8593.389940
[Nov  5 22:31:41.034] ir.c:74                 UT 8593.489931
[Nov  5 22:31:41.134] ir.c:74                 UT 8593.589934
[Nov  5 22:31:41.234] ir.c:74                 UT 8593.689949
[Nov  5 22:31:41.334] ir.c:74                 UT 8593.789947
[Nov  5 22:31:41.434] ir.c:74                 UT 8593.889938
[Nov  5 22:31:41.534] ir.c:74                 UT 8593.989939
[Nov  5 22:31:41.634] ir.c:74                 UT 8594.089954
[Nov  5 22:31:41.734] ir.c:74                 UT 8594.189954
[Nov  5 22:31:41.834] ir.c:74                 UT 8594.289939
[Nov  5 22:31:41.934] ir.c:74                 UT 8594.389954
[Nov  5 22:31:41.961] scandone
[Nov  5 22:31:42.852] state: 0 -> 2 (b0)
[Nov  5 22:31:42.854] ir.c:74                 UT 8595.309659
[Nov  5 22:31:42.858] ir.c:74                 UT 8595.313945
[Nov  5 22:31:42.862] state: 2 -> 3 (0)
[Nov  5 22:31:42.864] ir.c:74                 UT 8595.318303
[Nov  5 22:31:42.868] state: 3 -> 5 (10)
[Nov  5 22:31:42.869] add 0
[Nov  5 22:31:42.870] aid 1
[Nov  5 22:31:42.870] cnt 
[Nov  5 22:31:42.871] ir.c:74                 UT 8595.323978
[Nov  5 22:31:42.874] ir.c:74                 UT 8595.330039
[Nov  5 22:31:42.878] ir.c:74                 UT 8595.333654
[Nov  5 22:31:42.882] ir.c:74                 UT 8595.337259
[Nov  5 22:31:42.886] ir.c:74                 UT 8595.341175
[Nov  5 22:31:42.889] ir.c:74                 UT 8595.344964
[Nov  5 22:31:42.934] ir.c:74                 UT 8595.389949
[Nov  5 22:31:43.034] ir.c:74                 UT 8595.489962
[Nov  5 22:31:43.134] ir.c:74                 UT 8595.589959
[Nov  5 22:31:43.234] ir.c:74                 UT 8595.689961
[Nov  5 22:31:43.334] ir.c:74                 UT 8595.789962
[Nov  5 22:31:43.434] ir.c:74                 UT 8595.889963
[Nov  5 22:31:43.487] 
[Nov  5 22:31:43.487] connected with bono, channel 4
[Nov  5 22:31:43.490] dhcp client start...
[Nov  5 22:31:43.492] mgos_wifi.c:136         WiFi STA: Connected, BSSID 18:d6:c7:ba:d9:5b ch 4 RSSI -63
[Nov  5 22:31:43.499] mgos_event.c:135        ev WFI2 triggered 0 handlers
[Nov  5 22:31:43.506] mgos_net.c:89           WiFi STA: connected
[Nov  5 22:31:43.511] mgos_event.c:135        ev NET2 triggered 3 handlers
[Nov  5 22:31:43.534] ir.c:74                 UT 8595.989958
[Nov  5 22:31:43.634] ir.c:74                 UT 8596.089972
[Nov  5 22:31:43.734] ir.c:74                 UT 8596.189965
[Nov  5 22:31:43.834] ir.c:74                 UT 8596.289974
[Nov  5 22:31:43.934] ir.c:74                 UT 8596.389965
[Nov  5 22:31:44.034] ir.c:74                 UT 8596.489984
[Nov  5 22:31:44.134] ir.c:74                 UT 8596.589974
[Nov  5 22:31:44.234] ir.c:74                 UT 8596.689973
[Nov  5 22:31:44.334] ir.c:74                 UT 8596.789984
[Nov  5 22:31:44.434] ir.c:74                 UT 8596.889979
[Nov  5 22:31:44.534] ir.c:74                 UT 8596.989977
[Nov  5 22:31:44.634] ir.c:74                 UT 8597.089978
[Nov  5 22:31:44.734] ir.c:74                 UT 8597.189983
[Nov  5 22:31:44.753] ip:192.168.0.108,mask:255.255.255.0,gw:192.168.0.1
[Nov  5 22:31:44.758] mgos_event.c:135        ev WFI3 triggered 0 handlers
[Nov  5 22:31:44.763] mgos_net.c:101          WiFi STA: ready, IP 192.168.0.108, GW 192.168.0.1, DNS 192.168.0.1
[Nov  5 22:31:44.772] mgos_sntp.c:127         SNTP next query in 914 ms
[Nov  5 22:31:44.777] mgos_mqtt.c:431         MQTT connecting to dev-willow-iot-hub-we.azure-devices.net:8883
[Nov  5 22:31:44.785] mg_net.c:928            0x3fff0984 dev-willow-iot-hub-we.azure-devices.net:8883 azure.crt.pem,azure.key.pem,ca.pem
[Nov  5 22:31:44.796] mgos_vfs.c:256          azure.crt.pem -> /azure.crt.pem pl 1 -> 1 0x3fff1564 (refs 1)
[Nov  5 22:31:44.804] mgos_vfs.c:350          open azure.crt.pem 0x0 0x1b6 => 0x3fff1564 azure.crt.pem 1 => 257 (refs 1)
[Nov  5 22:31:44.812] mgos_vfs.c:509          fstat 257 => 0x3fff1564:1 => 0 (size 485)
[Nov  5 22:32:38.450] 
[Nov  5 22:32:38.450]  ets Jan  8 2013,rst cause:4, boot mode:(3,7)
[Nov  5 22:32:38.454] 
[Nov  5 22:32:38.455] wdt reset

Is this a bug? What I am trying to do is to implement IR remote controller receiver simmilar to https://github.com/mongoose-os-libs/ir but for different vendor.

Using system_get_time instead of mgos_uptime in our ISR fixes the problem regarding uptime shifting. Unfortunately it doesn’t fix VFS access freezing.

Still we consider that uptime shifting a bug in mos as their IR code uses mgos_uptime in their ISR. The shift seem to be like falsy increasing of num_overflows in mgos_uptime_micros.

I couldn’t reproduce the uptime shift, but when connecting to mdash I’ve got the wdt reset. Replacing mgos_uptime with mgos_uptime_micros in the ISR solved it.

Starting the timer after receiving MGOS_EVENT_CLOUD_CONNECTED might help too.

Thanks for your reply.

Replacing mgos_uptime with mgos_uptime_micros in the ISR solved it

Could it be casued by mgos_uptime missing IRAM? If yes then maybe it should be either fixed there or in IR implementation.

For time shift I’ll try to prepare a minimal testcase.

Meanwhile we found one more freezing scenario:

static IRAM void cb(int pin, void *arg) {
}

enum mgos_app_init_result mgos_app_init(void) {
  mgos_gpio_set_int_handler_isr(2, MGOS_GPIO_INT_EDGE_ANY, cb, NULL);
  mgos_system_restart_after(500);

  return MGOS_APP_INIT_SUCCESS;
}

After restart it hangs & wdt reboots:

[Nov  6 16:00:59.818]  ets Jan  8 2013,rst cause:2, boot mode:(3,7)
[Nov  6 16:00:59.818] 
[Nov  6 16:00:59.818] load 0x40100000, len 1540, room 16 
[Nov  6 16:00:59.818] tail 4
[Nov  6 16:00:59.818] chksum 0x62
[Nov  6 16:00:59.818] load 0x3ffe8000, len 748, room 4 
[Nov  6 16:00:59.818] tail 8
[Nov  6 16:00:59.818] chksum 0x85
[Nov  6 16:00:59.818] csum 0x85
[Nov  6 16:00:59.818] rBoot v1.2.1-cesanta1 - richardaburton@gmail.com
[Nov  6 16:00:59.818] Flash Size:   16 Mbit
[Nov  6 16:00:59.818] Flash Mode:   DIO
[Nov  6 16:00:59.818] Flash Speed:  80 MHz
[Nov  6 16:00:59.818] rBoot Option: Big flash
[Nov  6 16:00:59.818] 
[Nov  6 16:00:59.818] Booting rom 0 (0x8000).
[Nov  6 16:00:59.818]       {  n|  l   {$ #   l N   n   l    r l l  [   : softAP(2e:f4:32:0
[Nov  6 16:00:59.818] esp_main.c:172          lamp 1.0.0 (20191106-145928/g21d1398-master-dirty)
[Nov  6 16:00:59.818] esp_main.c:174          Mongoose OS 2.16.0 (20191106-145928/2.16.0-g9b779cd)
[Nov  6 16:00:59.818] esp_main.c:178          CPU: ESP8266EX, 160 MHz, RAM: 52856 total, 50244 free
[Nov  6 16:00:59.818] esp_main.c:180          SDK 2.2.1(1247cc5); flash: 2M
[Nov  6 16:00:59.818] esp_exc.c:211           Reset cause: 4 (soft reset)
[Nov  6 16:00:59.818] mg_lwip_ev_mgr.c:93     Mongoose 6.15, LwIP 1.4.1
[Nov  6 16:00:59.818] mg_ssl_if_mbedtls.c:57  mbed TLS 2.16.3-cesanta2
[Nov  6 16:00:59.818] mgos_vfs_dev.c:73       sfl0: sysflash (), size 2097152
[Nov  6 16:00:59.818] mgos_vfs_dev.c:73       root: part ({"dev": "sfl0", "offset": 897024, "size": 131072}), size 131072
[Nov  6 16:00:59.818] mgos_vfs.c:147          /: SPIFFS @ root, opts 
[Nov  6 16:00:59.818] mgos_vfs.c:320          /: size 113201, used: 31626, free: 81575
[Nov  6 16:01:25.797] 
[Nov  6 16:01:25.798] HW WDT @ 0x40106430
[Nov  6 16:01:25.799]  A0: 0x4000050c  A1: 0x3ffff910  A2: 0x00000000  A3: 0x3ffee61d
[Nov  6 16:01:25.805]  A4: 0x3fffc200  A5: 0x00000531  A6: 0xffffffff  A7: 0x00000020
[Nov  6 16:01:25.811]  A8: 0x4000dc3c  A9: 0x000000ce A10: 0x00000000 A11: 0x000000ce
[Nov  6 16:01:25.816] A12: 0x00000003 A13: 0x00000008 A14: 0x00000004 A15: 0x00000008
[Nov  6 16:01:25.822] 
[Nov  6 16:01:25.822] (exc SP: 0x3ffff750)
[Nov  6 16:01:25.824] 
[Nov  6 16:01:25.824] --- BEGIN CORE DUMP ---
[Nov  6 16:01:25.826] mos: catching core dump
[Nov  6 16:01:28.671] ....
[Nov  6 16:01:37.378] ---- END CORE DUMP ----
[Nov  6 16:01:37.382] mos: wrote to /home/martin/m/lamp/core-lamp-esp8266-20191106-160137.644602059 (133133 bytes)
[Nov  6 16:01:37.383] mos: analyzing core dump
Core dump by lamp/esp8266 1.0.0 20191106-145928/g21d1398-master-dirty
Using ELF file at: /home/martin/m/lamp/build/objs/lamp.elf
Using Docker image: docker.io/mgos/esp8266-build:2.2.1-1.5.0-r5
Running docker run --rm -v /home/martin/m/lamp/build/objs/lamp.elf:/fw.elf -v /home/martin/m/lamp/core-lamp-esp8266-20191106-160137.644602059:/core -v /home/martin/m/lamp:/home/martin/m/lamp docker.io/mgos/esp8266-build:2.2.1-1.5.0-r5 bash -c /usr/local/bin/serve_core.py --rom=/opt/Espressif/rom/rom.bin --rom_addr=0x40000000 /fw.elf /core & $MGOS_TARGET_GDB /fw.elf -ex 'target remote 127.0.0.1:1234' -ex 'set confirm off' -ex bt -ex quit
GNU gdb (crosstool-NG crosstool-ng-1.22.0-60-g37b07f6f) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-build_pc-linux-gnu --target=xtensa-lx106-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /fw.elf...done.
Remote debugging using 127.0.0.1:1234
0x40106430 in esp8266_gpio_isr (arg=<optimized out>)
    at /mongoose-os/platforms/esp8266/src/esp_gpio.c:194
194	/mongoose-os/platforms/esp8266/src/esp_gpio.c: No such file or directory.
#0  0x40106430 in esp8266_gpio_isr (arg=<optimized out>)
    at /mongoose-os/platforms/esp8266/src/esp_gpio.c:194
#1  0x4000050c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Detaching from program: /fw.elf, Remote target
Waiting for gdb on 1234
Found core at 23 - 133096
Mapping DRAM: 98304 @ 0x3ffe8000
Mapping /opt/Espressif/rom/rom.bin at 0x40000000
Mapping /fw.elf .data: 2508 @ 0x3ffe8000
Mapping /fw.elf .rodata: 8 @ 0x3ffe89cc
Mapping /fw.elf .bss: 26544 @ 0x3ffe89d8
Mapping /fw.elf .text: 29026 @ 0x40100000
Mapping /fw.elf .irom0.text: 583451 @ 0x40208010
Loaded core dump from last snippet in  /core
GDB closed the connection
Ending remote debugging.
 Nov  6 16:01:38.371] rebooting

@nliviu could you plese try to reproduce this? Workaround is to call mgos_gpio_disable_int(2); before restart.

Yes, it freezes when using GPIO2. It doesn’t with GPIO5. GPIO2 has some special meaning for ESP8266 and maybe that’s why it needs more attention. The freezing appears only after a software reset (Reset cause: 4 (soft reset)). According to the Espressif docs, after a software reset the GPIO state is unchanged.

You can capture MGOS_EVENT_REBOOT and disable the interrupt there:

static void system_restart_cb(int ev, void *ev_data, void *userdata) {
  mgos_gpio_disable_int((int) (userdata));

  (void) ev;
  (void) ev_data;
}

enum mgos_app_init_result mgos_app_init(void) {
  const int pin = 2;
  mgos_event_add_handler(MGOS_EVENT_REBOOT, system_restart_cb, (void *) pin);
  mgos_gpio_set_int_handler_isr(pin, MGOS_GPIO_INT_EDGE_ANY, cb, NULL);


  mgos_system_restart_after(500);

  return MGOS_APP_INIT_SUCCESS;
}

You can capture MGOS_EVENT_REBOOT and disable the interrupt there

Thanks @nliviu . This will help for controlled reboot. But what about WDT reboot when firmware crashes? In that case it won’t be called and the device will freeze on next boot.

Is there a way to add an early boot hook (before vfs initialization) where I would call mgos_gpio_disable_int(2);?

Starting the timer after receiving MGOS_EVENT_CLOUD_CONNECTED might help too.

I used ISR timer just to demonstrate the problem. In our case we use ISR GPIO handler and it is used for remote controlling the device witg IR. We can’t disable it until MGOS_EVENT_CLOUD_CONNECTED because it should work even if user won’t connect it to cloud. Also I am not sure if there won’t be other places in mos where it could freeze. The best would be to find a root problem of it and try to fix it. Do you have any trace for us please?

Can’t you change GPIO2 to something else which does not have this behavior?

Can’t you change GPIO2 to something else which does not have this behavior?

Unfortunately not, the devices has already been manufactured.

Working code with GPIO2

The device connects to mDash and reports state every 10 seconds if restart_after is long enough:

[Nov  7 13:57:58.394] main.c:181              uptime: 4684392, state: 0
[Nov  7 13:57:58.494] main.c:181              uptime: 4784456, state: 0
[Nov  7 13:57:58.496] main.c:181              uptime: 4784669, state: 1
[Nov  7 13:57:58.502] main.c:181              uptime: 4784669, state: 1
[Nov  7 13:57:58.505] main.c:181              uptime: 4784669, state: 1
[Nov  7 13:57:58.511] main.c:181              uptime: 4784669, state: 1
[Nov  7 13:58:06.531] ip:***,mask:255.255.255.0,gw:***
[Nov  7 13:58:06.534] mgos_net.c:102          WiFi STA: ready, IP ***, GW ***, DNS ***
[Nov  7 13:58:06.543] mgos_mqtt.c:435         MQTT connecting to mdash.net:8883
[Nov  7 13:58:06.554] mgos_mongoose.c:66      New heap free LWM: 39544
[Nov  7 13:58:06.607] mgos_mongoose.c:66      New heap free LWM: 39296
[Nov  7 13:58:06.697] mg_ssl_if_mbedtls.c:35  0x3fff0f54 ciphersuite: TLS-RSA-WITH-AES-128-GCM-SHA256
[Nov  7 13:58:06.704] mgos_mongoose.c:66      New heap free LWM: 38528
[Nov  7 13:58:07.258] mgos_mongoose.c:66      New heap free LWM: 26672
[Nov  7 13:58:07.309] mgos_mqtt.c:141         MQTT TCP connect ok (0)
[Nov  7 13:58:07.365] mgos_mqtt.c:185         MQTT CONNACK 0
[Nov  7 13:58:07.446] mgos_sntp.c:95          SNTP query to time.google.com
[Nov  7 13:58:07.515] mgos_sntp.c:59          SNTP reply from 216.239.35.0: time 1573127888.285617, local 13.803936, delta 1573127874.481681
[Nov  7 13:58:07.798] pm open,type:0 0
[Nov  7 13:58:14.068] main.c:83               report_state - mgos_mqtt_pub ({"state":{"reported":{"uptime": 20.357, "ram_min_free": 41636, "ram_free": 26672}}}) -> success
[Nov  7 13:58:24.068] main.c:83               report_state - mgos_mqtt_pub ({"state":{"reported":{"uptime": 30.357, "ram_min_free": 41636, "ram_free": 26672}}}) -> success
[Nov  7 13:58:32.898] main.c:181              uptime: 39187553, state: 0
[Nov  7 13:58:33.003] main.c:181              uptime: 39292416, state: 1
...
[Nov  7 14:00:37.904] mg_rpc.c:293            sys.reboot via WS_in ***:43832
[Nov  7 14:00:37.910] mgos_utils.c:38         Rebooting in 100 ms
[Nov  7 14:00:38.014] main.c:196              system_restart_cb - mgos_gpio_disable_int(2)
[Nov  7 14:00:38.020] mgos_vfs.c:990          Unmounting filesystems

Thanks for the code. Unfortunately system_restart_cb won’t be called if system freezes for whatever other reason and WDT restarts it. Then it freezes during the next boot (see the logs).

I’ve also tried to add following code for such cases but it didn’t help:

void mgos_app_preinit(void) {
  mgos_gpio_disable_int(2);
}

Logs:

[Nov  7 14:36:47.430] load 0x40100000, len 1540, room 16 
[Nov  7 14:36:47.430] tail 4
[Nov  7 14:36:47.430] chksum 0x62
[Nov  7 14:36:47.430] load 0x3ffe8000, len 748, room 4 
[Nov  7 14:36:47.430] tail 8
[Nov  7 14:36:47.430] chksum 0x85
[Nov  7 14:36:47.430] csum 0x85
[Nov  7 14:36:47.430] rBoot v1.2.1-cesanta1 - richardaburton@gmail.com
[Nov  7 14:36:47.430] Flash Size:   16 Mbit
[Nov  7 14:36:47.430] Flash Mode:   DIO
[Nov  7 14:36:47.430] Flash Speed:  80 MHz
[Nov  7 14:36:47.430] rBoot Option: Big flash
[Nov  7 14:36:47.430] 
[Nov  7 14:36:47.430] Booting rom 0 (0x8000).
| r  l N   N   $`   r p  l  [   : softAP(2e:f4:32:0
[Nov  7 14:36:47.430] esp_main.c:172          test 1.0 (20191107-133321)
[Nov  7 14:36:47.430] esp_main.c:174          Mongoose OS 2.16.0 (20191107-131839/2.16.0-g9b779cd)
[Nov  7 14:36:47.430] esp_main.c:178          CPU: ESP8266EX, 160 MHz, RAM: 52856 total, 50292 free
[Nov  7 14:36:47.430] esp_main.c:180          SDK 2.2.1(1247cc5); flash: 2M
[Nov  7 14:36:47.430] esp_exc.c:211           Reset cause: 4 (soft reset)
[Nov  7 14:36:47.430] mg_lwip_ev_mgr.c:93     Mongoose 6.15, LwIP 1.4.1
[Nov  7 14:36:47.430] mg_ssl_if_mbedtls.c:57  mbed TLS 2.16.3-cesanta2
[Nov  7 14:36:47.430] mgos_vfs_dev.c:73       sfl0: sysflash (), size 2097152
[Nov  7 14:36:47.430] mgos_vfs_dev.c:73       root: part ({"dev": "sfl0", "offset": 897024, "size": 131072}), size 131072
[Nov  7 14:36:47.430] mgos_vfs.c:147          /: SPIFFS @ root, opts 
[Nov  7 14:36:47.430] mgos_vfs.c:320          /: size 113201, used: 502, free: 112699
[Nov  7 14:37:13.410] 
[Nov  7 14:37:13.411] HW WDT @ 0x40106430
[Nov  7 14:37:13.413]  A0: 0x4000050c  A1: 0x3ffff910  A2: 0x00000000  A3: 0x3ffee62d
[Nov  7 14:37:13.418]  A4: 0x3fffc200  A5: 0x00000531  A6: 0xffffffff  A7: 0x00000020
[Nov  7 14:37:13.424]  A8: 0x4000dc3c  A9: 0x000000ce A10: 0x00000000 A11: 0x000000ce
[Nov  7 14:37:13.429] A12: 0x0000000b A13: 0x00000800 A14: 0x00000004 A15: 0x00000008
[Nov  7 14:37:13.435] 
[Nov  7 14:37:13.435] (exc SP: 0x3ffff750)
[Nov  7 14:37:13.437] 
[Nov  7 14:37:13.437] --- BEGIN CORE DUMP ---
[Nov  7 14:37:13.439] mos: catching core dump
[Nov  7 14:37:16.282] ....
[Nov  7 14:37:24.989] ---- END CORE DUMP ----
[Nov  7 14:37:24.994] mos: wrote to /home/martin/m/test/core-test-esp8266-20191107-143724.015077713 (133108 bytes)
[Nov  7 14:37:24.994] mos: analyzing core dump
Core dump by test/esp8266 1.0 20191107-133321
Using ELF file at: /home/martin/m/test/build/objs/test.elf
Using Docker image: docker.io/mgos/esp8266-build:2.2.1-1.5.0-r5
Running docker run --rm -v /home/martin/m/test/build/objs/test.elf:/fw.elf -v /home/martin/m/test/core-test-esp8266-20191107-143724.015077713:/core -v /home/martin/m/test:/home/martin/m/test docker.io/mgos/esp8266-build:2.2.1-1.5.0-r5 bash -c /usr/local/bin/serve_core.py --rom=/opt/Espressif/rom/rom.bin --rom_addr=0x40000000 /fw.elf /core & $MGOS_TARGET_GDB /fw.elf -ex 'target remote 127.0.0.1:1234' -ex 'set confirm off' -ex bt -ex quit
GNU gdb (crosstool-NG crosstool-ng-1.22.0-60-g37b07f6f) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-build_pc-linux-gnu --target=xtensa-lx106-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /fw.elf...done.
194	/mongoose-os/platforms/esp8266/src/esp_gpio.c: No such file or directory.
Remote debugging using 127.0.0.1:1234
0x40106430 in esp8266_gpio_isr (arg=<optimized out>)
    at /mongoose-os/platforms/esp8266/src/esp_gpio.c:194
#0  0x40106430 in esp8266_gpio_isr (arg=<optimized out>)
    at /mongoose-os/platforms/esp8266/src/esp_gpio.c:194
#1  0x4000050c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Detaching from program: /fw.elf, Remote target
Waiting for gdb on 1234
Found core at 23 - 133071
Mapping DRAM: 98304 @ 0x3ffe8000
Mapping /opt/Espressif/rom/rom.bin at 0x40000000
Mapping /fw.elf .data: 2508 @ 0x3ffe8000
Mapping /fw.elf .rodata: 8 @ 0x3ffe89cc
Mapping /fw.elf .bss: 26544 @ 0x3ffe89d8
Mapping /fw.elf .text: 29281 @ 0x40100000
Mapping /fw.elf .irom0.text: 584815 @ 0x40208010
Loaded core dump from last snippet in  /core
GDB closed the connection
Ending remote debugging.
 Nov  7 14:37:26.010] rebooting

We got into the state when accessing VFS during handling GPIO event of GPIO2 was causing WDT. The fix in our case was to move in our GPIO handler one case from switch to if and the problem was gone. I don’t understand how this could fix it but I can reliably reproduce it. I hesitate to blame the compiler…

I have managed to fix boot freezing by including following lines:

void mgos_app_preinit(void) {
  mgos_gpio_enable_int(2); // right, _enable_
}

This works even if the system is forcibly restarted eg by WDT. No need to call mgos_gpio_disable_int on MGOS_EVENT_REBOOT.