I have several hundred field devices that have been running 2.18 for the last couple of years. I recently did some maintenance updates and also migrated across to 2.20 for this particular hardware version. Most of them seem to be OK, however I’ve now had a handful that go offline a few days after the upgrade, and don’t come back online again.
I managed to get my hands on one that was fairly local and realised that it was core dumping immediately when a lock closed (so probably a power issue, but only after the firmware update).
Here’s the core dump:
Loaded core dump from last snippet in /core
0x4008fdfa in multi_heap_malloc_impl (heap=0x3f800000, size=20) at /opt/Espressif/esp-idf/components/heap/multi_heap.c:432
432 MULTI_HEAP_ASSERT(is_free(b), b); // block should be free
#0 0x4008fdfa in multi_heap_malloc_impl (heap=0x3f800000, size=20) at /opt/Espressif/esp-idf/components/heap/multi_heap.c:432
#1 0x400838e6 in heap_caps_malloc (size=20, caps=5120) at /opt/Espressif/esp-idf/components/heap/heap_caps.c:111
#2 0x40083b15 in heap_caps_calloc (n=<optimized out>, size=20, caps=5120)
at /opt/Espressif/esp-idf/components/heap/heap_caps.c:329
#3 0x40083b7a in heap_caps_calloc_prefer (n=1, size=20, num=1) at /opt/Espressif/esp-idf/components/heap/heap_caps.c:231
#4 0x400861fd in wifi_calloc (n=1, size=20) at /opt/Espressif/esp-idf/components/esp32/esp_adapter.c:92
#5 0x4008623c in wifi_zalloc_wrapper (size=20) at /opt/Espressif/esp-idf/components/esp32/esp_adapter.c:100
#6 0x40115cbb in esp_wifi_sta_get_ap_info ()
#7 0x4010ceec in mgos_wifi_sta_get_rssi () at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/wifi/src/esp32/esp32_wifi.c:599
#8 0x401020fd in ffi_call (func=0x4010cee4 <mgos_wifi_sta_get_rssi>, nargs=0, res=0x3ffb82f8 <mgos_task_stack+15376>, args=0x3ffb8220 <mgos_task_stack+15160>)
at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/modules/mjs/mjs.c:7434
#9 0x40106859 in mjs_ffi_call2 (mjs=0x3ffdd870) at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/modules/mjs/mjs.c:10892
#10 0x40108d49 in mjs_execute (mjs=0x3ffdd870, off=<optimized out>, res=0x3ffb8448 <mgos_task_stack+15712>)
at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/modules/mjs/mjs.c:9639
#11 0x40109c54 in mjs_apply (mjs=0x3ffdd870, res=0x3ffb84a8 <mgos_task_stack+15808>, func=4611389493885322285, this_val=<optimized out>, nargs=nargs@entry=1, args=args@entry=0x3ffde7b4)
at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/modules/mjs/mjs.c:9955
#12 0x40109e89 in ffi_cb_impl_generic (param=<optimized out>, data=<optimized out>) at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/modules/mjs/mjs.c:10365
#13 0x4010a099 in ffi_cb_impl_wpwwwww (w0=1073604396, w1=1074281972, w2=0, w3=1073600032, w4=1073447888, w5=1)
at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/modules/mjs/mjs.c:10430
#14 0x400e66d9 in mgos_timer_ev (nc=<optimized out>, ev=<optimized out>, ev_data=<optimized out>, user_data=0x3ffcd7e4) at /mongoose-os/src/mgos_timers.c:101
#15 0x40187de5 in mg_call (nc=0x3ffcd7f4, ev_handler=0x400e6600 <mgos_timer_ev>, user_data=0x3ffcd7e4, ev=6, ev_data=0x3ffb85a0 <mgos_task_stack+16056>) at src/mg_net.c:78
#16 0x401893d5 in mg_timer (now=1716575984.1110661, c=0x3ffcd7f4) at src/mg_net.c:100
#17 mg_if_poll (nc=0x3ffcd7f4, now=1716575984.1110661) at src/mg_net.c:139
#18 0x4018c4b3 in mg_lwip_if_poll (iface=<optimized out>, timeout_ms=<optimized out>) at src/common/platforms/lwip/mg_lwip_ev_mgr.c:119
#19 0x4019ee68 in mg_mgr_poll (m=0x3ffbdd0c <s_mgr>, timeout_ms=0) at src/mg_net.c:283
#20 0x40184440 in mongoose_poll (ms=0) at /data/tmp/mos_prebuild/tmp/cesanta/mos-libs/mongoose/src/mgos_mongoose.c:61
#21 0x400859ee in mgos_mg_poll_cb (arg=<optimized out>) at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/freertos/src/mgos_freertos.c:103
#22 0x40085bb0 in mgos_task (arg=<optimized out>) at /Users/gadget-man/Documents/iParcelBox/3-Firmware/iParcelBox_firmware/deps/freertos/src/mgos_freertos.c:222
Can any of you lot much smarter than me help me work out from this:
a) what’s causing the core dump (other than just that it’s a memory allocation error.
b) why it’s only doing it on 2.20, when it was absolutely rock solid on 2.18