OTA update fails occasionally

1. My goal is:

To solve a problem at the OTA udpate, which makes it impossible to update devices.

2. My actions are:

Hardware: Olimex ESP32-EVB
Mongoose OS: 2.12.1 (20190409-161926/2.12.1-gdd403d6)

After the first problem that a device in the field could not be updated anymore, I did several updates over mDash OTA where only the version and a log output were changed.

The mos.yml:

libs_version: ${mos.version}
modules_version: ${mos.version}
mongoose_os_version: ${mos.version}

tags:
  - c

sources:
  - src

filesystem:
  - fs

# Config
- ["shadow", "o", {title: "Device shadow settings"}]
- ["shadow.enable", "b", true, {title: "Enable device shadow functionality"}]
- ["shadow.lib", "s", "dash", {title: "Preferred shadow lib, e.g. aws, dash, gcp"}]
- ["update", "o", {title: "Firmware updater"}]
- ["update.timeout", "i", 600, {title : "Update will be aborted if it does not finish within this time"}]
- ["update.commit_timeout", "i", 600, {title : "After applying update, wait for commit up to this long"}]
- ["shadow.ota_enable", "b", true, {title: "Enable OTA via shadow"}]
- ["shadow.autocommit", "b", false, {title: "Autocommit OTA if the shadow connection is successful"}]

libs:
  - origin: https://github.com/mongoose-os-libs/boards
  - origin: https://github.com/mongoose-os-libs/dash
  - origin: https://github.com/mongoose-os-libs/shadow
  - origin: https://github.com/mongoose-os-libs/ca-bundle
  - origin: https://github.com/mongoose-os-libs/rpc-common
  - origin: https://github.com/mongoose-os-libs/rpc-service-config
  - origin: https://github.com/mongoose-os-libs/rpc-service-fs
  - origin: https://github.com/mongoose-os-libs/rpc-uart
  - origin: https://github.com/mongoose-os-libs/rpc-service-ota
  - origin: https://github.com/mongoose-os-libs/ota-common
  - origin: https://github.com/mongoose-os-libs/ota-http-client
  - origin: https://github.com/mongoose-os-libs/ota-shadow
  - origin: https://github.com/mongoose-os-libs/file-logger

manifest_version: 2017-09-29

3. The result I see is: [show the result - log, etc]

The update works n times and then sticks to one version. OTA updates with newer versions are ignored. With the OTA the boot partition will be set correctly, but the restart is done with the old partition.

The following is a brief history of the current update problem. Noticeable places are highlighted by comment.

OTA Step 1 - Version 0.3 to 0.4 (okay)

mgos_ota_core.c:488     FW: device esp32 0.4 20190409-161926
esp32_ota_backend.c:197 App: device.bin -> app_1, FS: fs.img -> fs_1
esp32_ota_backend.c:265 Writing app image @ 0x1d0000
esp32_ota_backend.c:383 app_1 verified (1a5f6fff5a760258749facdc04c927fd1ebd989e)
esp32_ota_backend.c:448 Setting boot partition to app_1
# esp_ota_ops: New OTA data 1: seq 0x00000002, st 0x11, CRC 0xddd0de27
mgos_system.c:42        Restarting
boot: OTA data 0: seq 0x00000001, st 0x10, CRC 0x157a2b85, valid? 1
boot: OTA data 1: seq 0x00000002, st 0x11, CRC 0xddd0de27, valid? 1
# boot: Loaded app from partition at offset 0x1d0000
mgos_hal_freertos.c:177 device 0.4 (20190409-161926)
esp32_ota_backend.c:448 Setting boot partition to app_0
# esp_ota_ops: New OTA data 0: seq 0x00000001, st 0x10, CRC 0x157a2b85
esp32_main.c:119        Boot partition: app_1; flash: 4M
mg_rpc.c:293            OTA.Commit via WSS_out 148.251.54.236:443
esp32_ota_backend.c:448 Setting boot partition to app_1
esp32_ota_backend.c:565 Committed slot 1

View full log

OTA Step 2 - Version 0.4 to 0.2 (failed)

mgos_ota_core.c:488     FW: device esp32 0.2 20190328-173727
esp32_ota_backend.c:197 App: device.bin -> app_0, FS: fs.img -> fs_0
esp32_ota_backend.c:265 Writing app image @ 0x10000
esp32_ota_backend.c:383 app_0 verified (8699599c57bbd2896333c1d9a746dad36df77c3b)
esp32_ota_backend.c:448 Setting boot partition to app_0
# esp_ota_ops: New OTA data 0: seq 0x00000001, st 0x10, CRC 0x157a2b85
mgos_system.c:42        Restarting
boot: OTA data 0: seq 0x00000001, st 0x10, CRC 0x157a2b85, valid? 1
boot: OTA data 1: seq 0x00000002, st 0x11, CRC 0xddd0de27, valid? 1
# boot: Loaded app from partition at offset 0x1d0000
mgos_hal_freertos.c:177 device 0.4 (20190409-161926)
esp32_main.c:119        Boot partition: app_1; flash: 4M
# esp32_ota_backend.c:448 Setting boot partition to app_1
mg_rpc.c:293            OTA.Commit via WSS_out 148.251.54.236:443
esp32_ota_backend.c:448 Setting boot partition to app_0
# esp_ota_ops: New OTA data 0: seq 0x00000001, st 0x10, CRC 0x157a2b85
esp32_ota_backend.c:565 Committed slot 0

view full log

OTA Step 3 - Version 0.4 to 0.4.1 (failed)

mgos_ota_core.c:488     FW: device esp32 0.4.1 20190423-124216
esp32_ota_backend.c:197 App: device.bin -> app_0, FS: fs.img -> fs_0
esp32_ota_backend.c:265 Writing app image @ 0x10000
esp32_ota_backend.c:383 app_0 verified (79d66685f4e7812b5a299187547e96023a4cb760)
esp32_ota_backend.c:448 Setting boot partition to app_0
# esp_ota_ops: New OTA data 0: seq 0x00000001, st 0x10, CRC 0x157a2b85
mgos_system.c:42        Restarting
boot: OTA data 0: seq 0x00000001, st 0x10, CRC 0x157a2b85, valid? 1
boot: OTA data 1: seq 0x00000002, st 0x11, CRC 0xddd0de27, valid? 1
# boot: Loaded app from partition at offset 0x1d0000
mgos_hal_freertos.c:177 device 0.4 (20190409-161926)
esp32_main.c:119        Boot partition: app_1; flash: 4M
# esp32_ota_backend.c:448 Setting boot partition to app_1
mg_rpc.c:293            OTA.Commit via WSS_out 148.251.54.236:443
esp32_ota_backend.c:448 Setting boot partition to app_0
# esp_ota_ops: New OTA data 0: seq 0x00000001, st 0x10, CRC 0x157a2b85
esp32_ota_backend.c:565 Committed slot 0

view full log

Another example can be seen here:
Full log of another OTA failure

4. My expectation & question is:

What can be the cause and how can it be solved?

There was an OTA bug in 2.12.
Release 2.13 has it fixed. Please use 2.13 and higher, it should OTA successfully.

I am facing the exact same problem. I updated the mos tool(current version:201905031521,Build ID:20190503-152306/gb952383-master) but the ota problem still persists :confused:

@cpq but what was the bug? I don’t see anything mentioned in the changelog of Mongoose OS releases … so if there was a bug, why is it not mentioned?

1 Like

Did you ever find anything about this? I’m seeing about 40% OTA failures. It does appear to be fixed in 2.18.0 though.