Understanding Mongoose as an ESP32 freertos task

jstevewood · July 15, 2021, 6:49pm

My application consists of

index.html with RPC connected buttons and sliders
init.js which has software timer and RPC ffi calls to main.c
main.c which sets up hardware timer callbacks (1 msec sensor read and process) and interrupts (generated by real world randomly timed events).

My application is built and OTA updated using the current version of mos

Everything in this structure works very well! however…

To be certain my app will work in all circumstances, I need to understand the relationship between the above mentioned entities and the task structure in ESP’s freertos regime, with respect to the following questions:

How often does the Mongoose task get called?
What other Espressif generated tasks live in a factory delivered WROOM module?
What is the Mongoose task (i. e. init.js? software timer callbacks? HTTP? main.c? )
In what task do hardware timer callbacks and hardware ISRs live?
Do I have to worry about any ESP32 ISR/process/task taking priority over my ISRs?

I have tried to find a document that describes ESP32/Mongoose interaction in sufficient detail but without success. Please point me in the right direction.

Thanks for your help,

JSW

scaprile · July 15, 2021, 9:05pm

I guess you actually mean Mongoose-OS, the event-driven framework, which is … an event-driven framework… on top of (for the ESP32) Espressif’s IDF, which uses FreeRTOS. (Mongoose is a HTTP server and network stack)
The only “bad news” that I can give you is that (almost ?) everything here uses dynamic memory allocation, which I don’t like 'cause I’m a die hard bare metal old dog.
FreeRTOS is not hard real-time (AFAIK), I wouldn’t mess with priorities, and all callbacks belong to a single context, except for hardware timer interrupts which I’m not 100% sure but I bet they are dispatched via the hardware and FreeRTOS is not aware of that. Those of course have a different context with respect to mOS and can preempt mOS and its callbacks.

jstevewood · July 20, 2021, 12:52am

Thanks for the info.

Questions still remain…

Does Mongoose’s javacript engine exist in the same context as mos compiled main.c ?
Where is the HTTP server installed?
Where are the RPC handlers installed?
How does the Mongoose context begin and end?
Are long software js timer events (I have timers at 1, 10, 60, 300 and 3600 seconds) guaranteed to happen over any number of context switches?
Are main.c software timer events (similar interval size as above) guaranteed to happen over any number of context switches?
How are software timers preserved over a context switch?
What are the other tasks that exist in a plain vanilla ESP-32 system? (I don’t intend to play with priorities, or add or subtract tasks. I think that it is essential to understand what is going on in the system).

I have looked though lots of Espressif and Mongoose documents, but I can’t find any specific material. A block diagram with how Mongoose plays with the IDF would be really useful.

Thanks for your help,

JSW

scaprile · July 21, 2021, 3:22pm

1: All mOS is one context, it is an event-driven framework and since there is no preemption all events are in the same context.
2: define “where”, Mongoose is HTTP+MQTT+… Mongoose-OS is Mongoose + event-driven engine
3: define “where”, your handlers are callbacks that are called when an event fires
4: I don’t know what you mean by “begin” and “end”
5: there are no context switches as far as you and me are concerned, for us mortals everything is in one context.
6: there are no js timer events and software timers, there are timers and there are timer expired events which fire callbacks. Those callbacks are catched by handlers that can be mJS scripts or C functions. See 5
7: See 5
8: See my previous answer. I guess the guys at Cesanta have a block diagram but I dont think they will give it to you for free; this product has a commercial license that pays their bills, thank them we have a nice piece of software with Apache license. As far as I can see, you have the hw, on top of that you have Espressif’s IDF that uses FreeRTOS, on top of that you have mOS (Mongoose-OS). If you program in mJS you only see the mJS interpreter, if you program in C you can also call the IDF and if you are brave you can also call FreeRTOS. As long as you don’t use a hardware interrupt everything for you is one context.
On the other side, depending on the chip, there is WiFi handling which in some micros requires lotsa CPU time, and on the ESP32 used to devote a big chunk of one of the cores. By definition, on a networked system, unless you have a hard-realtime OS and perhaps suitable hw to allow it to do its job, you don’t have guaranteed timing unless of course you run it in parallel via hw interrupts.
As far as I’ve tested, if you set a repeatable timer as a callback, in C, for <10ms on an ESP32, you’ll run into timing issues.
Hope it helps. If you need to control everything you’ll need an ARM Cortex-R5 and a certified TCP/IP stack and hard-realtime RTOS (or write everything bare metal yourself).

jstevewood · July 27, 2021, 1:37am

I am probably not using the right terminology. Is context the same as task?

All of this is inspired by a tight loop that writes 200 characters to flash. Sometimes (actually all the time when the flash is more than half full), the WDT on the Mongoose task times out, and the system re-starts. So, the Mongoose task does indeed come to an end, because in all cases where the WDT does not trigger, the Mongoose task has finished before the WDT expires. After the Mongoose task finishes, then what happens? Does another ESP originated task start? Is it just an idle task that re-starts the Mongoose task?

Hardware timers triggering interrupts (and I assume also hardware originated events) live outside any task and schedule structure, so I am not worried about them. But are js software timers different from C originated software timers?

My real concern is to ensure that software timers (either in js or in C) that are on really long intervals (i. e. minutes and hours) will be guaranteed to invoke their designated callbacks. This reliability of software timers appears to be actually happening, but I don’t think that I can finish my application without understanding how and why.

And we would be willing to pay a reasonable price for that block diagram.

Thanks for your help,

JSW

scaprile · July 27, 2021, 3:53pm

In an event-driven architecture, you are not supposed to loop. You write event handlers that do what they need to do and return. If you have more stuff to do, you fire an event to resume later. All you write in mOS are event handlers, functions that are called in response to an event occurring, you handle the event and return. Whatever the microcontroller does when no event happens belongs to the internals of the framework, the event engine. 99.9% of the time you write applications you don’t need to dive there.

If you get WDT resets then post the code where you are experiencing that, so others can help you. Your description suggests a seek operation taking longer than expected. The WDT should be properly handled by the framework if you don’t loop so this should not happen if you are doing things in an event-driven-oriented way. The framework will not preempt your callback/handler/“task”/code, it can only run again when your function returns.

Hardware timers have a different context than software timers. All you write with mOS functions is code that as far as you are concerned will run to completion, will not be preempted by other code you write. Except for hardware timer handlers which will interrupt other code, processor registers will be saved, the stack might also be switched (don’t know the particulars of the exception handling hardware nor software involved), you have a different context, if you share variables between generic mOS code and hardware timer interrupts, you have to take atomicity into consideration or protect accesses, use mutexes or semaphores or whatever you like. With all other code, you can safely share static variables, all handlers belong to the same context.
There are no mJS timers nor C timers. mOS provides you with the possibility to set a timer that once it expires will invoke a callback, the function to be called you can write it in C or you can write it in mJS. If it is written in C, it has been compiled and the callback will be to processor code performing your function. If it is written in mJS, the framework will call the mJS interpreter to interpret your code, which of course is slower.

jstevewood · July 28, 2021, 3:27am

OK so as I now understand things:

There is no other Espressif written ESP-32 task in a plain vanilla mos compiled application
The only other items that could possibly possibly pre-empt code that is busy attempting to run to completion are: a) hardware interrupts and b) callbacks called by hardware timers. In both of these cases, the code attempting to complete will be suspended for the duration of that ISR or callback (which will typically be a very short time, and I should not worry about it).

However, this generates 3 new questions:

When all of my code in this event generated architecture has completed, and the system is just idling, what code/hardware/thing resets the WDT?
How are software timers implemented? (They can’t be implemented as “code running to completion” because you can program an unlimited number of them, yet - unlike a a hardware timer- they are prevented from triggering a callback when , for example, my code is busy trying to finish writing to flash).
If my flash writing takes 20 seconds to complete, what happens to a software timer that is programmed to repeat at 60 seconds? (Does it expire at approximately 60 seconds from when it is called? or will it expire at closer to 80 seconds?)

Thanks for your help!

JSW

scaprile · July 28, 2021, 3:34pm

First paragraph

I don’t know, there is freeRTOS and there is the IDF. As far as I care, my code can not interrupt my code. I write event handlers and return control to mOS. Whatever happens below that is not my concern, I know this is not a hard realtime system and I know my callbacks are usually fired in the order of the milliseconds around the time they are expected to be fired, or better.
same as 1. Busy waiting and event-driven are usually mutually exclusive.
Second paragraph
“Idling” means you are not doing anything, but from mOS downwards there is housekeeping (the IDF is handling WiFi, lwIP is handling TCP/IP, freeRTOS is doing what an OS does…) mOS is what calls you and it also calls the IDF to kick the WD when it considers it has to do so. What the IDF does, I don’t know; it might do it itself or call FreeRTOS, I guess it kicks the dog itself.
I don’t know how software timers are implemented internally, though parts of the source code are free for you to check. Your timer handler callback will be called when it is due time to do so. Usually an event-driven OS or framework will have a tick interrupt, count time, then queue events. The main event queue handler will dequeue events and call the proper user/system handlers registered for that event when it runs.
It doesn’t matter how long you take to write your flash, you can be writing all day, just don’t loop and return control. An event handler handles an event and returns, it does not while(1) nor for(;;); work has to be done in reasonable chunks using reasonable time.
If you set a timer to fire an event every 60 seconds, it will fire an event every 60 seconds. It means that mOS will take care of queuing an event at proper time. mOS then will dequeue that event and call the appropriate handler when you let it do it by properly returning from your handlers that where called in response to other events. A repeatable timer will be requeued on expiration, so the events will fire at that time interval but if you don’t return then mOS won’t have a chance to call your callback.
If you are “writing to flash” that is because there was an event that called the callback you are using to write to flash (even if iyt is just the init function) , and until you return from that callback, mOS can not do anything but queue other events. If it takes 20 seconds for you to do your job, you must partition your job in small pieces that can finish in a reasonable time, where “reasonable time” means milliseconds. If for some reason you can’t release control, you have to kick the WD yourself, but be aware that you probably are also disrupting TCP, WiFi, and system timing. While you loop those 20 seconds, even though the timer event is queued, mOS can not call you callback (run to completion, not interruptable, single context) until you return.
If you can’t live with this then write a FreeRTOS task and run in parallel with the framework.

I suggest you post some code to check what you are not doing right.

nliviu · July 28, 2021, 5:32pm

If I understood correctly, you are using hardware timers. In this case the callback is executed in ISR context.

The only safe operations in ISR context are read/write GPIOs and mgos_invoke_cb.
The same applies to mgos_gpio_set_int_handler_isr.

As I said in another thread, it’s a good idea to call mongoose_poll(0) from time to time during a lenghty operation.

Timers source code.

I second @scaprile’s suggestion to post some code that reproduce your issues.

jstevewood · July 30, 2021, 10:23pm

I will post the code. Since the program is quite large, I will have to extract the relevant part in an isolated module. Early next week.

My questions in this thread really have to do with gaining a more complete understanding of how everything plays together in an ESP-32 application. I have been working on this project for almost 2 years, and at the moment - even without a complete understanding - my application appears to be working very well. ( That is, other than the file writing issue that it is the genesis of the thread; but if I have to, I have a work around for this issue. )

In looking into the timers source code, I think that I now understand enough about how this code works to ensure that I will never have a problem with software timers. Thanks for the direction.

However, I don’t understand the statement:
" The only safe operations in ISR context are read/writeGPIOs and mgos_invoke_cb"

What is “safe”?
My current architecture uses a hardware timer at 1msec repeat intervals. The callback associated with this hardware timer reads 4 A/D channels and does some arithmetic with the derived results, and exits. The A/D conversions at about 35 microseconds each account for the majority of the time consumed by this callback ( total duration measured at about 200 microseconds).

At the moment, this hardware timer - call back construction appears to be working perfectly.

Are you saying that this is not a proper way to performs this type of function? Should I put mgos_invoke_cb as the only instruction in in the exising hardware timer call back, and then put all of the A/D and calculation in a different callback invoked by mgos_invoke_cb?

Thanks for your help,

JSW

scaprile · August 2, 2021, 10:46pm

I will not continue this thread, I’m sorry, I just don’t have the time.

jstevewood · August 3, 2021, 4:50pm

Sorry to keep at this. I have re-read all of the Mongoose-OS docs, and I can’t find any explanation of ISR context. Direction to where this info exists would be most helpful. I am sure that all ESP-32 apps that use hardware timers are affected by this. Your explanation so far has been very helpful. The only thing that I cannot grasp is:

"The only safe operations in ISR context are read/write GPIOs and mgos_invoke_cb. "

Please explain what is meant by “safe”?

Thanks as always for the excellent assistance.

JSW

nliviu · August 3, 2021, 5:00pm

Found it here. That’s a message from one of Mongoose OS’s developers.

jstevewood · August 5, 2021, 1:45am

Thanks!

I think also mgos_clear_timer(int) is also safe. It is implemented in IRAM.

Hopefully also adc1_get_raw(int) from the IDF is safe.

Both of these appear to have been functioning flawlessly over thousands of cycles.