Pulled in the 5.1+5.3 docs from the esp8266 branch.

With minor modifications to drop ESP8266 specific information not
applicable to the ESP32 series. Further corrections welcome.
This commit is contained in:
Johny Mattsson 2021-08-22 19:25:39 +10:00
parent 17df207a5f
commit e52e0a8e84
7 changed files with 1044 additions and 0 deletions

195
docs/lfs.md Normal file
View File

@ -0,0 +1,195 @@
# Lua Flash Store (LFS)
## Some Acronym Definitions
Before I begin, I think it useful to define some acronyms in the context used in this paper. There are others like [IoT](https://en.wikipedia.org/wiki/Internet_of_things) which you might know and on first use are hyperlinked to an external (typically Wikipedia) definition.
Acronym | What it means in this paper
-----|--------
**LFS** | _Lua Flash Store_, and if you want to understand more then read on.
**RTS** | The Lua environment on the ESP chipset include a core [Runtime System](https://en.wikipedia.org/wiki/Runtime_system) includes a [VM](https://en.wikipedia.org/wiki/Virtual_machine) that executes compiled Lua code, handle resource management, and provides and [API](https://en.wikipedia.org/wiki/Application_programming_interface) for the integration of library modules.
**ESP** | [Espressif](https://www.espressif.com/) Systems Processor. A range of [SoC](https://en.wikipedia.org/wiki/System_on_a_chip) devices based on an [Xtensa core](https://en.wikipedia.org/wiki/Tensilica#Xtensa_configurable_cores). NodeMCU currently offers Lua firmware builds for its ESP8266 and ESP32 families.
**RAM** | [Random Access Memory](https://en.wikipedia.org/wiki/Random-access_memory) that can be directly read or written by the [CPU](https://en.wikipedia.org/wiki/Central_processing_unit) and can be used to hold both data and machine code.
**ROM** | [Read Only Memory](https://en.wikipedia.org/wiki/Read-only_memory), though for the purposes of this paper, this also includes a region of [Flash memory](https://en.wikipedia.org/wiki/Flash_memory).
**GC** | [Garbage Collection](https://www.lua.org/manual/5.3/manual.html#2.5) refers to the automatic management resources to collect dead objects and make any associated memory available for reuse.
## Background
Lua was originally designed as a general purpose embedded extension language for use in applications run on a conventional computer such as a PC, where the processor is mounted on a motherboard together with multiple Gb of RAM and a lot of other chips providing CPU and I/O support to connect to other devices.
An ESP8266 module is on a very different scale: it costs a few dollars; it is postage stamp-sized and only mounts two main components, an ESP SoC and a flash memory chip. The SoC includes on-chip RAM and also provides hardware support to map part of the external flash memory into a ROM addressable region so that the [firmware](https://en.wikipedia.org/wiki/Firmware) code can be executed via an [L1 cache](https://en.wikipedia.org/wiki/CPU_cache) out of this memory.
ESP SoCs also adopt a type of [modified Harvard architecture](https://en.wikipedia.org/wiki/Modified_Harvard_architecture) found on many IoT devices where separate address regions are used for RAM code and RAM data and ROM code. ESPs also allow data constants to be loaded from code memory. The ESP8266 has 96 Kb of data RAM, but half of this is used by the operating system, for stack and for device drivers such as for WiFi support; typically **44 Kb** RAM is available as heap space for embedded applications. By contrast, the mapped flash ROM region can be up to **960 Kb**, that is over twenty times larger. Even though flash ROM is read-only for normal execution, there is also a "back-door" file-like API for erasing flash pages and overwriting them (though some care has to be taken to ensure cache integrity after update).
Lua's design goals of speed, portability, small kernel size, extensibility and ease-of-use make it a good choice for embedded use on an IoT platform, but with one major limitation: the standard Lua RTS assumes that both Lua data _and_ code are stored in RAM, and this is a material constraint on a device with perhaps 44Kb free RAM and 512Kb free program ROM.
The LFS feature modifies the Lua RTS to support a modified Harvard architecture by allowing the Lua code and its associated constant data to be executed directly out of flash ROM (just as the NoceMCU firmware is itself executed). This enables NodeMCU Lua developers to create Lua applications with a region of flash ROM allocated to Lua code and read-only constants. The entire RAM heap is then available for Lua read-write variables and data for applications where all Lua is executed from LFS.
The ESP architecture provides very restricted write operations to flash memory: any updates use [NAND flash rules](https://en.wikipedia.org/wiki/Flash_memory#Writing_and_erasing), so any updates typically first require bulk erasing of complete 4Kb memory pages. This makes it impractical to modify Lua code pages on the fly safely, and hence LFS uses a "reflash and restart" paradigm for reloading the LFS region.
If you are just interested in learning how to quickly get started with LFS, then please read the chapters referenced in the [Getting Started](getting-started.md) overview. The remainder of this paper is for those who want to understand a little of how this magic happens, and gives more details on the technical issues that were addressed in order to implement this feature in the following sections:
- [Configuring LFS](#configuring-lfs)
- [An Overview of LFS Internals](#an-overview-of-lfs-internals)
- [Programming Techniques](#programming-techniques)
- [Compiling and Loading LFS Images](#compiling-and-loading-lfs-images)
## Configuring LFS
The LFS region can be allocated during firmware build:
- For Lua developers that prefer the convenience of our [Cloud Build Service](https://nodemcu-build.com/), the menu dialogue offers a range of drop down options to select the size of LFS region required.
- Some advanced developers might want to use [Docker](https://www.docker.com/) or their own build environment as per our [Building the firmware](build.md) documentation, and in this case the file [`app/include/user_config.h`](../app/include/user_config.h) is used and includes inline documentation on how to select the configuration options to make an LFS firmware build.
## An Overview of LFS Internals
### LFS Internal Structure
Whilst memory capacity isn't a material constraint on most conventional machines, the Lua RTS still includes some features to minimise overall memory usage. In particular:
- The more resource intensive data types are know as _collectable objects_, and the RTS includes a GC which regularly scans these collectable resources to determine which are no longer in use, so that their associated memory can be reclaimed and reused.
- The Lua RTS also treats strings and compiled function code as collectable objects, so that these can also be GCed when no longer referenced
The compiled Lua code which is executed by the RTS internally comprises one or more function prototypes (which use a `Proto` structure type) plus their associated vectors (constants, instructions and meta data for debug). Most of these compiled constant types are basic (e.g. numbers); strings are the only collectable constant data type. Other collectable types such as arrays are actually created at runtime by executing Lua compiled code to build each resource dynamically.
Overlay techniques can be used ensure that active functions are loaded into RAM on a just-in time basis and thus mitigate RAM limitations. Moving Lua "program" resources into flash ROM typically at least doubles the effective RAM available, and typically removes any need to complicate applications code by implementing overlaying.
When any Lua file is loaded (without LFS) into an ESP application, the RTS loads the corresponding compiled version into RAM. Each compiled function has an associated own Proto structure hierarchy, but this hierarchy is not exposed directly to the running application; instead the compiler generates `CLOSURE` instruction which is executed at runtime to bind the Proto to a Lua function value thus creating a [closure](https://en.wikipedia.org/wiki/Closure_(computer_programming)). Since this occurs at runtime, any Proto can be bound to multiple closures. A Lua closure can also have multiple [Upvalues](https://www.lua.org/manual/5.3/manual.html#3.5) bound to it, and so function value is an updatable object in that it is referring to something that can contain writeable state, even though the Proto hierarchy itself is intrinsically read-only.
As read-only resources that are store in flash ROM clearly cannot be collected by the GC, such ROM based resources cannot reference any volatile RAM-based data elements, though the converse can apply: updatable resources in RAM can reference read-only ones in ROM.
All strings are stored internally in Lua with a header structure known as a `TString`, In Lua 5.1, all TStrings were [interned](https://en.wikipedia.org/wiki/String_interning), so that only one copy of any string is kept in memory, and most string manipulation uses the address of this single copy as a unique reference. Lua 5.3 divides strings into **Short** and **Long** subtypes with short strings being handled in the same way as Lua 5.1. In contrast, long strings are created and copied by reference, but are _not_ guaranteed to be stored uniquely. Guaranteeing uniqueness requires the string to be hashed and this can have a large runtime overhead for long strings, yet identical long strings are rarely generated by other than by copy-reference; hence the general runtime savings for not hashing long strings exceed the small chance of storage duplication.
All of this complexity is hidden from running Lua applications, but does impact our LFS implementation. The Lua global state links to two (short) string tables: `ROstrt` in LFS for short strings stored in LFS; and `strt` for short strings in RAM. Maintaining integrity across these two string tables at runtime is simple and low-cost, with LFS resolution process extended across both the RAM and ROM string tables. Hence any strings already in the ROM string table can be reused and this avoids the need to add an additional entry in the RAM table. This significantly reduces the size of the RAM string table, and also removes a lot of strings from the LCG scanning.
Lua GC of both types is essentially the same (and skipped for LFS strings) except that collection of long strings does not need to update the `strt`.
LFS internal layouts are different for our Lua 5.1 and Lua 5.3 version execution environments, with the Lua 5.3 version:
- Reflecting the changes to the Lua core that support the new Lua 5.2 and 5.3 language features (such the two string subtypes);
- Facilitating the additions of a second LFS region to allow separate System and Application LFS regions; this will also add a second `RO2strt`;
- Facilitates the on-ESP building of LFS regions at runtime, thus optionally removing the need for host-based code compilation;
- Improving the scalability of LFS function resolution.
Any Lua file compiled into the LFS image includes its main function prototype and all the child resources that are linked in its `Proto` structure, so all of these resources are compiled into the LFS image with this entire hierarchy self-consistently within the flash memory:
- A TString reference to the name of the function
- The vector of Lua VM instructions used to execute the function codes
- (optional) A packed map of instruction offset -> line number used for error tracebacks
- A vector of constants used by the function
- (optional) metadata on local variables
- metadata on local variables (some fields optional)
- a vector of references to other Protos compiled within the current one.
The optional fields are include or not depending on the [stripdebug](modules/node.md#nodestripdebug) setting at the time of compilation.
Lua 5.1 LFS images include a (hidden) Lua index function which has an `if n == "moduleN" then return moduleN end` chain to resolve functions within the LFS. This as been replaced by a [ROTable](nodemcu-lrm.md#values-and-types) in Lua 5.3 significantly reducing lookup times for larger LFS images.
Lua uses a binary tokenised format for dumping compiled Lua code into a file based "compiled" format. Unlike the loading of Lua source which involves on-demand compilation at runtime, "undumping" compiled Lua code is the reverse operation to "dump" and is 5-10× faster.
With Lua 5.3 LFS image file formats are a minor extension to the dump format and loading the image file into an LFS is a variant of the "undump" process (which shares the same internal functions).
### Impact on the Lua Garbage Collector
The GC applies to what the Lua VM classifies as collectable objects (strings, tables, functions, userdata, threads -- known collectively as `GCObjects`). A simple two "colour" GC was used in previous Lua versions, but Lua 5.1 introduced the Dijkstra's 3-colour (*white*, *grey*, *black*) variant that enabled the GC to operate in an incremental mode. This permits smaller GC steps interspersed by pauses, and this is very useful for larger scale Lua implementations. Even though this incremental mode is less useful for small RAM IoT devices, NodeMCU retains this standard Lua implementation.
In fact, two *white* flavours are used to support incremental working (so this 3-colour algorithm really uses 4). All newly allocated collectable objects are marked as the current *white*, and a link in `GCObject` header enables scanning through all such Lua objects. Collectable objects can be referenced directly or indirectly via one of the Lua application's *roots*: .
The standard GC algorithm is quite complex and assumes that all GCObjects are in RAM and updatable. It operates in two broad phases: **mark** and **sweep** where the mark phase does a recursive walk starting at the GC _roots_ (the global environment, the Lua registry and the stack) and marks the collectable objects in use. This process is complicated by the fact that the collector is incremental, that is its processing can be broken into packets of collection that can be interleaved within normal memory allocation actions. Once a mark phase has been completed, the sweep phase chains down the linked list of GCobjects to detect unmarked objects and reclaim them. Because LFS ROM GCObjects are treated as immutable, GC processing can still maintain overall object integrity, with the LFS modifications preventing the marking updates to ROM GCObjects (since any attempt to update any ROM-base structure during GC will result in the firmware crashing with a memory exception).
There are all sorts of nuances needed because of the incremental nature and to allow Lua finalizers to take an active role in collection, and these finalizers can themselves trigger allocation actions. Not for the faint hearted. However, ROM GCOjects can still be handled robustly within this scheme because:
- Whilst a RAM object can refer to a ROM object, the converse in _not_true: a ROM object can never refer to a RAM one. Hence the recursive mark phase can safe abort its recursive walk at any node when a ROM object is detected.
- ROM objects are in a separate linked list to that used by the sweep process on RAM objects and so are never swept.
Lastly note that the cost of GC is directly related to the size of the GC sweep lists. Therefore moving resources into LFS ROM removes them from the GC scope, and therefore reduces GC runtime accordingly.
### Programming Techniques
A number of developers have commented that moving applications into LFS simplifies coding style, as you need to be far less concerned about dynamic loading of code just-in-time to mitigate RAM use, and source modules can be larger if that suits application structure. This makes it easier to adopt a clearer coding style, so ESP Lua code can now looks more similar to host-based Lua development. Also Lua code can still be loaded from SPIFFS, so you still have the option to keep test code in SPIFFS, and only move modules into LFS once they are stable.
#### Accessing LFS functions and loading LFS modules
The main interface to LFS is encapsulated in the `node.LFS` table.
See [`lua_examples/lfs/_init.lua`](../lua_examples/lfs/_init.lua) for the code that I use in my `_init` module to do create a simple access API for LFS. There are two parts to this.
The first sets up a table in the global variable `LFS` with the `__index` and `__newindex` metamethods. The main purpose of the `__index()` is to resolve any names against the LFS using a `node.flashindex()` call, so that `LFS.someFunc(params)` does exactly what you would expect it to do: this will call `someFunc` with the specified parameters, if it exists in in the LFS. The LFS properties `_time`, `_config` and `_list` can be used to access the other LFS metadata that you need. See the code to understand what they do, but `LFS._list` is the array of all module names in the LFS. The `__newindex` method makes `LFS` readonly.
The second part uses standard Lua functionality to add the LFS to the require [package.loaders](http://pgl.yoyo.org/luai/i/package.loaders) list. (Read the link if you want more detail). There are four standard loaders, which the require loader searches in turn. NodeMCU only uses the second of these (the Lua loader from the file system), and since loaders 1,3 and 4 aren't used, we can simply replace the 1st or the 3rd by code to use `node.flashindex()` to return the LFS module. The supplied `_init` puts the LFS loader at entry 3, so if the module is in both SPIFFS and LFS, then the SPIFFS version will be loaded. One result of this has burnt me during development: if there is an out of date version in SPIFFS, then it will still get loaded instead of the one if LFS.
If you want to swap this search order so that the LFS is searched first, then SET `package.loaders[1] = loader_flash` in your `_init` code. If you need to swap the search order temporarily for development or debugging, then do this after you've run the `_init` code:
```Lua
do local pl = package.loaders; pl[1],pl[3] = pl[3],pl[1]; end
```
#### Moving common string constants into LFS
LFS is mainly used to store compiled modules, but it also includes its own string table and any strings loaded into this can be used in your Lua application without taking any space in RAM. Hence, you might also want to preload any other frequently used strings into LFS as this will both save RAM use and reduced the Lua Garbage Collector (**GC**) overheads.
The new debug function `debug.getstrings()` can help you determine what strings are worth adding to LFS. It takes an optional string argument `'RAM'` (the default) or `'ROM'`, and returns a list of the strings in the corresponding table. So the following example can be used to get a listing of the strings in RAM.
```Lua
do
local a=debug.getstrings'RAM'
for i =1, #a do a[i] = ('%q'):format(a[i]) end
print ('local preload='..table.concat(a,','))
end
```
You can do this at the interactive prompt or call it as a debug function during a running application in order to generate this string list, (but note that calling this still creates the overhead of an array in RAM, so you do need to have enough "head room" to do the call).
You can then create a file, say `LFS_dummy_strings.lua`, and insert these `local preload` lines into it. By including this file in your `luac.cross` compile, then the cross compiler will also include all strings referenced in this dummy module in the generated ROM string table. Note that you don''t need to call this module; it's inclusion in the LFS build is enough to add the strings to the ROM table. Once in the ROM table, then you can use them subsequently in your application without incurring any RAM or GC overhead.
A useful starting point may be found in [lua_examples/lfs/dummy_strings.lua](../lua_examples/lfs/dummy_strings.lua); this saves about 4Kb of RAM by moving a lot of common compiler and Lua VM strings into ROM.
Another good use of this technique is when you have resources such as CSS, HTML and JS fragments that you want to output over the internet. Instead of having lots of small resource files, you can just use string assignments in an LFS module and this will keep these constants in LFS instead.
### Choosing your development life-cycle
The build environment for generating the firmware images is Linux-based, but you can still develop NodeMCU applications on pretty much any platform including Windows and MacOS, as you can use our cloud build service to generate these images. Unfortunately LFS images must be built off-ESP on a host platform, so you must be able to run the `luac.cross` cross compiler on your development machine to build LFS images.
- For Windows 10 developers, one method of achieving this is to install the [Windows Subsystem for Linux](https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux). The default installation uses the GNU `bash` shell and includes the core GNU utilities. WSL extends the NT kernel to support the direct execution of Linux ELF images, and it can directly run the `luac.cross` and `spiffsimg` that are build as part of the firmware. You will also need the `esptool.py` tool but `python.org` already provides Python releases for Windows. Of course all Windows developers can use the Cygwin environment as this runs on all Windows versions and it also takes up less than ½Gb HDD (WSL takes up around 5Gb).
- Linux users can just use these tools natively. Windows users can also to do this in a linux VM or use our standard Docker image. Another alternative is to get yourself a Raspberry Pi or equivalent SBC and use a package like [DietPi](http://www.dietpi.com/) which makes it easy to install the OS, a Webserver and Samba and make the RPi look like a NAS to your PC. It is also straightforward to write a script to automatically recompile a Samba folder after updates and to make the LFS image available on the webservice so that your ESP modules can update themselves OTA using the new `HTTP_OTA.lua` example.
- In principle, only the environment component needed to support application development is `luac.cross`, built by the `app/lua/lua_cross` make. (Some developers might also use the `spiffsimg` exectable, made in the `tools/spifsimg` subdirectory). Both of these components use the host toolchain (that is the compiler and associated utilities), rather than the Xtensa cross-compiler toolchain, so it is therefore straightforward to make under any environment which provides POSIX runtime support, including WSL, MacOS and Cygwin.
Most Lua developers seem to start with the [ESPlorer](https://github.com/4refr0nt/ESPlorer) tool, a 'simple to use' IDE that enables beginning Lua developers to get started. ESPlorer can be slow cumbersome for larger ESP application, and it requires a direct UART connection. So many experienced Lua developers switch to a rapid development cycle where they use a development machine to maintain your master Lua source. Going this route will allow you use your favourite program editor and source control, with one of various techniques for compiling the lua on-host and downloading the compiled code to the ESP:
- If you use a fixed SPIFFS image (I find 128Kb is enough for most of my applications) and are developing on a UART-attached ESP module, then you can also recompile any LC files and LFS image, then rebuild a SPIFFS file system image before loading it onto the ESP using `esptool.py`; if you script this you will find that this cycle takes less than a minute. You can either embed the LFS.img in the SPIFFS. You can also use the `luac.cross -a` option to build an absolute address format image that you can directly flash into the LFS region within the firmware.
- If you only need to update the Lua components, then you can work over-the-air (OTA). For example see my
[HTTP_OTA.lua](../lua_examples/lfs/HTTP_OTA.lua), which pulls a new LFS image from a webservice and reloads it into the LFS region. This only takes seconds, so I often use this in preference to UART-attached loading.
- Another option would be to include the FTP and Telnet modules in the base LFS image and to use telnet and FTP to update your system. (Given that a 64Kb LFS can store thousands of lines of Lua, doing this isn't much of an issue.)
My current practice is to use a small bootstrap `init.lua` file in SPIFFS to connect to WiFi, and also load the `_init` module from LFS to do all of the actual application initialisation. There is a few sec delay whilst connecting to the Wifi, and this delay also acts as a "just in case" when I am developing, as it is enough to allow me to paste a `file.remove('init.lua')` into the UART if my test application is stuck into a panic loop, or set up a different development path for debugging.
Under rare circumstances, for example a power fail during the flashing process, the flash can be left in a part-written state following a `flashreload()`. The Lua RTS start-up sequence will detect this and take the failsafe option of resetting the LFS to empty, and if this happens then the LFS `_init` function will be unavailable. Your `init.lua` should therefore not assume that the LFS contains any modules (such as `_init`), and should contain logic to detect if LFS reset has occurred and if necessary reload the LFS again. Calling `node.flashindex("_init")()` directly will result in a panic loop in these circumstances. Therefore first check that `node.flashindex("_init")` returns a function or protect the call, `pcall(node.flashindex("_init"))`, and decode the error status to validate that initialisation was successful.
No doubt some standard usecase / templates will be developed by the community over the next six months.
A LFS image can be loaded in the LFS store by either during provisioning of the initial firmware image or programmatically at runtime as discussed further in [Compiling and Loading LFS Images](#compiling-and-loading-lfs-images) below.
one of two mechanisms:
- The image can be build on the host and then copied into SPIFFS. Calling the `node.flashreload()` API with this filename will load the image, and then schedule a restart to leave the ESP in normal application mode, but with an updated flash block. This sequence is essentially atomic. Once called, and the format of the LFS image has been valiated, then the only exit is the reboot.
- The second option is to build the LFS image using the `-a` option to base it at the correct absolute address of the LFS store for a given firmware image. The LFS can then be flashed to the ESP along with the firmware image.
The LFS store is a fixed size for any given firmware build (configurable by the application developer through `user_config.h`) and is at a build-specific base address within the `ICACHE_FLASH` address space. This is used to store the ROM string table and the set of `Proto` hierarchies corresponding to a list of Lua files in the loaded image.
A separate `node.flashindex()` function creates a new Lua closure based on a module loaded into LFS and more specfically its flash-based prototype; whilst this access function is not transparent at a coding level, this is no different functionally than already having to handle `lua` and `lc` files and the existing range of load functions (`load`,`loadfile`, `loadstring`). Either way, creating a closure on flash-based prototype is _fast_ in terms of runtime. (It is basically a single instruction rather than a compile, and it has minimal RAM impact.)
### General comments
- **Reboot implementation**. Whilst the application initiated LFS reload might seem an overhead, it typically only adds a few seconds per reboot.
- **Typical Usecase**. The rebuilding of a store is an occasional step in the development cycle. (Say up to 10-20 times a day in a typical intensive development process). Modules and source files under development can also be executed from SPIFFS in `.lua` format. The developer is free to reorder the `package.loaders` and load any SPIFFS files in preference to Flash ones. And if stable code is moved into Flash, then there is little to be gained in storing development Lua code in SPIFFS in `lc` compiled format.
- **Flash caching coherency**. The ESP chipset employs hardware enabled caching of the `ICACHE_FLASH` address space, and writing to the flash does not flush this cache. However, in this restart model, the CPU is always restarted before any updates are read programmatically, so this (lack of) coherence isn't an issue.
- **Failsafe reversion**. Since the entire image is precompiled and validated before loading into LFS, the chances of failure during reload are small. The loader uses the Flash NAND rules to write the flash header flag in two parts: one at start of the load and again at the end. If on reboot, the flag in on incostent state, then the LFS is cleared and disabled until the next reload.

261
docs/lua53.md Normal file
View File

@ -0,0 +1,261 @@
## Background and Objectives
The NodeMCU firmware was historically based on the Lua 5.1.4 core with the `eLua` patch and other NodeMCU specific enhancements and optimisations ("**Lua51**"). This paper discusses the rebaselining of NodeMCU to the latest production Lua version 5.3.5 ("**Lua53**"). Our goals in this upgrade were:
- NodeMCU now offers a current Lua version, 5.3.5 which is as functionally complete as practical.
- Lua53 adopts a minimum change strategy against the standard Lua source code base, that is changes to the VM and runtime system will only be made where there is a compelling reasons for any change, for example Lua53 preserves some valuable NodeMCU enhancements previously introduced to Lua51, for example the addition of Lua VM support for constant Lua program and data being executable directly from Flash ROM in order to free up RAM for application use to mitigate the RAM limitations of ESP-class IoT devices
- NodeMCU provides a clear and stable migration path for both existing hardware libraries and ESP Lua applications being migrated from Lua51 to Lua53.
- The Lua53 implementation provides a common code base for ESP8266 and ESP32 architectures. (The Lua51 implementation was forked with variant code bases for the two architectures.)
## Specific Design Decisions
- The NodeMCU C module API is built on the standard Lua C API that is common across the Lua51 and Lua53 build environments but with limited changes needs to reflect our IoT changes. Note that standard Lua 5.3 introduced some core functional and C API changes v.v 5.1; however, the use the standard Lua 5.3 compatibility modes largely hides these changes, though modules can make use of the `LUA_VERSION_NUM` define should version-specific code variants be needed.
- The historic NodeMCU C module API (following `eLua` precedent and) added some extensions that somewhat compromised the orthogonal design principles of the standard Lua API; these are that modules should only access the Lua runtime via the `lua_XXX` macros and and calls exported through the `lua.h` header (or the wrapped helper `luaL_XXXX` versions exported through the `lauxlib.h` header). Such inconsistencies will be removed from the existing NodeMCU API and modules, so that all modules can be compiled and executed within either the Lua51 or the Lua53 environment.
- Lua publishes a [Lua Reference Manual](https://www.lua.org/manual/5.3/)(**LRM**) for the Language specification, core libraries and C APIs for each Lua version. The Lua53 implementation includes a supplemental [NodeMCU Reference Manual](nodemcu-lrm.md)(**NRM**) to document the NodeMCU extensions to the core libraries and C APIs. As this API is unified across Lua51 and Lua53, this also provides a common reference that can also be used for developing both Lua51 and Lua53 modules.
- The two Lua code bases are maintained within a common Git branch (in parallel NodeMCU sub-directories `app/lua` and `app/lua53`). An optional make parameter `LUA=53` selects a Lua53 build based on `app/lua53`, thus generating a Lua 5.3 firmware image. (At a later stage once Lua53 is proven and stable, we will swap the default to Lua53 and move the Lua51 tree into frozen support.)
- Many of the important features of the `eLua` changes to Lua51 used by NodeMCU have now been incorporated into core Lua53 and can continue to be used 'out of the box'. Other NodeMCU **LFS**, **ROTable** and **LCD** functionality has been rewritten for NodeMCU, so the Lua53 code base no longer uses the `eLua` patch.
- Lua53 will ultimately support three build targets that correspond to the ESP8266, ESP32 and native host targets using a _common Lua53 source directory_. The ESP build targets generate a firmware image for the corresponding ESP chip class, and the host target generates a host based `luac.cross` executable. This last can either be built standalone or as a sub-make of the ESP builds.
- The Lua53 host build `luac.cross` executable will continue to extend the standard functionality by adding support for LFS image compilation and include a Lua runtime execution environment that can be invoked with the `-e` option. An optional make target also adds the [Lua Test Suite](https://www.lua.org/tests/) to this environment to enable use of this test suite.
- Lua 5.3 introduces the concept of **subtypes** which are used for Numbers and Functions, and Lua53 follows this model adding an additional ROTables subtype for Tables.
- **Lua numbers** have separate **Integer** and **Floating point** subtypes. There is therefore no advantage in having separate Integer and Floating point build variants. Lua53 therefore ignores the `LUA_NUMBER_INTEGRAL` build option. However it will provide option to use 32 or 64 bit numeric values with floating point numbers being stored as single or double precision respectively as well as the current hybrid where integers are 32-bit and `double` for floating point. Hence 32-bit integer only applications will have similar memory use and runtime performance as existing Lua51 Integer builds.
- **Lua tables** have separate **Table** and **ROTable** subtypes. The Lua53 implementation of this subtyping has been backported to Lua51 where these were previously separate types, so that the table access API is the same for both Lua51 and Lua53.
- **Lua Functions** have separate **Lua**, **C** and **lightweight C** subtypes, with this last being a special case where the C function has no upvals and so it doesn't need an associated `Closure` structure. It is also the same TValue type as our lightweight functions.
- Lua 5.3 also introduces another small number of other but significant Lua language and core library / API changes. Our Lua53 implementation does not limit these, though we have enabled the appropriate compatibility modes to limit the impact at a Lua developer level. These are discussed in further detail in the following compatibility sections.
- Many standard OS services aren't available on a embedded IoT device, so Lua53 follows the Lua51 precedent by omitting the `os` library for target builds as the relevant functionally is largely replaced by the `node` library.
- Flash based code execution can incur runtime performance impacts if not mitigated. Some small code changes were required as the current GCC toolchain for the Xtensa processors doesn't handle all of these flash access mitigations during code generation. For example, the ESP CPU only supports aligned word access to flash memory, so 'hot' byte constant accesses have been encapsulated in an inline assembler macro to remove non-aligned exceptions at a cost of one extra Xtensa instruction per access; the remaining byte accesses to flash use a software exception handler.
- NodeMCU employs a single threaded event loop model (somewhat akin to Node.js), and this is supported by some task extensions to the C API that facilitate use of a callback mechanism.
## Detailed Implementation Notes
We follow two broad principles (1) everything other than the Lua source directory is common, and (2) only change the Lua core when there is a compelling reason. The following sub-sections describe these issues / changes in detail.
### Build variants and includes
Lua53 supports three build targets which correspond to:
- The ESP8266 (and derivative ESP8385) architectures use the Espressif non-OS SDK and its GCC Xtensa tool-chain. The macros `LUA_USE_ESP8266` and `LUA_USE_ESP` are defined for this target.
- The ESP32 architecture using the Espressif IDF and its GCC Xtensa toolchain. The macros `LUA_USE_ESP32` and `LUA_USE_ESP` are defined for this target.
- A host architecture using the standard host C toolchain. The macro `LUA_USE_HOST` is defined for this target. We currently support any POSIX environment that supports the GCC toolchain and Windows builds using native MSVC, WSL, Cygwin and MinGW.
`LUA_USE_HOST` and `LUA_USE_ESP` are in effect mutually exclusive: `LUA_USE_ESP` is defined for a target firmware build, and `LUA_USE_HOST` for a build of the host-based `luac.cross` executable for the host environment. Note that `LUA_USE_HOST` also defines `LUA_CROSS_COMPILER` as used in Lua51.
Our Lua51 source has been migrated to use `newlib` conformant headers and C runtime calls. An example of this is that the standard SDK headers supplied by Espressif include `"c_string.h"` and use `c_strcmp()` as the string comparison function. NodeMCU source files use `<string.h>` and `strcmp()`.
_**Caution**: The NodeMCU source is _compliant_ rather than fully _conformant_ with the standard headers such as `<string.h>`; that is the current subset of these APIs used by the code will successfully compile, link and execute as an image, but code additions which attempt to use extra functions defined in the APIs might not; this at least minimises the need to change standard source code to compile and run on the ESP8266._
As with Lua51, Lua53 heavily customises the `linit.c`, `lua.c` and `luac.c` files because of the demands of an embedded runtime environment. Given the amount of change, these files are stripped of the functionally dead code.
### TString types and implementation
The Lua 5.3 version includes a significant modification to the treatment of strings, by dividing them into two separate subtypes based on the string length (at `LUAI_MAXSHORTLEN`, 40 in the current implementation). This decision reflects two empirical observations based on a broad range of practical Lua applications: the longer the string, the less likely the application is to recreate it independently; and the cost of ensuring uniqueness increases linearly with the length of the string. Hence Lua53 now treats the string type differently:
- **Short Strings** are stored uniquely using the `strt` and `ROstrt` string table as discussed below. Two short TStrings are identical if and only if their addresses are the same.
- **Long Strings** are created and copied by reference, but are not guaranteed to be stored uniquely.
Since short strings are stored uniquely, identity comparison is based comparing their TString address. For long TStrings identify comparison is a little more complex:
- They are identical if their addresses are the same
- They are different if their lengths are different.
- Failing these short circuits, a full `memcmp()` must be carried out.
Lua GC of both types is essentially the same, except that collection of long strings does not need to update the `strt`.
Note that for real applications, identical long strings are rarely generated by other than by copy-reference and hence in general the runtime savings benefits exceed the small chance of storage duplication. Also note that this and other sub-typing is hidden at the Lua C API level and is handled privately inside the Lua VM implementation.
Whilst running Lua applications make heavy use of TStrings, the Lua VM itself makes little use of TStrings and typically pushes any string literals as CStrings. The Lua53 VM introduced a key cache to avoid the runtime cost of hashing string and doing the `strt` lookup for this type of CString constant. The NodeMCU implementation shares this key cache with ROTable field resolution.
Short strings in the standard Lua VM are bound at runtime into the the RAM-based `strt` string table. The LFS implementation adds a second LFS-based readonly `ROstrt` string table that is created during LFS image load and then referenced on subsequent CPU restarts. The Lua VM and C API resolves each new short string first against the `strt`, then the `ROstrt` string table, before adding any unresolved strings into the `strt`. Hence short strings are interned across these two `strt` and `ROstrt` string tables. Any runtime reference to strings already in the LFS `ROstrt` therefore do not create additional entries in RAM. So applications are free include dummy resource functions (such as `dummy_strings.lua` in `lua_examples/lfs`) to preload additional strings into `ROstrt` and avoid needing RAM for these string constants. Such dummy functions don't need to called; simply inclusion in the LFS build is sufficient.
An LFS section below discusses further implementation details.
### ROTables
The ROTables concept was introduced in `eLua`, with the ROTable format designed to be compiled by being declarable with C source and so statically included in the firmware at built-time, rather taking up RAM. This essential functionality has been preserved across both Lua51 and Lua53. At an API level ROTables are handled as a table subtype within the Lua VM except that:
- ROTables are declared statically in C code.
- Only a subset of key and value types is supported for ROTables.
- Attempting to write to a ROTable field will raise an error.
- The C API provides a method to push a ROTable reference direct to the Lua stack, but other than this, the Lua API to read ROTables and Tables is the same.
We have now completely replaced the `eLua` implementation for Lua53, and this implementation has been back-ported to Lua51. Tables are now declared using `LROT` macros with the `LROT_END()` macro also generating a `ROTable` structure, which is a variant of the standard `Table` header and linking to the `luaR_entry` vector declared using the various `LROT_XXXXENTRY()` macros. This has new implementation has some major advantages:
- ROTables are a separate subtype of Table, and so only minor code changes are needed within `ltable.c` to implement this, with the implementation now effectively hidden from the rest of the runtime and any library modules; this has enabled the removal of most of the ROTable code patches introduced by `eLua` into Lua51.
- The `luaR_entry` vector is a linear list, so (unlike a standard RAM Table) any ROTable has no associate hash table for fast key lookup. However we have introduced a unified ROTable key cache to provide direct access into ROTable entries with a typical hit rate over 99% (key cache misses still require a linear key scan), and so average ROTable access is perhaps 30% slower than RAM Table access, unlike the `eLua` implementation which was 30-50 times slower.
- The `ROTable` structure variants are not GC collectable and so one of its fields is tagged to allow the Lua GC to short-circuit GC sweeps across such RO nodes. The `ROTable` structure variant also drops unused fields to save space, and again this is handled internally within `ltable.c`.
- As all tables have a header record that includes a valid `flags` field, the `fasttm()` optimisations for common metamethods now work for both `ROTables` and `Tables`.
The same Lua51 ROTable functionality and limitations also apply to Lua53 in order to minimise migration impact for C module libraries:
- ROTables can only have string keys and a limited set of Lua value types (Numeric, Light CFunc, Light UserData, ROTable, string and Nil). In Lua 5.3 `Integer` and `Float` are now separate numeric subtypes, so `LROT_INTENTRY()` takes an integer value. The new `LROT_FLOATENTRY()` is used for a non-integer values. This isn't a migration issue as none of the current NodeMCU modules use floating point constants in ROTables declared in them. (There is the only one currently used and that is `math.PI`.)
- For 5.1 builds, `LROT_FLOATENTRY()` is a synonym of `LROT_NUMENTRY()`.
- For 5.3 builds, `LROT_NUMENTRY()` is a synonym of `LROT_INTENTRY()`.
- Some ordering limitations apply: `luaR_entry` vectors can be unordered _except for any metafields_: Any entry with a key name starting in "_" _must_ must be ordered and placed at the start of the vector.
- The `LROT_BEGIN()` and `LROT_END()` take the same three parameters. (These are ignored in the case of the `LROT_BEGIN()` macro, but by convention these are the same to facilitate the begin / end pairing).
- The first field is the table name.
- The second field is used to reference the ROTable's metatable (or `NULL` if it doesn't have one).
- The third field is 0 unless the table is a metatable, in which case it is a bit mask used to define the `fasttm()` `flags` field. This must match any metafield entries for metafield lookup to work correctly.
### Proto Structures
Standard Lua 5.3 contains a new peep hole optimisation relating to closures: the Proto structure now contains one RW field pointing to the last closure created, and the GC adopts a lazy approach to recovering these closures. When a new closure is created, if the old one exists _and the upvals are the same_ then it is reused instead of creating a new one. This allows peephole optimisation of a usecase where a function closure is embedded in a do loop, so the higher cost closure creation is done once rather than `n` times.
This reduces runtime at the cost of RAM overhead. However for RAM limited IoTs this change introduced two major issues: first, LFS relies on Protos being read-only and this RW `cache` field breaks this assumption; second closures can now exist past their lifetime, and this delays their GC. Memory constrained NodeMCU applications rely on the fact that dead closed upvals can be GCed once the closure is complete. This optimisation changes this behaviour. Not good.
Lua53 **_removes_** this optimisation for all prototypes.
### Locale support
Standard Lua 5.3 introduces localisation support. NodeMCU Lua53 disables this because IoT implementation doesn't have the appropriate OS support.
### Memory Optimisations
Some Lua structures have `double` fields which are align(8) by default. There is no reason or performance benefit for doing align(8) on ESPs so `#pragma pack(4)` is used for the value struct.
Lua53 also reimplements the Lua51 LCD (Lua Compact Debug) patch. This replaces the `sizecode` `ìnt` vector giving line info with a packed byte array that is typically 15-30× smaller. See the [LCD whitepaper](lcd.md) for more information on this algorithm.
### Unaligned exception avoidance
By default the GCC compiler emits a `l8ui` instruction to access byte fields on the ESP8266 and ESP32 Xtensa processors. This instruction will generate an unaligned fetch exception when this byte field is in Flash memory (as will accessing short fields). These exceptions are handled by emulating the instruction in software using an unaligned access handler; this allows execution to continue albeit with the runtime cost of handling the exception in software. We wish to avoid the performance hit of executing this handler for such exceptions.
`lobject.h` now defines a `GET_BYTE_FN(name,t,wo,bo)` macro. In the case of host targets this macro generates the normal field access, but in the case of Xtensa targets uses of this macro define an `static inline` access function for each field. These functions at the default `-O2` optimisation level cause the code generator to emit a pair of `l32i.n` + `extui` instructions replacing the single `l8ui` instruction. This has the cost of an extra instruction execution for accessing RAM data, but also removes the 200+ clock overhead of the software exception handler in the case of flash memory accesses.
There are 9 byte fields in the `GCObject`,`TString`, `Proto`, `ROTable` structures that can either be statically compiled as `const struct` into library code space or generated by the Lua cross compiler and loaded into the LFS region; the `GET_BYTE_FN` macro is used to create inline access functions for these fields, and read references of the form `(o)->tt` (for example) have been recoded using the access macro form `gettt(o)`. There are 44 such changed access references in the source which together represent perhaps 99% of potential sources of this software exception within the Lua VM.
The access macro hasn't been used where access is guarded by a conditional that implies the field in a RAM structure and therefore the `l8ui` instruction is executed correctly in hardware. Another exclusion is in modules such as `lcode.c` which are only used in compilation, and where the addition runtime penalty is acceptable.
A wider review of `const char` initialisers and `-S` asm output from the compiler confirms that there are few other cases of character loads of constant data, largely because inline character constants such as '@' are loaded into a register as an immediate parameter to a `movi.n` instruction. Ditto use of short fields.
### Modulus and division operation avoidance
The Lua runtime uses the modulus (`%`) and divide (`/`) operators in a number of computations. This isn't an issue for most uses where the divisor is an integer power of 2 since the **gcc** optimiser substitutes a fast machine code equivalent which typically executes 1-4 inline Xtensa instructions (ditto for many constant multiplies). The compiler will also fold any used in constant expressions to avoid runtime evaluation. However the ESP Xtensa CPU doesn't implement modulus and divide operations in hardware, so these generate a call to a subroutine such as `_udivsi3()` which typically involves 500 instructions or so to evaluate. A couple of frequent uses have been replaced. (I have ensured that such uses are space delimited, so searching for " % " will locate these. `grep -P " (%|/) (?!(2|4|8|16))" app/lua53/*.[hc]` will list them off.)
### Key cache
Standard Lua 5.3 introduced a string key cache for constant CString to TString lookup. In NodeMCU we also introduced a lookaside cache for ROTable fields access, and in practice this provides single probe access for over 99% of key hit accesses to ROTable entries.
In Lua53 these two caching functions (for CString and ROTable key lookup) have been unified into a common key cache to provide both caching functions with the runtime overhead of a single cache table in RAM. Folding these two lookups into a single key cache isn't ideal, but given our limited RAM this allows the cache use to be rebalanced at runtime reflecting the relative use of CString and ROTable key lookups.
### Flash image generation and loading
The current Lua51 `app/lua` implementation has two variants for dumping and loading Lua bytecode: (1) `ldump.c` + `lundump.c`; (2) `lflashimg.c` + `lflash.c`. These have been unified into a single load / unload mechanism in Lua53. However, this mechanism must facilitate sequential loading into flash storage, which is straight forward _if_ with some small changes to the standard internal ordering of the LC file format. The reason for this is that any `Proto` can embed other Proto definitions internally, creating a Proto hierarchy. The standard Lua dump algorithm dumps some Proto header components, then recurses into any sub-Protos before completing the wrapping Proto dump. As a result each Proto's resources get interleaved with those of its subordinate Proto hierarchy. This means that resources get to written to RAM non-serially, which is bad news for writing serially to the LFS region.
The NodeMCU Lua53 dump reorders the proto hierarchy tree walk, so that resources of the lowest protos in the hierarchy are loaded first:
```
dump_proto(p)
foreach subp in p
dump_proto(subp)
dump proto content
end
```
This results in Proto references now being backwards references to Protos that have already been loaded, and this in turn enables the Proto resources to be allocated as a sequential contiguous allocation units, so the same code can be used for loading compiled Lua code into RAM and into LFS.
The standard Lua 5.3 dump format embeds string constants in each proto as a `len`+`byte string` definition. NodeMCU needs to separate the collection of strings into an `ROstrt` for LFS loading, and this requires an extra processing pass either on dump or load. By doing a preliminary Proto scan to collect tracking the strings used then dumping these as a prologue makes the load process on the ESP a single pass and avoids any need for string resolution tables in the ESP's RAM. The extra memory resources needed for this two-pass dump aren't a material issue in a PC environment.
Changes to `lundump.c` facilitate the addition of LFS mode. Writing to flash uses a record oriented write-once API. Once the flash cache has been flushed when updating the LFS region, this data can be directly accesses using the memory-mapped RO flash window, the resources are written directly to Flash without any allocation in RAM.
Both the `dump.c` and `lundump.c` are compiled into both the ESP firmware and the host-based luac cross compiler. Both the host and ESP targets use the same integer and float formats (e.g. 32 bit, 32-bit IEEE) which simplifies loading and unloading. However one complication is that the host `luac.cross` application might be compiled on either a 32 or 64 bit environment and must therefore accommodate either 4 or 8 byte address constants. This is not an issue with the compiled Lua format since this uses the grammatical structure of the file format to derive resource relationships, rather than offsets or pointers.
We also have a requirement to generate binary compatible absolute LFS images for linking into firmware builds. The host mode is tweaked to achieve this. In this case the write buffer function returns the correct absolute ESP address which are 32-bit; this doesn't cause any execution issue in luac since these addressed are never used for access within `luac.cross`. On 64-bit execution environments, it also repacks the Proto and other record formats on copy by discarding the top 32-bits of any address reference.
#### Handling embedded integers in the dump format
A typical dump contains a lot of integer fields, not only for Integer constants, but also for repeat count and lengths. Most of these integers are small, so rather than using a fixed 4-byte field in the file stream all integers are unsigned and represented by a big-endian multi-byte encoding, 7 bits per byte, with the high-bit used as a continuation flag. This means that integers 0..127 encode in 1 byte, 128..32,767 in 2 etc. This multi-byte scheme has minimal overhead but reduces the size of typical `.lc` and `.img` by 10% with minimal extra processing and less than the cost of reading that extra 10% of bytes from the file system.
A separate dump type is used for negative integer constants where the constant `-x` is stored as `-(x+1)`. Note that endianness isn't an issue since the stream is processed byte-wise, but using big-endian simplifies the load algorithm.
#### Handling LFS-based strings
The dump function for a individual Protos hierarchy for loading as an `.lc` file follows the standard convention of embedding strings inline as a `\<len>\<byte sequence>`. Any LFS image contains a dump of all of the strings used in the LFS image Protos as an "all-strings" prologue; the Protos are then dumped into the image with string references using an index into the all-strings header. This approach enables a fast one-pass algorithm for loading the LFS image; it is also a compact encoding strategy as string references typically use 1 or 2 byte integer offset in the image file.
One complication here is that in the standard Lua runtime start-up adds a set of special fixed strings to the `strt` that are also tagged to prevent GC. This could cause problems with the LFS image if any of these constants is used in the code. To remove this conflict the LFS image loader _always_ automatically includes these fixed strings in the `ROstrt`. (This also moves an extra ~2Kb string constants from RAM to Flash as a side-effect.) These fixed strings are omitted from the "all-strings prologue", even though the code itself can still use them. The `llex.c` and `ltm.c` initialisers loop over internal `char *` lists to register these fixed strings. NodeMCU adds a couple of access methods to `llex.c` and `ltm.c` to enable the dump and load functions to process these lists and resolve strings against them.
#### Handling LFS top level functions
Lua functions in standard Lua 5.1 are represented by two variant Closure headers (for C and Lua functions). In the case of Lua functions with upvals, the internal Protos can validly be bound to multiple function instances. `eLua` and Lua 5.3 introduced the concept of lightweight C functions as a separate function subtype that doesn't require a Closure record. Note that a function variable in a Lua exists as a TValue referencing either the C function address or a Closure record; this Closure is not the same as the CallInfo records which are chained to track the current call chain and stack usage.
Whilst lightweight C functions can be declared statically as TValues in ROTables, there isn't a corresponding mechanism for declaring a ROTable containing LFS functions. This is because a Lua function TValue can only be created at runtime by executing a `CLOSURE` opcode within the Lua VM. The Lua51 implementation avoids this issue by generating a top level Lua dispatch function that does the equivalent of emitting `if name == "moduleN" then return moduleN end` for each entry, and this takes 4 Lua opcodes per module entry. This lookup has an O(N) cost which becomes non-trivial as N grows large, and so Lua51 has a somewhat arbitrary limit of 50 for the maximum number modules in a LFS image.
In the Lua53 LFS implementation, the undump loader appends a `ROTable` to the LFS region which contains a set of entries `"module name"= Proto_address`. These table values are store as lightweight userdata pointer and as such are not directly accessible via Lua but the NodeMCU C function that does LFS lookup can still retrieve the required Proto address, execute the `CLOSURE` and return the corresponding TValue. Since this approach uses the standard table access API, which is a lot more efficient than the 4×N opcode `if` chain implementation.
### Garbage collection
Lua51 includes the eLua emergency GC, plus the various EGC tuning parameters that seem to be rarely used. The default setting (which most users use) is `node.egc.ALWAYS` which triggers a full GC before every memory allocation so the VM spends maybe 90% of its time doing full GC sweeps.
Standard Lua 5.3 has adopted the eLua EGC but without the EGC tuning parameters. (I have raised a separate GitHub issue to discuss this.) We extend the EGC with the functional equivalent of the `ON_MEM_LIMIT` setting with a negative parameter, that is only trigger the EGC with less than a preset free heap left. The runtime spends far less time in the GC and code typically runs perhaps 5× faster.
### Panic Handling
Standard Lua includes a throw / catch framework for handling errors. (This has been slightly modified to enable yielding to work across C API calls, but this can be modification can ignored for the discussion of Panic handling.) All calls to Lua execution are handled by `ldo.c` through one of two mechanisms:
- All protected calls are handled via `luaD_rawrunprotected()` which links its C stack frame into the `struct lua_longjmp` chain updating the head pointer at `L->errorJmp`. Any `luaD_throw()` will `longjmp` up to this entry in the C stack, hence as long as there is at least one protected call in the call chain, the C call stack can be properly unrolled to the correct frame.
- If no protected calls are on the Lua call stack, then `L->errorJmp` will be null and there is no established C stack level to unroll to. In this case the `luaD_throw()` will directly call the `at_panic()` handler. Since there is no valid stack frame to unroll to and execution cannot safely continue, so the only safe next step is to abort, which in our case restarts the processor.
Any Lua calls directly initiated through `lua.c` interpreter loop or through `luac.cross` are protected. However NodeMCU applications can also establish C callbacks that are called directly by the SDK / event dispatcher. The current practice is that these invoke their associated Lua CB using an unprotected call and hence the only safe option is to restart the processor on error. With Lua53 we have introduced a new `luaL_pcallx()` call variant as a NodeMCU extension; this is new call is designed to be used within library CBs that execute Lua CB functions, and it is argument compatible with `lua_call()`, except that in the case of caught errors it will also return a negative call status. This function establishes an error handler to protect the called function, and this is invoked on error at the error's stack level to provide a stack trace.
This error handler posts a task to a panic error handler (with the error string as an upval) before returning control to the invoking routine. If the Lua registry entry `onerror` exists and is set to a function, then the handler calls this with the error string as an argument otherwise it calls standard `print` function by default. This function can return `false` in which case the handler exits, otherwise it restarts the processor. The application can use `node.setonerror()` to override the default "always restart" action if wanted (for example to write an error to a logfile or to a network syslog before restarting). Note that `print` returns `nil` and this has the effect of printing the full error traceback before restarting the processor.
Currently all (bar 1) of the cases of such Lua callbacks within the NodeMCU C modules used a simple `lua_call()`, with the result that any runtime error executed a panic on error and reboots the processor. These call have all been replaced by the `luaL_pcallx()` variant, so control is always returned to the C routine, and a later post task report the error and restarts the processor. Note that substituting library uses of `lua_call()` by `luaL_pcallx()` does changes processing paths in the case of thrown errors. If the library CB function immediately returns control to the SDK/event scheduler after the call, then this is the correct behaviour. However, in a few cases, the routine performs post-call clean-up and this adapt the logic depending on the return status.
### The `luac.cross` execution environment
As with Lua51, the Lua53 host-build `luac.cross` executable extends the standard functionality by adding support for LFS image compilation and also includes a Lua runtime execution environment that can be invoked with the `-e` option. This environment was added primarily to facilitate in host testing (albeit with some limitations) of the NodeMCU.
This execution environment also emulates LFS loading and execution using the `-F` option to load an LFS image before running the `-e` script. On POSIX environments this allocates the LFS region using a kernel extension, a page-aligned allocator and it also uses the kernel API to turn off write access to this region except during the simulated write to flash operations. In this way unintended writes to the LFS region throw a H/W exception in a manner parallel to the ESP environment.
### API Compatibility for NodeMCU modules
The Lua public API has largely been preserved across both Lua versions. Having done a difference analysis of the two API and in particular the standard `lua.h` and `lauxlib.h` headers which contain the public API as documented in the LRM 5.1 and 5.3, these differences are grouped into the following categories:
- NodeMCU features that were in Lua51 and have also been added to Lua53 as part of this migration.
- Differences (additions / removals / API changes) that we are not using in our modules and which can therefore be effectively ignored for the purposes of migration.
- Some very nice feature enhancements in Lua53, for example the table access functions are now type `int` instead of type `void` and return the type of the accessed TValue. These features have been back-ported to Lua51 so that modules can be coded using Lua53 best practice and still work in a Lua51 runtime environment.
- Source differences which can be encapsulated through common macros will be removed by updating module code to use this common macro set. In a very small number of cases module functionality will be recoded to employ this common API base.
Both the LRM and PiL make quite clear that the public API for C modules is as documented in `lua.h` and all its definitions start with `lua_`. This API strives for economy and orthogonality. The supplementary functions provided by the auxiliary library (`lauxlib.c`) access Lua services and functions through the `lua.h` interface and without other reference to the internals of Lua; this is exposed through `lauxlib.h` and all its definitions start with `luaL_`. All Lua library modules (such as `lstrlib.c` which implements `string`) conform to this policy and only access the Lua runtime via the `lua.h` and `lauxlib.h` interfaces.
There are significant changes to internal APIs between Lua51 and Lua53 and as exposed in the other "private" headers within the Lua source directory, and so any code using these APIs may fail to work across the two versions.
Both Lua51 and Lua53 have a concept of Lua core files, and these set the `LUA_CORE` define. In order to enforce limited access to the 'private' internal APIs, `#ifdef LUA_CORE` guards have been added to all private Lua headers effectively hiding them from application library and other non-core access.
One thing that this analysis has underline is that the project has been lax in the past about how we allow our modules to be implemented. All existing modules have now been updated to use only the public API. If any new or changed module required any of the 'internal' Lua headers to compile, then it is implemented incorrectly.
### Lua Language and Libary Compatibility for NodeMCU Lua modules
For the immediate future we will be supporting both builds based on both language variants, so Lua module writers either:
- Avoid using Lua 5.3 language features and implement their module in the common subset (this is currently our preferred approach);
- Or explicitly state any language constraints and include a test for `_VERSION=='Lua.5.3'` (or 5.1) in the module startup and explicitly error if incompatible.
### Other Implementation Notes
- **Use of Linker Magic**. Lua51 introduced a set of linker-aware macros to allow NodeMCU C library modules to be marshalled by the GNU linker for firmware builds; Lua53 target builds maintain these to ensuring cross-version compatibility. However, the lua53 `luac.cross` build does all library marshalling in `linit.c` and this removes the need to try to emulate this strategy on the diverse host toolchains that we support for compiling `luac.cross`.
- **Host / ESP interoperability**. Our strategy is to build the firmware for the ESP target and `luac.cross` in the same make process. This requires the host to use a little endian ANSI floating point host architecture such as x68, AMD64 or ARM, so that the LC binary formats are compatible. This ain't a material constraint in practice.
- **Emergency GC.** The Lua VM takes a more aggressive stance than the standard Lua version on triggering a GC sweep on heap exhaustion. This is because we run in a small RAM size environment. This means that any resource allocation within the Lua API can trigger a GC sweep which can call __GC metamethods which in turn can require to stack to be resized.

View File

@ -166,6 +166,18 @@ none
#### Returns
flash ID (number)
## node.flashindex() --deprecated
Deprecated synonym for [`node.LFS.get()`](#nodelfsget) to return an LFS function reference.
Note that this returns `nil` if the function does not exist in LFS.
## node.flashreload() --deprecated
Deprecated synonym for [`node.LFS.reload()`](#nodelfsreload) to reload [LFS (Lua Flash Store)](../lfs.md) with the named flash image provided.
## node.heap()
Returns the current available heap size in bytes. Note that due to fragmentation, actual allocations of this size may not be possible.
@ -230,6 +242,65 @@ sk:on("receive", function(conn, payload) node.input(payload) end)
#### See also
[`node.output()`](#nodeoutput)
## node.LFS
Sub-table containing the API for [Lua Flash Store](../lfs.md)(**LFS**) access. Programmers might prefer to map this to a global or local variable for convenience for example:
```lua
local LFS = node.LFS
```
This table contains the following methods and properties:
Property/Method | Description
-------|---------
`config` | A synonym for [`node.info('lfs')`](#nodeinfo). Returns the properties `lfs_base`, `lfs_mapped`, `lfs_size`, `lfs_used`.
`get()` | See [node.LFS.get()](#nodelfsget).
`list()` | See [node.LFS.list()](#nodelfslist).
`reload()` |See [node.LFS.reload()](#nodelfsreload).
`time` | Returns the Unix timestamp at time of image creation.
## node.LFS.get()
Returns the function reference for a function in LFS.
Note that unused `node.LFS` properties map onto the equialent `get()` call so for example: `node.LFS.mySub1` is a synonym for `node.LFS.get('mySub1')`.
#### Syntax
`node.LFS.get(modulename)`
#### Parameters
`modulename` The name of the module to be loaded.
#### Returns
- If the LFS is loaded and the `modulename` is a string that is the name of a valid module in the LFS, then the function is returned in the same way the `load()` and the other Lua load functions do
- Otherwise `nil` is returned.
## node.LFS.list()
List the modules in LFS.
#### Returns
- If no LFS image IS LOADED then `nil` is returned.
- Otherwise an sorted array of the name of modules in LFS is returned.
## node.LFS.reload()
Reload LFS with the flash image provided. Flash images can be generated on the host machine using the `luac.cross`command.
#### Syntax
`node.LFS.reload(imageName)`
#### Parameters
`imageName` The name of a image file in the filesystem to be loaded into the LFS.
#### Returns
- In the case when the `imagename` is a valid LFS image, this is expanded and loaded into flash, and the ESP is then immediately rebooted, _so control is not returned to the calling Lua application_ in the case of a successful reload.
- The reload process internally makes multiple passes through the LFS image file. The first pass validates the file and header formats and detects many errors. If any is detected then an error string is returned.
## node.key() --deprecated
Defines action to take on button press (on the old devkit 0.9), button connected to GPIO 16.

139
docs/modules/pipe.md Normal file
View File

@ -0,0 +1,139 @@
# pipe Module
| Since | Origin / Contributor | Maintainer | Source |
| :----- | :-------------------- | :---------- | :------ |
| 2019-07-18 | [Terry Ellison](https://github.com/TerryE) | [Terry Ellison](https://github.com/TerryE) | [pipe.c](../../app/modules/pipe.c)|
The pipe module provides a RAM-efficient means of passing character stream of records from one Lua
task to another.
## pipe.create()
Create a pipe.
#### Syntax
`pobj = pipe.create([CB_function],[task_priority])`
#### Parameters
- `CB_function` optional reader callback which is called through the `node.task.post()` when the pipe is written to. If the CB returns a boolean, then the reposting action is forced: it is reposted if true and not if false. If the return is nil or omitted then the deault is to repost if a pipe write has occured since the last call.
- `task_priority` See `ǹode.task.post()`
#### Returns
A pipe resource.
## pobj:read()
Read a record from a pipe object.
Note that the recommended method of reading from a pipe is to use a reader function as described below.
#### Syntax
`pobj:read([size/end_char])`
#### Parameters
- `size/end_char`
- If numeric then a string of `size` length will be returned from the pipe.
- If a string then this is a single character delimiter, followed by an optional "+" flag. The delimiter is used as an end-of-record to split the character stream into separate records. If the flag "+" is specified then the delimiter is also returned at the end of the record, otherwise it is discarded.
- If omitted, then this defaults to `"\n+"`
Note that if the last record in the pipe is missing a delimiter or is too short, then it is still returned, emptying the pipe.
#### Returns
A string or `nil` if the pipe is empty
#### Example
```lua
line = pobj:read('\n')
line = pobj:read(50)
```
## pobj:reader()
Returns a Lua **iterator** function for a pipe object. This is as described in the
[Lua Language: For Statement](http://www.lua.org/manual/5.1/manual.html#2.4.5). \(Note that the
`state` and `object` variables mentioned in 2.5.4 are optional and default to `nil`, so this
conforms to the`for` iterator syntax and works in a for because it maintains the state and `pobj`
internally as upvalues.
An emptied pipe takes up minimal RAM resources (an empty Lua array), and just like any other array
this is reclaimed if all variables referencing it go out of scope or are over-written. Note
that any reader iterators that you have created also refer to the pipe as an upval, so you will
need to discard these to descope the pipe array.
#### Syntax
`myFunc = pobj:reader([size/end_char])`
#### Parameters
- `size/end_char` as for `pobj:read()`
#### Returns
- `myFunc` iterator function
#### Examples
- used in `for` loop:
```lua
for rec in p:reader() do print(rec) end
-- or
fp = p:reader()
-- ...
for rec in fp do print(rec) end
```
- used in callback task:
```Lua
do
local pipe_reader = p:reader(1400)
local function flush(sk) -- Upvals flush, pipe_reader
local next = pipe_reader()
if next then
sk:send(next, flush)
else
sk:on('sent') -- dereference to allow GC
flush = nil
end
end
flush()
end
```
## pobj:unread()
Write a string to the head of a pipe object. This can be used to back-out a previous read.
#### Syntax
`pobj:unread(s)`
#### Parameters
`s` Any input string. Note that with all Lua strings, these may contain all character values including "\0".
#### Returns
Nothing
#### Example
```Lua
a=p:read()
p:unread(a) -- restores pipe to state before the read
```
## pobj:write()
Write a string to a pipe object.
#### Syntax
`pobj:write(s)`
#### Parameters
`s` Any input string. Note that with all Lua strings, these may contain all character values including "\0".
#### Returns
Nothing
## pobj:nrec()
Return the number of internal records in the pipe. Each record ranges from 1
to 256 bytes in length, with full chunks being the most common case. As
extracting from a pipe only to `unread` if too few bytes are available, it may
be useful to have a quickly estimated upper bound on the length of the string
that would be returned.

330
docs/nodemcu-lrm.md Normal file
View File

@ -0,0 +1,330 @@
# NodeMCU Reference Manual
## Introduction
NodeMCU firmware is an IoT project ("the Project") which implements a Lua-based runtime for SoC modules based on the Espressif ESP8266 and ESP32 architectures. This NodeMCU Reference Manual (**NRM**) specifically addresses how the NodeMCU Lua implementation relates to standard Lua as described in the two versions of the Lua language that we currently support:
- The [Lua 5.1 Reference Manual](https://www.lua.org/manual/5.1/) and
- The [Lua 5.3 Reference Manual](https://www.lua.org/manual/5.3/) (**LRM**)
Developers using the NodeMCU environment should familiarise themselves with the 5.3 LRM.
The Project provides a wide range of standard library modules written in both C and Lua to support many ESP hardware modules and chips, and these are documented in separate sections in our [online documentation](index.md).
The NRM supplements LRM content and module documentation by focusing on a complete description of the _differences_ between NodeMCU Lua and standard Lua 5.3 in use. It adopts the same structure and style as the LRM. As NodeMCU provides a full implementation of the Lua language there is little content herein relating to Lua itself. However, what NodeMCU does is to offer a number of enhancements that enable resources to be allocated in constant program memory &mdash; resources in standard Lua that are allocated in RAM; where this does impact is in the coding of C library modules and the APIs used to do this. Hence the bulk of the differences relate to these APIs.
One of our goals in introducing Lua 5.3 support was to maintain the continuity of our existing C modules by ensuring that they can be successfully compiled and executed in both the Lua 5.1 and 5.3 environments. This goal was achieved by a combination of:
- enabling relevant compatibility options for standard Lua libraries;
- back porting some Lua 5.3 API enhancements back into our Lua 5.1 implementation, and
- making some small changes to the module source to ensure that incompatible API use is avoided.
Further details are given in the [Lua compatibility](#lua-compatibility) section below. Notwithstanding this, the Project has now deprecated Lua 5.1 and will soon be moving this version into frozen support.
As well as providing the ability to building runtime firmware environments for ESP chipsets, the Project also offers a `luac.cross` cross compiler that can be built for common platforms such as Windows 10 and Linux, and this enables developers to compile source modules into a binary file format for download to ESP targets and loading from those targets.
## Basic Concepts
The NodeNCU runtime offers a full implementation of the [LRM §2](https://www.lua.org/manual/5.3/manual.html#2) core concepts, with the following adjustments:
### Values and Types
- The firmware is compiled to use 32-bit integers and single-precision (32-bit) floats by default. Address pointers are also 32-bit, and this allows all Lua variables to be encoded in RAM as an 8-byte "`TValue`" (compared to the 12-byte `TValue` used in Lua 5.1).
- C modules can statically declare read-only variant of the `Table` structure type known as a "`ROTable`". There are some limitations to the types for ROTable keys and value, in order to ensure that these are consistent with static declaration. ROTables are stored in code space (in flash memory on the ESPs), and hence do not take up RAM resources. However these are still represented by the Lua type _table_ and a Lua application can treat them during execution the same as any other read-only table. Any attempt to write to a `ROTable` or to set its metatable will throw an error.
- NodeMCU also introduces a concept known as **Lua Flash Store (LFS)**. This enables Lua code (and any string constants used in this code) to be compiled and stored in code space, and hence without using RAM resources. Such LFS functions are still represented by the type _function_ and can be executed just like any other Lua function.
### Environments and the Global Environment
These are implemented as per the LRM. Note that the Lua 5.1 and Lua 5.3 language implementations are different and can introduce breaking incompatibilities when moving between versions, but this is a Lua issue rather than a NodeMCU one.
### Garbage Collection
All LFS functions, any string constants used in these functions, and any ROTables are stored in static code space. These are ignored by the Lua Garbage collector (LGC). The LGC only scans RAM-based resources and recovers unused ones. The NodeMCU LGC has slightly modified configuration settings that increase its aggressiveness as heap usage approaches RAM capacity.
### Coroutines
The firmware includes the full coroutine implementation, but note that there are some slight differences between the standard Lua 5.1 and Lua 5.3 C API implementations. (See [Feature breaks](#feature-breaks) below.)
## Lua Language
The NodeNCU runtime offers a full implementation of the Lua language as defined in [LRM §3](https://www.lua.org/manual/5.3/manual.html#3) and its subsections.
## The Application Program Interface
[LRM §4](https://www.lua.org/manual/5.3/manual.html#4) describes the C API for Lua. This is used by NodeMCU modules written in the C source language to interface to the the Lua runtime. The header file `lua.h` is used to define all API functions, together with their related types and constants. This section 4 forms the primary reference, though the NodeMCU makes minor changes as detailed below to support LFS and ROTable resources.
### Error Handling in C
[LRM §4.6](https://www.lua.org/manual/5.3/manual.html#4.6) describes how errors are handled within the runtime.
The normal practice within the runtime and C modules is to throw any detected errors -- that is to unroll the call stack until the error is acquired by a routine that has declared an error handler. Such an environment can be established by [lua_pcall](https://www.lua.org/manual/5.3/manual.html#lua_pcall) and related API functions within C and by the Lua function [pcall](https://www.lua.org/manual/5.3/manual.html#pdf-pcall); this is known as a _protected environment_. Errors which occur outside any protected environment are not caught by the Lua application and by default trigger a "panic". By default NodeMCU captures the error traceback and posts a new SDK task to print the error before restarting the processor.
The NodeMCU runtime implements a non-blocking threaded model that is similar to that of `node.js`, and hence most Lua execution is initiated from event-triggered callback (CB) routines within C library modules. NodeMCU enables full recovery of error diagnostics from otherwise unprotected Lua execution by adding an additional auxiliary library function [`luaL_pcallx`](#luaL_pcallx). All event-driven Lua callbacks within our library modules use `luaL_pcallx` instead of `lua_call`. This has the same behaviour if no uncaught error occurs. However, in the case of an error that would have previously resulted in a panic, a new SDK task is posted with the error traceback as an upvalue to process this error information.
The default action is to print a full traceback and then trigger processor restart, that is a similar outcome as before but with recovery of the full error traceback. However the `node.onerror()` library function is available to override this default action; for example developers might wish to print the error without restarting the processor, so that the circumstances which triggered the error can be investigated.
### Additional API Functions and Types
These are available for developers coding new library modules in C. Note that if you compare the standard and NodeMCU versions of `lua.h` you will find a small number of entries not listed below. This is because the Lua 5.1 and Lua 5.3 variants are incompatible owing to architectural differences. However, `laubxlib.h` includes equivalent wrapper version-compatible functions that may be used safely for both versions.
#### ROTable and ROTable_entry
Extra structure types used in `LROT` macros to declare static RO tables. See detailed section below.
#### lua_createrotable
` void (lua_createrotable) (lua_State *L, ROTable *t, const ROTable_entry *e, ROTable *mt);`         [-0, +1, -]
Create a RAM based `ROTable` pointing to the `ROTable_entry` vector `e`, and metatable `mt`.
#### lua_debugbreak and ASSERT
` void (lua_debugbreak)(void);`
`lua_debugbreak()` and `ASSERT(condition)` are available for development debugging. If `DEVELOPMENT_USE_GDB` is defined then these will respectively trigger a debugger break and evaluate a conditional assert prologue on the same. If not, then these are effectively ignored and generate no executable code.
#### lua_dump
` int lua_dump (lua_State *L, lua_Writer writer, void *data, int strip);`         [-0, +0, ]
Dumps function at the top of the stack function as a binary chunk as per LRM. However the last argument `strip` is now an integer is in the range -1..2, rather a boolean as per standard Lua:
- -1, use the current default strip level (which can be set by [`lua_stripdebug`](#lua_stripdebug))
- 0, keep all debug info
- 1, discard Local and Upvalue debug info; keep line number info
- 2, discard Local, Upvalue and line number debug info
The internal NodeMCU `Proto` encoding of debug line number information is typically 15× more compact than in standard Lua; the intermediate `strip=1` argument allows the removal of most of the debug information whilst retaining the ability to produce a proper line number traceback on error.
#### lua_freeheap
` int lua_freeheap (void);`         [-0, +0, ]
returns the amount of free heap available to the Lua memory allocator.
#### lua_gc
` int lua_gc (lua_State *L, int what, int data);`         [-0, +0, m]
provides an option for
- `what`:
- `LUA_GCSETMEMLIMIT` sets the available heap threshold (in bytes) at which aggressive sweeping starts.
#### lua_getlfsconfig
` void lua_getlfsconfig (lua_State *L, int *conf);`         [-0, +0, -]
if `conf` is not `NULL`, then this returns an int[5] summary of the LFS configuration. The first 3 items are the mapped and flash address of the LFS region, and its allocated size. If the LFS is loaded then the 4th is the current size used and the 5th (for Lua 5.3 only) the date-timestamp of loaded LFS image.
#### lua_getstate
` lua_State * lua_getstate();`         [-0, +0, -]
returns the main thread `lua_State` record. Used in CBs to initialise `L` for subsequent API call use.
#### lua_pushrotable
` void lua_pushrotable (lua_State *L, ROTable *p);`         [-0, +1, -]
Pushes a ROTable onto the stack.
#### lua_stripdebug
` int lua_stripdebug (lua_State *L, int level);`         [-1, +0, e]
This function has two modes. A value is popped off the stack.
- If this value is `nil`, then the default strip level is set to given `level` if this is in the range 0 to 2. Returns the current default level.
- If this value is a Lua function (in RAM rather than in LFS), then the prototype hierarchy within the function is stripped of debug information to the specified level. Returns an _approximate_ estimate of the heap freed (this can be a lot higher if some strings can be garbage collected).
#### lua_writestring
` void lua_writestring(const char *s, size_t l); /* macro */`
Writes a string `s` of length `l` to `stdout`. Note that any output redirection will be applied to the string.
#### lua_writestringerror
` void lua_writestringerror(const char *s, void *p). /* macro */`
Writes an error with CString format specifier `s` and parameter `p` to `stderr`. Note on the ESP devices this error will always be sent to `UART0`; output redirection will not be applied.
### The Debug Interface
#### lua_pushstringsarray
` int lua_pushstringsarray (lua_State *L, int opt);`         [-0, +1, m]
Pushes an array onto the stack containing all strings in the specified strings table. If `opt` = 0 then the RAM table is used, else if `opt` = 1 and LFS is loaded then the LFS table is used, else `nil` is pushed onto the stack.
Returns a status boolean; true if a table has been pushed.
### Auxiliary Library Functions and Types
Note that the LRM defines an auxiliary library which contains a set of functions that assist in coding convenience and economy. These are strictly built on top of the basic API, and are defined in `lauxlib.h`. By convention all auxiliary functions have the prefix `luaL_` so module code should only contain `lua_XXXX()` and `luaL_XXXX()` data declarations and functions. And since the `lauxlib.h` itself incudes `lua.h`, all C modules should only need to `#include "lauxlib.h"` in their include preambles.
NodeMCU adds some extra auxiliary functions above those defined in the LRM.
#### luaL_lfsreload
` int lua_lfsreload (lua_State *L);`         [-1, +1, -]
This function pops the LFS image name from the stack, and if it exists and contains the correct image header then it reloads LFS with the specified image file, and immediately restarts with the new LFS image loaded, so control it not returned to the calling function. If the image is missing or the header is invalid then an error message is pushed onto the stack and control is returned. Note that if the image has a valid header but its contents are invalid then the result is undetermined.
#### luaL_pcallx
` int luaL_pcallx (lua_State *L, int narg, int nresults);`         [-(nargs + 1), +(nresults|1), ]
Calls a function in protected mode and providing a full traceback on error.
Both `nargs` and `nresults` have the same meaning as in [lua_call](https://www.lua.org/manual/5.3/manual.html#lua_call). If there are no errors during the call, then `luaL_pcallx` behaves exactly like `lua_call`. However, if there is any error, `lua_pcallx` has already established an traceback error handler for the call that catches the error. It cleans up the stack and returns the negative error code.
Any caught error is posted to a separate NodeMCU task which which calls the error reporter as defined in the registry entry `onerror` with the traceback text as its argument. The default action is to print the error and then set a 1 sec one-shot timer to restart the CPU. (One second is enough time to allow the error to be sent over the network if redirection to a telnet session is in place.) If the `onerror` entry is set to `print` for example, then the error is simply printed without restarting the CPU.
Note that the Lua runtime does not call the error handler if the error is an out-of memory one, so in this case the out-of-memory error is posted to the error reporter without a traceback.
#### luaL_posttask
` int luaL_posttask (lua_State* L, int prio);`         [-1, +0, e]
Posts a task to execute the function popped from the stack at the specified user task priority
- `prio` one of:
- `LUA_TASK_LOW`
- `LUA_TASK_MEDIUM`
- `LUA_TASK_HIGH`
Note that the function is invoked with the priority as its parameter.
#### luaL_pushlfsmodule
` int luaL_pushlfsmodule ((lua_State *L);`         [-1, +1, -]
This function pops a module name from the stack. If this is a string, LFS is loaded and it contains the named module then its closure is pushed onto the stack as a function value, otherwise `nil` is pushed.
Returns the type of the pushed value.
#### luaL_pushlfsmodules
` int luaL_pushlfsmodules (lua_State *L);`         [-0, +1, m]
If LFS is loaded then an array of the names of all of the modules in LFS is pushed onto the stack as a function value, otherwise `nil` is pushed.
Returns the type of the pushed value.
#### luaL_pushlfsdts
` int luaL_pushlfsdts (lua_State *L);`         [-0, +1, m-]
If LFS is loaded then the Unix-style date-timestamp for the compile time of the image is pushed onto the stack as a integer value, otherwise `nil` is pushed. Note that the primary use of this stamp is to act as a unique identifier for the image version.
Returns the type of the pushed value.
#### luaL_reref
` void (luaL_reref) (lua_State *L, int t, int *ref);`         [-1, +0, m]
Variant of [luaL_ref](https://www.lua.org/manual/5.3/manual.html#luaL_ref). If `*ref` is a valid reference in the table at index t, then this is replaced by the object at the top of the stack (and pops the object), otherwise it creates and returns a new reference using the `luaL_ref` algorithm.
#### luaL_rometatable
` int (luaL_rometatable) (lua_State *L, const char* tname, const ROTable *p);`         [-0, +1, e]
Equivalent to `luaL_newmetatable()` for ROTable metatables. Adds key / ROTable entry to the registry `[tname] = p`, rather than using a new RAM table.
#### luaL_unref2
` luaL_unref2(l,t,r)`
This macro executes `luaL_unref(L, t, r)` and then assigns `r = LUA_NOREF`.
### Declaring modules and ROTables in NodeMCU
All NodeMCU C library modules should include the standard header "`module.h`". This internally includes `lnodemcu.h` and these together provide the macros to enable declaration of NodeMCU modules and ROTables within them. All ROtable support macros are either prefixed by `LRO_` (Lua Read Only) or in the case of table entries `LROT_`.
#### NODEMCU_MODULE
` NODEMCU_MODULE(sectionname, libraryname, map, initfunc)`
This macro enables the module to be statically declared and linked in the `ROM` ROTable. The global environment's metafield `__index` is set to the ROTable `ROM` hence any entries in the ROM table are resolved as read-only entries in the global environment.
- `sectionname`. This is the linker section for the module and by convention this is the uppercased library name (e.g. `FILE`). Behind the scenes `_module_selected` is appended to this section name if corresponding "use module" macro (e.g. `LUA_USE_MODULES_FILE`) is defined in the configuration. Only the modules sections `*_module_selected` are linked into the firmware image and those not selected are ignored.
- `libraryname`. This is the name of the module (e.g. `file`) and is the key for the entry in the `ROM` ROTable.
- `map`. This is the ROTable defining the functions and constants for the module, and this is the corresponding value for the entry in the `ROM` ROTable.
- `initfunc`. If this is not NULL, it should be a valid C function and is call during Lua initialisation to carry out one-time initialisation of the module.
#### LROT_BEGIN and LROT_END
` LROT_BEGIN(rt,mt,flags)`
` LROT_END(rt,mt,flags)`
These macros start and end a ROTable definition. The three parameters must be the same in both declarations.
- `rt`. ROTable name.
- `mt`. ROTable's metatable. This should be of the form `LROT_TABLEREF(tablename)` if the metatable is used and `NULL` otherwise.
- `flags`. The Lua VM table access routines use a `flags` field to short-circuit where the access needs to honour metamethods during access. In the case of a static ROTable this flag bit mask must be declared statically during compile rather than cached dynamically at runtime. Hence if the table is a metatable and it includes the metamethods `__index`, `__newindex`, `__gc`, `__mode`, `__len` or `__eq`, then the mask field should or-in the corresponding mask for example if `__index` is used then the flags should include `LROT_MASK_INDEX`. Note that index and GC are a very common combination, so `LROT_MASK_GC_INDEX` is also defined to be `(LROT_MASK_GC | LROT_MASK_INDEX)`.
#### LROT_xxxx_ENTRY
ROTables only support static declaration of string keys and value types: C function, Lightweight userdata, Numeric, ROTable. These are entries are declared by means of the `LROT_FUNCENTRY`, `LROT_LUDENTRY`, `LROT_NUMENTRY`, `LROT_INTENTRY`, `LROT_FLOATENTRY` and `LROT_TABENTRY` macros. All take two parameters: the name of the key and the value. For Lua 5.1 builds `LROT_NUMENTRY` and `LROT_INTENTRY` both generate a numeric `TValue`, but in the case of Lua 5.3 these are separate numeric subtypes so these macros generate the appropriate subtype.
Note that ROTable entries can be declared in any order except that keys starting with "`_`" must be declared at the head of the list. This is a pragmatic constraint for runtime efficiency. A lookaside cache is used to optimise key searches and results in a direct table probe in over 95% of ROTable accesses. A table miss (that is the key doesn't exist) still requires a full scan of the list, and the main source of table misses are scans for metafield values. Forcing these to be at the head of the ROTable allows the scan to abort on reading the first non-"`_`" key.
ROTables can still support other key and value types by using an index metamethod to point at an C index access function. For example this technique is used in the `utf8` library to return `utf8.charpattern`
```C
LROT_BEGIN(utf8_meta, NULL, LROT_MASK_INDEX)
LROT_FUNCENTRY( __index, utf8_lookup )
LROT_END(utf8_meta, NULL, LROT_MASK_INDEX)
LROT_BEGIN(utf8, LROT_TABLEREF(utf8_meta, 0)
LROT_FUNCENTRY( offset, byteoffset )
LROT_FUNCENTRY( codepoint, codepoint )
LROT_FUNCENTRY( char, utfchar )
LROT_FUNCENTRY( len, utflen )
LROT_FUNCENTRY( codes, iter_codes )
LROT_END(utf8, LROT_TABLEREF(utf8_meta), 0)
```
Any reference to `utf8.charpattern` will call the `__index` method function (`utf8_lookup()`); this returns the UTF8 character pattern if the name equals `"charpattern"` and `nil` otherwise. Hence `utf8` works as standard even though ROTables don't natively support a string value type.
### Standard Libraries
- Basic Lua functions, coroutine support, Lua module support, string and table manipulation are as per the standard Lua implementation. However, note that there are some breaking changes in the standard Lua string implementation as discussed in the LRM, e.g. the `\z` end-of-line separator; no string functions exhibit a CString behaviour (that is treat `"\0"` as a special character).
- The modulus operator is implemented for string data types so `str % var` is a synonym for `string.format(str, var)` and `str % tbl` is a synonym for `string.format(str, table.unpack(tbl))`. This python-like formatting functionality is a very common extension to the string library, but is awkward to implement with `string` being a `ROTable`.
- The `string.dump()` `strip` parameter can take integer values 1,2,3 (the [`lua_stripdebug`](#lua_stripdebug) strip parameter + 1). `false` is synonymous to `1`, `true` to `3` and omitted takes the default strip level.
- The `string` library does not offer locale support.
- The 5.3 `math` library is expanded compared to the 5.1 one, and specifically:
- Included: ` abs`, ` acos`, ` asin`, ` atan`, ` ceil`, ` cos`, ` deg`, ` exp`, ` tointeger`, ` floor`, ` fmod`, ` ult`, ` log`, ` max`, ` min`, ` modf`, ` rad`, ` random`, ` randomseed`, ` sin`, ` sqrt`, ` tan` and ` type`
- Not implemented: ` atan2`, ` cosh`, ` sinh`, ` tanh`, ` pow`, ` frexp`, ` ldexp` and ` log10`
- Input, output OS Facilities (the `io` and `os` libraries) are not implement for firmware builds because of the minimal OS supported offered by the embedded run-time. The separately documented `file` and `node` libraries provide functionally similar analogues. The host execution environment implemented by `luac.cross` does support the `io` and `os` libraries.
- The full `debug` library is implemented less the `debug.debug()` function.
- An extra function `debug.getstrings(type)` has been added; `type` is one of `'ROM'`, or `'RAM'` (the default). Returns a sorted array of the strings returned from the [`lua_getstrings`](#lua_getstrings) function.
## Lua compatibility
Standard Lua has a number of breaking incompatibilities that require conditional code to enable modules using these features to be compiled against both Lua 5.1 and Lua 5.3. See [Lua 5.2 §8](https://www.lua.org/manual/5.2/manual.html#8) and [Lua 5.3 §8](https://www.lua.org/manual/5.3/manual.html#8) for incompatibilities with Lua 5.1.
A key strategy in our NodeMCU migration to Lua 5.3 is that all NodeMCU application modules must be compilable and work under both Lua versions. This has been achieved by three mechanisms
- The standard Lua build has conditionals to enable improved compatibility with earlier versions. In general the Lua 5.1 compatibilities have been enabled in the Lua 5.3 builds.
- Regressing new Lua 5.3 features into the Lua 5.1 API.
- For a limited number of features the Project accepts that the two versions APIs are incompatible and hence modules should either avoid their use or use `#if LUA_VERSION_NUM == 501` conditional compilation.
The following subsections detail how NodeMCU Lua versions deviate from standard Lua in order to achieve these objectives.
### Enabling Compatibility modes
The following Compatibility modes are enabled:
- `LUA_COMPAT_APIINTCASTS`. These `ìnt`cast are still used within NodeMCU modules.
- `LUA_COMPAT_UNPACK`. This retains `ROM.unpack` as a global synonym for `table.unpack`
- `LUA_COMPAT_LOADERS`. This keeps `package.loaders` as a synonym for `package.seachers`
- `LUA_COMPAT_LOADSTRING`. This keeps `loadstring(s)` as a synonym for `load(s)`.
### New Lua 5.3 features back-ported into the Lua 5.1 API
- Table access routines in Lua 5.3 are now type `int` rather than `void` and return the type of the value pushed onto the stack. The 5.1 routines `lua_getfield`, `lua_getglobal`, `lua_geti`, `lua_gettable`, `lua_rawget`, `lua_rawgeti`, `lua_rawgetp` and `luaL_getmetatable` have been updated to mirror this behaviour and return the type of the value pushed onto the stack.
- There is a general numeric comparison API function `lua_compare()` with macros for `lua_equal()` and `lua_lessthan()` whereas 5.1 only support the `==` and `<` tests through separate API calls. 5.1 has been update to mirror the 5.3. implementation.
- Lua 5.3 includes a `lua_absindex(L, idx)` which converts ToS relative (e.g. `-1`) indices to stack base relative and hence independent of further push/pop operations. This makes using down-stack indexes a lot simpler. 5.1 has been updated to mirror this 5.3 function.
### Feature breaks
- `\0` is now a valid pattern character in search patterns, and the `%z` pattern is no longer supported. We suggest that modules either limit searching to non-null strings or accept that the source will require version variants.
- The stack pseudo-index `LUA_GLOBALSINDEX` has been removed. Modules must either get the global environment from the registry entry `LUA_RIDX_GLOBALS` or use the `lua_getglobal` API call. All current global references in the NodeMCU library modules now use `lua_getglobal` where necessary.
- Shared userdata and table upvalues. Lua 5.3 now supports the sharing of GCObjects such as userdata and tables as upvalue between C functions using `lua_getupvalue` and `lua_setupvalue`. This has changed from 5.1 and we suggest that modules either avoid using these API calls or accept that the source will require version variants.
- The environment support has changed from Lua 5.1 to 5.3. We suggest that modules either avoid using these API calls or accept that the source will require version variants.
- The coroutine yield / resume API support has changed in Lua 5.3 to support yield and resume across C function calls. We suggest that modules either avoid using these API calls or accept that the source will require version variants.

42
docs/nodemcu-pil.md Normal file
View File

@ -0,0 +1,42 @@
# Programming in NodeMCU
The standard Lua runtime offers support for both Lua modules that can define multiple Lua functions and properties in an encapsulating table as described in the [Lua 5.3 Reference Manual](https://www.lua.org/manual/5.3/) \("**LRM**") and specifically in [LRM Section 6.3](https://www.lua.org/manual/5.3/manual.html#6.3). Lua also provides a C API to allow developers to implement modules in compiled C.
NodeMCU developers are also able to develop and incorporate their own C modules into their own firmware build using this standard API, although we encourage developers to download the standard [Lua Reference Manual](https://www.lua.org/manual/) and also buy of copy of Roberto Ierusalimschy's Programming in Lua edition 4 \("**PiL**"). The NodeMCU implementation extends standard Lua as documented in the [NodeMCU Reference Manual](nodemcu-lrm.md) \("**NRM**").
Those developers who wish to develop or to modify existing C modules should have access to the LRM, PiL and NRM and familiarise themselves with these references. These are the primary references; and this document does not repeat this content, but rather provide some NodeMCU-specific information to supplement it.
From a perspective of developing C modules, there is very little difference from that of developing modules in standard Lua. All of the standard Lua library modules (`bit`, `coroutine`, `debug`, `math`, `string`, `table`, `utf8`) use the C API for Lua and the NodeMCU versions have been updated to use NRM extensions. so their source code is available for browsing and using as style template (see the corresponding `lXXXlib.c` file in GitHub [NodeMCU lua53](../app/lua53) folder).
The main functional change is that NodeMCU supports a read-only subclass of the `Table` type, known as a **`ROTable`**, which can be statically declared within the module source using static `const` declarations. There are also limitations on the valid types for ROTable keys and value in order to ensure that these are consistent with static declaration; and hence ROTables are stored in code space (and therefore in flash memory on the IoT device). Hence unlike standard Lua tables, ROTables do not take up RAM resources.
Also unlike standard Lua, two global ROTables are used for the registration of C modules. Again, static declaration macros plus linker "magic" (use of make filters plus linker _section_ directives) result in the marshalling of these ROTables during the make process, and because this is all ROTable based, the integration of modules into the firmware builds and their access from executing Lua applications depends on code space rather than RAM-based data structures.
Note that dynamic loading of C modules is not supported within the ESP SDK, so any library registration must be compiled into the source used in the firmware build. Our approach is simple, flexible and avoids the RAM overheads of the standard Lua approach. The special ROTable **`ROM`** is core to this approach. The global environment table has an `__index` metamethod referencing this ROM table. Hence, any non-raw lookups against the global table will also resolve against ROM. All base Lua functions (such as `print`) and any C libraries (written to NodeMCU standards) have an entry in the ROM table and hence have global visibility. This approach does not prevent developers use of standard Lua mechanisms, but rather it offers a simple low RAM use alternative.
The `NODEMCU_MODULE` macro is used in each module to register it in an entry in the **`ROM`** ROTable. It also adds a entry in the second (hidden) **`ROMentry`** ROTable.
- All `ROM` entries will resolve globally
- The Lua runtime scans the `ROMentry` ROTable during its start up, and it will execute any non-NULL `CFunction` values in this table. This enables C modules to hook in any one-time start-up functions if they are needed.
Note that the standard `make` will include any modules found in the `app/modules` folder within a firmware build _if_ the corresponding `LUA_USE_MODULES_modname` macro has been defined. These defines are conventionally set in a common include file `user_modules.h`, and this practice is mandated for any user-submitted modules that are added to to the NodeMCU distribution. However, this does not prevent developers adding their own local modules to the `app/modules` folder and simply defining the corresponding `LUA_USE_MODULES_modname` inline.
This macro + linker approach renders the need for `luaL_reg` declarations and use of `luaL_openlib()` unnecessary, and these are not permitted in project-adopted `app/modules` files.
Hence a NodeMCU C library module typically has a standard layout that parallels that of the standard Lua library modules and uses the same C API to access the Lua runtime:
- A `#ìnclude` block to resolve access to external resources. All modules will include entries for `"module.h"`, and `"lauxlib.h"`. They should _not_ reference any other `lXXX.h` includes from the Lua source directory as these are private to the Lua runtime. These may be followed by C standard runtime includes, external application libraries and any SDK resource headers needed.
- The only external interface to a C module should be via the Lua runtime and its `NODEMCU_MODULE` hooks. Therefore all functions and resources should be declared `static` and be private to the module. These can be ordered as the developer wishes, subject of course to the need for appropriate forward declarations to comply with C scoping rules.
- Module methods will typically employ a Lua standard `static int somefunc (lua_State *L) { ... }` template.
- ROTables are typically declared at the end of the module to minimise the need for forward references and use the `LROT` macros described in the NRM. Note that ROTables only support static declaration of string keys and the value types: C function, Lightweight userdata, Numeric, ROTable. ROTables can also have ROTable metatables.
- Whilst the ROTable search algorithm is a simply linear scan of the ROTable entries, the runtime also maintains a LRU cache of ROTable accesses, so typically over 95% of ROTable accesses bypass the linear scan and do a direct access to the appropriate entry.
- ROTables are also reasonable lightweight and well integrated into the Lua runtime, so the normal metamethod processing works well. This means that developers can use the `__index` method to implement other key and value typed entries through an index function.
- NodeMCU modules are intended to be compilable against both our Lua 5.1 and Lua 5.3 runtimes. The NRM discusses the implications and constraints here. However note that:
- We have back-ported many new Lua 5.3 features into the NodeMCU Lua 5.1 API, so in general you can use the 5.3 API to code your modules. Again the NRM notes the exceptions where you will either need variant code or to decide to limit yourself to the the 5.3 runtime. In this last case the simplest approach is to `#if LUA_VERSION_NUM != 503` to disable the 5.3 content so that 5.1 build can compile and link. Note that all modules currently in the `app/modules` folder will compile against and execute within both the Lua 5.1 and the 5.3 environments.
- Lua 5.3 uses a 32-bit representation for all numerics with separate subtypes for integer (stored as a 32 bit signed integer) and float (stored as 32bit single precision float). This achieves the same RAM storage density as Lua 5.1 integer builds without the loss of use of floating point when convenient. We have therefore decided that there is no benefit in having a separate Integer 5.3 build variant.
- We recommend that developers make use of the full set of `luaL_` API calls to minimise code verbosity. We have also added a couple of registry access optimisations that both simply and improve runtime performance when using the Lua registry for callback support.
- `luaL_reref()` replaces an existing registry reference in place (or creates a new one if needed). Less code and faster execution than a `luaL_unref()` plus `luaL_ref()` construct.
- `luaL_unref2()` does the unref and set the static int hook to `LUA_NOREF`.
Rather than include simple examples of module templates, we suggest that you review the modules in our GitHub repository, such as the [`utf8`](../app/lua53/lutf8lib.c) library. Note that whilst all of the existing modules in `app/modules` folder compile and work, we plan to do a clean up of the core modules to ensure that they conform to best practice.

View File

@ -27,10 +27,15 @@ pages:
- Uploading code: 'upload.md'
- JTAG debugging: 'debug.md'
- Support: 'support.md'
- Reference:
- NodeMCU Language Reference Manual: 'nodemcu-lrm.md'
- Programming in NodeMCU: 'nodemcu-pil.md'
- FAQs:
- Lua Developer FAQ: 'lua-developer-faq.md'
- Extension Developer FAQ: 'extn-developer-faq.md'
- Whitepapers:
- Lua 5.3 Support: 'lua53.md'
- Lua Flash Store (LFS): 'lfs.md'
- Filesystem on SD card: 'sdcard.md'
- Writing external C modules: 'modules/extmods.md'
- C Modules:
@ -54,6 +59,7 @@ pages:
- 'node': 'modules/node.md'
- 'ota': 'modules/otaupgrade.md'
- 'ow (1-Wire)': 'modules/ow.md'
- 'pipe': 'modules/pipe.md'
- 'pulsecnt': 'modules/pulsecnt.md'
- 'qrcodegen': 'modules/qrcodegen.md'
- 'sdmmc': 'modules/sdmmc.md'