A Challenge called boot time
IoT DEVICES REQUIRE BOOT TIME OPTIMIZATION
Building a Linux image on any processors with any graphical libraries as Qt, Cairo, Enlightenment... is getting easier and easier. Those builds are provided by the silicon vendors and are working out of the box. But they are also overloaded with functionalities that might or not be necessary for your use case and that drastically delay the boot time of your system.
Building an IoT device based on those Linux images leads to face several challenges.
One of them is the boot time of the device. But talking about boot time optimization without any specific use case in mind doesn’t make much sense.
Hence let’s list a couple of requirements to define more precisely our use case:
- Graphical User Interface displayed (Screen resolution: 480x272)
- Sensor communication (UART)
GPIO that could be used to drive a relay
The hardware selected for that use case is an IMX6ULEVK board based on an iMX6UL from NXP. This processor is based on a cortex A7 and in a price range that can still be acceptable for an IoT project.
The basic Linux image provided by NXP boots within 22.3 seconds. To shrink this time below 2 seconds, we will focus on four key aspects:
- Developing your application
- Tweaking the initialization scripts
- Adapting the Kernel
- Adapting the bootloader
DEVELOPING YOUR APPLICATION
Why should we start by optimizing the application even before the bootloader or the kernel?
Because when we will start those optimizations it will mean amongst other things removing features. Those features could be extremely helpful during the debug, for instance using a Network File System during the optimization of your application. Therefore we always start by optimizing the applications/start up scripts, then the kernel and finally the bootloader.
Our target is a boot time below 2s so the selection of the graphical library that will be used for developing our application is critical.
As usual using Qt would be great but since the iMX6UL does not include any GPU it is probably not our best choice. There are a couple of options available on the market but we decided to focus on two:
- Cairo is a graphic library that is vector based. It is extremely light but since it is only a drawing library you have to do everything by yourself. For complex GUI, it might not be adapted but for our use case, it seems pretty perfect.
Enlightenment Foundation Libraries is a graphical framework which comes with a set of widgets. It allows developing UI using Edje Data Collection files that describe the position, size, and other parameters of graphical elements that compose the visual aspect of your application. If you use it without a composer (Wayland, X-server…) it is pretty light and can be used without any hardware acceleration.
So either we have EFL which could be compared to Qt or Cairo giving you the basics to design your GUI. We ran some performance tests:
- Flipping the first surface with CAIRO: 90ms
- Flipping the first surface with EFL: 1.5s
Therefore even if EFL will make everything easier for the design of the GUI, for our use case which is about boot time we will use Cairo.
Still EFL is a pretty good option for a processor which does not include a GPU and will be the topic of a different blog article on how to optimize its loading time.
TWEAKING THE INIT
In our case since our application is completely independent from the rest of the system, we decided to write our own initialization program in order to start our application before calling the standard initialization. Therefore the initialization of the different services will come only after our application is started. That simple trick allowed us to start our application 14s before the complete system is actually started.
OPTIMIZING THE KERNEL
The bigger the kernel is the longer it will take for the bootloader to do the copy. So, in order to achieve our goal of boot time optimization, we started by stripping down the kernel of all unnecessary features for our system.
We managed to reduce its size from 8.2MB to 2.7MB. This reduced the boot time by 473ms.
All the functionalities needed by the user but not for booting were moved from the kernel to the file system by compiling them in modules.
In our quest for reducing boot time we then analyzed what was happening during the initialization of the kernel. To do so we printed the output debug messages with their timestamps. By analyzing the log, we isolated drivers taking more than 50ms to start and started to investigate what we could do about it. In our case we managed to reduce the boot time by almost 750ms by reworking the three following drivers:
- UART: When loading the UART driver we observed a delay of 500ms. This was coming from the fact that we were printing all the debug messages over serial. So we set the debug output to silent.
- MMC: In the case of the mmc driver, there was nothing to do to decrease drastically its loading time. But we also realized that it was one of the latest drivers to be loaded and our system was waiting for the mmc driver to be ready to load our file system. So we started to initialize the mmc driver sooner to avoid having the file system wait for the mmc driver to be ready for mounting itself. This allowed us to gain 125ms.
SCREEN: Starting the display driver was taking 250ms; huge part of this time was the initialization of the first image to display to the screen done with a memcopy. We commented the memcopy out as we did not need it and earned 125ms more.
At the end the start-up time of the kernel moved from 2.7s to 700ms.
ADAPTING THE BOOTLOADER
By default, the booting process of the board support is:
At that point, all the features from u-boot used for debugging were not necessary anymore. So, in order to reduce the boot time we decided to use the SPL in Falcon mode:
In that mode, the SPL loads directly the Linux kernel. This allowed us to save 3.6 more seconds. On top of the Falcon mode, we did two additional optimizations:
Malloc pool: The SPL by default allocates a memory pool of 32Mbytes. This initialization takes several hundred milliseconds. We decided to reduce this values to 128Kb to save 346ms.
Cache management: The data cache is not enabled by default in the SPL. We introduced that feature and it allowed us to earn 300ms more on the booting time.
One last modification consisted in increasing the clock frequency for accessing the SDCARD from 25Mhz to 50Mhz which allowed to reach a final boot time of 1.2s.
1.2 seconds seems pretty long for the bootloader but this time includes the power-up of the board. The design of the power supply includes some reset chips adding an incompressible delay of 500ms.
By replacing these reset chips we could reduce the boot time by 400ms.
We reached our goal, our boot time is now below 2s. Great!
Please note that a boot time optimization will make sense only if you have a clear idea of what is necessary for your system. If you need a GUI, the graphic library that you will use has to be selected first and even tested in advance to ensure that it will not be the bottleneck preventing you to reach your requirement on boot time.
Also there is one topic that we did not cover in this article and that is security for IoT devices, quite a serious running topic.
One aspect of security is to make sure that no one can reprogram your target with for instance a compromised kernel. One cool feature of the i.MX6 processor is to include a secure boot mode. This feature is included in the ROM code and will check the signature of programmed images. But using that feature means computing a SHA of your images and even if the i.MX6 includes some hardware accelerator for those operations, what would be the impact on your boot time?
I will cover this subject in a future article.
If you have any question about boot time optimization and would like to hear about our progress on security don't hesitate to get in touch.
Tools used for the investigation