From controlling elevators, stoplights, the valves of your washing machine, or even how long your toaster toasts, embedded computing is everywhere.
Real-time embedded systems are a subset of embedded computing that must complete critical processes within a rigid deadline. Think of a car’s airbag deployment system. If the microcontroller doesn’t detect a collision or electronically trigger the airbag within a fraction of a second, the result is catastrophic.
Similarly, real-time embedded systems are associated with infallible applications like autonomous driving, flight control, or rocket launch instruments. However, they are also at the heart of every storage device. And their uncompromising nature poses both grueling and exciting challenges for engineers.
Embedded in embedded
Udi Shnitzer is a senior manager of real-time embedded firmware engineering at Western Digital in Israel. He explained that every flash storage device must support a certain data rate in real-time, measured by the number of transactions to the memory per second or how many megabits per second can be moved through the device.
“If I commit that I can write four gigabits per second into the flash and I can’t write each megabit in an average of 1/400 of a second, the system will not hold up,” said Shnitzer.
A two-decade veteran of real-time software, Shnitzer believes few crafts rival its engineering difficulty. “Real-time embedded systems create some of the biggest challenges for hardware programming in the industry because they must unfailingly complete time-bound tasks using extremely limited resources,” he said.
While it’s relatively easy to increase storage and memory capacity in computers and servers by adding high-capacity hard drives or increasing RAM, embedded systems don’t have that luxury. They operate with set and limited resources.
“That requires a lot of creativity,” explained Shnitzer. “The memory allocated is extremely limited so we can’t write applications with many code lines. Yet, we still have to deliver very complex operations. We have to be extremely creative, original, and efficient with how we code.”
Engineering Sherlocks
Shnitzer and his team work on Western Digital’s iNAND products, the embedded flash found in everything from automobiles to smartphones and IoT devices. Yet the products’ omnipresent, embedded, and global character also bears some unique challenges, particularly for debugging.
While general software can be debugged by running it on a computer, or a mobile device if it’s an app, that doesn’t quite work for embedded software. If a car in Korea experiences an issue in its embedded flash component, Shnitzer can’t just go to the carmaker to pick it. Solving these issues requires the flair of detective work.
“We can’t understand a problem if we have no information about it,” said Shnitzer. “That means you need to know in advance what data to keep on the device in real-time so that you can have some clues to solve when you need it.”
Yet, clues like thermal stress or electronic interferences still need to fit within that initial, exacting memory constraint. Finding efficient ways to collect the data “is a task where skill and experience prevail,” said Shnitzer.
A labor of logic and reasoning
For Sowjanya Sunkavelli, a Western Digital firmware engineering technologist based in Bangalore, debugging real-time embedded firmware debugging is much like solving a logic puzzle. “Unlike software programming, we cannot visually see what exactly is happening,” she said. “So, if something goes wrong, we should, based on our logic and clues collected during runtime, derive and come to a conclusion.”
Sunkavelli works on the real-time embedded systems of the company’s flash USB devices and SD card products. Compared to Shnitzer’s iNAND products or SSDs, USB devices often have less rigorous real-time bounds. However, these products present other distinct challenges that require no less magnificent engineering artistry.
“A tiny microSD card, which is the size of a fingernail, has millions of lines of code running in the background just to copy a photo from your mobile phone,” said Sunkavelli.
But it gets even more complex. USB and SD cards have extremely small RAM. Because these devices are intended to be affordable for every pocket, much scrutinizing and calculation takes place before adding each byte of RAM to reduce cost as much as possible.
Additionally, these devices often leverage lower-grade memories, where additional error and memory handling needs to be done on top of other real-time operations.
“This requires innovation that doesn’t exist in any other flash product,” said Sunkavelli. All those extra lines of code checking that data was written correctly to the NAND need to somehow fit into the limited amount of RAM.
Sunkavelli described the heart of real-time embedded system design: a puzzle of perpetual constraints requiring choosy, clever algorithms. “We need algorithms that, with the minimum amount of work, can achieve the maximum result,” she said. “You’d be surprised how many processes need to happen before a device performs an actual write and how many patents go into that.”
The concurrent nature of real-time embedded systems
The concurrent nature of tasks and layers within a real-time embedded system is the most arduous part of system design. Engineers must consider everything: boot firmware, device drivers, multiple clocks, timers and interfaces, embedded operating systems, adaptation layers, and hardware accelerators.
“We need to decide what our highest priority is,” said Sunkavelli. “But we need to keep in mind that another layer and user commands can interrupt a process at any point in time. So, every layer should have a buffer to check requests from the other and create a system of priorities working on how to honor those requests. If just one layer doesn’t honor a request, we will see issues arise.”
For Shnitzer, multi-threaded programming is the monkey wrench of real-time embedded systems. “It’s like when two people reach for the same piece of cake on a plate,” he explained. While in life, a resolution will be found through gesture or words, in a real-time embedded system, it’s called a collision.
“If there’s no clear definition of which task has the priority to reach the resource first, the system will come to a stop, or even worse, it will perform an unexpected action or behavior that may be the exact opposite of the intended one,” said Shnitzer.
The stakes of reliability are high for these products. “Mobile, computer, artificial intelligence, and other industries are founded on [Western Digital] systems. If our memory chip has issues that affect the user, an entire product line could fail,” said Shnitzer.
Failure is not an option
Whether it’s an airbag system, a flight controller, or a flash device, every scenario of a real-time embedded system needs to be designed with unyielding predictability.
At Western Digital, hundreds of real-time embedded software engineers, like Sunkavelli and Shnitzer, work on creating new algorithms, data schemes, and memory-handling functions. Invisible to the consumer, their ingenuity contributes to creating remarkable products.