dev_stories

In this article, I will explore:

How Rust ensures memory safety in no_std environments
What memory allocation strategies are viable on modern embedded targets
Which patterns help minimize memory usage without sacrificing reliability
The current state of Rust adoption in trustworthy embedded systems

🟠 If you work with hardware, write firmware, or are simply interested in the topic of memory safety, then this text is for you.

The cost of a defect

The better we become as developers, the less we are willing to tolerate bugs. Users even more so. Today, no one is willing to wait, especially if money has been paid for the software. Everything must work stably, instantly and always.

But the reality is more complicated. The world of software, electronics and network communications is a huge, multi-layered organism. Errors in it are almost inevitable, especially when there is a person in the chain. And there is nothing more dangerous than self-confidence: “everything is stable, there are no defects.” In practice, they always exist. Some of them just haven’t fired yet.

The hardest thing to catch and predict are memory errors. They can be microscopic — imperceptible leaks that after 100,000 iterations lead to a crash or, worse, to undefined behavior.

And here the question becomes relevant: what the cost of one bug?
For a startup, it could be a microservice failure in the middle of a sale on a new mainstream toy, leading to the company's closure.
For a corporation, it could be millions in losses due to a vulnerability in production.
And for an embedded system in an airplane with a glitched flap firmware, it could all end in a fatal crash.

We are used to seeing bugs in applications, the web, interfaces. But this is just the tip of the iceberg.
Under the hood of the modern world are complex systems invisible to the user: digital electronics, POS terminals, escalator controllers, microcontrollers in medicine, transport, and defense. They work quietly and ensure our everyday life. We simply do not notice them — until a failure occurs.

Developing for such systems is not just “another type of programming.” It is working with limited resources, strict standards, and zero tolerance for errors.
For many years, C/C++ reigned here: fast, flexible, but also extremely dangerous when working with memory. Segfault, use-after-free, race conditions — all these are the consequences of unsafe memory work, the price of which can be human lives.

Rust is a game changer.
It provides low-level control and deterministic behavior, but at the same time does not allow the developer to shoot himself in the foot.
Memory safety is built into the language architecture itself.
Therefore, Rust's unique value proposition fully meets the needs for safe and modern software for critical systems.

1. A quick overview of Rust's memory fundamentals


Concept	Description	Key Layout	Key Benefit for Embedded Systems
Ownership	Each value has a single owner; memory is automatically freed when the owner goes out of scope.	Compile Time	Eliminates memory leaks, double frees, use-after-frees.
Borrow	Allows temporary access to data (immutable or mutable) without transferring ownership; compiler ensures safety.	Compile Time	Prevents race conditions; ensures thread safety.
Lifetime	Ensures that references are valid as long as the data they point to.	Compile Time	Prevents dangling pointers.
Stack	Memory for local variables and function call frames; LIFO; fixed size, known at compile time.	Random Access Memory (RAM)	Extremely fast and predictable allocation; minimizes fragmentation.
Heap	Memory for dynamically allocated data; size determined at run time.	Random Access Memory (RAM)	Flexibility for dynamic structures (if explicitly managed); slower than stack.

2. `no_std`: the foundation for embedded Rust

The no_std attribute specifies that the crate will reference libcore instead of libstd.

📓 no_std highlights
The libcore is a platform-independent subset of libstd that provides some language primitives and processor features (atomic operations, SIMD), but has no API for platform integration, including a default memory allocator. This means that std features such as file I/O, networking, stack overflow protection, command line argument handling, and starting the main thread and dynamic memory allocation (Vec, String) are not available by default. This requires developers to manually handle these aspects or use specialized crates.

Module	Available?	Notes
`core`	✅	Core - arithmetic, Option, Result, iterators
`core::fmt`	✅	Formatting via `core::write!` and `Debug`
`core::ops`	✅	Operators: `Add`, `Deref`, `Drop`, etc.
`core::cmp`	✅	Comparison, `PartialEq`, `Ord`, etc.
`core::ptr`	✅	Working with pointers
`core::mem`	✅	Sizes, allocation, `MaybeUninit`
`core::slice`	✅	Working with slices
`core::str`	✅	Support for string slices

The no_std runtime is not just a technical constraint - it serves as a strict contract that enforces a design philosophy focused on minimal overhead and explicit resource management.

Category	In `std`	In `no_std`	Possible alternative
Prelude	Full `std::prelude`	`core::prelude`, limited	by `core` and `alloc::prelude`
Panic (`panic!`)	Yes, calls `std::panic()`	Requires custom implementation	`#[panic_handler]`
Hip/dynamics	Yes `Box`, `Vec`, `String`	No - no `alloc`	Add `alloc` and allocator
Default allocator	Yes	No	`wee_alloc`, `dlmalloc`, `buddy_alloc`, etc.
Files and File System	`std::fs`	❌ No	Not supported
Network	`std::net`	❌ No	Use HAL/RTOS or `smoltcp`
Threads	`std::thread`	❌ No	RTOS / `embassy` / `RTIC`
Synchronization	`Mutex`, `RwLock`, `Condvar`	❌ No	`spin`, `critical-section`, `cortex-m`
Clocks and Timers	`std::time`	❌ Almost everything	Hardware timers, HAL
Standard Output	`println!`, `std::io`	❌ No	`defmt`, UART/ITM over HAL
I/O	`std::io::Read/Write`	❌ No	HAL drivers, `embedded-hal`
Formatting	`format!`, `std::fmt`	✅ Partial (`core::fmt`)	`core::fmt` without allocation
Standard collections	`HashMap`, `Vec`, `String`	❌ No	`heapless`, `arrayvec`, `alloc`
Error trait)	`std::error::Error`	❌ No	Custom traits or `core::fmt::Debug`
C FFI	✅ Yes	✅ Yes	`extern "C"` works
Signals, processes, env	`std::env`, `std::process`	❌ No	Not applicable
Randomness	`rand`, `std::rand`	❌ No	`rand_core`, hardware generator

Optional allocator

Although the no_std environment removes the default allocator, it does not completely disable dynamic memory allocation capabilities (e.g. Vec, String).
To reintroduce this capability, the controlled alloc gateway from no_std must be used.
After hooking into this gateway, a custom allocator must be implemented and passed to it alloc from the existing crates specifically designed for this purpose, dlmalloc, buddy_alloc or wee_alloc. These allocators typically implement the core::alloc::GlobalAlloc trait.

3. Memory allocation strategies in constrained environments

3.1 Static vs Dynamic Memory Allocation

To ensure deterministic memory usage and minimize fragmentation, static allocation (global and stack variables) is preferred, since the memory for them is known and allocated at compile time.
On the other hand, using dynamic allocation may introduce non-determinism. Using the heap carries the risk of some memory allocation failures and fragmentation problems.
The community recommends using dynamic allocation only in exceptional cases, otherwise the "deterministic by default" sub-approach should be followed.
This design approach 🔻reduces the flexibility of the solution, but significantly 🔺increases the predictability of its behavior and overall reliability.

3.2 Heap in `no_std`

⭐ Best practices for heap management:

Minimize or avoid: The general best practice is to avoid dynamic allocations entirely if possible.
Pre-allocate: If dynamically sized data is needed, allocate memory once at startup and reuse it. For Vecs v.clear() can clear them while preserving the allocated memory to prevent repeat allocations.
heapless: heapless collections (e.g. heapless::Vec) when dynamic resizing is needed but heap usage should be avoided.
The core principle behind heapless is that its data structures are backed by a static memory allocation. For example, you can think of heapless::Vec as an alternative version of std::Vec with fixed capacity and that can’t be re-allocated on the fly (e.g. via push()).
Avoid in interrupts: It is strongly recommended to avoid allocating or freeing memory in interrupt handlers due to potential deadlocks with non-blocking allocators.

3.3 Stack in `no_std`

The most common problem that happens with the stack is stack overflows, which causes the program to overwrite adjacent memory, which can lead to UB or crashes.

⭐ Best practices for stack management:

Linker scripts: Linker scripts determine the memory layout, including the RAM areas for the stack and static data. They can specify stack boundaries, and the build will fail if the segments do not fit.
flip-link: A zero-cost stack overflow protection tool for bare-metal Rust.
Hardware Features (MPU/Stack Limit Registers): Some microcontrollers (e.g. Cortex-M) provide memory protection units (MPU) or dedicated stack limit registers that can detect stack overflows by throwing a crash exception when the stack pointer drops below a certain limit. This changes undefined behavior to a deterministic exception.

⭐ Monitoring stack memory usage:

Static Analysis: framehop. cargo-call-stack can report or inspect stack usage, but last update at oct 2024.
Runtime Monitoring: This can be done by filling the remaining free stack space with some data pattern, and checking at runtime that the pattern is still in the same place at the end of the stack.
Interrupt Stack Usage Considerations: Interrupt handlers have their own stack frames. Their memory usage should be carefully considered to avoid overflows, especially since they can preempt the main program. Static mutable variables in ISRs can be used to safely store state.

A general recommendation for managing problems like stack overflows is to bring these errors into the realm of complete determinism, from UB to guaranteed and predictable failure.

4. Hardware Memory Architectures and Protection

4.1 Microcontrollers Memory


Memory Type	Volatility	Primary Usage	Typical Write Cycles	Access Speed (Relative)
Flash	Non-Volatile	Program Code, Constants	1,000 - 10,000	Fast Reads, Slow Writes
SRAM	Volatile	Stack, Variables	Billions	Very Fast
EEPROM	Non-volatile	Configuration data, rarely changed	100,000	Slower than SRAM

4.2 Memory Layout and Sections

While Rust provides high-level memory safety, the linker script is the critical low-level component that translates these abstractions into physical memory addresses.
In the absence of OS, linker dictates where code and data segments are located in Flash and RAM, and provides the ability to precisely define stack, heap, and static data regions to ensure that Rust's memory model is correctly implemented on the target hardware.

With #[export_name], #[no_mangle], and #[link_section], you literally control how the compiled Rust code fits into memory, and what becomes the entry point for the hardware. Without this, the firmware won't work.

You can 📖 read more about the role of the linker and the distribution of data across memory sections in my recent article.

4.3 Memory Protection Units (MPUs) and Memory Management Units (MMUs)

MPU is an optional component in Cortex-M processors that protects memory regions by defining access permissions (read-only, read-write, no execute).
It can detect stack overflows by making the bottom of the stack inaccessible, prevent code injections by marking regions as non-executable (XN), and isolate application tasks.
MMUs are more advanced, typically found in Cortex-A processors and some RISC-V SoCs.
They allow physical memory to be mapped to virtual memory addresses, enabling full OS functionality such as user applications, process isolation, and paging.

Rust's memory safety guarantees are mostly provided at compile time.
However, hardware features such as the MPU and stack limit registers provide a crucial runtime layer of protection, especially against logic errors that can occur in unsafe code.

5. Concurrency and Shared State Management

5.1 `Send` and `Sync` concurrency Primitives.

Send marks a type as safe to share ownership between threads (i.e. &T is Send)
Sync marks a type as safe to share by reference between threads (i.e. T is Sync)

These are marker traits with no methods, which are automatically inferred if the type consists entirely of Send or Sync types.

They report thread safety information at the type system level.
However, there are exceptions to the rule:

raw pointers are neither Send nor Sync.
UnsafeCell (and thus Cell, RefCell) is not Sync.
Rc is neither Send nor Sync.

In embedded systems, concurrency often arises from interrupts (ISRs) and real-time tasks.
Applying Send and Sync to static mut variables and shared resources, in combination with critical sections or atomic operations, ensures that even in bare-metal or RTOS environments, race conditions are prevented at compile time.

5.2 Smart pointers in embedded systems

Smart pointers provide abstractions for managing memory and shared state at the cost of some overhead.
In resource-constrained embedded systems, these overheads require careful consideration.


Primitive	Thread-Safe	Memory Overhead	Performance Impact	Primary Use Case	Key Limitation/Feature
`Rc<T>`	None	Low (reference count)	Fast	Shared ownership in single-threaded scenarios	Not thread-safe
`RefCell<T>`	None	Low	Fast	Internal mutability in single-threaded scenarios	Not thread-safe
`Arc<T>`	Yes	Moderate (atomic count)	Fast (for cloning), low (for dereferencing)	Shared ownership in multi-threaded scenarios	Default data immutability
`Mutex<T>`	Yes	Moderate (lock structure)	Potential contention/blocking	Exclusive access to shared mutable data	May lead to deadlocks; locking overhead

The preference for heapless collections and static memory allocation over heap-based Arc and Box in many embedded contexts indicates a conscious trade-off: raw efficiency is prioritized unless the complexity of shared ownership requires absolutely smart pointers.

5.3 Critical Sections and Atomic Operations

Critical sections are a basic primitive for concurrency control, literally a global mutex that can only be acquired by one thread/context at a time.

On bare-metal single-core systems, this is typically implemented by disabling interrupts.
On multi-core systems, this involves disabling interrupts on the current core and acquiring a hardware spinlock.

The critical-section crate provides a generic API for critical sections across various target platforms.
It provides safe access to shared mutable data (e.g. Mutex<Cell<u32>>) in an interrupt-safe context.


Primitive	Thread-Safe	Memory Overhead	Performance Impact	Primary Use Case	Key Limitation/Feature
Critical Section (disable interrupts)	Yes (on single core)	Low	High interrupt latency, jitter	Protects shared data from interrupts	Does not guarantee exclusivity on multi-core systems
Atomic Operations	Yes	Low	Minimal overhead	Safe read-modify-write operations; inter-core synchronization	Depends on hardware support; more complex memory model

ARM Cortex-M:
- thumbv6 (M0/M0+) provides atomic load/store operations.
- thumbv7 (M3+) provides full Compare and Swap (CAS) instructions. They are safe even across multiple cores.
RISC-V: Extension A provides atomic read-modify-write (AMO) and load-reserved/store-conditional (LR/SC) instructions for synchronization. They support different memory coherence orders (out-of-order, acquire, release, sequentially coherent). The portable-atomic crate provides atomic types for various targets, including RISC-V without extension A.

The choice between disabling interrupts and using fine-grained atomic operations reflects a tradeoff between simplicity/guarantees and performance/latency.

Critical sections are simpler to implement, but incur higher interrupt latency and jitter.
Atomic operations offer finer control and better performance for specific operations, especially on multi-core systems.

5.4 RTOS and concurrency frameworks

Rust supports integration with real-time kernels and concurrency frameworks for multitasking.

RTIC (Real-Time Interrupt-driven Concurrency): Framework focuses on shared resource management, message passing, and task scheduling.
- Uses the NVIC (Nested Vectored Interrupt Controller) of the Cortex-M for scheduling.
- Ideal for hard-RTOS and interrupt-driven systems. Allows safe sharing of peripherals and data between interrupts and tasks without unsafe.
Embassy: Framework based on async/await with executors, priorities, timers, network and USB support.
- Provide effortless async approach while maintaining bare-metal simplicity.
- async fn moves part of its stack into a Future type, which has a size known at compile time.
- Futures only store variables held through .await points, not the entire stack frame.
TockOS: A safety-focused RTOS for Cortex-M and RISC-V, with an emphasis on application isolation via MPU support.
- The kernel is written in Rust, processes can be in any language.
- Has no shared heap in the kernel.
- The RTOS allocates a stack for each task. Proper sizing of these stacks is critical to avoid wastage or overflow.
- RTOSes often use memory block pools for deterministic dynamic allocation, which is preferable to shared heaps that can become fragmented.
- Each RTOS object (semaphores, mutexes, queues) has a control block with memory overhead; minimizing their use can save RAM.

6. Some approaches to memory optimization

6.1. DST and fat pointers

References or raw pointers to a DST are fat pointers, and take up 2x more memory than regular pointers (data + vtable).

Trait objects (dyn Trait) are a common DST. They provide dynamic dispatch (runtime method resolution via vtables). Dynamic dispatch via vtables also incurs a small runtime overhead compared to static dispatch.

In resource-constrained embedded systems, the cost of flexibility from using DSTs and fat pointers becomes more obvious. The choice involves a tradeoff between

dyn Trait (dynamic dispatch) - smaller binary but slower runtimme prformance.
impl Trait (static dispatch/monomorphization) - larger binary but faster runtime.

6.2. `Move` vs `Copy`

In Rust, by default values are "moved" when assigned or passed to a function.

For types that own memory on the heap (such as Vec or String), Copy cannot be implemented to prevent accidental deep copies and double frees.
- A move is bassically a shallow bitwise copy, with the compiler statically checking that the original variable cannot be used afterwards.
- Only moves or explicit clone() are allowed.
- A move of this type copies only a group of pointers on the stack (ptr, cap, len), not the contents of the heap.
Copy types (such as primitives or fixed-size arrays) are implicitly copied, and the original variable may be used afterwards.
- For Copy-compatible types, move and copy operations are identical at runtime and are subject to the same optimizations.
- For small types (such as i8), pass-by-value (copy) may be faster than pass-by-reference, due to passing through registers.

Although Copy and Move are often equivalent in performance for small types.
For large fixed-size data structures that could be Copy (e.g. [u8; 1024]), the decision whether to implement Copy directly affects whether a full memcpy will occur on assignment.

6.3. Memory Alignment Considerations

Data alignment ensures that data structures are allocated at memory addresses that are multiples of their size or the specified alignment.

This is critical for performance (CPUs often read data in machine word sized chunks) and correctness (unaligned accesses can throw exceptions or be slower).
The Rust (and LLVM) compilers handle alignment automatically. Struct fields are padded to align with the largest alignment of any of their fields.

sum of sizes of types = 9
sum of size including alignment = 12
denominator of alignment = 4
9\4 = 2.25 to the upper integer = 3
aligned size = 4 * 3 = 12

size_of::<(char, u8, i32)>();   // 12
align_of::<(char, u8, i32)>();  // 4

By deliberately designing structs with alignment in mind (e.g. by ordering fields by size or using #[repr(align)]), to ensure optimal cache usage and single-cycle memory accesses, even though the compiler will eventually align the data anyway.

6.4. Best practices for minimizing memory usage and optimizing performance

Prioritize stack allocation: Prefere stack memory whenever possible, due to its speed and predictability.
Stack depth: Minimize the depth of function stack calls.
Avoid dynamic allocation: If dynamic resizing is required, chose heapless collections or pre-allocate memory once at startup.
RAII principles: Tie allocation and deallocation to the lifetime of objects using RAII and the Drop trait to ensure timely resource release and minimize leaks.
Borrow instead of clone: Prefere borrowing to reduce unnecessary cloning, saving memory and processing power.
Profiling and Testing: Regularly profile memory usage and conduct continuous testing to identify and optimize allocation patterns and leaks.
Static Analysis Tools: Use tools like Clippy to identify common memory issues at compile time.
With alignment in mind: Check data types alignment to ensure consistent and efficient access.

7. Mission Critical Systems and Certification

7.1. Rust for Mission Critical

Traditional approaches to security in C/C++ and coding standards like MISRA and CERT are often involve reactive measures (testing, static analysis, runtime checks) to find vulnerabilities.
Rust, by design, prevents many of these vulnerabilities from occurring. Significant portion of the "rules" are enforced by the compiler itself, potentially reducing the need for extensive manual checks or complex rejection processes.
This proactive security model aligns with government initiatives to use safe-memory languages (e.g., CISA report), positioning Rust as a strategic choice for the future of mission-critical software development.


Aspect	Rust Approach (Built-in Guarantees)	C/C++ Approach (Optional Safeguards)
Memory Safety and Ownership	Guaranteed by compiler ownership model	Manually managed; depends on developer/tooling discipline
Garbage Collector	None (managed at compile time)	None (manually managed)
Buffer Overflow Prevention	Prevented at compile time	Depends on developer discipline, tools, and runtime checks
Use-After-Free Prevention	Guaranteed by ownership model	Depends on developer discipline, tools, and runtime checks
Race Condition Prevention	Enforced by type system (`Send`/`Sync`)	Manually managed thread safety; error prone
Stack and Heap Exploits	Mitigated by structured memory safety mechanisms	Governed by coding conventions and checks
`unsafe` code annotation	Explicitly marked (`unsafe` blocks)	Undefined behavior is governed by coding conventions and checks

7.2. ISome Industry initiatives

Safety Critical Rust Consortium: The Rust Foundation, in partnership with AdaCore, Arm, Ferrous Systems, and Woven by Toyota, formed this consortium to support the responsible use of the Rust language in mission-critical software. Their focus includes developing guidelines, linters, libraries, static analysis tools, formal methods, and language subsets to meet industry and legal requirements.

Ferrocene: A key development is Ferrocene, a qualified open-source Rust compiler toolchain for mission-critical systems.
It is qualified for automotive (ISO 26262 ASIL D), industrial (IEC 61508 SIL 4) and medical (IEC 62304 Class C) design.

Rust is no longer just a promising language — it is rapidly becoming a practical foundation for mission-critical software. With its proactive safety guarantees, strong community initiatives, and growing ecosystem of certified tools, Rust enables developers to build systems that are not only efficient and performant but also fundamentally safer by design.
As industries shift toward more secure and verifiable software, Rust stands out as a forward-looking, standards-aligned choice — ready to meet the demands of tomorrow’s most critical applications.

Building Trustworthy Embedded Systems with Rust