Description
This has been a fun one to track down.
tl;dr: if enabling instruction and/or data caches on your M7 is causing weird crashes, turn on the cortex_m
inline-asm
feature and make sure you're compiling at --release
. Oh, and you're on nightly now. Sorry.
Details
I've got an STM32H7B3 here, and shortly after enabling the icache I would sometimes get hard faults that looked suspiciously like it trying to execute random data. (I'm doing this early enough in boot that I don't have configurable faults broken out, but they were either BF or MMF.)
The key here is sometimes -- some builds worked okay. Until I tried turning on dcache. Then things went squirrely.
It came down to: how many instructions did the compiler decide to insert between turning on the cache and the dsb
/ isb
sequence? If the answer is not zero and contains a branch or memory access, weird things will happen.
The current implementation of enable_icache
reads as follows:
#[inline]
pub fn enable_icache(&mut self) {
// Don't do anything if I-cache is already enabled
if Self::icache_enabled() {
return;
}
// NOTE(unsafe): No races as all CBP registers are write-only and stateless
let mut cbp = unsafe { CBP::new() };
// Invalidate I-cache
cbp.iciallu();
// Enable I-cache
// NOTE(unsafe): We have synchronised access by &mut self
unsafe { self.ccr.modify(|r| r | SCB_CCR_IC_MASK) };
crate::asm::dsb();
crate::asm::isb();
}
The tail end of that function is almost right.
- The
modify
will end in astr
. - The barriers will follow.
However,
- There's no use of
compiler_fence
or equivalent to prevent instructions from being rearranged above the barriers. (Perhaps we can assume this won't happen because they'reextern "C"
? I'm fairly certain I just saw it happen but I didn't save the listing.) dsb
andisb
default to being function calls
This means the actual dynamic instruction trace here is
str r1, [r0, #0]
bl __dsb
dsb sy
bx lr
bl __isb
isb sy
bx lr
In other words, we've got one branch in the pipeline before the write even takes effect, and no fewer than three before the isb
causes a prefetch/pipeline flush. These instructions often execute incorrectly. (On a reasonably fast M7, "often" becomes "always" in my testing.)
Enabling the inline-asm
feature fixes this by changing the dynamic trace to be what you'd expect:
str r1, [r0, #0]
dsb sy
isb sy
This is probably an issue with any existing code attempting to use isb
to synchronize with memory map or cache changes. To do this correctly, one might consider making enable_icache
(and enable_dcache
though it crashes less reliably) contingent on the inline-asm
feature, but that only produces the right instruction sequence at --release
.
These routines probably need to be written in assembly to be correct on stable toolchains (and on debug builds on nightly toolchains, for that matter, to bind the compiler's hands). I haven't thought about the equivalent disable
cases in detail, but they and the clean/invalidate routines probably bear review as well.