`cortex_m::asm::delay` is wrong on anything more complex than an M4

The contract for `delay` currently states:

> Blocks the program for *at least* `n` instruction cycles

The current implementation is:

```
            llvm_asm!("1:
                  nop
                  subs $0, $$1
                  bne.n 1b"
                 : "+r"(_n / 4 + 1)
                 :
                 : "cpsr"
                 : "volatile");
```

This is approximately an assembly gloss of the code

```rust
let mut i = n / 4 + 1;
loop {
    i -= 1;
    if i == 0 { break }
}
```

Why the divide-by-four? Well, the code appears to be assuming that the minimum amount of time that the instruction sequence could take is four cycles, and so it is taking that into account to minimize the amount of over-delaying. That cycle count is approximately accurate on M3/M4 (it's an underestimate in practice, even in zero-wait-state RAM), but is wrong on M7 or any other part with any combination of nop-folding, branch prediction, and multiple issue. As a result, on such CPUs, this function delays for substantially fewer cycles than advertised -- by roughly a factor of 2, depending on cache and memory system configuration, branch alignment, etc. (It also appears to be wrong on M33, but M33 is not superscalar, so it's probably just an instruction timing issue.)

As CPU core complexity grows, any hand-tuned cycle delay loops need to be specialized for the particular microarchitecture being used, much like in the 486/Pentium era. Alternatively, you could use a conservative implementation that, when asked to loop for at least `n` cycles, repeats the loop `n` times -- that should be safe _for now_, but I wouldn't recommend it, particularly because we've got the zero-overhead-loop extension to ARMv8-M on the horizon.

Shout out to @labbott for noticing this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`cortex_m::asm::delay` is wrong on anything more complex than an M4 #236

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cortex_m::asm::delay is wrong on anything more complex than an M4 #236

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`cortex_m::asm::delay` is wrong on anything more complex than an M4 #236