Skip to content

AVR: interrupt code broken by unused alloca #75504

Closed
@couchand

Description

@couchand

I've run into an issue with code that uses interrupts on AVR. I'm not sure if this applies to any other platform.

I've attempted to minimize the example, but there's still a fair amount here, largely because I'm having trouble coming up with good, simple ways to observe the chip's behavior. It's written for the ATmega2560 (and specifically the Arduino Mega2560 board), but should be easily adapted to other AVR MCUs.

This code works fine, flashing the built-in LED on my Arduino Mega2560:

#![no_std]
#![no_main]
#![feature(abi_avr_interrupt, llvm_asm)]

extern crate panic_halt;
use core::cell::UnsafeCell;
use avr_device::atmega2560;

static FAIL: RacyUnsafeCell<Result<(), ()>> = RacyUnsafeCell::new(Err(()));

#[repr(transparent)]
pub struct RacyUnsafeCell<T>(UnsafeCell<T>);

unsafe impl<T: Sync> Sync for RacyUnsafeCell<T> {}

impl<T> RacyUnsafeCell<T> {
    pub const fn new(x: T) -> Self {
        RacyUnsafeCell(UnsafeCell::new(x))
    }

    pub fn get(&self) -> *mut T {
        self.0.get()
    }
}

#[no_mangle]
pub unsafe extern "avr-interrupt" fn __vector_9() {
    match *FAIL.get() {
        Ok(t) => t,
        Err(_) => unwrap_failed(&()),
    }
}

fn unwrap_failed(error: &dyn core::fmt::Debug) -> ! {
    panic!("{:?}", error)
}

#[export_name = "main"]
pub extern "C" fn main() -> ! {
    unsafe {
        *FAIL.get() = Ok(());
    }

    let dp = atmega2560::Peripherals::take().unwrap();

    // output for LED & PCINT0
    dp.PORTB.ddrb.write(|w| unsafe { w.bits(0b1000_0001) });
    dp.PORTB.portb.write(|w| unsafe { w.bits(0b1000_0001) });

    // enable PCINT0
    dp.EXINT.pcicr.write(|w| w.pcie().bits(1));
    dp.EXINT.pcmsk0.write(|w| w.pcint().bits(1));

    unsafe {
        llvm_asm!("sei" :::: "volatile");
    }

    loop {
        for _ in 0..50 {
            let mut u = 0xffff;
            unsafe {
                llvm_asm!(
                    "1: sbiw $0,1\n\tbrne 1b"
                    : "=w"(u)
                    : "0"(u)
                    :
                    : "volatile"
                );
            }
        }

        dp.PORTB.portb.write(|w| unsafe { w.bits(0) });

        for _ in 0..50 {
            let mut u = 0xffff;
            unsafe {
                llvm_asm!(
                    "1: sbiw $0,1\n\tbrne 1b"
                    : "=w"(u)
                    : "0"(u)
                    :
                    : "volatile"
                );
            }
        }

        dp.PORTB.portb.write(|w| unsafe { w.bits(0b1000_0001) });
    }
}

Making a single change to unwrap the actual error causes it to fail. It now turns the LED on once and then makes no further discernible progress. Notably, that means it's not even reaching the first update to the PCINT0 pin, which is mystifying.

$ diff examples/min-repro-working.rs examples/min-repro-broken.rs 
30c30
<         Err(_) => unwrap_failed(&()),
---
>         Err(e) => unwrap_failed(&e),

The function unwrap_failed shouldn't ever be called, as the interrupts are not enabled until after Ok(()) is written to the static. (RacyUnsafeCell is used here just to prove that there's no shenanigans).

I don't have any fancy in-circuit debugging tools, so it's hard for me to know what's happening on the metal, but I do have diff, which I've ran against the LLVM IR from these two programs. As far as I can tell, the only meaningful difference is that the broken program issues an alloca at the start of the ISR:

; Function Attrs: nofree norecurse nounwind optsize
define avr_signalcc  void @__vector_9() unnamed_addr addrspace(1) #0 {
start:
  %e = alloca {}, align 1
  %.b = load i1, i1* @_ZN16min_repro_broken4FAIL17h1f0c05d9b688f3e0E.0.0, align 1
  br i1 %.b, label %bb4, label %bb2

bb2:                                              ; preds = %start
; call min_repro_broken::unwrap_failed
  call fastcc addrspace(1) void @_ZN16min_repro_broken13unwrap_failed17h382387c4357ef48cE({}* nonnull align 1 %e)
  unreachable

bb4:                                              ; preds = %start
  ret void
}

versus

; Function Attrs: nofree norecurse nounwind optsize
define avr_signalcc  void @__vector_9() unnamed_addr addrspace(1) #0 {
start:
  %.b = load i1, i1* @_ZN17min_repro_working4FAIL17h3f0200d99301bf56E.0.0, align 1
  br i1 %.b, label %bb4, label %bb2

bb2:                                              ; preds = %start
; call min_repro_working::unwrap_failed
  tail call fastcc addrspace(1) void @_ZN17min_repro_working13unwrap_failed17h0e47ae32571d11baE()
  unreachable

bb4:                                              ; preds = %start
  ret void
}

I've pretty much run out of places I know to look for what could be causing this issue, or ways to productively continue investigating. Any pointers would be greatly appreciated!

My nightly was previously a few weeks old, I've updated to latest and the issue persists.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.O-AVRTarget: AVR processors (ATtiny, ATmega, etc.)T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.requires-nightlyThis issue requires a nightly compiler in some way.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions