Description
Cortex-M55 (or Cortex-M33 I assume)
with -mcmse -gdwarf-4 -fomit-frame-pointer
The DWARF FDE for a function with attribute __attribute__((cmse_nonsecure_entry))
that calls via a function pointer with attribute __attribute__((cmse_nonsecure_call))
does not describe all movements to register SP (stack pointer) which is the Canonical Frame Address
register (Dwarf_regnum_13) in the CIE.
This means it is not possible for a debugger to correctly unwind and show the call stack from callee to caller
(and beyond) because it looks in (Secure) stack memory for the pushed LR (link register) contents at the CFA offset the
DWARF describes, which is incorrect.
Compile with
clang -c -march=armv8.1m.main+mve -target arm-none-eabi -Os -mcmse -gdwarf-4 -fomit-frame-pointer -mfloat-abi=hard -o sec_to_nonsec_call.o sec_to_nonsec_call.c
(or -mcpu=cortex-m55, maybe also -I or -isystem for the path to arm_cmse.h header with a strange install)
sec_to_nonsec_call.c:
#include <arm_cmse.h>
#define CALL_ATTRIB __attribute__((cmse_nonsecure_call))
#define ENTRY_ATTRIB __attribute__((cmse_nonsecure_entry))
typedef void (*NS_ENTRY_POINT)(unsigned int) CALL_ATTRIB;
volatile NS_ENTRY_POINT nonsec_func = 0;
ENTRY_ATTRIB
void my_sec_func(unsigned int a_param)
{
if (nonsec_func != 0)
{
nonsec_func(a_param);
}
}
then disassemble
llvm-objdump -d --mcpu=cortex-m55 sec_to_nonsec_call.o > sec_to_nonsec_call.o.disasm.txt
and decode the DWARF
llvm-objdump --dwarf=frames --mcpu=cortex-m55 sec_to_nonsec_call.o > sec_to_nonsec_call.o.objdump_frames.txt
[ or in newer llvm versions
llvm-dwarfdump.exe --debug-frame --mcpu=cortex-m55 sec_to_nonsec_call.o > sec_to_nonsec_call.o.dwarfdump_frames.txt
]
See there is no corresponding DW_CFA_def_cfa_offset:+32 and DW_CFA_def_cfa_offset:+136 (== 0x88)
(CIE Data alignment is -4)
for instructions that move the SP register matching the function call 'nonsec_func(a_param);'
push.w {r4, r5, r6, r7, r8, r9, r10, r11}
bic r1, r1, #0x1
sub sp, #0x88
vlstm sp
clrm {r2, r3, r4, r5, r6, r7, r8, r9, r10, r11, r12, apsr}
blxns r1
The problem can be more easily observed here
Compiler Explorer 9Yx3c7hnT
Looking at the directives in the emitted assembler, neither the
push.w {r4, r5, r6, r7, r8, r9, r10, r11}
nor the
sub sp, #136
- which both move the CFA/SP to lower memory addresses -
have matching directives, where one could reasonably expect
.cfi_def_cfa_offset 32 ;// 8*4
and
.cfi_def_cfa_offset 136 ;// 34 *4
respectively.
This is using trunk (as I haven't tried an llvm 19.1.0 rc1 yet).
It also reproduces with llvm 17.0.6 and llvm 18.1.8.
Using Compiler Explorer it also exists for versions back to llvm armv7a-clang 11.0.0
Version 10.0.1 does not support the cmse attributes.
Using -mfix-cmse-cve-2021-35465 for VLLDM, where supported, makes no difference as that is the restore part of the code.
I believe the bug is caused by ARMExpandPseudoInsts.cpp
ARMExpandPseudoInsts.cpp line 1555ff
where function ARMExpandPseudo::CMSEPushCalleeSaves()
uses ARM::t2STMDB_UPD
and also where function ARMExpandPseudo::CMSESaveClearFPRegsV81() makes space for the later VLSTM
near
BuildMI(MBB, MBBI, DL, TII->get(ARM::tSUBspi), ARM::SP)
.addReg(ARM::SP)
.addImm(CMSE_FP_SAVE_SIZE >> 2)
.add(predOps(ARMCC::AL));
does not use sequences to add Call Frame Information
(as used by ARMBaseInstrInfo.cpp line 6468ff
)
MachineFunction &MF = *MBB.getParent();
int64_t StackPosEntry =
MF.addFrameInst(MCCFIInstruction::cfiDefCfaOffset(nullptr, new_8byte_aligned_offset));
BuildMI(MBB, It, DebugLoc(), get(ARM::CFI_INSTRUCTION))
.addCFIIndex(StackPosEntry)
.setMIFlags(MachineInstr::FrameSetup);
Similarly ARMExpandPseudo::CMSERestoreFPRegsV81() does not adjust the CFA, despite the SP register moving back to higher memory addresses, where I'd expect similar statements but with negative value and MachineInstr::FrameDestroy.
CMSEPopCalleeSaves() also makes no CFA adjustments.
Note there are also ARMExpandPseudo::CMSESaveClearFPRegsV8() and CMSERestoreFPRegsV8() similarly lacking.
PARTIAL WORKAROUND?
Remove -fomit-frame-pointer (or use {}-fno-omit-frame-pointer{}).
This means the SP is copied to register r7 and a .setfp r7 directive
.setfp r7, sp
mov r7, sp
.cfi_def_cfa_register r7
This in turn leads to the corresponding DWARF changing the CFA base register with
DW_CFA_def_cfa_register: reg7
so adjustments to SP by the setup code are of course possible and the link register's value can be found in memory, in theory...
BUT the later CLRM wipes out the R7 contents in order to leak nothing at the blxns
transition from secure to none-secure.
I think this is just as Requirement 54 of
[https://github.com/ARM-software/acle/releases/download/r2024Q2/cmse-1.4.pdf]
expects.
So, sadly, it is not even a partial workaround.
Moreover, use of framepointer grows code size so it cannot be a project default.
This being for embedded target we need to be able to debug exactly what we will ship.
Our debugger is able to recognise the FNC_RETURN pattern created by blxns instruction and has privileged access to the banked Secure stack memory in house.
And there's no requirement to clear register SP because it is banked, so does not leak information as section 4.3.1 'Information leakage' of cmse-1.4 describes.
Indeed for none secure calls with stack based arguments or return value the toolchain needs to take significant steps to validate and copy these from one stack to another.
Metadata
Metadata
Assignees
Type
Projects
Status