Skip to content

don't leave parts of the bootloader in the kernel's address space #239

Open
@Freax13

Description

@Freax13

While implementing finer granular ASLR I came across this comment:

used.entry_state[0] = true; // TODO: Can we do this dynamically?

We mark the first 512GiB of the address space as unusable for dynamically generated addresses. I think we do this because we identity map the context switch code into kernel memory and this code most likely resides within the first 512GiB of the address space:
// identity-map context switch function, so that we don't get an immediate pagefault
// after switching the active page table
let context_switch_function = PhysAddr::new(context_switch as *const () as u64);
let context_switch_function_start_frame: PhysFrame =
PhysFrame::containing_address(context_switch_function);
for frame in PhysFrame::range_inclusive(
context_switch_function_start_frame,
context_switch_function_start_frame + 1,
) {
match unsafe {
kernel_page_table.identity_map(frame, PageTableFlags::PRESENT, frame_allocator)
} {
Ok(tlb) => tlb.flush(),
Err(err) => panic!("failed to identity map frame {:?}: {:?}", frame, err),
}
}

This causes a number of (admittedly small and unlikely) problems:

  • The identity mapped pages could overlap with the kernel or other mappings
  • We don't expose the identity mapped addresses to the kernel in Mappings
  • An attacker could make use of the identity mapped pages to defeat ASLR
  • We mark so a lot of usable memory as unusable and because of that we can't check for overlaps because there would be a lot of false positives. We currently just ignore overlaps.

We could probably work around those problems while still mapping parts of the bootloader into the kernel's address space, but I'd like to propose another solution: We use another very short lived page table to do the context switch. This page table would only map a few pages containing code that switches to the kernel's page table. Importantly, we would set the page table up in such a way that the kernel's entrypoint is just after the page table switch instruction, so we don't have to use any code to jump to the kernel, it would simply be the next instruction.
I don't think we could reliably map such code into the bootloader's address space because we'd have to map the code just before the kernel's entrypoint which could be close to bootloader's code, so that's why I want to use a short-lived page table.

We also identity map a GDT into the kernel's address space:

// create, load, and identity-map GDT (required for working `iretq`)
let gdt_frame = frame_allocator
.allocate_frame()
.expect("failed to allocate GDT frame");
gdt::create_and_load(gdt_frame);
match unsafe {
kernel_page_table.identity_map(gdt_frame, PageTableFlags::PRESENT, frame_allocator)
} {
Ok(tlb) => tlb.flush(),
Err(err) => panic!("failed to identity map frame {:?}: {:?}", gdt_frame, err),
}

We should probably make the GDT's location configurable and expose it in Mappings.

I'd be happy to work on a pr for this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions