Skip to content

incorrect run-time feature detection for targets between late-486 and pentium III #572

Open
@gnzlbg

Description

@gnzlbg

The current run-time feature detection system has undefined behavior for targets between late-486 and pentium III.

In the feature-detection code, we first correctly query the max basic CPUID leaf value, after which we correctly identify early-486 and bail out (https://github.com/rust-lang-nursery/stdsimd/blob/master/stdsimd/arch/detect/os/x86.rs#L69 ).

Right afterwards, (https://github.com/rust-lang-nursery/stdsimd/blob/master/stdsimd/arch/detect/os/x86.rs#L95), we unconditionally query the max extended CPUID leaf value, which is not implemented for all targets, including later Intel 486, Pentium, Pentium Pro, Pentium II, Celeron, and Pentium III.

AFAICT, doing the query is undefined behavior. If we are lucky, the query just returns 0, and no code related to extended features get executed.


We can use the proper identification sequence from the Intel® Processor Identification and the CPUID Instruction to detect these CPU families and avoid detecting and using extended features in them:

To identify the processor using the CPUID instructions, software should follow the following steps.

  1. Determine if the CPUID instruction is supported by modifying the ID flag in the EFLAGS register. If the ID flag cannot be modified, the processor cannot be identified using the CPUID instruction.
  2. Execute the CPUID instruction with EAX equal to 80000000h. CPUID function 80000000h is used to determine if Brand String is supported. If the CPUID function 80000000h returns a value in EAX greater than or equal to 80000004h the Brand String feature is supported and software should use CPUID functions 80000002h through 80000004h to identify the processor.
  3. If the Brand String feature is not supported, execute CPUID with EAX equal to 1. CPUID function 1 returns the processor signature in the EAX register, and the Brand ID in the EBX register bits 0 through 7. If the EBX register bits 0 through 7 contain a non-zero value, the Brand ID is supported. Software should scan the list of Brand IDs (see Table 7-1) to identify the processor.
  4. If the Brand ID feature is not supported, software should use the processor signature (see Table 5-3 and Table 5-4) in conjunction with the cache descriptors (see
    Section 5.1.3) to identify the processor.

We are already doing something like this to identify buggy CPUs, but not all CPU identification features like the Brand String are available on all architectures, so we should triple check that we are currently doing things correctly here as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions