Skip to content

APFloat's addOrSubtractSignificand still throws assert in subtler FMA cases. #63895

Closed
@eddyb

Description

@eddyb
#include <math.h>

float foo() {
    return fmaf(
        -0.000000000000000000000000000000000000014728589,
        0.0000037105144,
        0.000000000000000000000000000000000000000000055
    );
}

results in (e.g. via godbolt):

llvm/lib/Support/APFloat.cpp:1805:
  llvm::lostFraction llvm::detail::IEEEFloat::addOrSubtractSignificand(const llvm::detail::IEEEFloat&, bool):
    Assertion `!carry' failed.

However, that's not how the bug was found. I noticed 8-bit formats (like Float8E5M2 and Float8E4M3FN) were added to APFloat, and decided to try brute-forcing all possible inputs for a few common operations (it only takes 8 seconds, and ~98% of that is FMA, because it uniquely has 3 inputs).

So the first example I found was for Float8E4M3FN, namely: FMA(0.0254, 0.0781, -0.00195) (the encoded byte values being 0x0d, 0x1a, 0x81).

Then @solson helped me turn some formula sketches I made looking at the code, into Z3, which confirmed that the example I found was the only one across all possible 8-bit floats, but 16-bit and 32-bit have a lot more cases. (I'm only not including all that here because it's not really set up to generate ready to use examples, it's really finicky as-is)


My understanding is that e62555c fixed most of the previously problematic cases, but not the ones where both significands are equal before subtraction, which will only work if lost_fraction == lfExactlyZero.

(cc @ekatz)

But if there is a lost fraction, neither direction will work to subtract equal significands - the code seems to rely on being able to assume that only equal exponents can result in equal significands, but we know that's not true w/ FMA.

(also, the lost fraction is always subtracted, regardless of the subtraction direction, but before e62555c the source of the lost_fraction was tied to reverse, i.e. a.subtractSignificand(b, lost_fraction != lfExactlyZero) was always performed with lost_fraction coming from lost_fraction = b.shiftSignificandRight(...);, regardless of how a and b mapped to *this and temp_rhs, respectively - this seems significant as well, but I'm not sure how to tell or test this)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions