-
Notifications
You must be signed in to change notification settings - Fork 13.6k
LangRef: Clarify llvm.minnum and llvm.maxnum about sNaN and signed zero #112852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-llvm-ir Author: YunQiang Su (wzssyqa) ChangesThe documents claims that it ignores sNaN, while in the code it may be different.
Since we have introduced llvm.minimumnum and llvm.maximumnum, which follow IEEE 754-2019's minimumNumber/maximumNumber. So, it's time for use to clarify llvm.minnum and llvm.maxnum. Let's define it to the libc's defination. Full diff: https://github.com/llvm/llvm-project/pull/112852.diff 1 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 566d0d4e4e81a3..247d520ac0997b 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -16464,21 +16464,13 @@ type.
Semantics:
""""""""""
+Follows the IEEE754 2008 semantics for minNum, except for handling of
++0.0 vs -0.0. This match's the behavior of libm's fmin.
-Follows the IEEE-754 semantics for minNum, except for handling of
-signaling NaNs. This match's the behavior of libm's fmin.
-
-If either operand is a NaN, returns the other non-NaN operand. Returns
-NaN only if both operands are NaN. If the operands compare equal,
-returns either one of the operands. For example, this means that
-fmin(+0.0, -0.0) returns either operand.
-
-Unlike the IEEE-754 2008 behavior, this does not distinguish between
-signaling and quiet NaN inputs. If a target's implementation follows
-the standard and returns a quiet NaN if either input is a signaling
-NaN, the intrinsic lowering is responsible for quieting the inputs to
-correctly return the non-NaN input (e.g. by using the equivalent of
-``llvm.canonicalize``).
+If either operand is a qNaN, returns the other non-NaN operand. Returns
+NaN only if both operands are NaN or either operand is sNaN.
+If the operands compare equal, returns either one of the operands.
+For example, this means that fmin(+0.0, -0.0) returns either operand.
.. _i_maxnum:
@@ -16515,20 +16507,13 @@ type.
Semantics:
""""""""""
-Follows the IEEE-754 semantics for maxNum except for the handling of
-signaling NaNs. This matches the behavior of libm's fmax.
+Follows the IEEE754 2008 semantics for maxNum, except for handling of
++0.0 vs -0.0. This match's the behavior of libm's fmax.
If either operand is a NaN, returns the other non-NaN operand. Returns
-NaN only if both operands are NaN. If the operands compare equal,
-returns either one of the operands. For example, this means that
-fmax(+0.0, -0.0) returns either -0.0 or 0.0.
-
-Unlike the IEEE-754 2008 behavior, this does not distinguish between
-signaling and quiet NaN inputs. If a target's implementation follows
-the standard and returns a quiet NaN if either input is a signaling
-NaN, the intrinsic lowering is responsible for quieting the inputs to
-correctly return the non-NaN input (e.g. by using the equivalent of
-``llvm.canonicalize``).
+NaN only if both operands are NaN or either operand is sNaN.
+If the operands compare equal, returns either one of the operands.
+For example, this means that fmin(+0.0, -0.0) returns either operand.
.. _i_minimum:
@@ -19407,12 +19392,12 @@ The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
matches the element-type of the vector input.
This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
-intrinsic. That is, the result will always be a number unless all elements of
-the vector are NaN. For a vector with maximum element magnitude 0.0 and
-containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
+intrinsic. If the intrinsic call has the ``nnan`` fast-math flag, then the
+operation can assume that NaNs are not present in the input vector.
-If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
-assume that NaNs are not present in the input vector.
+It is deprecated, since the different order of inputs may produce different
+outputs, and it is hard to optimize with Vector or SIMD extensions.
+Use '``llvm.vector.reduce.fmaximum``' or '``llvm.vector.reduce.fmaximumnum``' instead.
Arguments:
""""""""""
@@ -19440,12 +19425,12 @@ The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
matches the element-type of the vector input.
This instruction has the same comparison semantics as the '``llvm.minnum.*``'
-intrinsic. That is, the result will always be a number unless all elements of
-the vector are NaN. For a vector with minimum element magnitude 0.0 and
-containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
+intrinsic. If the intrinsic call has the ``nnan`` fast-math flag, then the
+operation can assume that NaNs are not present in the input vector.
-If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
-assume that NaNs are not present in the input vector.
+It is deprecated, since the different order of inputs may produce different
+outputs, and it is hard to optimize with Vector or SIMD extensions.
+Use '``llvm.vector.reduce.fminimum``' or '``llvm.vector.reduce.fminimumnum``' instead.
Arguments:
""""""""""
@@ -22994,13 +22979,14 @@ result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
This instruction has the same comparison semantics as the
:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
-'``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
-unless all elements of the vector and the starting value are ``NaN``. For a
-vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
-``-0.0`` elements, the sign of the result is unspecified.
+'``llvm.maxnum.*``' intrinsic).
To ignore the start value, the neutral value can be used.
+It is deprecated, since the different order of inputs may produce different
+outputs, and it is hard to optimize with Vector or SIMD extensions.
+Use '``llvm.vp.vector.reduce.fmaximum``' or '``llvm.vp.vector.reduce.fmaximumnum``' instead.
+
Examples:
"""""""""
@@ -23064,13 +23050,14 @@ result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
This instruction has the same comparison semantics as the
:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
-'``llvm.minnum.*``' intrinsic). That is, the result will always be a number
-unless all elements of the vector and the starting value are ``NaN``. For a
-vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
-``-0.0`` elements, the sign of the result is unspecified.
+'``llvm.minnum.*``' intrinsic).
To ignore the start value, the neutral value can be used.
+It is deprecated, since the different order of inputs may produce different
+outputs, and it is hard to optimize with Vector or SIMD extensions.
+Use '``llvm.vp.vector.reduce.fminimum``' or '``llvm.vp.vector.reduce.fminimumnum``' instead.
+
Examples:
"""""""""
|
See: llvm#112852 We have reclarify llvm.maxnum and llvm.minnum to follow libc's fmax(3) and fmin(3). So let's make APFloat::maxnum and APFloat::minnum to follow IEEE-754 2008's maxNum and minNum. maxNum is a more restircted version of fmax(3) with -0.0 < + 0.0. minNum is a more restircted version of fmin(3) with -0.0 < + 0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced this change is a good idea, as opposed to fixing the backends that implement it differently. LLVM has a general principle that sNaNs are treated more or less equivalent to qNaN, and this is the main case where sNaN inputs do very different things from qNaN inputs. The original text is pretty explicit that targets are supposed to use llvm.canonicalize
to handle sNaN quieting if they're mapping to hardware that does distinguish between sNaN/qNaN, and I don't see any justification in this PR yet for why doing that is problematic.
Yes. We should fix the backends, while before we fix them, we need a standard that what extractly to do. For
As users may read manual from libc, where states that fmax(N, sNaN) => NaN, while we (llvm) produces N. |
I think we definitely need to do this. Taking the name of the IEEE operations and not implementing the same behavior is not an acceptable place to be. We will simplify the code, and save instructions by doing this. We now have minimumnum/maximumnum to provide the ignore-signalingness behavior.
Yes, the IEEE behavior absurdly inverts the behavior for signaling nans so we do not have the luxury of simply ignoring them like in every other operation.
This was me working around this absolute nonsense situation. The only reason it is this way is because of the OpenCL conformance test checked for signaling nans treated as quiet nans, and gnu libm's fmin/fmax had the ignore signaling behavior at the time. libm has been fixed, and OpenCL has a footnote on fmin/fmax that signaling nans "may" behave as quiet nans. The codegen for these was never consistent across targets. We never considered the host library versions for the buggy libm, and until recently I believe AMDGPU was the only target bothering with the canonicalize + ieee variant lowering. We have been wasting cycles on this for years, for no real reason. Signaling nans are not used in the real world. We should be able to just directly select this to the instruction that implements the IEEE 2008 behavior. |
See: llvm#112852 We will define llvm.minnum and llvm.maxnum with +0.0>-0.0, by default, while libc doesn't require it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some mention of the interaction with FP arithmetic. (In particular, that arithmetic on an sNaN doesn't consistently produce a qNaN, so arithmetic feeding into a minnum/maxnum can produce inconsistent results.)
qNaN is always produced in the following cases:
Yes, for older libc implementations (2.24-) qNaN vs sNaN may produce sNaN, which I have mentioned. |
Say you write |
done |
See: llvm#112852 We will define llvm.minnum and llvm.maxnum with +0.0>-0.0, by default, while libc doesn't require it.
See: llvm#112852 We have reclarify llvm.maxnum and llvm.minnum to follow libc's fmax(3) and fmin(3). So let's make APFloat::maxnum and APFloat::minnum to follow IEEE-754 2008's maxNum and minNum. maxNum is a more restircted version of fmax(3) with -0.0 < + 0.0. minNum is a more restircted version of fmin(3) with -0.0 < + 0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR as it stands has numerous issues that need to be addressed.
At this point, we are now changing the behavior of the intrinsic with respect to signed zero handling, and changing the behavior with respect to sNaN. This means that the commit message is no longer accurate and needs to be adjusted.
The deeper problem is the text. The edge case behavior of these intrinsics is confusing, and it's not helped by the fact that the source materials tend to bury the behavior in the important edge cases so that you need to collate normative text from several different places to even understand what IEEE 754:2008's rules for minNum or C23's rules for fmin are. Because of that, the semantics really need to be upfront and clear about what the edge-case behavior is, and the history of changes so far have served to bury those details rather than illuminate them.
An example opening paragraph that would work better is this:
Follows the IEEE-754 semantics for minNum, except that -0.0 < +0.0 for the purposes of this intrinsic. As for signaling NaNs, per the IEEE-754 semantics, if either operand is an sNaN, the result is always a qNaN. This matches the recommended behavior for the libm function
fmin
, although not all implementations have implemented these recommended behaviors.
After that opening paragraph, I think it's useful to keep the original second paragraph, except modified slightly to account for the changes to sNaN and -0.0/+0.0 semantics. And only after that does it make sense to start to discuss the interactions with nsz
, LLVM's general sNaN-quieting rules, and the caution about implementing it properly.
(It also might be worth adding notes to the minnum/maxnum intrinsics that mininum/minimumnum intrinsics should probably be preferred for their non-distinction of sNaN/qNaN handling).
With llvm#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. To make llvm.minnum/llvm.maxnum easy to use, we define the builtin functions for them, include __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum All of them support _Float16, float, double, long double.
In llvm#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. Let's add FMFSource parameter to CreateMaxNum and CreateMinNum, so that it can be used by CodeGenFunction::EmitBuiltinExpr of Clang.
…ro (llvm#112852) The documents claims that it ignores sNaN, while in the current code it may be different. - as the finally callback, it use libc call fmin(3)/fmax(3). while C23 clarifies that fmin(3)/fmax(3) should return NaN for sNaN vs NUM. - on some architectures, such as aarch64, it converts to `fmaxnm`, which returns qNaN for sNaN vs NUM. - on RISC-V (SPEC 2019+), it converts to `fmax`, which returns NUM for sNaN vs NUM. Since we have introduced llvm.minimumnum and llvm.maximumnum, which follow IEEE 754-2019's minimumNumber/maximumNumber. So, it's time for us to clarify llvm.minnum and llvm.maxnum. Since the final fallback of llvm.minnum and llvm.maxnum is fmin(3)/fmax(3), so that it is reasonable to follow the behaviors of fmin(3)/fmax(3). Although C23 clarified the behavior about sNaN and +0.0/-0.0: (NUM or NaN) vs sNaN -> qNaN +0.0 vs -0.0 -> either one of +0.0/-0.0 It is the same the IEEE754-2008's maxNUM and minNUM. Not all implementation work as expected. Since some architectures such as aarch64/MIPSr6/LoongArch, have instructions that implements +0.0>-0.0. So Let's define llvm.minnum and llvm.maxnum to IEEE754-2008 with +0.0>-0.0. The architectures without such instructions can implements `NSZ` flavor to speed up, and the frontend, such as clang, can call them with `nsz` attribute.
See: llvm#112852 Fixes: llvm#111991 We have reclarify llvm.maxnum and llvm.minnum to follow IEEE-754 2008's maxNum and minNum with +0.0>-0.0. So let's make APFloat::maxnum and APFloat::minnum to follow it, too.
) In #112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. Let's add FMFSource parameter to CreateMaxNum and CreateMinNum, so that they can be used by CodeGenFunction::EmitBuiltinExpr of Clang.
…inNum (#129173) In llvm/llvm-project#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. Let's add FMFSource parameter to CreateMaxNum and CreateMinNum, so that they can be used by CodeGenFunction::EmitBuiltinExpr of Clang.
With llvm#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. To make llvm.minnum/llvm.maxnum easy to use, we define the builtin functions for them, include __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum All of them support _Float16, float, double, long double.
See: llvm#112852 We will define llvm.minnum and llvm.maxnum with +0.0>-0.0, by default, while libc doesn't require it.
See: llvm#112852 We will define llvm.minnum and llvm.maxnum with +0.0>-0.0, by default, while libc doesn't require it.
With llvm#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. To make llvm.minnum/llvm.maxnum easy to use, we define the builtin functions for them, include __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum All of them support _Float16, float, double, long double.
…#129173) In llvm#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. Let's add FMFSource parameter to CreateMaxNum and CreateMinNum, so that they can be used by CodeGenFunction::EmitBuiltinExpr of Clang.
With llvm#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. To make llvm.minnum/llvm.maxnum easy to use, we define the builtin functions for them, include __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_elementwise_minnum __builtin_minnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum __builtin_elementwise_maxnum __builtin_maxnum All of them support _Float16, float, double, long double.
With #112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. To make llvm.minnum/llvm.maxnum easy to use, we define the builtin functions for them, include __builtin_elementwise_minnum __builtin_elementwise_maxnum All of them support _Float16, __bf16, float, double, long double.
…29207) With llvm/llvm-project#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. To make llvm.minnum/llvm.maxnum easy to use, we define the builtin functions for them, include __builtin_elementwise_minnum __builtin_elementwise_maxnum All of them support _Float16, __bf16, float, double, long double.
See: llvm#112852 We will define llvm.minnum and llvm.maxnum with +0.0>-0.0, by default, while libc doesn't require it. fix testcases -ffp-exception-behavior=strict add missing builtin test test auto vectorize fix test cases update testcase disable-llvm-passes fix elementswise fix some tests
If these intrinsics now define the order of -0.0/+0.0, but the fmax and fmin library calls don't, how are we allowed to expand these intrinsics the fmax/fmin libcalls? |
Yes. You are right. It is a bug yet not being fixed. |
See: llvm#112852 We will define llvm.minnum and llvm.maxnum with +0.0>-0.0, by default, while libc doesn't require it. fix testcases -ffp-exception-behavior=strict add missing builtin test test auto vectorize fix test cases update testcase disable-llvm-passes fix elementswise fix some tests
With llvm#112852, we claimed that llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't require fmin(3)/fmax(3) for it. To make llvm.minnum/llvm.maxnum easy to use, we define the builtin functions for them, include __builtin_elementwise_minnum __builtin_elementwise_maxnum All of them support _Float16, __bf16, float, double, long double.
See: llvm#112852 We will define llvm.minnum and llvm.maxnum with +0.0>-0.0, by default, while libc doesn't require it. fix testcases -ffp-exception-behavior=strict add missing builtin test test auto vectorize fix test cases update testcase disable-llvm-passes fix elementswise fix some tests
See: llvm#112852 We will define llvm.minnum and llvm.maxnum with +0.0>-0.0, by default, while libc doesn't require it. fix testcases -ffp-exception-behavior=strict add missing builtin test test auto vectorize fix test cases update testcase disable-llvm-passes fix elementswise fix some tests
Please see #138303 |
…igned zero (llvm#112852)" This reverts commit 363b059. I don't agree with the change itself, but even if we want to do it, it needs to be phased in a lot more carefully than it was. You can't just change the semantics in LangRef without actually dealing with the consequences of the change. Revert to the status quo for now. Fixes llvm#138303.
The documents claims that it ignores sNaN, while in the current code it may be different.
fmaxnm
, which returns qNaN for sNaN vs NUM.fmax
, which returns NUM for sNaN vs NUM.Since we have introduced llvm.minimumnum and llvm.maximumnum, which follow IEEE 754-2019's minimumNumber/maximumNumber.
So, it's time for us to clarify llvm.minnum and llvm.maxnum. Since the final fallback of llvm.minnum and llvm.maxnum is
fmin(3)/fmax(3), so that it is reasonable to follow the behaviors of fmin(3)/fmax(3).
Although C23 clarified the behavior about sNaN and +0.0/-0.0:
(NUM or NaN) vs sNaN -> qNaN
+0.0 vs -0.0 -> either one of +0.0/-0.0
It is the same the IEEE754-2008's maxNUM and minNUM.
Not all implementation work as expected.
Since some architectures such as aarch64/MIPSr6/LoongArch, have instructions that implements +0.0>-0.0.
So Let's define llvm.minnum and llvm.maxnum to IEEE754-2008 with +0.0>-0.0.
The architectures without such instructions can implements
NSZ
flavor to speed up,and the frontend, such as clang, can call them with
nsz
attribute.