-
Notifications
You must be signed in to change notification settings - Fork 13.6k
LangRef: Clarify behaviors of nsz in fast math flag #137567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
On min/max operations, now +0.0 is treated greater than -0.0. To ignore this behaivor (fmin/fmax of libc doesn't require it), let's use `nsz` for this case.
@llvm/pr-subscribers-llvm-ir Author: YunQiang Su (wzssyqa) ChangesOn min/max operations, now +0.0 is treated greater than -0.0. To ignore this behaivor (fmin/fmax of libc doesn't require it), let's use Full diff: https://github.com/llvm/llvm-project/pull/137567.diff 1 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 5bd1d29487139..12d5a7fe8c419 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3914,6 +3914,10 @@ following flags to enable otherwise unsafe floating-point transformations.
No Signed Zeros - Allow optimizations to treat the sign of a zero
argument or zero result as insignificant. This does not imply that -0.0
is poison and/or guaranteed to not exist in the operation.
+ Don't require +0.0>-0.0 for min/max operations - Allow optimizations of the
+ min/max operation not to treat +0.0>-0.0. Note the result should be either of
+ the operands. (max|min)(-0.0, -0.0) should never return +0.0, and vice versa.
+ Note: floating compare operations always imply -0.0 == +0.0.
Note: For :ref:`phi <i_phi>`, :ref:`select <i_select>`, and :ref:`call <i_call>`
instructions, the following return types are considered to be floating-point
|
See: #113133 |
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling this out as a special case for min/max isn't the right way to express this. Given that this is the one standardized area with fuzzy signed zero handling, I think we should try to fit the definition such that this works (i.e. it's not permitted to randomly introduce a 0 sign bit).
I'm not aware of any transforms taking advantage of producing a -0 where there wasn't one to begin with. That is, we are allowed to produce either sign zero if one of the operation inputs is a -0, or the operation would naturally produce a -0 based on its inputs
Any suggestions?
? |
Perhaps something like "If a -0 is given, the sign of the result is unspecified unless otherwise documented." ? Then we'd need to document those places where the sign of the zero is "specified". The downside of this I guess is that the nsz documentation would be scattered about. Or maybe that's an upside? |
Maybe I'm missing some context as to why this was needed, but I don't feel like the proposed changes clarifies anything. I find it to be less clear. If I am understanding it correctly, I think it would be sufficient to add just this:
|
I think the problem is alive2's interpretation of nsz maxnum(+0, +0) assumes -0 is a valid output |
It is a valid output, isn't it? |
No. It's permitted to return either operand if they have mixed signs, but it does not permit returning a value that isn't either of the original inputs |
I don't have any reason why any transformation would want to introduce -0 when none of the inputs were -0, but I don't understand why we need a special case where the sign of zero results is restricted. Or is the special case "calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi"? It's not clear to me from the proposed change what the rule is for "arithmetic instructions, select, and phi". |
In some case we may introduce +0 when both operand is -0.
max/min operations claim that it should return |
If we're saying that, in general, we're not respecting the sign of zero, I don't see why that should be a problem. |
It is not clear why some particular cases should be mentioned in documentation, if they don't clarify meaning of the flag or its usage. If some specific transformation suffers from this flag, it can simply ignore Also, |
The problem we are trying to solve is to say something more specific than this. I assert that if we cannot get the IEEE-754 2008 fuzzy minnum/maxnum behavior by taking a stronger intrinsic with ordered 0 handling, and adding nsz to it, the definition of nsz is broken |
Trying to get IEEE-754 semantics by adding fast-math flags seems wrong to me. If we want IEEE-754 2008 semantics for an intrinsic, why don't we just make an intrinsic with the semantics we want. |
Calling this semantics is too strong; it's a relaxation permitted by the standard. The superior QoI that it seems just about every ISA implemented is to properly order -0 < +0, and this is the required behavior for both minimum and minimumNumber in the 2019 version. This is a semantic relaxation, and as the one place with a standard has a fuzzy signed zero handling I think the goal should be to define NSZ around this. I don't see how it's useful to permit flipping the sign out of nothing
That would make the set of min/max intrinsics even more ridiculous than it is now, for what is only an optimization modifier. The current LangRef splits the difference between 2008 and 2019 and strengthens the signed zero treatment. I also view the lack of definitive signed zero handling is more of an optimization blocker than enabler. You cannot make a definitive statement about the sign for a clamp to zero. The fuzzy signed zero only reduces instructions in the software expansion |
I'm not claiming that it's useful to do that. I'd be surprised if that ever happened. What I'm saying is that I don't understand why we need the restriction in the definition to say that it can't happen. If it's not useful, it won't happen. If someone somehow came up with a scenario where it was useful (perhaps as part of a transformation involving a chain of instructions with a min/max intrinsic in the middle), we might end up revisiting this discussion. I agree that further proliferation of the min/max intrinsics is a bad idea, but I don't like the idea of changing the meaning of You want to It sounds to me like the problem here is that rather than using a fast-math flag because the user requested the associated fast-math property to be enabled, you want to introduce a fast-math flag on a specific operation to get specific behavior from an intrinsic. That sounds to me an awful lot like the x86 backend wanting to interpret the |
In fact it may happen. I had a try to expand |
Because it's useful to have that defined, and get alive2 to agree on the rules. If we want to verify optimization as correct we need to better define what those rules are. Getting alive2 to not complain is the only problem being solved here
Which is the goal of the PR, to not permit this.
It's more like the C spec has a fast math flag by default on
This is an entirely different situation. In that reassoc case, there's load bearing semantics in the flag, and an ambiguity in the interpretation of the reassoc flag between the internal expansion and the external uses. In this NSZ case it's a droppable, unambiguous optimization hint which is the intended use of fast math flags. |
Do you know why it was happening? |
It is OK to restrict the general rule to aligh the behavior with alive2 or to avoid some unwanted things. After all, there is no a specification for A more detailed comment in the code where nsz is handled with the problematic instructions could be more useful for the developer who would like to enhance the treatment of |
Alive2 needs for each instruction, a rule that indicates whether a given return value is valid or invalid. The proposed wording is, as far as I can tell, minimal in that regard. |
For fcmp, this has no effect.
For min/max intrinsics, this allows returning either +0 or -0 if both +0 and -0 are passed.
For calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi, if the result is zero, the returned value can be a zero of either sign.