Skip to content

LangRef: Clarify behaviors of nsz in fast math flag #137567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wzssyqa
Copy link
Contributor

@wzssyqa wzssyqa commented Apr 28, 2025

For fcmp, this has no effect.

For min/max intrinsics, this allows returning either +0 or -0 if both +0 and -0 are passed.

For calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi, if the result is zero, the returned value can be a zero of either sign.

On min/max operations, now +0.0 is treated greater than -0.0. To
ignore this behaivor (fmin/fmax of libc doesn't require it), let's
use `nsz` for this case.
@llvmbot
Copy link
Member

llvmbot commented Apr 28, 2025

@llvm/pr-subscribers-llvm-ir

Author: YunQiang Su (wzssyqa)

Changes

On min/max operations, now +0.0 is treated greater than -0.0. To ignore this behaivor (fmin/fmax of libc doesn't require it), let's use nsz for this case.


Full diff: https://github.com/llvm/llvm-project/pull/137567.diff

1 Files Affected:

  • (modified) llvm/docs/LangRef.rst (+4)
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 5bd1d29487139..12d5a7fe8c419 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3914,6 +3914,10 @@ following flags to enable otherwise unsafe floating-point transformations.
    No Signed Zeros - Allow optimizations to treat the sign of a zero
    argument or zero result as insignificant. This does not imply that -0.0
    is poison and/or guaranteed to not exist in the operation.
+   Don't require +0.0>-0.0 for min/max operations - Allow optimizations of the
+   min/max operation not to treat +0.0>-0.0. Note the result should be either of
+   the operands. (max|min)(-0.0, -0.0) should never return +0.0, and vice versa.
+   Note: floating compare operations always imply -0.0 == +0.0.
 
 Note: For :ref:`phi <i_phi>`, :ref:`select <i_select>`, and :ref:`call <i_call>`
 instructions, the following return types are considered to be floating-point

@wzssyqa
Copy link
Contributor Author

wzssyqa commented Apr 28, 2025

See: #113133

@wzssyqa wzssyqa requested a review from efriedma-quic May 8, 2025 01:36
@wzssyqa wzssyqa changed the title LangRef: Clarify nsz on min/max operations for +0.0 vs -0.0 LangRef: Clarify behaviors of nsz in fast math flag May 8, 2025
@wzssyqa
Copy link
Contributor Author

wzssyqa commented May 14, 2025

ping

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling this out as a special case for min/max isn't the right way to express this. Given that this is the one standardized area with fuzzy signed zero handling, I think we should try to fit the definition such that this works (i.e. it's not permitted to randomly introduce a 0 sign bit).

I'm not aware of any transforms taking advantage of producing a -0 where there wasn't one to begin with. That is, we are allowed to produce either sign zero if one of the operation inputs is a -0, or the operation would naturally produce a -0 based on its inputs

@arsenm arsenm added the floating-point Floating-point math label May 14, 2025
@wzssyqa
Copy link
Contributor Author

wzssyqa commented May 14, 2025

Calling this out as a special case for min/max isn't the right way to express this. Given that this is the one standardized area with fuzzy signed zero handling, I think we should try to fit the definition such that this works (i.e. it's not permitted to randomly introduce a 0 sign bit).

I'm not aware of any transforms taking advantage of producing a -0 where there wasn't one to begin with. That is, we are allowed to produce either sign zero if one of the operation inputs is a -0, or the operation would naturally produce a -0 based on its inputs

Any suggestions?
Just remove

For calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi, if the result is zero, the returned value can be a zero of either sign.

?

@kpneal
Copy link
Member

kpneal commented May 14, 2025

Perhaps something like "If a -0 is given, the sign of the result is unspecified unless otherwise documented." ?

Then we'd need to document those places where the sign of the zero is "specified". The downside of this I guess is that the nsz documentation would be scattered about. Or maybe that's an upside?

@andykaylor
Copy link
Contributor

Maybe I'm missing some context as to why this was needed, but I don't feel like the proposed changes clarifies anything. I find it to be less clear. If I am understanding it correctly, I think it would be sufficient to add just this:

If the result is any zero, the returned value can be a zero of either sign.

@arsenm
Copy link
Contributor

arsenm commented May 14, 2025

Maybe I'm missing some context as to why this was needed

I think the problem is alive2's interpretation of nsz maxnum(+0, +0) assumes -0 is a valid output

@andykaylor
Copy link
Contributor

Maybe I'm missing some context as to why this was needed

I think the problem is alive2's interpretation of nsz maxnum(+0, +0) assumes -0 is a valid output

It is a valid output, isn't it?

@arsenm
Copy link
Contributor

arsenm commented May 14, 2025

It is a valid output, isn't it?

No. It's permitted to return either operand if they have mixed signs, but it does not permit returning a value that isn't either of the original inputs

@andykaylor
Copy link
Contributor

It is a valid output, isn't it?

No. It's permitted to return either operand if they have mixed signs, but it does not permit returning a value that isn't either of the original inputs

I don't have any reason why any transformation would want to introduce -0 when none of the inputs were -0, but I don't understand why we need a special case where the sign of zero results is restricted. Or is the special case "calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi"? It's not clear to me from the proposed change what the rule is for "arithmetic instructions, select, and phi".

@wzssyqa
Copy link
Contributor Author

wzssyqa commented May 15, 2025

It is a valid output, isn't it?

No. It's permitted to return either operand if they have mixed signs, but it does not permit returning a value that isn't either of the original inputs

I don't have any reason why any transformation would want to introduce -0 when none of the inputs were -0, but I don't

In some case we may introduce +0 when both operand is -0.
For example to expand some operation, we may use FADD, while FADD(-0.0, +0.0) will be +0.0.

understand why we need a special case where the sign of zero results is restricted. Or is the special case "calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi"? It's not clear to me from the proposed change what the rule is for "arithmetic instructions, select, and phi".

max/min operations claim that it should return either operand, so we shouldn't return +0 for max(-0, -0).

@andykaylor
Copy link
Contributor

max/min operations claim that it should return either operand, so we shouldn't return +0 for max(-0, -0).

If we're saying that, in general, we're not respecting the sign of zero, I don't see why that should be a problem.

@spavloff
Copy link
Collaborator

It is not clear why some particular cases should be mentioned in documentation, if they don't clarify meaning of the flag or its usage. If some specific transformation suffers from this flag, it can simply ignore nsz, a comment in code looks sufficient. Anyway, fast math flags allow some (unsafe) transformations, but not require to use them.

Also, nsz enables transformations that break FP semantics. Nothing surprizing if they produce the result that do not conform to the semantics of some functions.

@arsenm
Copy link
Contributor

arsenm commented May 15, 2025

If we're saying that, in general, we're not respecting the sign of zero, I don't see why that should be a problem.

The problem we are trying to solve is to say something more specific than this. I assert that if we cannot get the IEEE-754 2008 fuzzy minnum/maxnum behavior by taking a stronger intrinsic with ordered 0 handling, and adding nsz to it, the definition of nsz is broken

@andykaylor
Copy link
Contributor

The problem we are trying to solve is to say something more specific than this. I assert that if we cannot get the IEEE-754 2008 fuzzy minnum/maxnum behavior by taking a stronger intrinsic with ordered 0 handling, and adding nsz to it, the definition of nsz is broken

Trying to get IEEE-754 semantics by adding fast-math flags seems wrong to me. If we want IEEE-754 2008 semantics for an intrinsic, why don't we just make an intrinsic with the semantics we want.

@arsenm
Copy link
Contributor

arsenm commented May 15, 2025

Trying to get IEEE-754 semantics by adding fast-math flags seems wrong to me.

Calling this semantics is too strong; it's a relaxation permitted by the standard. The superior QoI that it seems just about every ISA implemented is to properly order -0 < +0, and this is the required behavior for both minimum and minimumNumber in the 2019 version. This is a semantic relaxation, and as the one place with a standard has a fuzzy signed zero handling I think the goal should be to define NSZ around this. I don't see how it's useful to permit flipping the sign out of nothing

If we want IEEE-754 2008 semantics for an intrinsic, why don't we just make an intrinsic with the semantics we want.

That would make the set of min/max intrinsics even more ridiculous than it is now, for what is only an optimization modifier. The current LangRef splits the difference between 2008 and 2019 and strengthens the signed zero treatment. I also view the lack of definitive signed zero handling is more of an optimization blocker than enabler. You cannot make a definitive statement about the sign for a clamp to zero. The fuzzy signed zero only reduces instructions in the software expansion

@andykaylor
Copy link
Contributor

Calling this semantics is too strong; it's a relaxation permitted by the standard. The superior QoI that it seems just about every ISA implemented is to properly order -0 < +0, and this is the required behavior for both minimum and minimumNumber in the 2019 version. This is a semantic relaxation, and as the one place with a standard has a fuzzy signed zero handling I think the goal should be to define NSZ around this. I don't see how it's useful to permit flipping the sign out of nothing

I'm not claiming that it's useful to do that. I'd be surprised if that ever happened. What I'm saying is that I don't understand why we need the restriction in the definition to say that it can't happen. If it's not useful, it won't happen. If someone somehow came up with a scenario where it was useful (perhaps as part of a transformation involving a chain of instructions with a min/max intrinsic in the middle), we might end up revisiting this discussion.

I agree that further proliferation of the min/max intrinsics is a bad idea, but I don't like the idea of changing the meaning of nsz to make one of them fit a desired characteristic.

You want to nsz llvm.minnum to be able to return either +0 or -0 if the arguments are one of each. I think the current definition allows that. You want to restrict it so that it can't return -0 if both arguments are +0. I don't think that's going to happen, but I do think the current definition of nsz allows it.

It sounds to me like the problem here is that rather than using a fast-math flag because the user requested the associated fast-math property to be enabled, you want to introduce a fast-math flag on a specific operation to get specific behavior from an intrinsic. That sounds to me an awful lot like the x86 backend wanting to interpret the reassoc flag when attached to llvm.vector.reduce.fadd to require a specific re-ordering.

@wzssyqa
Copy link
Contributor Author

wzssyqa commented May 16, 2025

You want to restrict it so that it can't return -0 if both arguments are +0. I don't think that's going to happen, but I do think the current definition of nsz allows it.

In fact it may happen. I had a try to expand FMINIMUM with fadd OP, 0.0.
And finally I found that FMINIMUM -0.0, -0.0 may return +0.0.
So I prefer to state explicitly that it is not allowed.

@arsenm
Copy link
Contributor

arsenm commented May 16, 2025

I'm not claiming that it's useful to do that. I'd be surprised if that ever happened. What I'm saying is that I don't understand why we need the restriction in the definition to say that it can't happen. If it's not useful, it won't happen.

Because it's useful to have that defined, and get alive2 to agree on the rules. If we want to verify optimization as correct we need to better define what those rules are. Getting alive2 to not complain is the only problem being solved here

You want to nsz llvm.minnum to be able to return either +0 or -0 if the arguments are one of each. I think the current definition allows that. You want to restrict it so that it can't return -0 if both arguments are +0. I don't think that's going to happen, but I do think the current definition of nsz allows it.

Which is the goal of the PR, to not permit this.

you want to introduce a fast-math flag on a specific operation to get specific behavior from an intrinsic

It's more like the C spec has a fast math flag by default on fmin and fmax. We don't strictly need to do this, it's taking the optimization hint permitted by the C spec.

That sounds to me an awful lot like the x86 backend wanting to interpret the reassoc flag when attached to llvm.vector.reduce.fadd to require a specific re-ordering.

This is an entirely different situation. In that reassoc case, there's load bearing semantics in the flag, and an ambiguity in the interpretation of the reassoc flag between the internal expansion and the external uses. In this NSZ case it's a droppable, unambiguous optimization hint which is the intended use of fast math flags.

@andykaylor
Copy link
Contributor

You want to restrict it so that it can't return -0 if both arguments are +0. I don't think that's going to happen, but I do think the current definition of nsz allows it.

In fact it may happen. I had a try to expand FMINIMUM with fadd OP, 0.0. And finally I found that FMINIMUM -0.0, -0.0 may return +0.0. So I prefer to state explicitly that it is not allowed.

Do you know why it was happening?

@spavloff
Copy link
Collaborator

I'm not claiming that it's useful to do that. I'd be surprised if that ever happened. What I'm saying is that I don't understand why we need the restriction in the definition to say that it can't happen. If it's not useful, it won't happen.

Because it's useful to have that defined, and get alive2 to agree on the rules. If we want to verify optimization as correct we need to better define what those rules are. Getting alive2 to not complain is the only problem being solved here

It is OK to restrict the general rule to aligh the behavior with alive2 or to avoid some unwanted things. After all, there is no a specification for nsz, which would demand strictly defined behavior. The concern is that the documentation includes wording that cannot help a developer to properly handle nsz. These limitations look arbitrary and no explanation for them is provided.

A more detailed comment in the code where nsz is handled with the problematic instructions could be more useful for the developer who would like to enhance the treatment of nsz.

@efriedma-quic
Copy link
Collaborator

It is OK to restrict the general rule to aligh the behavior with alive2 or to avoid some unwanted things. After all, there is no a specification for nsz, which would demand strictly defined behavior. The concern is that the documentation includes wording that cannot help a developer to properly handle nsz.

Alive2 needs for each instruction, a rule that indicates whether a given return value is valid or invalid. The proposed wording is, as far as I can tell, minimal in that regard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
floating-point Floating-point math llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants