LangRef: Clarify behaviors of nsz in fast math flag #137567

wzssyqa · 2025-04-28T01:00:36Z

For fcmp, this has no effect.

For min/max intrinsics, this allows returning either +0 or -0 if both +0 and -0 are passed.

For calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi, if the result is zero, the returned value can be a zero of either sign.

On min/max operations, now +0.0 is treated greater than -0.0. To ignore this behaivor (fmin/fmax of libc doesn't require it), let's use `nsz` for this case.

llvmbot · 2025-04-28T01:01:15Z

@llvm/pr-subscribers-llvm-ir

Author: YunQiang Su (wzssyqa)

Changes

On min/max operations, now +0.0 is treated greater than -0.0. To ignore this behaivor (fmin/fmax of libc doesn't require it), let's use nsz for this case.

Full diff: https://github.com/llvm/llvm-project/pull/137567.diff

1 Files Affected:

(modified) llvm/docs/LangRef.rst (+4)

diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 5bd1d29487139..12d5a7fe8c419 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3914,6 +3914,10 @@ following flags to enable otherwise unsafe floating-point transformations.
    No Signed Zeros - Allow optimizations to treat the sign of a zero
    argument or zero result as insignificant. This does not imply that -0.0
    is poison and/or guaranteed to not exist in the operation.
+   Don't require +0.0>-0.0 for min/max operations - Allow optimizations of the
+   min/max operation not to treat +0.0>-0.0. Note the result should be either of
+   the operands. (max|min)(-0.0, -0.0) should never return +0.0, and vice versa.
+   Note: floating compare operations always imply -0.0 == +0.0.
 
 Note: For :ref:`phi <i_phi>`, :ref:`select <i_select>`, and :ref:`call <i_call>`
 instructions, the following return types are considered to be floating-point

wzssyqa · 2025-04-28T01:01:23Z

See: #113133

llvm/docs/LangRef.rst

wzssyqa · 2025-05-14T02:00:48Z

ping

arsenm

Calling this out as a special case for min/max isn't the right way to express this. Given that this is the one standardized area with fuzzy signed zero handling, I think we should try to fit the definition such that this works (i.e. it's not permitted to randomly introduce a 0 sign bit).

I'm not aware of any transforms taking advantage of producing a -0 where there wasn't one to begin with. That is, we are allowed to produce either sign zero if one of the operation inputs is a -0, or the operation would naturally produce a -0 based on its inputs

wzssyqa · 2025-05-14T11:34:10Z

Calling this out as a special case for min/max isn't the right way to express this. Given that this is the one standardized area with fuzzy signed zero handling, I think we should try to fit the definition such that this works (i.e. it's not permitted to randomly introduce a 0 sign bit).

I'm not aware of any transforms taking advantage of producing a -0 where there wasn't one to begin with. That is, we are allowed to produce either sign zero if one of the operation inputs is a -0, or the operation would naturally produce a -0 based on its inputs

Any suggestions?
Just remove

For calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi, if the result is zero, the returned value can be a zero of either sign.

?

kpneal · 2025-05-14T18:00:00Z

Perhaps something like "If a -0 is given, the sign of the result is unspecified unless otherwise documented." ?

Then we'd need to document those places where the sign of the zero is "specified". The downside of this I guess is that the nsz documentation would be scattered about. Or maybe that's an upside?

andykaylor · 2025-05-14T19:11:49Z

Maybe I'm missing some context as to why this was needed, but I don't feel like the proposed changes clarifies anything. I find it to be less clear. If I am understanding it correctly, I think it would be sufficient to add just this:

If the result is any zero, the returned value can be a zero of either sign.

arsenm · 2025-05-14T19:13:31Z

Maybe I'm missing some context as to why this was needed

I think the problem is alive2's interpretation of nsz maxnum(+0, +0) assumes -0 is a valid output

andykaylor · 2025-05-14T19:27:10Z

Maybe I'm missing some context as to why this was needed

I think the problem is alive2's interpretation of nsz maxnum(+0, +0) assumes -0 is a valid output

It is a valid output, isn't it?

arsenm · 2025-05-14T19:49:42Z

It is a valid output, isn't it?

No. It's permitted to return either operand if they have mixed signs, but it does not permit returning a value that isn't either of the original inputs

andykaylor · 2025-05-14T20:26:37Z

It is a valid output, isn't it?

No. It's permitted to return either operand if they have mixed signs, but it does not permit returning a value that isn't either of the original inputs

I don't have any reason why any transformation would want to introduce -0 when none of the inputs were -0, but I don't understand why we need a special case where the sign of zero results is restricted. Or is the special case "calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi"? It's not clear to me from the proposed change what the rule is for "arithmetic instructions, select, and phi".

wzssyqa · 2025-05-15T03:41:04Z

It is a valid output, isn't it?

No. It's permitted to return either operand if they have mixed signs, but it does not permit returning a value that isn't either of the original inputs

I don't have any reason why any transformation would want to introduce -0 when none of the inputs were -0, but I don't

In some case we may introduce +0 when both operand is -0.
For example to expand some operation, we may use FADD, while FADD(-0.0， +0.0) will be +0.0.

understand why we need a special case where the sign of zero results is restricted. Or is the special case "calls to anything other than a min/max intrinsic, arithmetic instructions, select, and phi"? It's not clear to me from the proposed change what the rule is for "arithmetic instructions, select, and phi".

max/min operations claim that it should return either operand, so we shouldn't return +0 for max(-0, -0).

andykaylor · 2025-05-15T16:36:51Z

max/min operations claim that it should return either operand, so we shouldn't return +0 for max(-0, -0).

If we're saying that, in general, we're not respecting the sign of zero, I don't see why that should be a problem.

spavloff · 2025-05-15T16:47:14Z

It is not clear why some particular cases should be mentioned in documentation, if they don't clarify meaning of the flag or its usage. If some specific transformation suffers from this flag, it can simply ignore nsz, a comment in code looks sufficient. Anyway, fast math flags allow some (unsafe) transformations, but not require to use them.

Also, nsz enables transformations that break FP semantics. Nothing surprizing if they produce the result that do not conform to the semantics of some functions.

arsenm · 2025-05-15T17:28:48Z

If we're saying that, in general, we're not respecting the sign of zero, I don't see why that should be a problem.

The problem we are trying to solve is to say something more specific than this. I assert that if we cannot get the IEEE-754 2008 fuzzy minnum/maxnum behavior by taking a stronger intrinsic with ordered 0 handling, and adding nsz to it, the definition of nsz is broken

andykaylor · 2025-05-15T17:35:21Z

The problem we are trying to solve is to say something more specific than this. I assert that if we cannot get the IEEE-754 2008 fuzzy minnum/maxnum behavior by taking a stronger intrinsic with ordered 0 handling, and adding nsz to it, the definition of nsz is broken

Trying to get IEEE-754 semantics by adding fast-math flags seems wrong to me. If we want IEEE-754 2008 semantics for an intrinsic, why don't we just make an intrinsic with the semantics we want.

arsenm · 2025-05-15T20:06:08Z

Trying to get IEEE-754 semantics by adding fast-math flags seems wrong to me.

Calling this semantics is too strong; it's a relaxation permitted by the standard. The superior QoI that it seems just about every ISA implemented is to properly order -0 < +0, and this is the required behavior for both minimum and minimumNumber in the 2019 version. This is a semantic relaxation, and as the one place with a standard has a fuzzy signed zero handling I think the goal should be to define NSZ around this. I don't see how it's useful to permit flipping the sign out of nothing

If we want IEEE-754 2008 semantics for an intrinsic, why don't we just make an intrinsic with the semantics we want.

That would make the set of min/max intrinsics even more ridiculous than it is now, for what is only an optimization modifier. The current LangRef splits the difference between 2008 and 2019 and strengthens the signed zero treatment. I also view the lack of definitive signed zero handling is more of an optimization blocker than enabler. You cannot make a definitive statement about the sign for a clamp to zero. The fuzzy signed zero only reduces instructions in the software expansion

andykaylor · 2025-05-15T20:36:26Z

Calling this semantics is too strong; it's a relaxation permitted by the standard. The superior QoI that it seems just about every ISA implemented is to properly order -0 < +0, and this is the required behavior for both minimum and minimumNumber in the 2019 version. This is a semantic relaxation, and as the one place with a standard has a fuzzy signed zero handling I think the goal should be to define NSZ around this. I don't see how it's useful to permit flipping the sign out of nothing

I'm not claiming that it's useful to do that. I'd be surprised if that ever happened. What I'm saying is that I don't understand why we need the restriction in the definition to say that it can't happen. If it's not useful, it won't happen. If someone somehow came up with a scenario where it was useful (perhaps as part of a transformation involving a chain of instructions with a min/max intrinsic in the middle), we might end up revisiting this discussion.

I agree that further proliferation of the min/max intrinsics is a bad idea, but I don't like the idea of changing the meaning of nsz to make one of them fit a desired characteristic.

You want to nsz llvm.minnum to be able to return either +0 or -0 if the arguments are one of each. I think the current definition allows that. You want to restrict it so that it can't return -0 if both arguments are +0. I don't think that's going to happen, but I do think the current definition of nsz allows it.

It sounds to me like the problem here is that rather than using a fast-math flag because the user requested the associated fast-math property to be enabled, you want to introduce a fast-math flag on a specific operation to get specific behavior from an intrinsic. That sounds to me an awful lot like the x86 backend wanting to interpret the reassoc flag when attached to llvm.vector.reduce.fadd to require a specific re-ordering.

wzssyqa · 2025-05-16T04:08:00Z

You want to restrict it so that it can't return -0 if both arguments are +0. I don't think that's going to happen, but I do think the current definition of nsz allows it.

In fact it may happen. I had a try to expand FMINIMUM with fadd OP, 0.0.
And finally I found that FMINIMUM -0.0, -0.0 may return +0.0.
So I prefer to state explicitly that it is not allowed.

arsenm · 2025-05-16T08:23:38Z

I'm not claiming that it's useful to do that. I'd be surprised if that ever happened. What I'm saying is that I don't understand why we need the restriction in the definition to say that it can't happen. If it's not useful, it won't happen.

Because it's useful to have that defined, and get alive2 to agree on the rules. If we want to verify optimization as correct we need to better define what those rules are. Getting alive2 to not complain is the only problem being solved here

You want to nsz llvm.minnum to be able to return either +0 or -0 if the arguments are one of each. I think the current definition allows that. You want to restrict it so that it can't return -0 if both arguments are +0. I don't think that's going to happen, but I do think the current definition of nsz allows it.

Which is the goal of the PR, to not permit this.

you want to introduce a fast-math flag on a specific operation to get specific behavior from an intrinsic

It's more like the C spec has a fast math flag by default on fmin and fmax. We don't strictly need to do this, it's taking the optimization hint permitted by the C spec.

That sounds to me an awful lot like the x86 backend wanting to interpret the reassoc flag when attached to llvm.vector.reduce.fadd to require a specific re-ordering.

This is an entirely different situation. In that reassoc case, there's load bearing semantics in the flag, and an ambiguity in the interpretation of the reassoc flag between the internal expansion and the external uses. In this NSZ case it's a droppable, unambiguous optimization hint which is the intended use of fast math flags.

andykaylor · 2025-05-16T16:13:03Z

You want to restrict it so that it can't return -0 if both arguments are +0. I don't think that's going to happen, but I do think the current definition of nsz allows it.

In fact it may happen. I had a try to expand FMINIMUM with fadd OP, 0.0. And finally I found that FMINIMUM -0.0, -0.0 may return +0.0. So I prefer to state explicitly that it is not allowed.

Do you know why it was happening?

spavloff · 2025-05-16T17:47:34Z

I'm not claiming that it's useful to do that. I'd be surprised if that ever happened. What I'm saying is that I don't understand why we need the restriction in the definition to say that it can't happen. If it's not useful, it won't happen.

Because it's useful to have that defined, and get alive2 to agree on the rules. If we want to verify optimization as correct we need to better define what those rules are. Getting alive2 to not complain is the only problem being solved here

It is OK to restrict the general rule to aligh the behavior with alive2 or to avoid some unwanted things. After all, there is no a specification for nsz, which would demand strictly defined behavior. The concern is that the documentation includes wording that cannot help a developer to properly handle nsz. These limitations look arbitrary and no explanation for them is provided.

A more detailed comment in the code where nsz is handled with the problematic instructions could be more useful for the developer who would like to enhance the treatment of nsz.

efriedma-quic · 2025-05-23T00:23:00Z

It is OK to restrict the general rule to aligh the behavior with alive2 or to avoid some unwanted things. After all, there is no a specification for nsz, which would demand strictly defined behavior. The concern is that the documentation includes wording that cannot help a developer to properly handle nsz.

Alive2 needs for each instruction, a rule that indicates whether a given return value is valid or invalid. The proposed wording is, as far as I can tell, minimal in that regard.

LangRef: Clarify nsz on min/max operations for +0.0 vs -0.0

a25d5dc

On min/max operations, now +0.0 is treated greater than -0.0. To ignore this behaivor (fmin/fmax of libc doesn't require it), let's use `nsz` for this case.

llvmbot added the llvm:ir label Apr 28, 2025

wzssyqa requested review from efriedma-quic and arsenm April 28, 2025 01:00

efriedma-quic reviewed Apr 28, 2025

View reviewed changes

llvm/docs/LangRef.rst Show resolved Hide resolved

Rewording

918e5ac

wzssyqa requested a review from efriedma-quic May 8, 2025 01:36

wzssyqa changed the title ~~LangRef: Clarify nsz on min/max operations for +0.0 vs -0.0~~ LangRef: Clarify behaviors of nsz in fast math flag May 8, 2025

arsenm reviewed May 14, 2025

View reviewed changes

arsenm added the floating-point Floating-point math label May 14, 2025

arsenm requested review from spavloff, andykaylor, kpneal and jcranmer-intel May 14, 2025 08:23

LangRef: Clarify behaviors of nsz in fast math flag #137567

Are you sure you want to change the base?

LangRef: Clarify behaviors of nsz in fast math flag #137567

Conversation

wzssyqa commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Apr 28, 2025

Uh oh!

wzssyqa commented Apr 28, 2025

Uh oh!

Uh oh!

wzssyqa commented May 14, 2025

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

wzssyqa commented May 14, 2025

Uh oh!

kpneal commented May 14, 2025

Uh oh!

andykaylor commented May 14, 2025

Uh oh!

arsenm commented May 14, 2025

Uh oh!

andykaylor commented May 14, 2025

Uh oh!

arsenm commented May 14, 2025

Uh oh!

andykaylor commented May 14, 2025

Uh oh!

wzssyqa commented May 15, 2025

Uh oh!

andykaylor commented May 15, 2025

Uh oh!

spavloff commented May 15, 2025

Uh oh!

arsenm commented May 15, 2025

Uh oh!

andykaylor commented May 15, 2025

Uh oh!

arsenm commented May 15, 2025

Uh oh!

andykaylor commented May 15, 2025

Uh oh!

wzssyqa commented May 16, 2025

Uh oh!

arsenm commented May 16, 2025

Uh oh!

andykaylor commented May 16, 2025

Uh oh!

spavloff commented May 16, 2025

Uh oh!

efriedma-quic commented May 23, 2025

Uh oh!

Uh oh!

wzssyqa commented Apr 28, 2025 •

edited

Loading