Skip to content

Finite only math doesn't strip out most of body of powr implementation #64870

Open
@arsenm

Description

@arsenm

With sufficient fast math flags passed to an implementation of OpenCL's powr, the edge case infinite value handling is not pruned out. This should have adequate information to delete everything except the first 4 instructions in this function from the nofpclass(nan inf) attributes on the arguments and return value

define hidden noundef nofpclass(nan inf nzero nsub nnorm) float @test_powr(float noundef nofpclass(nan inf) %x, float noundef nofpclass(nan inf) %y) #0 {
entry:
  %i = tail call float @llvm.fabs.f32(float noundef %x)
  %i1 = tail call float @llvm.log2.f32(float noundef %i)
  %i2 = fmul float %i1, %y
  %i3 = tail call noundef nofpclass(ninf nzero nsub nnorm) float @llvm.exp2.f32(float noundef %i2)
  %i4 = fcmp olt float %y, 0.000000e+00
  %i5 = select i1 %i4, float 0x7FF0000000000000, float 0.000000e+00
  %i6 = fcmp oeq float %x, 0.000000e+00
  %i7 = select i1 %i6, float %i5, float %i3
  %i8 = fcmp oeq float %y, 0.000000e+00
  %i9 = select i1 %i6, float 0x7FF8000000000000, float 1.000000e+00
  %i10 = select i1 %i8, float %i9, float %i7
  %i11 = fcmp oeq float %x, 1.000000e+00
  %i12 = select i1 %i11, float 1.000000e+00, float %i10
  %i13 = fcmp olt float %x, 0.000000e+00
  %i14 = select i1 %i13, float 0x7FF8000000000000, float %i12
  ret float %i14
}

declare float @llvm.fabs.f32(float) #1
declare float @llvm.log2.f32(float) #1
declare float @llvm.exp2.f32(float) #1
declare float @llvm.trunc.f32(float) #1
declare float @llvm.copysign.f32(float, float) #1

attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) }
attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

I think this requires a "simplify demanded fpclass" type of handling, similar to SimplifyDemandedBits

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions