Skip to content

Improve codegen of large Copy parameters under CopyProp to match DestinationPropagation #108068

Closed
@saethlin

Description

@saethlin

This will probably bitrot, but this is a working godbolt link at the moment: https://godbolt.org/z/v446e66jG

Currently this program:

type T = [u8; 256];
pub fn f(a: T, b: fn(_: T, _: T)) {
    b(a, a)
}

Compiles to this IR:

define void @f(ptr noalias nocapture noundef readonly dereferenceable(256) %a, ptr nocapture noundef nonnull readonly %b) unnamed_addr #0 !dbg !6 {
start:
  %_5 = alloca [256 x i8], align 1
  %_4 = alloca [256 x i8], align 1
  call void @llvm.lifetime.start.p0(i64 256, ptr nonnull %_4), !dbg !11
  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(256) %_4, ptr noundef nonnull align 1 dereferenceable(256) %a, i64 256, i1 false), !dbg !11
  call void @llvm.lifetime.start.p0(i64 256, ptr nonnull %_5), !dbg !12
  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(256) %_5, ptr noundef nonnull align 1 dereferenceable(256) %a, i64 256, i1 false), !dbg !12
  call void %b(ptr noalias nocapture noundef nonnull dereferenceable(256) %_4, ptr noalias nocapture noundef nonnull dereferenceable(256) %_5), !dbg !13
  call void @llvm.lifetime.end.p0(i64 256, ptr nonnull %_5), !dbg !14
  call void @llvm.lifetime.end.p0(i64 256, ptr nonnull %_4), !dbg !14
  ret void, !dbg !15
}

But if we pass -Zmir-enable-passes=+DestinationPropagation, we eliminate a memcpy:

define void @f(ptr noalias nocapture noundef dereferenceable(256) %a, ptr nocapture noundef nonnull readonly %b) unnamed_addr #0 !dbg !6 {
start:
  %_3 = alloca [256 x i8], align 1
  call void @llvm.lifetime.start.p0(i64 256, ptr nonnull %_3), !dbg !11
  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(256) %_3, ptr noundef nonnull align 1 dereferenceable(256) %a, i64 256, i1 false), !dbg !11
  call void %b(ptr noalias nocapture noundef nonnull dereferenceable(256) %_3, ptr noalias nocapture noundef nonnull dereferenceable(256) %a), !dbg !12
  call void @llvm.lifetime.end.p0(i64 256, ptr nonnull %_3), !dbg !13
  ret void, !dbg !14
}

But with CopyProp enabled, we do not manage this optimization. This is probably a coordination problem between codegen, MIR optimizations, and MIR semantics: #105813 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-codegenArea: Code generationA-mir-optArea: MIR optimizationsC-enhancementCategory: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions