[MISched] Fix off-by-one error in debug output with -misched-cutoff=<n> flag #137988

c-rhodes · 2025-04-30T16:30:51Z

This flag instructs the scheduler to stop scheduling after N instructions, but
in the debug output it appears as if it's scheduling N+1 instructions, e.g.

$ llc -misched-cutoff=10 -debug-only=machine-scheduler
example.ll 2>&1 | grep "^Scheduling SU" | wc -l
11

as it calls pickNode before calling checkSchedLimit.

mshockwave

The "Scheduling SU" message you grep-ed was from pickNode. The reason you saw 11 messages when the cut-off was 10 is because checkSchedLimit is placed after pickNode:

llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp

Lines 1007 to 1012 in fa76965

    
           SUnit *SU = SchedImpl->pickNode(IsTopNode); 
        
           if (!SU) break; 
        
           assert(!SU->isScheduled && "Node already scheduled"); 
        
           if (!checkSchedLimit()) 
        
             break;

So basically what happened was that although it showed 11 "Scheduling SU", the scheduler indeed only scheduled 10 instructions as it stopped right before it made any actual changes.

Therefore, following the same logic, I think your current patch will actually scheduled one less instruction then the cut-off value provided by the user, which is probably undesired.

I think a better solution might be putting checkSchedLimit before pickNode.

c-rhodes · 2025-05-01T06:43:38Z

The "Scheduling SU" message you grep-ed was from pickNode. The reason you saw 11 messages when the cut-off was 10 is because checkSchedLimit is placed after pickNode:

llvm-project/llvm/lib/CodeGen/MachineScheduler.cpp

Lines 1007 to 1012 in fa76965

SUnit *SU = SchedImpl->pickNode(IsTopNode);

if (!SU) break;

assert(!SU->isScheduled && "Node already scheduled");

if (!checkSchedLimit())

break;

So basically what happened was that although it showed 11 "Scheduling SU", the scheduler indeed only scheduled 10 instructions as it stopped right before it made any actual changes.

Therefore, following the same logic, I think your current patch will actually scheduled one less instruction then the cut-off value provided by the user, which is probably undesired.

I think a better solution might be putting checkSchedLimit before pickNode.

ah, well spotted! Thanks that makes more sense, updated 👍

mshockwave

generally looks good, but could you add a simple test?

llvmbot · 2025-05-02T09:18:45Z

@llvm/pr-subscribers-backend-aarch64

Author: Cullen Rhodes (c-rhodes)

Changes

This flag instructs the scheduler to stop scheduling after N instructions, but
in the debug output it appears as if it's scheduling N+1 instructions, e.g.

$ llc -misched-cutoff=10 -debug-only=machine-scheduler
example.ll 2>&1 | grep "^Scheduling SU" | wc -l
11

as it calls pickNode before calling checkSchedLimit.

Full diff: https://github.com/llvm/llvm-project/pull/137988.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/MachineScheduler.cpp (+6-4)
(added) llvm/test/CodeGen/AArch64/misched-cutoff.mir (+55)

diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp
index 97f27277aface..77cf46178aabd 100644
--- a/llvm/lib/CodeGen/MachineScheduler.cpp
+++ b/llvm/lib/CodeGen/MachineScheduler.cpp
@@ -1003,13 +1003,14 @@ void ScheduleDAGMI::schedule() {
 
   bool IsTopNode = false;
   while (true) {
+    if (!checkSchedLimit())
+      break;
+
     LLVM_DEBUG(dbgs() << "** ScheduleDAGMI::schedule picking next node\n");
     SUnit *SU = SchedImpl->pickNode(IsTopNode);
     if (!SU) break;
 
     assert(!SU->isScheduled && "Node already scheduled");
-    if (!checkSchedLimit())
-      break;
 
     MachineInstr *MI = SU->getInstr();
     if (IsTopNode) {
@@ -1637,13 +1638,14 @@ void ScheduleDAGMILive::schedule() {
 
   bool IsTopNode = false;
   while (true) {
+    if (!checkSchedLimit())
+      break;
+
     LLVM_DEBUG(dbgs() << "** ScheduleDAGMILive::schedule picking next node\n");
     SUnit *SU = SchedImpl->pickNode(IsTopNode);
     if (!SU) break;
 
     assert(!SU->isScheduled && "Node already scheduled");
-    if (!checkSchedLimit())
-      break;
 
     scheduleMI(SU, IsTopNode);
 
diff --git a/llvm/test/CodeGen/AArch64/misched-cutoff.mir b/llvm/test/CodeGen/AArch64/misched-cutoff.mir
new file mode 100644
index 0000000000000..a61eb64318a39
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/misched-cutoff.mir
@@ -0,0 +1,55 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=aarch64 -passes=machine-scheduler -o - %s | FileCheck %s
+# RUN: llc -mtriple=aarch64 -passes=machine-scheduler -misched-cutoff=1 -debug-only=machine-scheduler -o - %s 2>&1 | FileCheck %s --check-prefix=CHECK-CUTOFF
+
+# REQUIRES: asserts
+
+# CHECK-CUTOFF-COUNT-1: Scheduling SU
+
+# NOTE: copied from machine-scheduler.mir
+
+--- |
+  define i64 @load_imp-def(ptr nocapture %P, i32 %v) {
+  entry:
+    %0 = bitcast ptr %P to ptr
+    %1 = load i32, ptr %0
+    %conv = zext i32 %1 to i64
+    %arrayidx19 = getelementptr inbounds i64, ptr %P, i64 1
+    %arrayidx1 = bitcast ptr %arrayidx19 to ptr
+    store i32 %v, ptr %arrayidx1
+    %2 = load i64, ptr %arrayidx19
+    %and = and i64 %2, 4294967295
+    %add = add nuw nsw i64 %and, %conv
+    ret i64 %add
+  }
+...
+---
+name: load_imp-def
+tracksRegLiveness: true
+body: |
+  bb.0.entry:
+    liveins: $w1, $x0
+    ; CHECK-LABEL: name: load_imp-def
+    ; CHECK: liveins: $w1, $x0
+    ; CHECK-NEXT: {{  $}}
+    ; CHECK-NEXT: $w8 = LDRWui $x0, 1, implicit-def $x8 :: (load (s32) from %ir.0)
+    ; CHECK-NEXT: $w9 = LDRWui $x0, 0, implicit-def $x9 :: (load (s32) from %ir.arrayidx19, align 8)
+    ; CHECK-NEXT: STRWui $w1, $x0, 2 :: (store (s32) into %ir.arrayidx1)
+    ; CHECK-NEXT: $x0 = ADDXrr killed $x9, killed $x8
+    ; CHECK-NEXT: RET_ReallyLR implicit $x0
+    ;
+    ; CHECK-CUTOFF-LABEL: name: load_imp-def
+    ; CHECK-CUTOFF: liveins: $w1, $x0
+    ; CHECK-CUTOFF-NEXT: {{  $}}
+    ; CHECK-CUTOFF-NEXT: $w8 = LDRWui $x0, 1, implicit-def $x8 :: (load (s32) from %ir.0)
+    ; CHECK-CUTOFF-NEXT: STRWui $w1, $x0, 2 :: (store (s32) into %ir.arrayidx1)
+    ; CHECK-CUTOFF-NEXT: $w9 = LDRWui $x0, 0, implicit-def $x9 :: (load (s32) from %ir.arrayidx19, align 8)
+    ; CHECK-CUTOFF-NEXT: $x0 = ADDXrr killed $x9, killed $x8
+    ; CHECK-CUTOFF-NEXT: RET_ReallyLR implicit $x0
+    $w8 = LDRWui $x0, 1, implicit-def $x8  :: (load (s32) from %ir.0)
+    STRWui killed $w1, $x0, 2 :: (store (s32) into %ir.arrayidx1)
+    $w9 = LDRWui killed $x0, 0, implicit-def $x9  :: (load (s32) from %ir.arrayidx19, align 8)
+    $x0 = ADDXrr killed $x9, killed $x8
+    RET_ReallyLR implicit $x0
+...
+

c-rhodes · 2025-05-02T09:19:43Z

generally looks good, but could you add a simple test?

put a test up for precommit here #138243

mshockwave

LGTM thanks

This flag instructs the scheduler to stop scheduling after N instructions, but it currently schedules N+1 instructions, e.g. $ llc -misched-cutoff=10 -debug-only=machine-scheduler \ example.ll 2>&1 | grep "^Scheduling SU" | wc -l 11

llvm-ci · 2025-05-06T10:24:25Z

LLVM Buildbot has detected a new failure on builder mlir-nvidia-gcc7 running on mlir-nvidia while building llvm at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/116/builds/12455

Here is the relevant piece of the build log for the reference

Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/GPU/CUDA/async.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-kernel-outlining  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -pass-pipeline='builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary="format=fatbin"  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_cuda_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_async_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so    --entry-point-result=void -O0  | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-kernel-outlining
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt '-pass-pipeline=builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary=format=fatbin
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0
# .---command stderr------------
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventSynchronize(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# `-----------------------------
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# .---command stderr------------
# | /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir:68:12: error: CHECK: expected string not found in input
# |  // CHECK: [84, 84]
# |            ^
# | <stdin>:1:1: note: scanning from here
# | Unranked Memref base@ = 0x5b9abce964d0 rank = 1 offset = 0 sizes = [2] strides = [1] data = 
# | ^
# | <stdin>:2:1: note: possible intended match here
# | [42, 42]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mlir-nvidia/mlir-nvidia-gcc7/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Unranked Memref base@ = 0x5b9abce964d0 rank = 1 offset = 0 sizes = [2] strides = [1] data =  
# | check:68'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             2: [42, 42] 
# | check:68'0     ~~~~~~~~~
# | check:68'1     ?         possible intended match
...

…n> flag (llvm#137988) This flag instructs the scheduler to stop scheduling after N instructions, but in the debug output it appears as if it's scheduling N+1 instructions, e.g. $ llc -misched-cutoff=10 -debug-only=machine-scheduler example.ll 2>&1 | grep "^Scheduling SU" | wc -l 11 as it calls pickNode before calling checkSchedLimit.

c-rhodes requested a review from david-arm April 30, 2025 16:30

mshockwave self-requested a review April 30, 2025 18:27

mshockwave reviewed Apr 30, 2025

View reviewed changes

c-rhodes changed the title ~~[MISched] Fix off-by-one error with -misched-cutoff=<n> flag~~ [MISched] Fix off-by-one error in debug output with -misched-cutoff=<n> flag May 1, 2025

mshockwave reviewed May 1, 2025

View reviewed changes

c-rhodes added a commit to c-rhodes/llvm-project that referenced this pull request May 2, 2025

[MISched][NFC] Precommit test for llvm#137988

10cf588

c-rhodes force-pushed the misched-cutoff-fix-off-by-one-error branch from 0b1319c to dbfbfeb Compare May 2, 2025 09:18

llvmbot added the backend:AArch64 label May 2, 2025

mshockwave approved these changes May 2, 2025

View reviewed changes

c-rhodes added a commit that referenced this pull request May 6, 2025

[MISched][NFC] Precommit test for #137988 (#138243)

91ad90b

c-rhodes added 3 commits May 6, 2025 07:51

address comments

ade1dae

address comments

fa90da2

c-rhodes force-pushed the misched-cutoff-fix-off-by-one-error branch from dbfbfeb to fa90da2 Compare May 6, 2025 08:06

c-rhodes merged commit 8ea5eac into llvm:main May 6, 2025
11 checks passed

GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025

[MISched][NFC] Precommit test for llvm#137988 (llvm#138243)

b202dc0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MISched] Fix off-by-one error in debug output with -misched-cutoff=<n> flag #137988

[MISched] Fix off-by-one error in debug output with -misched-cutoff=<n> flag #137988

Uh oh!

c-rhodes commented Apr 30, 2025 •

edited

Loading

Uh oh!

mshockwave left a comment

Uh oh!

c-rhodes commented May 1, 2025

Uh oh!

mshockwave left a comment

Uh oh!

llvmbot commented May 2, 2025

Uh oh!

c-rhodes commented May 2, 2025

Uh oh!

mshockwave left a comment

Uh oh!

Uh oh!

llvm-ci commented May 6, 2025

Uh oh!

Uh oh!

	SUnit *SU = SchedImpl->pickNode(IsTopNode);
	if (!SU) break;

	assert(!SU->isScheduled && "Node already scheduled");
	if (!checkSchedLimit())
	break;

[MISched] Fix off-by-one error in debug output with -misched-cutoff=<n> flag #137988

[MISched] Fix off-by-one error in debug output with -misched-cutoff=<n> flag #137988

Uh oh!

Conversation

c-rhodes commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshockwave left a comment

Choose a reason for hiding this comment

Uh oh!

c-rhodes commented May 1, 2025

Uh oh!

mshockwave left a comment

Choose a reason for hiding this comment

Uh oh!

llvmbot commented May 2, 2025

Uh oh!

c-rhodes commented May 2, 2025

Uh oh!

mshockwave left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented May 6, 2025

Uh oh!

Uh oh!

c-rhodes commented Apr 30, 2025 •

edited

Loading