Open
Description
TL; DR: -fwrapv
resolves the regression.
548.exchange2_r built using flang-new-18:
548.exchange2_r: time (ms) = 181666
548.exchange2_r: clock freq (MHz) = 5701.6834
548.exchange2_r: instructions = 4.05239e+12
548.exchange2_r: branch instructions = 5.80567e+11
548.exchange2_r: ipc = 3.9123
548.exchange2_r: misprediction rate (%) = 0.8200
548.exchange2_r: mpki = 1.1747
548.exchange2_r built using flang-new-19:
548.exchange2_r: time (ms) = 182113
548.exchange2_r: clock freq (MHz) = 5701.3884
548.exchange2_r: instructions = 4.03508e+12
548.exchange2_r: branch instructions = 5.50304e+11
548.exchange2_r: ipc = 3.8863
548.exchange2_r: misprediction rate (%) = 0.7708
548.exchange2_r: mpki = 1.0513
548.exchange2_r built using flang-new-20:
548.exchange2_r: time (ms) = 240537
548.exchange2_r: clock freq (MHz) = 5700.9669
548.exchange2_r: instructions = 5.15502e+12
548.exchange2_r: branch instructions = 5.42299e+11
548.exchange2_r: ipc = 3.7592
548.exchange2_r: misprediction rate (%) = 0.7719
548.exchange2_r: mpki = 0.8121
A performance regression of ~30% is observed.
Hardware: Intel Core i9-14900K 5.7 GHz
Full results and compilation flags are available at https://jia.je/benchmark/ (in chinese).