LICM: add an optimization to move multiple loads and stores from/to the same memory location out of a loop. #27849

eeckstein · 2019-10-23T15:27:15Z

This is a combination of load hoisting and store sinking, e.g.

  preheader:
    br header_block
  header_block:
    %x = load %not_aliased_addr
    // use %x and define %y
    store %y to %not_aliased_addr
    ...
  exit_block:

is transformed to:

  preheader:
    %x = load %not_aliased_addr
    br header_block
  header_block:
    // use %x and define %y
    ...
  exit_block:
    store %y to %not_aliased_addr

This optimization is important to optimize inout arguments, especially with COW support in SIL: it relies on values being in SSA values rather than in memory.

This PR is for test and review for now. If everything goes well I'll merge it next week.

eeckstein · 2019-10-23T15:27:55Z

@swift-ci benchmark

eeckstein · 2019-10-23T15:28:16Z

@swift-ci test

swift-ci · 2019-10-23T16:37:15Z

Build failed
Swift Test Linux Platform
Git Sha - 9ff63b01eb90c6e92b34d4fe15bc755ef9afee5b

swift-ci · 2019-10-23T16:38:21Z

Build failed
Swift Test OS X Platform
Git Sha - 9ff63b01eb90c6e92b34d4fe15bc755ef9afee5b

Because the set includes all side-effect instructions, also may-reads. NFC

eeckstein · 2019-10-29T09:22:25Z

@swift-ci test

eeckstein · 2019-10-29T09:22:47Z

@swift-ci test

eeckstein · 2019-10-29T09:23:00Z

@swift-ci benchmark

eeckstein · 2019-10-29T09:23:17Z

@swift-ci benchmark

swift-ci · 2019-10-29T10:06:43Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
PrefixAnyCollection	58	76	+31.0%	0.76x
DropFirstAnyCollection	59	76	+28.8%	0.78x
SuffixAnyCollection	22	28	+27.3%	0.79x
DropLastAnyCollection	22	28	+27.3%	0.79x
DropWhileAnyCollection	76	94	+23.7%	0.81x
PrefixWhileAnyCollection	111	129	+16.2%	0.86x

Improvement	OLD	NEW	DELTA	RATIO
ExclusivityGlobal	8	5	-37.5%	1.60x
NSStringConversion.LongUTF8	589	542	-8.0%	1.09x (?)
NSStringConversion.UTF8	835	777	-6.9%	1.07x (?)

Code size: -O

Performance: -Osize

Improvement	OLD	NEW	DELTA	RATIO
ExclusivityGlobal	8	5	-37.5%	1.60x
ArrayAppendOptionals	2020	1300	-35.6%	1.55x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
RandomShuffle.o	3533	3581	+1.4%	0.99x

Performance: -Onone

Regression	OLD	NEW	DELTA	RATIO
ClassArrayGetter2	3940	4260	+8.1%	0.92x

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

swift-ci · 2019-10-29T10:14:28Z

Build failed
Swift Test OS X Platform
Git Sha - 10d96a86768c8cc4f1610c42ea91c7d2e2c5ece3

swift-ci · 2019-10-29T10:30:25Z

Build failed
Swift Test Linux Platform
Git Sha - 10d96a86768c8cc4f1610c42ea91c7d2e2c5ece3

…he same memory location out of a loop. This is a combination of load hoisting and store sinking, e.g. preheader: br header_block header_block: %x = load %not_aliased_addr // use %x and define %y store %y to %not_aliased_addr ... exit_block: is transformed to: preheader: %x = load %not_aliased_addr br header_block header_block: // use %x and define %y ... exit_block: store %y to %not_aliased_addr

eeckstein · 2019-10-29T15:55:46Z

@swift-ci test

eeckstein · 2019-10-29T15:55:59Z

@swift-ci test

swift-ci · 2019-10-29T17:23:32Z

Build failed
Swift Test Linux Platform
Git Sha - 584581e

swift-ci · 2019-10-29T17:57:03Z

Build failed
Swift Test OS X Platform
Git Sha - 584581e

eeckstein · 2019-10-30T08:21:52Z

@swift-ci test

atrick

This is really good. I have a few comments below though.

And I just realized this PR isn't the code that I reviewed, so I'll add another comment here
#27990

atrick · 2019-11-12T18:57:43Z