Inline constants in composite graphs #361

ricardoV94 · 2023-06-30T12:05:58Z

This PR inlines constants in composite graphs, which reduces the number of variables that must be iterated over in the fused Elemwise graphs. I noticed some speedups when combined with openmp but not really otherwise.

Would be interesting to check if numba benefits more from this.

Regardless, this seems like a reasonable rewrite to include, if for nothing else that it also simplifies the debug_print of the final compiled graphs.

import pytensor
import pytensor.tensor as pt

x = pt.vector("x")
out = pt.exp(x + 2)

fn = pytensor.function([x], out)
pytensor.dprint(fn)

Composite{exp((2.0 + i0))} [id A] 0
 └─ x [id B]

Inner graphs:

Composite{exp((2.0 + i0))} [id A]
 ← exp [id C] 'o0'
    └─ add [id D]
       ├─ 2.0 [id E]
       └─ i0 [id F]

There is no further constant folding triggered in the inner Composite graph. This shouldn't be needed for automatically introduced Composites, as those happen after constant folding.

If users create their own Composite and pass constants, then they may have useless operations on constants inside the graph, but that's hopefully something compilers can optimize themselves?

Related to #107

ricardoV94 · 2023-07-03T07:48:12Z

Interesting: Scalar Constants were never in-lined in the compiled thunks (the C-functions are simply given a pointer to the Python list that will hold those constants). That reduces number of compiled thunks, at the expense of compiler-level optimizations like the one mentioned in this PR:

pytensor/pytensor/tensor/rewriting/math.py

Line 2083 in b9c4f20

def local_pow_specialize_device(fgraph, node):

The optimization in this PR allows to inline some cases via the Composite graph

ricardoV94 · 2023-07-05T13:12:45Z

@aseyboldt I think the functionality is working. Would you be interested in doing a quick Numba benchmark?

codecov-commenter · 2023-07-05T13:46:39Z

Codecov Report

Merging #361 (2351a1f) into main (5c87d74) will increase coverage by 0.03%.
The diff coverage is 92.39%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #361      +/-   ##
==========================================
+ Coverage   80.40%   80.44%   +0.03%     
==========================================
  Files         156      156              
  Lines       45401    45483      +82     
  Branches    11106    11138      +32     
==========================================
+ Hits        36505    36588      +83     
+ Misses       6689     6687       -2     
- Partials     2207     2208       +1

Impacted Files	Coverage Δ
pytensor/scalar/basic.py	`80.16% <66.66%> (-0.04%)`	⬇️
pytensor/tensor/rewriting/shape.py	`81.20% <83.33%> (+0.10%)`	⬆️
pytensor/tensor/rewriting/elemwise.py	`89.29% <98.14%> (+0.78%)`	⬆️
pytensor/scan/rewriting.py	`79.94% <100.00%> (ø)`
pytensor/tensor/basic.py	`90.77% <100.00%> (ø)`
pytensor/tensor/rewriting/math.py	`86.36% <100.00%> (+0.31%)`	⬆️
pytensor/tensor/var.py	`87.76% <100.00%> (+0.19%)`	⬆️

... and 1 file with indirect coverage changes

pytensor/tensor/rewriting/elemwise.py

aseyboldt · 2023-07-07T17:13:44Z

pytensor/tensor/rewriting/elemwise.py

+    new_outputs = Elemwise(new_composite_op).make_node(*new_outer_inputs).outputs
+
+    # Some of the inlined constants were broadcasting the output shape
+    if node.outputs[0].type.broadcastable != new_outputs[0].type.broadcastable:


How could this happen if we only changed scalar inputs?

get_unique_constant_value works for homogeneous constants of any rank

aseyboldt · 2023-07-07T17:18:34Z

pytensor/tensor/rewriting/elemwise.py

+    In cases where the constant influenced the final shape of the Elemwise operation
+    We broadcast (via alloc) the new Elemwise result:
+
+    Elemwise(row, ones((3, 4))) -> Alloc(Elemwise(row, ones((1, 1))), 3, 4)


In this case the post-rewrite version might actually be slower than the pre-rewrite version.
After the rewrite we add an extra allocation and copy operation. I guess which one is faster in practice depends on how expensive the scalar_op of the Elemwise is...
Maybe we just shouldn't do that rewrite if we'd have to broadcast the output?

We can make Alloc become a view so it doesn't re-allocate (or as things stand now, use a BroadcastTo instead of Alloc)

I think the solution to #367 will fix it. Everything is an Alloc and then during the inplace optimizations we take care of finding out what actually needs to allocate an output and what doesn't.

Note that for the common cheap Elemwise-broadcasted cases 1*x or 0+y, we actually already rewrite it like this via the AlgebraCanocalizer. That's why I parametrized the tests with an odd pow(x, 2.5) as well.

But do we really want to use broadcasting to go from an (1,) array to a (n,) array without copying? We can do that, but the resulting array would use the same storage location multiple times. And if a downstream op works inplace and doesn't take that possibility into account, then we'd end up with incorrect results.

That's why we have a daddy called Supervisor:

pytensor/pytensor/compile/function/types.py

Line 135 in 5c87d74

class Supervisor(Feature):

And why we don't let users give us Functions with inplace Ops.

We shouldn't have to worry about that at the level of pre-inplace rewrites

My point was more that I think we have a whole bunch of logic that just silently assumes that we don't use any weird arrays like broadcast(np.ones(1), (3,)) at all. Numpy will even just return a read-only array for that, probably to make sure users don't shoot themselves into the foot:

vals = np.ones(1) bc = np.broadcast_to(vals, (3,)) assert not bc.flags.writeable

A later rewrite that would be totally valid with a normal array could turn into something that errors or even gives silently incorrect results.

Let's say we have a graph like

inc_subtensor(pow(ones(3), 1.)[0, 1], 1) -> inc_subtensor(Alloc(pow(ones(1), 1)), (3,), inplace=False)[0, 1], 1, inplace=False) -> inc_subtensor(Alloc(pow(ones(1), 1)), (3,), inplace=True)[0, 1], 1, inplace=False) -> inc_subtensor(Alloc(pow(ones(1), 1)), (3,), inplace=True)[0, 1], 1, inplace=True)

The two inplace rewrites seem perfectly innocent in isolation, but because the first inplace rewrite changed how the array is stored, the second one won't work. There is only one storage location for all three entries of the array, so it will return an array like [3, 3, 3] instead of [2, 2, 1].

I think the most sensible solution to this is to just never ever create arrays that refer to the same memory location multiple times. But that means we can't implement Alloc(ones(1), (3,), inplace=True) at all.

So maybe that rewrite is still fine, and we think that there are more cases where it helps than cases where it hurts (I think that's probably true). But at least we should make sure we disallow Alloc(ones(1), 3, 4, inplace=True).

So maybe that rewrite is still fine, and we think that there are more cases where it helps than cases where it hurts (I think that's probably true). But at least we should make sure we disallow Alloc(ones(1), 3, 4, inplace=True).

I'll add this to the #367 issue

After some discussion: We agree that the rewrite is fine, but we shouldn't rewrite the alloc to a broadcast to avoid creating arrays that alias memory. Doing this might be fine (there are some checks in the blas wrappers already), but we thought it is not worth the trouble with possible bugs that this could introduce.

aseyboldt

LGTM :-)

ricardoV94 · 2023-07-11T08:06:43Z

I removed the Elemwise rewrite for now, as it raises other questions that I have to think for a bit longer.

…ization (only for broadcasting)

ricardoV94 added graph rewriting performance compilation enhancement New feature or request maintenance labels Jun 30, 2023

ricardoV94 force-pushed the inline_constants_composite branch 4 times, most recently from 7d9dfe0 to ce2d4c3 Compare June 30, 2023 14:47

ricardoV94 added the bug Something isn't working label Jun 30, 2023

ricardoV94 mentioned this pull request Jun 30, 2023

Add rewrite for Sum(MakeVector) #346

Merged

ricardoV94 force-pushed the inline_constants_composite branch from ce2d4c3 to 193b22a Compare June 30, 2023 17:11

ricardoV94 force-pushed the inline_constants_composite branch from 193b22a to 8c0b6b1 Compare July 3, 2023 18:30

ricardoV94 marked this pull request as draft July 3, 2023 18:30

ricardoV94 force-pushed the inline_constants_composite branch from 8c0b6b1 to b8be9e8 Compare July 3, 2023 18:35

This was referenced Jul 3, 2023

Fix miscelaneous bugs #368

Merged

Cleanup fusion rewrite database #369

Merged

ricardoV94 force-pushed the inline_constants_composite branch 4 times, most recently from 12ea5f6 to 1580034 Compare July 5, 2023 09:45

ricardoV94 added 2 commits July 5, 2023 11:45

Rename get_unique_value to get_unique_constant_value

1408968

Allow inlined Infinity / Nan constants in Composite

5bb4cf0

ricardoV94 force-pushed the inline_constants_composite branch 5 times, most recently from be115bb to 2351a1f Compare July 5, 2023 13:10

ricardoV94 marked this pull request as ready for review July 5, 2023 13:11

ricardoV94 requested a review from aseyboldt July 5, 2023 13:11

ricardoV94 removed maintenance compilation labels Jul 5, 2023

ricardoV94 removed the bug Something isn't working label Jul 5, 2023

ricardoV94 mentioned this pull request Jul 7, 2023

HalfNormal in JAX failing due to implicit downcasting of constant 0d TensorVariable to float #373

Closed

aseyboldt reviewed Jul 7, 2023

View reviewed changes

ricardoV94 force-pushed the inline_constants_composite branch from 2351a1f to 71c018b Compare July 10, 2023 08:32

ricardoV94 requested a review from aseyboldt July 10, 2023 08:33

ricardoV94 mentioned this pull request Jul 10, 2023

Alloc vs BroadcastTo vs Second #367

Closed

aseyboldt approved these changes Jul 10, 2023

View reviewed changes

ricardoV94 marked this pull request as draft July 10, 2023 19:03

ricardoV94 force-pushed the inline_constants_composite branch from 71c018b to b618a0f Compare July 11, 2023 08:06

ricardoV94 marked this pull request as ready for review July 11, 2023 08:07

ricardoV94 force-pushed the inline_constants_composite branch from b618a0f to 033ad21 Compare July 11, 2023 08:35

ricardoV94 added 3 commits July 11, 2023 12:06

Don't introduce Second in AlegrabicCanonizer because of shape special…

7775b9e

…ization (only for broadcasting)

Add rewrite to lift SpecifyShape through Elemwise Operations

eb2431d

Implement rewrite to inline Composite constants

451b18c

ricardoV94 force-pushed the inline_constants_composite branch from 033ad21 to 451b18c Compare July 11, 2023 10:06

ricardoV94 merged commit 5e6b356 into pymc-devs:main Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inline constants in composite graphs #361

Inline constants in composite graphs #361

ricardoV94 commented Jun 30, 2023 •

edited

Loading

ricardoV94 commented Jul 3, 2023

ricardoV94 commented Jul 5, 2023

codecov-commenter commented Jul 5, 2023

aseyboldt Jul 7, 2023

ricardoV94 Jul 8, 2023 •

edited

Loading

aseyboldt Jul 7, 2023

ricardoV94 Jul 8, 2023 •

edited

Loading

ricardoV94 Jul 10, 2023 •

edited

Loading

aseyboldt Jul 10, 2023

ricardoV94 Jul 10, 2023

ricardoV94 Jul 10, 2023

aseyboldt Jul 10, 2023

ricardoV94 Jul 10, 2023

aseyboldt Jul 10, 2023

aseyboldt left a comment

ricardoV94 commented Jul 11, 2023 •

edited

Loading

Inline constants in composite graphs #361

Inline constants in composite graphs #361

Conversation

ricardoV94 commented Jun 30, 2023 • edited Loading

ricardoV94 commented Jul 3, 2023

ricardoV94 commented Jul 5, 2023

codecov-commenter commented Jul 5, 2023

Codecov Report

Choose a reason for hiding this comment

ricardoV94 Jul 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 Jul 8, 2023 • edited Loading

Choose a reason for hiding this comment

ricardoV94 Jul 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aseyboldt left a comment

Choose a reason for hiding this comment

ricardoV94 commented Jul 11, 2023 • edited Loading

ricardoV94 commented Jun 30, 2023 •

edited

Loading

ricardoV94 Jul 8, 2023 •

edited

Loading

ricardoV94 Jul 8, 2023 •

edited

Loading

ricardoV94 Jul 10, 2023 •

edited

Loading

ricardoV94 commented Jul 11, 2023 •

edited

Loading