Add avgPool3d & maxPool3d #1778

syt123450 · 2019-06-06T08:39:04Z

This PR adds avgPool3d op & maxPool3d op with CPU and WebGL implementation, supports inference and gradient. The APIs align with TensorFlow’s Python API tf.nn.avg_pool3d and tf.nn.max_pool3d. The avgPool3d & maxPool3d ops support tf.layers.averagePooling3d & tf.layers.maxPooling3d (feature requested in tensorflow/tfjs#1035).

As a checklist, features in this PR:

Add tf.avgPool3d to ops
Add tf.maxPool3d to ops
Add avgPool3d kernel to handle tf.avgPool3d’s prediction
Add maxPool3d kernel to handle tf.maxPool3d’s prediction
Add private helper pool3d function in CPU kernel as the implementation of avgPool3d kernel and maxPool3d kernel in CPU end
Add Pool3DProgram in WebGL kernel as the implementation of avgPool3d kernel and maxPool3d kernel in GPU end
Add avgPool3dBackprop kernel to handle tf.avgPool3d’s gradient
Add avgPool3dBackprop CPU kernel implementation
Add AvgPool3DBackpropProgram as the implementation of avgPool3dBackprop WebGL kernel
Add maxPool3dBackprop kernel to handle tf.maxPool3d’s gradient
Add maxPool3dBackprop CPU kernel implementation
Add a private helper function maxPool3dPositions for maxPool3dBackprop in CPU kernel
Add MaxPool3DBackpropProgram as the implementation of maxPool3dBackprop WebGL kernel
Integrate WebGL end’s maxPool3dPositions helper function into Pool3DProgram
Add a computePool3DInfo util function to compute operation information for avgPool3d & maxPool3d
Make check function eitherStridesOrDilationsAreOne support 3D input
Add 14 unit tests for avgPool3d (one test case failed in nodejs env as TF Backend doesn’t support NUMBER pad mode)
Add 7 unit tests for avgPool3dBackprop
Add 13 unit tests for maxPool3d (one test case failed in nodejs env as TF Backend doesn’t support NUMBER pad mode)
Add 10 unit tests for maxPool3dBackprop
Add 8 unit tests for util function computePool3DInfo
Export Tensor5D type
Add jsdocs and executable examples for website api documentation

I built a local website, if everything goes well, the avgPool3d and maxPool3d APIs would lie in Operations/Convolution section and look like the screen shot below:

Relative PRs:

Make nodejs kernel support avgPool3d & maxPool3d tensorflow/tfjs-node#256
Add averagePooling3d layer & maxPooling3d layer tensorflow/tfjs-layers#555
Add avgPool3d & maxPool3d ops for graph model tensorflow/tfjs-converter#375

Reference:

To see the logs from the Cloud Build CI, please join either
our discussion
or announcement mailing list.

This change is

…pool3d

dsmilkov

Really nice work!! I applaud the amazing PR description and how thorough the code changes are. Left a few comments (we use reviewable.io which makes it easy to track changes between revisions, but up to you how you;d like to reply).

Reviewed 1 of 11 files at r1, 10 of 10 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @syt123450)

src/backends/cpu/backend_cpu.ts, line 2615 at r2 (raw file):

                    const pixel =
                        dyBuf.get(batch, dyDepth, dyRow, dyCol, channel);

Let's avoid using tensor.get(....) and tensor.set(...) for performance reasons, and use flat indexing instead, just like you do for the pool3d above.

Not for this PR: We have older code that uses tensor.get(). If you have time and want it, feel free to convert those ops to flat index also, but do it in a new PR.

src/backends/cpu/backend_cpu.ts, line 2704 at r2 (raw file):

              }

              maxPositions.set(maxPosition, batch, yDepth, yRow, yCol, channel);

same here, avoid tensor.set() and tensor.get()

src/backends/cpu/backend_cpu.ts, line 2786 at r2 (raw file):

                }
              }
              dx.set(dotProd, batch, dxDepth, dxRow, dxCol, channel);

same here.

src/ops/conv_util.ts, line 539 at r2 (raw file):

export function eitherStridesOrDilationsAreOne(
    strides: number|[number, number]|[number, number, number],

at this point, let's simplify the type and make it number|number[]. same for dilations below.

src/ops/pool.ts, line 234 at r2 (raw file):

/**
 * Performs an 3D & 4D pooling operation

Since you didn't touch the code inside the pool_ function, let's revert this comment. The current behavior of tf.pool is that it only works for 2D pooling.

Internally, we will revisit how to truly generalize this to ND (which might lead to a small breaking change since we have to stop being relaxed about the input rank -> rank 4 will always mean pool2d, rank 5 will always mean pool3d.

src/ops/pool.ts, line 494 at r2 (raw file):

 * ```
 *
 * @param x The input tensor, of rank 5 of shape

let's also allow a 4d tensor (analogous to how we allow 3d and 4d for conv2d), where we append a batch size of 1 internally. This is not a problem for this op since it's specifically 3D pooling. If this op was a generic ND pooling, this would be a problem (which we have in pool_)

src/ops/pool.ts, line 723 at r2 (raw file):

    dimRoundingMode?: 'floor'|'round'|'ceil',
    dataFormat: 'NDHWC'|'NCDHW' = 'NDHWC'): Tensor5D {
  return maxPool3dImpl_(

no need to have a separate maxPool3DImpl_ function. Let's move that logic inside this method. Do the same for the other *Impl_ methods. This way you also only have 1 place for the jsdoc.

syt123450

Hi @dsmilkov , thanks for the review! I have made the relative changes, could you please take a look again?

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov)

src/backends/cpu/backend_cpu.ts, line 2615 at r2 (raw file):