Skip to content

Commit 7ef1458

Browse files
committed
Update Tensor & CUDA graph section
1 parent 26f257a commit 7ef1458

File tree

1 file changed

+10
-15
lines changed

1 file changed

+10
-15
lines changed

recipes_source/recipes/tuning_guide.py

+10-15
Original file line numberDiff line numberDiff line change
@@ -323,21 +323,15 @@ def gelu(x):
323323
# Enable Tensor cores
324324
# ~~~~~~~~~~~~~~~~~~~~~~~
325325
# Tensor cores are specialized hardware designed to compute matrix-matrix multiplication
326-
# operations, which neural network operations can take advantage of.
326+
# operations, primarily utilized in deep learning and AI workloads. Tensor cores have
327+
# specific precision requirements which can be adjusted manually or via the Automatic
328+
# Mixed Precision API.
327329
#
328-
# Hardware tensor core operations tend to use a different floating point format
329-
# which sacrifices precision at expense of speed gains.
330-
# Prior to PyTorch 1.12 this functionality was enabled by default but since this version
331-
# it must be explicitly set as it can conflict with some operations which do not
332-
# benefit from Tensor core computations.
333-
334-
## Tensor computation can be enabled "manually" modifying the matrix multiplication precision
335-
## The default precision is "highest" which will perform the operation according to the dtype
336-
337-
# precision "high" and "medium" can be hardware accelerated via tensor cores
338-
339-
# Carefully consider the tradeoff between speed and precision at the moment of evaluating your models!
340-
torch.set_float32_matmul_precision("high")
330+
# In particular, tensor operations take advantage of lower precision workloads.
331+
# Which can be controlled via ``torch.set_float32_matmul_precision``.
332+
# The default format is set to 'highest,' which utilizes the tensor data type.
333+
# However, PyTorch offers alternative precision settings: 'high' and 'medium.'
334+
# These options prioritize computational speed over numerical precision."
341335

342336
###############################################################################
343337
# Use CUDA Graphs
@@ -353,7 +347,8 @@ def gelu(x):
353347
torch.compile(m, "max-autotune")
354348

355349
###############################################################################
356-
# Special care must be present when using cuda graphs as it can lead to increased memory consumption and some models might not compile.
350+
# Support for CUDA graph is in development, and its usage can incur in increased
351+
# device memory consumption and some models might not compile.
357352

358353
###############################################################################
359354
# Enable cuDNN auto-tuner

0 commit comments

Comments
 (0)