@@ -323,21 +323,15 @@ def gelu(x):
323
323
# Enable Tensor cores
324
324
# ~~~~~~~~~~~~~~~~~~~~~~~
325
325
# Tensor cores are specialized hardware designed to compute matrix-matrix multiplication
326
- # operations, which neural network operations can take advantage of.
326
+ # operations, primarily utilized in deep learning and AI workloads. Tensor cores have
327
+ # specific precision requirements which can be adjusted manually or via the Automatic
328
+ # Mixed Precision API.
327
329
#
328
- # Hardware tensor core operations tend to use a different floating point format
329
- # which sacrifices precision at expense of speed gains.
330
- # Prior to PyTorch 1.12 this functionality was enabled by default but since this version
331
- # it must be explicitly set as it can conflict with some operations which do not
332
- # benefit from Tensor core computations.
333
-
334
- ## Tensor computation can be enabled "manually" modifying the matrix multiplication precision
335
- ## The default precision is "highest" which will perform the operation according to the dtype
336
-
337
- # precision "high" and "medium" can be hardware accelerated via tensor cores
338
-
339
- # Carefully consider the tradeoff between speed and precision at the moment of evaluating your models!
340
- torch .set_float32_matmul_precision ("high" )
330
+ # In particular, tensor operations take advantage of lower precision workloads.
331
+ # Which can be controlled via ``torch.set_float32_matmul_precision``.
332
+ # The default format is set to 'highest,' which utilizes the tensor data type.
333
+ # However, PyTorch offers alternative precision settings: 'high' and 'medium.'
334
+ # These options prioritize computational speed over numerical precision."
341
335
342
336
###############################################################################
343
337
# Use CUDA Graphs
@@ -353,7 +347,8 @@ def gelu(x):
353
347
torch .compile (m , "max-autotune" )
354
348
355
349
###############################################################################
356
- # Special care must be present when using cuda graphs as it can lead to increased memory consumption and some models might not compile.
350
+ # Support for CUDA graph is in development, and its usage can incur in increased
351
+ # device memory consumption and some models might not compile.
357
352
358
353
###############################################################################
359
354
# Enable cuDNN auto-tuner
0 commit comments