Skip to content

Pyt 639 fx blog #875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: site
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
801cba4
testing blog addition on local env
arielmoguillansky Oct 20, 2021
2d06d47
testing blog addition on local env v2
arielmoguillansky Oct 20, 2021
b52170e
[PYT-636]-pyt-1.10.-release-blog v1
arielmoguillansky Oct 20, 2021
e47c6ab
[PYT-637]-pyt-1.10-new-library-releases
arielmoguillansky Oct 20, 2021
116e461
Updated blog dates to 2021-10-21
arielmoguillansky Oct 20, 2021
c99fdfb
testing bundle version for netlify
arielmoguillansky Oct 21, 2021
95c32f5
merge from site
arielmoguillansky Oct 21, 2021
1d1bf4c
[PYT-635]-cuda-graphs-new-post
arielmoguillansky Oct 21, 2021
b7f039a
[PYT-635] table style correction
arielmoguillansky Oct 21, 2021
6386415
[pyt-635] removed site.baseUrl for testing purpose
arielmoguillansky Oct 22, 2021
138492a
[PYT-635] site baseUrl rollback2
arielmoguillansky Oct 22, 2021
37b14f2
pull from pyt fork
arielmoguillansky Oct 25, 2021
e9986cf
Merge branch 'site' into pyt-new-blogs
arielmoguillansky Oct 27, 2021
6baae18
Merge pull request #109 from shiftlab/pyt-new-blogs
arielmoguillansky Oct 27, 2021
0e20c3a
Merge pull request #110 from shiftlab/pyt-634-previous-ptcv-episodes
arielmoguillansky Oct 27, 2021
230c983
merge from site
arielmoguillansky Oct 27, 2021
fbb262c
Merge pull request #111 from shiftlab/pyt-635-new-blog
arielmoguillansky Oct 27, 2021
a94c693
[PYT-639] FX blog
arielmoguillansky Oct 27, 2021
80f16e2
[PYT-639] FX blog removed site.url for netlify img preview
arielmoguillansky Oct 27, 2021
a08551a
rollback
arielmoguillansky Oct 27, 2021
36abb58
[PYT-639] fx-based-feature post
arielmoguillansky Oct 28, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -273,4 +273,5 @@ DEPENDENCIES
RUBY VERSION
ruby 2.5.1p57


BUNDLED WITH
2.2.22
1 change: 0 additions & 1 deletion _layouts/blog.html
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@ <h1 class="blog-index-title">
<div class="main-content">
<div class="container">
<div class="row blog-vertical">

{% for post in posts %}
<div class="vertical-blog-container">
<div class="col-md-4">
Expand Down
247 changes: 247 additions & 0 deletions _posts/2021-10-21-accelerating-pytorch-with-cuda-graphs.md

Large diffs are not rendered by default.

104 changes: 104 additions & 0 deletions _posts/2021-10-21-pytorch-1.10-main-release.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
layout: blog_detail
title: 'PyTorch 1.10 Release, including CUDA Graphs APIs, TorchScript improvements'
author: Team PyTorch
---

We are excited to announce the release of PyTorch 1.10. This release is composed of around 3,400 commits since 1.9, made by 426 contributors. We want to sincerely thank our community for continuously improving PyTorch.

PyTorch 1.10 updates are focused on improving training and performance of PyTorch, and developer usability. The full release notes are available [here](https://github.com/pytorch/pytorch/releases/tag/v1.10.0). Highlights include:
1. CUDA Graphs APIs are integrated to reduce CPU overheads for CUDA workloads
2. New features to optimize usability and performance of TorchScript - profile-directed typing in TorchScript & LLVM-based JIT Compiler for CPUs
3. Android NNAPI support now in beta

We are also releasing major updates to TorchAudio and TorchVision along with 1.10 as well as introducing TorchX - a new SDK for quickly building and deploying ML applications from research to production. See [this blog post](https://pytorch.org/blog/pytorch-1.10-new-library-releases/) for details. Features in PyTorch releases are classified as Stable, Beta, and Prototype. You can learn more about the definitions in [this blog post](https://pytorch.org/blog/pytorch-feature-classification-changes/).

# Frontend APIs

### (Stable) Python code transformations with FX

FX provides a Pythonic platform for transforming and lowering PyTorch programs. It is a toolkit for pass writers to facilitate Python-to-Python transformation of functions and nn.Module instances. This toolkit aims to support a subset of Python language semantics—rather than the whole Python language—to facilitate ease of implementation of transforms. With 1.10, FX is moving to stable.

You can learn more about FX in the [official documentation](https://pytorch.org/docs/master/fx.html) and [GitHub examples](https://github.com/pytorch/examples/tree/master/fx) of program transformations implemented using ```torch.fx````.

### (Stable) *torch.special*

A ```torch.special module```, analogous to [SciPy’s special module](https://docs.scipy.org/doc/scipy/reference/special.html), is now available in stable. The module has 30 operations, including gamma, Bessel, and error functions. Refer to this [documentation](https://pytorch.org/docs/master/special.html) for more details.

### (Stable) nn.Module Parametrization

```nn.Module``` parametrizaton, a feature that allows users to parametrize any parameter or buffer of an ```nn.Module``` without modifying the ```nn.Module``` itself, is available in stable. This release adds weight normalization (```weight_norm```), orthogonal parametrization (matrix constraints and part of pruning) and more flexibility when creating your own parametrization.

Refer to this [tutorial](https://pytorch.org/tutorials/intermediate/parametrizations.html) and the general [documentation](https://pytorch.org/docs/master/generated/torch.nn.utils.parametrizations.spectral_norm.html?highlight=parametrize) for more details.

### (Beta) CUDA Graphs APIs Integration

PyTorch now integrates CUDA Graphs APIs to reduce CPU overheads for CUDA workloads.

CUDA Graphs greatly reduce the CPU overhead for CPU-bound cuda workloads and thus improve performance by increasing GPU utilization. For distributed workloads, CUDA Graphs also reduce jitter, and since parallel workloads have to wait for the slowest worker, reducing jitter improves overall parallel efficiency.

Integration allows seamless interop between the parts of the network captured by cuda graphs, and parts of the network that cannot be captured due to graph limitations.

Read the [note](https://pytorch.org/docs/master/notes/cuda.html#cuda-graphs) for more details and examples, and refer to the general [documentation](https://pytorch.org/docs/master/generated/torch.cuda.CUDAGraph.html#torch.cuda.CUDAGraph) for additional information.

# Distributed Training

### Distributed Training Releases Now in Stable

In 1.10, there are a number of features that are moving from beta to stable in the distributed package:

* **(Stable) Remote Module**: CThis feature allows users to operate a module on a remote worker like using a local module, where the RPCs are transparent to the user. Refer to this [documentation](https://pytorch.org/docs/master/rpc.html#remotemodule) for more details.

* **(Stable) DDP Communication Hook**: This feature allows users to override how DDP synchronizes gradients across processes. Refer to this [documentation](https://pytorch.org/docs/master/rpc.html#remotemodule) for more details.

* **(Stable) ZeroRedundancyOptimizer**: This feature can be used in conjunction with DistributedDataParallel to reduce the size of per-process optimizer states. With this stable release, it now can handle uneven inputs to different data-parallel workers. Check out this [tutorial](https://pytorch.org/tutorials/advanced/generic_join.html). We also improved the parameter partition algorithm to better balance memory and computation overhead across processes. Refer to this [documentation](https://pytorch.org/docs/master/distributed.optim.html) and this [tutorial](https://pytorch.org/tutorials/recipes/zero_redundancy_optimizer.html) to learn more.

# Performance Optimization and Tooling

### (Beta) Profile-directed typing in TorchScript

TorchScript has a hard requirement for source code to have type annotations in order for compilation to be successful. For a long time, it was only possible to add missing or incorrect type annotations through trial and error (i.e., by fixing the type-checking errors generated by torch.jit.script one by one), which was inefficient and time consuming.

Now, we have enabled profile directed typing for torch.jit.script by leveraging existing tools like MonkeyType, which makes the process much easier, faster, and more efficient. For more details, refer to the [documentation](https://pytorch.org/docs/1.9.0/jit.html).

### (Beta) CPU Fusion

In PyTorch 1.10, we've added an LLVM-based JIT compiler for CPUs that can fuse together sequences of `torch` library calls to improve performance. While we've had this capability for some time on GPUs, this release is the first time we've brought compilation to the CPU. Check out a few sample results in this notebook,

You can check out a few performance results for yourself in this [Colab notebook](https://colab.research.google.com/drive/1xaH-L0XjsxUcS15GG220mtyrvIgDoZl6?usp=sharing).

### (Beta) PyTorch Profiler

The objective of PyTorch Profiler is to target the execution steps that are the most costly in time and/or memory, and visualize the workload distribution between GPUs and CPUs. PyTorch 1.10 includes the following key features:

* **Enhanced Memory View**: This helps you understand your memory usage better. This tool will help you avoid Out of Memory errors by showing active memory allocations at various points of your program run.

* **Enhanced Automated Recommendations**: This helps provide automated performance recommendations to help optimize your model. The tools recommend changes to batch size, TensorCore, memory reduction technologies, etc.

* **Distributed Training**: Gloo is now supported for distributed training jobs.

* **Correlate Operators in the Forward & Backward Pass**: This helps map the operators found in the forward pass to the backward pass, and vice versa, in a trace view.

* **TensorCore**: This tool shows the Tensor Core (TC) usage and provides recommendations for data scientists and framework developers.

Refer to this [documentation](https://pytorch.org/docs/stable/profiler.html) for details. Check out this [tutorial](https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html) to learn how to get started with this feature.

# PyTorch Mobile

### (Beta) Android NNAPI Support in Beta

Last year we [released prototype support](https://medium.com/pytorch/pytorch-mobile-now-supports-android-nnapi-e2a2aeb74534) for Android’s Neural Networks API (NNAPI). NNAPI allows Android apps to run computationally intensive neural networks on the most powerful and efficient parts of the chips that power mobile phones, including GPUs (Graphics Processing Units) and NPUs (specialized Neural Processing Units).

Try out this feature using the [tutorial](https://pytorch.org/tutorials/prototype/nnapi_mobilenetv2.html). Please provide your feedback or ask questions on [the forum](https://discuss.pytorch.org/c/mobile/18). You can also check out [this presentation](https://www.youtube.com/watch?v=B-2spa3UCTU) to learn more.

### (Beta) PyTorch Bundle Inputs

PyTorch now provides a utility that allows TorchScript models to have inputs bundled directly to them. It allows users to streamline the process of passing runnable inputs with a model. These inputs can be used to actually run the model in benchmarking applications or trace the used operators in something like mobile’s upcoming tracing based selective build. Also, they could be used to just specify input shapes for certain pipelines.

You can find a tutorial for this feature here [<waiting on pytorch recipes link to be live>], and provide your feedback on the [PyTorch Discussion Forum - Mobile](https://discuss.pytorch.org/c/mobile/18).

Thanks for reading. If you’re interested in these updates and want to join the PyTorch community, we encourage you to join the [discussion forums](https://discuss.pytorch.org/) and [open GitHub issues](https://github.com/pytorch/pytorch/issues). To get the latest news from PyTorch, follow us on [Facebook](https://www.facebook.com/pytorch/), [Twitter](https://twitter.com/PyTorch), [Medium](https://medium.com/pytorch), [YouTube](https://www.youtube.com/pytorch), or [LinkedIn](https://www.linkedin.com/company/pytorch).

Cheers!

Team PyTorch
Loading