|
| 1 | +# `TensorFlow (TF) Transformer with Intel® Advanced Matrix Extensions (Intel® AMX) bfoat16 Mixed Precision Learning` |
| 2 | + |
| 3 | +This sample code demonstrates optimizing a TensorFlow model with Intel® Advanced Matrix Extensions (Intel® AMX) using bfloat16 (Brain Floating Point) on 4th Gen Intel® Xeon® Scalable Processors (Sapphire Rapids). |
| 4 | + |
| 5 | +| Area | Description |
| 6 | +|:--- |:-- |
| 7 | + What you will learn | How to use AMX bfloat16 mixed precision learning on a TensorFlow model |
| 8 | +| Time to complete | 15 minutes |
| 9 | + |
| 10 | +> **Note**: The sample is based on the [*Text classification with Transformer*](https://keras.io/examples/nlp/text_classification_with_transformer/) Keras sample. |
| 11 | +
|
| 12 | + |
| 13 | +## Purpose |
| 14 | + |
| 15 | +In this sample, you will run a transformer classification model with bfloat16 mixed precision learning on Intel® AMX ISA and compare the performance against AVX512. You should notice that using Intel® AMX results in performance increases when compared to AVX512 while retaining the expected precision. |
| 16 | + |
| 17 | +## Prerequisites |
| 18 | + |
| 19 | +This sample code work on **Sapphire Rapids** only. |
| 20 | + |
| 21 | +| Optimized for | Description |
| 22 | +|:--- |:--- |
| 23 | +| OS | Ubuntu* 20.04 |
| 24 | +| Hardware | Sapphire Rapids |
| 25 | +| Software | Intel® AI Analytics Toolkit (AI Kit) |
| 26 | + |
| 27 | +The sample assumes Intel® Optimization for TensorFlow is installed. (See the [Intel® Optimization for TensorFlow* Installation Guide](https://www.intel.com/content/www/us/en/developer/articles/guide/optimization-for-TensorFlow-installation-guide.html) for more information.) |
| 28 | + |
| 29 | +### For Local Development Environments |
| 30 | + |
| 31 | +You will need to download and install the following toolkits, tools, and components to use the sample. |
| 32 | + |
| 33 | +- **Intel® AI Analytics Toolkit (AI Kit)** |
| 34 | + |
| 35 | + You can get the AI Kit from [Intel® oneAPI Toolkits](https://www.intel.com/content/www/us/en/developer/tools/oneapi/toolkits.html#analytics-kit). <br> See [*Get Started with the Intel® AI Analytics Toolkit for Linux**](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux) for AI Kit installation information and post-installation steps and scripts. |
| 36 | + |
| 37 | +- **Jupyter Notebook** |
| 38 | + |
| 39 | + Install using PIP: `$pip install notebook`. <br> Alternatively, see [*Installing Jupyter*](https://jupyter.org/install) for detailed installation instructions. |
| 40 | + |
| 41 | + |
| 42 | +- **Intel® oneAPI Data Analytics Library** |
| 43 | + |
| 44 | + You might need some parts of the [Intel® oneAPI Data Analytics Library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html). |
| 45 | + |
| 46 | + |
| 47 | +### For Intel® DevCloud |
| 48 | + |
| 49 | +The necessary tools and components are already installed in the environment. You do not need to install additional components. See [Intel® DevCloud for oneAPI](https://devcloud.intel.com/oneapi/get_started/) for information. |
| 50 | + |
| 51 | + |
| 52 | +## Key Implementation Details |
| 53 | + |
| 54 | +The sample code is written in Python and targets Sapphire Rapids only. |
| 55 | + |
| 56 | + |
| 57 | +## Run the Sample |
| 58 | + |
| 59 | +### On Linux* |
| 60 | + |
| 61 | +> **Note**: If you have not already done so, set up your CLI |
| 62 | +> environment by sourcing the `setvars` script in the root of your oneAPI installation. |
| 63 | +> |
| 64 | +> Linux*: |
| 65 | +> - For system wide installations: `. /opt/intel/oneapi/setvars.sh` |
| 66 | +> - For private installations: ` . ~/intel/oneapi/setvars.sh` |
| 67 | +> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'` |
| 68 | +> |
| 69 | +> For more information on configuring environment variables, see [Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html). |
| 70 | +
|
| 71 | +#### Activate Conda |
| 72 | + |
| 73 | +1. Activate the Conda environment. |
| 74 | + |
| 75 | + ``` |
| 76 | + conda activate tensorflow |
| 77 | + ``` |
| 78 | +
|
| 79 | + By default, the AI Kit is installed in the `/opt/intel/oneapi` folder and requires root privileges to manage it. |
| 80 | +
|
| 81 | + You can choose to activate Conda environment without root access. To bypass root access to manage your Conda environment, clone and activate your desired Conda environment using the following commands similar to the following. |
| 82 | +
|
| 83 | + ``` |
| 84 | + conda create --name usr_tensorflow --clone tensorflow |
| 85 | + conda activate usr_tensorflow |
| 86 | + ``` |
| 87 | +
|
| 88 | +#### Run the NoteBook |
| 89 | +
|
| 90 | +1. Launch Jupyter Notebook. |
| 91 | + ``` |
| 92 | + jupyter notebook --ip=0.0.0.0 |
| 93 | + ``` |
| 94 | +2. Follow the instructions to open the URL with the token in your browser. |
| 95 | +3. Locate and select the Notebook. |
| 96 | + ``` |
| 97 | + IntelTensorFlow_Transformer_AMX_bfloat16_MixedPrecision.ipynb |
| 98 | + ``` |
| 99 | +4. Run every cell in the Notebook in sequence. |
| 100 | +
|
| 101 | +
|
| 102 | +#### Troubleshooting |
| 103 | +
|
| 104 | +If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility. |
| 105 | +
|
| 106 | +
|
| 107 | +### Run the Sample on Intel® DevCloud |
| 108 | +
|
| 109 | +1. If you do not already have an account, request an Intel® DevCloud account at [*Create an Intel® DevCloud Account*](https://intelsoftwaresites.secure.force.com/DevCloud/oneapi). |
| 110 | +2. On a Linux* system, open a terminal. |
| 111 | +3. SSH into Intel® DevCloud. |
| 112 | + ``` |
| 113 | + ssh DevCloud |
| 114 | + ``` |
| 115 | + > **Note**: You can find information about configuring your Linux system and connecting to Intel DevCloud at Intel® DevCloud for oneAPI [Get Started](https://devcloud.intel.com/oneapi/get_started). |
| 116 | +
|
| 117 | +4. Locate and select the Notebook. |
| 118 | + ``` |
| 119 | + IntelTensorFlow_Transformer_AMX_bfloat16_MixedPrecision.ipynb |
| 120 | + ``` |
| 121 | +5. Run every cell in the Notebook in sequence. |
| 122 | +
|
| 123 | +
|
| 124 | +## Example Output |
| 125 | +
|
| 126 | +You should see diagrams demonstrating performance analysis formatted, as pie charts, for JIT Kernel Type Time breakdown for both AVX512 and AMX. |
| 127 | +
|
| 128 | +The following image shows a typical example of JIT Kernel Time breakdown file analysis diagrams. |
| 129 | +
|
| 130 | + |
| 131 | +
|
| 132 | +## Further Reading |
| 133 | +
|
| 134 | +Explore [Get Started with the Intel® AI Analytics Toolkit for Linux*](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top.html) to find out how you can achieve performance gains for popular deep-learning and machine-learning frameworks through Intel optimizations. |
| 135 | +
|
| 136 | +## License |
| 137 | +
|
| 138 | +Code samples are licensed under the MIT license. See [License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) |
| 139 | +for details. |
| 140 | +
|
| 141 | +Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt). |
0 commit comments