Ai and analytics features and functionality intel tensor flow inference optimization (#1462)

jkinsky · jimmytwei · krzeszew · web-flow · commit 61bd0f782c5a · 2023-03-24T15:48:12.000-07:00
* Fixes for 2023.1 AI Kit (#1409) * Intel Python Numpy Numba_dpes kNN sample (#1292) * *.py and *.ipynb files with implementation * README.md and sample.json files with documentation * License and thir party programs * Adding PyTorch Training Optimizations with AMX BF16 oneAPI sample (#1293) * add IntelPytorch Quantization code samples (#1301) * add IntelPytorch Quantization code samples * fix the spelling error in the README file * use john's README with grammar fix and title change * Rename third-party-grograms.txt to third-party-programs.txt Co-authored-by: Jimmy Wei <jimmy.t.wei@intel.com> * AMX bfloat16 mixed precision learning TensorFlow Transformer sample (#1317) * [New Sample] Intel Extension for TensorFlow Getting Started (#1313) * first draft * Update README.md * remove redunant file * [New Sample] [oneDNN] Benchdnn tutorial (#1315) * New Sample: benchDNN tutorial * Update readme: new sample * Rename sample to benchdnn_tutorial * Name fix * Add files via upload (#1320) * [New Sample] oneCCL Bindings for PyTorch Getting Started (#1316) * Update README.md * [New Sample] oneCCL Bindings for PyTorch Getting Started * Update README.md * add torch-ccl version check * [New Sample] Intel Extension for PyTorch Getting Started (#1314) * add new ipex GSG notebook for dGPU * Update sample.json for expertise field * Update requirements.txt Update package versions to comply with Snyk tool * Updated title field in sample.json in TF Transformer AMX bfloat16 Mixed Precision sample to fit within character length range (#1327) * add arch checker class (#1332) * change gpu.patch to convert the code samples from cpu to gpu correctly (#1334) * Fixes for spelling in AMX bfloat16 transformer sample and printing error in python code in numpy vs numba sample (#1335) * 2023.1 ai kit itex get started example fix (#1338) * Fix the typo * Update ResNet50_Inference.ipynb * fix resnet inference demo link (#1339) * Fix printing issue in numpy vs numba AI sample (#1356) * Fix Invalid Kmeans parameters on oneAPI 2023 (#1345) * Update README to add new samples into the list (#1366) * PyTorch AMX BF16 Training sample: remove graphs and performance numbers (#1408) * Adding PyTorch Training Optimizations with AMX BF16 oneAPI sample * remove performance graphs, update README * remove graphs from README and folder * update top README in Features and Functionality --------- Co-authored-by: krzeszew <93649016+krzeszew@users.noreply.github.com> Co-authored-by: alexsin368 <109180236+alexsin368@users.noreply.github.com> Co-authored-by: ZhaoqiongZ <106125927+ZhaoqiongZ@users.noreply.github.com> Co-authored-by: Louie Tsai <louie.tsai@intel.com> Co-authored-by: Orel Yehuda <orel.yehuda@intel.com> Co-authored-by: yuning <113460727+YuningQiu@users.noreply.github.com> Co-authored-by: Wang, Kai Lawrence <109344418+wangkl2@users.noreply.github.com> Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com> * Optimize TensorFlow Pre-trained Model for Inference sample readme update Restructured section to match new template. Updated name in readme to match name in sample.json. Updated sample.json file for title information. Updated lots of formatting issues. Added information related to DevCloud. Added information on setting vars. Corrected spelling and grammar errors in top-level readme and readme in the “scripts” subfolder. Added link from main readme to subfolder readme. --------- Co-authored-by: Jimmy Wei <jimmy.t.wei@intel.com> Co-authored-by: krzeszew <93649016+krzeszew@users.noreply.github.com> Co-authored-by: alexsin368 <109180236+alexsin368@users.noreply.github.com> Co-authored-by: ZhaoqiongZ <106125927+ZhaoqiongZ@users.noreply.github.com> Co-authored-by: Louie Tsai <louie.tsai@intel.com> Co-authored-by: Orel Yehuda <orel.yehuda@intel.com> Co-authored-by: yuning <113460727+YuningQiu@users.noreply.github.com> Co-authored-by: Wang, Kai Lawrence <109344418+wangkl2@users.noreply.github.com> Co-authored-by: xiguiw <111278656+xiguiw@users.noreply.github.com>
diff --git a/AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_InferenceOptimization/README.md b/AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_InferenceOptimization/README.md
@@ -1,102 +1,136 @@
-# Tutorial : Optimize TensorFlow pre-trained model for inference
-This tutorial will guide you how to optimize a pre-trained model for a better inference performance, and also
-analyze the model pb files before and after the inference optimizations.
-Both the Intel® Low Precision Optimization Tool (Intel® LPOT) and TensorFlow optimization tools are used in this tutorial, and Intel LPOT is the preferred tool for inference optimization on Intel Architectures.
-
-| Optimized for                     | Description
-|:---                               |:---
-| OS                                | Ubuntu* 18.04
-| Hardware                          | Intel® Xeon® Scalable processor family or newer
-| Software                          | [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html)
-| What you will learn               | Optimize a pre-trained model for a better inference performance
-| Time to complete                  | 30 minutes
+# `Optimize TensorFlow Pre-trained Model for Inference` Sample
+
+The `Optimize TensorFlow Pre-trained Model for Inference` sample demonstrates how to optimize a pre-trained model for a better inference performance, and also analyze the model pb files before and after the inference optimizations.
+
+| Area                 | Description
+|:---                  |:---
+| What you will learn  | Optimize a pre-trained model for a better inference performance
+| Time to complete     | 30 minutes
+| Category             | Code Optimization
+
+## Prequisites
+
+| Optimized for        | Description
+|:---                  |:---
+| OS                   | Ubuntu* 18.04
+| Hardware             | Intel® Xeon® Scalable processor family or newer
+| Software             | [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html)
+
+### For Local Development Environments
+
+- **Intel® AI Analytics Toolkit (AI Kit)**
+
+  You can get the AI Kit from [Intel® oneAPI Toolkits](https://www.intel.com/content/www/us/en/developer/tools/oneapi/toolkits.html#analytics-kit). <br> See [*Get Started with the Intel® AI Analytics Toolkit for Linux**](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux) for AI Kit installation information and post-installation steps and scripts.
+
+- **Jupyter Notebook**
+
+  Install using PIP: `$pip install notebook`. <br> Alternatively, see [*Installing Jupyter*](https://jupyter.org/install) for detailed installation instructions.
+
+The sample uses both the Intel® Low Precision Optimization Tool (Intel® LPOT) and TensorFlow* optimization tools, and Intel® LPOT is the preferred tool for inference optimization on Intel Architectures.
+
+### For Intel® DevCloud
+
+The necessary tools and components are already installed in the environment. You do not need to install additional components. See [Intel® DevCloud for oneAPI](https://devcloud.intel.com/oneapi/get_started/) for information.
 
 ## Purpose
-Show users the importance of inference optimization on performance, and also analyze TensorFlow ops difference in pre-trained models before/after the optimizations.
-Those optimizations include:
-* Converting variables to constants.
-* Removing training-only operations like checkpoint saving.
-* Stripping out parts of the graph that are never reached.
-* Removing debug operations like CheckNumerics.
-* Folding batch normalization ops into the pre-calculated weights.
-* Fusing common operations into unified versions.
-
-## Key implementation details
-This tutorial contains one Jupyter notebook and three python scripts listed below.
+
+Show users the importance of inference optimization on performance, and also analyze TensorFlow ops difference in pre-trained models before/after the optimizations. Those optimizations include:
+
+- Converting variables to constants.
+- Removing training-only operations like checkpoint saving.
+- Stripping out parts of the graph that are never reached.
+- Removing debug operations like CheckNumerics.
+- Folding batch normalization ops into the pre-calculated weights.
+- Fusing common operations into unified versions.
+
+## Key Implementation Details
+
+This tutorial contains one Jupyter Notebook and three python scripts listed below.
+
 ### Jupyter Notebooks
 
-| Notebook | Notes|
-| ------ | ------ |
-|  tutorial_optimize_TensorFlow_pretrained_model.ipynb | Optimize a pre-trained model for a better inference performance, and also analyze the model pb files  |
+| Notebook                                             | Description
+|:---                                                  |:---
+|`tutorial_optimize_TensorFlow_pretrained_model.ipynb` | Optimize a pre-trained model for a better inference performance, and also analyze the model pb files
 
 ### Python Scripts
-| Scripts | Notes|
-| ------ | ------ |
-|  tf_pb_utils.py | This script parses a pre-trained TensorFlow model PB file.  |
-|  freeze_optimize_v2.py | This script optimizes a pre-trained TensorFlow model PB file.  |
-|  profile_utils.py | This script helps on output processing of the Jupyter Notebook.  |
-
 
-## License
-Code samples are licensed under the MIT license. See
-[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
+| Script                 | Description
+|:---                    |:---
+|`tf_pb_utils.py`        | Parses a pre-trained TensorFlow model PB file. <br> (See [TensorFlow* PB File Parser README](scripts/README.md) for more information.)
+|`freeze_optimize_v2.py` | Optimizes a pre-trained TensorFlow model PB file.
+|`profile_utils.py`      | Helps output processing of the Jupyter Notebook.
 
-Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
+## Set Environment Variables
 
-## Build and Run the Sample
+When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development.
 
-### Pre-requirement
+## Run the `Optimize TensorFlow Pre-trained Model for Inference` Sample
 
-> NOTE: No action is required if users use Intel DevCloud as their environment.
-  Please refer to [Intel® DevCloud for oneAPI](https://intelsoftwaresites.secure.force.com/devcloud/oneapi) for Intel DevCloud.
+### On Linux*
 
- 1. **Intel® AI Analytics Toolkit**
-       You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation,
-       and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.
+> **Note**: If you have not already done so, set up your CLI
+> environment by sourcing  the `setvars` script in the root of your oneAPI installation.
+>
+> Linux*:
+> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
+> - For private installations: ` . ~/intel/oneapi/setvars.sh`
+> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
+>
+> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*.
 
- 2. **Jupyter Notebook**
-       Users can install via PIP by `$pip install notebook`.
-       Users can also refer to the [installation link](https://jupyter.org/install) for details.
 
+#### Open Jupyter Notebook
 
+1. Launch Jupyter Notebook.
+   ```
+   jupyter notebook --ip=0.0.0.0
+   ```
+2. Follow the instructions to open the URL with the token in your browser.
+3. Locate and select the Notebook.
+   ```
+   tutorial_optimize_TensorFlow_pretrained_model.ipynb
+   ```
+4. Change your Jupyter Notebook kernel to **tensorflow** or **intel-tensorflow**.
+5. Run every cell in the Notebook in sequence.
 
-### Running the Sample
+#### Troubleshooting
 
-1. Launch Jupyter notebook: `$jupyter notebook --ip=0.0.0.0`
+If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility.
 
+### Run the Sample on Intel® DevCloud (Optional)
 
-2. Follow the instructions to open the URL with the token in your browser
-3. Click the `tutorial_optimize_TensorFlow_pretrained_model.ipynb` file
-4. Change your Jupyter notebook kernel to "tensorflow" or "intel-tensorflow"
-5. Run through every cell of the notebook one by one
+1. If you do not already have an account, request an Intel® DevCloud account at [*Create an Intel® DevCloud Account*](https://intelsoftwaresites.secure.force.com/DevCloud/oneapi).
+2. On a Linux* system, open a terminal.
+3. SSH into Intel® DevCloud.
+   ```
+   ssh DevCloud
+   ```
+   > **Note**: You can find information about configuring your Linux system and connecting to Intel DevCloud at Intel® DevCloud for oneAPI [Get Started](https://devcloud.intel.com/oneapi/get_started).
+   
+4. Follow the instructions to open the URL with the token in your browser.
+5. Locate and select the Notebook.
+   ```
+   tutorial_optimize_TensorFlow_pretrained_model.ipynb
+   ```
+6. Change the kernel to **tensorflow** or **intel-tensorflow**.
+7. Run every cell in the Notebook in sequence.
 
+## Example Output
 
+Users should be able to see some diagrams for performance comparison and analysis. One example of performance comparison diagrams:
 
-### Example of Output
-Users should be able to see some diagrams for performance comparison and analysis.
-One example of performance comparison diagrams:
-<br><img src="images/perf_comparison.png" width="500" height="400"><br>
+![speed up example](images/perf_comparison.png)
 
 For performance analysis, users can also see pie charts for different Tensorflow* operations in the analyzed pre-trained model pb file.
-One example of model pb file analysis diagrams:
-<br><img src="images/saved_model_pie.png" width="800" height="600"><br>
-
-If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits.
-[Learn more](https://software.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html)
 
-### Using Visual Studio Code*  (Optional)
+One example of model pb file analysis diagrams:
 
-You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations,
-and browse and download samples.
+![inception example](images/saved_model_pie.png)
 
-The basic steps to build and run a sample using VS Code include:
- - Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**.
- - Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**.
- - Open a Terminal in VS Code (**Terminal>New Terminal**).
- - Run the sample in the VS Code terminal using the instructions below.
- - (Linux only) Debug your GPU application with GDB for Intel® oneAPI toolkits using the Generate Launch Configurations extension.
+## License
 
-To learn more about the extensions, see
-[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).
+Code samples are licensed under the MIT license. See
+[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
 
-After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample.
+Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
diff --git a/AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_InferenceOptimization/sample.json b/AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_InferenceOptimization/sample.json
@@ -1,6 +1,6 @@
 {
 	"guid": "9d32f194-8667-41d3-865d-d43e9983c471",
-	"name": "Optimize TensorFlow pre-trained model for inference",
+	"name": "Optimize TensorFlow Pre-trained Model for Inference",
 	"categories": ["Toolkit/oneAPI AI And Analytics/Features And Functionality"],
 	"description": "This tutorial will guide you how to optimize a pre-trained model for a better inference performance, and also analyze the model pb files before and after the inference optimizations.",
 	"builder": ["cli"],
diff --git a/AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_InferenceOptimization/scripts/README.md b/AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_InferenceOptimization/scripts/README.md
@@ -1,74 +1,74 @@
-# TensorFlow PB file parser
+# TensorFlow* PB File Parser
 
+## Prerequisites 
 
-## prerequisites 
+You must have a PB file and TensorFlow environment first.  
 
-*  users need to have a PB file and TensorFlow environment first.  
+## How to Parse a PB File
 
-## How to parse a pb file
-A TensorFlow graph might be composed from many subgraphs.
-Therefore, users might see many layers in a PB file due to those subgraphs.
+A TensorFlow graph might be composed from many subgraphs; therefore, users might see many layers in a PB file due to those subgraphs.
+
+### ( Optional ) 1. Understand Structures of a PB file
 
-### ( Optional ) 1. Understand structures of a PB file
 This section is optional for users who fully understand the structure of the pb file.
 If you investigate into a new pb file, you might need to go through below section to understand the structure of pb file.
 
-The tf_pb_utils.py will parse op.type and op.name from a graph_def into a CSV file.
+The `tf_pb_utils.py` will parse op.type and op.name from a graph_def into a CSV file.
 
-op.name might contains a layer structure.
+op.name might contain a layer structure.
 
 Below is an example.
+
 ex : FeatureExtractor\InceptionV2\InceptionV2\Mixed_5c\Branch_2\Conv2d_0b_3x3\Conv2D
 The first layer is FeatureExtractor, and the second layer is InceptionV2. The last layer is Conv2D.
 
 Here is another example.
+
 ex : BoxPredictor_4\BoxEncodingPredictor\Conv2D
 Even the last layer is Conv2D, it has different first and second layer.
 Moreover, this Conv2D is not related to InceptionV2 layer, so we don't want to count this Conv2D as a inceptionV2 ops.
 
 Therefore, we still need the layers information to focus on the ops important to us.
 
-we parse op.type and op.name into a CSV file "out.csv", and below is a mapping table between CSV column and op.type & op.name. op.name[i] represnt the i layer of this op.name.  
+We parse op.type and op.name into a CSV file "out.csv". The following table maps the CSV column and op.type & op.name. op.name[i] represent the i layer of this op.name.  
 
 |op_type|op_name|op1|op2|op3|op4|op5|op6|  
 |:-----|:----|:-----|:-----|:-----|:-----|:-----|:-----|  
 |op.type| op.name[-1] |op.name[0] | op.name[1] | op.name[2] |op.name[3] |op.name[4] |op.name[5] |  
 
+Following two sub-sections will show you how to focus on op.type of an interested layer such as InceptionV2.
 
-Following two sub-sections will show you how to focus on op.type of a interested layer such as InceptionV2.
+### ( Optional ) 2. Find the Column Containing the Interested Layer
 
-### ( Optional ) 2. Find the column which contain the interested layer such as InceptionV2
 Below command will group rows by the values on the selected column from out.csv file.
-Check which column contains the interested layer.  
-Below is column 3 of ssd_inception_v2 case, it contains InceptionV2 as second row.
-
-     == Dump column : 3 ==  
-    op2  
-    BatchMultiClassNonMaxSuppression    5307  
-    InceptionV2                         1036  
-    0                                    263  
-    map                                   63  
-    Decode                                63  
-    ClassPredictor                        36  
-    BoxEncodingPredictor                  36  
-    Meshgrid_14                           34  
-    Meshgrid_1                            34  
-    Meshgrid_10                           34  
-    dtype: int64  
-
-
-
-Both indexs of column and row start from 0. 
-Therefore, we could access second row by index 1.  
-By using column index 3 and row index 1, we could access InceptionV2 related op.name.  
-
-###  3. Parse a interested subgraph of a PB file
-
-With understanding of interested subgraph of a pb file,  
-users can parse that subgraph by assigning related column and row index  
-Ex : column 3 and row 1 from Step 2 above.  
-
-`python tf_pb_utils.py model.pb -c 3 -r 1`  
+Check which column contains the interested layer. Below is column 3 of ssd_inception_v2 case, it contains InceptionV2 as second row.
+
+```
+     == Dump column : 3 ==
+    op2
+    BatchMultiClassNonMaxSuppression    5307
+    InceptionV2                         1036
+    0                                    263
+    map                                   63
+    Decode                                63
+    ClassPredictor                        36
+    BoxEncodingPredictor                  36
+    Meshgrid_14                           34
+    Meshgrid_1                            34
+    Meshgrid_10                           34
+    dtype: int64
+```
+Both indexes of column and row start from 0; therefore, we could access second row by index 1.
+By using column index 3 and row index 1, we could access InceptionV2 related op.name.
+
+###  3. Parse an Interested Subgraph of a PB File
+
+With understanding of interested subgraph of a pb file, users can parse that subgraph by assigning related column and row index. Ex : column 3 and row 1 from Step 2 above.
+
+```
+python tf_pb_utils.py model.pb -c 3 -r 1
+```
 
 One example of TensorFlow Operations breakdown diagrams from a PB file:
-<br><img src="breakdown.png" width="800" height="800"><br>
+
+![](breakdown.png)

Original file line number	Diff line number	Diff line change
`@@ -1,6 +1,6 @@`
`1`	`1`	`{`
`2`	`2`	`"guid": "9d32f194-8667-41d3-865d-d43e9983c471",`
`3`		`- "name": "Optimize TensorFlow pre-trained model for inference",`
	`3`	`+ "name": "Optimize TensorFlow Pre-trained Model for Inference",`
`4`	`4`	`"categories": ["Toolkit/oneAPI AI And Analytics/Features And Functionality"],`
`5`	`5`	`"description": "This tutorial will guide you how to optimize a pre-trained model for a better inference performance, and also analyze the model pb files before and after the inference optimizations.",`
`6`	`6`	`"builder": ["cli"],`