Skip to content

Commit 61bd0f7

Browse files
jkinskyjimmytweikrzeszewalexsin368ZhaoqiongZ
authored
Ai and analytics features and functionality intel tensor flow inference optimization (#1462)
* Fixes for 2023.1 AI Kit (#1409) * Intel Python Numpy Numba_dpes kNN sample (#1292) * *.py and *.ipynb files with implementation * README.md and sample.json files with documentation * License and thir party programs * Adding PyTorch Training Optimizations with AMX BF16 oneAPI sample (#1293) * add IntelPytorch Quantization code samples (#1301) * add IntelPytorch Quantization code samples * fix the spelling error in the README file * use john's README with grammar fix and title change * Rename third-party-grograms.txt to third-party-programs.txt Co-authored-by: Jimmy Wei <[email protected]> * AMX bfloat16 mixed precision learning TensorFlow Transformer sample (#1317) * [New Sample] Intel Extension for TensorFlow Getting Started (#1313) * first draft * Update README.md * remove redunant file * [New Sample] [oneDNN] Benchdnn tutorial (#1315) * New Sample: benchDNN tutorial * Update readme: new sample * Rename sample to benchdnn_tutorial * Name fix * Add files via upload (#1320) * [New Sample] oneCCL Bindings for PyTorch Getting Started (#1316) * Update README.md * [New Sample] oneCCL Bindings for PyTorch Getting Started * Update README.md * add torch-ccl version check * [New Sample] Intel Extension for PyTorch Getting Started (#1314) * add new ipex GSG notebook for dGPU * Update sample.json for expertise field * Update requirements.txt Update package versions to comply with Snyk tool * Updated title field in sample.json in TF Transformer AMX bfloat16 Mixed Precision sample to fit within character length range (#1327) * add arch checker class (#1332) * change gpu.patch to convert the code samples from cpu to gpu correctly (#1334) * Fixes for spelling in AMX bfloat16 transformer sample and printing error in python code in numpy vs numba sample (#1335) * 2023.1 ai kit itex get started example fix (#1338) * Fix the typo * Update ResNet50_Inference.ipynb * fix resnet inference demo link (#1339) * Fix printing issue in numpy vs numba AI sample (#1356) * Fix Invalid Kmeans parameters on oneAPI 2023 (#1345) * Update README to add new samples into the list (#1366) * PyTorch AMX BF16 Training sample: remove graphs and performance numbers (#1408) * Adding PyTorch Training Optimizations with AMX BF16 oneAPI sample * remove performance graphs, update README * remove graphs from README and folder * update top README in Features and Functionality --------- Co-authored-by: krzeszew <[email protected]> Co-authored-by: alexsin368 <[email protected]> Co-authored-by: ZhaoqiongZ <[email protected]> Co-authored-by: Louie Tsai <[email protected]> Co-authored-by: Orel Yehuda <[email protected]> Co-authored-by: yuning <[email protected]> Co-authored-by: Wang, Kai Lawrence <[email protected]> Co-authored-by: xiguiw <[email protected]> * Optimize TensorFlow Pre-trained Model for Inference sample readme update Restructured section to match new template. Updated name in readme to match name in sample.json. Updated sample.json file for title information. Updated lots of formatting issues. Added information related to DevCloud. Added information on setting vars. Corrected spelling and grammar errors in top-level readme and readme in the “scripts” subfolder. Added link from main readme to subfolder readme. --------- Co-authored-by: Jimmy Wei <[email protected]> Co-authored-by: krzeszew <[email protected]> Co-authored-by: alexsin368 <[email protected]> Co-authored-by: ZhaoqiongZ <[email protected]> Co-authored-by: Louie Tsai <[email protected]> Co-authored-by: Orel Yehuda <[email protected]> Co-authored-by: yuning <[email protected]> Co-authored-by: Wang, Kai Lawrence <[email protected]> Co-authored-by: xiguiw <[email protected]>
1 parent 37c1ef4 commit 61bd0f7

File tree

3 files changed

+151
-117
lines changed

3 files changed

+151
-117
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,102 +1,136 @@
1-
# Tutorial : Optimize TensorFlow pre-trained model for inference
2-
This tutorial will guide you how to optimize a pre-trained model for a better inference performance, and also
3-
analyze the model pb files before and after the inference optimizations.
4-
Both the Intel® Low Precision Optimization Tool (Intel® LPOT) and TensorFlow optimization tools are used in this tutorial, and Intel LPOT is the preferred tool for inference optimization on Intel Architectures.
5-
6-
| Optimized for | Description
7-
|:--- |:---
8-
| OS | Ubuntu* 18.04
9-
| Hardware | Intel® Xeon® Scalable processor family or newer
10-
| Software | [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html)
11-
| What you will learn | Optimize a pre-trained model for a better inference performance
12-
| Time to complete | 30 minutes
1+
# `Optimize TensorFlow Pre-trained Model for Inference` Sample
2+
3+
The `Optimize TensorFlow Pre-trained Model for Inference` sample demonstrates how to optimize a pre-trained model for a better inference performance, and also analyze the model pb files before and after the inference optimizations.
4+
5+
| Area | Description
6+
|:--- |:---
7+
| What you will learn | Optimize a pre-trained model for a better inference performance
8+
| Time to complete | 30 minutes
9+
| Category | Code Optimization
10+
11+
## Prequisites
12+
13+
| Optimized for | Description
14+
|:--- |:---
15+
| OS | Ubuntu* 18.04
16+
| Hardware | Intel® Xeon® Scalable processor family or newer
17+
| Software | [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html)
18+
19+
### For Local Development Environments
20+
21+
- **Intel® AI Analytics Toolkit (AI Kit)**
22+
23+
You can get the AI Kit from [Intel® oneAPI Toolkits](https://www.intel.com/content/www/us/en/developer/tools/oneapi/toolkits.html#analytics-kit). <br> See [*Get Started with the Intel® AI Analytics Toolkit for Linux**](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux) for AI Kit installation information and post-installation steps and scripts.
24+
25+
- **Jupyter Notebook**
26+
27+
Install using PIP: `$pip install notebook`. <br> Alternatively, see [*Installing Jupyter*](https://jupyter.org/install) for detailed installation instructions.
28+
29+
The sample uses both the Intel® Low Precision Optimization Tool (Intel® LPOT) and TensorFlow* optimization tools, and Intel® LPOT is the preferred tool for inference optimization on Intel Architectures.
30+
31+
### For Intel® DevCloud
32+
33+
The necessary tools and components are already installed in the environment. You do not need to install additional components. See [Intel® DevCloud for oneAPI](https://devcloud.intel.com/oneapi/get_started/) for information.
1334

1435
## Purpose
15-
Show users the importance of inference optimization on performance, and also analyze TensorFlow ops difference in pre-trained models before/after the optimizations.
16-
Those optimizations include:
17-
* Converting variables to constants.
18-
* Removing training-only operations like checkpoint saving.
19-
* Stripping out parts of the graph that are never reached.
20-
* Removing debug operations like CheckNumerics.
21-
* Folding batch normalization ops into the pre-calculated weights.
22-
* Fusing common operations into unified versions.
23-
24-
## Key implementation details
25-
This tutorial contains one Jupyter notebook and three python scripts listed below.
36+
37+
Show users the importance of inference optimization on performance, and also analyze TensorFlow ops difference in pre-trained models before/after the optimizations. Those optimizations include:
38+
39+
- Converting variables to constants.
40+
- Removing training-only operations like checkpoint saving.
41+
- Stripping out parts of the graph that are never reached.
42+
- Removing debug operations like CheckNumerics.
43+
- Folding batch normalization ops into the pre-calculated weights.
44+
- Fusing common operations into unified versions.
45+
46+
## Key Implementation Details
47+
48+
This tutorial contains one Jupyter Notebook and three python scripts listed below.
49+
2650
### Jupyter Notebooks
2751

28-
| Notebook | Notes|
29-
| ------ | ------ |
30-
| tutorial_optimize_TensorFlow_pretrained_model.ipynb | Optimize a pre-trained model for a better inference performance, and also analyze the model pb files |
52+
| Notebook | Description
53+
|:--- |:---
54+
|`tutorial_optimize_TensorFlow_pretrained_model.ipynb` | Optimize a pre-trained model for a better inference performance, and also analyze the model pb files
3155

3256
### Python Scripts
33-
| Scripts | Notes|
34-
| ------ | ------ |
35-
| tf_pb_utils.py | This script parses a pre-trained TensorFlow model PB file. |
36-
| freeze_optimize_v2.py | This script optimizes a pre-trained TensorFlow model PB file. |
37-
| profile_utils.py | This script helps on output processing of the Jupyter Notebook. |
38-
3957

40-
## License
41-
Code samples are licensed under the MIT license. See
42-
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
58+
| Script | Description
59+
|:--- |:---
60+
|`tf_pb_utils.py` | Parses a pre-trained TensorFlow model PB file. <br> (See [TensorFlow* PB File Parser README](scripts/README.md) for more information.)
61+
|`freeze_optimize_v2.py` | Optimizes a pre-trained TensorFlow model PB file.
62+
|`profile_utils.py` | Helps output processing of the Jupyter Notebook.
4363

44-
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
64+
## Set Environment Variables
4565

46-
## Build and Run the Sample
66+
When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development.
4767

48-
### Pre-requirement
68+
## Run the `Optimize TensorFlow Pre-trained Model for Inference` Sample
4969

50-
> NOTE: No action is required if users use Intel DevCloud as their environment.
51-
Please refer to [Intel® DevCloud for oneAPI](https://intelsoftwaresites.secure.force.com/devcloud/oneapi) for Intel DevCloud.
70+
### On Linux*
5271

53-
1. **Intel® AI Analytics Toolkit**
54-
You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation,
55-
and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.
72+
> **Note**: If you have not already done so, set up your CLI
73+
> environment by sourcing the `setvars` script in the root of your oneAPI installation.
74+
>
75+
> Linux*:
76+
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
77+
> - For private installations: ` . ~/intel/oneapi/setvars.sh`
78+
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
79+
>
80+
> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*.
5681
57-
2. **Jupyter Notebook**
58-
Users can install via PIP by `$pip install notebook`.
59-
Users can also refer to the [installation link](https://jupyter.org/install) for details.
6082

83+
#### Open Jupyter Notebook
6184

85+
1. Launch Jupyter Notebook.
86+
```
87+
jupyter notebook --ip=0.0.0.0
88+
```
89+
2. Follow the instructions to open the URL with the token in your browser.
90+
3. Locate and select the Notebook.
91+
```
92+
tutorial_optimize_TensorFlow_pretrained_model.ipynb
93+
```
94+
4. Change your Jupyter Notebook kernel to **tensorflow** or **intel-tensorflow**.
95+
5. Run every cell in the Notebook in sequence.
6296

63-
### Running the Sample
97+
#### Troubleshooting
6498

65-
1. Launch Jupyter notebook: `$jupyter notebook --ip=0.0.0.0`
99+
If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility.
66100

101+
### Run the Sample on Intel® DevCloud (Optional)
67102

68-
2. Follow the instructions to open the URL with the token in your browser
69-
3. Click the `tutorial_optimize_TensorFlow_pretrained_model.ipynb` file
70-
4. Change your Jupyter notebook kernel to "tensorflow" or "intel-tensorflow"
71-
5. Run through every cell of the notebook one by one
103+
1. If you do not already have an account, request an Intel® DevCloud account at [*Create an Intel® DevCloud Account*](https://intelsoftwaresites.secure.force.com/DevCloud/oneapi).
104+
2. On a Linux* system, open a terminal.
105+
3. SSH into Intel® DevCloud.
106+
```
107+
ssh DevCloud
108+
```
109+
> **Note**: You can find information about configuring your Linux system and connecting to Intel DevCloud at Intel® DevCloud for oneAPI [Get Started](https://devcloud.intel.com/oneapi/get_started).
110+
111+
4. Follow the instructions to open the URL with the token in your browser.
112+
5. Locate and select the Notebook.
113+
```
114+
tutorial_optimize_TensorFlow_pretrained_model.ipynb
115+
```
116+
6. Change the kernel to **tensorflow** or **intel-tensorflow**.
117+
7. Run every cell in the Notebook in sequence.
72118

119+
## Example Output
73120

121+
Users should be able to see some diagrams for performance comparison and analysis. One example of performance comparison diagrams:
74122

75-
### Example of Output
76-
Users should be able to see some diagrams for performance comparison and analysis.
77-
One example of performance comparison diagrams:
78-
<br><img src="images/perf_comparison.png" width="500" height="400"><br>
123+
![speed up example](images/perf_comparison.png)
79124

80125
For performance analysis, users can also see pie charts for different Tensorflow* operations in the analyzed pre-trained model pb file.
81-
One example of model pb file analysis diagrams:
82-
<br><img src="images/saved_model_pie.png" width="800" height="600"><br>
83-
84-
If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits.
85-
[Learn more](https://software.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html)
86126

87-
### Using Visual Studio Code* (Optional)
127+
One example of model pb file analysis diagrams:
88128

89-
You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations,
90-
and browse and download samples.
129+
![inception example](images/saved_model_pie.png)
91130

92-
The basic steps to build and run a sample using VS Code include:
93-
- Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**.
94-
- Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**.
95-
- Open a Terminal in VS Code (**Terminal>New Terminal**).
96-
- Run the sample in the VS Code terminal using the instructions below.
97-
- (Linux only) Debug your GPU application with GDB for Intel® oneAPI toolkits using the Generate Launch Configurations extension.
131+
## License
98132

99-
To learn more about the extensions, see
100-
[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).
133+
Code samples are licensed under the MIT license. See
134+
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
101135

102-
After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample.
136+
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).

AI-and-Analytics/Features-and-Functionality/IntelTensorFlow_InferenceOptimization/sample.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"guid": "9d32f194-8667-41d3-865d-d43e9983c471",
3-
"name": "Optimize TensorFlow pre-trained model for inference",
3+
"name": "Optimize TensorFlow Pre-trained Model for Inference",
44
"categories": ["Toolkit/oneAPI AI And Analytics/Features And Functionality"],
55
"description": "This tutorial will guide you how to optimize a pre-trained model for a better inference performance, and also analyze the model pb files before and after the inference optimizations.",
66
"builder": ["cli"],
Original file line numberDiff line numberDiff line change
@@ -1,74 +1,74 @@
1-
# TensorFlow PB file parser
1+
# TensorFlow* PB File Parser
22

3+
## Prerequisites
34

4-
## prerequisites
5+
You must have a PB file and TensorFlow environment first.
56

6-
* users need to have a PB file and TensorFlow environment first.
7+
## How to Parse a PB File
78

8-
## How to parse a pb file
9-
A TensorFlow graph might be composed from many subgraphs.
10-
Therefore, users might see many layers in a PB file due to those subgraphs.
9+
A TensorFlow graph might be composed from many subgraphs; therefore, users might see many layers in a PB file due to those subgraphs.
10+
11+
### ( Optional ) 1. Understand Structures of a PB file
1112

12-
### ( Optional ) 1. Understand structures of a PB file
1313
This section is optional for users who fully understand the structure of the pb file.
1414
If you investigate into a new pb file, you might need to go through below section to understand the structure of pb file.
1515

16-
The tf_pb_utils.py will parse op.type and op.name from a graph_def into a CSV file.
16+
The `tf_pb_utils.py` will parse op.type and op.name from a graph_def into a CSV file.
1717

18-
op.name might contains a layer structure.
18+
op.name might contain a layer structure.
1919

2020
Below is an example.
21+
2122
ex : FeatureExtractor\InceptionV2\InceptionV2\Mixed_5c\Branch_2\Conv2d_0b_3x3\Conv2D
2223
The first layer is FeatureExtractor, and the second layer is InceptionV2. The last layer is Conv2D.
2324

2425
Here is another example.
26+
2527
ex : BoxPredictor_4\BoxEncodingPredictor\Conv2D
2628
Even the last layer is Conv2D, it has different first and second layer.
2729
Moreover, this Conv2D is not related to InceptionV2 layer, so we don't want to count this Conv2D as a inceptionV2 ops.
2830

2931
Therefore, we still need the layers information to focus on the ops important to us.
3032

31-
we parse op.type and op.name into a CSV file "out.csv", and below is a mapping table between CSV column and op.type & op.name. op.name[i] represnt the i layer of this op.name.
33+
We parse op.type and op.name into a CSV file "out.csv". The following table maps the CSV column and op.type & op.name. op.name[i] represent the i layer of this op.name.
3234

3335
|op_type|op_name|op1|op2|op3|op4|op5|op6|
3436
|:-----|:----|:-----|:-----|:-----|:-----|:-----|:-----|
3537
|op.type| op.name[-1] |op.name[0] | op.name[1] | op.name[2] |op.name[3] |op.name[4] |op.name[5] |
3638

39+
Following two sub-sections will show you how to focus on op.type of an interested layer such as InceptionV2.
3740

38-
Following two sub-sections will show you how to focus on op.type of a interested layer such as InceptionV2.
41+
### ( Optional ) 2. Find the Column Containing the Interested Layer
3942

40-
### ( Optional ) 2. Find the column which contain the interested layer such as InceptionV2
4143
Below command will group rows by the values on the selected column from out.csv file.
42-
Check which column contains the interested layer.
43-
Below is column 3 of ssd_inception_v2 case, it contains InceptionV2 as second row.
44-
45-
== Dump column : 3 ==
46-
op2
47-
BatchMultiClassNonMaxSuppression 5307
48-
InceptionV2 1036
49-
0 263
50-
map 63
51-
Decode 63
52-
ClassPredictor 36
53-
BoxEncodingPredictor 36
54-
Meshgrid_14 34
55-
Meshgrid_1 34
56-
Meshgrid_10 34
57-
dtype: int64
58-
59-
60-
61-
Both indexs of column and row start from 0.
62-
Therefore, we could access second row by index 1.
63-
By using column index 3 and row index 1, we could access InceptionV2 related op.name.
64-
65-
### 3. Parse a interested subgraph of a PB file
66-
67-
With understanding of interested subgraph of a pb file,
68-
users can parse that subgraph by assigning related column and row index
69-
Ex : column 3 and row 1 from Step 2 above.
70-
71-
`python tf_pb_utils.py model.pb -c 3 -r 1`
44+
Check which column contains the interested layer. Below is column 3 of ssd_inception_v2 case, it contains InceptionV2 as second row.
45+
46+
```
47+
== Dump column : 3 ==
48+
op2
49+
BatchMultiClassNonMaxSuppression 5307
50+
InceptionV2 1036
51+
0 263
52+
map 63
53+
Decode 63
54+
ClassPredictor 36
55+
BoxEncodingPredictor 36
56+
Meshgrid_14 34
57+
Meshgrid_1 34
58+
Meshgrid_10 34
59+
dtype: int64
60+
```
61+
Both indexes of column and row start from 0; therefore, we could access second row by index 1.
62+
By using column index 3 and row index 1, we could access InceptionV2 related op.name.
63+
64+
### 3. Parse an Interested Subgraph of a PB File
65+
66+
With understanding of interested subgraph of a pb file, users can parse that subgraph by assigning related column and row index. Ex : column 3 and row 1 from Step 2 above.
67+
68+
```
69+
python tf_pb_utils.py model.pb -c 3 -r 1
70+
```
7271

7372
One example of TensorFlow Operations breakdown diagrams from a PB file:
74-
<br><img src="breakdown.png" width="800" height="800"><br>
73+
74+
![](breakdown.png)

0 commit comments

Comments
 (0)