Skip to content

Commit 6bd7a2c

Browse files
authored
improvements for Tf perf analysis (#1504)
* add new notebook and the analysis script for TF profiling analysis * Update README.md * add back a image file
1 parent 3b05b0f commit 6bd7a2c

22 files changed

+1392
-20
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# Analyze TensorFlow Trace Json Files
2+
This analyze tool helps users to analyze TensorFlow Trace Json with a HTML output which contains some statistic charts and a timeline chart.
3+
4+
5+
## Prerequisites
6+
7+
* users need to enable TensorFlow Profiler in their workloads first. Please refer to [TF_PerfAnalysis.ipynb](../TF_PerfAnalysis.ipynb) for more details.
8+
9+
## How to analyze TensorFlow tace json.gz files
10+
11+
### analyze a TensorFlow tace json.gz file
12+
Users could also use a file path instead.
13+
* parse a trace from workload : `$./analyze A.trace.json.gz`
14+
15+
16+
### Compare and Analyze two TensorFlow tace json.gz files
17+
Users could also use a file path instead.
18+
* compare two json.gz files "A.trace.json.gz" and "B.trace.json.gz" : `$./analyze A.trace.json.gz B.trace.json.gz`
19+
20+
## Understand Reports
21+
22+
<details>
23+
<summary> oneDNN overall useage </summary>
24+
25+
Users could understand how many percentage this workload spend on oneDNN computations.
26+
Here is an example diagram, and more than 94% of cpu time are on oneDNN computations which is good.
27+
<br><img src="report_template/mkl_percentage_tf_op_duration_pie.png" width="400" height="300"><br>
28+
</details>
29+
30+
<details>
31+
<summary> Bart Chart for TF ops Elapsed time comparison </summary>
32+
33+
Users could compare TF ops elpased time between Base and Compare run.
34+
Here is an example diagram.
35+
Yellow bars are from Base Run and Blue bars are from Compare run.
36+
Overall, lower is better for the elapsed time.
37+
<br><img src="report_template/compared_tf_op_duration_bar.png" width="400" height="300"><br>
38+
</details>
39+
40+
<details>
41+
<summary> Bart Chart for TF ops speedup comparison </summary>
42+
43+
Users could compare TF ops speedup between Base and Compare run.
44+
Here is an example diagram.
45+
Yellow bars are from Eigen Ops and Blue bars are oneDNN Ops.
46+
Each bar show the speedup from Compare run to Base run, so higher is better.
47+
<br><img src="report_template/compared_tf_op_duration_ratio_bar.png" width="400" height="300"><br>
48+
</details>
49+
50+
<details>
51+
<summary> Pie Chart for Base Run TF Ops hotspots </summary>
52+
53+
Users could understand how many percentage this workload spend on different TF ops.
54+
Here is an example diagram, and more than 73% of cpu time are on FusedConv2D.
55+
Users could start optimize the top hotspot to improve the performance
56+
<br><img src="report_template/base_tf_op_duration_pie.png" width="420" height="300"><br>
57+
</details>
58+
59+
<details>
60+
<summary> Pie Chart for Base Run Unique TF Ops hotspots </summary>
61+
62+
Users could understand how many percentage this workload spend on unique TF ops only used in the Base run.
63+
Here is an example diagram, and there is a unique TF ops "Add" token ~15% of total cpu time.
64+
<br><img src="report_template/unique_base_tf_op_duration_pie.png" width="480" height="300"><br>
65+
</details>
66+
67+
<details>
68+
<summary> Pie Chart for Compare Run TF Ops hotspots </summary>
69+
70+
Users could understand how many percentage this workload spend on different TF ops.
71+
Here is an example diagram, and more than 86% of cpu time are on FusedConv2D.
72+
Users could start optimize the top hotspot to improve the performance
73+
<br><img src="report_template/compare_tf_op_duration_pie.png" width="450" height="300"><br>
74+
</details>
75+
76+
<details>
77+
<summary> Pie Chart for Compare Run Unique TF Ops hotspots </summary>
78+
79+
Users could understand how many percentage this workload spend on unique TF ops only used in the compare run.
80+
Here is an example diagram, and there is a unique TF ops "PadWithFusedConv2D" token ~10% of total cpu time.
81+
<br><img src="report_template/unique_compare_tf_op_duration_pie.png" width="450" height="300"><br>
82+
</details>
83+
84+
<details>
85+
<summary> Table for Base Run TF ops Elapsed time </summary>
86+
87+
Users could understand exact elpased time for each TF ops from Base run.
88+
Here is an example diagram.
89+
<br><img src="report_template/tf_ops_1_dataframe.png" width="800" height="500"><br>
90+
</details>
91+
92+
<details>
93+
<summary> Table for Compare Run TF ops Elapsed time </summary>
94+
95+
Users could understand exact elpased time for each TF ops from Compare run.
96+
Here is an example diagram.
97+
<br><img src="report_template/tf_ops_2_dataframe.png" width="800" height="500"><br>
98+
</details>
99+
100+
<details>
101+
<summary> Table for Compare Run TF ops Elapsed time with shape info </summary>
102+
103+
Users could understand exact elpased time for each TF ops with shape info from Compare run.
104+
Here is an example diagram.
105+
<br><img src="report_template/tf_ops_shape_2_dataframe.png" width="800" height="500"><br>
106+
</details>
107+
108+
109+
<details>
110+
<summary> Table for TF ops Elapsed time comparison </summary>
111+
112+
Users could understand exact elpased time for each TF ops from both run and related speedup number.
113+
If the TF ops is accelerated with oneDNN, mkl_op would be marked as True.
114+
If the TF ops is accelerated with native format, native_op would be marked as True.
115+
Here is an example diagram.
116+
<br><img src="report_template/common_ops_dataframe.png" width="800" height="500"><br>
117+
</details>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#!/usr/bin/env python3
2+
# -*- coding: utf-8 -*-
3+
import re
4+
import sys
5+
6+
from analyze_utils import main
7+
8+
if __name__ == '__main__':
9+
sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
10+
sys.exit(main())

0 commit comments

Comments
 (0)