You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: recipes_source/profile_with_itt.rst
+33-3
Original file line number
Diff line number
Diff line change
@@ -58,6 +58,10 @@ Launch Intel® VTune™ Profiler
58
58
59
59
To verify the functionality, you need to start an Intel® VTune™ Profiler instance. Please check the `Intel® VTune™ Profiler User Guide <https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/launch.html>`__ for steps to launch Intel® VTune™ Profiler.
60
60
61
+
.. note::
62
+
Users can also use web-server-ui by following `Intel® VTune™ Profiler Web Server UI Guide <https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2024-1/web-server-ui.html>`__
63
+
ex : vtune-backend --web-port=8080 --allow-remote-access --enable-server-profiling
64
+
61
65
Once you get the Intel® VTune™ Profiler GUI launched, you should see a user interface as below:
@@ -66,8 +70,8 @@ Once you get the Intel® VTune™ Profiler GUI launched, you should see a user i
66
70
67
71
Three sample results are available on the left side navigation bar under `sample (matrix)` project. If you do not want profiling results appear in this default sample project, you can create a new project via the button `New Project...` under the blue `Configure Analysis...` button. To start a new profiling, click the blue `Configure Analysis...` button to initiate configuration of the profiling.
68
72
69
-
Configure Profiling
70
-
~~~~~~~~~~~~~~~~~~~
73
+
Configure Profiling for CPU
74
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71
75
72
76
Once you click the `Configure Analysis...` button, you should see the screen below:
73
77
@@ -77,6 +81,16 @@ Once you click the `Configure Analysis...` button, you should see the screen bel
77
81
78
82
The right side of the windows is split into 3 parts: `WHERE` (top left), `WHAT` (bottom left), and `HOW` (right). With `WHERE`, you can assign a machine where you want to run the profiling on. With `WHAT`, you can set the path of the application that you want to profile. To profile a PyTorch script, it is recommended to wrap all manual steps, including activating a Python environment and setting required environment variables, into a bash script, then profile this bash script. In the screenshot above, we wrapped all steps into the `launch.sh` bash script and profile `bash` with the parameter to be `<path_of_launch.sh>`. On the right side `HOW`, you can choose whatever type that you would like to profile. Intel® VTune™ Profiler provides a bunch of profiling types that you can choose from. Details can be found at `Intel® VTune™ Profiler User Guide <https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performance.html>`__.
79
83
84
+
85
+
Configure Profiling for XPU
86
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
87
+
Pick GPU Offload Profiling Type instead of Hotspots, and follow the same instructions as CPU to Launch the Application.
@@ -101,6 +115,18 @@ As illustrated on the right side navigation bar, brown portions in the timeline
101
115
102
116
Of course there are much more enriched sets of profiling features that Intel® VTune™ Profiler provides to help you understand a performance issue. When you understand the root cause of a performance issue, you can get it fixed. More detailed usage instructions are available at `Intel® VTune™ Profiler User Guide <https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performance.html>`__.
103
117
118
+
Read XPU Profiling Result
119
+
~~~~~~~~~~~~~~~~~~~~~~~~~
120
+
121
+
With a successful profiling with ITT, you can open `Platform` tab of the profiling result to see labels in the Intel® VTune™ Profiler timeline.
The timeline shows the main thread as a `python` thread on the top. Labeled PyTorch operators and customized regions are shown in the main thread row. All operators starting with `aten::` are operators labeled implicitly by the ITT feature in PyTorch. The timeline also shows the GPU Computing Queue on the top, and users could see different XPU Kernels dispatched into GPU Queue.
129
+
104
130
A short sample code showcasing how to use PyTorch ITT APIs
0 commit comments