oneapi-src · jimmytwei · Mar 13, 2023 · Mar 7, 2023 · Mar 13, 2023
diff --git a/...lPyTorch_TrainingOptimizations_AMX_BF16/IntelPyTorch_TrainingOptimizations_AMX_BF16.ipynb b/...lPyTorch_TrainingOptimizations_AMX_BF16/IntelPyTorch_TrainingOptimizations_AMX_BF16.ipynb
@@ -311,6 +311,7 @@
   },
   {
    "cell_type": "markdown",
+   "id": "5eea6ae7",
    "metadata": {},
    "source": [
     "The training times for the 3 cases are printed out and shown in the figure above. Using BF16 should show significant reduction in training time. However, there is little to no change using AVX512 with BF16 and AMX with BF16 because the amount of computations required for one batch is too small with this dataset.   "
@@ -348,15 +349,16 @@
    "id": "b6ea2aeb",
    "metadata": {},
    "source": [
-    "This figure shows the relative performance speedup of AMX compared to FP32 and BF16 with AVX512. The expected behavior is that AMX with BF16 should have about a 1.5X improvement over FP32 and about the same performance as BF16 with AVX512. To see more performance improvement between AVX-512 BF16 and AMX BF16, increase the amount of required computations in one batch. This can be done by increasing the batch size with CIFAR10 or using another dataset.  "
+    "This figure shows the relative performance speedup of AMX compared to FP32 and BF16 with AVX512."
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "0da073a6",
+   "id": "7bf01080",
    "metadata": {},
    "source": [
-    "This code sample shows how to enable and disable AMX during runtime, as well as the performance improvements using AMX BF16 for training the ResNet50 model. There will be additional significant performance improvements if AMX INT8 is used in inference, which is covered in a related oneAPI sample."
+    "## Conclusion\n",
+    "This code sample shows how to enable and disable AMX during runtime, as well as the performance improvements using AMX BF16 for training on the ResNet50 model. Performance will vary based on your hardware and software versions. To see more performance improvement between AVX-512 BF16 and AMX BF16, increase the amount of required computations in one batch. This can be done by increasing the batch size with CIFAR10 or using another dataset. For even more speedup, consider using the Intel® Extension for PyTorch* [Launch Script](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/performance_tuning/launch_script.html). "
    ]
   },
   {

diff --git a/...eatures-and-Functionality/IntelPyTorch_TrainingOptimizations_AMX_BF16/README.md b/...eatures-and-Functionality/IntelPyTorch_TrainingOptimizations_AMX_BF16/README.md
@@ -148,11 +148,9 @@ If you receive an error message, troubleshoot the problem using the **Diagnostic
 
 ## Example Output
 
-If successful, the sample displays `[CODE_SAMPLE_COMPLETED_SUCCESSFULLY]`. Additionally, the sample generates performance and analysis diagrams for comparison.
+If successful, the sample displays `[CODE_SAMPLE_COMPLETED_SUCCESSFULLY]`. Additionally, the sample will print out the runtimes and charts of relative performance with the FP32 model without any optimizations as the baseline. 
 
-The following image shows approximate performance speed increases using AMX BF16 with auto-mixed precision during training. To see more performance improvement between AVX-512 BF16 and AMX BF16, increase the amount of required computations in one batch. This can be done by increasing the batch size with CIFAR10 or using another dataset.  
-
-![comparison images](assets/amx_relative_speedup.png)
+The performance speedups using AMX BF16 are approximate on ResNet50. Performance will vary based on your hardware and software versions. To see more performance improvement between AVX-512 BF16 and AMX BF16, increase the amount of required computations in one batch. This can be done by increasing the batch size with CIFAR10 or using another dataset. For even more speedup, consider using the Intel® Extension for PyTorch* [Launch Script](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/performance_tuning/launch_script.html). 
 
 ## License
 

diff --git a/...ity/IntelPyTorch_TrainingOptimizations_AMX_BF16/assets/amx_relative_speedup.png b/...ity/IntelPyTorch_TrainingOptimizations_AMX_BF16/assets/amx_relative_speedup.png
diff --git a/...Analytics/Features-and-Functionality/IntelPython_Numpy_Numba_dpex_kNN/README.md b/...Analytics/Features-and-Functionality/IntelPython_Numpy_Numba_dpex_kNN/README.md
@@ -1,26 +1,26 @@
-# `Intel® Python NumPy vs Numba_dpex` Sample
+# `Intel® Python NumPy vs numba-dpex` Sample
 
-The `Intel® Python NumPy vs Numba_dpex` sample shows how to achieve the same accuracy of the k-NN model classification while using numpy, numba, and numba_dpex.
+The `Intel® Python NumPy vs numba-dpex` sample shows how to achieve the same accuracy of the k-NN model classification while using NumPy, Numba, and Data-parallel Extension for Numba* (numba-dpex).
 
 | Area                    | Description
 | :---                    | :---
-| What you will learn     | How to program using numba_dpex
+| What you will learn     | How to program using the Data-parallel Extension for Numba* (numba-dpex)
 | Time to complete        | 5 minutes
-| Category                | Component
+| Category                | Code Optimization
 
 >**Note**: The libraries used in this sample are available in Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/en-us/oneapi/ai-kit).
 
 ## Purpose
 
-In this sample, you will run a k-nearest neighbors algorithm using 3 different Intel® Distribution for Python* libraries: numpy, numba, and numba_dpex. You will learn how to use k-NN model and how to optimize it by numba_dpex operations without sacrificing accuracy.
+In this sample, you will run a k-nearest neighbors algorithm using 3 different Intel® Distribution for Python* libraries: NumPy, Numba, and numba-dpex. You will learn how to use k-NN model and how to optimize it by numba-dpex operations without sacrificing accuracy.
 
 ## Prerequisites
 
 | Optimized for           | Description
 |:---                     |:---
 | OS                      | Ubuntu* 20.04
 | Hardware                | CPU
-| Software                | Intel® AI Analytics Toolkit
+| Software                | Intel® AI Analytics Toolkit (AI Kit)
 
 ### For Local Development Environments
 
@@ -40,11 +40,11 @@ The necessary tools and components are already installed in the environment. You
 
 ## Key Implementation Details
 
-This sample code is implemented for the CPU using Python. The sample assumes you have numba_dpex installed inside a Conda environment, similar to what is installed with the Intel® Distribution for Python*.
+This sample code is implemented for the CPU using Python. The sample assumes you have numba-dpex installed inside a Conda environment, similar to what is installed with the Intel® Distribution for Python*.
 
 >**Note**: Read *[Get Started with the Intel® AI Analytics Toolkit for Linux*](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top.html)* to find out how you can achieve performance gains for popular deep-learning and machine-learning frameworks through Intel optimizations.
 
-## Run the `Intel® Python NumPy vs Numba_dpex` Sample
+## Run the `Intel® Python NumPy vs numba-dpex` Sample
 
 ### On Linux*
 
@@ -73,7 +73,15 @@ This sample code is implemented for the CPU using Python. The sample assumes you
    conda activate usr_base
    ```
 
-#### Run the Jupyter Notebook
+#### Run the Python Script
+
+1. Change to the sample directory.
+2. Run the script.
+   ```
+   python IntelPython_Numpy_Numba_dpex_kNN.py
+   ```
+
+#### Run the Jupyter Notebook (Optional)
 
 1. Launch Jupyter Notebook.
    ```
@@ -90,21 +98,31 @@ This sample code is implemented for the CPU using Python. The sample assumes you
 
 If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility.
 
-### Run the Sample on Intel® DevCloud
+### Build and Run the Sample on Intel® DevCloud (Optional)
+
+>**Note**: For more information on using Intel® DevCloud, see the Intel® oneAPI [Get Started](https://devcloud.intel.com/oneapi/get_started/) page.
 
-1. If you do not already have an account, request an Intel® DevCloud account at [*Create an Intel® DevCloud Account*](https://intelsoftwaresites.secure.force.com/DevCloud/oneapi).
-2. On a Linux* system, open a terminal.
-3. SSH into Intel® DevCloud.
+1. Open a terminal on a Linux* system.
+2. Log in to the Intel® DevCloud.
    ```
-   ssh DevCloud
+   ssh devcloud
    ```
-   > **Note**: You can find information about configuring your Linux system and connecting to Intel DevCloud at Intel® DevCloud for oneAPI [Get Started](https://devcloud.intel.com/oneapi/get_started).
-
-4. Locate and select the Notebook.
+3. If the sample is not already available, download the samples from GitHub.
+   ```
+   git clone https://github.com/oneapi-src/oneAPI-samples.git
    ```
-   numba_numpy.ipynb
+4. Change to the sample directory.
+5. Launch Jupyter Notebook.
+6. Locate and select the Notebook.
+   ```
+   IntelPython_Numpy_Numba_dpex_kNN.ipynb
    ```
-5. Run every cell in the Notebook in sequence.
+7. Run every cell in the Notebook in sequence.
+8. Review the output.
+9. Disconnect from Intel® DevCloud.
+	```
+	exit
+	```
 
 ## Example Output