|
1 |
| -# `Intel® Python XGBoost daal4py Prediction Sample` |
2 |
| -This sample code illustrates how to analyze the performance benefit of minimal code changes to port pre-trained XGBoost model to daal4py prediction for much faster prediction. It demonstrates how to use software products that can be found in the [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html). |
| 1 | +# `Intel® Python XGBoost Daal4py Prediction` Sample |
3 | 2 |
|
4 |
| -| Optimized for | Description |
5 |
| -| :--- | :--- |
6 |
| -| OS | 64-bit Linux: Ubuntu 18.04 or higher |
7 |
| -| Hardware | Intel Atom® Processors; Intel® Core™ Processor Family; Intel® Xeon® Processor Family; Intel® Xeon® Scalable processor family |
8 |
| -| Software | XGBoost, Intel® AI Analytics Toolkit |
9 |
| -| What you will learn | How to analyze the performance benefit of minimal code changes to port pre-trained XGBoost model to daal4py prediction for much faster prediction |
10 |
| -| Time to complete | 5-8 minutes |
| 3 | +This sample code illustrates how to analyze the performance benefit of minimal code changes to port pre-trained XGBoost model to daal4py prediction for much faster prediction. |
| 4 | + |
| 5 | +| Area | Description |
| 6 | +| :--- | :--- |
| 7 | +| What you will learn | How to analyze the performance benefit of minimal code changes to port pre-trained XGBoost model to daal4py prediction for much faster prediction |
| 8 | +| Time to complete | 5-8 minutes |
| 9 | +| Category | Code Optimization |
11 | 10 |
|
12 | 11 | ## Purpose
|
13 | 12 |
|
14 |
| -XGBoost is a widely used gradient boosting library in the classical ML area. Designed for flexibility, performance, and portability, XGBoost includes optimized distributed gradient boosting frameworks and implements Machine Learning algorithms underneath. In addition, daal4py provides functionality to bring even more optimizations to gradient boosting prediction with XGBoost without modifying XGBoost models or learning an additional API. |
| 13 | +This sample illustrates how to analyze the performance benefit of minimal code changes to port pre-trained XGBoost models to Daal4py prediction for much faster prediction. |
15 | 14 |
|
16 |
| -This sample code illustrates how to analyze the performance benefit of minimal code changes to port pre-trained XGBoost models to daal4py prediction for much faster prediction. |
| 15 | +XGBoost is a widely used gradient boosting library in the classical machine learning (ML) area. Designed for flexibility, performance, and portability, XGBoost includes optimized distributed gradient boosting frameworks and implements Machine Learning algorithms underneath. In addition, Daal4py provides functionality to bring even more optimizations to gradient boosting prediction with XGBoost without modifying XGBoost models or learning an additional API. |
17 | 16 |
|
18 |
| -In this sample, you will run an XGBoost model with daal4py prediction and XGBoost API prediction to see the performance benefit of daal4py gradient boosting prediction. You will also learn how to port a pre-trained XGBoost model to daal4py prediction. |
| 17 | +## Prerequisites |
19 | 18 |
|
20 |
| -## Key Implementation Details |
21 |
| -This sample code is implemented for CPU using the Python language. The example assumes you have XGboost and daal4py installed inside a conda environment, similar to what is delivered with the installation of the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit). |
| 19 | +| Optimized for | Description |
| 20 | +|:--- |:--- |
| 21 | +| OS | Ubuntu* 18.04 or higher |
| 22 | +| Hardware | Intel Atom® processors <br> Intel® Core™ processor family <br> Intel® Xeon® processor family <br> Intel® Xeon® Scalable processor family |
| 23 | +| Software | XGBoost model <br> Intel® AI Analytics Toolkit (AI Kit) |
22 | 24 |
|
23 |
| -## License |
24 |
| -Code samples are licensed under the MIT license. See |
25 |
| -[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. |
| 25 | +This sample code is implemented for CPU using the Python language. The sample assumes you have XGboost, Daal4py, and Matplotlib installed inside a conda environment. Installing the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit) should suffice in most cases. |
| 26 | + |
| 27 | +You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts. |
26 | 28 |
|
27 |
| -Third-party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt) |
| 29 | +## Key Implementation Details |
28 | 30 |
|
29 |
| -## Building XGBoost for CPU |
| 31 | +In this sample, you will run an XGBoost model with Daal4py prediction and XGBoost API prediction to see the performance benefit of Daal4py gradient boosting prediction. You will also learn how to port a pre-trained XGBoost model to Daal4py prediction. |
30 | 32 |
|
31 | 33 | XGBoost* is ready for use once you finish the Intel® AI Analytics Toolkit installation and have run the post installation script.
|
32 | 34 |
|
33 |
| -You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts. |
| 35 | +## Set Environment Variables |
| 36 | + |
| 37 | +When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development. |
34 | 38 |
|
| 39 | +## Run the `Intel® Python XGBoost Daal4py Prediction` Sample |
| 40 | + |
| 41 | +### On Linux* |
35 | 42 |
|
36 | 43 | > **Note**: If you have not already done so, set up your CLI
|
37 |
| -> environment by sourcing the `setvars` script located in |
38 |
| -> the root of your oneAPI installation. |
39 |
| -> |
40 |
| -> Linux Sudo: . /opt/intel/oneapi/setvars.sh |
| 44 | +> environment by sourcing the `setvars` script in the root of your oneAPI installation. |
41 | 45 | >
|
42 |
| -> Linux User: . ~/intel/oneapi/setvars.sh |
| 46 | +> Linux*: |
| 47 | +> - For system wide installations: `. /opt/intel/oneapi/setvars.sh` |
| 48 | +> - For private installations: ` . ~/intel/oneapi/setvars.sh` |
| 49 | +> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'` |
43 | 50 | >
|
44 |
| -> Windows: C:\Program Files(x86)\Intel\oneAPI\setvars.bat |
45 |
| -> |
46 |
| ->For more information on environment variables, see Use the setvars Script for [Linux or macOS](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html), or [Windows](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html). |
47 |
| -
|
| 51 | +> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*. |
48 | 52 |
|
49 |
| -### Activate conda environment With Root Access |
| 53 | +#### Activate Conda with Root Access |
50 | 54 |
|
51 |
| -Intel Python environment will be active by default. However, if you activated another environment, you can return with the following command: |
| 55 | +By default, the AI Kit is installed in the `/opt/intel/oneapi` folder and requires root privileges to manage it. However, if you activated another environment, you can return with the following command. |
52 | 56 |
|
53 |
| -#### On a Linux* System |
54 | 57 | ```
|
55 | 58 | source activate base
|
56 | 59 | ```
|
57 | 60 |
|
58 |
| -### Activate conda environment Without Root Access (Optional) |
| 61 | +#### Activate Conda without Root Access (Optional) |
59 | 62 |
|
60 |
| -By default, the Intel® AI Analytics Toolkit is installed in the inteloneapi folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone your desired conda environment using the following command: |
| 63 | +You can choose to activate Conda environment without root access. To bypass root access to manage your Conda environment, clone and activate your desired Conda environment using the following commands similar to the following. |
61 | 64 |
|
62 |
| -#### On a Linux* System |
63 | 65 | ```
|
64 | 66 | conda create --name usr_intelpython --clone base
|
65 |
| -``` |
66 |
| - |
67 |
| -Then activate your conda environment with the following command: |
68 |
| - |
69 |
| -``` |
70 | 67 | source activate usr_intelpython
|
71 | 68 | ```
|
72 | 69 |
|
73 |
| -### Install Jupyter Notebook |
| 70 | +#### Install Jupyter Notebook |
| 71 | + |
74 | 72 | ```
|
75 | 73 | conda install jupyter nb_conda_kernels
|
76 | 74 | ```
|
77 | 75 |
|
78 |
| -#### View in Jupyter Notebook |
| 76 | +#### Open Jupyter Notebook |
79 | 77 |
|
80 |
| -_Note: This distributed execution cannot be launched from the jupyter notebook version, but you can still view inside the notebook to follow the included write-up and description._ |
| 78 | +> **Note**: This distributed sample cannot be executed from the Jupyter Notebook, but you can read the description and follow the program flow in the Notebook. |
81 | 79 |
|
82 |
| -Launch Jupyter Notebook in the directory housing the code example |
| 80 | +1. Change to the sample directory. |
| 81 | +2. Launch Jupyter Notebook. |
| 82 | + ``` |
| 83 | + jupyter notebook |
| 84 | + ``` |
| 85 | +3. Locate and select the Notebook. |
| 86 | + ``` |
| 87 | + IntelPython_XGBoost_daal4pyPrediction.ipynb |
| 88 | + ``` |
| 89 | +4. Click the **Run** button to move through the cells in sequence. |
83 | 90 |
|
84 |
| -``` |
85 |
| -jupyter notebook |
86 |
| -``` |
| 91 | +#### Download and Run the Script |
87 | 92 |
|
88 |
| -## Running the Sample |
| 93 | +1. Still in Jupyter Notebook. |
89 | 94 |
|
90 |
| -### Running the Sample as a Jupyter Notebook |
| 95 | +2. Select **Download as** > **python (py)**. |
91 | 96 |
|
92 |
| -Open .ipynb file and run cells in Jupyter Notebook using the "Run" button (see the image using "Modin Getting Started" sample) |
| 97 | +3. Locate the downloaded script. |
93 | 98 |
|
94 |
| - |
| 99 | +4. Run the script. |
| 100 | + ``` |
| 101 | + python IntelPython_XGBoost_daal4pyPrediction.py |
| 102 | + ``` |
| 103 | + When it finishes, you will see two plots: one for prediction time and one for prediction accuracy. You might need to dismiss the first plot to view the second plot. |
95 | 104 |
|
96 |
| -##### Expected Printed Output for Cells (with similar numbers): |
97 |
| -``` |
98 |
| -[CODE_SAMPLE_COMPLETED_SUCCESFULLY] |
99 |
| -``` |
| 105 | +## Example Output |
100 | 106 |
|
101 |
| -### Running the Sample as a Python File |
| 107 | +In addition to the plots for prediction time and prediction accuracy, you should see output similar to the following example. |
102 | 108 |
|
103 |
| -Open notebook in Jupyter and download as python file (see the image using "daal4py Hello World" sample) |
| 109 | +``` |
| 110 | +XGBoost prediction results (first 10 rows): |
| 111 | + [4. 2. 2. 2. 3. 1. 3. 4. 3. 4.] |
104 | 112 |
|
105 |
| - |
| 113 | +daal4py prediction results (first 10 rows): |
| 114 | + [4. 2. 2. 2. 3. 1. 3. 4. 3. 4.] |
106 | 115 |
|
107 |
| -Run the Program |
| 116 | +Ground truth (first 10 rows): |
| 117 | + [4. 2. 2. 2. 3. 1. 3. 4. 3. 4.] |
| 118 | +XGBoost errors count: 10 |
| 119 | +XGBoost accuracy score: 0.99 |
108 | 120 |
|
109 |
| -`python IntelPython_XGBoost_Performance.py` |
| 121 | +daal4py errors count: 10 |
| 122 | +daal4py accuracy score: 0.99 |
110 | 123 |
|
111 |
| -The output files of the script will be saved in the included models and result directories. |
| 124 | + XGBoost Prediction Time: 0.03896141052246094 |
112 | 125 |
|
113 |
| -##### Expected Printed Output (with similar numbers): |
114 |
| -``` |
| 126 | + daal4py Prediction Time: 0.10008668899536133 |
| 127 | +
|
| 128 | +All looks good! |
| 129 | +speedup: 0.3892766452116991 |
| 130 | +Accuracy Difference 0.0 |
115 | 131 | [CODE_SAMPLE_COMPLETED_SUCCESFULLY]
|
116 | 132 | ```
|
117 |
| -### Using Visual Studio Code* (VS Code) |
118 |
| - |
119 |
| -You can use VS Code extensions to set your environment, create launch configurations, |
120 |
| -and browse and download samples. |
121 | 133 |
|
122 |
| -The basic steps to build and run a sample using VS Code include: |
123 |
| - - Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**. |
124 |
| - - Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**. |
125 |
| - - Open a Terminal in VS Code (**Terminal>New Terminal**). |
126 |
| - - Run the sample in the VS Code terminal using the instructions below. |
| 134 | +## License |
127 | 135 |
|
128 |
| -To learn more about the extensions and how to configure the oneAPI environment, see |
129 |
| -[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html). |
| 136 | +Code samples are licensed under the MIT license. See |
| 137 | +[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. |
130 | 138 |
|
131 |
| -After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample. |
| 139 | +Third-party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt). |
0 commit comments