Skip to content

Commit c457e68

Browse files
jkinskyjimmytweikrzeszewalexsin368ZhaoqiongZ
authored
Ai and analytics features and functionality intel python xg boost daal4py prediction (#1434)
* Fixes for 2023.1 AI Kit (#1409) * Intel Python Numpy Numba_dpes kNN sample (#1292) * *.py and *.ipynb files with implementation * README.md and sample.json files with documentation * License and thir party programs * Adding PyTorch Training Optimizations with AMX BF16 oneAPI sample (#1293) * add IntelPytorch Quantization code samples (#1301) * add IntelPytorch Quantization code samples * fix the spelling error in the README file * use john's README with grammar fix and title change * Rename third-party-grograms.txt to third-party-programs.txt Co-authored-by: Jimmy Wei <[email protected]> * AMX bfloat16 mixed precision learning TensorFlow Transformer sample (#1317) * [New Sample] Intel Extension for TensorFlow Getting Started (#1313) * first draft * Update README.md * remove redunant file * [New Sample] [oneDNN] Benchdnn tutorial (#1315) * New Sample: benchDNN tutorial * Update readme: new sample * Rename sample to benchdnn_tutorial * Name fix * Add files via upload (#1320) * [New Sample] oneCCL Bindings for PyTorch Getting Started (#1316) * Update README.md * [New Sample] oneCCL Bindings for PyTorch Getting Started * Update README.md * add torch-ccl version check * [New Sample] Intel Extension for PyTorch Getting Started (#1314) * add new ipex GSG notebook for dGPU * Update sample.json for expertise field * Update requirements.txt Update package versions to comply with Snyk tool * Updated title field in sample.json in TF Transformer AMX bfloat16 Mixed Precision sample to fit within character length range (#1327) * add arch checker class (#1332) * change gpu.patch to convert the code samples from cpu to gpu correctly (#1334) * Fixes for spelling in AMX bfloat16 transformer sample and printing error in python code in numpy vs numba sample (#1335) * 2023.1 ai kit itex get started example fix (#1338) * Fix the typo * Update ResNet50_Inference.ipynb * fix resnet inference demo link (#1339) * Fix printing issue in numpy vs numba AI sample (#1356) * Fix Invalid Kmeans parameters on oneAPI 2023 (#1345) * Update README to add new samples into the list (#1366) * PyTorch AMX BF16 Training sample: remove graphs and performance numbers (#1408) * Adding PyTorch Training Optimizations with AMX BF16 oneAPI sample * remove performance graphs, update README * remove graphs from README and folder * update top README in Features and Functionality --------- Co-authored-by: krzeszew <[email protected]> Co-authored-by: alexsin368 <[email protected]> Co-authored-by: ZhaoqiongZ <[email protected]> Co-authored-by: Louie Tsai <[email protected]> Co-authored-by: Orel Yehuda <[email protected]> Co-authored-by: yuning <[email protected]> Co-authored-by: Wang, Kai Lawrence <[email protected]> Co-authored-by: xiguiw <[email protected]> * Update Python XGBoost Daal4py Prediction sample readme Restructured to match new template—with exceptions for AI kits. Updated some formatting. Corrected some branding. Corrected python script name. Added actual results from the script. Remove the images as they made little difference in intent, and referred to a different sample. --------- Co-authored-by: Jimmy Wei <[email protected]> Co-authored-by: krzeszew <[email protected]> Co-authored-by: alexsin368 <[email protected]> Co-authored-by: ZhaoqiongZ <[email protected]> Co-authored-by: Louie Tsai <[email protected]> Co-authored-by: Orel Yehuda <[email protected]> Co-authored-by: yuning <[email protected]> Co-authored-by: Wang, Kai Lawrence <[email protected]> Co-authored-by: xiguiw <[email protected]>
1 parent 8b606a6 commit c457e68

File tree

3 files changed

+83
-75
lines changed

3 files changed

+83
-75
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,131 +1,139 @@
1-
# `Intel® Python XGBoost daal4py Prediction Sample`
2-
This sample code illustrates how to analyze the performance benefit of minimal code changes to port pre-trained XGBoost model to daal4py prediction for much faster prediction. It demonstrates how to use software products that can be found in the [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).
1+
# `Intel® Python XGBoost Daal4py Prediction` Sample
32

4-
| Optimized for | Description
5-
| :--- | :---
6-
| OS | 64-bit Linux: Ubuntu 18.04 or higher
7-
| Hardware | Intel Atom® Processors; Intel® Core™ Processor Family; Intel® Xeon® Processor Family; Intel® Xeon® Scalable processor family
8-
| Software | XGBoost, Intel® AI Analytics Toolkit
9-
| What you will learn | How to analyze the performance benefit of minimal code changes to port pre-trained XGBoost model to daal4py prediction for much faster prediction
10-
| Time to complete | 5-8 minutes
3+
This sample code illustrates how to analyze the performance benefit of minimal code changes to port pre-trained XGBoost model to daal4py prediction for much faster prediction.
4+
5+
| Area | Description
6+
| :--- | :---
7+
| What you will learn | How to analyze the performance benefit of minimal code changes to port pre-trained XGBoost model to daal4py prediction for much faster prediction
8+
| Time to complete | 5-8 minutes
9+
| Category | Code Optimization
1110

1211
## Purpose
1312

14-
XGBoost is a widely used gradient boosting library in the classical ML area. Designed for flexibility, performance, and portability, XGBoost includes optimized distributed gradient boosting frameworks and implements Machine Learning algorithms underneath. In addition, daal4py provides functionality to bring even more optimizations to gradient boosting prediction with XGBoost without modifying XGBoost models or learning an additional API.
13+
This sample illustrates how to analyze the performance benefit of minimal code changes to port pre-trained XGBoost models to Daal4py prediction for much faster prediction.
1514

16-
This sample code illustrates how to analyze the performance benefit of minimal code changes to port pre-trained XGBoost models to daal4py prediction for much faster prediction.
15+
XGBoost is a widely used gradient boosting library in the classical machine learning (ML) area. Designed for flexibility, performance, and portability, XGBoost includes optimized distributed gradient boosting frameworks and implements Machine Learning algorithms underneath. In addition, Daal4py provides functionality to bring even more optimizations to gradient boosting prediction with XGBoost without modifying XGBoost models or learning an additional API.
1716

18-
In this sample, you will run an XGBoost model with daal4py prediction and XGBoost API prediction to see the performance benefit of daal4py gradient boosting prediction. You will also learn how to port a pre-trained XGBoost model to daal4py prediction.
17+
## Prerequisites
1918

20-
## Key Implementation Details
21-
This sample code is implemented for CPU using the Python language. The example assumes you have XGboost and daal4py installed inside a conda environment, similar to what is delivered with the installation of the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit).
19+
| Optimized for | Description
20+
|:--- |:---
21+
| OS | Ubuntu* 18.04 or higher
22+
| Hardware | Intel Atom® processors <br> Intel® Core™ processor family <br> Intel® Xeon® processor family <br> Intel® Xeon® Scalable processor family
23+
| Software | XGBoost model <br> Intel® AI Analytics Toolkit (AI Kit)
2224

23-
## License
24-
Code samples are licensed under the MIT license. See
25-
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
25+
This sample code is implemented for CPU using the Python language. The sample assumes you have XGboost, Daal4py, and Matplotlib installed inside a conda environment. Installing the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit) should suffice in most cases.
26+
27+
You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.
2628

27-
Third-party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
29+
## Key Implementation Details
2830

29-
## Building XGBoost for CPU
31+
In this sample, you will run an XGBoost model with Daal4py prediction and XGBoost API prediction to see the performance benefit of Daal4py gradient boosting prediction. You will also learn how to port a pre-trained XGBoost model to Daal4py prediction.
3032

3133
XGBoost* is ready for use once you finish the Intel® AI Analytics Toolkit installation and have run the post installation script.
3234

33-
You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.
35+
## Set Environment Variables
36+
37+
When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development.
3438

39+
## Run the `Intel® Python XGBoost Daal4py Prediction` Sample
40+
41+
### On Linux*
3542

3643
> **Note**: If you have not already done so, set up your CLI
37-
> environment by sourcing the `setvars` script located in
38-
> the root of your oneAPI installation.
39-
>
40-
> Linux Sudo: . /opt/intel/oneapi/setvars.sh
44+
> environment by sourcing the `setvars` script in the root of your oneAPI installation.
4145
>
42-
> Linux User: . ~/intel/oneapi/setvars.sh
46+
> Linux*:
47+
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
48+
> - For private installations: ` . ~/intel/oneapi/setvars.sh`
49+
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
4350
>
44-
> Windows: C:\Program Files(x86)\Intel\oneAPI\setvars.bat
45-
>
46-
>For more information on environment variables, see Use the setvars Script for [Linux or macOS](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html), or [Windows](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html).
47-
51+
> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*.
4852
49-
### Activate conda environment With Root Access
53+
#### Activate Conda with Root Access
5054

51-
Intel Python environment will be active by default. However, if you activated another environment, you can return with the following command:
55+
By default, the AI Kit is installed in the `/opt/intel/oneapi` folder and requires root privileges to manage it. However, if you activated another environment, you can return with the following command.
5256

53-
#### On a Linux* System
5457
```
5558
source activate base
5659
```
5760

58-
### Activate conda environment Without Root Access (Optional)
61+
#### Activate Conda without Root Access (Optional)
5962

60-
By default, the Intel® AI Analytics Toolkit is installed in the inteloneapi folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone your desired conda environment using the following command:
63+
You can choose to activate Conda environment without root access. To bypass root access to manage your Conda environment, clone and activate your desired Conda environment using the following commands similar to the following.
6164

62-
#### On a Linux* System
6365
```
6466
conda create --name usr_intelpython --clone base
65-
```
66-
67-
Then activate your conda environment with the following command:
68-
69-
```
7067
source activate usr_intelpython
7168
```
7269

73-
### Install Jupyter Notebook
70+
#### Install Jupyter Notebook
71+
7472
```
7573
conda install jupyter nb_conda_kernels
7674
```
7775

78-
#### View in Jupyter Notebook
76+
#### Open Jupyter Notebook
7977

80-
_Note: This distributed execution cannot be launched from the jupyter notebook version, but you can still view inside the notebook to follow the included write-up and description._
78+
> **Note**: This distributed sample cannot be executed from the Jupyter Notebook, but you can read the description and follow the program flow in the Notebook.
8179
82-
Launch Jupyter Notebook in the directory housing the code example
80+
1. Change to the sample directory.
81+
2. Launch Jupyter Notebook.
82+
```
83+
jupyter notebook
84+
```
85+
3. Locate and select the Notebook.
86+
```
87+
IntelPython_XGBoost_daal4pyPrediction.ipynb
88+
```
89+
4. Click the **Run** button to move through the cells in sequence.
8390

84-
```
85-
jupyter notebook
86-
```
91+
#### Download and Run the Script
8792

88-
## Running the Sample
93+
1. Still in Jupyter Notebook.
8994

90-
### Running the Sample as a Jupyter Notebook
95+
2. Select **Download as** > **python (py)**.
9196

92-
Open .ipynb file and run cells in Jupyter Notebook using the "Run" button (see the image using "Modin Getting Started" sample)
97+
3. Locate the downloaded script.
9398

94-
![Click the Run Button in the Jupyter Notebook](Jupyter_Run.jpg "Run Button on Jupyter Notebook")
99+
4. Run the script.
100+
```
101+
python IntelPython_XGBoost_daal4pyPrediction.py
102+
```
103+
When it finishes, you will see two plots: one for prediction time and one for prediction accuracy. You might need to dismiss the first plot to view the second plot.
95104

96-
##### Expected Printed Output for Cells (with similar numbers):
97-
```
98-
[CODE_SAMPLE_COMPLETED_SUCCESFULLY]
99-
```
105+
## Example Output
100106

101-
### Running the Sample as a Python File
107+
In addition to the plots for prediction time and prediction accuracy, you should see output similar to the following example.
102108

103-
Open notebook in Jupyter and download as python file (see the image using "daal4py Hello World" sample)
109+
```
110+
XGBoost prediction results (first 10 rows):
111+
[4. 2. 2. 2. 3. 1. 3. 4. 3. 4.]
104112
105-
![Download as python file in the Jupyter Notebook](Jupyter_Save_Py.jpg "Download as python file in the Jupyter Notebook")
113+
daal4py prediction results (first 10 rows):
114+
[4. 2. 2. 2. 3. 1. 3. 4. 3. 4.]
106115
107-
Run the Program
116+
Ground truth (first 10 rows):
117+
[4. 2. 2. 2. 3. 1. 3. 4. 3. 4.]
118+
XGBoost errors count: 10
119+
XGBoost accuracy score: 0.99
108120
109-
`python IntelPython_XGBoost_Performance.py`
121+
daal4py errors count: 10
122+
daal4py accuracy score: 0.99
110123
111-
The output files of the script will be saved in the included models and result directories.
124+
XGBoost Prediction Time: 0.03896141052246094
112125
113-
##### Expected Printed Output (with similar numbers):
114-
```
126+
daal4py Prediction Time: 0.10008668899536133
127+
128+
All looks good!
129+
speedup: 0.3892766452116991
130+
Accuracy Difference 0.0
115131
[CODE_SAMPLE_COMPLETED_SUCCESFULLY]
116132
```
117-
### Using Visual Studio Code* (VS Code)
118-
119-
You can use VS Code extensions to set your environment, create launch configurations,
120-
and browse and download samples.
121133

122-
The basic steps to build and run a sample using VS Code include:
123-
- Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**.
124-
- Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**.
125-
- Open a Terminal in VS Code (**Terminal>New Terminal**).
126-
- Run the sample in the VS Code terminal using the instructions below.
134+
## License
127135

128-
To learn more about the extensions and how to configure the oneAPI environment, see
129-
[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).
136+
Code samples are licensed under the MIT license. See
137+
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
130138

131-
After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample.
139+
Third-party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).

0 commit comments

Comments
 (0)