|
1 |
| -# `Intel Python XGBoost Getting Started` Sample |
2 |
| -XGBoost* is a widely used gradient boosting library in the classical ML area. Designed for flexibility, performance, and portability, XGBoost* includes optimized distributed gradient boosting frameworks and implements Machine Learning algorithms underneath. Starting with 0.9 version of XGBoost, Intel has been upstreaming optimizations to the through the `hist` histogram tree-building method. Starting with 1.3.3 version of XGBoost and beyond, Intel has also begun upstreaming inference optimziations to XGBoost as well. |
| 1 | +# `Intel® Python XGBoost* Getting Started` Sample |
3 | 2 |
|
4 |
| -| Optimized for | Description |
5 |
| -| :--- | :--- |
6 |
| -| OS | 64-bit Linux: Ubuntu 18.04 or higher, 64-bit Windows 10, macOS 10.14 or higher |
7 |
| -| Hardware | Intel Atom® Processors; Intel® Core™ Processor Family; Intel® Xeon® Processor Family; Intel® Xeon® Scalable processor family |
8 |
| -| Software | XGBoost, Intel® AI Analytics Toolkit (AI Kit) |
9 |
| -| What you will learn | basic XGBoost programming model for Intel CPU |
10 |
| -| Time to complete | 5 minutes |
| 3 | +The `Intel® Python XGBoost* Getting Started` sample demonstrates how to set up and train an XGBoost model on datasets for prediction. |
| 4 | + |
| 5 | +| Area | Description |
| 6 | +| :--- | :--- |
| 7 | +| What you will learn | The basics of XGBoost programming model for Intel CPUs |
| 8 | +| Time to complete | 5 minutes |
| 9 | +| Category | Getting Started |
11 | 10 |
|
12 | 11 | ## Purpose
|
13 |
| -In this code sample, you will learn how to use Intel optimizations for XGBoost published as part of Intel® AI Analytics Toolkit. The sample also illustrates how to set up and train an XGBoost* model on datasets for prediction. |
14 |
| -It also demonstrates how to use software products that can be found in the [Intel® AI Analytics Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html). |
15 | 12 |
|
16 |
| -## Key Implementation Details |
17 |
| -This Getting Started sample code is implemented for CPU using the Python language. The example assumes you have XGboost installed inside a conda environment, similar to what is delivered with the installation of the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit). |
| 13 | +XGBoost* is a widely used gradient boosting library in the classical ML area. Designed for flexibility, performance, and portability, XGBoost* includes optimized distributed gradient boosting frameworks and implements Machine Learning algorithms underneath. Starting with 0.9 version of XGBoost, Intel has been up streaming optimizations through the `hist` histogram tree-building method. Starting with 1.3.3 version of XGBoost and beyond, Intel has also begun up streaming inference optimizations to XGBoost as well. |
18 | 14 |
|
19 |
| -## License |
20 |
| -Code samples are licensed under the MIT license. See |
21 |
| -[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. |
| 15 | +In this code sample, you will learn how to use Intel optimizations for XGBoost published as part of Intel® AI Analytics Toolkit. The sample also illustrates how to set up and train an XGBoost* model on datasets for prediction. It also demonstrates how to use software products that can be found in the [Intel® AI Analytics Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html). |
| 16 | + |
| 17 | +## Prerequisites |
| 18 | + |
| 19 | +| Optimized for | Description |
| 20 | +| :--- | :--- |
| 21 | +| OS | Ubuntu* 20.04 (or newer) |
| 22 | +| Hardware | Intel Atom® Processors <br> Intel® Core™ Processor Family <br> Intel® Xeon® Processor Family <br> Intel® Xeon® Scalable processor family |
| 23 | +| Software | XGBoost* <br> Intel® AI Analytics Toolkit (AI Kit) |
22 | 24 |
|
23 |
| -Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt) |
| 25 | +## Key Implementation Details |
24 | 26 |
|
25 |
| -## Building XGBoost for CPU |
| 27 | +This Getting Started sample code is implemented for CPU using the Python language. The example assumes you have XGboost installed inside a conda environment, similar to what is delivered with the installation of the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit). |
26 | 28 |
|
27 | 29 | XGBoost* is ready for use once you finish the Intel® AI Analytics Toolkit installation and have run the post installation script.
|
28 | 30 |
|
29 |
| -You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts. |
| 31 | +## Set Environment Variables |
30 | 32 |
|
| 33 | +When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development. |
| 34 | + |
| 35 | +## Configure Environment |
31 | 36 |
|
32 | 37 | > **Note**: If you have not already done so, set up your CLI
|
33 |
| -> environment by sourcing the `setvars` script located in |
34 |
| -> the root of your oneAPI installation. |
35 |
| -> |
36 |
| -> Linux Sudo: . /opt/intel/oneapi/setvars.sh |
| 38 | +> environment by sourcing the `setvars` script in the root of your oneAPI installation. |
37 | 39 | >
|
38 |
| -> Linux User: . ~/intel/oneapi/setvars.sh |
| 40 | +> Linux*: |
| 41 | +> - For system wide installations: `. /opt/intel/oneapi/setvars.sh` |
| 42 | +> - For private installations: ` . ~/intel/oneapi/setvars.sh` |
| 43 | +> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'` |
39 | 44 | >
|
40 |
| -> Windows: C:\Program Files(x86)\Intel\oneAPI\setvars.bat |
41 |
| -> |
42 |
| ->For more information on environment variables, see Use the setvars Script for [Linux or macOS](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html), or [Windows](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html). |
| 45 | +> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*. |
43 | 46 |
|
44 |
| -### Activate conda environment With Root Access |
| 47 | +### Using Visual Studio Code* (Optional) |
45 | 48 |
|
46 |
| -However, if you activated another environment, you can return with the following command: |
| 49 | +You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, |
| 50 | +and browse and download samples. |
47 | 51 |
|
48 |
| -#### On a Linux* System |
49 |
| -``` |
50 |
| -source activate base |
51 |
| -``` |
| 52 | +The basic steps to build and run a sample using VS Code include: |
| 53 | + - Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**. |
| 54 | + - Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**. |
| 55 | + - Open a Terminal in VS Code (**Terminal>New Terminal**). |
| 56 | + - Run the sample in the VS Code terminal using the instructions below. |
52 | 57 |
|
53 |
| -### Activate conda environment Without Root Access (Optional) |
| 58 | +To learn more about the extensions, see |
| 59 | +[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html). |
54 | 60 |
|
55 |
| -By default, the Intel® AI Analytics Toolkit is installed in the inteloneapi folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone your desired conda environment using the following command: |
| 61 | +### Activate Conda with Root Access |
56 | 62 |
|
57 |
| -#### On a Linux* System |
| 63 | +If you activated another environment, you can return with the following command: |
58 | 64 | ```
|
59 |
| -conda create --name user_base --clone base |
| 65 | +source activate base |
60 | 66 | ```
|
| 67 | +### Activate Conda without Root Access (Optional) |
61 | 68 |
|
62 |
| -Then activate your conda environment with the following command: |
63 |
| - |
| 69 | +By default, the Intel® AI Analytics Toolkit is installed in the inteloneapi folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone and active your desired conda environment using the following commands: |
64 | 70 | ```
|
| 71 | +conda create --name user_base --clone base |
65 | 72 | source activate user_base
|
66 | 73 | ```
|
67 | 74 |
|
68 |
| -### Install Jupyter Notebook |
69 |
| - |
70 |
| -Launch Jupyter Notebook in the directory housing the code example |
| 75 | +## Run the `Intel® Python XGBoost* Getting Started` Sample |
71 | 76 |
|
72 |
| -``` |
73 |
| -conda install jupyter nb_conda_kernels |
74 |
| -``` |
75 |
| - |
76 |
| -#### View in Jupyter Notebook |
77 |
| - |
78 |
| -_Note: This distributed execution cannot be launched from the jupyter notebook version, but you can still view inside the notebook to follow the included write-up and description._ |
79 |
| - |
80 |
| -Launch Jupyter Notebook in the directory housing the code example |
81 |
| - |
82 |
| -``` |
83 |
| -jupyter notebook |
84 |
| -``` |
85 |
| -## Running the Sample |
86 |
| - |
87 |
| -### Running the Sample as a Jupyter Notebook |
| 77 | +### Install Jupyter Notebook |
88 | 78 |
|
89 |
| -Open .pynb file and run cells in Jupyter Notebook using the "Run" button (see the image using "Modin Getting Started" sample) |
| 79 | +1. Change to the sample directory. |
| 80 | +2. Install Jupyter Notebook with an appropriate kernel. |
| 81 | + ``` |
| 82 | + conda install jupyter nb_conda_kernels |
| 83 | + ``` |
| 84 | +### Open Jupyter Notebook |
90 | 85 |
|
91 |
| - |
| 86 | +>**Note**: You cannot execute the sample in Jupyter Notebook, but you can still view inside the notebook to follow the included write-up and description. |
92 | 87 |
|
93 |
| -##### Expected Printed Output for Cells (with similar numbers): |
94 |
| -``` |
95 |
| -RMSE: 11.113036205909719 |
96 |
| -[CODE_SAMPLE_COMPLETED_SUCCESFULLY] |
97 |
| -``` |
| 88 | +1. Change to the sample directory. |
| 89 | +2. Launch Jupyter Notebook. |
| 90 | + ``` |
| 91 | + jupyter notebook |
| 92 | + ``` |
| 93 | +3. Locate and select the Notebook. |
| 94 | + ``` |
| 95 | + IntelPython_XGBoost_GettingStarted.ipynb |
| 96 | + ``` |
| 97 | +4. Click the **Run** button to move through the cells in sequence. |
98 | 98 |
|
| 99 | +### Run the Python Script |
99 | 100 |
|
100 |
| -### Running the Sample as a Python File |
| 101 | +1. Still in Jupyter Notebook. |
101 | 102 |
|
102 |
| -Open notebook in Jupyter and download as python file (see the image using "daal4py Hello World" sample) |
| 103 | +2. Select **File** > **Download as** > **Python (py)**. |
| 104 | +3. Run the script. |
| 105 | + ``` |
| 106 | + python IntelPython_XGBoost_GettingStarted.py |
| 107 | + ``` |
| 108 | + The output files of the script will be saved in **models** and **result** directories. |
103 | 109 |
|
104 |
| - |
| 110 | +#### Troubleshooting |
105 | 111 |
|
106 |
| -Run the Program |
| 112 | +If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility. |
107 | 113 |
|
108 |
| -`python IntelPython_XGBoost_GettingStarted.py` |
| 114 | +## Example Output |
109 | 115 |
|
110 |
| -The output files of the script will be saved in the included models and result directories. |
| 116 | +>**Note**: Your numbers might be different. |
111 | 117 |
|
112 |
| -##### Expected Printed Output (with similar numbers): |
113 | 118 | ```
|
114 | 119 | RMSE: 11.113036205909719
|
115 | 120 | [CODE_SAMPLE_COMPLETED_SUCCESFULLY]
|
116 | 121 | ```
|
117 | 122 |
|
118 |
| -### Build and run additional samples |
119 |
| -Several sample programs are available for you to try, many of which can be compiled and run in a similar fashion. Experiment with running the various samples on different kinds of compute nodes or adjust their source code to experiment with different workloads. |
120 |
| - |
121 |
| -### Troubleshooting |
122 |
| -If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits. |
123 |
| -[Learn more](https://software.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) |
124 |
| - |
125 |
| -### Using Visual Studio Code* (Optional) |
126 |
| - |
127 |
| -You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations, |
128 |
| -and browse and download samples. |
129 |
| - |
130 |
| -The basic steps to build and run a sample using VS Code include: |
131 |
| - - Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**. |
132 |
| - - Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**. |
133 |
| - - Open a Terminal in VS Code (**Terminal>New Terminal**). |
134 |
| - - Run the sample in the VS Code terminal using the instructions below. |
135 |
| - - (Linux only) Debug your GPU application with GDB for Intel® oneAPI toolkits using the Generate Launch Configurations extension. |
| 123 | +## License |
136 | 124 |
|
137 |
| -To learn more about the extensions, see |
138 |
| -[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html). |
| 125 | +Code samples are licensed under the MIT license. See |
| 126 | +[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. |
139 | 127 |
|
140 |
| -After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample. |
| 128 | +Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt). |
0 commit comments