|
1 |
| -# `Intel® Extension for Scikit-learn: SVC for Adult dataset` Sample |
2 |
| -This sample code uses the [Adult dataset](https://archive.ics.uci.edu/ml/datasets/adult) to show how to train and predict with a SVC algorithm using Intel® Extension for Scikit-learn. It demonstrates how to use software products that can be found in the [Intel® oneAPI Data Analytics Library](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onedal.html), [Intel(R) Extension for Scikit-learn](https://intel.github.io/scikit-learn-intelex/), and [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html). |
| 1 | +# `Intel® Extension for Scikit-learn*: SVC for Adult Data Set` Sample |
3 | 2 |
|
4 |
| -| Optimized for | Description |
5 |
| -| :--- | :--- |
6 |
| -| OS | 64-bit Linux: Ubuntu 18.04 or higher, 64-bit Windows 10, macOS 10.14 or higher |
7 |
| -| Hardware | Intel Atom® Processors; Intel® Core™ Processor Family; Intel® Xeon® Processor Family; Intel® Xeon® Scalable processor family |
8 |
| -| Software | Intel® AI Analytics Toolkit |
9 |
| -| What you will learn | How to get started with Intel® Extension for Scikit-learn |
10 |
| -| Time to complete | 25 minutes |
| 3 | +The `Intel® Extension for Scikit-learn*: SVC for Adult Data Set` sample uses the [Adult dataset](https://archive.ics.uci.edu/ml/datasets/adult) to show how to train and predict with an SVC algorithm using Intel® Extension for Scikit-learn*. |
| 4 | + |
| 5 | +| Optimized for | Description |
| 6 | +| :--- | :--- |
| 7 | +| What you will learn | How to get started with Intel® Extension for Scikit-learn* |
| 8 | +| Time to complete | 25 minutes |
| 9 | +| Category | Concepts and Functionality |
| 10 | + |
| 11 | +The sample demonstrates how to use software products that can be found in the [Intel® oneAPI Data Analytics Library (oneDAL)](https://github.com/oneapi-src/oneDAL), [Intel® Extension for Scikit-learn*](https://intel.github.io/scikit-learn-intelex/), and the [Intel® AI Analytics Toolkit (AI Kit)](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html). |
11 | 12 |
|
12 | 13 | ## Purpose
|
13 | 14 |
|
14 |
| -Intel® Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application. The acceleration is achieved through the use of the Intel® oneAPI Data Analytics Library ([oneAPI Data Analytics Library (oneDAL)](https://github.com/oneapi-src/oneDAL)). Patching scikit-learn makes it a well-suited machine learning framework for dealing with real-life problems. |
| 15 | +In this sample, you will run an SVC algorithm with Intel® Extension for Scikit-learn* and compare its performance against the original stock version of scikit-learn. You will see that patching scikit-learn results in a significant increase in performance over the original scikit-learn while also maintaining the same precision. |
| 16 | + |
| 17 | +The acceleration is achieved through the use of the oneDAL. Patching scikit-learn makes it a well-suited machine learning framework for dealing with real-life problems. |
| 18 | + |
| 19 | +## Prerequisites |
| 20 | + |
| 21 | +| Optimized for | Description |
| 22 | +| :--- | :--- |
| 23 | +| OS | Ubuntu 20.04 (or newer) |
| 24 | +| Hardware | Intel Atom® Processors <br> Intel® Core™ Processor Family <br> Intel® Xeon® Processor Family <br> Intel® Xeon® Scalable processor family |
| 25 | +| Software | Intel® AI Analytics Toolkit (AI Kit) |
| 26 | + |
| 27 | +### For Local Development Environments |
| 28 | + |
| 29 | +You will need to download and install the following toolkits, tools, and components to use the sample. |
15 | 30 |
|
16 |
| -In this sample, you will run a SVC algorithm with Intel® Extension for Scikit-learn and compare its performance against the original stock version of scikit-learn. You will see that patching scikit-learn results in a significant increase in performance over the original scikit-learn while also maintaining the same precision. |
| 31 | +- **Intel® AI Analytics Toolkit (AI Kit)** |
| 32 | + |
| 33 | + You can get the AI Kit from [Intel® oneAPI Toolkits](https://www.intel.com/content/www/us/en/developer/tools/oneapi/toolkits.html#analytics-kit). <br> See [*Get Started with the Intel® AI Analytics Toolkit for Linux**](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux) for AI Kit installation information and post-installation steps and scripts. |
| 34 | + |
| 35 | +- **Jupyter Notebook** |
| 36 | + |
| 37 | + Install using PIP: `$pip install notebook`. <br> Alternatively, see [*Installing Jupyter*](https://jupyter.org/install) for detailed installation instructions. |
| 38 | + |
| 39 | +### For Intel® DevCloud |
| 40 | + |
| 41 | +The necessary tools and components are already installed in the environment. You do not need to install additional components. See [Intel® DevCloud for oneAPI](https://devcloud.intel.com/oneapi/get_started/) for information. |
17 | 42 |
|
18 | 43 | ## Key Implementation Details
|
19 |
| -The sample code is written in Python and it targets CPU architecture. The example assumes you have Intel® Extension for Scikit-learn installed. |
20 | 44 |
|
21 |
| -## License |
22 |
| -Code samples are licensed under the MIT license. See |
23 |
| -[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. |
| 45 | +The sample code is written in Python and it targets CPU architecture. The example assumes you have Intel® Extension for Scikit-learn* installed. |
| 46 | + |
| 47 | +## Set Environment Variables |
24 | 48 |
|
25 |
| -Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt) |
| 49 | +When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development. |
26 | 50 |
|
27 |
| -## Build and Run the Sample |
| 51 | +## Run the Sample |
28 | 52 |
|
29 |
| -### Pre-requirement |
| 53 | +### On Linux* |
30 | 54 |
|
31 |
| -> NOTE: No action is required if you are using Intel DevCloud as your environment. |
32 |
| - Refer to [Intel® DevCloud for oneAPI](https://intelsoftwaresites.secure.force.com/devcloud/oneapi) for Intel DevCloud. |
| 55 | +> **Note**: If you have not already done so, set up your CLI |
| 56 | +> environment by sourcing the `setvars` script in the root of your oneAPI installation. |
| 57 | +> |
| 58 | +> Linux*: |
| 59 | +> - For system wide installations: `. /opt/intel/oneapi/setvars.sh` |
| 60 | +> - For private installations: ` . ~/intel/oneapi/setvars.sh` |
| 61 | +> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'` |
| 62 | +> |
| 63 | +> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*. |
33 | 64 |
|
34 |
| - 1. **Intel® AI Analytics Toolkit** |
35 |
| - Install the toolkit from the [oneAPI main page](https://software.intel.com/en-us/oneapi) |
36 |
| - and refer to the [Toolkit Get Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts. |
| 65 | +### Open Jupyter Notebook |
37 | 66 |
|
38 |
| - 2. **Jupyter Notebook** |
39 |
| - Install Jupyter Notebook via pip: `pip install notebook`. |
40 |
| - Refer to [Installing the Jupyter Software](https://jupyter.org/install) for details. |
| 67 | +1. Change to the sample directory. |
| 68 | +2. Launch Jupyter Notebook. |
| 69 | + ``` |
| 70 | + jupyter notebook |
| 71 | + ``` |
| 72 | +3. Locate and select the Notebook. |
| 73 | + ``` |
| 74 | + Intel_Extension_for_SKLearn_Performance_SVC_Adult.ipynb |
| 75 | + ``` |
| 76 | +4. Click the **Run** button to move through the cells in sequence. |
41 | 77 |
|
| 78 | +### Run the Python File |
42 | 79 |
|
43 |
| -### Running the Sample as a Jupyter Notebook |
| 80 | +1. Run the script. |
| 81 | + ``` |
| 82 | + python Intel_Extension_for_SKLearn_Performance_SVC_Adult.py |
| 83 | + ``` |
| 84 | +#### Troubleshooting |
44 | 85 |
|
45 |
| -1. Launch Jupyter notebook: `jupyter notebook --ip=0.0.0.0` |
46 |
| -2. Follow the instructions to open the URL with the token in your browser. |
47 |
| -3. Click the `Intel_Extension_for_SKLearn_Performance_SVC_Adult.ipynb` file. |
48 |
| -4. Run each cell of the notebook one by one. |
| 86 | +If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the *[Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html)* for more information on using the utility. |
49 | 87 |
|
50 |
| -### Running the Sample as a Python File |
51 | 88 |
|
52 |
| -1. `python Intel_Extension_for_SKLearn_Performance_SVC_Adult.py` |
| 89 | +### On Intel® DevCloud (Optional) |
53 | 90 |
|
54 |
| -### Example of Output |
| 91 | +>**Note**: For more information on using Intel® DevCloud, see the Intel® oneAPI [Get Started](https://devcloud.intel.com/oneapi/get_started/) page. |
| 92 | +
|
| 93 | +1. Open a terminal on a Linux* system. |
| 94 | +2. Log in to the Intel® DevCloud. |
| 95 | + ``` |
| 96 | + ssh devcloud |
| 97 | + ``` |
| 98 | +3. Change to the sample directory. |
| 99 | +4. Perform steps as you would on Linux. |
| 100 | +5. Run the sample. |
| 101 | +6. Review the output. |
| 102 | +7. Disconnect from Intel® DevCloud. |
| 103 | + ``` |
| 104 | + exit |
| 105 | + ``` |
| 106 | +## Example Output |
55 | 107 |
|
56 | 108 | ```
|
57 | 109 | Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
|
@@ -79,23 +131,9 @@ Classification report for SVC trained with the original scikit-learn:
|
79 | 131 | weighted avg 0.82 0.82 0.82 9769
|
80 | 132 | ```
|
81 | 133 |
|
| 134 | +## License |
82 | 135 |
|
83 |
| -If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits. |
84 |
| -[Learn more](https://software.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) |
85 |
| - |
86 |
| - |
87 |
| -### Using Visual Studio Code* (VS Code) |
88 |
| - |
89 |
| -You can use VS Code extensions to set your environment, create launch configurations, |
90 |
| -and browse and download samples. |
91 |
| - |
92 |
| -The basic steps to build and run a sample using VS Code include: |
93 |
| - - Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**. |
94 |
| - - Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**. |
95 |
| - - Open a Terminal in VS Code (**Terminal>New Terminal**). |
96 |
| - - Run the sample in the VS Code terminal using the instructions below. |
97 |
| - |
98 |
| -To learn more about the extensions and how to configure the oneAPI environment, see |
99 |
| -[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html). |
| 136 | +Code samples are licensed under the MIT license. See |
| 137 | +[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. |
100 | 138 |
|
101 |
| -After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample. |
| 139 | +Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt). |
0 commit comments