Skip to content

Commit e1a487c

Browse files
authored
Multiple Changes to readmes (#1510)
Updated formatting. Changed sample names to match sample.json files where needed. Moved images to images folder where needed.
1 parent 6bd7a2c commit e1a487c

File tree

15 files changed

+569
-648
lines changed

15 files changed

+569
-648
lines changed
Binary file not shown.

AI-and-Analytics/Getting-Started-Samples/IntelModin_GettingStarted/README.md

+102-141
Large diffs are not rendered by default.

AI-and-Analytics/Getting-Started-Samples/IntelModin_Vs_Pandas/README.md

+25-12
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,37 @@
1-
# `Intel® Modin Vs Pandas Performance` Sample
2-
The `Intel® Modin Vs Pandas Performance` code illustrates how to use Modin* to replace the Pandas API. The sample compares the performance of Intel® Distribution of Modin* and the performance of Pandas for specific dataframe operations.
1+
# `Intel® Modin* Vs. Pandas Performance` Sample
32

4-
| Property | Description
5-
|:--- |:---
6-
| What you will learn | How to accelerate the Pandas API using Intel® Distribution of Modin*.
7-
| Time to complete | Less than 10 minutes
3+
The `Intel® Modin* Vs. Pandas Performance` code illustrates how to use Modin* to replace the Pandas API. The sample compares the performance of Intel® Distribution of Modin* and the performance of Pandas for specific dataframe operations.
4+
5+
| Area | Description
6+
|:--- |:---
7+
| What you will learn | How to accelerate the Pandas API using Intel® Distribution of Modin*.
8+
| Time to complete | Less than 10 minutes
9+
| Category | Concepts and Functionality
810

911
## Purpose
10-
Intel® Distribution of Modin* accelerates Pandas operations using Ray or Dask execution engine. The distribution provides compatibility and integration with the existing Pandas code. The sample code demonstrates how to perform some basic dataframe operations using Pandas and Intel® Distribution of Modin. You will be able to compare the performance difference between the two methods.
12+
13+
Intel® Distribution of Modin* accelerates Pandas operations using Ray or Dask execution engine. The distribution provides compatibility and integration with the existing Pandas code. The sample code demonstrates how to perform some basic dataframe operations using Pandas and Intel® Distribution of Modin*. You will be able to compare the performance difference between the two methods.
1114

1215
You can run the sample locally or in Google Colaboratory (Colab).
1316

1417
## Prerequisites
18+
1519
| Optimized for | Description
1620
|:--- |:---
17-
| OS | Ubuntu 18.04.3 LTS
18-
| Hardware | Intel® Xeon® CPU
21+
| OS | Ubuntu* 20.04 (or newer)
22+
| Hardware | Intel® Core™ Gen10 Processor <br> Intel® Xeon® Scalable Performance processors
1923
| Software | Intel® AI Analytics Toolkit (AI Kit) <br> Intel® Distribution of Modin*
2024

2125
## Key Implementation Details
26+
2227
This code sample is implemented for CPU using Python programming language. The sample requires NumPy, Pandas, Modin libraries, and the time module in Python.
2328

2429
## Run the `Intel® Modin Vs Pandas Performance` Sample Locally
30+
2531
If you want to run the sample on a local system using a command-line interface (CLI), you must install the Intel® Distribution of Modin* in a new Conda* environment first.
2632

2733
### Install the Intel® Distribution of Modin*
34+
2835
1. Create a Conda environment.
2936
```
3037
conda create --name aikit-modin
@@ -51,14 +58,16 @@ If you want to run the sample on a local system using a command-line interface (
5158
pip install ipython
5259
```
5360
### Run the Sample
61+
5462
1. Change to the directory containing the `IntelModin_Vs_Pandas.ipynb` notebook file on your local system.
5563

5664
2. Run the sample notebook.
5765
```
58-
ipython Modin_Vs_Pandas.ipynb
66+
ipython IntelModin_Vs_Pandas.ipynb
5967
```
6068

6169
## Run the `Intel® Modin Vs Pandas Performance` Sample in Google Colaboratory
70+
6271
1. Change to the directory containing the `IntelModin_Vs_Pandas.ipynb` notebook file on your local system.
6372

6473
2. Open the notebook file, and remove the prepended number sign (#) symbol from the following lines:
@@ -84,15 +93,19 @@ If you want to run the sample on a local system using a command-line interface (
8493
9. Select **Runtime** > **Run all**.
8594

8695
## Example Output
96+
8797
>**Note**: Your output might be different between runs on the notebook depending upon the random generation of the dataset. For the first run, Modin may take longer to execute than Pandas for certain operations since Modin performs some initialization in the first iteration.
98+
8899
```
89100
CPU times: user 8.47 s, sys: 132 ms, total: 8.6 s
90101
Wall time: 8.57 s
91102
```
92-
Example expected cell output is included in `IntelModin_Vs_Pandas.ipynb` too.
103+
104+
Example expected cell output is included in `IntelModin_Vs_Pandas.ipynb`.
93105

94106
## License
107+
95108
Code samples are licensed under the MIT license. See
96109
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
97110

98-
Third party program licenses are at [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
111+
Third party program licenses are at [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
Original file line numberDiff line numberDiff line change
@@ -1,140 +1,128 @@
1-
# `Intel Python XGBoost Getting Started` Sample
2-
XGBoost* is a widely used gradient boosting library in the classical ML area. Designed for flexibility, performance, and portability, XGBoost* includes optimized distributed gradient boosting frameworks and implements Machine Learning algorithms underneath. Starting with 0.9 version of XGBoost, Intel has been upstreaming optimizations to the through the `hist` histogram tree-building method. Starting with 1.3.3 version of XGBoost and beyond, Intel has also begun upstreaming inference optimziations to XGBoost as well.
1+
# `Intel® Python XGBoost* Getting Started` Sample
32

4-
| Optimized for | Description
5-
| :--- | :---
6-
| OS | 64-bit Linux: Ubuntu 18.04 or higher, 64-bit Windows 10, macOS 10.14 or higher
7-
| Hardware | Intel Atom® Processors; Intel® Core™ Processor Family; Intel® Xeon® Processor Family; Intel® Xeon® Scalable processor family
8-
| Software | XGBoost, Intel® AI Analytics Toolkit (AI Kit)
9-
| What you will learn | basic XGBoost programming model for Intel CPU
10-
| Time to complete | 5 minutes
3+
The `Intel® Python XGBoost* Getting Started` sample demonstrates how to set up and train an XGBoost model on datasets for prediction.
4+
5+
| Area | Description
6+
| :--- | :---
7+
| What you will learn | The basics of XGBoost programming model for Intel CPUs
8+
| Time to complete | 5 minutes
9+
| Category | Getting Started
1110

1211
## Purpose
13-
In this code sample, you will learn how to use Intel optimizations for XGBoost published as part of Intel® AI Analytics Toolkit. The sample also illustrates how to set up and train an XGBoost* model on datasets for prediction.
14-
It also demonstrates how to use software products that can be found in the [Intel® AI Analytics Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).
1512

16-
## Key Implementation Details
17-
This Getting Started sample code is implemented for CPU using the Python language. The example assumes you have XGboost installed inside a conda environment, similar to what is delivered with the installation of the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit).
13+
XGBoost* is a widely used gradient boosting library in the classical ML area. Designed for flexibility, performance, and portability, XGBoost* includes optimized distributed gradient boosting frameworks and implements Machine Learning algorithms underneath. Starting with 0.9 version of XGBoost, Intel has been up streaming optimizations through the `hist` histogram tree-building method. Starting with 1.3.3 version of XGBoost and beyond, Intel has also begun up streaming inference optimizations to XGBoost as well.
1814

19-
## License
20-
Code samples are licensed under the MIT license. See
21-
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
15+
In this code sample, you will learn how to use Intel optimizations for XGBoost published as part of Intel® AI Analytics Toolkit. The sample also illustrates how to set up and train an XGBoost* model on datasets for prediction. It also demonstrates how to use software products that can be found in the [Intel® AI Analytics Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).
16+
17+
## Prerequisites
18+
19+
| Optimized for | Description
20+
| :--- | :---
21+
| OS | Ubuntu* 20.04 (or newer)
22+
| Hardware | Intel Atom® Processors <br> Intel® Core™ Processor Family <br> Intel® Xeon® Processor Family <br> Intel® Xeon® Scalable processor family
23+
| Software | XGBoost* <br> Intel® AI Analytics Toolkit (AI Kit)
2224

23-
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
25+
## Key Implementation Details
2426

25-
## Building XGBoost for CPU
27+
This Getting Started sample code is implemented for CPU using the Python language. The example assumes you have XGboost installed inside a conda environment, similar to what is delivered with the installation of the Intel® Distribution for Python* as part of the [Intel® AI Analytics Toolkit](https://software.intel.com/en-us/oneapi/ai-kit).
2628

2729
XGBoost* is ready for use once you finish the Intel® AI Analytics Toolkit installation and have run the post installation script.
2830

29-
You can refer to the oneAPI [main page](https://software.intel.com/en-us/oneapi) for toolkit installation and the Toolkit [Getting Started Guide for Linux](https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit) for post-installation steps and scripts.
31+
## Set Environment Variables
3032

33+
When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. Set up your CLI environment by sourcing the `setvars` script every time you open a new terminal window. This practice ensures that your compiler, libraries, and tools are ready for development.
34+
35+
## Configure Environment
3136

3237
> **Note**: If you have not already done so, set up your CLI
33-
> environment by sourcing the `setvars` script located in
34-
> the root of your oneAPI installation.
35-
>
36-
> Linux Sudo: . /opt/intel/oneapi/setvars.sh
38+
> environment by sourcing the `setvars` script in the root of your oneAPI installation.
3739
>
38-
> Linux User: . ~/intel/oneapi/setvars.sh
40+
> Linux*:
41+
> - For system wide installations: `. /opt/intel/oneapi/setvars.sh`
42+
> - For private installations: ` . ~/intel/oneapi/setvars.sh`
43+
> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'`
3944
>
40-
> Windows: C:\Program Files(x86)\Intel\oneAPI\setvars.bat
41-
>
42-
>For more information on environment variables, see Use the setvars Script for [Linux or macOS](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html), or [Windows](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html).
45+
> For more information on configuring environment variables, see *[Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html)*.
4346
44-
### Activate conda environment With Root Access
47+
### Using Visual Studio Code* (Optional)
4548

46-
However, if you activated another environment, you can return with the following command:
49+
You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations,
50+
and browse and download samples.
4751

48-
#### On a Linux* System
49-
```
50-
source activate base
51-
```
52+
The basic steps to build and run a sample using VS Code include:
53+
- Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**.
54+
- Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**.
55+
- Open a Terminal in VS Code (**Terminal>New Terminal**).
56+
- Run the sample in the VS Code terminal using the instructions below.
5257

53-
### Activate conda environment Without Root Access (Optional)
58+
To learn more about the extensions, see
59+
[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).
5460

55-
By default, the Intel® AI Analytics Toolkit is installed in the inteloneapi folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone your desired conda environment using the following command:
61+
### Activate Conda with Root Access
5662

57-
#### On a Linux* System
63+
If you activated another environment, you can return with the following command:
5864
```
59-
conda create --name user_base --clone base
65+
source activate base
6066
```
67+
### Activate Conda without Root Access (Optional)
6168

62-
Then activate your conda environment with the following command:
63-
69+
By default, the Intel® AI Analytics Toolkit is installed in the inteloneapi folder, which requires root privileges to manage it. If you would like to bypass using root access to manage your conda environment, then you can clone and active your desired conda environment using the following commands:
6470
```
71+
conda create --name user_base --clone base
6572
source activate user_base
6673
```
6774

68-
### Install Jupyter Notebook
69-
70-
Launch Jupyter Notebook in the directory housing the code example
75+
## Run the `Intel® Python XGBoost* Getting Started` Sample
7176

72-
```
73-
conda install jupyter nb_conda_kernels
74-
```
75-
76-
#### View in Jupyter Notebook
77-
78-
_Note: This distributed execution cannot be launched from the jupyter notebook version, but you can still view inside the notebook to follow the included write-up and description._
79-
80-
Launch Jupyter Notebook in the directory housing the code example
81-
82-
```
83-
jupyter notebook
84-
```
85-
## Running the Sample
86-
87-
### Running the Sample as a Jupyter Notebook
77+
### Install Jupyter Notebook
8878

89-
Open .pynb file and run cells in Jupyter Notebook using the "Run" button (see the image using "Modin Getting Started" sample)
79+
1. Change to the sample directory.
80+
2. Install Jupyter Notebook with an appropriate kernel.
81+
```
82+
conda install jupyter nb_conda_kernels
83+
```
84+
### Open Jupyter Notebook
9085

91-
![Click the Run Button in the Jupyter Notebook](Jupyter_Run.jpg "Run Button on Jupyter Notebook")
86+
>**Note**: You cannot execute the sample in Jupyter Notebook, but you can still view inside the notebook to follow the included write-up and description.
9287
93-
##### Expected Printed Output for Cells (with similar numbers):
94-
```
95-
RMSE: 11.113036205909719
96-
[CODE_SAMPLE_COMPLETED_SUCCESFULLY]
97-
```
88+
1. Change to the sample directory.
89+
2. Launch Jupyter Notebook.
90+
```
91+
jupyter notebook
92+
```
93+
3. Locate and select the Notebook.
94+
```
95+
IntelPython_XGBoost_GettingStarted.ipynb
96+
```
97+
4. Click the **Run** button to move through the cells in sequence.
9898

99+
### Run the Python Script
99100

100-
### Running the Sample as a Python File
101+
1. Still in Jupyter Notebook.
101102

102-
Open notebook in Jupyter and download as python file (see the image using "daal4py Hello World" sample)
103+
2. Select **File** > **Download as** > **Python (py)**.
104+
3. Run the script.
105+
```
106+
python IntelPython_XGBoost_GettingStarted.py
107+
```
108+
The output files of the script will be saved in **models** and **result** directories.
103109

104-
![Download as python file in the Jupyter Notebook](Jupyter_Save_Py.jpg "Download as python file in the Jupyter Notebook")
110+
#### Troubleshooting
105111

106-
Run the Program
112+
If you receive an error message, troubleshoot the problem using the **Diagnostics Utility for Intel® oneAPI Toolkits**. The diagnostic utility provides configuration and system checks to help find missing dependencies, permissions errors, and other issues. See the [Diagnostics Utility for Intel® oneAPI Toolkits User Guide](https://www.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html) for more information on using the utility.
107113

108-
`python IntelPython_XGBoost_GettingStarted.py`
114+
## Example Output
109115

110-
The output files of the script will be saved in the included models and result directories.
116+
>**Note**: Your numbers might be different.
111117
112-
##### Expected Printed Output (with similar numbers):
113118
```
114119
RMSE: 11.113036205909719
115120
[CODE_SAMPLE_COMPLETED_SUCCESFULLY]
116121
```
117122

118-
### Build and run additional samples
119-
Several sample programs are available for you to try, many of which can be compiled and run in a similar fashion. Experiment with running the various samples on different kinds of compute nodes or adjust their source code to experiment with different workloads.
120-
121-
### Troubleshooting
122-
If an error occurs, troubleshoot the problem using the Diagnostics Utility for Intel® oneAPI Toolkits.
123-
[Learn more](https://software.intel.com/content/www/us/en/develop/documentation/diagnostic-utility-user-guide/top.html)
124-
125-
### Using Visual Studio Code* (Optional)
126-
127-
You can use Visual Studio Code (VS Code) extensions to set your environment, create launch configurations,
128-
and browse and download samples.
129-
130-
The basic steps to build and run a sample using VS Code include:
131-
- Download a sample using the extension **Code Sample Browser for Intel oneAPI Toolkits**.
132-
- Configure the oneAPI environment with the extension **Environment Configurator for Intel oneAPI Toolkits**.
133-
- Open a Terminal in VS Code (**Terminal>New Terminal**).
134-
- Run the sample in the VS Code terminal using the instructions below.
135-
- (Linux only) Debug your GPU application with GDB for Intel® oneAPI toolkits using the Generate Launch Configurations extension.
123+
## License
136124

137-
To learn more about the extensions, see
138-
[Using Visual Studio Code with Intel® oneAPI Toolkits](https://software.intel.com/content/www/us/en/develop/documentation/using-vs-code-with-intel-oneapi/top.html).
125+
Code samples are licensed under the MIT license. See
126+
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.
139127

140-
After learning how to use the extensions for Intel oneAPI Toolkits, return to this readme for instructions on how to build and run a sample.
128+
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).

0 commit comments

Comments
 (0)