Skip to content

Intel® AI Analytics Toolkit (AI Kit) Container Getting Started sample readme update #1492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -311,6 +311,7 @@
},
{
"cell_type": "markdown",
"id": "5eea6ae7",
"metadata": {},
"source": [
"The training times for the 3 cases are printed out and shown in the figure above. Using BF16 should show significant reduction in training time. However, there is little to no change using AVX512 with BF16 and AMX with BF16 because the amount of computations required for one batch is too small with this dataset. "
Expand Down Expand Up @@ -348,15 +349,16 @@
"id": "b6ea2aeb",
"metadata": {},
"source": [
"This figure shows the relative performance speedup of AMX compared to FP32 and BF16 with AVX512. The expected behavior is that AMX with BF16 should have about a 1.5X improvement over FP32 and about the same performance as BF16 with AVX512. To see more performance improvement between AVX-512 BF16 and AMX BF16, increase the amount of required computations in one batch. This can be done by increasing the batch size with CIFAR10 or using another dataset. "
"This figure shows the relative performance speedup of AMX compared to FP32 and BF16 with AVX512."
]
},
{
"cell_type": "markdown",
"id": "0da073a6",
"id": "7bf01080",
"metadata": {},
"source": [
"This code sample shows how to enable and disable AMX during runtime, as well as the performance improvements using AMX BF16 for training the ResNet50 model. There will be additional significant performance improvements if AMX INT8 is used in inference, which is covered in a related oneAPI sample."
"## Conclusion\n",
"This code sample shows how to enable and disable AMX during runtime, as well as the performance improvements using AMX BF16 for training on the ResNet50 model. Performance will vary based on your hardware and software versions. To see more performance improvement between AVX-512 BF16 and AMX BF16, increase the amount of required computations in one batch. This can be done by increasing the batch size with CIFAR10 or using another dataset. For even more speedup, consider using the Intel® Extension for PyTorch* [Launch Script](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/performance_tuning/launch_script.html). "
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -148,11 +148,9 @@ If you receive an error message, troubleshoot the problem using the **Diagnostic

## Example Output

If successful, the sample displays `[CODE_SAMPLE_COMPLETED_SUCCESSFULLY]`. Additionally, the sample generates performance and analysis diagrams for comparison.
If successful, the sample displays `[CODE_SAMPLE_COMPLETED_SUCCESSFULLY]`. Additionally, the sample will print out the runtimes and charts of relative performance with the FP32 model without any optimizations as the baseline.

The following image shows approximate performance speed increases using AMX BF16 with auto-mixed precision during training. To see more performance improvement between AVX-512 BF16 and AMX BF16, increase the amount of required computations in one batch. This can be done by increasing the batch size with CIFAR10 or using another dataset.

![comparison images](assets/amx_relative_speedup.png)
The performance speedups using AMX BF16 are approximate on ResNet50. Performance will vary based on your hardware and software versions. To see more performance improvement between AVX-512 BF16 and AMX BF16, increase the amount of required computations in one batch. This can be done by increasing the batch size with CIFAR10 or using another dataset. For even more speedup, consider using the Intel® Extension for PyTorch* [Launch Script](https://intel.github.io/intel-extension-for-pytorch/cpu/latest/tutorials/performance_tuning/launch_script.html).

## License

Expand Down
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -1,208 +1,162 @@
# Intel® AI Analytics Toolkit (AI Kit) Container Sample
# `Intel® AI Analytics Toolkit (AI Kit) Container Getting Started` Sample

Containers allow you to set up and configure environments for
building, running, and profiling AI applications and distribute
them using images. You can also use Kubernetes* to automate the
deployment and management of containers in the cloud.
The `Intel® AI Analytics Toolkit (AI Kit) Container Getting Started` sample demonstrates how to use AI Kit containers.

This get started sample shows the easiest way to start using any of
the [Intel® AI Analytics
Toolkit (AI Kit)](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-analytics-toolkit.html)
components without the hassle of installing the toolkit, configuring
networking and file sharing.
| Area | Description
|:--- |:---
| What you will learn | How to start using the Intel® oneapi-aikit container
| Time to complete | 10 minutes
| Category | Tutorial


| Optimized for | Description
|:--- |:---
| OS | Linux* Ubuntu* 18.04
| Hardware | Intel® Xeon® Scalable processor family or newer
| Software | Intel® AI Analytics Toolkit
| What you will learn | How to start using the Intel® oneapi-aikit container
| Time to complete | 10 minutes
For more information on the **oneapi-aikit** container, see [Intel AI Analytics Toolkit](https://hub.docker.com/r/intel/oneapi-aikit) Docker Hub location.

## Purpose

This sample provides a Bash script to help users configure their Intel® AI Analytics Toolkit
container environment. Developers can
quickly build and train deep learning models using this Docker*
environment.
This sample provides a Bash script to help you configure an AI Kit container environment. You can build and train deep learning models using this Docker* environment.

Containers allow you to set up and configure environments for building, running, and profiling AI applications and distribute them using images. You can also use Kubernetes* to automate the deployment and management of containers in the cloud.

Read the [Get Started with the Intel® AI Analytics Toolkit for Linux*](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top.html) to find out how you can achieve performance gains for popular deep-learning and machine-learning frameworks through Intel optimizations.

This sample shows an easy way to start using any of the [Intel® AI Analytics Toolkit (AI Kit)](https://www.intel.com/content/www/us/en/developer/tools/oneapi/ai-analytics-toolkit.html) components without the hassle of installing the toolkit, configuring networking and file sharing.

For more information on the one API AIKit container, see [AI Kit
Container Repository](https://hub.docker.com/r/intel/oneapi-aikit).
## Prerequisites

| Optimized for | Description
|:--- |:---
| OS | Ubuntu* 20.04 (or newer)
| Hardware | Intel® Xeon® Scalable processor family
| Software | Intel® AI Analytics Toolkit (AI Kit)

## Key Implementation Details

The Bash script provided in this sample performs the following
configuration steps:

- Mounts the `/home` folder from host machine into the Docker
container. You can share files between the host machine and the
Docker container via the `/home` folder.
- Mounts the `/home` folder from host machine into the Docker container. You can share files between the host machine and the Docker container through the `/home` folder.

- Applies proxy settings from the host machine into the Docker
container.

- Uses the same IP addresses between the host machine and the Docker
container.
- Applies proxy settings from the host machine into the Docker container.

- Forwards ports 8888, 6006, 6543, and 12345 from the host machine to
the Docker container for some popular network services, such as
Jupyter notebook and TensorBoard.

- Uses the same IP addresses between the host machine and the Docker container.

## Run the Sample
- Forwards ports 8888, 6006, 6543, and 12345 from the host machine to the Docker container for some popular network services, such as Jupyter* Notebook and TensorFlow* TensorBoard.

## Run the `Intel® AI Analytics Toolkit (AI Kit) Container Getting Started` Sample

This sample uses a configuration script to automatically configure the
environment. This provides fast and less error prone setup. For
complete instructions for using the AI Kit containers see
the [Getting Started Guide](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top/using-containers.html.)
This sample uses a configuration script to automatically configure the environment. This provides fast and less error prone setup. For complete instructions for using the AI Kit containers, see the [Getting Started Guide](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top/using-containers.html).

To run the configuration script on Linux*, type the following command
in the terminal with [Docker](https://docs.docker.com/engine/install/)
installed:
### On Linux*

You must have [Docker](https://docs.docker.com/engine/install/)
installed.

1. Navigate to the directory with the IntelAIKitContainer sample and pull the oneapi-aikit docker image:

```
docker pull intel/oneapi-aikit
```
> Please apply the below command and login again if a permisson denied error occurs.
```
sudo usermod -aG docker $USER
```

2. Use the `run_oneapi_docker.sh` Bash script to run the Docker image:
1. Open a terminal.
2. Change to the sample folder, and pull the oneapi-aikit Docker image.
```
docker pull intel/oneapi-aikit
```
>**Note**: If a permission denied error occurs, run the following command.
>```
>sudo usermod -aG docker $USER
>```

```bash
3. Run the Docker images using the `run_oneapi_docker.sh` Bash script.
```
./run_oneapi_docker.sh intel/oneapi-aikit
```

The script opens a Bash shell inside the Docker container.
> Note : Users could install additional packages by adding them into requirements.txt.
> Please copy the modified requirements.txt into /tmp folder, so the bash script will install those packages for you.

To create a new Bash session in the running container from outside
the Docker container, use the following:

```bash
> **Note**: Install additional packages by adding them into requirements.txt file in the sample. Copy the modified requirements.txt into /tmp folder, so the bash script will install those packages for you.

To create a Bash session in the running container from outside the Docker container, enter a command similar to the following.
```
docker exec -it aikit_container /bin/bash
```

3. In the Bash shell inside the Docker container, activate the oneAPI
environment:

```bash
4. In the Bash shell inside the Docker container, activate the specialized environment.
```
source activate tensorflow
```

or

```bash
```
source activate pytorch
```

Now you can start using Intel® Optimization for TensorFlow* or Intel
Optimization for PyTorch inside the Docker container.

To verify the activated environment, navigate to the directory with
the IntelAIKitContainer sample and run the `version_check.py` script:

```bash
python version_check.py
```
You can start using Intel® Optimization for TensorFlow* or Intel® Optimization for PyTorch* inside the Docker container.

## Example of Output

Output from TensorFlow Environment
```
TensorFlow version: 2.6.0
MKL enabled : True
```

Output from PyTorch Environment
```
PyTorch Version: 1.8.0a0+37c1f4a
mkldnn : True, mkl : True, openmp : True
```



## Next Steps

Explore the [Get Started
Guide](https://www.intel.com/content/www/us/en/develop/documentation/get-started-with-ai-linux/top.html).
to find out how you can achieve performance gains for popular
deep-learning and machine-learning frameworks through Intel
optimizations.
>**Note**: You can verify the activated environment. Change to the directory with the IntelAIKitContainer sample and run the `version_check.py` script.
>```
>python version_check.py
>```

### Manage Docker* Images

You can install additional packages, upload the workloads via the
`/tmp` folder, and then commit your changes into a new Docker image,
for example, `intel/oneapi-aikit-v1`:


```bash
You can install additional packages, upload the workloads via the `/tmp` folder, and then commit your changes into a new Docker image, for example, `intel/oneapi-aikit-v1`.
```
docker commit -a "intel" -m "test" DOCKER_ID intel/oneapi-aikit-v1
```
>**Note**: Replace `DOCKER_ID` with the ID of your container. Use `docker ps` to get the DOCKER_ID of your Docker container.

**NOTE:** Replace `DOCKER_ID` with the ID of your container. Use
`docker ps` to get the DOCKER_ID of your Docker container.

You can then use the new image name to start Docker:

```bash
You can use the new image name to start Docker.
```
./run_oneapi_docker.sh intel/oneapi-aikit-v1
```

To save the Docker image as a tar file:

```bash
You can save the Docker image as a tar file.
```
docker save -o oneapi-aikit-v1.tar intel/oneapi-aikit-v1
```

To load the tar file on other machines:

```bash
You can load the tar file on other machines.
```
docker load -i oneapi-aikit-v1.tar
```
## Troubleshooting

### Docker Proxy

#### Ubuntu
For docker proxy related problem, you could follow below instructions to setup proxy for your docker client.
For Docker proxy related problem, you could follow below instructions to configure proxy settings for your Docker client.

1. Create a directory for the Docker service configurations.
```
sudo mkdir -p /etc/systemd/system/docker.service.d
```
2. Create a file called `proxy.conf` in our configuration directory.
```
sudo vi /etc/systemd/system/docker.service.d/proxy.conf
```
3. Add the contents similar to the following to the `.conf` file. Change the values to match your environment.
```
[Service]
Environment="HTTP_PROXY=http://proxy-hostname:911/"
Environment="HTTPS_PROXY="http://proxy-hostname:911/
Environment="NO_PROXY="10.0.0.0/8,192.168.0.0/16,localhost,127.0.0.0/8,134.134.0.0/16"
```
4. Save your changes and exit the text editor.
5. Reload the daemon configuration.
```
sudo systemctl daemon-reload
```
6. Restart Docker to apply our changes.
```
sudo systemctl restart docker.service
```

## Example Output

### Output from TensorFlow* Environment

1. Create a new directory for our Docker service configurations
```bash
sudo mkdir -p /etc/systemd/system/docker.service.d
```
2. Create a file called proxy.conf in our configuration directory.
```bash
sudo vi /etc/systemd/system/docker.service.d/proxy.conf
```
3. Add the following contents, changing the values to match your environment.
```bash
[Service]
Environment="HTTP_PROXY=http://proxy-hostname:911/"
Environment="HTTPS_PROXY="http://proxy-hostname:911/
Environment="NO_PROXY="10.0.0.0/8,192.168.0.0/16,localhost,127.0.0.0/8,134.134.0.0/16"
TensorFlow version: 2.6.0
MKL enabled : True
```
4. Save your changes and exit the text editor.
5. Reload the daemon configuration
```bash
sudo systemctl daemon-reload

### Output from PyTorch* Environment

```
6. Restart Docker to apply our changes
```bash
sudo systemctl restart docker.service
PyTorch Version: 1.8.0a0+37c1f4a
mkldnn : True, mkl : True, openmp : True
```

## License

Code samples are licensed under the MIT license. See
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt)
for details.
[License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details.

Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt)
Third party program Licenses can be found here: [third-party-programs.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/third-party-programs.txt).
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"guid": "0F95DA9E-0A5D-4CF2-B791-885B09675004",
"name": "IntelAIKitContainer_GettingStarted",
"name": "Intel(R) AI Analytics Toolkit (AI Kit) Container Getting Started",
"categories": ["Toolkit/oneAPI AI And Analytics/AI Getting Started Samples"],
"description": "This sample illustrates how to utilize the oneAPI AI Kit container.",
"builder": ["cli"],
Expand Down