Skip to content

Private copies2 #1296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,10 @@ A typical design flow may be to:
```
make report
```
* Compile for simulation (fast compile time, targets simulated FPGA device, reduced data size):
```
make fpga_sim
```
* Compile for FPGA hardware (longer compile time, targets FPGA device):
```
make fpga
Expand Down Expand Up @@ -167,6 +171,10 @@ A typical design flow may be to:
```
nmake report
```
* Compile for simulation (fast compile time, targets simulated FPGA device, reduced data size):
```
nmake fpga_sim
```
* Compile for FPGA hardware (longer compile time, targets FPGA device):
```
nmake fpga
Expand All @@ -188,7 +196,18 @@ On the main report page, scroll down to the section titled "Estimated Resource U
./private_copies.fpga_emu (Linux)
private_copies.fpga_emu.exe (Windows)
```
2. Run the sample on the FPGA device:
2. Run the sample on the FPGA simulator device:
* On Linux
```bash
CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1 ./private_copies.fpga_sim
```
* On Windows
```bash
set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1
private_copies.fpga_sim.exe
set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=
```
3. Run the sample on the FPGA device:
```
./private_copies.fpga (Linux)
private_copies.fpga.exe (Windows)
Expand Down Expand Up @@ -218,7 +237,7 @@ When run on the Intel® PAC with Intel Arria10® 10 GX FPGA hardware board

Setting the `private_copies` attribute to 0 (or equivalently omitting the attribute entirely) produced good throughput, and the reports show us that the compiler selected 3 private copies. This does produce the optimal throughput, but in this case it probably makes sense to save some area in exchange for a very small throughput loss by specifying 2 private copies instead.

When run on the FPGA emulator, the `private_copies` attribute has no effect on kernel time. You may actually notice that the emulator achieved higher throughput than the FPGA in this example. This is because this trivial example uses only a tiny fraction of the spatial compute resources available on the FPGA.
When run on the FPGA emulator or simulator, the `private_copies` attribute has no effect on kernel time. You may actually notice that the emulator achieved higher throughput than the FPGA in this example. This is because this trivial example uses only a tiny fraction of the spatial compute resources available on the FPGA.

## License

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
set(SOURCE_FILE private_copies.cpp)
set(TARGET_NAME private_copies)
set(EMULATOR_TARGET ${TARGET_NAME}.fpga_emu)
set(SIMULATOR_TARGET ${TARGET_NAME}.fpga_sim)
set(FPGA_TARGET ${TARGET_NAME}.fpga)

# FPGA board selection
Expand All @@ -24,6 +25,8 @@ endif()
# For this reason, FPGA backend flags must be passed as link flags in CMake.
set(EMULATOR_COMPILE_FLAGS "-fsycl -fintelfpga -Wall ${WIN_FLAG} -DFPGA_EMULATOR")
set(EMULATOR_LINK_FLAGS "-fsycl -fintelfpga")
set(SIMULATOR_COMPILE_FLAGS "-fsycl -fintelfpga -Wall ${WIN_FLAG} -Xssimulation -DFPGA_SIMULATOR")
set(SIMULATOR_LINK_FLAGS "-fsycl -fintelfpga -Xssimulation -Xstarget=${FPGA_DEVICE} ${USER_HARDWARE_FLAGS}")
set(HARDWARE_COMPILE_FLAGS "-fsycl -fintelfpga -Wall ${WIN_FLAG} -DFPGA_HARDWARE")
set(HARDWARE_LINK_FLAGS "-fsycl -fintelfpga -Xshardware -Xstarget=${FPGA_DEVICE} ${USER_HARDWARE_FLAGS}")
# use cmake -D USER_HARDWARE_FLAGS=<flags> to set extra flags for FPGA backend compilation
Expand Down Expand Up @@ -56,6 +59,20 @@ set_target_properties(${FPGA_EARLY_IMAGE} PROPERTIES COMPILE_FLAGS "${HARDWARE_C
set_target_properties(${FPGA_EARLY_IMAGE} PROPERTIES LINK_FLAGS "${HARDWARE_LINK_FLAGS} -fsycl-link=early")
# fsycl-link=early stops the compiler after RTL generation, before invoking Quartus®

###############################################################################
### FPGA Simulator
###############################################################################
# To compile in a single command:
# icpx -fsycl -fintelfpga -Xssimulation -DFPGA_SIMULATOR -Xstarget=<FPGA_DEVICE> private_copies.cpp -o private_copies.fpga
# CMake executes:
# [compile] icpx -fsycl -fintelfpga -Xssimulation -DFPGA_SIMULATOR -o private_copies.cpp.o -c private_copies.cpp
# [link] icpx -fsycl -fintelfpga -Xssimulation -Xstarget=<FPGA_DEVICE> private_copies.cpp.o -o private_copies.fpga
add_executable(${SIMULATOR_TARGET} ${SOURCE_FILE})
target_include_directories(${SIMULATOR_TARGET} PRIVATE ../../../../include)
add_custom_target(fpga_sim DEPENDS ${SIMULATOR_TARGET})
set_target_properties(${SIMULATOR_TARGET} PROPERTIES COMPILE_FLAGS "${SIMULATOR_COMPILE_FLAGS}")
set_target_properties(${SIMULATOR_TARGET} PROPERTIES LINK_FLAGS "${SIMULATOR_LINK_FLAGS}")

###############################################################################
### FPGA Hardware
###############################################################################
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,14 @@

using namespace sycl;

constexpr size_t kSize = 8192;
#if defined(FPGA_SIMULATOR)
// Smaller size to keep the runtime reasonable
constexpr size_t kSize = 512; //2^9
constexpr size_t kMaxIter = 100;
#else
constexpr size_t kSize = 8192; //2^13
constexpr size_t kMaxIter = 50000;
#endif
constexpr size_t kTotalOps = 2 * kMaxIter * kSize;
constexpr size_t kMaxValue = 128;

Expand All @@ -33,10 +39,12 @@ template <int num_copies> class Kernel;
template <int num_copies, bool first_call = false>
void SimpleMathWithShift(const IntArray &array, int shift, IntScalar &result) {

#if FPGA_HARDWARE
auto selector = sycl::ext::intel::fpga_selector_v;
#if FPGA_SIMULATOR
auto selector = sycl::ext::intel::fpga_simulator_selector_v;
#elif FPGA_HARDWARE
auto selector = sycl::ext::intel::fpga_selector_v;
#else // #if FPGA_EMULATOR
auto selector = sycl::ext::intel::fpga_emulator_selector_v;
auto selector = sycl::ext::intel::fpga_emulator_selector_v;
#endif

double kernel_time = 0.0;
Expand Down