Download PDFOpen PDF in browser

Computing DTWs on CPU, GPU and FPGA with SYCL

EasyChair Preprint 14488, version 2

Versions: 12history
12 pagesDate: September 13, 2024

Abstract

One of the most time-consuming kernels of an epileptic seizure detection app is the computation of the Dynamic Time Warping (DTW) Distance Matrix. This kernel is a good candidate for heterogeneous CPU/GPU/FPGA execution. In this paper, we explore the design space of heterogeneous CPU, GPU, and FPGA implementations of this kernel. We start by optimizing the CPU implementation of the DTW Distance Matrix computation leveraging the latest C++26 SIMD library and compare it with the SYCL implementation for CPU that also exploits the SIMD units. Next, we take advantage of the portability of SYCL to run the code on an on-chip GPU, iGPU, as well as on a discrete NVIDIA GPU, dGPU. Finally we also present the SYCL implementation of the kernel on an Intel Stratix 10 MX FPGA. Our evaluations demonstrate that SYCL seems well suited to exploit the available SIMD capabilities of modern CPU cores, and also shows promising results for the accelerating devices considered in this work.

Keyphrases: DTW, FPGA, GPU, SIMD, SYCL, energy efficiency, heterogeneous architecture

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:14488,
  author    = {Cristian Campos and Rafael Asenjo and Javier Hormigo and Angeles Navarro},
  title     = {Computing DTWs on CPU, GPU and FPGA with SYCL},
  howpublished = {EasyChair Preprint 14488},
  year      = {EasyChair, 2024}}
Download PDFOpen PDF in browser