Patrick Damme, Matthias Boehm
Enabling Integrated Data Analysis Pipelines on Heterogeneous Hardware through Holistic Extensibility
2nd Workshop on Novel Data Management Ideas on Heterogeneous Hardware Architectures (NoDMC)
In this talk we propose holistic extensibility for IDA pipelines to handle increasing specialization from operators for heterogeneous hardware over the often co-designed data representations to the corresponding optimization and scheduling techniques. We sketch the extensibility design of DAPHNE, which offers users great benefits, while requiring low effort.
Chloe Averti, Vasileios Karakostas, Nikhita Kunati, Georgios Goumas, Michael Swif
DaxVM: Stressing the Limits of Memory as a File Interface
MICRO 2022 – 55th IEEE/ACM International Synopsium on Microarchitecture
We analyse the problem of high overhead of virtual memory operations involved in memory mapped file I/O and propose DaxVM, an optimized POSIX-relaxed interface that provides byte-addressable high performance storage.
Patrick Damme, Marius Birkenbach, Constantinos Bitsakos, Matthias Boehm, Philippe Bonnet, Florina Ciorba, Mark Dokter, Pawel Dowgiallo, Ahmed Eleliemy, Christian Faerber, Georgios Goumas, Dirk Habich, Niclas Hedam, Marlies Hofer, Wenjun Huang, Kevin Innerebner, Vasileios Karakostas, Roman Kern, Tomaž Kosar, Alexander Krause, Daniel Krems, Andreas Laber, Wolfgang Lehner, Eric Mier, Marcus Paradies, Bernhard Peischl, Gabrielle Poerwawinata, Stratos Psomadakis, Tilmann Rabl, Piotr Ratuszniak, Pedro Silva, Nikolai Skuppin, Andreas Starzacher, Benjamin Steinwender, Ilin Tolovski, Pınar Tözün, Wojciech Ulatowski, Aristotelis Vontzalidis, Yuanyuan Wang, Izajasz Wrosz, Aleš Zamuda, Ce Zhang, Xiao Xiang Zhu
DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines.
Conference on Innovative Data Systems Research, CIDR, 2022
We described the overall architecture and key design decisions of the DAPHNE system infrastructure as an open and extensible system for integrated data analysis pipelines, comprising query processing, ML, and HPC. This is the central publication to be referenced when referring to the DAPHNE project as a whole.
Quentin Guilloteau, Jonas H. Müller Korndörfer, Florina M. Ciorba
Seamlessly Scaling Applications with DAPHNE
COMPAS 2024
We present the ongoing work on seamlessly scaling applications with DAPHNE. After exposing the different components of the distributed DAPHNE runtime, we compare a DaphneDSL implementation for the Connected Components algorithm against Python, Julia, and C++ implementations along several dimensions: external dependencies, effort to adapt the code for parallel and distributed executions, and performance.
Ahmed Eleliemy, Florina M. Ciorba
DaphneSched: A Scheduler for Integrated Data Analysis Pipelines
ISPDC23
DaphneSched provides a wide range of scheduling schemes for task partitioning and assignment, including self-scheduling and work-stealing. We show that the number of workers and the work queues layout have a significant impact on the performance achievable with DaphneSched.
Jonas H. Müller Korndörfer, Ahmed Eleliemy, Osman Seckin Simsek , Thomas Ilsche, Robert Schöne, Florina M. Ciorba
How Do OS and Application Schedulers Interact? An Investigation with Multithreaded Applications
Euro-Par 23, 29th International European Conference on Parallel and Distributed Computing
This work investigates the interaction between OS-level and application thread-level scheduling to explain and quantify their precise roles in application and system performance.
Ahmed Eleliemy, Florina M. Ciorba
A Resourceful Coordination Approach for Multilevel Scheduling
International Conference on High Performance Computing & Simulation, HPCS, 2021
We propose a resourceful coordination approach (RCA) that allows application schedulers to cooperate by involving the batch scheduler. We implement the proposed approach in a two-level simulation using realistic and well-known simulators and evaluate it using the effective system performance (ESP) benchmark.
Aleš Zamuda, Mark Dokter
DAPHNE Computational Intelligence on EuroHPC Vega for Benchmarking Randomised Optimisation Algorithms
CobCom 2024
This paper presents a deployment of DAPHNE on EuroHPC Vega, running randomized optimization algorithms (ROA) tasks with Slurm and an example ROA benchmarking scenario using the HappyCat function is discussed.
Nina Ihde, Paula Marten, Ahmed Eleliemy, Gabrielle Poerwawinata, Pedro Silva, Ilin Tolovski, Florina M. Ciorba, and Tilmann Rabl
A Survey of Big Data, High Performance Computing, and Machine Learning Benchmarks
Technology Conference on Performance Evaluation & Benchmarking, TPC, published by Springer, 2021
We discuss the state-of-the-art of BD, HPC, and ML benchmarks and summarize a representative selection of some of the classic and most used benchmarks and classify them under the light of a feature space composed of purpose, stage, metric, and convergence, as well as from the perspective of a proposed integrated data analysis architecture.
Dirk Habich, Johannes Pietrzyk
SIMDified Data Processing – Foundations, Abstraction, and Advanced Techniques
SIGMOD Conference Companion 2024
This tutorial will provided the attendees with an opportunity to gain insights into the evolving topic of SIMDified data processing. The tutorial was designed with a database audience in mind, while at the same time being aware of a most probably prevailing medium level of knowledge about SIMD.
Dirk Habich, Alexander Krause, Johannes Pietrzyk, Christian Faerber, Wolfgang Lehner
Simplicity done right for SIMDified query processing on CPU and FPGA
ACM SIGMOD/PODS
We show a combination of SIMD abstraction libraries to seamlessly port standard templated C++ SIMD host code to the FPGA without the necessity of complex FPGA-specific programming.