DSDSD - Upcoming Talks

Evaluating CXL Memory Performance

Marcel Weisgut (HPI)

The Compute Express Link (CXL) standard enables new forms of memory management and access across devices and servers. Based on PCIe, it enables cache coherent access to remote memory. This widens the design space for database systems by expanding the available memory beyond memory local to the CPU. Efficiently utilizing CXL-attached memory requires conscious decisions of data systems about data placement and management. In this work, we provide an in-depth analysis of database operation performance with data interleaved across multiple CXL memory devices. We evaluate memory access performance for basic access patterns and the performance impact of placing data across multiple CXL memory devices for in-memory column scans and in-memory B+tree operations.

Marcel Weisgut is a PhD student at the Hasso Plattner Institute, specializing in data management utilizing modern hardware in the Data Engineering Systems group led by Tilmann Rabl. He received his master’s degree from HPI in 2021, focusing on in-memory data management in Hasso Plattner’s research group. During his master’s studies, he contributed to the columnar open-source in-memory database system Hyrise and interned with the SAP HANA development team at SAP Labs Korea. His current research focuses on utilizing memory attached to a CPU via the cache-coherent interconnect Compute Express Link (CXL) for database systems.

Alsatian - Optimizing Model Search for Deep Transfer Learning

Nils Strassenburg (HPI)

Transfer learning is an effective technique for tuning a deep learning model when training data or computational resources are limited. Instead of training a new model from scratch, the parameters of an existing “base model” are adjusted for a new task. The accuracy of such a fine-tuned model depends on choosing an appropriate base model. Model search automates the selection of such a base model by evaluating the suitability of candidate models for a specific task. This entails inference with each candidate model on task-specific data. With thousands of models available through model stores, the computational cost of model search is a major bottleneck for efficient transfer learning. In this work, we present Alsatian, a novel model search system. Based on the observation that many candidate models overlap to a significant extent and based on a careful bottleneck analysis, we propose optimization techniques that are applicable to many model search frameworks. These optimizations include: (i) splitting models into individual blocks that can be shared across models, (ii) caching of intermediate inference results and model blocks, and (iii) selecting a beneficial search order for models to maximize sharing of cached results. In our evaluation on state-of-the-art deep learning models from computer vision and natural language processing, we show that Alsatian outperforms baselines by up to ~14×.

Nils is a PhD student in the Database Group at the Hasso Plattner Institute (HPI) in Potsdam, under the supervision of Tilmann Rabl. His research focuses on ML systems, particularly ML model management and search. In addition to his research, he contributes to the lecture on big data systems, leads seminars on ML systems, and supervises master’s theses. Before starting his PhD, he earned a master’s degree in IT-Systems Engineering from HPI and a bachelor’s degree in Computer Science from the University of Hamburg. As part of his studies, he completed a six-month internship at SAP Labs France in Sophia Antipolis and spent a semester at ETH Zurich.

Upcoming talks

32nd Edition Seminar

Evaluating CXL Memory Performance

Alsatian - Optimizing Model Search for Deep Transfer Learning