The Dutch Seminar
on Data Systems Design

An initiative to bring together research groups working on data systems in Dutch universities and research institutes.

Fridays4–5:30 pm
bi-weekly

We hold bi-weekly talks on Fridays from 4:00 PM to 5:30 PM CET for and by researchers and practitioners designing (and implementing) data systems. The objective is to establish a new forum for the Dutch Data Systems community to come together, foster collaborations between its members, and bring in high quality international speakers. We would like to invite all researchers, especially also PhD students, who are working on related topics to join the events. It is an excellent opportunity to receive feedback early on by researchers in your field.

Upcoming talks

October 1, 2021, 4:00PM-5:30PM (CET)

12th Seminar

The 12th seminar of DSDSD will feature talks by
Konstantinos Karanasos (Microsoft's Gray Systems Lab - Azure Data's applied research group) .

read more
Oct 01, 2021

Optimizing machine learning prediction queries and beyond on modern data engines

Konstantinos Karanasos (Microsoft's Gray Systems Lab - Azure Data's applied research group)

Prediction queries are widely used across industries to perform advanced analytics and draw insights from data. They include a data processing part (e.g., for joining, filtering, cleaning, featurizing the datasets) and a machine learning (ML) part invoking one or more trained models to perform predictions. These parts have so far been optimized in isolation, leaving significant opportunities for optimization unexplored.

Past talks

Jul 16, 2021

Data-Intensive Systems in the Microsecond Era

Pinar Tozun (ITU Copenhagen)

Late 2000s and early 2010s have seen the rise of data-intensive systems optimized for in-memory execution. Today, it has been increasingly clear that just optimizing for main memory is neither economically viable nor strictly necessary for high performance. Modern SSDs, such as Z-NAND and Optane, can access data at a latency of around 10 microseconds.

read more
Jul 16, 2021

Charting the Design Space of Query Execution using VOILA

Tim Gubner (CWI)

atabase architecture, while having been studied for four decades now, has delivered only a few designs with well-understood properties. These few are followed by most actual systems. Acquiring more knowledge about the design space is a very time-consuming process that requires manually crafting prototypes with a low chance of generating material insight.

read more
Jul 02, 2021

DuckDQ: Data Quality Validation for Machine Learning Pipelines

Till Döhmen (RWTH Aachen University and Fraunhofer FIT)

Data quality validation plays an important role in ensuring the correct behaviour of productive machine learning (ML) applications and services. Observing a lack of existing solutions for quality control in medium-sized production ML systems, we developed DuckDQ: A lightweight and efficient Python library for protecting machine learning pipelines from data errors.

read more

Tweets by @DSDSDNL