The Dutch Seminar
on Data Systems Design

An initiative to bring together research groups working on data systems in Dutch universities and research institutes.

Fridays4–5:30 pm

We hold bi-weekly talks on Fridays from 4:00 PM to 5:30 PM CET for and by researchers and practitioners designing (and implementing) data systems. The objective is to establish a new forum for the Dutch Data Systems community to come together, foster collaborations between its members, and bring in high quality international speakers. We would like to invite all researchers, especially also PhD students, who are working on related topics to join the events. It is an excellent opportunity to receive feedback early on by researchers in your field.

Upcoming talks

July 2, 2021, 4:00PM-5:30PM (CET)

10th Seminar

The 10th seminar of DSDSD will feature talks by
Arun Kumar (UCSD)
Till Döhmen (RWTH Aachen University and Fraunhofer FIT) .

read more
Jul 02, 2021

Multi-Query Optimizations for Deep Learning Systems

Arun Kumar (UCSD)

Deep learning (DL) is growing in popularity for many advanced data analytics applications in enterprise, Web, scientific, and other domains. Naturally, resource efficiency of DL systems and the productivity of their users are pressing challenges to democratizing DL. In this talk, I present a new technical direction from my research that tackles such challenges with a database-inspired lens: higher-level specification and multi-query optimization (MQO).

Jul 02, 2021

DuckDQ: Data Quality Validation for Machine Learning Pipelines

Till Döhmen (RWTH Aachen University and Fraunhofer FIT)

Data quality validation plays an important role in ensuring the correct behaviour of productive machine learning (ML) applications and services. Observing a lack of existing solutions for quality control in medium-sized production ML systems, we developed DuckDQ: A lightweight and efficient Python library for protecting machine learning pipelines from data errors.

Past talks

Jun 18, 2021

Teseo and the Analysis of Structural Dynamic Graphs

Dean De Leo (CWI)

Teseo is a new system for the storage and analysis of dynamic structural graphs in main-memory, with the addition of transactional support. It introduces a novel design based on sparse arrays, large arrays interleaved with gaps, and a fat tree, where the graph is ultimately stored. Our design contrasts with early systems for the analysis of dynamic graphs, which often lack transactional support and are anchored to a vertex table as a primary index.

read more
Jun 18, 2021

MxTasks: How to Make Efficient Synchronization and Prefetching Easy

Jens Teubner (TU Dortmund)

The hardware environment has changed rapidly in recent years: Many cores, multiple sockets, and large amounts of main memory have become a commodity. To benefit from these highly parallel systems, the software has to be adapted. Sophisticated latch-free data structures and algorithms are often meant to address the situation.

read more
Jun 04, 2021

Evaluating Matching Techniques with Valentine

Christos Koutras (Delft University of Technology)

Data scientists today search large data lakes to discover and integrate datasets. In order to bring together disparate data sources, dataset discovery methods rely on some form of schema matching: the process of establishing correspondences between datasets. Traditionally, schema matching has been used to find matching pairs of columns between a source and a target schema. However, the use of schema matching in dataset discovery methods differs from its original use.

read more

Tweets by @DSDSDNL