The Dutch Seminar
on Data Systems Design

An initiative to bring together research groups working on data systems in Dutch universities and research institutes.

Fridays4–5:30 pm

We hold bi-weekly talks on Fridays from 4:00 PM to 5:30 PM CET for and by researchers and practitioners designing (and implementing) data systems. The objective is to establish a new forum for the Dutch Data Systems community to come together, foster collaborations between its members, and bring in high quality international speakers. We would like to invite all researchers, especially also PhD students, who are working on related topics to join the events. It is an excellent opportunity to receive feedback early on by researchers in your field.

Upcoming talks

October 15, 2021, 4:00PM-5:30PM (CET)

13th Seminar

The 13th seminar of DSDSD will feature a talk by
Raul Castro Fernandez (University of Chicago)
, as well as an ad-hoc panel on data discovery.

read more
Oct 15, 2021

Data Stations : Combining Data, Compute, and Market Forces

Raul Castro Fernandez (University of Chicago)

In this talk, I will present preliminary work on a new architecture (Data Station) to facilitate data sharing within and across organizations. Data Stations depart from modern data lakes in that both data and derived data products, such as machine learning models, are sealed and cannot be directly seen, accessed, or downloaded by anyone.

Oct 15, 2021

Ad-hoc panel on data discovery

Discovering datasets for data scientists: how can we enable the discovery of relevant datasets, as well as the relationships among datasets in a huge data repository? Dataset discovery is the basis on which we can build data augmentation, data integration, building better and more accurate ML models.

Past talks

Oct 01, 2021

Optimizing machine learning prediction queries and beyond on modern data engines

Konstantinos Karanasos (Microsoft's Gray Systems Lab - Azure Data's applied research group)

Prediction queries are widely used across industries to perform advanced analytics and draw insights from data. They include a data processing part (e.g., for joining, filtering, cleaning, featurizing the datasets) and a machine learning (ML) part invoking one or more trained models to perform predictions. These parts have so far been optimized in isolation, leaving significant opportunities for optimization unexplored.

read more
Oct 01, 2021

Optimisation of Inference Queries

Ziyu Li (TU Delft)

The wide adoption of machine learning (ML) in diverse application domains is resulting in an explosion of available models described by, and stored in model repositories. In application contexts where inference needs are dynamic and subject to strict execution constraints – such as in video processing – the manual selection of an optimal set of models from a large model repository is a nearly impossible task and practitioners typically settle for models with a good average accuracy and performance.

read more
Jul 16, 2021

Data-Intensive Systems in the Microsecond Era

Pinar Tozun (ITU Copenhagen)

Late 2000s and early 2010s have seen the rise of data-intensive systems optimized for in-memory execution. Today, it has been increasingly clear that just optimizing for main memory is neither economically viable nor strictly necessary for high performance. Modern SSDs, such as Z-NAND and Optane, can access data at a latency of around 10 microseconds.

read more

Tweets by @DSDSDNL