Past talks

May 07, 2021

Materialize: SQL Incremental View Maintenance

Frank McSherry (Materialize, Inc.)

While OLTP engines excel at maintaining invariants under transactional workloads, and OLAP engines excel at ad-hoc analytics, the relational database is not presently an excellent tool for maintaining the results of computations as data change. This space is currently occupied largely by microservices, bespoke tools that suffer from all the problems you might expect of systems that do not provide many of the ACID properties, and which anecdotally consume engineering departments with their maintenance overhead.

read more
May 07, 2021

Clonos: Consistent Causal Recovery for Highly-Available Streaming Dataflows

Pedro Silvestre (Imperial College London)

Stream processing lies in the backbone of modern businesses, being employed for mission critical applications such as real-time fraud detection, car-trip fare calculations, traffic management, and stock trading. Large-scale applications are executed by scale-out stream processing systems on thousands of long-lived operators, which are subject to failures.

read more
Apr 23, 2021

TED Learn: Towards Technology-Enabled Database Education

Sourav Bhowmick (NTU Singapore)

There is continuous demand for database-literate professionals in today’s market due to widespread usage of relational database management system (RDBMS) in the commercial world. Such commercial demand has played a pivotal role in the offering of database systems course as part of an undergraduate computer science (CS) degree program in major universities around the world. Furthermore, not all working adults dealing with RDBMS have taken an undergraduate database course. Hence, they often need to undergo on-the-job training or attend relevant courses in higher institutes of learning to acquire database literacy. Database courses in major institutes rely on textbooks, lectures, and off-the-shelf RDBMS to impart relevant knowledge such as how SQL queries are processed.

read more
Apr 23, 2021

PG-Keys: Keys for Property Graphs

George Fletcher (TU/e)

I report on a community effort between industry and academia to shape the future of property graph constraints. The standardization for a property graph query language is currently underway through the ISO Graph Query Language (GQL) project. Our position is that this project should pay close attention to schemas and constraints, and should focus next on key constraints.

read more
Apr 09, 2021

Factorization Matters in Large Graphs

Nikolay Yakovets (TU/e)

Evaluation of complex graph pattern queries on large graphs often leads to “explosion” of intermediate results (IR) which, in turn, considerably slows down query processing. In this talk, I will present WireFrame, our recent two-step factorization-based solution which aims to drastically reduce the IR during query processing.

read more
Apr 09, 2021

Aggregation Support for Modern Graph Analytics

Alin Deutsch (UC San Diego)

In this talk I will describe how GSQL, TigerGraph’s graph query language, supports the specification of aggregation in graph analytics. GSQL makes several unique design decisions with respect to both the expressive power, the semantics, and the evaluation complexity of the specified aggregation.

read more
Mar 26, 2021

LeanStore: A High-Performance Storage Engine for Modern Hardware

Viktor Leis (Friedrich Schiller University Jena)

LeanStore is a high-performance OLTP storage engine optimized for many-core CPUs and NVMe SSDs. The goal of the project is to achieve performance comparable to in-memory systems when the data set fits into RAM, while being able to fully exploit the bandwidth of fast NVMe SSDs for large data sets.

read more
Mar 26, 2021

Vectorized query processing over encrypted data with DuckDB and Intel SGX

Sam Ansmink (CWI, UvA, and VU)

Data confidentiality is an increasingly important requirement for customers outsourcing databases to the cloud. The common approach to achieve data confidentiality in this context is by using encryption. However, processing queries over encrypted data securely and efficiently remains an open issue. To this day, many different approaches to designing encrypted database management systems (EDBMS) have been suggested, for example by using homomorphic encryption or trusted execution environments such as Intel SGX.

read more
Mar 12, 2021

Three techniques for exploiting string compression in data systems

Peter Boncz (CWI & VU)

Actual data in real-life database often is in the form of strings. Strings take significantly more volume than fixed-size data, causing I/O, network traffic, memory traffic and cache traffic. Further, operations on strings tend to be significantly more expensive than operations on e.g. integers, which CPUs do support quite efficiently (let alone GPUs, TPUs - which even do not acknowledge the existense of string data).

read more
Mar 12, 2021

OtterTune : An Automatic Database Configuration Tuning Service

Andy Pavlo (Carnegie Mellon University)

Database management systems (DBMS) expose dozens of configurable knobs that control their runtime behavior. Setting these knobs correctly for an application’s workload can improve the performance and efficiency of the DBMS. But such tuning requires considerable efforts from experienced administrators, which is not scalable for large DBMS fleets. This problem has led to research on using machine learning (ML) to devise strategies to optimize DBMS knobs for any application automatically.

read more
Feb 26, 2021

Integrating Columnar Techniques and Factorization into GraphflowDB

Semih Salihoglu (University of Waterloo)

Graph database management systems (GDBMSs) in contemporary jargon refer to systems that adopt the property graph model and often power applications such as fraud detection and recommendations that require very fast joins of records that represent many-to-many relationships, often beyond the performance that existing relational systems generally provide. In this talk, I will give an overview of GraphflowDB, which is an in-memory GDBMS we are developing at University of Waterloo.

read more
Feb 26, 2021

1000 days of DuckDB - The Pareto Principle still holds

Hannes Mühleisen (CWI)

It has been almost 1000 days since we started work on DuckDB. In this talk, we reflect on the process and revisit the initial goals. We also describe major additions to the system such as automatic query parallelization.

read more