Stardog query optimiser - Join ordering and cardinality estimations for graph queries

Pavel Klinov

Stardog is a commercial knowledge graph platform at the heart of which lies a graph database. It manages graph data as RDF and natively implements SPARQL 1.1 graph query language. This talk will briefly present the general architecture of the query engine and then will delve deep into the internals of the query optimiser, particularly, graph statistics and cardinality estimations for graph patterns. It will also briefly discuss reliability of cardinality estimations for different kinds of graph patterns and how it relates to robust query execution.

Differently from some early SPARQL systems Stardog is not built on top of a relational database. Nonetheless the talk will highlight how it takes advantage of many foundational aspects of relational query optimisation, such as rewriting algebraic expressions, cost-based optimisation, planning joins, etc. At the same time some aspects, such as the lack of a rigid schema, like foreign key constraints or column data types, present unique challenges for the query optimiser.

Pavel Klinov has led the query engine team at Stardog since 2011 (with a short academic break in 2012-2015 to work on an automated reasoning project at the University of Ulm). He has overseen Stardog’s query engine evolve from a very simple heuristic optimiser in 2011 to a sophisticated cost-based optimiser in 2023 where most inputs for the cost model come from cardinality estimations. His team is responsible for both query optimisation work and implementing new features for the query language, such as recursive path queries. Prior to joining Stardog he earned his PhD on performance of reasoning algorithms for probabilistic logic at the University of Manchester, UK.