Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins

Thomas Neumann, Guido Moerkotte

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

187 Scopus citations

Abstract

Accurate cardinality estimates are essential for a successful query optimization. This is not only true for relational DBMSs but also for RDF stores. An RDF database consists of a set of triples and, hence, can be seen as a relational database with a single table with three attributes. This makes RDF rather special in that queries typically contain many self joins. We show that relational DBMSs are not well-prepared to perform cardinality estimation in this context. Further, there are hardly any special cardinality estimation methods for RDF databases. To overcome this lack of appropriate cardinality estimation methods, we introduce characteristic sets together with new cardinality estimation methods based upon them. We then show experimentally that the new methods are-in the RDF context-highly superior to the estimation methods employed by commercial DBMSs and by the open-source RDF store RDF-3X.

Original languageEnglish
Title of host publication2011 IEEE 27th International Conference on Data Engineering, ICDE 2011
Pages984-994
Number of pages11
DOIs
StatePublished - 2011
Event2011 IEEE 27th International Conference on Data Engineering, ICDE 2011 - Hannover, Germany
Duration: 11 Apr 201116 Apr 2011

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627

Conference

Conference2011 IEEE 27th International Conference on Data Engineering, ICDE 2011
Country/TerritoryGermany
CityHannover
Period11/04/1116/04/11

Fingerprint

Dive into the research topics of 'Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins'. Together they form a unique fingerprint.

Cite this