TY - GEN
T1 - Algebraic Query Optimization for Distributed Top-k Queries
AU - Neumann, Thomas
AU - Michel, Sebastian
N1 - Publisher Copyright:
© 2007 Gesellschaft fur Informatik (GI). All rights reserved.
PY - 2007
Y1 - 2007
N2 - Distributed top-k query processing is increasingly becoming an essential functionality in a large number of emerging application classes. This paper addresses the efficient algebraic optimization of top-k queries in wide-area distributed data repositories where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers and the computational costs include network latency, bandwidth consumption, and local peer work. We use a dynamic programming approach to find the optimal execution plan using compact data synopses for selectivity estimation that is the basis for our cost model. The optimized query is executed in a hierarchical way involving a small and fixed number of communication phases. We have performed experiments on real web data that show the benefits of distributed top-k query optimization both in network resource consumption and query response time.
AB - Distributed top-k query processing is increasingly becoming an essential functionality in a large number of emerging application classes. This paper addresses the efficient algebraic optimization of top-k queries in wide-area distributed data repositories where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers and the computational costs include network latency, bandwidth consumption, and local peer work. We use a dynamic programming approach to find the optimal execution plan using compact data synopses for selectivity estimation that is the basis for our cost model. The optimized query is executed in a hierarchical way involving a small and fixed number of communication phases. We have performed experiments on real web data that show the benefits of distributed top-k query optimization both in network resource consumption and query response time.
UR - http://www.scopus.com/inward/record.url?scp=85135863017&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85135863017
T3 - Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)
SP - 324
EP - 343
BT - Datenbanksysteme in Business, Technologie und Web, BTW 2007, 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme", DBIS 2007, Proceedings
A2 - Kemper, Alfons
A2 - Schoning, Harald
A2 - Rose, Thomas
A2 - Jarke, Matthias
A2 - Seidl, Thomas
A2 - Brochhaus, Christoph
PB - Gesellschaft fur Informatik (GI)
T2 - Datenbanksysteme in Business, Technologie und Web, BTW 2007, 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme", DBIS 2007 - Database Systems for Business, Technology and Web, BTW 2007, 12th Conference of the GI Division "Databases and Information Systems", DBIS 2007
Y2 - 7 March 2007 through 9 March 2007
ER -