TY - GEN
T1 - Dynamic workload management for very large data warehouses - Juggling feathers and bowling balls
AU - Krompass, Stefan
AU - Kuno, Harumi
AU - Dayal, Umeshwar
AU - Kemper, Alfons
N1 - Publisher Copyright:
Copyright 2007 VLDB Endowment, ACM.
PY - 2007
Y1 - 2007
N2 - Workload management for business intelligence (BI) queries poses different challenges than those addressed in the online transaction processing (OLTP) context. The fundamental problem is that the execution times of BI queries can range from milliseconds to hours, and it is difficult to estimate these times accurately. Key challenges raised by this problem are how to identify queries that are not performing properly and what to do about them. We propose here a workload management system for controlling the execution of individual queries based on realistic customer service level objectives. In order to validate our proposal, we have implemented an experimental system that includes a dynamic execution controller that leverages fuzzy logic. We present results from a number of experiments that we ran using workloads based on actual industrial workloads and customer objectives that we gathered by interviewing industry practitioners. Our experiments show that even a handful of moderately mis-behaving problem queries can have a significant impact on a workload consisting of thousands of queries. We were surprised when our experiments also demonstrated that false positives - incorrectly identifying a normal query as a problem - can also have significant consequences. For those reasons, it is very important that an execution controller be as accurate as possible - avoiding both false positives and false negatives. Our experiments also validate that our execution controller can markedly improve the execution of a workload that includes problem queries.
AB - Workload management for business intelligence (BI) queries poses different challenges than those addressed in the online transaction processing (OLTP) context. The fundamental problem is that the execution times of BI queries can range from milliseconds to hours, and it is difficult to estimate these times accurately. Key challenges raised by this problem are how to identify queries that are not performing properly and what to do about them. We propose here a workload management system for controlling the execution of individual queries based on realistic customer service level objectives. In order to validate our proposal, we have implemented an experimental system that includes a dynamic execution controller that leverages fuzzy logic. We present results from a number of experiments that we ran using workloads based on actual industrial workloads and customer objectives that we gathered by interviewing industry practitioners. Our experiments show that even a handful of moderately mis-behaving problem queries can have a significant impact on a workload consisting of thousands of queries. We were surprised when our experiments also demonstrated that false positives - incorrectly identifying a normal query as a problem - can also have significant consequences. For those reasons, it is very important that an execution controller be as accurate as possible - avoiding both false positives and false negatives. Our experiments also validate that our execution controller can markedly improve the execution of a workload that includes problem queries.
UR - http://www.scopus.com/inward/record.url?scp=70349226991&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:70349226991
T3 - 33rd International Conference on Very Large Data Bases, VLDB 2007 - Conference Proceedings
SP - 1105
EP - 1115
BT - 33rd International Conference on Very Large Data Bases, VLDB 2007 - Conference Proceedings
A2 - Gehrke, Johannes
A2 - Koch, Christoph
A2 - Garofalakis, Minos
A2 - Aberer, Karl
A2 - Kanne, Carl-Christian
A2 - Neuhold, Erich J.
A2 - Ganti, Venkatesh
A2 - Klas, Wolfgang
A2 - Chan, Chee-Yong
A2 - Srivastava, Divesh
A2 - Florescu, Dana
A2 - Deshpande, Anand
PB - Association for Computing Machinery, Inc
T2 - 33rd International Conference on Very Large Data Bases, VLDB 2007
Y2 - 23 September 2007 through 27 September 2007
ER -