TY - GEN
T1 - Dynamic load balancing in data grids by global load estimation
AU - Rupprecht, Lukas
AU - Reiser, Angelika
AU - Kemper, Alfons
PY - 2012
Y1 - 2012
N2 - Peer-to-Peer (P2P) technology can be utilized to combine remote resources and build distributed, high performance database systems, called data grids, which help to handle the rapidly increasing volumes of data produced by disciplines like astrophysics, biology, or geology. One major challenge of data grids are skewed query patterns which cause load imbalances and heavily diminish performance and availability. To avoid hot spots, sophisticated load balancing techniques are required. We present a dynamic replication strategy which prevents hot spots by dynamically replicating the hot data on different locations. The main questions of such a strategy are when to copy which data to what receivers and when to delete the copies. To answer these questions we propose a low-overhead, decentralized method which is able to deliver a highly accurate estimate of the global load and the single peer loads to all clients. We use that information in an optimization problem to determine the data to be replicated and the optimal replica receivers. A simulated performance evaluation based on a real-world scenario demonstrates the effectiveness of the approach.
AB - Peer-to-Peer (P2P) technology can be utilized to combine remote resources and build distributed, high performance database systems, called data grids, which help to handle the rapidly increasing volumes of data produced by disciplines like astrophysics, biology, or geology. One major challenge of data grids are skewed query patterns which cause load imbalances and heavily diminish performance and availability. To avoid hot spots, sophisticated load balancing techniques are required. We present a dynamic replication strategy which prevents hot spots by dynamically replicating the hot data on different locations. The main questions of such a strategy are when to copy which data to what receivers and when to delete the copies. To answer these questions we propose a low-overhead, decentralized method which is able to deliver a highly accurate estimate of the global load and the single peer loads to all clients. We use that information in an optimization problem to determine the data to be replicated and the optimal replica receivers. A simulated performance evaluation based on a real-world scenario demonstrates the effectiveness of the approach.
KW - data grids
KW - dynamic replication
KW - load balancing
UR - http://www.scopus.com/inward/record.url?scp=84870762262&partnerID=8YFLogxK
U2 - 10.1109/ISPDC.2012.40
DO - 10.1109/ISPDC.2012.40
M3 - Conference contribution
AN - SCOPUS:84870762262
SN - 9780769548050
T3 - Proceedings - 2012 11th International Symposium on Parallel and Distributed Computing, ISPDC 2012
SP - 243
EP - 250
BT - Proceedings - 2012 11th International Symposium on Parallel and Distributed Computing, ISPDC 2012
T2 - 2012 11th International Symposium on Parallel and Distributed Computing, ISPDC 2012
Y2 - 25 June 2012 through 29 June 2012
ER -