TY - GEN
T1 - Minimizing the Communication Cost of Aggregation in Publish/Subscribe Systems
AU - Pandey, Navneet Kumar
AU - Zhang, Kaiwen
AU - Weiss, Stephane
AU - Jacobsen, Hans Arno
AU - Vitenberg, Roman
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/7/22
Y1 - 2015/7/22
N2 - Modern applications for distributed publish/subscribe systems often require stream aggregation capabilities along with rich data filtering. When compared to other distributed systems, aggregation in pub/sub differentiates itself as a complex problem which involves dynamic dissemination paths that are difficult to predict and optimize for a priori, temporal fluctuations in publication rates, and the mixed presence of aggregated and non-aggregated workloads. In this paper, we propose a formalization for the problem of minimizing communication traffic in the context of aggregation in pub/sub. We present a solution to this minimization problem by using a reduction to the well-known problem of minimum vertex cover in a bipartite graph. This solution is optimal under the strong assumption of complete knowledge of future publications. We call the resulting algorithm 'Aggregation Decision, Optimal with Complete Knowledge' (ADOCK). We also show that under a dynamic setting without full knowledge, ADOCK can still be applied to produce a low, yet not necessarily optimal, communication cost. We also devise a computationally cheaper dynamic approach called 'Aggregation Decision with Weighted Publication' (WAD). We compare our solutions experimentally using two real datasets and explore the trade-offs with respect to communication and computation costs.
AB - Modern applications for distributed publish/subscribe systems often require stream aggregation capabilities along with rich data filtering. When compared to other distributed systems, aggregation in pub/sub differentiates itself as a complex problem which involves dynamic dissemination paths that are difficult to predict and optimize for a priori, temporal fluctuations in publication rates, and the mixed presence of aggregated and non-aggregated workloads. In this paper, we propose a formalization for the problem of minimizing communication traffic in the context of aggregation in pub/sub. We present a solution to this minimization problem by using a reduction to the well-known problem of minimum vertex cover in a bipartite graph. This solution is optimal under the strong assumption of complete knowledge of future publications. We call the resulting algorithm 'Aggregation Decision, Optimal with Complete Knowledge' (ADOCK). We also show that under a dynamic setting without full knowledge, ADOCK can still be applied to produce a low, yet not necessarily optimal, communication cost. We also devise a computationally cheaper dynamic approach called 'Aggregation Decision with Weighted Publication' (WAD). We compare our solutions experimentally using two real datasets and explore the trade-offs with respect to communication and computation costs.
KW - Aggregation
KW - Distributed event based system
KW - Publish/Subscribe system
UR - http://www.scopus.com/inward/record.url?scp=84944313204&partnerID=8YFLogxK
U2 - 10.1109/ICDCS.2015.54
DO - 10.1109/ICDCS.2015.54
M3 - Conference contribution
AN - SCOPUS:84944313204
T3 - Proceedings - International Conference on Distributed Computing Systems
SP - 462
EP - 473
BT - Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015
Y2 - 29 June 2015 through 2 July 2015
ER -