Abstract
Graph analytics on social networks, Web data, and communication networks has been widely used in a plethora of applications. Many graph analytics algorithms are based on breadth-first search (BFS) graph traversal, which is not only time-consuming for large datasets but also involves much redundant computation when executed multiple times from different start vertices. In this paper, we propose Multi- Source BFS (MS-BFS), an algorithm that is designed to run multiple concurrent BFSs over the same graph on a single CPU core while scaling up as the number of cores increases. MS-BFS leverages the properties of small-world networks, which apply to many real-world graphs, and enables efficient graph traversal that: (i) shares common computation across concurrent BFSs; (ii) greatly reduces the number of random memory accesses; and (iii) does not incur synchronization costs. We demonstrate how a real graph analytics application-all-vertices closeness centrality-can be efficiently solved with MS-BFS. Furthermore, we present an extensive experimental evaluation with both synthetic and real datasets, including Twitter and Wikipedia, showing that MS-BFS provides almost linear scalability with respect to the number of cores and excellent scalability for increasing graph sizes, outperforming state-of-the-art BFS algorithms by more than one order of magnitude when running a large number of BFSs.
Original language | English |
---|---|
Pages (from-to) | 449-460 |
Number of pages | 12 |
Journal | Proceedings of the VLDB Endowment |
Volume | 8 |
Issue number | 4 |
DOIs | |
State | Published - Dec 2014 |
Event | 3rd Workshop on Spatio-Temporal Database Management, STDBM 2006, Co-located with the 32nd International Conference on Very Large Data Bases, VLDB 2006 - Seoul, Korea, Republic of Duration: 11 Sep 2006 → 11 Sep 2006 |