TY - GEN
T1 - STAR
T2 - 25th IEEE International Conference on Data Engineering, ICDE 2009
AU - Kasneci, Gjergji
AU - Ramanath, Maya
AU - Sozio, Mauro
AU - Suchanek, Fabian M.
AU - Weikum, Gerhard
PY - 2009
Y1 - 2009
N2 - Large graphs and networks are abundant in modern information systems: entity-relationship graphs over relational data or Web-extracted entities, biological networks, social online communities, knowledge bases, and many more. Often such data comes with expressive node and edge labels that allow an interpretation as a semantic graph, and edge weights that reflect the strengths of semantic relations between entities. Finding close relationships between a given set of two, three, or more entities is an important building block for many search, ranking, and analysis tasks. From an algorithmic point of view, this translates into computing the best Steiner trees between the given nodes, a classical NP-hard problem. In this paper, we present a new approximation algorithm, coined STAR, for relationship queries over large relationship graphs. We prove that for n query entities, STAR yields an O(log(n))-approximation of the optimal Steiner tree in pseudopolynomial run-time, and show that in practical cases the results returned by STAR are qualitatively comparable to or even better than the results returned by a classical 2- approximation algorithm. We then describe an extension to our algorithm to return the top-k Steiner trees. Finally, we evaluate our algorithm over both main-memory as well as completely diskresident graphs containing millions of nodes. Our experiments show that in terms of efficiency STAR outperforms the best stateof- the-art database methods by a large margin, and also returns qualitatively better results.
AB - Large graphs and networks are abundant in modern information systems: entity-relationship graphs over relational data or Web-extracted entities, biological networks, social online communities, knowledge bases, and many more. Often such data comes with expressive node and edge labels that allow an interpretation as a semantic graph, and edge weights that reflect the strengths of semantic relations between entities. Finding close relationships between a given set of two, three, or more entities is an important building block for many search, ranking, and analysis tasks. From an algorithmic point of view, this translates into computing the best Steiner trees between the given nodes, a classical NP-hard problem. In this paper, we present a new approximation algorithm, coined STAR, for relationship queries over large relationship graphs. We prove that for n query entities, STAR yields an O(log(n))-approximation of the optimal Steiner tree in pseudopolynomial run-time, and show that in practical cases the results returned by STAR are qualitatively comparable to or even better than the results returned by a classical 2- approximation algorithm. We then describe an extension to our algorithm to return the top-k Steiner trees. Finally, we evaluate our algorithm over both main-memory as well as completely diskresident graphs containing millions of nodes. Our experiments show that in terms of efficiency STAR outperforms the best stateof- the-art database methods by a large margin, and also returns qualitatively better results.
UR - http://www.scopus.com/inward/record.url?scp=67649663902&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2009.64
DO - 10.1109/ICDE.2009.64
M3 - Conference contribution
AN - SCOPUS:67649663902
SN - 9780769535456
T3 - Proceedings - International Conference on Data Engineering
SP - 868
EP - 879
BT - Proceedings - 25th IEEE International Conference on Data Engineering, ICDE 2009
Y2 - 29 March 2009 through 2 April 2009
ER -