TY - GEN
T1 - NAGA
T2 - 2008 IEEE 24th International Conference on Data Engineering, ICDE'08
AU - Kasneci, Gjergji
AU - Suchanek, Fabian M.
AU - Ifrim, Georgiana
AU - Ramanath, Maya
AU - Weikum, Gerhard
PY - 2008
Y1 - 2008
N2 - The Web has the potential to become the world's largest knowledge base. In order to unleash this potential, the wealth of information available on the Web needs to be extracted and organized. There is a need for new querying techniques that are simple and yet more expressive than those provided by standard keyword-based search engines. Searching for knowledge rather than Web pages needs to consider inherent semantic structures like entities (person, organization, etc.) and relationships (isA, located In, etc.). In this paper, we propose NAGA, a new semantic search engine. NAGA builds on a knowledge base, which is organized as a graph with typed edges, and consists of millions of entities and relationships extracted from Web-based corpora. A graph-based query language enables the formulation of queries with additional semantic information. We introduce a novel scoring model, based on the principles of generative language models, which formalizes several notions such as confidence, informativeness and compactness and uses them to rank query results. We demonstrate NAGA's superior result quality over state-of-the-art search engines and question answering systems.
AB - The Web has the potential to become the world's largest knowledge base. In order to unleash this potential, the wealth of information available on the Web needs to be extracted and organized. There is a need for new querying techniques that are simple and yet more expressive than those provided by standard keyword-based search engines. Searching for knowledge rather than Web pages needs to consider inherent semantic structures like entities (person, organization, etc.) and relationships (isA, located In, etc.). In this paper, we propose NAGA, a new semantic search engine. NAGA builds on a knowledge base, which is organized as a graph with typed edges, and consists of millions of entities and relationships extracted from Web-based corpora. A graph-based query language enables the formulation of queries with additional semantic information. We introduce a novel scoring model, based on the principles of generative language models, which formalizes several notions such as confidence, informativeness and compactness and uses them to rank query results. We demonstrate NAGA's superior result quality over state-of-the-art search engines and question answering systems.
UR - http://www.scopus.com/inward/record.url?scp=52649125614&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2008.4497504
DO - 10.1109/ICDE.2008.4497504
M3 - Conference contribution
AN - SCOPUS:52649125614
SN - 9781424418374
T3 - Proceedings - International Conference on Data Engineering
SP - 953
EP - 962
BT - Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08
Y2 - 7 April 2008 through 12 April 2008
ER -