The YAGO-NAGA approach to knowledge discovery

Gjergji Kasneci, Maya Ramanath, Fabian Suchanek, Gerhard Weikum

Research output: Contribution to journalArticlepeer-review

70 Scopus citations

Abstract

This paper gives an overview on the YAGO-NAGA approach to information extraction for building a conveniently searchable, large-scale, highly accurate knowledge base of common facts. YAGO harvests infoboxes and category names of Wikipedia for facts about individual entities, and it reconciles these with the taxonomic backbone of WordNet in order to ensure that all entities have proper classes and the class system is consistent. Currently, the YAGO knowledge base contains about 19 million instances of binary relations for about 1.95 million entities. Based on intensive sampling, its accuracy is estimated to be above 95 percent. The paper presents the architecture of the YAGO extractor toolkit, its distinctive approach to consistency checking, its provisions for maintenance and further growth, and the query engine for YAGO, coined NAGA. It also discusses ongoing work on extensions towards integrating fact candidates extracted from natural-language text sources.

Original languageEnglish
Pages (from-to)41-47
Number of pages7
JournalSIGMOD Record (ACM Special Interest Group on Management of Data)
Volume37
Issue number4
DOIs
StatePublished - Dec 2008
Externally publishedYes

Fingerprint

Dive into the research topics of 'The YAGO-NAGA approach to knowledge discovery'. Together they form a unique fingerprint.

Cite this