SiGMa: Simple greedy matching for aligning large knowledge bases

Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, Zoubin Ghahramani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

149 Scopus citations

Abstract

The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information. Tools for automatically aligning these knowledge bases would make it possible to unify many sources of structured knowledge and answer complex queries. However, the efficient alignment of large- scale knowledge bases still poses a considerable challenge. Here, we present Simple Greedy Matching (SiGMa), a simple algorithm for aligning knowledge bases with millions of entities and facts. SiGMa is an iterative propagation algorithm that leverages both the structural information from the relationship graph and flexible similarity measures between entity properties in a greedy local search, which makes it scalable. Despite its greedy nature, our experiments indicate that SiGMa can efficiently match some of the world's largest knowledge bases with high accuracy. We provide additional experiments on benchmark datasets which demonstrate that SiGMa can outperform state-of-The-Art approaches both in accuracy and efficiency.

Original languageEnglish
Title of host publicationKDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
EditorsRajesh Parekh, Jingrui He, Dhillon S. Inderjit, Paul Bradley, Yehuda Koren, Rayid Ghani, Ted E. Senator, Robert L. Grossman, Ramasamy Uthurusamy
PublisherAssociation for Computing Machinery
Pages572-580
Number of pages9
ISBN (Electronic)9781450321747
DOIs
StatePublished - 11 Aug 2013
Externally publishedYes
Event19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013 - Chicago, United States
Duration: 11 Aug 201314 Aug 2013

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
VolumePart F128815

Conference

Conference19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013
Country/TerritoryUnited States
CityChicago
Period11/08/1314/08/13

Keywords

  • Alignment
  • Entity
  • Greedy algorithm
  • Knowledge base
  • Large-scale
  • Relationship

Fingerprint

Dive into the research topics of 'SiGMa: Simple greedy matching for aligning large knowledge bases'. Together they form a unique fingerprint.

Cite this