Are girls neko or shojo? Cross-lingual alignment of non-isomorphic embeddings with iterative normalization

Mozhi Zhang, Keyulu Xu, Ken Ichi Kawarabayashi, Stefanie Jegelka, Jordan Boyd-Graber

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

24 Scopus citations

Abstract

Cross-lingual word embeddings (CLWE) underlie many multilingual natural language processing systems, often through orthogonal transformations of pre-trained monolingual embeddings. However, orthogonal mapping only works on language pairs whose embeddings are naturally isomorphic. For non-isomorphic pairs, our method (Iterative Normalization) transforms monolingual embeddings to make orthogonal alignment easier by simultaneously enforcing that (1) individual word vectors are unit length, and (2) each language's average vector is zero. Iterative Normalization consistently improves word translation accuracy of three CLWE methods, with the largest improvement observed on English-Japanese (from 2% to 44% test accuracy).

Original languageEnglish
Title of host publicationACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages3180-3189
Number of pages10
ISBN (Electronic)9781950737482
StatePublished - 2020
Externally publishedYes
Event57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy
Duration: 28 Jul 20192 Aug 2019

Publication series

NameACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
Country/TerritoryItaly
CityFlorence
Period28/07/192/08/19

Fingerprint

Dive into the research topics of 'Are girls neko or shojo? Cross-lingual alignment of non-isomorphic embeddings with iterative normalization'. Together they form a unique fingerprint.

Cite this