Projects per year
Abstract
Word embeddings represent the semantic meanings of words in high-dimensional vector space. Because of this capability, word embeddings could be used in a wide range of Natural Language Processing (NLP) tasks. While domain-specific monolingual word embeddings are common in literature, domain-specific bilingual word embeddings are uncommon. In general, large text corpora are required for training high quality word embeddings. Furthermore, training domain-specific word embeddings necessitates the use of source texts from the relevant domain. To train bilingual domain-specific word embeddings, the domain-specific texts must also be available in two different languages. In this paper, we use a large dataset of engineering-related articles in German and English to train bilingual engineering-specific word embedding models using different approaches. We will evaluate our trained models, identify the most promising approach, and demonstrate that the best performing one is very capable of representing semantic relationships between engineering-specific words and mapping languages in a shared vector space. Moreover, we show that the additional use of an engineering-specific learning dictionary can improve the quality of bilingual engineering-specific word embeddings.
Original language | English |
---|---|
Title of host publication | MSIE 2022 - 2022 4th International Conference on Management Science and Industrial Engineering |
Publisher | Association for Computing Machinery |
Pages | 407-413 |
Number of pages | 7 |
ISBN (Electronic) | 9781450395816 |
DOIs | |
State | Published - 28 Apr 2022 |
Event | 4th International Conference on Management Science and Industrial Engineering, MSIE 2022 - Virtual, Online, Thailand Duration: 28 Apr 2022 → 30 Apr 2022 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Conference
Conference | 4th International Conference on Management Science and Industrial Engineering, MSIE 2022 |
---|---|
Country/Territory | Thailand |
City | Virtual, Online |
Period | 28/04/22 → 30/04/22 |
Keywords
- Bilingual Word Embeddings
- Engineering
- Natural Language Processing
Fingerprint
Dive into the research topics of 'Towards BilingualWord Embedding Models for Engineering: Evaluating Semantic Linking Capabilities of Engineering-Specific Word Embeddings Across Languages'. Together they form a unique fingerprint.Projects
- 1 Finished
-
TSaaS: Technology Scouting as a Service - KI gestütztes Matching von Technologien mit Problemstellungen aus dem Maschinen- und Anlagenbau
Schopf, T. (PI), Braun, D. (PI), Matthes, F. (PI) & Klymenko, O. (CoI)
1/07/20 → 30/11/21
Project: Research