Automatic generation of German translation candidates for SNOMED CT textual descriptions

Andrea Prunotto, Stefan Schulz, Martin Boeker

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

4 Scopus citations

Abstract

We present an approach called MTP (multiple translation paths) aiming at assisting human translation in SNOMED CT localisation projects based on free, web-based machine translation tools. For a chosen target language, MTP generates a scored output of translation candidates (TCs) for each input concept. This paper describes the basic idea of MTP, the distribution of its output TCs and discusses typical examples with German as target language. The MTP approach capitalises on combinatorial growth by the combination of input languages, support languages, and translation engines. We applied MTP on the SNOMED CT Starter Set, using Google Translator, DeepL and Systran, together with the four source languages English, Spanish, Swedish and French, and Danish, Dutch, Norwegian, Italian, Portuguese, Polish and Russian as support languages. The descriptive assessment of TC variety, together with an analysis of typical results is the focus of this paper. MTP defines, for each input concept, TPs by the combination of input languages, support languages and translation engines, resulting in 91 translation results with various degrees of co-incidence (cardinality). The most configurations produce an average number of TCs indicating that the same TC is often derived via different translation paths. Combinations of translation engines result in distributions with a higher number of distinct TCs per concept. We present work in progress on using machine translation (MT) for terminology translation, by leveraging several free MT tools fed by different languages and language combinations. A first qualitative analysis was promising and supports our hypothesis that a majority voting applied to many translation candidates yields higher quality results than from one single engine and input language.

Original languageEnglish
Title of host publicationPublic Health and Informatics
Subtitle of host publicationProceedings of MIE 2021
PublisherIOS Press
Pages178-182
Number of pages5
ISBN (Electronic)9781643681856
ISBN (Print)9781643681849
DOIs
StatePublished - 1 Jul 2021
Externally publishedYes

Keywords

  • Machine Translation
  • NLP
  • SNOMED CT

Fingerprint

Dive into the research topics of 'Automatic generation of German translation candidates for SNOMED CT textual descriptions'. Together they form a unique fingerprint.

Cite this