AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity Using Contrastive Learning and Structured Knowledge

Tim Schopf, Emanuel Gerber, Malte Ostendorff, Florian Matthes

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

Abstract

Generic sentence embeddings provide a coarsegrained approximation of semantic textual similarity but ignore specific aspects that make texts similar. Conversely, aspect-based sentence embeddings provide similarities between texts based on certain predefined aspects. Thus, similarity predictions of texts are more targeted to specific requirements and more easily explainable. In this paper, we present AspectCSE, an approach for aspect-based contrastive learning of sentence embeddings. Results indicate that AspectCSE achieves an average improvement of 3.97% on information retrieval tasks across multiple aspects compared to the previous best results. We also propose using Wikidata knowledge graph properties to train models of multiaspect sentence embeddings in which multiple specific aspects are simultaneously considered during similarity predictions. We demonstrate that multi-aspect embeddings outperform single-aspect embeddings on aspect-specific information retrieval tasks. Finally, we examine the aspect-based sentence embedding space and demonstrate that embeddings of semantically similar aspect labels are often close, even without explicit similarity training between different aspect labels.

OriginalspracheEnglisch
TitelInternational Conference Recent Advances in Natural Language Processing, RANLP 2023
UntertitelLarge Language Models for Natural Language Processing - Proceedings
Redakteure/-innenGalia Angelova, Maria Kunilovskaya, Ruslan Mitkov
Herausgeber (Verlag)Incoma Ltd
Seiten1054-1065
Seitenumfang12
ISBN (elektronisch)9789544520922
DOIs
PublikationsstatusVeröffentlicht - 2023
Veranstaltung2023 International Conference Recent Advances in Natural Language Processing: Large Language Models for Natural Language Processing, RANLP 2023 - Varna, Bulgarien
Dauer: 4 Sept. 20236 Sept. 2023

Publikationsreihe

NameInternational Conference Recent Advances in Natural Language Processing, RANLP
ISSN (Print)1313-8502

Konferenz

Konferenz2023 International Conference Recent Advances in Natural Language Processing: Large Language Models for Natural Language Processing, RANLP 2023
Land/GebietBulgarien
OrtVarna
Zeitraum4/09/236/09/23

Fingerprint

Untersuchen Sie die Forschungsthemen von „AspectCSE: Sentence Embeddings for Aspect-based Semantic Textual Similarity Using Contrastive Learning and Structured Knowledge“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren