Semantic Label Representations with Lbl2Vec: A Similarity-Based Approach for Unsupervised Text Classification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

In this paper, we evaluate the Lbl2Vec approach for unsupervised text document classification. Lbl2Vec requires only a small number of keywords describing the respective classes to create semantic label representations. For classification, Lbl2Vec uses cosine similarities between label and document representations, but no annotation information. We show that Lbl2Vec significantly outperforms common unsupervised text classification approaches and a widely used zero-shot text classification approach. Furthermore, we show that using more precise keywords can significantly improve the classification results of similarity-based text classification approaches.

Original languageEnglish
Title of host publicationWeb Information Systems and Technologies - 16th International Conference, WEBIST 2020, and 17th International Conference, WEBIST 2021, Revised Selected Papers
EditorsMassimo Marchiori, Francisco José Domínguez Mayo, Joaquim Filipe
PublisherSpringer Science and Business Media Deutschland GmbH
Pages59-73
Number of pages15
ISBN (Print)9783031241963
DOIs
StatePublished - 2023
Event16th International Conference on Web Information Systems and Technologies, WEBIST 2020 and 17th International Conference on Web Information Systems and Technologies, WEBIST 2021 - Virtual, Online
Duration: 26 Oct 202128 Oct 2021

Publication series

NameLecture Notes in Business Information Processing
Volume469 LNBIP
ISSN (Print)1865-1348
ISSN (Electronic)1865-1356

Conference

Conference16th International Conference on Web Information Systems and Technologies, WEBIST 2020 and 17th International Conference on Web Information Systems and Technologies, WEBIST 2021
CityVirtual, Online
Period26/10/2128/10/21

Keywords

  • Natural language processing
  • Semantic label representations
  • Text representations
  • Text similarity
  • Unsupervised text classification

Fingerprint

Dive into the research topics of 'Semantic Label Representations with Lbl2Vec: A Similarity-Based Approach for Unsupervised Text Classification'. Together they form a unique fingerprint.

Cite this