Skip to main navigation Skip to search Skip to main content

An Improved Method for Class-specific Keyword Extraction: A Case Study in the German Business Registry

  • Technical University of Munich
  • Fusionbase GmbH

Research output: Contribution to conferencePaperpeer-review

Abstract

The task of keyword extraction is often an important initial step in unsupervised information extraction, forming the basis for tasks such as topic modeling or document classification. While recent methods have proven to be quite effective in the extraction of keywords, the identification of class-specific keywords, or only those pertaining to a predefined class, remains challenging. In this work, we propose an improved method for class-specific keyword extraction, which builds upon the popular KEYBERT library to identify only keywords related to a class described by seed keywords. We test this method using a dataset of German business registry entries, where the goal is to classify each business according to an economic sector. Our results reveal that our method greatly improves upon previous approaches, setting a new standard for class-specific keyword extraction.

Original languageEnglish
Pages159-165
Number of pages7
StatePublished - 2024
Event20th Conference on Natural Language Processing, KONVENS 2024 - Vienna, Austria
Duration: 10 Sep 202413 Sep 2024

Conference

Conference20th Conference on Natural Language Processing, KONVENS 2024
Country/TerritoryAustria
CityVienna
Period10/09/2413/09/24

Fingerprint

Dive into the research topics of 'An Improved Method for Class-specific Keyword Extraction: A Case Study in the German Business Registry'. Together they form a unique fingerprint.

Cite this