Crafting Tomorrow’s Headlines: Neural News Generation and Detection in English, Turkish, Hungarian, and Persian

Cem Üyük, Danica Rovó, Shaghayegh Kolli, Rabia Varol, Georg Groh, Daryna Dementieva

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In an era dominated by information overload and its facilitation with Large Language Models (LLMs), the prevalence of misinformation poses a significant threat to public discourse and societal well-being. A critical concern at present involves the identification of machine-generated news. In this work, we take a significant step by introducing a benchmark dataset designed for neural news detection in four languages: English, Turkish, Hungarian, and Persian. The dataset incorporates outputs from multiple multilingual generators (in both zero-shot and fine-tuned setups) such as BloomZ, LLaMa-2, Mistral, Mixtral, and GPT-4. Next, we experiment with a variety of classifiers, ranging from those based on linguistic features to advanced Transformer-based models and LLMs prompting. We present the detection results aiming to delve into the interpretability and robustness of machine-generated text detectors across all target languages.

Original languageEnglish
Title of host publicationNLP4PI 2024 - 3rd Workshop on NLP for Positive Impact, Proceedings of the Workshop
EditorsDaryna Dementieva, Oana Ignat, Zhijing Jin, Zhijing Jin, Rada Mihalcea, Giorgio Piatti, Joel Tetreault, Steven Wilson, Jieyu Zhao
PublisherAssociation for Computational Linguistics (ACL)
Pages271-307
Number of pages37
ISBN (Electronic)9798891761759
StatePublished - 2024
Event3rd Workshop on NLP for Positive Impact, NLP4PI 2024, held in conjunction with the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 - Miami, United States
Duration: 15 Nov 2024 → …

Publication series

NameNLP4PI 2024 - 3rd Workshop on NLP for Positive Impact, Proceedings of the Workshop

Conference

Conference3rd Workshop on NLP for Positive Impact, NLP4PI 2024, held in conjunction with the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024
Country/TerritoryUnited States
CityMiami
Period15/11/24 → …

Fingerprint

Dive into the research topics of 'Crafting Tomorrow’s Headlines: Neural News Generation and Detection in English, Turkish, Hungarian, and Persian'. Together they form a unique fingerprint.

Cite this