TY - JOUR
T1 - The CLIN33 Shared Task on the Detection of Text Generated by Large Language Models
AU - Fivez, Pieter
AU - Daelemans, Walter
AU - Van de Cruys, Tim
AU - Kashnitsky, Yury
AU - Chamezopoulos, Savvas
AU - Mohammadi, Hadi
AU - Giachanou, Anastasia
AU - Bagheri, Ayoub
AU - Poelman, Wessel
AU - Vladika, Juraj
AU - Ploeger, Esther
AU - Bjerva, Johannes
AU - Matthes, Florian
AU - van Halteren, Hans
N1 - Publisher Copyright:
© 2024 Pieter Fivez et al.
PY - 2024/3/21
Y1 - 2024/3/21
N2 - The Shared Task for CLIN33 focuses on a relatively novel yet societally relevant task: the detection of text generated by Large Language Models (LLMs). We frame this detection task as a binary classification problem (LLM-generated or not), using test data from up to 6 different domains and text genres for both Dutch and English. Part of this test data was held out entirely from the contestants, including a”mystery genre” which belonged to an unknown domain (later revealed to be columns). Four teams submitted 11 runs with substantially different models and features. This paper gives an overview of our task setup and contains the evaluation and detailed descriptions of the participating systems. Notably, included in the winning systems are both deep learning models as well as more traditional machine learning models leveraging task-specific feature engineering.
AB - The Shared Task for CLIN33 focuses on a relatively novel yet societally relevant task: the detection of text generated by Large Language Models (LLMs). We frame this detection task as a binary classification problem (LLM-generated or not), using test data from up to 6 different domains and text genres for both Dutch and English. Part of this test data was held out entirely from the contestants, including a”mystery genre” which belonged to an unknown domain (later revealed to be columns). Four teams submitted 11 runs with substantially different models and features. This paper gives an overview of our task setup and contains the evaluation and detailed descriptions of the participating systems. Notably, included in the winning systems are both deep learning models as well as more traditional machine learning models leveraging task-specific feature engineering.
UR - http://www.scopus.com/inward/record.url?scp=85197756038&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85197756038
SN - 2211-4009
VL - 13
SP - 233
EP - 259
JO - Computational Linguistics in the Netherlands Journal
JF - Computational Linguistics in the Netherlands Journal
T2 - 33rd Meeting of Computational Linguistics in the Netherlands, CLIN 2023
Y2 - 22 September 2023
ER -