TY - GEN
T1 - Distinguishing Fact from Fiction
T2 - 3rd Workshop on Trustworthy Natural Language Processing, TrustNLP 2023, co-located with ACL 2023
AU - Mosca, Edoardo
AU - Abdalla, Mohamed Hesham I.
AU - Basso, Paolo
AU - Musumeci, Margherita
AU - Groh, Georg
N1 - Publisher Copyright:
© 2023 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All rights reserved.
PY - 2023
Y1 - 2023
N2 - As generative NLP can now produce content nearly indistinguishable from human writing, it becomes difficult to identify genuine research contributions in academic writing and scientific publications. Moreover, information in NLP-generated text can potentially be factually wrong or even entirely fabricated. This study introduces a novel benchmark dataset, containing human-written and machine-generated scientific papers from SCIgen, GPT-2, GPT-3, ChatGPT, and Galactica. After describing the generation and extraction pipelines, we also experiment with four distinct classifiers as a baseline for detecting the authorship of scientific text. A strong focus is put on generalization capabilities and explainability to highlight the strengths and weaknesses of detectors. We believe our work serves as an important step towards creating more robust methods for distinguishing between human-written and machine-generated scientific papers, ultimately ensuring the integrity of scientific literature.
AB - As generative NLP can now produce content nearly indistinguishable from human writing, it becomes difficult to identify genuine research contributions in academic writing and scientific publications. Moreover, information in NLP-generated text can potentially be factually wrong or even entirely fabricated. This study introduces a novel benchmark dataset, containing human-written and machine-generated scientific papers from SCIgen, GPT-2, GPT-3, ChatGPT, and Galactica. After describing the generation and extraction pipelines, we also experiment with four distinct classifiers as a baseline for detecting the authorship of scientific text. A strong focus is put on generalization capabilities and explainability to highlight the strengths and weaknesses of detectors. We believe our work serves as an important step towards creating more robust methods for distinguishing between human-written and machine-generated scientific papers, ultimately ensuring the integrity of scientific literature.
UR - http://www.scopus.com/inward/record.url?scp=85174862576&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85174862576
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 190
EP - 207
BT - 3rd Workshop on Trustworthy Natural Language Processing, TrustNLP 2023 - Proceedings of the Workshop
A2 - Ovalle, Anaelia
A2 - Chang, Kai-Wei
A2 - Chang, Kai-Wei
A2 - Mehrabi, Ninareh
A2 - Pruksachatkun, Yada
A2 - Galystan, Aram
A2 - Galystan, Aram
A2 - Dhamala, Jwala
A2 - Verma, Apurv
A2 - Cao, Trista
A2 - Kumar, Anoop
A2 - Gupta, Rahul
PB - Association for Computational Linguistics (ACL)
Y2 - 14 July 2023
ER -