Abstract
As the prevalence of machine-written texts grows, it has become increasingly important to distinguish between human- and machine-generated content, especially when such texts are not explicitly labeled. Current artificial intelligence (AI) detection methods primarily focus on human-like characteristics, such as emotionality and subjectivity. However, these features can be easily modified through AI humanization, which involves altering word choice. In contrast, altering the underlying grammar without affecting the conveyed information is considerably more challenging. Thus, the grammatical characteristics of a text can be used as additional indicators of its origin. To address this, we employ a newly developed rule-based parser to analyze the grammatical structures in human- and machine-written texts. Our findings reveal systematic grammatical differences between human- and machine-written texts, providing a reliable criterion for the determination of the text origin. We further examine the stability of this criterion in the context of AI humanization and translation to other languages.
| Original language | English |
|---|---|
| Article number | 274 |
| Journal | Information (Switzerland) |
| Volume | 16 |
| Issue number | 4 |
| DOIs | |
| State | Published - Apr 2025 |
Keywords
- AI detection
- AI humanization
- natural language grammar
- rule-based parser
Fingerprint
Dive into the research topics of 'Comparison of Grammar Characteristics of Human-Written Corpora and Machine-Generated Texts Using a Novel Rule-Based Parser'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver