TY - JOUR
T1 - A Rule-Based Parser in Comparison with Statistical Neuronal Approaches in Terms of Grammar Competence
AU - Strübbe, Simon M.
AU - Grünwald, Alexander T.D.
AU - Sidorenko, Irina
AU - Lampe, Renée
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2025/1
Y1 - 2025/1
N2 - The “Easy Language” standard was created to help individuals with cognitive disabilities understand texts more easily. Typically, text simplification is performed by language experts and is available for limited materials. We introduce a new software tool designed to analyze and simplify any text according to the “Easy Language” rules. This tool uses a rule-based system, conducting a full grammatical analysis of each sentence and then simplifying it into a grammatically correct form. Unlike neuronal approaches, which are based on statistics and are very popular today, our rule-based approach explicitly addresses language ambiguities by examining all possible interpretations and eliminating the incorrect ones. The purpose of the present study is to compare the performance of our rule-base parser with two state-of-the-art statistical parsers, one based on dependencies between words (SpaCy parser) and the other based on linguistic constituents (Stanford parser). Although large language models (LLMs), which are the technical basis of the software ChatGPT, were not designed specifically for grammatical parsing, because of their popularity, users, especially language learners, often ask them grammatical questions as well. Therefore, we use LLMs as supplementary models for comparison. LMMs produce grammatically correct text on any topic; however, their grammar knowledge is implicit within the trained weights. To evaluate how well state-of-the-art methods can perform a grammatical analysis, we parse ten sentences with our tool, the statistical parsers from SpaCy and Stanford, and ask two LLMs equivalent grammar questions. The results show that our rule-based method provides a more informative and reliable grammatical analysis compared to these two parsers and outperforms LLMs in that specific task.
AB - The “Easy Language” standard was created to help individuals with cognitive disabilities understand texts more easily. Typically, text simplification is performed by language experts and is available for limited materials. We introduce a new software tool designed to analyze and simplify any text according to the “Easy Language” rules. This tool uses a rule-based system, conducting a full grammatical analysis of each sentence and then simplifying it into a grammatically correct form. Unlike neuronal approaches, which are based on statistics and are very popular today, our rule-based approach explicitly addresses language ambiguities by examining all possible interpretations and eliminating the incorrect ones. The purpose of the present study is to compare the performance of our rule-base parser with two state-of-the-art statistical parsers, one based on dependencies between words (SpaCy parser) and the other based on linguistic constituents (Stanford parser). Although large language models (LLMs), which are the technical basis of the software ChatGPT, were not designed specifically for grammatical parsing, because of their popularity, users, especially language learners, often ask them grammatical questions as well. Therefore, we use LLMs as supplementary models for comparison. LMMs produce grammatically correct text on any topic; however, their grammar knowledge is implicit within the trained weights. To evaluate how well state-of-the-art methods can perform a grammatical analysis, we parse ten sentences with our tool, the statistical parsers from SpaCy and Stanford, and ask two LLMs equivalent grammar questions. The results show that our rule-based method provides a more informative and reliable grammatical analysis compared to these two parsers and outperforms LLMs in that specific task.
KW - GPT-3.5
KW - GPT-4
KW - natural language processing
KW - rule-based parsing
KW - SpaCy
KW - Stanford parser
UR - http://www.scopus.com/inward/record.url?scp=85214529812&partnerID=8YFLogxK
U2 - 10.3390/app15010087
DO - 10.3390/app15010087
M3 - Article
AN - SCOPUS:85214529812
SN - 2076-3417
VL - 15
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 1
M1 - 87
ER -