TY - GEN
T1 - Detecting Requirements Smells with Deep Learning
T2 - 29th IEEE International Requirements Engineering Conference Workshops, REW 2021
AU - Habib, Mohammad Kasra
AU - Wagner, Stefan
AU - Graziotin, Daniel
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/9
Y1 - 2021/9
N2 - Requirements Engineering (RE) is one of the initial phases when building a software system. The success or failure of a software project is firmly tied to this phase, based on communication among stakeholders using natural language. The problem with natural language is that it can easily lead to different understandings if it is not expressed precisely by the stakeholders involved. This results in building a product which is different from the expected one. Previous work proposed to enhance the quality of the software requirements by detecting language errors based on ISO 29148 requirements language criteria. The existing solutions apply classical Natural Language Processing (NLP) to detect them. NLP has some limitations, such as domain dependability which results in poor generalization capability. Therefore, this work aims to improve the previous work by creating a manually labeled dataset and using ensemble learning, Deep Learning (DL), and techniques such as word embeddings and transfer learning to overcome the generalization problem that is tied with classical NLP and improve precision and recall metrics using a manually labeled dataset. The current findings show that the dataset is unbalanced and which class examples should be added more. It is tempting to train algorithms even if the dataset is not considerably representative. Whence, the results show that models are overfitting; in Machine Learning this issue is adressed by adding more instances to the dataset, improving label quality, removing noise, and reducing the learning algorithms complexity, which is planned for this research.
AB - Requirements Engineering (RE) is one of the initial phases when building a software system. The success or failure of a software project is firmly tied to this phase, based on communication among stakeholders using natural language. The problem with natural language is that it can easily lead to different understandings if it is not expressed precisely by the stakeholders involved. This results in building a product which is different from the expected one. Previous work proposed to enhance the quality of the software requirements by detecting language errors based on ISO 29148 requirements language criteria. The existing solutions apply classical Natural Language Processing (NLP) to detect them. NLP has some limitations, such as domain dependability which results in poor generalization capability. Therefore, this work aims to improve the previous work by creating a manually labeled dataset and using ensemble learning, Deep Learning (DL), and techniques such as word embeddings and transfer learning to overcome the generalization problem that is tied with classical NLP and improve precision and recall metrics using a manually labeled dataset. The current findings show that the dataset is unbalanced and which class examples should be added more. It is tempting to train algorithms even if the dataset is not considerably representative. Whence, the results show that models are overfitting; in Machine Learning this issue is adressed by adding more instances to the dataset, improving label quality, removing noise, and reducing the learning algorithms complexity, which is planned for this research.
KW - Deep Learning
KW - Natural Language Processing
KW - RE
UR - http://www.scopus.com/inward/record.url?scp=85118447318&partnerID=8YFLogxK
U2 - 10.1109/REW53955.2021.00027
DO - 10.1109/REW53955.2021.00027
M3 - Conference contribution
AN - SCOPUS:85118447318
T3 - Proceedings of the IEEE International Conference on Requirements Engineering
SP - 153
EP - 156
BT - Proceedings - 29th IEEE International Requirements Engineering Conference Workshops, REW 2021
A2 - Yue, Tao
A2 - Mirakhorli, Mehdi
PB - IEEE Computer Society
Y2 - 20 September 2021 through 24 September 2021
ER -