TY - GEN
T1 - Multi-task deep learning for legal document translation, summarization and multi-label classification
AU - Elnaggar, Ahmed
AU - Gebendorfer, Christoph
AU - Glaser, Ingo
AU - Matthes, Florian
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/12/21
Y1 - 2018/12/21
N2 - The digitalization of the legal domain has been ongoing for a couple of years. In that process, the application of different machine learning (ML) techniques is crucial. Tasks such as the classification of legal documents or contract clauses as well as the translation of those are highly relevant. On the other side, digitized documents are barely accessible in this field, particularly in Germany. Today, deep learning (DL) is one of the hot topics with many publications and various applications. Sometimes it provides results outperforming the human level. Hence this technique may be feasible for the legal domain as well. However, DL requires thousands of samples to provide decent results. A potential solution to this problem is multi-task DL to enable transfer learning. This approach may be able to overcome the data scarcity problem in the legal domain, specifically for the German language. We applied the state of the art multi-task model on three tasks: translation, summarization, and multi-label classification. The experiments were conducted on legal document corpora utilizing several task combinations as well as various model parameters. The goal was to find the optimal configuration for the tasks at hand within the legal domain. The multi-task DL approach outperformed the state of the art results in all three tasks. This opens a new direction to integrate DL technology more efficiently in the legal domain.
AB - The digitalization of the legal domain has been ongoing for a couple of years. In that process, the application of different machine learning (ML) techniques is crucial. Tasks such as the classification of legal documents or contract clauses as well as the translation of those are highly relevant. On the other side, digitized documents are barely accessible in this field, particularly in Germany. Today, deep learning (DL) is one of the hot topics with many publications and various applications. Sometimes it provides results outperforming the human level. Hence this technique may be feasible for the legal domain as well. However, DL requires thousands of samples to provide decent results. A potential solution to this problem is multi-task DL to enable transfer learning. This approach may be able to overcome the data scarcity problem in the legal domain, specifically for the German language. We applied the state of the art multi-task model on three tasks: translation, summarization, and multi-label classification. The experiments were conducted on legal document corpora utilizing several task combinations as well as various model parameters. The goal was to find the optimal configuration for the tasks at hand within the legal domain. The multi-task DL approach outperformed the state of the art results in all three tasks. This opens a new direction to integrate DL technology more efficiently in the legal domain.
KW - Classifcation
KW - Multi-la bel
KW - Multi-task Deep Learning
KW - Summarization
KW - Translation
UR - https://www.scopus.com/pages/publications/85062954184
U2 - 10.1145/3299819.3299844
DO - 10.1145/3299819.3299844
M3 - Conference contribution
AN - SCOPUS:85062954184
T3 - ACM International Conference Proceeding Series
SP - 9
EP - 15
BT - AICCC 2018 - Proceedings of 2018 Artificial Intelligence and Cloud Computing Conference
PB - Association for Computing Machinery
T2 - 2018 International Conference on Artificial Intelligence and Cloud Computing, AICCC 2018
Y2 - 21 December 2018 through 23 December 2018
ER -