Uncovering Trauma in Genocide Tribunals: An NLP Approach Using the Genocide Transcript Corpus

Miriam Schirmer, Isaac Misael Olguín Nolasco, Edoardo Mosca, Shanshan Xu, Jürgen Pfeffer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

This paper applies Natural Language Processing (NLP) methods to analyze the exposure to trauma experienced by witnesses in international criminal tribunals when testifying in court. One major contribution of this study is the creation of a substantially extended version of the Genocide Transcript Corpus (GTC) that includes 52,845 text segments of transcripts from three different genocide tribunals. Based on this data, we first examine the prevalence of trauma-related content in witness statements. Second, we are implementing a binary classification algorithm to automatically detect potentially traumatic content. Therefore, in a preparatory step, an Active Learning (AL) approach is applied to establish the ideal size for the training data set. Subsequently, this data is used to train a transformer model. In this case, the two models BERTbase and HateBERT are used for both steps, allowing for a comparison of a base-level model with a model that has already been pre-trained on data more relevant in the context of harmful vocabulary. In a third step, the study employs an Explainable Artificial Intelligence (XAI) model to gain a deeper understanding of the reasoning behind the model’s classifications. Our results suggest that both BERTbase and HateBERT perform comparatively well on this classification task, with no model clearly outperforming the other. The classification outcomes further suggest that a reduced data set size can achieve equally high performance metrics and might be a preferable choice in certain use cases. The results can be used to establish more trauma-informed legal procedures in genocide-related tribunals, including the identification of potentially re-traumatizing examination approaches at an early stage.

Original languageEnglish
Title of host publication19th International Conference on Artificial Intelligence and Law, ICAIL 2023 - Proceedings of the Conference
PublisherAssociation for Computing Machinery, Inc
Pages257-266
Number of pages10
ISBN (Electronic)9798400701979
DOIs
StatePublished - 19 Jun 2023
Event19th International Conference on Artificial Intelligence and Law, ICAIL 2023 - Braga, Portugal
Duration: 19 Jun 202323 Jun 2023

Publication series

Name19th International Conference on Artificial Intelligence and Law, ICAIL 2023 - Proceedings of the Conference

Conference

Conference19th International Conference on Artificial Intelligence and Law, ICAIL 2023
Country/TerritoryPortugal
CityBraga
Period19/06/2323/06/23

Keywords

  • BERT
  • XAI
  • classification
  • genocide
  • trauma

Fingerprint

Dive into the research topics of 'Uncovering Trauma in Genocide Tribunals: An NLP Approach Using the Genocide Transcript Corpus'. Together they form a unique fingerprint.

Cite this