Classifying semantic types of legal sentences: Portability of machine learning models

Ingo Glaser, Elena Scepankova, Florian Matthes

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Scopus citations

Abstract

Legal contract analysis is an important research area. The classification of clauses or sentences enables valuable insights such as the extraction of rights and obligations. However, datasets consisting of contracts are quite rare, particularly regarding German language. Therefore this paper experiments the portability of machine learning (ML) models with regard to different document types. We trained different ML classifiers on the tenancy law of the German Civil Code (BGB) to apply the resulting models on a set of rental agreements afterwards. The performance of our models varies on the contract set. Some models perform significantly worse, while certain settings reveal a portability. Additionally, we trained and evaluated the same classifiers on a dataset consisting solely of contracts, to be able to observe a reference performance. We could show that the performance of ML models may depend on the document type used for training, while certain setups result in portable models.

Original languageEnglish
Title of host publicationLegal Knowledge and Information - JURIX 2018
Subtitle of host publication31st Annual Conference
EditorsMonica Palmirani
PublisherIOS Press BV
Pages61-70
Number of pages10
ISBN (Electronic)9781614999348
DOIs
StatePublished - 2018
Event31st International Conference on Legal Knowledge and Information Systems, JURIX 2018 - Groningen, Netherlands
Duration: 12 Dec 201814 Dec 2018

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume313
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Conference

Conference31st International Conference on Legal Knowledge and Information Systems, JURIX 2018
Country/TerritoryNetherlands
CityGroningen
Period12/12/1814/12/18

Keywords

  • Legal sentence classification
  • Natural language processing
  • Portability of machine learning models
  • Text mining

Fingerprint

Dive into the research topics of 'Classifying semantic types of legal sentences: Portability of machine learning models'. Together they form a unique fingerprint.

Cite this