Toward extracting information from public health statutes using text classification machine learning

Matthias Grabmair, Kevin D. Ashley, Rebecca Hwa, Patricia M. Sweeney

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

This paper presents preliminary results in extracting semantic information from US state public health legislative provisions using natural language processing techniques and machine learning classifiers. Challenges in the density and distribution of the data as well as the structure of the prediction task are described. Decision tree models trained on a unigram representation with TFIDF measures in most cases outperform the baselines by varying margins, leaving room for further improvement.

Original languageEnglish
Title of host publicationLegal Knowledge and Information Systems - JURIX 2011
Subtitle of host publicationThe Twenty-Fourth Annual Conference
PublisherIOS Press BV
Pages73-82
Number of pages10
ISBN (Print)9781607509806
DOIs
StatePublished - 2011
Externally publishedYes

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume235
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Keywords

  • machine learning
  • natural language processing
  • semantic extraction

Fingerprint

Dive into the research topics of 'Toward extracting information from public health statutes using text classification machine learning'. Together they form a unique fingerprint.

Cite this