Jusletter IT

Toward Extracting Information from Public Health Statutes using Text Classification and Machine Learning

  • Authors: Matthias Grabmair / Kevin D. Ashley / Rebecca Hwa / Patricia M. Sweeney
  • Category: Scientific Articles
  • Region: USA
  • Field of law: AI & Law
  • Citation: Matthias Grabmair / Kevin D. Ashley / Rebecca Hwa / Patricia M. Sweeney, Toward Extracting Information from Public Health Statutes using Text Classification and Machine Learning, in: Jusletter IT 12 September 2012
This paper presents preliminary results in extracting semantic information from US state public health legislative provisions using natural language processing techniques and machine learning classifiers. Challenges in the density and distribution of the data as well as the structure of the prediction task are described. Decision tree models trained on a unigram representation with TFIDF measures in most cases outperform the baselines by varying margins, leaving room for further improvement.

Inhaltsverzeichnis

  • 1. Introduction
  • 2. Task Description
  • 3. Our Framework
  • 3.1. Preprocessing
  • 3.2. Chunk Dataset
  • 3.3. Machine Learning Environment
  • 3.4. Bag-of-Words and TFIDF
  • 3.5. Code Ranking
  • 4. Experiments
  • 4.1. Experiment Setup
  • 4.2. Evaluation Metrics
  • 4.3. Results
  • 4.4. Discussion
  • 5. Relationship to Prior Work
  • 6. Conclusions
  • 7. Acknowledgements
  • 8. References

No comments

There are no comments yet

Ihr Kommentar zu diesem Beitrag

AbonnentInnen dieser Zeitschrift können sich an der Diskussion beteiligen. Bitte loggen Sie sich ein, um Kommentare verfassen zu können.