Toward Extracting Information from Public Health Statutes using Text Classification and Machine Learning
This paper presents preliminary results in extracting semantic information from US state public health legislative provisions using natural language processing techniques and machine learning classifiers. Challenges in the density and distribution of the data as well as the structure of the prediction task are described. Decision tree models trained on a unigram representation with TFIDF measures in most cases outperform the baselines by varying margins, leaving room for further improvement.
Inhaltsverzeichnis
- 1. Introduction
- 2. Task Description
- 3. Our Framework
- 3.1. Preprocessing
- 3.2. Chunk Dataset
- 3.3. Machine Learning Environment
- 3.4. Bag-of-Words and TFIDF
- 3.5. Code Ranking
- 4. Experiments
- 4.1. Experiment Setup
- 4.2. Evaluation Metrics
- 4.3. Results
- 4.4. Discussion
- 5. Relationship to Prior Work
- 6. Conclusions
- 7. Acknowledgements
- 8. References
Loggen Sie sich bitte ein, um den ganzen Text zu lesen.
Es gibt noch keine Kommentare
Ihr Kommentar zu diesem Beitrag
AbonnentInnen dieser Zeitschrift können sich an der Diskussion beteiligen. Bitte loggen Sie sich ein, um Kommentare verfassen zu können.
0 Kommentare