LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Comparison of Neural Language Modeling Pipelines for Outcome Prediction From Unstructured Medical Text Notes

Photo from wikipedia

Machine learning techniques and algorithm-based approaches are becoming more and more vital to support clinical decision-making. In the medical area, natural language processing (NLP) techniques have shown the ability to… Click to show full abstract

Machine learning techniques and algorithm-based approaches are becoming more and more vital to support clinical decision-making. In the medical area, natural language processing (NLP) techniques have shown the ability to extract useful information from electronic health records. On the one hand, statistic, semantic, and contextualized word embedding-based models and on the other hand preprocessing approaches are the keys to a better representation of a document. Using narratives from the Intensive Care Unit, we elaborated a comparison of the most used methods and preprocessing approaches to tackle an outcome prediction problem and guide researchers into NLP pipelines in the medical area. We used real data from Medical Information Mart for Intensive Care-III (MIMIC-III). We selected all notes related to patients with pneumonia. We conducted a deep analysis on text preprocessing tasks producing three datasets: raw data with minor preprocessing, meticulous preprocessing, and extreme preprocessing filtering only medical-related terminologies using Named Entity Recognition algorithms. We then used these three sets in five models, of which two are based on the traditional noncontextual word embedding techniques and three use contextualized word embedding based on a transformer. We demonstrated that transformer-based models outperform other word embedding models and a profound preprocessing yielded an accuracy of 98.2 F1-score. These results show the highly competitive ability of NLP predictive models against other models that use medical data. With an appropriate NLP pipeline, the information contained in medical narratives can be used to draw up a patient profile, and admission notes can help to ascertain a mortality risk of a patient admitted to the Intensive Care Unit.

Keywords: outcome prediction; word embedding; language; text

Journal Title: IEEE Access
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.