Suicide Note Classification Using Natural Language Processing: A Content Analysis

TLDR

Suicide is the second leading cause of death among 25–34 year olds and the third among 15–25 year olds in the United States, and risk assessment in emergency departments is typically left to clinical judgment. This study investigates whether machine‑learning algorithms can classify suicide notes as accurately as mental‑health professionals and psychiatric trainees, aiming to clarify the role of computational methods in interpreting suicidal patients’ thoughts. The authors developed natural‑language‑processing techniques to differentiate genuine from elicited notes, using a dataset of 33 completed and 33 elicited notes, and compared nine machine‑learning models to the judgments of 11 professionals and 31 trainees. Results show that trainees correctly classified 49 % of notes, professionals 63 %, and the best algorithm 78 %, demonstrating that NLP can improve identification of suicidal note types and support evidence‑based prediction of repeat attempts.

Abstract

Suicide is the second leading cause of death among 25–34 year olds and the third leading cause of death among 15–25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient's thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes.

References

Page 1

	Year	Citations

Page 1