Publication | Closed Access
Automated coding of diagnoses--three methods compared.
33
Citations
7
References
2000
Year
EngineeringDiagnosisDiagnosticsAutomated CodingMedical DiagnosisCorpus LinguisticsText MiningNatural Language ProcessingInformation RetrievalSnomed CodesComputational LinguisticsCorrect DiagnosesBiostatisticsPublic HealthBiomedical Text MiningDisease DiagnosisMachine TranslationComputational LexicologyTerminology ExtractionLexical ResourceVector Space ModelDiagnostic SystemPossible DiagnosesMedicineLinguisticsHealth Informatics
In Germany, new legal requirements have raised the importance of the accurate encoding of admission and discharge diseases for in- and outpatients. In response to emerging needs for computer-supported tools we examined three methods for automated coding of German-language free-text diagnosis phrases. We compared a language-independent lexicon-free n-gram approach with one which uses a dictionary of medical morphemes and refines the query by a mapping to SNOMED codes. Both techniques produced a ranked output of possible diagnoses within a vector space framework for retrieval. The results did not reveal any significant difference: The correct diagnosis was found in approximately 40% for three-digit codes, and 30% for four-digit codes. The lexicon-based method was then modified by substituting the vector space ranking by a heuristic approach that capitalizes on the semantic structure of SNOMED, thus raising the number of correct diagnoses significantly (approximately 50% for three-digit codes, and 40% for four-digit codes). As a result, we claim that lexicon-based retrieval methods do not perform better than the lexicon-free ones, unless conceptual knowledge is added.
| Year | Citations | |
|---|---|---|
Page 1
Page 1