Concepedia

Publication | Closed Access

Automatic Term Identification and Classification in Biology Texts.

94

Citations

0

References

1999

Year

Abstract

The rapid growth of collections in online academic databases has meant that there is increasing difficulty for experts who want to access information in a timely and efficient way. We seek here to explore the application of information extraction methods to the identification and classification of terms in biological abstracts from MEDLINE. We explore the use of a statistical method and a decision tree method for classification and term candidate identification and also a method based on shallow parsing for identification. Experiments are made against a corpus of 100 expert tagged abstracts and results indicate that while identifying term boundaries is non-trivial, a high success rate can be obtained in term classification and that a combination of methods will provide the best solution. 1