Publication | Open Access
Combining sequence and itemset mining to discover named entities in biomedical texts: a new type of pattern
18
Citations
26
References
2009
Year
EngineeringMachine LearningNew TypeItemset MiningPattern DiscoverySequential Rule MiningPattern MiningCorpus LinguisticsText MiningSequential Pattern MiningNatural Language ProcessingInformation RetrievalData ScienceData MiningPattern RecognitionEntity RecognitionBiostatisticsPublic HealthBiomedical Text MiningNamed-entity RecognitionSequence ModellingKnowledge DiscoveryComputer ScienceFrequent Pattern MiningBiomedical TextsHealth Informatics
Biomedical named entity recognition (NER) is a challenging problem. In this paper, we show that mining techniques, such as sequential pattern mining and sequential rule mining, can be useful to tackle this problem but present some limitations. We demonstrate and analyse these limitations and introduce a new kind of pattern called LSR pattern that offers an excellent trade-off between the high precision of sequential rules and the high recall of sequential patterns. We formalise the LSR pattern mining problem first. Then we show how LSR patterns enable us to successfully tackle biomedical NER problems. We report experiments carried out on real datasets that underline the relevance of our proposition.
| Year | Citations | |
|---|---|---|
Page 1
Page 1