Publication | Closed Access
EcForest: Extractive document summarization through enhanced sentence embedding and cascade forest
11
Citations
30
References
2019
Year
EngineeringMachine LearningPresent EcforestLanguage ProcessingText MiningAutomatic SummarizationWord EmbeddingsNatural Language ProcessingLarge Language ModelsData ScienceText SummarizationComputational LinguisticsEmbeddingsLanguage StudiesExtractive Document SummarizationNlp TaskPre-trained ModelsDeep LearningInformation ExtractionMulti-modal SummarizationEnhanced Sentence EmbeddingCascade ForestLinguistics
Summary We present EcForest, an extractive summarization model through Enhanced Sentence Embedding and Cascade Forest. Sentence representation is of great significance for many summarization methods. Bag‐of‐words mostly fails to grasp the semantics, and typical embedding models cannot capture more complex semantic features, such as polysemy and the meaning of a phrase, which is usually ignored by simply averaging the word embeddings included in a sentence. To this end, we propose Enhanced Sentence Embedding (ESE) model to solve such drawbacks via mapping several valid features to dense vectors. Essentially, the enhanced sentence embedding is a novel model for improving the distributed representation of sentence. Our sentence embedding model is universally applicable and it can be adapted to other NLP tasks. Moreover, deep forest is used as a sentence extraction algorithm for its robustness to the hyper‐parameters and its efficient training algorithm compared to deep neural network. The evaluation of variant models proposed in this work proves the validation of the enhanced sentence embedding. The comparison results between EcForest and several baselines on two different datasets demonstrate that the proposed summarization model performs better than or with high competitiveness to the state‐of‐the‐art.
| Year | Citations | |
|---|---|---|
Page 1
Page 1