Publication | Open Access
COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
22
Citations
27
References
2019
Year
EngineeringKnowledge ExtractionMultilingual PretrainingSemantic WebLarge Language ModelCorpus LinguisticsLanguage ProcessingText MiningNatural Language ProcessingKnowledge Graph EmbeddingsData ScienceComputational LinguisticsCommonsense KnowledgeLanguage StudiesCommonsense Knowledge GraphsMachine TranslationKnowledge RepresentationAutomatic Commonsense CompletionKnowledge DiscoveryPre-trained ModelsComputer ScienceRetrieval Augmented GenerationCommonsense TransformersDomain Knowledge ModelingSemantic GraphLinguistics
Commonsense knowledge graphs store loosely structured open‑text descriptions rather than canonical templates, unlike many conventional knowledge bases. This study presents a comprehensive investigation of automatic knowledge base construction for ATOMIC and ConceptNet and proposes COMET, a generative model that learns to produce rich, diverse commonsense descriptions in natural language. COMET is a generative transformer that learns from deep pre‑trained language models to generate explicit commonsense knowledge in natural language for these graphs. COMET achieves high‑quality generation, with human‑rated precision of 77.5 % on ATOMIC and 91.7 % on ConceptNet, demonstrating that generative commonsense models can serve as a viable alternative to extractive methods.
We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017). Contrary to many conventional KBs that store knowledge with canonical templates, commonsense KBs only store loosely structured open-text descriptions of knowledge. We posit that an important step toward automatic commonsense completion is the development of generative models of commonsense knowledge, and propose COMmonsEnse Transformers (COMET) that learn to generate rich and diverse commonsense descriptions in natural language. Despite the challenges of commonsense modeling, our investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs. Empirical results demonstrate that COMET is able to generate novel knowledge that humans rate as high quality, with up to 77.5% (ATOMIC) and 91.7% (ConceptNet) precision at top 1, which approaches human performance for these resources. Our findings suggest that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.
| Year | Citations | |
|---|---|---|
Page 1
Page 1