Publication | Open Access
VISTA: Visual-Textual Knowledge Graph Representation Learning
27
Citations
36
References
2023
Year
Unknown Venue
Natural Language ProcessingKnowledge RepresentationEngineeringKnowledge Graph EmbeddingsData ScienceDeep LearningVisual ReasoningComputational LinguisticsVision Language ModelEntity EncodingVisual Question AnsweringKnowledge GraphsText DescriptionsSemantic GraphSemantic NetworkRepresentation Learning
Knowledge graphs represent human knowledge using triplets composed of entities and relations. While most existing knowledge graph embedding methods only consider the structure of a knowledge graph, a few recently proposed multimodal methods utilize images or text descriptions of entities in a knowledge graph. In this paper, we propose visual-textual knowledge graphs (VTKGs), where not only entities but also triplets can be explained using images, and both entities and relations can accompany text descriptions. By compiling visually expressible commonsense knowledge, we construct new benchmark datasets where triplets themselves are explained by images, and the meanings of entities and relations are described using text. We propose VISTA, a knowledge graph representation learning method for VTKGs, which incorporates the visual and textual representations of entities and relations using entity encoding, relation encoding, and triplet decoding transformers. Experiments show that VISTA outperforms state-of-the-art knowledge graph completion methods in real-world VTKGs.
| Year | Citations | |
|---|---|---|
Page 1
Page 1