Research on Image Recognition Technology Based on Multimodal Deep Learning

Abstract

This research explores a human multi-modal behavior identification algorithm using deep neural networks. The algorithm leverages various deep neural networks tailored to different types of modal information to process diverse video data. By integrating these neural networks, the algorithm effectively identifies behaviors across multiple modalities. Data collection for this study utilized multiple cameras developed by Microsoft Kinect, capturing skeletal point data in addition to conventional images. This dual data collection method enables the extraction of motion features from the images. The synthesized behavioral characteristics from these data sources facilitate precise behavior identification and categorization. The proposed algorithm's effectiveness was validated using the MSR3D dataset. Experimental results demonstrate that behavior recognition accuracy remains consistent across various scenarios, indicating the robustness of the recognition process. The findings reveal that the algorithm substantially enhances the accuracy of pedestrian behavior recognition in video footage.

References

Page 1

	Year	Citations

Page 1