A Deep Learning Approach for Human Activities Recognition From Multimodal Sensing Devices

TLDR

Human activity recognition has advanced with deep learning, which automatically extracts features from multimodal sensors, overcoming limitations of handcrafted, single‑modality approaches. The study proposes a multi‑channel deep learning architecture combining CNN and BLSTM for human activity recognition. The architecture uses CNN layers to extract multi‑resolution features from raw sensor data, followed by a BLSTM that leverages bidirectional temporal context, and is evaluated on two public datasets. Experiments demonstrate that the CNN‑BLSTM model outperforms baseline and comparable approaches on two datasets, confirming its effectiveness for multimodal human activity recognition.

Abstract

Research in the recognition of human activities of daily living has significantly improved using deep learning techniques. Traditional human activity recognition techniques often use handcrafted features from heuristic processes from single sensing modality. The development of deep learning techniques has addressed most of these problems by the automatic feature extraction from multimodal sensing devices to recognise activities accurately. In this paper, we propose a deep learning multi-channel architecture using a combination of convolutional neural network (CNN) and Bidirectional long short-term memory (BLSTM). The advantage of this model is that the CNN layers perform direct mapping and abstract representation of raw sensor inputs for feature extraction at different resolutions. The BLSTM layer takes full advantage of the forward and backward sequences to improve the extracted features for activity recognition significantly. We evaluate the proposed model on two publicly available datasets. The experimental results show that the proposed model performed considerably better than our baseline models and other models using the same datasets. It also demonstrates the suitability of the proposed model on multimodal sensing devices for enhanced human activity recognition.

References

Page 1

	Year	Citations

Page 1