Please use this identifier to cite or link to this item: http://drsr.daiict.ac.in//handle/123456789/1197
Title: Human Activity Recognition using Two-stream Attention Based Bi-LSTM Networks
Authors: Khare, Manish
Bhandarkar, Vaidehi
Keywords: TAB-BiLSTM
Convolutional neural network
Human Activity Recognition
Issue Date: 2023
Publisher: Dhirubhai Ambani Institute of Information and Communication Technology
Citation: Bhandarkar, Vaidehi (2023). Human Activity Recognition using Two-stream Attention Based Bi-LSTM Networks. Dhirubhai Ambani Institute of Information and Communication Technology. vii, 37 p. (Acc. # T01138).
Abstract: Human Activity Recognition (HAR) is a challenging task that aims to identify theactions of humans from various data sources. Recently, deep learning methodshave been applied to HAR using RGB (Red, Green and Blue) videos, which capturethe spatial and temporal information of human actions. However, most ofthese methods rely on hand-crafted features or pre-trained models that may notbe optimal for HAR. In this Thesis, we propose a novel method for HAR usingTwo-stream Attention Based Bi-LSTM Networks (TAB-BiLSTM) in RGB videos.Our method consists of two components: a spatial stream and a temporal stream.The spatial stream uses a convolutional neural network (CNN) to extract featuresfrom RGB frames, while the temporal stream uses an optical flow network to capturethe motion information. Both streams are fed into an attention-based bidirectionallong short-term memory (Bi-LSTM) network, which learns the long-termdependencies and focuses on the most relevant features for HAR. The attentionmechanism is implemented by multiplying the outputs of the spatial and temporalstreams, applying a softmax activation, and then multiplying the result withthe temporal stream output again. This way, the attention mechanism can weighthe importance of each feature based on both streams. We evaluate our method onfour benchmark datasets: UCF11, UCF50, UCF101, and NTU RGB. This methodachieves state-of-the-art results on all datasets, with accuracies of 98.3%, 97.1%,92.1%, and 89.5%, respectively, demonstrating its effectiveness and robustness forHAR in RGB videos.
URI: http://drsr.daiict.ac.in//handle/123456789/1197
Appears in Collections:M Tech Dissertations

Files in This Item:
File SizeFormat 
202111064.pdf2.79 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.