dc.description.abstract | "In this thesis, we present a hierarchical approach for human action classification using 3-D Convolutional neural networks (3-D CNN). Human actions refer to
positioning and movement of hands and legs and hence can be classified based on those performed by hands or by legs or, in some cases, both. This acts as the
intuition for our work on hierarchical classification.
In this work, we consider the actions as tasks performed by hand or leg movements.
Therefore, instead of using a single 3-D CNN for classification of given actions, we use multiple networks to perform the classification hierarchically, that
is, we first classify an action into a hand or leg action and then use two separate networks for hand and leg action classes to perform classification among target
action categories. In particular, we train three networks to classify six different actions, comprising of three actions each for hands and legs. The use of 3-D CNN enables automatic extraction of features in spatial as well as temporal domain,
avoiding the need for hand crafted features. This makes it one of the better approaches when it comes to video classification. We use the KTH dataset to evaluate our approach and comparison with the state of the art methods shows that our approach outperforms most of the state of the art methods." | |