A Robust Approach for Action Recognition Based on Spatio-Temporal Features in RGB-D Sequences

Abstract

Recognizing human action is attractive research topic in computer vision since it plays an important role on the applications such as human-computer interaction, intelligent surveillance, human actions retrieval system, health care, smart home, robotics and so on. The availability the low-cost Microsoft Kinect sensor, which can capture real-time high-resolution RGB and visual depth information, has opened an opportunity to significantly increase the capabilities of many automated vision based recognition tasks. In this paper, we propose new framework for action recognition in RGB-D video. We extract spatiotemporal features from RGB-D data that capture both visual, shape and motion information. Moreover, the segmentation technique is applied to present the temporal structure of action. Firstly, we use STIP to detect interest points both of RGB and depth channels. Secondly, we apply HOG3D descriptor for RGB channel and 3DS-HONV descriptor for depth channel. In addition, we also extract HOF2.5D from fusing RGB and Depth to capture human’s motion. Thirdly, we divide the video into segments and apply GMM to create feature vectors for each segment. So, we have three feature vectors (HOG3D, 3DS-HONV, and HOF2.5D) that represent for each segment. Next, the max pooling technique is applied to create a final vector for each descriptor. Then, we concatenate the feature vectors from the previous step into the final vector for action representation. Lastly, we use SVM method for classification step. We evaluated our proposed method on three benchmark datasets to demonstrate generalizability. And, the experimental results shown to be more accurate for action recognition compared to the previous works. We obtain overall accuracies of 93.5%, 99.16% and 89.38% with our proposed method on the UTKinect-Action, 3D Action Pairs and MSR-Daily Activity 3D dataset, respectively. These results show that our method is feasible and superior performance over the-state-of-the-art methods on these datasets.

Authors and Affiliations

Ly Ngoc, Vo Viet, Tran Son, Pham Hoang

Keywords

Related Articles

Spontaneous-braking and lane-changing effect on traffic congestion using cellular automata model applied to the two-lane traffic

In the real traffic situations, vehicle would make a braking as the response to avoid collision with another vehicle or avoid some obstacle like potholes, snow, or pedestrian that crosses the road unexpectedly. However,...

Automatic Rotation Recovery Algorithm for Accurate Digital Image and Video Watermarks Extraction

Research in digital watermarking has evolved rapidly in the current decade. This evolution brought various different methods and algorithms for watermarking digital images and videos. Introduced methods in the field vari...

ADBT Frame Work as a Testing Technique: An Improvement in Comparison with Traditional Model Based Testing

Software testing is an embedded activity in all software development life cycle phases. Due to the difficulties and high costs of software testing, many testing techniques have been developed with the common goal of test...

New Approach for Image Fusion Based on Curvelet Approach

Most of the image fusion work has been limited to monochrome images. Algorithms which utilize human colour perception are attracting the image fusion community with great interest. It is mainly due to the reason that the...

An Improved Brain Mr Image Segmentation using Truncated Skew Gaussian Mixture

A novel approach for segmenting the MRI brain image based on Finite Truncated Skew Gaussian Mixture Model using Fuzzy C-Means algorithm is proposed. The methodology is presented evaluated on bench mark images. The obtain...

Download PDF file
  • EP ID EP118000
  • DOI 10.14569/IJACSA.2016.070526
  • Views 80
  • Downloads 0

How To Cite

Ly Ngoc, Vo Viet, Tran Son, Pham Hoang (2016). A Robust Approach for Action Recognition Based on Spatio-Temporal Features in RGB-D Sequences. International Journal of Advanced Computer Science & Applications, 7(5), 166-177. https://europub.co.uk/articles/-A-118000