The clash between two worlds in human action recognition: supervised feature training vs Recurrent ConvNet

dc.contributor.advisorTsechpenakis, Gavriil
dc.contributor.authorRaptis, Konstantinos
dc.date.accessioned2017-01-18T21:09:19Z
dc.date.available2017-01-18T21:09:19Z
dc.date.issued2016-11-28
dc.degree.date2016en_US
dc.degree.grantorPurdue Universityen_US
dc.degree.levelM.S.en_US
dc.descriptionIndiana University-Purdue University Indianapolis (IUPUI)en_US
dc.description.abstractAction recognition has been an active research topic for over three decades. There are various applications of action recognition, such as surveillance, human-computer interaction, and content-based retrieval. Recently, research focuses on movies, web videos, and TV shows datasets. The nature of these datasets make action recognition very challenging due to scene variability and complexity, namely background clutter, occlusions, viewpoint changes, fast irregular motion, and large spatio-temporal search space (articulation configurations and motions). The use of local space and time image features shows promising results, avoiding the cumbersome and often inaccurate frame-by-frame segmentation (boundary estimation). We focus on two state of the art methods for the action classification problem: dense trajectories and recurrent neural networks (RNN). Dense trajectories use typical supervised training (e.g., with Support Vector Machines) of features such as 3D-SIFT, extended SURF, HOG3D, and local trinary patterns; the main idea is to densely sample these features in each frame and track them in the sequence based on optical flow. On the other hand, the deep neural network uses the input frames to detect action and produce part proposals, i.e., estimate information on body parts (shapes and locations). We compare qualitatively and numerically these two approaches, indicative to what is used today, and describe our conclusions with respect to accuracy and efficiency.en_US
dc.identifier.doi10.7912/C2CW7G
dc.identifier.urihttps://hdl.handle.net/1805/11827
dc.identifier.urihttp://dx.doi.org/10.7912/C2/2335
dc.language.isoen_USen_US
dc.rightsAttribution 3.0 United States
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/us/
dc.subjectAction Recognitionen_US
dc.subjectDense Trajectoriesen_US
dc.subjectR-CNNen_US
dc.subjectLSTM RNNen_US
dc.subjectConvolution Neural Networksen_US
dc.subjectRecurrent Neural Networksen_US
dc.titleThe clash between two worlds in human action recognition: supervised feature training vs Recurrent ConvNeten_US
dc.typeThesisen
thesis.degree.disciplineComputer & Information Scienceen
thesis.degree.grantorPurdue Universityen
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
KRaptis_thesis_v4.pdf
Size:
3.6 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.88 KB
Format:
Item-specific license agreed upon to submission
Description: