TY - GEN
T1 - Discriminative subvolume search for efficient action detection
AU - Yuan, Junsong
AU - Liu, Zicheng
AU - Wu, Ying
PY - 2009
Y1 - 2009
N2 - Actions are spatio-temporal patterns which can be characterized by collections of spatio-temporal invariant features. Detection of actions is to find the re-occurrences (e.g. through pattern matching) of such spatio-temporal patterns. This paper addresses two critical issues in pattern matching-based action detection: (1) efficiency of pattern search in 3D videos and (2) tolerance of intra-pattern variations of actions. Our contributions are two-fold. First, we propose a discriminative pattern matching called naive- Bayes based mutual information maximization (NBMIM) for multi-class action categorization. It improves the stateof- the-art results on standard KTH dataset. Second, a novel search algorithm is proposed to locate the optimal subvolume in the 3D video space for efficient action detection. Our method is purely data-driven and does not rely on object detection, tracking or background subtraction. It can well handle the intra-pattern variations of actions such as scale and speed variations, and is insensitive to dynamic and clutter backgrounds and even partial occlusions. The experiments on versatile datasets including KTH and CMU action datasets demonstrate the effectiveness and efficiency of our method.
AB - Actions are spatio-temporal patterns which can be characterized by collections of spatio-temporal invariant features. Detection of actions is to find the re-occurrences (e.g. through pattern matching) of such spatio-temporal patterns. This paper addresses two critical issues in pattern matching-based action detection: (1) efficiency of pattern search in 3D videos and (2) tolerance of intra-pattern variations of actions. Our contributions are two-fold. First, we propose a discriminative pattern matching called naive- Bayes based mutual information maximization (NBMIM) for multi-class action categorization. It improves the stateof- the-art results on standard KTH dataset. Second, a novel search algorithm is proposed to locate the optimal subvolume in the 3D video space for efficient action detection. Our method is purely data-driven and does not rely on object detection, tracking or background subtraction. It can well handle the intra-pattern variations of actions such as scale and speed variations, and is insensitive to dynamic and clutter backgrounds and even partial occlusions. The experiments on versatile datasets including KTH and CMU action datasets demonstrate the effectiveness and efficiency of our method.
UR - https://www.scopus.com/pages/publications/70450164163
U2 - 10.1109/CVPRW.2009.5206671
DO - 10.1109/CVPRW.2009.5206671
M3 - Conference contribution
AN - SCOPUS:70450164163
SN - 9781424439935
T3 - 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009
SP - 2442
EP - 2449
BT - 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009
PB - IEEE Computer Society
T2 - 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009
Y2 - 20 June 2009 through 25 June 2009
ER -