Skip to main navigation Skip to search Skip to main content

Language-motivated approaches to action recognition

  • SUNY Buffalo

Research output: Contribution to journalArticlepeer-review

35 Scopus citations

Abstract

We present language-motivated approaches to detecting, localizing and classifying activities and gestures in videos. In order to obtain statistical insight into the underlying patterns of motions in activities, we develop a dynamic, hierarchical Bayesian model which connects low-level visual features in videos with poses, motion patterns and classes of activities. This process is somewhat analogous to the method of detecting topics or categories from documents based on the word content of the documents, except that our documents are dynamic. The proposed generative model harnesses both the temporal ordering power of dynamic Bayesian networks such as hidden Markov models (HMMs) and the automatic clustering power of hierarchical Bayesian models such as the latent Dirichlet allocation (LDA) model. We also introduce a probabilistic framework for detecting and localizing pre-specified activities (or gestures) in a video sequence, analogous to the use of filler models for keyword detection in speech processing. We demonstrate the robustness of our classification model and our spotting framework by recognizing activities in unconstrained real-life video sequences and by spotting gestures via a one-shot-learning approach.

Original languageEnglish
Pages (from-to)2189-2212
Number of pages24
JournalJournal of Machine Learning Research
Volume14
StatePublished - Jun 2013

Keywords

  • Activity recognition
  • Dynamic hierarchical Bayesian networks
  • Generative models
  • Gesture spotting
  • Topic models

Fingerprint

Dive into the research topics of 'Language-motivated approaches to action recognition'. Together they form a unique fingerprint.

Cite this