Skip to main navigation Skip to search Skip to main content

AirObject: A Temporally Evolving Graph Embedding for Object Identification

  • Nikhil Varma Keetha
  • , Chen Wang
  • , Yuheng Qiu
  • , Kuan Xu
  • , Sebastian Scherer
  • Carnegie Mellon University
  • Indian Institute of Technology, Dhanbad
  • Geek+Corp

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Object encoding and identification are vital for robotic tasks such as autonomous exploration, semantic scene understanding, and relocalization. Previous approaches have attempted to either track objects or generate descriptors for object identification. However, such systems are limited to a 'fixed' partial object representation from a single viewpoint. In a robot exploration setup, there is a requirement for a temporally 'evolving' global object representation built as the robot observes the object from multiple viewpoints. Furthermore, given the vast distribution of unknown novel objects in the real world, the object identification process must be class-agnostic. In this context, we propose a novel temporal 3D object encoding approach, dubbed AirObject, to obtain global keypoint graph-based embeddings of objects. Specifically, the global 3D object embeddings are generated using a temporal convolutional network across structural information of multiple frames obtained from a graph attention-based encoding method. We demonstrate that AirObject achieves the state-of-the-art performance for video object identification and is robust to severe occlusion, perceptual aliasing, viewpoint shift, deformation, and scale transform, outperforming the state-of-the-art single-frame and sequential descriptors. To the best of our knowledge, AirObject is one of the first temporal object encoding methods. Source code is available at https://github.com/Nik-v9/AirObject.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
PublisherIEEE Computer Society
Pages8397-8406
Number of pages10
ISBN (Electronic)9781665469463
DOIs
StatePublished - 2022
Event2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 - New Orleans, United States
Duration: Jun 19 2022Jun 24 2022

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2022-June
ISSN (Print)1063-6919

Conference

Conference2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
Country/TerritoryUnited States
CityNew Orleans
Period06/19/2206/24/22

Keywords

  • 3D from multi-view and sensors
  • Deep learning architectures and techniques
  • Machine learning
  • Recognition: detection
  • Representation learning
  • Robot vision
  • Video analysis and understanding
  • Vision applications and systems
  • categorization
  • retrieval

Fingerprint

Dive into the research topics of 'AirObject: A Temporally Evolving Graph Embedding for Object Identification'. Together they form a unique fingerprint.

Cite this