Object Detection in Video

We introduce a framework for refining object detection in video. Our approach extracts contextual information from neighboring frames, generating predictions with state of the art accuracy that are also temporally consistent. Importantly, our model benefits from context frames even when they lack ground truth annotations.

2021

Spatiotemporal Contrastive Video Representation Learning

Qian*, Rui; Meng*, Tianjian; Gong, Boqing; Yang, Ming-Hsuan; Wang, Huisheng; Belongie, Serge; Cui, Yin

Spatiotemporal Contrastive Video Representation Learning

Computer Vision and Pattern Recognition (CVPR), Virtual, 2021, (*Equal Contribution).

(Links | BibTeX)

2016

Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

Tripathi, Subarna; Lipton, Zachary; Belongie, Serge; Nguyen, Truong

Context Matters: Refining Object Detection in Video with Recurrent Neural Networks

British Machine Vision Conference (BMVC), York, UK, 2016.

(Links | BibTeX)

Detecting Temporally Consistent Objects in Videos through Object Class Label Propagation

Tripathi, Subarna; Belongie, Serge; Hwang, Youngbae; Nguyen, Truong

Detecting Temporally Consistent Objects in Videos through Object Class Label Propagation

Winter Conference on Applications of Computer Vision (WACV), 2016.

(Links | BibTeX)