Human Motion Capture Using a Drone

Posted in My Projects

6DoF Pose Estimation and Object Grasping

6-DoF Object Pose from Semantic Keypoints
Georgios Pavlakos, Xiaowei Zhou, Aaron Chan, Konstantinos G. Derpanis, Kostas Daniilidis
International Conference on Robotics and Automation (ICRA), 2017
project page / code / video / bibtex

Posted in My Projects

Multi-Image Matching

Abstract

In this paper we propose a global optimization-based approach to jointly matching a set of images. The estimated correspondences simultaneously maximize pairwise feature affinities and cycle consistency across multiple images. Unlike previous convex methods relying on semidefinite programming, we formulate the problem as a low-rank matrix recovery problem and show that the desired semidefiniteness of a solution can be spontaneously fulfilled. The low-rank formulation enables us to derive a fast alternating minimization algorithm in order to handle practical problems with thousands of features. Both simulation and real experiments demonstrate that the proposed algorithm can achieve a competitive performance with an order of magnitude speedup compared to the state-of-the-art algorithm. In the end, we demonstrate the applicability of the proposed method to match the images of different object instances and as a result the potential to reconstruct category-specific object models from those images.


Overview

 


Publication

Multi-Image Matching via Fast Alternating Minimization.
X. Zhou, M. Zhu, K. Daniilidis.
International Conference on Computer Vision (ICCV), 2015.
Supplementary material: PDF


Code

The MATLAB code for Algorithm 1 in the paper.

Posted in My Projects

Single-Image Popup

Abstract

We introduce a new approach for estimating a fine grained 3D shape and continuous pose of an object from a single image. Given a training set of view exemplars, we learn and select appearance-based discriminative parts which are mapped onto the 3D model through a facility location optimization. The training set of 3D models is summarized into a set of basis shapes from which we can generalize by linear combination. Given a test image, we detect hypotheses for each part. The main challenge is to select from these hypotheses and compute the 3D pose and shape coefficients at the same time. To achieve this, we optimize a function that considers simultaneously the appearance matching of the parts as well as the geometric reprojection error. We apply the alternating direction method of multipliers (ADMM) to minimize the resulting convex function. Our main and novel contribution is the simultaneous solution for part localization and detailed 3D geometry estimation by maximizing both appearance and geometric compatibility with convex relaxation.


Overview

summary


Publication

Single Image Pop-Up from Discriminatively Learned Parts.
M. Zhu*, X. Zhou*, K. Daniilidis.
International Conference on Computer Vision (ICCV), 2015.
*Equal contribution.


Code

Available soon.

Posted in My Projects

Human Pose Estimation from Monocular Video

Abstract

This paper addresses the challenge of 3D full-body human pose estimation from a monocular image sequence. Here, two cases are considered: (i) the image locations of the human joints are provided and (ii) the image locations of joints are unknown. In the former case, a novel approach is introduced that integrates a sparsity-driven 3D geometric prior and temporal smoothness. In the latter case, the former case is extended by treating the image locations of the joints as latent variables. A deep fully convolutional network is trained to predict the uncertainty maps of the 2D joint locations.  The 3D pose estimates are realized via an Expectation-Maximization algorithm over the entire sequence, where it is shown that the 2D joint location uncertainties can be conveniently marginalized out during inference. Empirical evaluation on the Human3.6M dataset shows that the proposed approaches achieve greater 3D pose estimation accuracy over  state-of-the-art baselines.  Further, the proposed approach outperforms a publicly available 2D pose estimation baseline on the challenging PennAction dataset.


Overview

overview


Example results

 


Publication

Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video.
X. Zhou, M. Zhu, S. Leonardos, K. Derpanis, K. Daniilidis.
CVPR 2016.

Supplementary material | Poster


Code

Updated package that includes the whole pipeline for reconstructing 3D human poses from an image sequence
including the proposed reconstruction algorithm + the “Stacked Hourglass Network” for 2D pose detection.

 

 

Posted in My Projects

3D Shape Estimation via Convex Optimization

Abstract

We investigate the problem of estimating the 3D structure of an object defined by a set of 3D landmarks, given their 2D correspondences in a single image. To alleviate the reconstruction ambiguity, a widely used approach is to assume the unknown structure as a linear combination of predefined basis shapes and the sparse representation is usually adopted to capture complex shape variability. While this approach has proven to be successful in many applications, a challenging issue remains, i.e., the joint estimation of structure and viewpoint requires to solve a nonconvex optimization problem. Previous methods often adopt an alternating minimization scheme to alternately update the structure and viewpoint, and the solution depends on initialization and might be stuck at local optimum. In this paper, we propose a convex approach to addressing this issue and develop an efficient algorithm to solve the proposed convex program. Moreover, we propose a robust model to handle gross errors in the 2D correspondences. We demonstrate the exact recovery property of the proposed model, its merits compared to alternative methods and the applicability to recover 3D human poses and car shapes from real images.


Demo Video

 


Publication

Sparse Representation for 3D Shape Estimation: A Convex Relaxation Approach.
X. Zhou, M. Zhu, S. Leonardos, K. Daniilidis.
Supplementary material

3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach.
X. Zhou, S. Leonardos, X. Hu, K. Daniilidis.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.


Code

Matlab code — the MATLAB implementation of the algorithms introduced in the journal version of our work and several demonstration examples.

Posted in My Projects

Low-Rank Modeling: A Review

Abstract

Low-rank modeling generally refers to a class of methods that solves problems by representing variables of interest as low-rank matrices. It has achieved great success in various fields including computer vision, data mining, signal processing, and bioinformatics. Recently, much progress has been made in theories, algorithms, and applications of low-rank modeling, such as exact low-rank matrix recovery via convex programming and matrix completion applied to collaborative filtering. These advances have brought more and more attention to this topic. In this article, we review the recent advances of low-rank modeling, the state-of-the-art algorithms, and the related applications in image analysis. We first give an overview of the concept of low-rank modeling and the challenging problems in this area. Then, we summarize the models and algorithms for low-rank matrix recovery and illustrate their advantages and limitations with numerical experiments. Next, we introduce a few applications of low-rank modeling in the context of image analysis. Finally, we conclude this article with some discussions.


Example

mc-alg

A comparison of some matrix completion solvers (distance to ground truth vs. time).


Publication

Low-Rank Modeling and its Applications in Image Analysis.
X. Zhou, C. Yang, H. Zhao, W. Yu.
ACM Computing Surveys, 47(2): 36, 2014.


Code

The MATLAB codes generating the figures in the paper is available through this link.

Posted in My Projects

Active Contours with Group Similarity

Abstract

Active contours are widely used in image segmentation. To cope with missing or misleading features in images, re- searchers have introduced various ways to model the prior of shapes and use the prior to constrain active contours. However, the shape prior is usually learnt from a large set of annotated data, which is not always accessible in practice. Moreover, it is often doubted that the existing shapes in the training set will be sufficient to model the new instance in the testing image. In this paper, we propose to use the group similarity of object shapes in multiple images as a prior to aid segmentation, which can be interpreted as an unsupervised approach of shape prior modeling. We show that the rank of the matrix consisting of multiple shapes is a good measure of the group similarity of the shapes, and the nuclear norm minimization is a simple and effective way to impose the proposed constraint on existing active contour models. Moreover, we develop a fast algorithm to solve the proposed model by using the accelerated proximal method. Experiments using echocardiographic image sequences acquired from acute canine experiments demonstrate that the proposed method can consistently improve the performance of active contour models and increase the robustness against image defects such as missing boundaries.


Demo

Top: without the shape constraint. Bottom: with the shape constraint.


Publication

Active Contours with Group Similarity.
X. Zhou
, X. Huang, J.S. Duncan, W. Yu.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013.


Code

The MATLAB codes can be found here.

 

Posted in My Projects

DECOLOR for Moving Object Detection

Abstact

Object detection is a fundamental step for automated video analysis in many vision applications. Object detection in a video is usually performed by object detectors or background subtraction techniques. Often, an object detector requires manually labeled examples to train a binary classifier, while background subtraction needs a training sequence that contains no objects to build a background model. To automate the analysis, object detection without a separate training phase becomes a critical task. People have tried to tackle this task by using motion information. But existing motion-based methods are usually limited when coping with complex scenarios such as nonrigid motion and dynamic background. In this paper, we show that the above challenges can be addressed in a unified framework named DEtecting Contiguous Outliers in the LOw-rank Representation (DECOLOR). This formulation integrates object detection and background learning into a single process of optimization, which can be solved by an alternating algorithm efficiently. We explain the relations between DECOLOR and other sparsity-based methods. Experiments on both simulated data and real sequences demonstrate that DECOLOR outperforms the state-of-the-art approaches and it can work effectively on a wide range of complex scenarios.


Example


From top to bottom: original image, background, segmentation

More examples


Publication

Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation.
X. Zhou, C. Yang, W. Yu.
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2013.


Code

MATLAB code

Updated on May 25, 2016.
The newest version of the GCO toolbox is included to solve the compatibility issue with the new versions of MATLAB.
Only the mex file for Win64 is included. Please compile the GCO toolbox if you are using another system.

Posted in My Projects