Project Background

Face recognition is one of the most foremost topics in computer vision and has been studied for four decades. Significant progress has been made in frontal face recognition, where a face is in a forward-facing pose. However, most faces captured by surveillance video cameras are often not in a frontal view. Therefore, pose-invariant face recognition (PIFR) is crucial for real-world applications. PIFR refers to the problem of identifying an individual given a facial image of variant poses. Although the problem is attracting more and more attention, it is still a challenging in the field of computer vision.

Project Objective

In this project, we will design and train a deep neural network for synthesizing a frontal view of a face from its non-frontal view. There are many deep neural networks addressing this problem, many of which use generative adversarial network (GAN). In 2017, Huang et al. proposed a two-pathways GAN (TP-GAN) for frontal view synthesis, which produces encouraging results on large pose face recognition.

We will mainly adopt TP-GAN to synthesize identity preserving frontal view faces. However, the input face image of TP-GAN still need to be labelled with four landmarks, i.e. left eye center, right eye center, nose tip and mouse center. We want to improve the performance of TP-GAN by integrating a feature extraction network. In this way, we will build an automatic deep neural network, which can synthesize a frontal view image with an unlabelled face image. We will train and demonstrate our model on the CMU Multi-PIE Face Database. Evaluation of the output of our model will be conducted in terms of identity preservation, efficiency and effectiveness under large pose images. In all three assessments, we strive to outperform the current state-of-the-art techniques.

Project Methodology

The structure of our neural network will largely be based on the TP-GAN methodology, incorporating a facial landmark detection network whilst keeping most of the original structure. We will also take into reference the implementation of the domain-invariant dual-path generator proposed by Zhao in 2018, which incorporates an off-the-shelf landmark detection model that deals with unlabelled images in the wild. Detailed implementation of the proposed architecture will be realized using python and TensorFlow, in accordance with the methodology of the aforementioned references. Upon completion of the code, we will train and evaluate our proposed deep neural network on the MultiPIE, which is the largest multi-view face recognition benchmark incorporating 337 subjects under different pose, illumination, expression, etc.

Related Work

Generative Adversarial Network (GAN)

A Generative Adversarial Network is a framework of adversarial nets that aims to find Nash equilibrium between two networks, the Generator and the Discriminator. The networks are trained simultaneously, with the generator learning to capture the data distribution and the discriminator discriminating between real samples and generated ones

Face Frontalization

The problem of face frontalization is quite ill-posed in nature and challenging due to occlusion of certain features. Huang et al. proposed a dual-path methodology that separately learns the global face structure and the local details. This compared with previous methods largely improves the definition of the result. This method is widely adopted in recent works on related topics and will also be used in our network.

Pose-Invariant Face Alignment (PIFA)

Face alignment is a process of applying a supervised learned model to a face image and estimating the locations of a set of facial landmarks, such as eye corners, mouth corners, etc. Face alignment is a key module in the pipeline of most facial analysis algorithms, normally after face detection and before subsequent feature extraction and classification.

Project Schedule and Milestones

Sept 2019: Confirmation of project topic and literature review, Prepare and purchase dataset, Project planning and report, Building project website
Oct: Setup GPU server, load dataset, Start implementation of Feature Extraction network and TP-GAN, Train and test the model
Nov: Continue implementation process, Literature review and integration of the two networks
Dec: Expect to have first deliverable of a working model, Prepare demo for interim presentation, Continue implementation of single neural network
Jan 2020: Interim presentation and report, Submit preliminary implementation of model
Feb: Improve existing neural network, Explore methods on improving identity preservation and other algorithm performance
Mar: Start drafting final report and preparing for poster presentation

Our team

Kenneth Wong

Supervisor: Dr. Kenneth K.Y. Wong

Associate Professor
Dept. of Computer Science
The University of Hong Kong

Wu Haoyu

Wu Haoyu

LI Xueer

Li Xueer

Contact us

Do not hesitate to contact the team if you have any comments or suggestions.