MANUSCRIPT, APRIL 2021 1 Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark Yihua Cheng1, Haofei Wang2, Yiwei Bao1, Feng Lu1,2,* 1State Key Laboratory of Virtual Reality Technology and Systems, SCSE, Beihang University, China. 2Peng Cheng Laboratory, Shenzhen, China. fyihua c, baoyiwei,
[email protected],
[email protected] Abstract—Gaze estimation reveals where a person is looking. It is an important clue for understanding human intention. The Deep learning algorithms recent development of deep learning has revolutionized many computer vision tasks, the appearance-based gaze estimation is no exception. However, it lacks a guideline for designing deep learn- ing algorithms for gaze estimation tasks. In this paper, we present a comprehensive review of the appearance-based gaze estimation methods with deep learning. We summarize the processing Chip pipeline and discuss these methods from four perspectives: deep feature extraction, deep neural network architecture design, personal calibration as well as device and platform. Since the Platform Calibration Feature Model data pre-processing and post-processing methods are crucial for gaze estimation, we also survey face/eye detection method, data rectification method, 2D/3D gaze conversion method, and gaze origin conversion method. To fairly compare the performance of various gaze estimation approaches, we characterize all the Fig. 1. Deep learning based gaze estimation relies on simple devices and publicly available gaze estimation datasets and collect the code of complex deep learning algorithms to estimate human gaze. It usually uses typical gaze estimation algorithms. We implement these codes and off-the-shelf cameras to capture facial appearance, and employs deep learning set up a benchmark of converting the results of different methods algorithms to regress gaze from the appearance.