Gaze Estimation and Prediction in the Wild

ICCV 2019 Workshop, Seoul, Korea

Sunday October 27 2019, 8:30am - 12:50pm
Location: Room 318 A

Please subscribe to our new mailing list to gain access to our speakers' slides and future updates!

Photo: Shutterstock


The 1st Workshop on Gaze Estimation and Prediction in the Wild (GAZE 2019) at ICCV 2019 is the first-of-its-kind workshop focused on designing and evaluating deep learning methods for the task of gaze estimation and prediction. We aim to encourage and highlight novel strategies with a focus on robustness and accuracy in real-world settings. This is expected to be achieved via novel neural network architectures, incorporating anatomical insights and constraints, introducing new and challenging datasets, and exploiting multi-modal training among other directions. This half-day workshop consists of three invited talks as well as talks from industry contributors.

The topics of this workshop include but are not limited to:

  • Proposal of novel eye detection, gaze estimation, and gaze prediction pipelines using deep convolutional neural networks.
  • Incorporating geometric and anatomical constraints into neural networks in a differentiable manner.
  • Demonstration of robustness to conditions where current methods fail (illumination, appearance, low-resolution etc.).
  • Robust estimation from different data modalities such as RGB, depth, and near infra-red.
  • Leveraging additional cues such as task context, temporal information, eye movement classification.
  • Designing new accurate metrics to account for rapid eye movements in the real world.
  • Semi-supervised / unsupervised / self-supervised learning, meta-learning, domain adaptation, attention mechanisms and other related machine learning methods for gaze estimation.
  • Methods for temporal gaze estimation and prediction including bayesian methods.

Call for Contributions

Full Workshop Papers

Submission: We invite authors to submit unpublished papers (8-page ICCV format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted through CMT in the following link:

Extended Abstracts

In addition to regular papers, we also invite extended abstracts of ongoing or published work (e.g. related papers on ICCV main track). The extended abstracts will not be published or made available to the public (we will only list titles on our website) but will rather be presented during our poster session. We see this as an opportunity for authors to promote their work to an interested audience to gather valuable feedback.

Extended abstracts are limited to three pages and must be created using this LaTeX template. The submission must be sent to by 16th September.

We will evaluate and notify authors of acceptance as soon as possible after receiving their extended abstract submission.

Accepted ICCV/CVPR Papers

Relevant papers that were accepted at the main conference (ICCV 2019) or at CVPR 2019 are welcome to be presented during our poster session to increase the exposure of your work and foster discussion in the community. Please send a PDF document of your camera-ready paper to at any time to register your presence.

Important Dates

Paper Submission Deadline July 29 July 31, 2019 (23:59 Pacific time)
Notification to Authors August 16 August 18, 2019
Camera-Ready Deadline August 30, 2019
Extended Abstracts Deadline September 6 September 16, 2019
Workshop Date October 27, 2019 (Morning)

Workshop Schedule

# Time Item
1 8:30am - 8:35am Welcome and Opening Remarks
2 8:35am - 10:05am Keynote Talks
Yusuke Sugano (University of Tokyo)
Jean-Marc Odobez (Idiap and EPFL)
3 10:05am - 10:20am Accepted Full Paper Lightning Talks
4 10:20am - 11:00am Coffee Break and Poster Session
5 11:00am - 12:15pm Industry Keynote Talks
Jae-Joon Han (Samsung Advanced Institute of Technology)
Shalini De Mello (Nvidia)
Maria Gordon (Tobii)
6 12:15pm - 12:40pm Panel Discussion
7 12:40pm - 12:50pm Presentation of Awards and Closing Remarks

Invited Keynote Speakers

Yusuke Sugano
University of Tokyo

Appearance-based Gaze Estimation: What We Have Done and What We Should Do


Since its first appearance in the 90s, appearance-based gaze estimation has been gradually but steadily gaining attention until now. This talk aims at providing a brief overview of past research achievements in the area of appearance-based gaze estimation, mainly from the perspective of both personalization and generalization techniques. I will also discuss some remaining challenges towards the ultimate goal of camera-based versatile gaze estimation techniques.


Yusuke Sugano is an associate professor at Institute of Industrial Science, The University of Tokyo. His research interests focus on computer vision and human-computer interaction. He received his Ph.D. in information science and technology from the University of Tokyo in 2010. He was previously an associate professor at Graduate School of Information Science and Technology, Osaka University, a postdoctoral researcher at Max Planck Institute for Informatics, and a project research associate at Institute of Industrial Science, the University of Tokyo.

Jean-Marc Odobez
Idiap Research Institute
and EPFL

Measuring attention in interactions: from context based multimodal head pose analysis to 3D gaze estimation


Beyond words, non-verbal behaviors (NVB) are known to play important roles in face-to-face interactions. However, decoding non-verbal behaviors is a challenging problem that involves both extracting subtle physical NVB cues and mapping them to higher-level communication behaviors or social constructs. This is particularly the case of gaze, one of the most important non-verbal behaviors with functions related to communication and social signaling.

In this talk, I will present our past and current work towards the automatic analysis of attention (whether 3D gaze or its discrete version the Visual Focus of Attention, VFOA) in situations where large user mobility is expected and minimal intrusion is required. I will first introduce how we addressed VFOA recognition in meetings using Dynamical Bayesian Networks to jointly model speech conversation, gaze (represented by head pose), and task context. I will then present recent methods investigated to perform 3D gaze tracking, including robust and accurate 3D head pose tracking under 360 degrees as well as the use of several deep neural network architectures for appearance-based gaze estimation. The latter will include methods to build personalized models through few-shot learning and gaze redirection eye synthesis, differential gaze estimation, and online learning or adaptation, potentially taking advantage of priors on social interactions to obtain weak labels for model adaptation.


Dr. Jean-Marc Odobez received his PhD from Rennes University/INRIA in 1994 and was, from 1996 to 2001, Assistant Professor at the University of Maine, France. He is now a Senior Researcher at Idiap and adjunct faculty at the École Polytechnique Fédérale de Lausanne (EPFL) where he is a member of the School of Engineering (STI).

He is the author or coauthor of more than 150 papers, and has been the principal investigator of more than 14 European and Swiss projects. He holds several patents in computer vision, and is the cofounder of the companies Klewel SA and Eyeware SA companies. He is a member of the IEEE, and Associate Editor of the IEEE Transaction on Circuits and Systems for Video Technology and of Machine Vision and Application journals.

Invited Industry Speakers

Jae-Joon Han
Samsung Advanced Institute of Technology

Jae-Joon Han is a Master of AI & SW Research Center at Samsung Advanced Institute of Technology (SAIT), the corporate research of Samsung Electronics. He received Ph.D degree in Electrical and computer engineering from Purdue University in 2006 and did a postdoctoral fellow at Purdue before he joined SAIT in 2007. Since then, he has mainly focused on developing computer vision and machine learning algorithms which enable for users to interact with devices in a novel way. He is currently leading a project related to on-device facial recognition. His research interest includes facial recognition, facial anti-spoofing, speaker verification, neural network model compression for on-device processing and gaze estimation.


Shalini De Mello is a Principal Research Scientist at NVIDIA. Her research interests are in computer vision and machine learning for human-computer interaction and smart interfaces. At NVIDIA, she has invented technologies for gaze estimation, and 2D and 3D head pose estimation, hand gesture recognition, face detection, video stabilization and GPU-optimized libraries for mobile computer vision. Her research over that past several years has pushed the envelope of HCI in cars and has led to the development of NVIDIA’s innovative DriveIX product for smart AI-based automotive interfaces for future generations of cars. She received doctoral and master’s degrees in Electrical and Computer Engineering from the University of Texas at Austin in 2008 and 2004, respectively.

Gaze into the future


Just a few years ago, eye tracking was still considered a research tool and a niche product used for e.g. offering people with assistive needs a voice. Today, however, eye tracking is broadcasted on major sports gaming tournaments and has become integrated into commercial off-the-shelf PCs and VR headsets.
A major pull for eye tracking technology today comes from the XR space. While eye tracking already now adds value to current XR products, future XR hardware, such as e.g. dynamic focus displays, simply won’t work without refined eye tracking solutions. In this talk we’ll describe what role eye tracking will play in future remote and near eye setups and where eye tracking simply needs to succeed.


Maria has been working with software and algorithm development for more than 15 years. During her career she has held a variety of roles from developer to research lead at companies like Philips, Infineon and Saab. Maria joined Tobii’s eye tracking algorithm team in 2015 as a Key Algorithm Engineer.

An eye tracker that works for every single individual is still a challenge. Maria has experienced from close how PC eye tracking has moved from being a specialty market and research tool to a mainstream gaming equipment. She has also been closely involved in the development of eye tracking for VR and AR, and been part of the work behind releasing products such as the HTC Vive Pro Eye.


Best Paper Award sponsored by

On-device Few-shot Personalization for Real-time Gaze Estimation
Junfeng He, Khoi Pham, Nachiappan Valliappan, Pingmei Xu, Chase Roberts, Vidhya Navalpakkam, Dmitry Lagun

Best Poster Award sponsored by

RT-BENE: A dataset and baselines for Real-Time Blink Estimation in Natural Environments
Kévin Cortacero, Tobias Fischer, Yiannis Demiris

We would like to thank for sponsoring our costs.

Accepted Full Papers

A Generalized and Robust Method Towards Practical Gaze Estimation on Smart Phone Tianchu Guo, Yongchao Liu, Hui Zhang, Xiabing Liu, Youngjun Kwak, ByungIn Yoo, Jae-Joon Han, Changkyu Choi
PDF (CVF) arXiv
Learning to Personalize in Appearance-Based Gaze Tracking Erik Lindén, Jonas Sjöstrand, Alexandre Proutiere
PDF (CVF) arXiv
On-device Few-shot Personalization for Real-time Gaze Estimation Junfeng He, Khoi Pham, Nachiappan Valliappan, Pingmei Xu, Chase Roberts, Vidhya Navalpakkam, Dmitry Lagun
RT-BENE: A dataset and baselines for Real-Time Blink Estimation in Natural Environments Kévin Cortacero, Tobias Fischer, Yiannis Demiris
SalGaze: Personalizing Gaze Estimation using Visual Saliency Zhuoqing Chang, J. Matias Di Martino, Qiang Qiu, Guillermo Sapiro
PDF (CVF) arXiv

Accepted Posters

Extended Abstracts
High-Speed Pupil Tracking based on Deep Random Forests MinJi Park, Mira Jeong, ByoungChul Ko

From the Main Conference (ICCV/CVPR)
Few-Shot Adaptive Gaze Estimation Seonwook Park, Shalini De Mello, Pavlo Molchanov, Umar Iqbal, Otmar Hilliges, Jan Kautz
PDF (CVF) arXiv
Photo-Realistic Monocular Gaze Redirection Using Generative Adversarial Networks Zhe He, Adrian Spurr, Xucong Zhang, Otmar Hilliges
Gaze360: Physically Unconstrained Gaze Estimation in the Wild Petr Kellnhofer, Adrià Recasens, Simon Stent, Wojciech Matusik, Antonio Torralba
Mixed Effects Neural Networks (MeNets) with Applications to Gaze Estimation Yunyang Xiong, Hyunwoo J. Kim, Vikas Singh
PDF (CVF) Code
Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning Lifeng Fan, Wenguan Wang, Siyuan Huang, Xinyu Tang, Song-Chun Zhu
PDF (CVF) arXiv
Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis Yu Yu, Gang Liu, Jean-Marc Odobez
PDF (CVF) arXiv


Hyung Jin Chang
University of Birmingham
Seonwook Park
ETH Zürich
Xucong Zhang
ETH Zürich
Otmar Hilliges
ETH Zürich
Aleš Leonardis
University of Birmingham

Program Committee

Minjie Cai
Hunan University
Hyung Jin Chang
University of Birmingham
Eunji Chong
Georgia Tech
Tobias Fischer
Imperial College London
Wolfgang Fuhl
University of Tübingen
Otmar Hilliges
ETH Zürich
Nora Horanyi
University of Birmingham
Yifei Huang
University of Tokyo
Enkelejda Kasneci
University of Tübingen
Hyunwoo J. Kim
Korea University
Aleš Leonardis
University of Birmingham
Miao Liu
Georgia Tech
Seonwook Park
ETH Zürich
Nataniel Ruiz
Boston University
Arantzazu Villanueva
Public University of Navarre
Yu Yu
Xucong Zhang
ETH Zürich
Rui Zhao
Amazon Research

Workshop sponsored by:

A special thank you to our industry liaisons:

Changkyu Choi
Samsung Advanced Institute of Technology