Image credit to DALL·E

Date: Tuesday, 18th June 2024
Time: 8:30 AM – 12:30 PM (half-day)
Location: Arch 309


The 6th International Workshop on Gaze Estimation and Prediction in the Wild (GAZE 2024) at CVPR 2024 aims to encourage and highlight novel strategies for eye gaze estimation and prediction. The workshop topics include (but are not limited to):

  • Enhancing eye image segmentation, landmark localization, gaze estimation and other tasks in mixed and augmented reality (XR / AR) settings.
  • Novel multi-modal systems for incorporating gaze information to improve visual recognition tasks.
  • Improving eye detection, gaze estimation, and gaze prediction pipelines in various ways, such as by applying geometric and anatomical constraints, leveraging additional cues such as head pose, scene content, or considering multi-modal inputs.
  • Developing adversarial or domain generalization methods to improve cross-dataset performance or to deal with conditions where current methods fail (illumination, appearance, etc.).
  • Exploring attention mechanisms and temporal information to predict the point of regard.
  • Novel methods for temporal gaze estimation and prediction including Bayesian methods.
  • Personalization of gaze estimators with methods such as few-shot learning.
  • Semi-/weak-/un-/self- supervised learning methods, domain adaptation methods, and other novel methods towards improved representation learning from eye/face region images or gaze target region images.
We will be hosting 2 invited speakers and will also be accepting the submission of full unpublished papers as done in previous versions of the workshop. These papers will be peer-reviewed via a double-blind process, and will be published in the official workshop proceedings and be presented at the workshop itself.

Call for Contributions

Full Workshop Papers

Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) at this CMT link.

Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.

Note: Authors of previously rejected main conference submissions are also welcome to submit their work to our workshop. When doing so, you must submit the previous reviewers' comments (named as previous_reviews.pdf) and a letter of changes (named as letter_of_changes.pdf) as part of your supplementary materials to clearly demonstrate the changes made to address the comments made by previous reviewers.

Important Dates

Paper Submission Deadline March 15, 2024 (23:59 Pacific time)
Notification to Authors April 5, 2024
Camera-Ready Deadline April 14, 2024
Workshop Day June 18, 2024

Invited Keynote Speakers

Feng Xu
Tsinghua University
Biography (click to expand/collapse)

Feng Xu is currently an associate professor at the School of Software, Tsinghua University, Beijing, China. He earned a Ph.D. in automation and a B.S. in physics from Tsinghua University in 2012 and 2007, respectively. Until 2015, He was a Researcher in the Internet Graphics group, Microsoft Research Asia. His research interests include human body reconstruction, face animation, and medical image analysis. He has authored more than 40 conference and journal papers in the corresponding areas, including Nature Medicine, SIGGRAPH, CVPR, ICCV, ECCV, PAMI, and so on.

Alexander Fix
Meta Reality Labs Research

Challenges in Near Eye Tracking for AR/VR

Biography (click to expand/collapse)

Alexander Fix is a Research Scientist at Meta Reality Labs Research, where he has worked on eye tracking and related topics for the last 9 years. His research interests include 3D reconstruction, NeRF and other implicit reconstructions, and geometric eye tracking. Collaborations at Meta include quite a lot of eye tracking hardware research, particularly on novel methods for eye tracking such as Event Cameras. He graduated from Cornell in 2016 with a PhD in Computer Science, and from the University of Chicago in 2009 with a BS in Computer Science and Mathematics.

Accepted Full Papers

Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation Swati Jindal, Mohit Yadav, Roberto Manduchi
arXiv Code
Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following Anshul Gupta, Pierre Vuillecard, Arya Farkhondeh, Jean-Marc Odobez
Gaze Scanpath Transformer: Predicting Visual Search Target by Spatiotemporal Semantic Modeling of Gaze Scanpath Takumi Nishiyasu, Yoichi Sato
GESCAM: A Dataset and Method on Gaze Estimation for Classroom Attention Measurement Athul Mathew, Arshad Khan, Thariq Khalid, Riad Souissi

Invited Posters

What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation Yihua Cheng, Yaning Zhu, Zongji Wang, Hongquan Hao, Liu Wei, Shiqing Cheng, Xi Wang, Hyung Jin Chang
Learning from Observer Gaze: Zero-shot Attention Prediction Oriented by Human-Object Interaction Recognition Yuchen Zhou, Linkai Liu, Chao Gou
Sharingan: A Transformer Architecture for Multi-Person Gaze Following Samy Tafasca, Anshul Gupta, Jean-Marc Odobez


Hyung Jin Chang
University of Birmingham
Xucong Zhang
Delft University of Technology
Shalini De Mello
NVIDIA Research
Seonwook Park
Lunit Inc.

Jean-Marc Odobez
EPFL & Idiap Research Institute
Yihua Cheng
University of Birmingham
Xi Wang
ETH Zürich
Otmar Hilliges
ETH Zürich
Aleš Leonardis
University of Birmingham

Website Chair

Hengfei Wang
University of Birmingham

Please contact me if you have any question about this website.

Workshop sponsored by: