GAZE 2024: Gaze Estimation and Prediction in the Wild

Image credit to DALL·E

Date:	Tuesday, 18th June 2024
Time:	8:30 AM – 12:30 PM (half-day)
Location:	Arch 309

Introduction

The 6th International Workshop on Gaze Estimation and Prediction in the Wild (GAZE 2024) at CVPR 2024 aims to encourage and highlight novel strategies for eye gaze estimation and prediction. The workshop topics include (but are not limited to):

Enhancing eye image segmentation, landmark localization, gaze estimation and other tasks in mixed and augmented reality (XR / AR) settings.
Novel multi-modal systems for incorporating gaze information to improve visual recognition tasks.
Improving eye detection, gaze estimation, and gaze prediction pipelines in various ways, such as by applying geometric and anatomical constraints, leveraging additional cues such as head pose, scene content, or considering multi-modal inputs.
Developing adversarial or domain generalization methods to improve cross-dataset performance or to deal with conditions where current methods fail (illumination, appearance, etc.).
Exploring attention mechanisms and temporal information to predict the point of regard.
Novel methods for temporal gaze estimation and prediction including Bayesian methods.
Personalization of gaze estimators with methods such as few-shot learning.
Semi-/weak-/un-/self- supervised learning methods, domain adaptation methods, and other novel methods towards improved representation learning from eye/face region images or gaze target region images.

We will be hosting 2 invited speakers and will also be accepting the submission of full unpublished papers as done in previous versions of the workshop. These papers will be peer-reviewed via a double-blind process, and will be published in the official workshop proceedings and be presented at the workshop itself.

Call for Contributions

Full Workshop Papers

Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) at this CMT link.

Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.

Note: Authors of previously rejected main conference submissions are also welcome to submit their work to our workshop. When doing so, you must submit the previous reviewers' comments (named as previous_reviews.pdf) and a letter of changes (named as letter_of_changes.pdf) as part of your supplementary materials to clearly demonstrate the changes made to address the comments made by previous reviewers.

Important Dates

Paper Submission Deadline	March 15, 2024 (23:59 Pacific time)
Notification to Authors	April 5, 2024
Camera-Ready Deadline	April 14, 2024
Workshop Day	June 18, 2024

Workshop Video Recording

Workshop Schedule

Time in UTC	Start Time in UTC* (probably your time zone)	Item
3:30pm - 3:35pm	18 Jun 2024 15:30:00 UTC	Opening remark
3:35pm - 4:15pm	18 Jun 2024 15:35:00 UTC	Invited talk by Feng Xu
4:15pm - 4:55pm	18 Jun 2024 16:15:00 UTC	Invited talk by Alexander Fix
4:55pm - 5:10pm	18 Jun 2024 16:55:00 UTC	Invited poster spotlight talk
5:10pm - 6:10pm	18 Jun 2024 17:10:00 UTC	Coffee break & poster presentation
6:10pm - 6:50pm	18 Jun 2024 18:10:00 UTC	Workshop paper presentation
6:50pm - 7:00pm	18 Jun 2024 18:50:00 UTC	Award & closing remark

* This time is calculated to be in your computer's reported time zone.
For example, those in Los Angeles may see UTC-7,
while those in Berlin may see UTC+2.

Please note that there may be differences to your actual time zone.

Invited Keynote Speakers

Feng Xu

Tsinghua University

Eye Region Reconstruction with a Monocular Camera

Abstract

Eye region reconstruction is an important yet challenging task in computer vision and graphics. It suffers from complicated geometry and motions, severe occlusions, and eyeglass interference, for which existing methods have to make a trade-off between capture cost and reconstruction quality. We focused on low-cost capture setups and proposed novel algorithms to achieve high-quality eye region reconstruction under limited inputs. In addition, we tried to solve the eyeglass interference, which lays the foundation for high-quality eye region reconstruction. We have also tried to apply eye region reconstruction in medicine for disease diagnosis.

Biography (click to expand/collapse)

Feng Xu is currently an associate professor at the School of Software, Tsinghua University, Beijing, China. He earned a Ph.D. in automation and a B.S. in physics from Tsinghua University in 2012 and 2007, respectively. Until 2015, He was a Researcher in the Internet Graphics group, Microsoft Research Asia. His research interests include human body reconstruction, face animation, and medical image analysis. He has authored more than 40 conference and journal papers in the corresponding areas, including Nature Medicine, SIGGRAPH, CVPR, ICCV, ECCV, PAMI, and so on.

Alexander Fix

Meta Reality Labs Research

Challenges in Near Eye Tracking for AR/VR

Abstract

Artificial and Virtual Reality (AR/VR) has incredible potential for using eye tracking to power the future of computing, but also incredible challenges in making eye tracking that works for everyone, all the time. In this talk, I will talk about some of the work we’re doing here at Meta Reality Labs to build eye tracking into AR/VR, as well as the key areas where the CVPR and GAZE 2024 community can help solve the hardest problems in this space. We will highlight Aria – the ET-enabled research glasses from Meta – and how they are a great tool for investigating applications of eye tracking. We will also show some new approaches to doing eye tracking, based on event cameras, polarization, and more.

Biography (click to expand/collapse)

Alexander Fix is a Research Scientist at Meta Reality Labs Research, where he has worked on eye tracking and related topics for the last 9 years. His research interests include 3D reconstruction, NeRF and other implicit reconstructions, and geometric eye tracking. Collaborations at Meta include quite a lot of eye tracking hardware research, particularly on novel methods for eye tracking such as Event Cameras. He graduated from Cornell in 2016 with a PhD in Computer Science, and from the University of Chicago in 2009 with a BS in Computer Science and Mathematics.

Awards

Best Paper Award

Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following
Anshul Gupta, Pierre Vuillecard, Arya Farkhondeh, Jean-Marc Odobez

PDF (CVF)

Best Poster Award

Gaze Scanpath Transformer: Predicting Visual Search Target by Spatiotemporal Semantic Modeling of Gaze Scanpath
Takumi Nishiyasu, Yoichi Sato

PDF (CVF)

Accepted Full Papers

Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation Swati Jindal, Mohit Yadav, Roberto Manduchi

PDF (CVF) Suppl. (CVF) arXiv Code

Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following Anshul Gupta, Pierre Vuillecard, Arya Farkhondeh, Jean-Marc Odobez

PDF (CVF) Suppl. (CVF)

Gaze Scanpath Transformer: Predicting Visual Search Target by Spatiotemporal Semantic Modeling of Gaze Scanpath Takumi Nishiyasu, Yoichi Sato

PDF (CVF)

GESCAM: A Dataset and Method on Gaze Estimation for Classroom Attention Measurement Athul Mathew, Arshad Khan, Thariq Khalid, Riad Souissi

PDF (CVF) Project Page

Invited Posters

What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation Yihua Cheng, Yaning Zhu, Zongji Wang, Hongquan Hao, Liu Wei, Shiqing Cheng, Xi Wang, Hyung Jin Chang

PDF (CVF) Suppl. (CVF) Project Page arXiv Code

Learning from Observer Gaze: Zero-shot Attention Prediction Oriented by Human-Object Interaction Recognition Yuchen Zhou, Linkai Liu, Chao Gou

PDF (CVF) Suppl. (CVF)

Sharingan: A Transformer Architecture for Multi-Person Gaze Following Samy Tafasca, Anshul Gupta, Jean-Marc Odobez

PDF (CVF) Suppl. (CVF) arXiv

From Feature to Gaze: A Generalizable Replacement of Linear Layer for Gaze Estimation Yiwei Bao, Feng Lu

PDF (CVF)