Image credit to Nano Banana Pro

Date: Thursday, 4th June 2026
Time: 1:00 PM – 5:00 PM (half-day)
Location: Room 711


Introduction

The 7th International Workshop on Eye and Gaze in Computer Vision (GAZE 2026) at CVPR 2026 aims to encourage and highlight novel strategies for eye gaze estimation and prediction. The workshop topics include (but are not limited to):

  • Foundation models and large-scale training for the eye and gaze.
  • Gaze in egocentric vision, physical AI learning, and human–robot interaction.
  • Understanding gaze in social interactions, human activities, and telepresence scenarios involving real or virtual agents and entities.
  • Gaze estimation algorithms, including 3D gaze estimation, point-of-regard estimation, gaze following, gaze zone classification, etc.
  • Detection and segmentation of the eye region, such as eye detection, pupil detection, eye-region landmark localization, etc.
  • Human eye modeling and generation, including synthesis and animation from images or videos, etc.
  • Eye gaze data collection, generation, and analysis, such as scanpath generation, etc.
  • Applications of gaze tracking and analysis in real-world scenarios, including VR/AR, mobile devices, PCs, etc.
We will be hosting 2 invited speakers and will also be accepting the submission of full unpublished papers as done in previous versions of the workshop. These papers will be peer-reviewed via a double-blind process, and will be published in the official workshop proceedings and be presented at the workshop itself.


Call for Contributions


Full Workshop Papers

Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) on OpenReview.

Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.

Note: Authors of previously rejected main conference submissions are also welcome to submit their work to our workshop. When doing so, you must submit the previous reviewers' comments (named as previous_reviews.pdf) and a letter of changes (named as letter_of_changes.pdf) as part of your supplementary materials to clearly demonstrate the changes made to address the comments made by previous reviewers.



Important Dates


Paper Submission Deadline March 7, 2026 (23:59 Pacific time)
Notification to Authors March 25, 2026
Camera-Ready Deadline April 10, 2026


Workshop Schedule


Time in Denver Start Time in UTC*
(probably your time zone)
Item
1:00pm - 1:05pm 4 Jun 2026 19:00:00 UTC Opening Remark
1:05pm - 1:45pm 4 Jun 2026 19:05:00 UTC Invited Talk by Jim Rehg
1:45pm - 3:25pm 4 Jun 2026 19:45:00 UTC Paper Presentations
3:25pm - 4:15pm 4 Jun 2026 21:25:00 UTC Poster Session & Coffee Break
4:15pm - 4:55pm 4 Jun 2026 22:15:00 UTC Invited Talk by Ken Pfeuffer
4:55pm - 5:00pm 4 Jun 2026 22:55:00 UTC Award & Closing Remark
* This time is calculated to be in your computer's reported time zone.
For example, those in Los Angeles may see UTC-7,
while those in Berlin may see UTC+2.

Please note that there may be differences to your actual time zone.


Invited Keynote Speakers


James M. Rehg
University of Illinois Urbana-Champaign

Inference and Forecasting of 2D and 3D Gaze


Abstract

Gaze behavior is a key indicator of human attention and intention, and gaze measurement has been an active area of investigation for more than a century. In the past few decades there has been substantial progress in using computer vision models to estimate gaze from images and video, without explicitly measuring eye movements. At the same time, there has also been significant progress in forecasting future gaze behavior from video, which is an important task if we are to build Ai agents that can work seamlessly with humans and proactively support their goals. Applications in developmental psychology and social robotics, in particular, can benefit from an understanding of gaze, as it plays a key role in social communication. In this talk, I will review the current state of the art in gaze analysis from video, with a focus on how to leverage modern foundation models effectively and perform analysis in a 3D context.

Biography (click to expand/collapse)

James M. Rehg (pronounced “ray”) is a Founder Professor of Computer Science and Industrial and Enterprise Systems Engineering at University of Illinois Urbana-Champaign. Previously, he was a Professor in the School of Interactive Computing at the Georgia Institute of Technology, where he co-Directed the Center for Health Analytics and Informatics (CHAI). He received his Ph.D. from CMU in 1995 and worked at the Cambridge Research Lab of DEC (and then Compaq) from 1995-2001, where he managed the computer vision research group. He received an NSF CAREER award in 2001 and a Raytheon Faculty Fellowship from Georgia Tech in 2005. He and his students have received a number of best paper awards, including best student paper awards at ICML 2005, BMVC 2010, Mobihealth 2014, Face and Gesture 2015, and a Distinguished Paper Award from ACM IMWUT and a Method of the Year award from the journal Nature Methods. Dr. Rehg served as the General co-Chair for CVPR 2009 and the Program co-Chair for CVPR 2017. He has authored more than 200 peer-reviewed scientific papers and holds 26 issued US patents.

Ken Pfeuffer
Aarhus University

Eye-Hand Symbiosis


Abstract

Our eyes continuously reveal what we attend to and intend to do, yet most user interfaces today are still fundamentally controlled through the hands — from the mouse and touchscreen to spatial interaction. Eye-hand symbiosis offers a new paradigm in which the eyes indicate attention and intent, while the hands provide confirmation and manipulation. This has the potential to advance not only XR interaction, but digital interfaces more broadly by augmenting many existing hand-driven interactions. The shift is already becoming visible in XR, where eye-hand spatial interfaces are increasingly emerging as the new standard in devices and platforms from Apple, Google, and Samsung. In this talk, I will reflect on recent scientific and industrial progress in eye-hand interaction and discuss future directions for how eye-hand symbiosis may shape the next generation of human-computer interaction.

Biography (click to expand/collapse)

Ken Pfeuffer is a researcher, designer, and professor for future user interfaces at Aarhus University, where he is leading the Extended Interaction group that specializes in Human-Computer Interaction (HCI) and Spatial Computing for Extended Reality (XR). He has published over 75 scientific papers and received awards at ACM CHI, UIST, and SUI, including the ACM SIGCHI Special Recognition Award (2025). He is affiliated with the AI Danish Pioneer Center and COGAIN and is regularly in program committees for HCI and XR research conferences and journals. He earned his PhD from Lancaster University (UK), completed a postdoc at Bundeswehr University (Germany), and interned at Microsoft and Google Research US. He has pioneered interaction paradigms such as Gaze+Pinch and Direct-Indirect gestures, shaping 3D interfaces in emerging spatial computing technology.

Awards

Best Paper Award sponsored by


How Much Future Helps? A Controlled Study of Future-Privileged Supervision for Causal Egocentric Gaze Estimation
Jia Li, Wenjie Zhao, Fnu Atisri, Sanskriti Aripineni, Shijian Deng, Jon E. Froehlich, Yuhang Zhao, Yapeng Tian

PDF (CVF)

Best Poster Award sponsored by


Learning Ego-Exo Visual Representations for Conversational Gaze Estimation
Anshul Gupta, Yijun Qian, Ruohan Gao, Ishwarya Ananthabhotla, Jean-Marc Odobez, Vamsi Krishna Ithapu, Calvin Murdock

PDF (CVF)


Accepted Full Papers

Learning Ego-Exo Visual Representations for Conversational Gaze Estimation Anshul Gupta, Yijun Qian, Ruohan Gao, Ishwarya Ananthabhotla, Jean-Marc Odobez, Vamsi Krishna Ithapu, Calvin Murdock
PDF (CVF) Suppl. (CVF)
FlowScan: Self-Supervised Features and Flow Matching Regularization for Gaze Scanpath Prediction Brahan Aklilu, Ofir Itzhak Shahar, Ohad Ben-Shahar
PDF (CVF)
Learning to Look: CLIP-Guided Dual-Crop Fusion for Head Position-Invariant Gaze Estimation Sourav Lakhotia, Chaviti Vasantha Lakshmi, Aratrik Chattopadhyay
PDF (CVF)
SIGN: A Statistically-Informed Gaze Network for Gaze Time Prediction and Inference Jianping Ye, Michel Wedel
PDF (CVF) arXiv
End-to-End Shared Attention Estimation via Group Detection with Feedback Refinement Chihiro Nakatani, Norimichi Ukita, Jean-Marc Odobez
PDF (CVF) Project Page arXiv Code
How Much Future Helps? A Controlled Study of Future-Privileged Supervision for Causal Egocentric Gaze Estimation Jia Li, Wenjie Zhao, Fnu Atisri, Sanskriti Aripineni, Shijian Deng, Jon E. Froehlich, Yuhang Zhao, Yapeng Tian
PDF (CVF)
RayGazeFM: Geometry-Grounded Foundation Adapters for Unified 3D Gaze, Point-of-Regard, and Gaze Target Estimation Murari Ambati
PDF (CVF)


Invited Posters

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing Baifeng Shi, Stephanie Fu, Long Lian, Hanrong Ye, David Eigen, Aaron Reite, Jan Kautz, Boyi Li, David Chan, Trevor Darrell, Pavlo Molchanov, Danny Yin
PDF (CVF) Project Page arXiv Code
Beyond Scanpaths: Graph-Based Gaze Simulation in Dynamic Scenes Luke Palmer, Petar Palasek, Hazem Abdelkawy
PDF (CVF) arXiv
DriverGaze360: OmniDirectional Driver Attention with Object-Level Guidance Shreedhar Govil, Didier Stricker, Jason Rambach
PDF (CVF) Project Page arXiv Code
Forecasting 3D Scanpaths in Egocentric Video Fiona Ryan, Immanuel Ananthabhotla, Pulkit Qian, Judy Hoffman, James M. Rehg, Vamsi Krishna Ithapu, Calvin Murdock
PDF (CVF)
Gaze Target Estimation with Concepts Xu Cao, Houze Yang, Vipin Gunda, Zhongyi Zhou, Tianyu Xu, Adarsh Kowdle, Inki Kim, James Rehg
PDF (CVF) Code
GazeShift: Unsupervised Gaze Estimation and Dataset for VR Gil Shapira, Ishay Goldin, Evgeny Artyomov, Niv Zehngut
PDF (CVF) arXiv Code
Omni-MMSI: Toward Identity-attributed Social Interaction Understanding Xinpeng Li, Bolin Lai, Hardy Chen, Shijian Deng, Cihang Xie, Yuyin Zhou, James Matthew Rehg, Yapeng Tian
PDF (CVF) Project Page arXiv Code
See Through the Noise: Improving Domain Generalization in Gaze Estimation Yanming Peng, Shijing Wang, Yaping Huang, Yi Tian
PDF (CVF) arXiv


Organizers



Yihua Cheng
University of Birmingham
Seonwook Park
NVIDIA Research
Xucong Zhang
Delft University of Technology
Xi Wang
ETH Zürich
Hengfei Wang
EPFL & Idiap Research Institute
Michael Stengel
NVIDIA Research


David Wong
Microsoft
Jean-Marc Odobez
EPFL & Idiap Research Institute
Aleš Leonardis
University of Birmingham
Shalini De Mello
NVIDIA Research
Hyung Jin Chang
University of Birmingham


Workshop sponsored by: