GAZE 2026: Eye and Gaze in Computer Vision

Image credit to Nano Banana Pro

Date:	Thursday, 4th June 2026
Time:	1:00 PM – 5:00 PM (half-day)
Location:	Room 711

Introduction

The 7th International Workshop on Eye and Gaze in Computer Vision (GAZE 2026) at CVPR 2026 aims to encourage and highlight novel strategies for eye gaze estimation and prediction. The workshop topics include (but are not limited to):

Foundation models and large-scale training for the eye and gaze.
Gaze in egocentric vision, physical AI learning, and human–robot interaction.
Understanding gaze in social interactions, human activities, and telepresence scenarios involving real or virtual agents and entities.
Gaze estimation algorithms, including 3D gaze estimation, point-of-regard estimation, gaze following, gaze zone classification, etc.
Detection and segmentation of the eye region, such as eye detection, pupil detection, eye-region landmark localization, etc.
Human eye modeling and generation, including synthesis and animation from images or videos, etc.
Eye gaze data collection, generation, and analysis, such as scanpath generation, etc.
Applications of gaze tracking and analysis in real-world scenarios, including VR/AR, mobile devices, PCs, etc.

We will be hosting 2 invited speakers and will also be accepting the submission of full unpublished papers as done in previous versions of the workshop. These papers will be peer-reviewed via a double-blind process, and will be published in the official workshop proceedings and be presented at the workshop itself.

Call for Contributions

Full Workshop Papers

Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) on OpenReview.

Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.

Note: Authors of previously rejected main conference submissions are also welcome to submit their work to our workshop. When doing so, you must submit the previous reviewers' comments (named as previous_reviews.pdf) and a letter of changes (named as letter_of_changes.pdf) as part of your supplementary materials to clearly demonstrate the changes made to address the comments made by previous reviewers.

Important Dates

Paper Submission Deadline	March 7, 2026 (23:59 Pacific time)
Notification to Authors	March 25, 2026
Camera-Ready Deadline	April 10, 2026

Workshop Schedule

Time in Denver	Start Time in UTC* (probably your time zone)	Item
1:00pm - 1:05pm	4 Jun 2026 19:00:00 UTC	Opening Remark
1:05pm - 1:45pm	4 Jun 2026 19:05:00 UTC	Invited Talk by Jim Rehg
1:45pm - 3:25pm	4 Jun 2026 19:45:00 UTC	Paper Presentations
3:25pm - 4:15pm	4 Jun 2026 21:25:00 UTC	Poster Session & Coffee Break
4:15pm - 4:55pm	4 Jun 2026 22:15:00 UTC	Invited Talk by Ken Pfeuffer
4:55pm - 5:00pm	4 Jun 2026 22:55:00 UTC	Award & Closing Remark

* This time is calculated to be in your computer's reported time zone.
For example, those in Los Angeles may see UTC-7,
while those in Berlin may see UTC+2.

Please note that there may be differences to your actual time zone.

Invited Keynote Speakers

James M. Rehg

University of Illinois Urbana-Champaign

Inference and Forecasting of 2D and 3D Gaze

Abstract

Gaze behavior is a key indicator of human attention and intention, and gaze measurement has been an active area of investigation for more than a century. In the past few decades there has been substantial progress in using computer vision models to estimate gaze from images and video, without explicitly measuring eye movements. At the same time, there has also been significant progress in forecasting future gaze behavior from video, which is an important task if we are to build Ai agents that can work seamlessly with humans and proactively support their goals. Applications in developmental psychology and social robotics, in particular, can benefit from an understanding of gaze, as it plays a key role in social communication. In this talk, I will review the current state of the art in gaze analysis from video, with a focus on how to leverage modern foundation models effectively and perform analysis in a 3D context.

Biography (click to expand/collapse)

James M. Rehg (pronounced “ray”) is a Founder Professor of Computer Science and Industrial and Enterprise Systems Engineering at University of Illinois Urbana-Champaign. Previously, he was a Professor in the School of Interactive Computing at the Georgia Institute of Technology, where he co-Directed the Center for Health Analytics and Informatics (CHAI). He received his Ph.D. from CMU in 1995 and worked at the Cambridge Research Lab of DEC (and then Compaq) from 1995-2001, where he managed the computer vision research group. He received an NSF CAREER award in 2001 and a Raytheon Faculty Fellowship from Georgia Tech in 2005. He and his students have received a number of best paper awards, including best student paper awards at ICML 2005, BMVC 2010, Mobihealth 2014, Face and Gesture 2015, and a Distinguished Paper Award from ACM IMWUT and a Method of the Year award from the journal Nature Methods. Dr. Rehg served as the General co-Chair for CVPR 2009 and the Program co-Chair for CVPR 2017. He has authored more than 200 peer-reviewed scientific papers and holds 26 issued US patents.

Ken Pfeuffer

Aarhus University

Eye-Hand Symbiosis

Abstract

Our eyes continuously reveal what we attend to and intend to do, yet most user interfaces today are still fundamentally controlled through the hands — from the mouse and touchscreen to spatial interaction. Eye-hand symbiosis offers a new paradigm in which the eyes indicate attention and intent, while the hands provide confirmation and manipulation. This has the potential to advance not only XR interaction, but digital interfaces more broadly by augmenting many existing hand-driven interactions. The shift is already becoming visible in XR, where eye-hand spatial interfaces are increasingly emerging as the new standard in devices and platforms from Apple, Google, and Samsung. In this talk, I will reflect on recent scientific and industrial progress in eye-hand interaction and discuss future directions for how eye-hand symbiosis may shape the next generation of human-computer interaction.

Biography (click to expand/collapse)

Ken Pfeuffer is a researcher, designer, and professor for future user interfaces at Aarhus University, where he is leading the Extended Interaction group that specializes in Human-Computer Interaction (HCI) and Spatial Computing for Extended Reality (XR). He has published over 75 scientific papers and received awards at ACM CHI, UIST, and SUI, including the ACM SIGCHI Special Recognition Award (2025). He is affiliated with the AI Danish Pioneer Center and COGAIN and is regularly in program committees for HCI and XR research conferences and journals. He earned his PhD from Lancaster University (UK), completed a postdoc at Bundeswehr University (Germany), and interned at Microsoft and Google Research US. He has pioneered interaction paradigms such as Gaze+Pinch and Direct-Indirect gestures, shaping 3D interfaces in emerging spatial computing technology.

Awards

Best Paper Award sponsored by

How Much Future Helps? A Controlled Study of Future-Privileged Supervision for Causal Egocentric Gaze Estimation
Jia Li, Wenjie Zhao, Fnu Atisri, Sanskriti Aripineni, Shijian Deng, Jon E. Froehlich, Yuhang Zhao, Yapeng Tian

PDF (CVF)

Best Poster Award sponsored by

Learning Ego-Exo Visual Representations for Conversational Gaze Estimation
Anshul Gupta, Yijun Qian, Ruohan Gao, Ishwarya Ananthabhotla, Jean-Marc Odobez, Vamsi Krishna Ithapu, Calvin Murdock

PDF (CVF)

Accepted Full Papers

Learning Ego-Exo Visual Representations for Conversational Gaze Estimation Anshul Gupta, Yijun Qian, Ruohan Gao, Ishwarya Ananthabhotla, Jean-Marc Odobez, Vamsi Krishna Ithapu, Calvin Murdock

PDF (CVF) Suppl. (CVF)

FlowScan: Self-Supervised Features and Flow Matching Regularization for Gaze Scanpath Prediction Brahan Aklilu, Ofir Itzhak Shahar, Ohad Ben-Shahar

PDF (CVF)

Learning to Look: CLIP-Guided Dual-Crop Fusion for Head Position-Invariant Gaze Estimation Sourav Lakhotia, Chaviti Vasantha Lakshmi, Aratrik Chattopadhyay

PDF (CVF)

SIGN: A Statistically-Informed Gaze Network for Gaze Time Prediction and Inference Jianping Ye, Michel Wedel

PDF (CVF) arXiv

End-to-End Shared Attention Estimation via Group Detection with Feedback Refinement Chihiro Nakatani, Norimichi Ukita, Jean-Marc Odobez

PDF (CVF) Project Page arXiv Code

How Much Future Helps? A Controlled Study of Future-Privileged Supervision for Causal Egocentric Gaze Estimation Jia Li, Wenjie Zhao, Fnu Atisri, Sanskriti Aripineni, Shijian Deng, Jon E. Froehlich, Yuhang Zhao, Yapeng Tian

PDF (CVF)

RayGazeFM: Geometry-Grounded Foundation Adapters for Unified 3D Gaze, Point-of-Regard, and Gaze Target Estimation Murari Ambati

PDF (CVF)

Invited Posters

Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing Baifeng Shi, Stephanie Fu, Long Lian, Hanrong Ye, David Eigen, Aaron Reite, Jan Kautz, Boyi Li, David Chan, Trevor Darrell, Pavlo Molchanov, Danny Yin

PDF (CVF) Project Page arXiv Code

Beyond Scanpaths: Graph-Based Gaze Simulation in Dynamic Scenes Luke Palmer, Petar Palasek, Hazem Abdelkawy

PDF (CVF) arXiv

DriverGaze360: OmniDirectional Driver Attention with Object-Level Guidance Shreedhar Govil, Didier Stricker, Jason Rambach

PDF (CVF) Project Page arXiv Code

Forecasting 3D Scanpaths in Egocentric Video Fiona Ryan, Immanuel Ananthabhotla, Pulkit Qian, Judy Hoffman, James M. Rehg, Vamsi Krishna Ithapu, Calvin Murdock

PDF (CVF)

Gaze Target Estimation with Concepts Xu Cao, Houze Yang, Vipin Gunda, Zhongyi Zhou, Tianyu Xu, Adarsh Kowdle, Inki Kim, James Rehg

PDF (CVF) Code

GazeShift: Unsupervised Gaze Estimation and Dataset for VR Gil Shapira, Ishay Goldin, Evgeny Artyomov, Niv Zehngut

PDF (CVF) arXiv Code

Omni-MMSI: Toward Identity-attributed Social Interaction Understanding Xinpeng Li, Bolin Lai, Hardy Chen, Shijian Deng, Cihang Xie, Yuyin Zhou, James Matthew Rehg, Yapeng Tian

PDF (CVF) Project Page arXiv Code

See Through the Noise: Improving Domain Generalization in Gaze Estimation Yanming Peng, Shijing Wang, Yaping Huang, Yi Tian

PDF (CVF) arXiv

Organizers

Yihua Cheng

University of Birmingham

Seonwook Park

NVIDIA Research

Xucong Zhang

Delft University of Technology

Xi Wang

ETH Zürich

Hengfei Wang

EPFL & Idiap Research Institute

Michael Stengel

NVIDIA Research

David Wong

Introduction

Call for Contributions

Full Workshop Papers

Important Dates

Workshop Schedule

Invited Keynote Speakers

University of Illinois Urbana-Champaign

Inference and Forecasting of 2D and 3D Gaze

Aarhus University

Eye-Hand Symbiosis

Awards

Best Paper Award sponsored by

Best Poster Award sponsored by

Accepted Full Papers

Invited Posters

Organizers

University of Birmingham

NVIDIA Research

Delft University of Technology

ETH Zürich

EPFL & Idiap Research Institute

NVIDIA Research

Microsoft

EPFL & Idiap Research Institute

University of Birmingham

NVIDIA Research

University of Birmingham

Workshop sponsored by: