3pm - 8:20pm UTC
8am | - 1:20pm | PDT | (UTC-7) |
11am | - 4:20pm | EDT | (UTC-4) |
4pm | - 9:20pm | BST | (UTC+1) |
5pm | - 10:20pm | CEST | (UTC+2) |
8:30pm | - 1:50am (+1) | IST | (UTC+5.5) |
11pm | - 4:20am (+1) | CST | (UTC+8) |
12am | - 5:20am (+1) | KST | (UTC+9) |
Youtube recording: https://youtu.be/WQ8azMW_dn8
The 3rd International Workshop on Gaze Estimation and Prediction in the Wild (GAZE 2021) at CVPR 2021 aims to encourage and highlight novel strategies for eye gaze estimation and prediction with a focus on robustness and accuracy in extended parameter spaces, both spatially and temporally. This is expected to be achieved by applying novel neural network architectures, incorporating anatomical insights and constraints, introducing new and challenging datasets, and exploiting multi-modal training. Specifically, the workshop topics include (but are not limited to):
- Reformulating eye detection, gaze estimation, and gaze prediction pipelines with deep networks.
- Applying geometric and anatomical constraints into the training of (sparse or dense) deep networks.
- Leveraging additional cues such as contexts from face region and head pose information.
- Developing adversarial methods to deal with conditions where current methods fail (illumination, appearance, etc.).
- Exploring attention mechanisms to predict the point of regard.
- Designing new accurate measures to account for rapid eye gaze movement.
- Novel methods for temporal gaze estimation and prediction including Bayesian methods.
- Integrating differentiable components into 3D gaze estimation frameworks.
- Robust estimation from different data modalities such as RGB, depth, head pose, and eye region landmarks.
- Generic gaze estimation method for handling extreme head poses and gaze directions.
- Temporal information usage for eye tracking to provide consistent gaze estimation on the screen.
- Personalization of gaze estimators with few-shot learning.
- Semi-/weak-/un-/self- supervised leraning methods, domain adaptation methods, and other novel methods towards improved representation learning from eye/face region images or gaze target region images.
Call for Contributions
Full Workshop Papers
Submission: We invite authors to submit unpublished papers (8-page ICCV format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. All contributions must be submitted (along with supplementary materials, if any) at this CMT link.
Accepted papers will be published in the official ICCV Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive.
Note: Authors of previously rejected main conference submissions are also welcome to submit their work to our workshop. When doing so, you must submit the previous reviewers' comments (named as previous_reviews.pdf
) and a letter of changes (named as letter_of_changes.pdf
) as part of your supplementary materials to clearly demonstrate the changes made to address the comments made by previous reviewers.
GAZE 2021 Challenges
The GAZE 2021 Challenges are hosted on Codalab, and can be found at:
- ETH-XGaze Challenge: https://competitions.codalab.org/competitions/28930
- EVE Challenge: https://competitions.codalab.org/competitions/28954
More information on the respective challenges can be found on their pages.
We are thankful to our sponsors for providing the following prizes:
ETH-XGaze Challenge Winner | USD 500 | courtesy of |
EVE Challenge Winner | Tobii Eye Tracker 5 | courtesy of |
Workshop Schedule
Time in UTC | Start Time in UTC* (probably your time zone) |
Item |
---|---|---|
3:00pm - 3:05pm | 20 Jun 2021 15:00:00 UTC | Opening Remarks and Awards |
3:05pm - 3:40pm | 20 Jun 2021 15:05:00 UTC | Challenge Winner Talks |
3:40pm - 4:40pm | 20 Jun 2021 15:40:00 UTC | Full Workshop Paper Presentations |
4:40pm - 6:00pm | 20 Jun 2021 16:40:00 UTC | Break + Poster Session |
6:00pm - 6:35pm | 20 Jun 2021 18:00:00 UTC | Keynote Talk: Jim Rehg |
6:35pm - 7:15pm | 20 Jun 2021 18:35:00 UTC | Keynote Talk: Moshe Eizenman |
7:10pm - 7:45pm | 20 Jun 2021 19:10:00 UTC | Keynote Talk: Adrià Recasens |
7:45pm - 8:15pm | 20 Jun 2021 19:45:00 UTC | Panel Discussion |
8:15pm - 8:20pm | 20 Jun 2021 20:15:00 UTC | Closing Remarks |
For example, those in Los Angeles may see UTC-7,
while those in Berlin may see UTC+2.
Please note that there may be differences to your actual time zone.
Invited Keynote Speakers
Georgia Institute of Technology
An Egocentric View of Social Behavior
James M. Rehg (pronounced “ray”) is a Professor in the School of Interactive Computing at the Georgia Institute of Technology, where he is Director of the Center for Behavioral Imaging, co-Director of the Center for Computational Health, and co-Director of the Computational Perception Lab. He received his Ph.D. from CMU in 1995 and worked at the Cambridge Research Lab of DEC (and then Compaq) from 1995-2001, where he managed the computer vision research group. He received an NSF CAREER award in 2001 and a Raytheon Faculty Fellowship from Georgia Tech in 2005. He and his students have received a number of best paper awards, including best student paper awards at ICML 2005, BMVC 2010, Mobihealth 2014, Face and Gesture 2015, and a Method of the Year award from the journal Nature Methods. Dr. Rehg serves on the Editorial Board of the Intl. J. of Computer Vision, and he served as the General co-Chair for CVPR 2009 and is serving as the Program co-Chair for CVPR 2017 (Puerto Rico). He has authored more than 100 peer-reviewed scientific papers and holds 23 issued US patents. Dr. Rehg’s research interests include computer vision, machine learning, behavioral imaging, and mobile health (mHealth). He is the Deputy Director of the NIH Center of Excellence on Mobile Sensor Data-to-Knowledge (MD2K), which is developing novel on-body sensing and predictive analytics for improving health outcomes. Dr. Rehg is also leading a multi-institution effort, funded by an NSF Expedition award, to develop the science and technology of Behavioral Imaging— the capture and analysis of social and communicative behavior using multi-modal sensing, to support the study and treatment of developmental disorders such as autism.
University of Toronto
Development of hybrid eye tracking systems for studies of neuropsychiatric disorders.
Abstract
Visual scanning behaviour is controlled by both low-level perception processes (e.g., colour, spatial characteristics of the visual stimuli) and high-level cognitive processes, which are driven by memories, emotions, expectations, and goals. During natural viewing subjects are unaware of their visual scanning behaviour and as such visual scanning behaviour can provide physiological markers for objective evaluation of cognitive processes in patients with neuropsychiatric disorders.
In this talk I will present our past and current work towards the development of objective markers for neuropsychiatric disorders. This work includes both the development of new methods to analyse visual scanning patterns and the development of eye-tracking systems to monitor such patterns.
I will start by describing a general method for the analysis of visual scanning behaviour in neuropsychiatric disorders. I will then demonstrate the utility of this novel method by providing examples from our studies in patients with eating and mood disorders. I will then describe two low cost eye-tracking systems that we developed for such studies. One system uses a smartphone to display visual stimuli and analyse visual scanning patterns while the other uses a virtual reality headset to display visual stimuli. Point-of-gaze, in both systems, is computed by an eye-model whose parameters are estimated from eye-images by machine learning techniques (i.e., a hybrid approach to point-of-gaze estimation).
Moshe Eizenman is a professor in the departments of Ophthalmology and Visual Science and Electrical and Computer Engineering at the University of Toronto. He is also a senior researcher at the Krembil Brain Institute. He received his Ph.D. from the University of Toronto in 1984, where he worked at the Institute of Biomedical Engineering as the head of the vision and eye-movements group. He has authored more than 120 peer-reviewed scientific papers and his research interests include the development of eye-tracking systems, analysis of eye-movements and visual scanning patterns and development of objective physiological markers for psychiatric and neurological disorders. Prof. Eizenman is the founder of EL-MAR Inc. a company that develops advanced eye-tracking technologies for pilot training, driving and medical research.
DeepMind
Where are they looking?
Abstract
In order to understand actions or anticipate intentions, humans need efficient ways of gathering information about each other. In particular, gaze is a rich source of information about other peoples’ activities and intentions. In this talk, we describe our work on predicting human gaze. We introduce a series of methods to follow gaze for different modalities. First, we present GazeFollow, a dataset and model to predict the location of people's gaze in an image. Furthermore, we introduce Gaze360, a large-scale gaze-tracking dataset and method for robust 3D gaze direction estimation in unconstrained scenes. Finally, we also propose a saliency-based sampling layer designed to improve performance in arbitrary tasks by efficiently zooming into the relevant parts of the input image.
Adrià Recasens is a Research Scientist at DeepMind. He previously completed his PhD on computer vision at the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology in 2019. During his PhD, he worked on various topics related to image and video understanding. Particularly, he has various publications on gaze estimation on image and video. His current research focuses on self-supervised learning specifically applied to multiple modalities such as video, audio or text.
Awards
Best Paper Award sponsored by
PupilTAN: A Few-Shot Adversarial Pupil Localizer
Nikolaos Poulopoulos, Emmanouil Z. Psarakis, and Dimitrios Kosmopoulos