Look at Me! Correcting eye gaze in live video communication

Chih Fan Hsu, Yu Shuen Wang, Chin Laung Lei, Kuan Ta Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Although live video communication is widely used, it is generally less engaging than face-to-face communication because of limitations on social, emotional, and haptic feedback. Missing eye contact is one such problem caused by the physical deviation between the screen and camera on a device. Manipulating video frames to correct eye gaze is a solution to this problem. In this article, we introduce a system to rotate the eyeball of a local participant before the video frame is sent to the remote side. It adopts a warping-based convolutional neural network to relocate pixels in eye regions. To improve visual quality, we minimize the L2 distance between the ground truths and warped eyes. We also present several newly designed loss functions to help network training. These new loss functions are designed to preserve the shape of eye structures and minimize color changes around the periphery of eye regions. To evaluate the presented network and loss functions, we objectively and subjectively compared results generated by our system and the state-of-the-art, DeepWarp, in relation to two datasets. The experimental results demonstrated the effectiveness of our system. In addition, we showed that our system can perform eye-gaze correction in real time on a consumer-level laptop. Because of the quality and efficiency of the system, gaze correction by postprocessing through this system is a feasible solution to the problem of missing eye contact in video communication.

Original languageEnglish
Article numbera38
JournalACM Transactions on Multimedia Computing, Communications and Applications
Volume15
Issue number2
DOIs
StatePublished - Jun 2019

Keywords

  • Convolutional neural network
  • Eye contact
  • Gaze correction
  • Image processing
  • Live video communication

Fingerprint Dive into the research topics of 'Look at Me! Correcting eye gaze in live video communication'. Together they form a unique fingerprint.

Cite this