Learning-based saliency model with depth information

Chih Yao Ma, Hsueh-Ming Hang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

19 Scopus citations


Most previous studies on visual saliency focused on two-dimensional (2D) scenes. Due to the rapidly growing three-dimensional (3D) video applications, it is very desirable to know how depth information affects human visual attention. In this study, we first conducted eye-fixation experiments on 3D images. Our fixation data set comprises 475 3D images and 16 subjects. We used a Tobii TX300 eye tracker (Tobii, Stockholm, Sweden) to track the eye movement of each subject. In addition, this database contains 475 computed depth maps. Due to the scarcity of publicdomain 3D fixation data, this data set should be useful to the 3D visual attention research community. Then, a learning-based visual attention model was designed to predict human attention. In addition to the popular 2D features, we included the depth map and its derived features. The results indicate that the extra depth information can enhance the saliency estimation accuracy specifically for close-up objects hidden in a complex-texture background. In addition, we examined the effectiveness of various low-, mid-, and high-level features on saliency prediction. Compared with both 2D and 3D state-of-the-art saliency estimation models, our methods show better performance on the 3D test images. The eye-tracking database and the MATLAB source codes for the proposed saliency model and evaluation methods are available on our website.

Original languageEnglish
Pages (from-to)1-22
Number of pages22
JournalJournal of Vision
Issue number6
StatePublished - 1 Jan 2015


  • Depth saliency
  • Eye-fixation database
  • Saliency map
  • Visual attention

Fingerprint Dive into the research topics of 'Learning-based saliency model with depth information'. Together they form a unique fingerprint.

Cite this