Learning goal-oriented visual dialog agents: Imitating and surpassing analytic experts

Yen Wei Chang, Wen-Hsiao Peng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper tackles the problem of learning a questioner in the goal-oriented visual dialog task. Several previous works adopt model-free reinforcement learning. Most pretrain the model from a finite set of human-generated data. We argue that using limited demonstrations to kick-start the questioner is insufficient due to the large policy search space. Inspired by a recently proposed information theoretic approach, we develop two analytic experts to serve as a source of high-quality demonstrations for imitation learning. We then take advantage of reinforcement learning to refine the model towards the goal-oriented objective. Experimental results on the GuessWhat?! dataset show that our method has the combined merits of imitation and reinforcement learning, achieving the state-of-the-art performance.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019
PublisherIEEE Computer Society
Pages520-525
Number of pages6
ISBN (Electronic)9781538695524
DOIs
StatePublished - 1 Jul 2019
Event2019 IEEE International Conference on Multimedia and Expo, ICME 2019 - Shanghai, China
Duration: 8 Jul 201912 Jul 2019

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume2019-July
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2019 IEEE International Conference on Multimedia and Expo, ICME 2019
CountryChina
CityShanghai
Period8/07/1912/07/19

Keywords

  • Goal oriented visual dialog
  • Imitation learning
  • Reinforcement learning

Fingerprint Dive into the research topics of 'Learning goal-oriented visual dialog agents: Imitating and surpassing analytic experts'. Together they form a unique fingerprint.

  • Cite this

    Chang, Y. W., & Peng, W-H. (2019). Learning goal-oriented visual dialog agents: Imitating and surpassing analytic experts. In Proceedings - 2019 IEEE International Conference on Multimedia and Expo, ICME 2019 (pp. 520-525). [8784740] (Proceedings - IEEE International Conference on Multimedia and Expo; Vol. 2019-July). IEEE Computer Society. https://doi.org/10.1109/ICME.2019.00096