Character-Preserving Coherent Story Visualization

Yun Zhu Song*, Zhi Rui Tam, Hung Jen Chen, Huiao Han Lu, Hong Han Shuai

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Story visualization aims at generating a sequence of images to narrate each sentence in a multi-sentence story. Different from video generation that focuses on maintaining the continuity of generated images (frames), story visualization emphasizes preserving the global consistency of characters and scenes across different story pictures, which is very challenging since story sentences only provide sparse signals for generating images. Therefore, we propose a new framework named Character-Preserving Coherent Story Visualization (CP-CSV) to tackle the challenges. CP-CSV effectively learns to visualize the story by three critical modules: story and context encoder (story and sentence representation learning), figure-ground segmentation (auxiliary task to provide information for preserving character and story consistency), and figure-ground aware generation (image sequence generation by incorporating figure-ground information). Moreover, we propose a metric named Fréchet Story Distance (FSD) to evaluate the performance of story visualization. Extensive experiments demonstrate that CP-CSV maintains the details of character information and achieves high consistency among different frames, while FSD better measures the performance of story visualization.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings
EditorsAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages16
ISBN (Print)9783030585198
StatePublished - 2020
Event16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom
Duration: 23 Aug 202028 Aug 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12362 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference16th European Conference on Computer Vision, ECCV 2020
CountryUnited Kingdom


  • Evaluation metric
  • Foreground segmentation
  • Story visualization

Fingerprint Dive into the research topics of 'Character-Preserving Coherent Story Visualization'. Together they form a unique fingerprint.

Cite this