Hierarchical theme and topic model for summarization

Jen-Tzung Chien, Ying Lan Chang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

This paper presents a hierarchical summarization model to extract representative sentences from a set of documents. In this study, we select the thematic sentences and identify the topical words based on a hierarchical theme and topic model (H2TM). The latent themes and topics are inferred from document collection. A tree stick-breaking process is proposed to draw the theme proportions for representation of sentences. The structural learning is performed without fixing the number of themes and topics. This H2TM is delicate and flexible to represent words and sentences from heterogeneous documents. Thematic sentences are effectively extracted for document summarization. In the experiments, the proposed H2TM outperforms the other methods in terms of precision, recall and F-measure.

Original languageEnglish
Title of host publication2013 IEEE International Workshop on Machine Learning for Signal Processing - Proceedings of MLSP 2013
DOIs
StatePublished - 1 Dec 2013
Event2013 16th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2013 - Southampton, United Kingdom
Duration: 22 Sep 201325 Sep 2013

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Conference

Conference2013 16th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2013
CountryUnited Kingdom
CitySouthampton
Period22/09/1325/09/13

Keywords

  • Bayesian nonparametrics
  • document summarization
  • structural learning
  • Topic model

Fingerprint Dive into the research topics of 'Hierarchical theme and topic model for summarization'. Together they form a unique fingerprint.

Cite this