As the number of speech and video documents increases on the Internet and portable devices proliferate, speech summarization becomes increasingly essential. Relevant research in this domain has typically focused on broadcasts and news; however, the automatic summarization methods used in the past may not apply to other speech domains (e.g., speech in lectures). Therefore, this study explores the lecture speech domain. The features used in previous research were analyzed and suitable features were selected following experimentation; subsequently, a three-phase teal-time speech summarizer (RTSS) was proposed. Phase One involved selecting independent features (e.g. centrality, resemblance to the title, sentence length, term frequency, and thematic words) and calculating the independent feature scores; Phase Two involved calculating the dependent features, such as the position compared with the independent feature scores; and Phase Three involved comparing these feature scores to obtain weighted averages of the function-scores, determine the highest-scoring sentence, and provide a summary.
|Name||2014 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2014)|
|Conference||International Symposium on Computer, Consumer and Control (IS3C)|
|Period||10/06/14 → 12/06/14|
- feature selection; information retrieval; speech summarization; text mining