Mining high utility sequential patterns from evolving data streams

Morteza Zihayat, Cheng Wei Wu, Aijun An, S. Tseng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

In this paper, we define the problem of mining high utility sequential patterns (HUSPs) over high-velocity streaming data and propose an efficient algorithm for mining HUSPs over a data stream. The main challenges we tackle include how to maintain a compact summary of the data stream to reflect the evolution of sequence utilities over time and how to overcome the problem of combinatorial explosion of a search space. We propose a compact data structure named HUSP-Tree to maintain the essential information for mining HUSPs in an online fashion. An efficient and single-pass algorithm named HUSP-Stream is proposed to generate HUSPs from HUSP-Tree. HUSP-Stream uses a new utility estimation model to more effectively prune the search space. Experimental results on real and synthetic datasets show that our algorithm serves as an efficient solution to the new problem of mining high utility sequential patterns over data streams.

Original languageEnglish
Title of host publicationProceedings of the ASE BigData and SocialInformatics 2015, ASE BD and SI 2015
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450337359
DOIs
StatePublished - 7 Oct 2015
EventASE BigData and SocialInformatics, ASE BD and SI 2015 - Kaohsiung, Taiwan
Duration: 7 Oct 20159 Oct 2015

Publication series

NameACM International Conference Proceeding Series
Volume07-09-Ocobert-2015

Conference

ConferenceASE BigData and SocialInformatics, ASE BD and SI 2015
CountryTaiwan
CityKaohsiung
Period7/10/159/10/15

Keywords

  • Big data
  • Data stream
  • High utility sequential pattern mining

Fingerprint Dive into the research topics of 'Mining high utility sequential patterns from evolving data streams'. Together they form a unique fingerprint.

Cite this