Mining Top-K Path Traversal Patterns over Streaming Web Click-Sequences

Hua-Fu Li, Suh-Yin Lee

Research output: Contribution to journalArticle

5 Scopus citations

Abstract

Online, one-pass mining Web click streams poses some interesting computational issues, such as unbounded length of streaming data, possibly very fast arrival rate, and just one scan over previously arrived Web click-sequences. In this paper, we propose a new, single-pass algorithm, called DSM-TKP (Data Stream Mining for Top-K Path traversal patterns), for mining a set of top-k path traversal patterns, where k is the desired number of path traversal patterns to be mined. An effective summary data structure, called TKP-forest (a forest of Top-K Path traversal patterns), is used to maintain the essential information about the top-k path traversal patterns generated so far. Experimental studies show that the proposed DSM-TKP algorithm uses stable memory usage and makes only one pass over the streaming Web click-sequences.
Original languageEnglish
Pages (from-to)1121-1133
Number of pages13
JournalJournal of Information Science and Engineering
Volume25
Issue number4
DOIs
StatePublished - Jul 2009
EventIEEE/WIC/ACM International Conference on Web Intelligence - Compiegne Univ Technol, Compiegne, France
Duration: 19 Sep 200522 Sep 2005

Keywords

  • web usage mining; data streams; path traversal patterns; top-k pattern mining; single-pass mining

Fingerprint Dive into the research topics of 'Mining Top-K Path Traversal Patterns over Streaming Web Click-Sequences'. Together they form a unique fingerprint.

  • Cite this