Efficient Algorithms for Mining Top-K High Utility Itemsets

S. Tseng, Cheng Wei Wu, Philippe Fournier-Viger, Philip S. Yu

Research output: Contribution to journalArticlepeer-review

132 Scopus citations

Abstract

High utility itemsets (HUIs) mining is an emerging topic in data mining, which refers to discovering all itemsets having a utility meeting a user-specified minimum utility threshold min-util. However, setting min-util appropriately is a difficult problem for users. Generally speaking, finding an appropriate minimum utility threshold by trial and error is a tedious process for users. If min-util is set too low, too many HUIs will be generated, which may cause the mining process to be very inefficient. On the other hand, if min-util is set too high, it is likely that no HUIs will be found. In this paper, we address the above issues by proposing a new framework for top-k high utility itemset mining, where k is the desired number of HUIs to be mined. Two types of efficient algorithms named TKU (mining Top-K Utility itemsets) and TKO (mining Top-K utility itemsets in One phase) are proposed for mining such itemsets without the need to set min-util. We provide a structural comparison of the two algorithms with discussions on their advantages and limitations. Empirical evaluations on both real and synthetic datasets show that the performance of the proposed algorithms is close to that of the optimal case of state-of-the-art utility mining algorithms.

Original languageEnglish
Article number7164333
Pages (from-to)54-67
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume28
Issue number1
DOIs
StatePublished - 1 Jan 2016

Keywords

  • high utility itemset mining
  • top-k high utility itemset mining
  • top-k pattern mining
  • Utility mining

Fingerprint Dive into the research topics of 'Efficient Algorithms for Mining Top-K High Utility Itemsets'. Together they form a unique fingerprint.

Cite this