Efficiently mining uncertain high-utility itemsets

Jerry Chun Wei Lin*, Wensheng Gan, Philippe Fournier-Viger, Tzung Pei Hong, Vincent Shin-Mu Tseng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

23 Scopus citations


Data mining consists of deriving implicit, potentially meaningful and useful knowledge from databases such as information about the most profitable items. High-utility itemset mining (HUIM) has thus emerged as an important research topic in data mining. But most HUIM algorithms can only handle precise data, although big data collected in real-life applications using experimental measurements or noisy sensors is often uncertain. In this paper, an efficient algorithm, named Mining Uncertain High-Utility Itemsets (MUHUI), is proposed to efficiently discover potential high-utility itemsets (PHUIs) in uncertain data. Based on the probability-utility-list (PU-list) structure, the MUHUI algorithm directly mines PHUIs without generating candidates, and can avoid constructing PU-lists for numerous unpromising itemsets by applying several efficient pruning strategies, which greatly improve its performance. Extensive experiments conducted on both real-life and synthetic datasets show that the proposed algorithm significantly outperforms the state-of-the-art PHUI-List algorithm in terms of efficiency and scalability, and that the proposed MUHUI algorithm scales well when mining PHUIs in large-scale uncertain datasets.

Original languageEnglish
Pages (from-to)2801-2820
Number of pages20
JournalSoft Computing
Issue number11
StatePublished - 1 Jun 2017


  • Data mining
  • High-utility itemset
  • Large-scale dataset
  • Pruning strategies
  • Uncertainty

Fingerprint Dive into the research topics of 'Efficiently mining uncertain high-utility itemsets'. Together they form a unique fingerprint.

Cite this