Mining potential high-utility itemsets over uncertain databases

Jerry Chun Wei Lin, Wensheng Gan, Philippe Fournier-Viger, Tzung Pei Hong, Vincent Shin-Mu Tseng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Traditional high-utility itemsets mining (HUIM) incorporates the concept of utility (e.g., profit) over certain databases. However, an item or itemset is not only present or absent in the transactions but also associated with an existing probability especially the data is collected from the sensor environment. The topic of HUIM from uncertain databases has not yet been addressed though it is commonly seen in real-world applications. In this paper, we propose a novel framework for mining potential high-utility itemsets (PHUIs) over uncertain databases. The upper-bound-based PHUI-UP algorithm is firstly presented to level-wisely mine PHUIs. Based on the probability-utility (PU)-list structure, an improved (PHUI-List) algorithm is further developed to mine PHUIs directly without candidate generation. Substantial experiments are conducted on both real-life and synthetic datasets to show the performance of two designed algorithms in terms of runtime, number of patterns, and scalability.

Original languageEnglish
Title of host publicationProceedings of the ASE BigData and SocialInformatics 2015, ASE BD and SI 2015
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450337359
DOIs
StatePublished - 7 Oct 2015
EventASE BigData and SocialInformatics, ASE BD and SI 2015 - Kaohsiung, Taiwan
Duration: 7 Oct 20159 Oct 2015

Publication series

NameACM International Conference Proceeding Series
Volume07-09-Ocobert-2015

Conference

ConferenceASE BigData and SocialInformatics, ASE BD and SI 2015
CountryTaiwan
CityKaohsiung
Period7/10/159/10/15

Keywords

  • High-utility itemset mining
  • PU-list
  • Probabilisticbased
  • Uncertain database
  • Upper-bound

Fingerprint Dive into the research topics of 'Mining potential high-utility itemsets over uncertain databases'. Together they form a unique fingerprint.

Cite this