Mining high-utility itemsets in dynamic profit databases

Loan T.T. Nguyen, Phuc Nguyen, Trinh D.D. Nguyen, Bay Vo*, Philippe Fournier-Viger, Vincent S. Tseng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

High-Utility Itemset (HUI)mining is an important data-mining task which has gained popularity in recent years due to its applications in numerous fields. HUI mining aims at discovering itemsets that have high utility (e.g., yield a high profit)in transactional databases. Although several algorithms have been designed to enumerate all HUIs, an important issue is that they assume that the utilities (e.g., unit profits)of items are static. But this simplifying assumption does not hold in real-life situations. For example, the unit profits of items often vary over time in a retail store due to fluctuating supply costs and promotions. Ignoring this important characteristic of real-life transactional databases makes current HUI-mining algorithms inapplicable in many real-world applications. To address this critical limitation of current HUI-mining techniques, this paper studies the novel problem of mining HUIs in databases having dynamic unit profits. To accurately assess the utility of any itemset in this context, a redefined utility measure is introduced. Furthermore, a novel algorithm named MEFIM (Modified EFficient high-utility Itemset Mining), which relies on a novel compact database format to discover the desired itemsets efficiently, is designed. An improved version of the MEFIM algorithm, named iMEFIM, is also introduced. This algorithm employs a novel structure called P-set to reduce the number of transaction scans and to speed up the mining process. Experimental results show that the proposed algorithms considerably outperform the state-of-the-art HUI-mining algorithms on dynamic profit databases in terms of runtime, memory usage, and scalability.

Original languageEnglish
Pages (from-to)130-144
Number of pages15
JournalKnowledge-Based Systems
Volume175
DOIs
StatePublished - 1 Jul 2019

Keywords

  • Candidate pruning
  • Data mining
  • Dynamic profit
  • High-utility itemset mining

Fingerprint Dive into the research topics of 'Mining high-utility itemsets in dynamic profit databases'. Together they form a unique fingerprint.

Cite this