EFIM-closed: Fast and memory efficient discovery of closed high-utility itemsets

Philippe Fournier-Viger*, Souleymane Zida, Jerry Chun Wei Lin, Cheng Wei Wu, S. Tseng

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

26 Scopus citations

Abstract

Discovering high-utility temsets in transaction databases is a popular data mining task. A limitation of traditional algorithms is that a huge amount of high-utility itemsets may be presented to the user. To provide a concise and lossless representation of results to the user, the concept of closed high-utility itemsets was proposed. However, mining closed high-utility itemsets is computationally expensive. To address this issue, we present a novel algorithm for discovering closed high-utility itemsets, named EFIM-Closed. This algorithm includes novel pruning strategies named closure jumping, forward closure checking and backward closure checking to prune non-closed high-utility itemsets. Furthermore, it also introduces novel utility upper-bounds and a transaction merging mechanism. Experimental results shows that EFIM-Closed can be more than an order of magnitude faster and consumes more than an order of magnitude less memory than the previous state-of-art CHUD algorithm.

Original languageEnglish
Title of host publicationMachine Learning and Data Mining in Pattern Recognition - 12th International Conference, MLDM 2016, Proceedings
EditorsPetra Perner
PublisherSpringer Verlag
Pages199-213
Number of pages15
ISBN (Print)9783319419190
DOIs
StatePublished - 1 Jan 2016
Event12th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2016 - New York, United States
Duration: 16 Jul 201621 Jul 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9729
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference12th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2016
CountryUnited States
CityNew York
Period16/07/1621/07/16

Keywords

  • Closed itemset
  • High-utility itemset
  • Pattern mining

Fingerprint Dive into the research topics of 'EFIM-closed: Fast and memory efficient discovery of closed high-utility itemsets'. Together they form a unique fingerprint.

Cite this