TY - JOUR

T1 - Efficiently mining high average-utility itemsets with an improved upper-bound strategy

AU - Lan, Guo Cheng

AU - Hong, Tzung Pei

AU - Tseng, S.

PY - 2012/9/1

Y1 - 2012/9/1

N2 - Utility mining has recently been discussed in the field of data mining. A utility itemset considers both profits and quantities of items in transactions, and thus its utility value increases with increasing itemset length. To reveal a better utility effect, an average-utility measure, which is the total utility of an itemset divided by its itemset length, is proposed. However, existing approaches use the traditional average-utility upper-bound model to find high average-utility itemsets, and thus generate a large number of unpromising candidates in the mining process. The present study proposes an improved upper-bound approach that uses the prefix concept to create tighter upper bounds of average-utility values for itemsets, thus reducing the number of unpromising itemsets for mining. Results from experiments on two real databases show that the proposed algorithm outperforms other mining algorithms under various parameter settings.

AB - Utility mining has recently been discussed in the field of data mining. A utility itemset considers both profits and quantities of items in transactions, and thus its utility value increases with increasing itemset length. To reveal a better utility effect, an average-utility measure, which is the total utility of an itemset divided by its itemset length, is proposed. However, existing approaches use the traditional average-utility upper-bound model to find high average-utility itemsets, and thus generate a large number of unpromising candidates in the mining process. The present study proposes an improved upper-bound approach that uses the prefix concept to create tighter upper bounds of average-utility values for itemsets, thus reducing the number of unpromising itemsets for mining. Results from experiments on two real databases show that the proposed algorithm outperforms other mining algorithms under various parameter settings.

KW - average-utility mining

KW - Data mining

KW - high average-utility itemsets

KW - prefix concept

KW - upper-bound strategy

UR - http://www.scopus.com/inward/record.url?scp=84869476515&partnerID=8YFLogxK

U2 - 10.1142/S0219622012500307

DO - 10.1142/S0219622012500307

M3 - Article

AN - SCOPUS:84869476515

VL - 11

SP - 1009

EP - 1030

JO - International Journal of Information Technology and Decision Making

JF - International Journal of Information Technology and Decision Making

SN - 0219-6220

IS - 5

ER -