Welcome to D
SIGMOD 2005
PODS 2005
SIGMOD-RECOR
CIDR 2005
CIKM 2005
COMAD 2005
CVDB 2005
DaMoN 2005
Data Enginee
DEBS05
DMSN 2005
DOLAP 2005
GIR 2005
GIS 2005
Hypertext 20
ICDE 2005
ICDM 2005
IHIS 2005
IQIS 2005
JCDL 2005
KRAS 2005
MDM 2005
MIR 2005
MobiDE 2005
P2PIR 2005
RIDE 2005
SBBD 2005
SIGIR 2005
SIGIR-FORUM
SIGKDD 2005
SIGKDD-EXP
SSDBM 2005
TIME 2005
TKDE 2005
<<< = TKDE'05 Pape>>>
TODS 2005
VLDB 2005
VLDBJ 2005
WebDB 2005
WIDM 2005

Fast algorithms for frequent itemset mining using FP-trees


Gösta Grahne and Jianfei Zhu

  View Paper (PDF)  

Return to October 2005, Volume 17, Issue 10


Abstract

Efficient algorithms for mining frequent itemsets are crucial for mining association rules as well as for many other data mining tasks. Methods for mining frequent itemsets have been implemented using a prefix-tree structure, known as an FP-tree, for storing compressed information about frequent itemsets. Numerous experimental results have demonstrated that these algorithms perform extremely well. In this paper, we present a novel FP-array technique that greatly reduces the need to traverse FP-trees, thus obtaining significantly improved performance for FP-tree-based algorithms. Our technique works especially well for sparse data sets. Furthermore, we present new algorithms for mining all, maximal, and closed frequent itemsets. Our algorithms use the FP-tree data structure in combination with the FP-array technique efficiently and incorporate various optimization techniques. We also present experimental results comparing our methods with existing algorithms. The results show that our methods are the fastest for many cases. Even though the algorithms consume much memory when the data sets are sparse, they are still the fastest ones when the minimum support is low. Moreover, they are always among the fastest algorithms and consume less memory than other methods when the data sets are dense.


©2006 Association for Computing Machinery