Efficiently Mining Long Patterns from Databases
Roberto J. Bayardo
Full Paper (PDF)

Slides (PDF)

Abstract
We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with longest pattern length. Experiments on real data show that when the patterns are long, our algorithm is more efficient by an order of magnitude or more.

References

References, where available, link to the DBLP on the World Wide Web.

[1]
Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216
[2]
...
[3]
...
[4]
Rakesh Agrawal, Ramakrishnan Srikant: Mining Sequential Patterns. ICDE 1995: 3-14
[5]
...
[6]
Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur: Dynamic Itemset Counting and Implication Rules for Market Basket Data. SIGMOD Conference 1997: 255-264
[7]
Dimitros Gunopoulos, Heikki Mannila, Sanjeev Saluja: Discovering All Most Specific Sentences by Randomized Algorithms. ICDT 1997: 215-229
[8]
Dao-I Lin, Zvi M. Kedem: Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set. EDBT 1998: 105-119
[9]
Jong Soo Park, Ming-Syan Chen, Philip S. Yu: An Effective Hash Based Algorithm for Mining Association Rules. SIGMOD Conference 1995: 175-186
[10]
Ron Rymon: Search through Systematic Set Enumeration. KR 1992: 539-550
[11]
Ashoka Savasere, Edward Omiecinski, Shamkant B. Navathe: An Efficient Algorithm for Mining Association Rules in Large Databases. VLDB 1995: 432-444
[12]
...
[13]
Padhraic Smyth, Rodney M. Goodman: An Information Theoretic Approach to Rule Induction from Databases. TKDE 4(4): 301-316(1992)
[14]
Ramakrishnan Srikant, Rakesh Agrawal: Mining Sequential Patterns: Generalizations and Performance Improvements. EDBT 1996: 3-17
[15]
...
[16]
... Referenced By:
  1. Charu C. Aggarwal, Philip S. Yu: Mining Large Itemsets for Association Rules. Data Engineering Bulletin 21(1): 23-31(1998)
  2. Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan: Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. SIGMOD Conference 1998: 94-105
BIBTEX

@inproceedings{DBLP:conf/sigmod/Bayardo98,
author = {Roberto J. Bayardo Jr.},
editor = {Laura M. Haas and
Ashutosh Tiwary},
title = {Efficiently Mining Long Patterns from Databases},
booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
publisher = {ACM Press},
year = {1998},
isbn = {0-89791-955-5},
pages = {85-93},
crossref = {DBLP:conf/sigmod/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}


DBLP: Copyright ©1999 by Michael Ley (ley@uni-trier.de).