 |


















|
|
Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications | Full Paper (PDF)
|
Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering algorithm that satisfies each of these requirements. CLIQUE identifies dense clusters in subspaces of maximum dimensionality. It generates cluster descriptions in the form of DNF expressions that are minimized for ease of comprehension. It produces identical results irrespective of the order in which input records are presented and does not presume any specific mathematical form for data distribution. Through experiments, we show that CLIQUE efficiently finds accurate clusters in large high dimensional datasets. |
References, where available, link to the DBLP on the World Wide Web.
[1]...
[2]Alfred V. Aho, John E. Hopcroft, Jeffrey D. Ullman:
The Design and Analysis of Computer Algorithms.
Addison-Wesley 1974, ISBN 0-201-00029-6
[3]...
[4]...
[5]Roberto J. Bayardo Jr.:
Efficiently Mining Long Patterns from Databases.
SIGMOD Conference 1998: 85-93[6]Stefan Berchtold, Christian Böhm, Daniel A. Keim, Hans-Peter Kriegel:
A Cost Model For Nearest Neighbor Search in High-Dimensional Data Space.
PODS 1997: 78-86[7]...
[8]Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur:
Dynamic Itemset Counting and Implication Rules for Market Basket Data.
SIGMOD Conference 1997: 255-264[9]...
[10]...
[11]...
[12]...
[13]Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu:
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.
KDD 1996: 226-231[14]Martin Ester, Hans-Peter Kriegel, Xiaowei Xu:
A Database Interface for Clustering in Large Spatial Databases.
KDD 1995: 94-99[15]Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy (Eds.):
Advances in Knowledge Discovery and Data Mining.
AAAI/MIT Press 1996, ISBN 0-262-56097-6
Contents[16]...
[17]...
[18]...
[19]...
[20]Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Hannu Toivonen:
Data mining, Hypergraph Transversals, and Machine Learning.
PODS 1997: 209-216[21]Ching-Tien Ho, Rakesh Agrawal, Nimrod Megiddo, Ramakrishnan Srikant:
Range Queries in OLAP Data Cubes.
SIGMOD Conference 1997: 73-88[22]...
[23]...
[24]...
[25]...
[26]Dao-I Lin, Zvi M. Kedem:
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set.
EDBT 1998: 105-119[27]...
[28]Carsten Lund, Mihalis Yannakakis:
On the Hardness of Approximating Minimization Problems (Extended Abstract).
STOC 1993: 286-293[29]...
[30]Manish Mehta, Rakesh Agrawal, Jorma Rissanen:
SLIQ: A Fast Scalable Classifier for Data Mining.
EDBT 1996: 18-32[31]...
[32]R. J. Miller, Yuping Yang:
Association Rules over Interval Data.
SIGMOD Conference 1997: 452-461[33]Raymond T. Ng, Jiawei Han:
Efficient and Effective Clustering Methods for Spatial Data Mining.
VLDB 1994: 144-155[34]...
[35]...
[36]...
[37]John C. Shafer, Rakesh Agrawal, Manish Mehta:
SPRINT: A Scalable Parallel Classifier for Data Mining.
VLDB 1996: 544-555[38]...
[39]...
[40]Ramakrishnan Srikant, Rakesh Agrawal:
Mining Quantitative Association Rules in Large Relational Tables.
SIGMOD Conf. 1996: 1-12[41]Hannu Toivonen:
Sampling Large Databases for Association Rules.
VLDB 1996: 134-145[42]...
[43]...
[44]...
[45]Tian Zhang, Raghu Ramakrishnan, Miron Livny:
BIRCH: An Efficient Data Clustering Method for Very Large Databases.
SIGMOD Conf. 1996: 103-114
|
@inproceedings{DBLP:conf/sigmod/AgrawalGGR98, author = {Rakesh Agrawal and Johannes Gehrke and Dimitrios Gunopulos and Prabhakar Raghavan}, editor = {Laura M. Haas and Ashutosh Tiwary}, title = {Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications}, booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA}, publisher = {ACM Press}, year = {1998}, isbn = {0-89791-955-5}, pages = {94-105}, crossref = {DBLP:conf/sigmod/98}, bibsource = {DBLP, http://dblp.uni-trier.de} }
|
DBLP: Copyright ©1999 by Michael Ley (ley@uni-trier.de).
|
|