Welcome to D
SIGMOD'00
 = SIGMOD'00 We
 = Plenary Talk
<<< = SIGMOD'00 Pa>>>
PODS'00
SIGMOD Recor
CIKM 2000/CI
COMAD 2000
Data Enginee
DL 2000
DPDJ
EDBT 2000
Hypertext 20
ICDE 2000
KDD 2000
KDD Explorat
KRDB 2000
SBBD 2000
SIGIR 2000
SIGIR Forum
SSDBM 2000
TODS
VLDB'00
VLDBJ

LOF: Identifying Density-Based Local Outliers


Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander

  View Paper (PDF)  

Return to Research Sessions


Abstract

For many KDD applications, such as detecting criminal activities in E-commerce, finding the rare instances or the outliers, can be more interesting than finding the common patterns. Existing work in outlier detection regards being an outlier as a binary property. In this paper, we contend that for many scenarios, it is more meaningful to assign to each object a degree of being an outlier. This degree is called the local outlier factor (LOF) of an object. It is local in that the degree depends on how isolated the object is with respect to the surrounding neighborhood. We give a detailed formal analysis showing that LOF enjoys many desirable properties. Using real-world datasets, we demonstrate that LOF can be used to find outliers which appear to be meaningful, but can otherwise not be identified with existing approaches. Finally, a careful performance evaluation of our algorithm confirms we show that our approach of finding local outliers can be practical.


References


Note: References link to DBLP on the Web.

[1]
Andreas Arning , Rakesh Agrawal , Prabhakar Raghavan : A Linear Method for Deviation Detection in Large Databases. KDD 1996 : 164-169
[2]
Mihael Ankerst , Markus M. Breunig , Hans-Peter Kriegel , Jörg Sander : OPTICS: Ordering Points To Identify the Clustering Structure. SIGMOD Conference 1999 : 49-60
[3]
Rakesh Agrawal , Johannes Gehrke , Dimitrios Gunopulos , Prabhakar Raghavan : Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. SIGMOD Conference 1998 : 94-105
[4]
Stefan Berchtold , Daniel A. Keim , Hans-Peter Kriegel : The X-tree : An Index Structure for High-Dimensional Data. VLDB 1996 : 28-39
[5]
...
[6]
William DuMouchel , Matthias Schonlau : A Fast Computer Intrusion Detection Algorithm Based on Hypothesis Testing of Command Transition Probabilities. KDD 1998 : 189-193
[7]
Martin Ester , Hans-Peter Kriegel , Jörg Sander , Xiaowei Xu : A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. KDD 1996 : 226-231
[8]
Tom Fawcett , Foster J. Provost : Adaptive Fraud Detection. Data Mining and Knowledge Discovery 1(3) : 291-316(1997)
[9]
Usama M. Fayyad , Gregory Piatetsky-Shapiro , Padhraic Smyth : Knowledge Discovery and Data Mining: Towards a Unifying Framework. KDD 1996 : 82-88
[10]
...
[11]
Alexander Hinneburg , Daniel A. Keim : An Efficient Approach to Clustering in Large Multimedia Databases with Noise. KDD 1998 : 58-65
[12]
Theodore Johnson , Ivy Kwok , Raymond T. Ng : Fast Computation of 2-Dimensional Depth Contours. KDD 1998 : 224-228
[13]
Edwin M. Knorr , Raymond T. Ng : Algorithms for Mining Distance-Based Outliers in Large Datasets. VLDB 1998 : 392-403
[14]
Edwin M. Knorr , Raymond T. Ng : Finding Intensional Knowledge of Distance-Based Outliers. VLDB 1999 : 211-222
[15]
Raymond T. Ng , Jiawei Han : Efficient and Effective Clustering Methods for Spatial Data Mining. VLDB 1994 : 144-155
[16]
Franco P. Preparata , Michael Ian Shamos : Computational Geometry - An Introduction. Springer 1985, ISBN 3-540-96131-3
[17]
Sridhar Ramaswamy , Rajeev Rastogi , Kyuseok Shim : Efficient Algorithms for Mining Outliers from Large Data Sets. SIGMOD Conference 2000 : 427-438
[18]
...
[19]
Gholamhosein Sheikholeslami , Surojit Chatterjee , Aidong Zhang : WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases. VLDB 1998 : 428-439
[20]
...
[21]
Roger Weber , Hans-Jörg Schek , Stephen Blott : A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces. VLDB 1998 : 194-205
[22]
Wei Wang , Jiong Yang , Richard R. Muntz : STING: A Statistical Information Grid Approach to Spatial Data Mining. VLDB 1997 : 186-195
[23]
Tian Zhang , Raghu Ramakrishnan , Miron Livny : BIRCH: An Efficient Data Clustering Method for Very Large Databases. SIGMOD Conf. 1996 : 103-114

Referenced by

  1. Sridhar Ramaswamy , Rajeev Rastogi , Kyuseok Shim : Efficient Algorithms for Mining Outliers from Large Data Sets. SIGMOD Conference 2000 : 427-438

BIBTEX


@inproceedings{DBLP:conf/sigmod/BreunigKNS00,
  author    = {Markus M. Breunig and
                Hans-Peter Kriegel and
                Raymond T. Ng and
                J{\"o}rg Sander},
   editor    = {Weidong Chen and
                Jeffrey F. Naughton and
                Philip A. Bernstein},
   title     = {LOF: Identifying Density-Based Local Outliers},
   booktitle = {Proceedings of the 2000 ACM SIGMOD International Conference on
                Management of Data, May 16-18, 2000, Dallas, Texas, USA},
   journal   = {SIGMOD Record},
   publisher = {ACM},
   volume    = {29},
   number    = {2},
   year      = {2000},
   isbn      = {1-58113-218-2},
   pages     = {93-104},
   crossref  = {DBLP:conf/sigmod/2000},
   bibsource = {DBLP, http://dblp.uni-trier.de} } },




DiSC'01 Copyright ©2002 ACM Inc.