Welcome to DiSC
Journals
TODS '06/'07
VLDBJ '06/'07
Forums
SIGKDD Explor. '06/'07
SIGIR Forum '06/'07
SIGMOD Record '06/'07
Conferences
ADC '06
APCCM '06
CIKM '06
CIKM '07
ER '05
ER '06
Hypertext '06
Hypertext '07
JCDL '06
JCDL '07
MIR '06
MIR '07
PODS '06
PODS '07
SIGIR '06
SIGIR '07
SIGKDD '06
SIGKDD '07
SIGMOD '06
SIGMOD '07
VLDB '06
VLDB '07
Symposiums
ACM-GIS '06
ACM-GIS '07
SBBD '06
SBBD '07
Workshops
CVDB '07
DaMoN '06
DaMoN '07
DOLAP '06
DOLAP '07
ExpDB '06
ExpDB '07
HIKM '06
IDAR '07
MobiDE '06
MobiDE '07
WebDB '06
WebDB '07
WIDM '06
WIDM '07
XIME-P '06
XIME-P '07
Videos
SIGMOD '07
|
This DVD contains the proceedings of the
Twelfth ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (SIGKDD 2006),
which was held from August 20-23, 2006 in Philadelphia, Pennsylvania.
You may use the "  PDF"
link to retrieve the paper,
and the other links to find more information on the paper.
|
|
John A. Stankovic
Self-Organizing wireless sensor networks in action 1
|
|
|
Andrew Moore
New cached-sufficient statistics algorithms for quickly answering statistical questions 2
|
|
|
Rakesh Agrawal
Next frontier 3
|
|
|
Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger, Arthur Zimek
Deriving quantitative models for correlation clusters 4-13
|
|
|
Alekh Agarwal, Soumen Chakrabarti, Sunny Aggarwal
Learning to rank networked entities 14-23
|
|
|
Deepak Agarwal, Andrew McGregor, Jeff M. Phillips, Suresh Venkatasubramanian, Zhengyuan Zhu
Spatial scan statistics: approximations and performance study 24-33
|
|
|
Aris Anagnostopoulos, Michail Vlachos, Marios Hadjieleftheriou, Eamonn J. Keogh, Philip S. Yu
Global distance-based segmentation of trajectories 34-43
|
|
|
Lars Backstrom, Daniel P. Huttenlocher, Jon M. Kleinberg, Xiangyang Lan
Group formation in large social networks: membership, growth, and evolution 44-54
|
|
|
Daniel Barbará, Carlotta Domeniconi, James P. Rogers
Detecting outliers using transduction and statistical testing 55-64
|
|
|
Christian Böhm, Christos Faloutsos, Jia-Yu Pan, Claudia Plant
Robust information-theoretic clustering 65-75
|
|
|
Justin Brickell, Vitaly Shmatikov
Efficient anonymity-preserving data collection 76-85
|
|
|
Gregory Buehrer, Srinivasan Parthasarathy, Amol Ghoting
Out-of-core frequent pattern mining on a commodity PC 86-95
|
|
|
Toon Calders, Bart Goethals, Szymon Jaroszewicz
Mining rank-correlated sets of numerical attributes 96-105
|
|
|
Jin Chen, Wynne Hsu, Mong-Li Lee, See-Kiong Ng
NeMoFinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs 106-115
|
|
|
Jason V. Davis, Inderjit S. Dhillon
Estimating the global pagerank of web communities 116-125
|
|
|
Chris H. Q. Ding, Tao Li, Wei Peng, Haesun Park
Orthogonal nonnegative matrix t-factorizations for clustering 126-135
|
|
|
Wei Fan, Joe McCloskey, Philip S. Yu
A general framework for accurate and fast regression by data summarization in random decision trees 136-146
|
|
|
Wei Fan, Ian Davidson
Reverse testing: an efficient framework to select amongst classifiers under sample selection bias 147-156
|
|
|
George Forman
Quantifying trends accurately despite classifier error and class imbalance 157-166
|
|
|
Aristides Gionis, Heikki Mannila, Taneli Mielikäinen, Panayiotis Tsaparas
Assessing data mining results via swap randomization 167-176
|
|
|
Kosuke Hashimoto, Kiyoko F. Aoki-Kinoshita, Nobuhisa Ueda, Minoru Kanehisa, Hiroshi Mamitsuka
A new efficient probabilistic model for mining labeled ordered trees 177-186
|
|
|
Steven C. H. Hoi, Michael R. Lyu, Edward Y. Chang
Learning the unified kernel machines for classification 187-196
|
|
|
Tamás Horváth, Jan Ramon, Stefan Wrobel
Frequent subgraph mining in outerplanar graphs 197-206
|
|
|
Alexander T. Ihler, Jon Hutchins, Padhraic Smyth
Adaptive event detection with time-varying poisson processes 207-216
|
|
|
Thorsten Joachims
Training linear SVMs in linear time 217-226
|
|
|
Yiping Ke, James Cheng, Wilfred Ng
Mining quantitative correlated patterns using an information-theoretic approach 227-236
|
|
|
Arno J. Knobbe, Eric K. Y. Ho
Maximally informative k-itemsets and their efficient discovery 237-244
|
|
|
Yehuda Koren, Stephen C. North, Chris Volinsky
Measuring and extracting proximity in networks 245-255
|
|
|
Ravi Kumar, Kunal Punera, Andrew Tomkins
Hierarchical topic segmentation of websites 257-266
|
|
|
Longin Jan Latecki, Marc Sobel, Rolf Lakämper
New EM derived from Kullback-Leibler divergence 267-276
|
|
|
Kristen LeFevre, David J. DeWitt, Raghu Ramakrishnan
Workload-aware anonymization 277-286
|
|
|
Ping Li, Trevor Hastie, Kenneth Ward Church
Very sparse random projections 287-296
|
|
|
Bing Liu, Kaidi Zhao, Jeffrey Benkler, Weimin Xiao
Rule interestingness analysis using OLAP operations 297-306
|
|
|
Elsa Loekito, James Bailey
Fast mining of high dimensional expressive contrast patterns using zero-suppressed binary decision diagrams 307-316
|
|
|
Bo Long, Xiaoyun Wu, Zhongfei (Mark) Zhang, Philip S. Yu
Unsupervised learning on k-partite graphs 317-326
|
|
|
Michael W. Mahoney, Mauro Maggioni, Petros Drineas
Tensor-CUR decompositions for tensor-based data 327-336
|
|
|
Qiaozhu Mei, Dong Xin, Hong Cheng, Jiawei Han, ChengXiang Zhai
Generating semantic annotations for frequent patterns with context analysis 337-346
|
|
|
Taneli Mielikäinen, Evimaria Terzi, Panayiotis Tsaparas
Aggregating time partitions 347-356
|
|
|
Matthew J. Rattigan, Marc Maier, David Jensen
Using structure indices for efficient approximation of network properties 357-366
|
|
|
Rómer Rosales, Glenn Fung
Learning sparse metrics via linear programming 367-373
|
|
|
Jimeng Sun, Dacheng Tao, Christos Faloutsos
Beyond streams and graphs: dynamic tensor analysis 374-383
|
|
|
Lei Tang, Jianping Zhang, Huan Liu
Acclimatizing taxonomic semantics for hierarchical content classification from semantics to data-driven taxonomy 384-393
|
|
|
Yufei Tao, Xiaokui Xiao, Shuigeng Zhou
Mining distance-based outliers from large databases in any metric space 394-403
|
|
|
Hanghang Tong, Christos Faloutsos
Center-piece subgraphs: problem definition and fast solutions 404-413
|
|
|
Ke Wang, Benjamin C. M. Fung
Anonymizing sequential releases 414-423
|
|
|
Xuerui Wang, Andrew McCallum
Topics over time: a non-Markov continuous-time model of topical trends 424-433
|
|
|
Geoffrey I. Webb
Discovering significant rules 434-443
|
|
|
Dong Xin, Hong Cheng, Xifeng Yan, Jiawei Han
Extracting redundancy-aware top-k patterns 444-453
|
|
|
Jieping Ye, Tie Wang
Regularized discriminant analysis for high dimensional, low sample size data 454-463
|
|
|
Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Kriegel, Mingrui Wu
Supervised probabilistic principal component analysis 464-473
|
|
|
Dell Zhang, Wee Sun Lee
Extracting key-substring-group features for text classification 474-483
|
|
|
Qiankun Zhao, Tie-Yan Liu, Sourav S. Bhowmick, Wei-Ying Ma
Event detection from evolution of click-through data 484-493
|
|
|
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Ying Ma
Simultaneous record detection and attribute labeling in web data extraction 494-503
|
|
|
Naoki Abe, Bianca Zadrozny, John Langford
Outlier detection by active learning 504-509
|
|
|
Charu C. Aggarwal, Jian Pei, Bo Zhang 0002
On privacy preservation against adversarial data mining 510-516
|
|
|
Bavani Arunasalam, Sanjay Chawla
CCCS: a top-down associative classifier for imbalanced class distribution 517-522
|
|
|
Tanya Y. Berger-Wolf, Jared Saia
A framework for analysis of dynamic social networks 523-528
|
|
|
Indrajit Bhattacharya, Lise Getoor, Louis Licamele
Query-time entity resolution 529-534
|
|
|
Cristian Bucila, Rich Caruana, Alexandru Niculescu-Mizil
Model compression 535-541
|
|
|
Robin D. Burke, Bamshad Mobasher, Chad Williams, Runa Bhaumik
Classification features for attack detection in collaborative recommender systems 542-547
|
|
|
Vitor R. Carvalho, William W. Cohen
Single-pass online learning: performance, voting schemes and online feature selection 548-553
|
|
|
Deepayan Chakrabarti, Ravi Kumar, Andrew Tomkins
Evolutionary clustering 554-560
|
|
|
Aristides Gionis, Heikki Mannila, Kai Puolamäki, Antti Ukkonen
Algorithms for discovering bucket orders from data 561-566
|
|
|
Hongyu Guo, Herna L. Viktor
Mining relational data through correlation-based multiple view validation 567-573
|
|
|
Tomoharu Iwata, Kazumi Saito, Takeshi Yamada
Recommendation method for extending subscription periods 574-579
|
|
|
Wolfgang Jank, Galit Shmueli, Shanshan Wang
Dynamic, real-time forecasting of online auctions via functional models 580-585
|
|
|
Szymon Jaroszewicz
Polynomial association rules with applications to logistic regression 586-591
|
|
|
Nan Jiang, Le Gruenwald
CFI-Stream: mining closed frequent itemsets in data streams 592-597
|
|
|
Arnd Christian König, Eric Brill
Reducing the human overhead in text categorization 598-603
|
|
|
Deept Kumar, Naren Ramakrishnan, Richard F. Helm, Malcolm Potts
Algorithms for storytelling 604-610
|
|
|
Ravi Kumar, Jasmine Novak, Andrew Tomkins
Structure and evolution of online social networks 611-617
|
|
|
Sven Laur, Helger Lipmaa, Taneli Mielikäinen
Cryptographically private support vector machines 618-624
|
|
|
Hady Wirawan Lauw, Ee-Peng Lim, Ke Wang
Bias and controversy: beyond the statistical deviation 625-630
|
|
|
Jure Leskovec, Christos Faloutsos
Sampling from large graphs 631-636
|
|
|
Jinze Liu, Qi Zhang, Wei Wang, Leonard McMillan, Jan Prins
Clustering pair-wise dissimilarity data into partially ordered sets 637-642
|
|
|
Dharmesh M. Maniyar, Ian T. Nabney
Visual data mining using principled projection algorithms and information visualization techniques 643-648
|
|
|
Qiaozhu Mei, ChengXiang Zhai
A mixture model for contextual text mining 649-655
|
|
|
Srujana Merugu, Saharon Rosset, Claudia Perlich
A new multi-view regression approach with an application to customer wallet estimation 656-661
|
|
|
Riadh Ben Messaoud, Omar Boussaid, Sabine Loudcher Rabaséda
Efficient multidimensional data representations based on multiple correspondence analysis 662-667
|
|
|
Fabian Mörchen
Algorithms for time series knowledge mining 668-673
|
|
|
J. Saketha Nath, Chiranjib Bhattacharyya, M. Narasimha Murty
Clustering based large margin classification: a scalable approach using SOCP formulation 674-679
|
|
|
David Newman, Chaitanya Chemudugunta, Padhraic Smyth
Statistical entity-topic models 680-686
|
|
|
Noam Palatin, Arie Leizarowitz, Assaf Schuster, Ran Wolff
Mining for misconfigured machines in grid systems 687-692
|
|
|
Jia-Yu Pan, André G. R. Balan, Eric P. Xing, Agma J. M. Traina, Christos Faloutsos
Automatic mining of fruit fly embryo images 693-698
|
|
|
Seung-Taek Park, David Pennock, Omid Madani, Nathan Good, Dennis DeCoste
Naïve filterbots for robust cold-start recommendations 699-705
|
|
|
Myra Spiliopoulou, Irene Ntoutsi, Yannis Theodoridis, Rene Schult
MONIC: modeling and monitoring cluster transitions 706-711
|
|
|
Fabian M. Suchanek, Georgiana Ifrim, Gerhard Weikum
Combining linguistic and statistical analysis to extract relations from web documents 712-717
|
|
|
Bin Tan, Xuehua Shen, ChengXiang Zhai
Mining long-term search history to improve search accuracy 718-723
|
|
|
Ivor W. Tsang, András Kocsor, James T. Kwok
Efficient kernel feature extraction for massive data sets 724-729
|
|
|
Chao Wang, Srinivasan Parthasarathy
Summarizing itemset patterns using probabilistic models 730-735
|
|
|
Haixun Wang, Jian Yin, Jian Pei, Philip S. Yu, Jeffrey Xu Yu
Suppressing model overfitting in mining concept-drifting data streams 736-741
|
|
|
Steve Wedig, Omid Madani
A large-scale analysis of query logs for assessing personalization opportunities 742-747
|
|
|
Li Wei, Eamonn J. Keogh
Semi-supervised time series classification 748-753
|
|
|
Raymond Chi-Wing Wong, Jiuyong Li, Ada Wai-Chee Fu, Ke Wang
(alpha, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing 754-759
|
|
|
Gang Wu, Edward Y. Chang, Yen-Kuang Chen, Christoper Hughes
Incremental approximate matrix factorization for speeding up support vector machines 760-766
|
|
|
Mingxi Wu, Chris Jermaine
Outlier detection by sampling with accuracy guarantees 767-772
|
|
|
Dong Xin, Xuehua Shen, Qiaozhu Mei, Jiawei Han
Discovering interesting patterns through user's interactive feedback 773-778
|
|
|
Hui Xiong, Junjie Wu, Jian Chen
K-means clustering versus validation measures: a data distribution perspective 779-784
|
|
|
Jian Xu, Wei Wang, Jian Pei, Xiaoyuan Wang, Baile Shi, Ada Wai-Chee Fu
Utility-based anonymization using local recoding 785-790
|
|
|
Illhoi Yoo, Xiaohua Hu, Il-Yeol Song
Integration of semantic-based bipartite graph representation and mutual refinement strategy for biomedical literature clustering 791-796
|
|
|
Zhiping Zeng, Jianyong Wang, Lizhu Zhou, George Karypis
Coherent closed quasi-clique discovery from large dense graph databases 797-802
|
|
|
Minghua Zhang, Wynne Hsu, Mong-Li Lee
Mining progressive confident rules 803-808
|
|
|
Sheng Zhang, Amit Chakrabarti, James Ford, Fillia Makedon
Attack detection in time series for recommender systems 809-814
|
|
|
Shichao Zhang, Feng Chen, Xindong Wu, Chengqi Zhang
Identifying bridging rules between conceptual clusters 815-820
|
|
|
Tong Zhang, Alexandrin Popescul, Byron Dom
Linear prediction models with graph regularization for web-page categorization 821-826
|
|
|
Lizhuang Zhao, Mohammed J. Zaki, Naren Ramakrishnan
BLOSOM: a framework for mining arbitrary boolean expressions 827-832
|
|
|
Jeff Jonas
Introducing perpetual analytics 833
|
|
|
William Kahn
Capital One's statistical problems: our top ten list 834
|
|
|
Andrew McCallum
Information extraction, data mining and joint inference 835
|
|
|
Michael Cavaretta
Data mining challenges in the automotive domain 836
|
|
|
Jinbo Bi, Senthil Periaswamy, Kazunori Okada, Toshiro Kubota, Glenn Fung, Marcos Salganicoff, R. Bharat Rao
Computer aided detection via asymmetric cascade of sparse hyperplane classifiers 837-844
|
|
|
Rebecca Castaño, Dominic Mazzoni, Nghia Tang, Ronald Greeley, Thomas Doggett, Benjamin Cichy, Steve A. Chien, Ashley Davies
Onboard classifiers for science event detection on a remote sensing spacecraft 845-851
|
|
|
George Forman, Evan Kirshenbaum, Jaap Suermondt
Pragmatic text mining: minimizing human effort to quantify many issues in call logs 852-861
|
|
|
Seth Hettich, Michael J. Pazzani
Mining for proposal reviewers: lessons learned at the national science foundation 862-871
|
|
|
Chao Liu, Chen Chen, Jiawei Han, Philip S. Yu
GPLAG: detection of software plagiarism by program dependence graph analysis 872-881
|
|
|
Fabian Mörchen, Ingo Mierswa, Alfred Ultsch
Understandable models Of music collections based on exhaustive feature generation with temporal statistics 882-891
|
|
|
Kaidi Zhao, Bing Liu, Jeffrey Benkler, Weimin Xiao
Opportunity map: identifying causes of failure - a deployed data mining system 892-901
|
|
|
Eugene Agichtein, Zijian Zheng
Identifying "best bet" web search results by mining past user behavior 902-908
|
|
|
Rich Caruana, Mohamed Farid Elhawary, Art Munson, Mirek Riedewald, Daria Sorokina, Daniel Fink, Wesley M. Hochachka, Steve Kelling
Mining citizen science data to predict orevalence of wild bird species 909-915
|
|
|
Julien Etienne, Bernd Wachmann, Lei Zhang
A component-based framework for knowledge discovery in bioinformatics 916-921
|
|
|
Byron J. Gao, Obi L. Griffith, Martin Ester, Steven J. M. Jones
Discovering significant OPSM subspace clusters in massive gene expression data 922-928
|
|
|
Charles X. Ling, Victor S. Sheng, Tilmann F. W. Bruckhaus, Nazim H. Madhavji
Maximum profit mining and its application in software development 929-934
|
|
|
Ingo Mierswa, Michael Wurst, Ralf Klinkenberg, Martin Scholz, Timm Euler
YALE: rapid prototyping for complex data mining tasks 935-940
|
|
|
Sankar Virdhagriswaran, Gordon Dakin
Camouflaged fraud detection in domains with complex relationships 941-947
|
|
|
Lian Yan, Patrick Baldasare
Beyond classification and ranking: constrained optimization of the ROI 948-953
|
|
|
Gregory Piatetsky-Shapiro, Robert Grossman, Chabane Djeraba, Ronen Feldman, Lise Getoor, Mohammed Javeed Zaki
Is there a grand challenge or X-prize for data mining? 954-956
|
Copyright ©2010 Association for Computing Machinery
|