Welcome to DiSC
Journals
TODS '08/'09
VLDBJ '08/'09
PVLDB '08/'09
Forums
DE Bulletin '08/'09
SIGKDD Explor. '08/'09
SIGIR Forum '08/'09
SIGMOD Record '08/'09
Conferences
ADC '08
ADC '09
APCCM '08
APCCM '09
CIKM '08
CIKM '09
EDBT '08
EDBT '09
GIS '08
GIS '09
Hypertext '08
Hypertext '09
ICDT '09
JCDL '08
JCDL '09
KDD '08
KDD '09
MIR '08
PODS '08
PODS '09
SIGIR '08
SIGIR '09
SIGMOD '08
SIGMOD '09
Symposiums
SBBD '08
SBBD '09
Workshops
DaMoN '08
DaMoN '09
DBTest '08
DBTest '09
DOLAP '08
DOLAP '09
IDAR '08
KEYS '09
MobiDE '08
MobiDE '09
WebDB '08
WIDM '08
WIDM '09
XIME-P '08
Videos
SIGMOD/PODS '08
SIGMOD/PODS '09
|
This DVD contains the proceedings of the
15th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (KDD 2009),
which was held from June 28-July 1, 2009 in Paris, France.
You may use the "  PDF"
link to retrieve the paper,
and the other links to find more information on the paper.
|
|
David J. Hand
Mismatched models, wrong results, and dreadful decisions: on choosing appropriate data mining tools 1-2
|
|
|
Ravi Kumar
Mining web logs: applications and challenges 3-4
|
|
|
Heikki Mannila
Randomization methods in data mining 5-6
|
|
|
Ashok N. Srivastava
Data mining at NASA: from theory to applications 7-8
|
|
|
Stanley Wasserman
Network science: an introduction to recent statistical approaches 9-10
|
|
|
Michael Zeller, Robert Grossman, Christoph Lingenfelder, Michael R. Berthold, Erik Marcade, Rick Pechter, Mike Hoskins, Wayne Thompson, Rich Holada
Open standards and cloud computing: KDD-2009 panel report 11-18
|
|
|
Deepak Agarwal, Bee-Chung Chen
Regression-based latent factor models 19-28
|
|
|
Charu C. Aggarwal, Yan Li, Jianyong Wang, Jing Wang
Frequent pattern mining with uncertain data 29-38
|
|
|
Amr Ahmed, Eric P. Xing, William W. Cohen, Robert F. Murphy
Structured correspondence topic models for mining captioned figures in biological literature 39-48
|
|
|
Anurag Ambekar, Charles B. Ward, Jahangir Mohammed, Swapna Male, Steven Skiena
Name-ethnicity classification from open sources 49-58
|
|
|
Shin Ando, Einoshin Suzuki
Detection of unique temporal segments by information theoretic meta-clustering 59-68
|
|
|
Mafruz Zaman Ashrafi, See-Kiong Ng
Collusion-resistant anonymous data collection method 69-78
|
|
|
Sitaram Asur, Srinivasan Parthasarathy
A viewpoint-based approach for interaction graph analysis 79-88
|
|
|
Lars Backstrom, Jon M. Kleinberg, Ravi Kumar
Optimizing web traffic via the media scheduling problem 89-98
|
|
|
Ron Bekkerman, Martin Scholz, Krishnamurthy Viswanathan
Improving clustering stability with combinatorial MRFs 99-108
|
|
|
Michele Berlingerio, Fabio Pinelli, Mirco Nanni, Fosca Giannotti
Temporal mining for interactive workflow data analysis 109-118
|
|
|
Thomas Bernecker, Hans-Peter Kriegel, Matthias Renz, Florian Verhein, Andreas Züfle
Probabilistic frequent itemset mining in uncertain databases 119-128
|
|
|
Alina Beygelzimer, John Langford
The offset tree for learning with partial labels 129-138
|
|
|
Albert Bifet, Geoffrey Holmes, Bernhard Pfahringer, Richard Kirkby, Ricard Gavaldà
New ensemble methods for evolving data streams 139-148
|
|
|
Christian Böhm, Katrin Haegler, Nikola S. Müller, Claudia Plant
CoCo: coding cost for parameter-free outlier detection 149-158
|
|
|
Yingyi Bu, Lei Chen 0002, Ada Wai-Chee Fu, Dawei Liu
Efficient anomaly monitoring over moving object trajectory streams 159-168
|
|
|
Jonathan Chang, Jordan L. Boyd-Graber, David M. Blei
Connections between the lines: augmenting social networks with text 169-178
|
|
|
Bo Chen, Wai Lam, Ivor W. Tsang, Tak-Lam Wong
Extracting discriminative concepts for domain adaptation in text mining 179-188
|
|
|
Minmin Chen, Yixin Chen, Michael R. Brent, Aaron E. Tenney
Constrained optimization for validation-guided conditional random field learning 189-198
|
|
|
Wei Chen, Yajun Wang, Siyu Yang
Efficient influence maximization in social networks 199-208
|
|
|
Ye Chen, Dmitry Pavlov, John F. Canny
Large-scale behavioral targeting 209-218
|
|
|
Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessandro Panconesi, Prabhakar Raghavan
On compressing social networks 219-228
|
|
|
Erick Delage
Regret-based online ranking for a growing digital library 229-238
|
|
|
Hongbo Deng, Michael R. Lyu, Irwin King
A generalized Co-HITS algorithm and its application to bipartite graphs 239-248
|
|
|
Meghana Deodhar, Joydeep Ghosh
Mining for the most certain predictions from dyadic data 249-258
|
|
|
Pinar Donmez, Jaime G. Carbonell, Jeff Schneider
Efficiently learning the accuracy of labeling sources for selective sampling 259-268
|
|
|
Nan Du, Christos Faloutsos, Bai Wang, Leman Akoglu
Large human communication networks: patterns and a utility-driven generator 269-278
|
|
|
Murat Dundar, E. Daniel Hirleman, Arun K. Bhunia, J. Paul Robinson, Bartek Rajwa
Learning with a non-exhaustive training dataset: a case study: detection of bacteria cultures using optical-scattering technology 279-288
|
|
|
Khalid El-Arini, Gaurav Veda, Dafna Shahaf, Carlos Guestrin
Turning down the noise in the blogosphere 289-298
|
|
|
George Forman, Martin Scholz, Shyamsundar Rajaram
Feature shaping for linear SVM classifiers 299-308
|
|
|
Richard Frank, Martin Ester, Arno J. Knobbe
A multi-relational approach to spatial classification 309-318
|
|
|
Antonino Freno, Edmondo Trentin, Marco Gori
Scalable pseudo-likelihood estimation in hybrid random fields 319-328
|
|
|
João Gama, Raquel Sebastião, Pedro Pereira Rodrigues
Issues in evaluation of stream learning algorithms 329-338
|
|
|
Jing Gao, Wei Fan, Yizhou Sun, Jiawei Han
Heterogeneous source consensus learning via decision propagation and negotiation 339-348
|
|
|
Yong Ge, Hui Xiong, Wenjun Zhou, Ramendra K. Sahoo, Xiaofeng Gao, Weili Wu
Multi-focal learning and its application to customer service support 349-358
|
|
|
Quanquan Gu, Jie Zhou
Co-clustering on manifolds 359-368
|
|
|
Lei Guo, Enhua Tan, Songqing Chen, Xiaodong Zhang, Yihong Eric Zhao
Analyzing patterns of user content generation in online social networks 369-378
|
|
|
Sami Hanhijärvi, Markus Ojala, Niko Vuokko, Kai Puolamäki, Nikolaj Tatti, Heikki Mannila
Tell me something I don't know: randomization strategies for iterative data mining 379-388
|
|
|
Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, Xiaohua Zhou
Exploiting Wikipedia as external knowledge for document clustering 389-396
|
|
|
Mohsen Jamali, Martin Ester
TrustWalker: a random walk model for combining trust-based and item-based recommendation 397-406
|
|
|
Shuiwang Ji, Lei Yuan, Ying-Xin Li, Zhi-Hua Zhou, Sudhir Kumar, Jieping Ye
Drosophila gene expression pattern annotation using sparse features and term-term interactions 407-416
|
|
|
Ruoming Jin, Yang Xiang, Lin Liu
Cartesian contour: a concise representation for a collection of frequent sets 417-426
|
|
|
Aleksander Kolcz, Gordon V. Cormack
Genre-based decomposition of email class noise 427-436
|
|
|
Arne Koopman, Arno Siebes
Characteristic relational patterns 437-446
|
|
|
Yehuda Koren
Collaborative filtering with temporal dynamics 447-456
|
|
|
Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, Soumen Chakrabarti
Collective annotation of Wikipedia entities in web text 457-466
|
|
|
Theodoros Lappas, Kun Liu, Evimaria Terzi
Finding a team of experts in social networks 467-476
|
|
|
Theodoros Lappas, Benjamin Arai, Manolis Platakis, Dimitrios Kotsakos, Dimitrios Gunopulos
On burstiness-aware search for document sequences 477-486
|
|
|
Mark Last
Improving data mining utility with projective sampling 487-496
|
|
|
Jure Leskovec, Lars Backstrom, Jon M. Kleinberg
Meme-tracking and the dynamics of the news cycle 497-506
|
|
|
Lei Li, James McCann, Nancy S. Pollard, Christos Faloutsos
DynaMMo: mining and summarization of coevolving sequences with missing values 507-516
|
|
|
Tiancheng Li, Ninghui Li
On the tradeoff between privacy and utility in data publishing 517-526
|
|
|
Yu-Ru Lin, Jimeng Sun, Paul Castro, Ravi B. Konuru, Hari Sundaram, Aisling Kelliher
MetaFac: community discovery via relational hypergraph factorization 527-536
|
|
|
Chao Liu 0001, Fan Guo, Christos Faloutsos
BBM: bayesian browsing model from petabyte-scale data 537-546
|
|
|
Jun Liu, Jianhui Chen, Jieping Ye
Large-scale sparse logistic regression 547-556
|
|
|
David Lo, Hong Cheng, Jiawei Han, Siau-Cheng Khoo, Chengnian Sun
Classification of software behaviors for failure detection: a discriminative pattern mining approach 557-566
|
|
|
Steven Loscalzo, Lei Yu, Chris H. Q. Ding
Consensus group stable feature selection 567-576
|
|
|
Aurelie C. Lozano, Naoki Abe, Yan Liu 0002, Saharon Rosset
Grouped graphical Granger modeling methods for temporal causal modeling 577-586
|
|
|
Aurelie C. Lozano, Hongfei Li, Alexandru Niculescu-Mizil, Yan Liu 0002, Claudia Perlich, Jonathan R. M. Hosking, Naoki Abe
Spatial-temporal causal modeling for climate change attribution 587-596
|
|
|
Sofus A. Macskassy
Using graph-based metrics with empirical risk minimization to speed up active learning on networked data 597-606
|
|
|
R. Dean Malmgren, Jake M. Hofman, Luis A. Nunes Amaral, Duncan J. Watts
Characterizing individual communication patterns 607-616
|
|
|
Andreas Maunz, Christoph Helma, Stefan Kramer
Large-scale graph mining using backbone refinement classes 617-626
|
|
|
Frank McSherry, Ilya Mironov
Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders 627-636
|
|
|
Anna Monreale, Fabio Pinelli, Roberto Trasarti, Fosca Giannotti
WhereNext: a location predictor on trajectory pattern mining 637-646
|
|
|
Siegfried Nijssen, Tias Guns, Luc De Raedt
Correlated itemset mining in ROC space: a constraint programming approach 647-656
|
|
|
Kensuke Onuma, Hanghang Tong, Christos Faloutsos
TANGENT: a novel, 'Surprise me', recommendation algorithm 657-666
|
|
|
Rong Pan, Martin Scholz
Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering 667-676
|
|
|
Gaurav Pandey, Gowtham Atluri, Michael Steinbach, Chad L. Myers, Vipin Kumar
An association analysis approach to biclustering 677-686
|
|
|
Ardian Kristanto Poernomo, Vivekanand Gopalkrishnan
CP-summary: a concise representation for browsing frequent itemsets 687-696
|
|
|
Ardian Kristanto Poernomo, Vivekanand Gopalkrishnan
Towards efficient mining of proportional fault-tolerant frequent itemsets 697-706
|
|
|
Foster J. Provost, Brian Dalessandro, Rod Hook, Xiaohan Zhang, Alan Murray
Audience selection for on-line brand advertising: privacy-friendly social network targeting 707-716
|
|
|
Zijie Qi, Ian Davidson
A principled and flexible framework for finding alternative clusterings 717-726
|
|
|
Steffen Rendle, Leandro Balby Marinho, Alexandros Nanopoulos, Lars Schmidt-Thieme
Learning optimal ranking with tensor factorization for tag recommendation 727-736
|
|
|
Venu Satuluri, Srinivasan Parthasarathy
Scalable graph clustering using stochastic flows: applications to community discovery 737-746
|
|
|
Jerry Scripps, Pang-Ning Tan, Abdol-Hossein Esfahanian
Measuring the effects of preprocessing decisions and network forces in dynamic network analysis 747-756
|
|
|
Bao-Hong Shen, Shuiwang Ji, Jieping Ye
Mining discrete patterns via binary matrix factorization 757-766
|
|
|
Lei Shi, Vandana Pursnani Janeja
Anomalous window discovery through scan statistics for linear intersecting paths (SSLIP) 767-776
|
|
|
Xiaolin Shi, Jun Zhu, Rui Cai, Lei Zhang
User grouping behavior in online forums 777-786
|
|
|
Takashi Shibuya, Tatsuya Harada, Yasuo Kuniyoshi
Causality quantification and its applications: structuring and modeling of multivariate time series 787-796
|
|
|
Yizhou Sun, Yintao Yu, Jiawei Han
Ranking-based clustering of heterogeneous information networks with star network schema 797-806
|
|
|
Jie Tang, Jimeng Sun, Chi Wang, Zi Yang
Social influence analysis in large-scale networks 807-816
|
|
|
Lei Tang, Huan Liu
Relational learning via latent social dimensions 817-826
|
|
|
Chayant Tantipathananandh, Tanya Y. Berger-Wolf
Constant-factor approximation algorithms for identifying dynamic communities 827-836
|
|
|
Charalampos E. Tsourakakis, U. Kang, Gary L. Miller, Christos Faloutsos
DOULION: counting triangles in massive graphs with a coin 837-846
|
|
|
Pavan Vatturi, Weng-Keen Wong
Category detection using hierarchical mean shift 847-856
|
|
|
Ting Wang, Mudhakar Srivatsa, Dakshi Agrawal, Ling Liu
Learning, indexing, and diagnosing network faults 857-866
|
|
|
Xuanhui Wang, Deepayan Chakrabarti, Kunal Punera
Mining broad latent query aspects from search sessions 867-876
|
|
|
Junjie Wu, Hui Xiong, Jian Chen
Adapting the right measures for K-means clustering 877-886
|
|
|
Mingxi Wu, Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums
A LRT framework for fast spatial anomaly detection 887-896
|
|
|
Jack Chongjie Xue, Gary M. Weiss
Quantification and semi-supervised classification methods for handling changes in class distribution 897-906
|
|
|
Donghui Yan, Ling Huang, Michael I. Jordan
Fast approximate spectral clustering 907-916
|
|
|
Bishan Yang, Jian-Tao Sun, Tengjiao Wang, Zheng Chen
Effective multi-label active learning for text classification 917-926
|
|
|
Tianbao Yang, Rong Jin, Yun Chi, Shenghuo Zhu
Combining link and content for community detection: a discriminative approach 927-936
|
|
|
Limin Yao, David M. Mimno, Andrew McCallum
Efficient methods for topic model inference on streaming document collections 937-946
|
|
|
Lexiang Ye, Eamonn J. Keogh
Time series shapelets: a new primitive for data mining 947-956
|
|
|
Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han
Exploring social tagging graph for web object classification 957-966
|
|
|
Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon
Mining social networks for personalized email prioritization 967-976
|
|
|
Chang Hun You, Lawrence B. Holder, Diane J. Cook
Learning patterns in the dynamics of biological networks 977-986
|
|
|
Xiangliang Zhang, Cyril Furtlehner, Julien Perez, Cécile Germain-Renaud, Michèle Sebag
Toward autonomic grids: analyzing the job flow with affinity streaming 987-996
|
|
|
Yuzhou Zhang, Jianyong Wang, Yi Wang, Lizhu Zhou
Parallel community detection on large networks with propinquity dynamics 997-1006
|
|
|
Elena Zheleva, Hossam Sharara, Lise Getoor
Co-evolution of social and affiliation networks 1007-1016
|
|
|
Lei Zheng, Shaojun Wang, Yan Liu, Chi-Hoon Lee
Information theoretic regularization for semi-supervised boosting 1017-1026
|
|
|
ErHeng Zhong, Wei Fan, Jing Peng, Kun Zhang, Jiangtao Ren, Deepak S. Turaga, Olivier Verscheure
Cross domain distribution adaptation via kernel mapping 1027-1036
|
|
|
Guangyu Zhu, Gilad Mishne
Mining rich session context to improve web search 1037-1046
|
|
|
Jun Zhu, Eric P. Xing, Bo Zhang
Primal sparse Max-margin Markov networks 1047-1056
|
|
|
Qiang Zhu 0002, Xiaoyue Wang, Eamonn J. Keogh, Sang-Hee Lee
Augmenting the generalized hough transform to enable the mining of petroglyphs 1057-1066
|
|
|
Josh Attenberg, Sandeep Pandey, Torsten Suel
Modeling and predicting user behavior in sponsored search 1067-1076
|
|
|
Indrajit Bhattacharya, Shantanu Godbole, Ajay Gupta, Ashish Verma, Jeff Achtermann, Kevin English
Enabling analysts in managed services for CRM analytics 1077-1086
|
|
|
Ludmila Cherkasova, Kave Eshghi, Charles B. Morrey III, Joseph Tucek, Alistair C. Veitch
Applying syntactic similarity algorithms for enterprise information management 1087-1096
|
|
|
Wei Chu, Seung-Taek Park, Todd Beaupre, Nitin Motgi, Amit Phadke, Seinjuti Chakraborty, Joe Zachariah
A case study of behavior-driven conjoint analysis on Yahoo!: front page today module 1097-1104
|
|
|
Thomas Crook, Brian Frasca, Ron Kohavi, Roger Longbotham
Seven pitfalls to avoid when running controlled experiments on the web 1105-1114
|
|
|
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joydeep Ghosh
Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data 1115-1124
|
|
|
Xiaowen Ding, Bing Liu, Lei Zhang
Entity discovery and assignment for opinion mining applications 1125-1134
|
|
|
Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, John H. Thornton Jr.
Migration motif: a spatial - temporal pattern mining approach for financial markets 1135-1144
|
|
|
Ariel Fuxman, Anitha Kannan, Andrew B. Goldberg, Rakesh Agrawal, Panayiotis Tsaparas, John C. Shafer
Improving classification accuracy using automatically extracted training data 1145-1154
|
|
|
Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Zhong Su
Address standardization with latent semantic association 1155-1164
|
|
|
Sonal Gupta, Mikhail Bilenko, Matthew Richardson
Catching the drift: learning broad matches from clickthrough data 1165-1174
|
|
|
Mohammad Al Hasan, W. Scott Spangler, Thomas D. Griffin, Alfredo Alba
COA: finding novel patents through text analysis 1175-1184
|
|
|
Shunsuke Hirose, Kenji Yamanishi, Takayuki Nakata, Ryohei Fujimaki
Network anomaly detection based on Eigen equation compression 1185-1194
|
|
|
Wei Jin, Hung Hay Ho, Rohini K. Srihari
OpinionMiner: a novel machine learning system for web opinion mining and extraction 1195-1204
|
|
|
Jongwuk Lee, Seung-won Hwang, Zaiqing Nie, Ji-Rong Wen
Query result clustering for object-level search 1205-1214
|
|
|
Ming Li, M. Benjamin Dias, Ian H. Jarman, Wael El-Deredy, Paulo J. G. Lisboa
Grocery shopping recommendations based on basket-sensitive random walk 1215-1224
|
|
|
Yan Liu 0002, Jayant R. Kalagnanam, Oivind Johnsen
Learning dynamic temporal graphs for oil-production equipment monitoring system 1225-1234
|
|
|
Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi
Towards combining web classification and web information extraction: a case study 1235-1244
|
|
|
Justin Ma, Lawrence K. Saul, Stefan Savage, Geoffrey M. Voelker
Beyond blacklists: learning to detect malicious web sites from suspicious URLs 1245-1254
|
|
|
Adetokunbo Makanju, A. Nur Zincir-Heywood, Evangelos E. Milios
Clustering event logs using iterative partitioning 1255-1264
|
|
|
Mary McGlohon, Stephen Bay, Markus G. Anderle, David M. Steier, Christos Faloutsos
SNARE: a link analytic system for graph labeling and risk detection 1265-1274
|
|
|
Prem Melville, Wojciech Gryc, Richard D. Lawrence
Sentiment analysis of blogs by combining lexical knowledge with text classification 1275-1284
|
|
|
Noman Mohammed, Benjamin C. M. Fung, Patrick C. K. Hung, Cheuk-kwong Lee
Anonymizing healthcare data: a case study on the blood transfusion service 1285-1294
|
|
|
Kivanc M. Ozonat, Donald Young
Towards a universal marketplace over the web: statistical multi-label classification of service provider forms with simulated annealing 1295-1304
|
|
|
Debprakash Patnaik, Manish Marwah, Ratnesh K. Sharma, Naren Ramakrishnan
Sustainable operation and management of data center chillers using temporal data mining 1305-1314
|
|
|
B. Aditya Prakash, Nicholas Valler, David Andersen, Michalis Faloutsos, Christos Faloutsos
BGP-lens: patterns and anomalies in internet routing updates 1315-1324
|
|
|
D. Sculley, Robert G. Malkin, Sugato Basu, Roberto J. Bayardo
Predicting bounce rates in sponsored search advertisements 1325-1334
|
|
|
Liang Sun, Rinkal Patel, Jun Liu, Kewei Chen, Teresa Wu, Jing Li, Eric Reiman, Jieping Ye
Mining brain region connectivity for alzheimer's disease study via sparse inverse covariance estimation 1335-1344
|
|
|
Junfeng Wang, Chun Chen, Can Wang, Jian Pei, Jiajun Bu, Ziyu Guan, Wei Vivian Zhang
Can we learn a template-independent wrapper for news article extraction from a single training site? 1345-1354
|
|
|
Kuansan Wang, Toby Walker, Zijian Zheng
PSkip: estimating relevance ranking quality from web search clickthrough data 1355-1364
|
|
|
Gu Xu, Shuang-Hong Yang, Hang Li
Named entity mining from click-through data using weakly supervised latent dirichlet allocation 1365-1374
|
|
|
Jiang-Ming Yang, Rui Cai, Chunsong Wang, Hua Huang, Lei Zhang, Wei-Ying Ma
Incorporating site-level knowledge for incremental crawling of web forums: a list-wise strategy 1375-1384
|
|
|
Yanfang Ye, Tao Li, Qingshan Jiang, Zhixue Han, Li Wan
Intelligent file scoring system for malware detection from the gray list 1385-1394
|
|
|
Bin Zhou 0002, Daxin Jiang, Jian Pei, Hang Li
OLAP on search logs: an infrastructure supporting data-driven applications in search engines 1395-1404
|
Copyright ©2010 Association for Computing Machinery
|