Welcome to DiSC
Journals
TODS '08/'09
VLDBJ '08/'09
PVLDB '08/'09
Forums
DE Bulletin '08/'09
SIGKDD Explor. '08/'09
SIGIR Forum '08/'09
SIGMOD Record '08/'09
Conferences
ADC '08
ADC '09
APCCM '08
APCCM '09
CIKM '08
CIKM '09
EDBT '08
EDBT '09
GIS '08
GIS '09
Hypertext '08
Hypertext '09
ICDT '09
JCDL '08
JCDL '09
KDD '08
KDD '09
MIR '08
PODS '08
PODS '09
SIGIR '08
SIGIR '09
SIGMOD '08
SIGMOD '09
Symposiums
SBBD '08
SBBD '09
Workshops
DaMoN '08
DaMoN '09
DBTest '08
DBTest '09
DOLAP '08
DOLAP '09
IDAR '08
KEYS '09
MobiDE '08
MobiDE '09
WebDB '08
WIDM '08
WIDM '09
XIME-P '08
Videos
SIGMOD/PODS '08
SIGMOD/PODS '09
|
This DVD contains the proceedings of the
14th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (KDD 2008),
which was held from August 24-27, 2008 in Las Vegas, Nevada.
You may use the "  PDF"
link to retrieve the paper,
and the other links to find more information on the paper.
|
|
Benjamin Edelman, Michael Schwarz
Internet advertising and optimal auction design 1
|
|
|
Thore Graepel, Ralf Herbrich
Large scale data analysis and modelling in online services and advertising 2
|
|
|
Trevor Hastie, Jerome Friedman, Robert Tibshirani
Regularization paths and coordinate descent 3
|
|
|
Jitendra Malik
The future of image search 4
|
|
|
Udo Miletzki
Genesis of postal address reading, current state and future prospects: thirty years of pattern recognition on duty of postal services 5-6
|
|
|
Aris Anagnostopoulos, Ravi Kumar, Mohammad Mahdian
Influence and correlation in social networks 7-15
|
|
|
Luca Becchetti, Paolo Boldi, Carlos Castillo, Aristides Gionis
Efficient semi-streaming algorithms for local triangle counting in massive graphs 16-24
|
|
|
Indrajit Bhattacharya, Shantanu Godbole, Sachindra Joshi
Structured entity identification and document categorization: two tasks with one joint model 25-33
|
|
|
Albert Bifet, Ricard Gavaldà
Mining adaptively frequent closed unlabeled rooted trees in data streams 34-42
|
|
|
Mustafa Bilgic, Lise Getoor
Effective label acquisition for collective classification 43-51
|
|
|
Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis
Topical query decomposition 52-60
|
|
|
Christos Boutsidis, Michael W. Mahoney, Petros Drineas
Unsupervised feature selection for principal components analysis 61-69
|
|
|
Justin Brickell, Vitaly Shmatikov
The cost of privacy: destruction of data-mining utility in anonymized data publishing 70-78
|
|
|
Deepayan Chakrabarti, Ravi Kumar, Kunal Punera
Generating succinct titles for web URLs 79-87
|
|
|
Soumen Chakrabarti, Rajiv Khanna, Uma Sawant, Chiru Bhattacharyya
Structured learning for non-smooth ranking losses 88-96
|
|
|
Ming-wei Chang, Wen-tau Yih, Christopher Meek
Partitioned logistic regression for spam filtering 97-105
|
|
|
Jianhui Chen, Shuiwang Ji, Betul Ceran, Qi Li, Mingrui Wu, Jieping Ye
Learning subspace kernels for classification 106-114
|
|
|
WenYen Chen, Dong Zhang, Edward Y. Chang
Combinational collaborative filtering for personalized community recommendation 115-123
|
|
|
Xue-wen Chen, Michael Wasikowski
FAST: a roc-based feature selection metric for small samples and imbalanced data classification problems 124-132
|
|
|
Haibin Cheng, Pang-Ning Tan
Semi-supervised learning with data calibration for long-term time series forecasting 133-141
|
|
|
Yong Ju Cho, Naren Ramakrishnan, Yang Cao
Reconstructing chemical reaction networks: data mining meets system identification 142-150
|
|
|
Peter Christen
Automatic record linkage using seeded nearest neighbour and support vector machine classification 151-159
|
|
|
David J. Crandall, Dan Cosley, Daniel P. Huttenlocher, Jon M. Kleinberg, Siddharth Suri
Feedback effects between similarity and social influence in online communities 160-168
|
|
|
Kaustav Das, Jeff G. Schneider, Daniel B. Neill
Anomaly pattern detection in categorical datasets 169-176
|
|
|
Atish Das Sarma, Sreenivas Gollapudi, Samuel Ieong
Bypass rates: reducing query abandonment using negative inferences 177-185
|
|
|
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar
De-duping URLs via rewrite rules 186-194
|
|
|
Jason V. Davis, Inderjit S. Dhillon
Structured metric learning for high dimensional problems 195-203
|
|
|
Luc De Raedt, Tias Guns, Siegfried Nijssen
Constraint programming for itemset mining 204-212
|
|
|
Charles Elkan, Keith Noto
Learning classifiers from only positive and unlabeled data 213-220
|
|
|
Kave Eshghi, Shyamsundar Rajaram
Locality sensitive hash functions based on concomitant rank order statistics 221-229
|
|
|
Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei Han, Philip S. Yu, Olivier Verscheure
Direct mining of discriminative and essential frequent patterns via model-based search tree 230-238
|
|
|
George Forman, Shyamsundar Rajaram
Scaling up text classification for large file systems 239-246
|
|
|
Yasuhiro Fujiwara, Yasushi Sakurai, Masashi Yamamuro
SPIRAL: efficient and exact model identification for hidden Markov models 247-255
|
|
|
Brian Gallagher, Hanghang Tong, Tina Eliassi-Rad, Christos Faloutsos
Using ghost edges for classification in sparsely labeled networks 256-264
|
|
|
Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, Adam Smith
Composition attacks and auxiliary information in data privacy 265-273
|
|
|
Venkatesh Ganti, Arnd Christian König, Rares Vernica
Entity categorization over large document collections 274-282
|
|
|
Jing Gao, Wei Fan, Jing Jiang, Jiawei Han
Knowledge transfer via multiple model local structure mapping 283-291
|
|
|
Gemma C. Garriga, Esa Junttila, Heikki Mannila
Banded structure in binary matrices 292-300
|
|
|
Rohit Gupta, Gang Fang, Blayne Field, Michael Steinbach, Vipin Kumar
Quantitative evaluation of approximate frequent pattern mining algorithms 301-309
|
|
|
Robert Hall, Charles A. Sutton, Andrew McCallum
Unsupervised deduplication using cross-field dependencies 310-317
|
|
|
Meng Hu, Jiong Yang, Wei Su
Permu-pattern: discovery of mutable permutation patterns with proximity constraint 318-326
|
|
|
Heng Huang, Chris H. Q. Ding, Dijun Luo, Tao Li
Simultaneous tensor subspace selection and clustering: the equivalence of high order svd and k-means clustering 327-335
|
|
|
Woochang Hwang, Taehyong Kim, Murali Ramanathan, Aidong Zhang
Bridging centrality: graph mining from element level to group level 336-344
|
|
|
Saara Hyvönen, Pauli Miettinen, Evimaria Terzi
Interpretable nonnegative matrix decompositions 345-353
|
|
|
Georgiana Ifrim, Gökhan H. Bakir, Gerhard Weikum
Fast logistic regression for text categorization with variable-length n-grams 354-362
|
|
|
Tomoharu Iwata, Takeshi Yamada, Naonori Ueda
Probabilistic latent semantic visualization: topic model for visualizing documents 363-371
|
|
|
David D. Jensen, Andrew S. Fast, Brian J. Taylor, Marc E. Maier
Automatic identification of quasi-experimental designs for discovering causal knowledge 372-380
|
|
|
Shuiwang Ji, Lei Tang, Shipeng Yu, Jieping Ye
Extracting shared subspace for multi-label classification 381-389
|
|
|
Bin Jiang, Jian Pei, Xuemin Lin, David W. Cheung, Jiawei Han
Mining preferences from superior and inferior examples 390-398
|
|
|
Ruoming Jin, Muad Abu-Ata, Yang Xiang, Ning Ruan
Effective and efficient itemset pattern summarization: regression-based approaches 399-407
|
|
|
S. Sathiya Keerthi, S. Sundararajan, Kai-Wei Chang, Cho-Jui Hsieh, Chih-Jen Lin
A sequential dual method for large scale multi-class linear svms 408-416
|
|
|
Jerry Kiernan, Evimaria Terzi
Constructing comprehensive summaries of large event sequences 417-425
|
|
|
Yehuda Koren
Factorization meets the neighborhood: a multifaceted collaborative filtering model 426-434
|
|
|
Gueorgi Kossinets, Jon M. Kleinberg, Duncan J. Watts
The structure of information pathways in a social communication network 435-443
|
|
|
Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek
Angle-based outlier detection in high-dimensional data 444-452
|
|
|
Srivatsan Laxman, Vikram Tankasali, Ryen W. White
Stream prediction using a generative model based on frequent episodes in event sequences 453-461
|
|
|
Jure Leskovec, Lars Backstrom, Ravi Kumar, Andrew Tomkins
Microscopic evolution of social networks 462-470
|
|
|
Lei Li, Wenjie Fu, Fan Guo, Todd C. Mowry, Christos Faloutsos
Cut-and-stitch: efficient parallel learning of linear dynamical systems on smps 471-479
|
|
|
Charles X. Ling, Jun Du
Active learning with direct query construction 480-487
|
|
|
Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, Yong Yu
Spectral domain-transfer learning 488-496
|
|
|
Xu Ling, Qiaozhu Mei, ChengXiang Zhai, Bruce R. Schatz
Mining multi-faceted overviews of arbitrary topics in a text collection 497-505
|
|
|
Aurelie C. Lozano, Naoki Abe
Multi-class cost-sensitive boosting with p-norm loss functions 506-514
|
|
|
Omid Madani, Jian Huang 0002
On updates that constrain the features' connections during learning 515-523
|
|
|
Mary McGlohon, Leman Akoglu, Christos Faloutsos
Weighted graphs and disconnected components: patterns and a generator 524-532
|
|
|
Gabriela Moise, Jörg Sander
Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projected and subspace clustering 533-541
|
|
|
Ramesh Nallapati, Amr Ahmed, Eric P. Xing, William W. Cohen
Joint latent topic models for text and citations 542-550
|
|
|
Nam Nguyen, Rich Caruana
Classification with partial labels 551-559
|
|
|
Dino Pedreschi, Salvatore Ruggieri, Franco Turini
Discrimination-aware data mining 560-568
|
|
|
Ian Porteous, David Newman, Alexander T. Ihler, Arthur Asuncion, Padhraic Smyth, Max Welling
Fast collapsed gibbs sampling for latent dirichlet allocation 569-577
|
|
|
Hiroto Saigo, Nicole Krämer, Koji Tsuda
Partial least squares regression for graph mining 578-586
|
|
|
Issei Sato, Minoru Yoshida, Hiroshi Nakagawa
Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model 587-595
|
|
|
Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec
Mobile call graphs: beyond power-law and lognormal distributions 596-604
|
|
|
Qihong Shao, Yi Chen, Shu Tao, Xifeng Yan, Nikos Anerousis
Efficient ticket routing by resolution sequence mining 605-613
|
|
|
Victor S. Sheng, Foster J. Provost, Panagiotis G. Ipeirotis
Get another label? improving data quality and data mining using multiple, noisy labelers 614-622
|
|
|
Jin Shieh, Eamonn J. Keogh
iSAX: indexing and mining terabyte sized time series 623-631
|
|
|
Ka Cheung Sia, Junghoo Cho, Yun Chi, Belle L. Tseng
Efficient computation of personal aggregate queries on blogs 632-640
|
|
|
György J. Simon, Vipin Kumar, Zhi-Li Zhang
Semi-supervised approach to rapid and reliable labeling of large data sets 641-649
|
|
|
Ajit Paul Singh, Geoffrey J. Gordon
Relational learning via collective matrix factorization 650-658
|
|
|
Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums
A bayesian mixture model with linear regression mixing proportions 659-667
|
|
|
Liang Sun, Shuiwang Ji, Jieping Ye
Hypergraph spectral learning for multi-label classification 668-676
|
|
|
Lei Tang, Huan Liu, Jianping Zhang, Zohreh Nazeri
Community evolution in dynamic multi-mode networks 677-685
|
|
|
Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos
Colibri: fast mining of large static and dynamic graphs 686-694
|
|
|
Pedro O. S. Vaz de Melo, Virgílio A. F. Almeida, Antonio Alfredo Ferreira Loureiro
Can complex network metrics predict the behavior of NBA teams? 695-703
|
|
|
Daniel David Walker, Eric K. Ringger
Model-based document clustering with a collapsed gibbs sampler 704-712
|
|
|
Pu Wang, Carlotta Domeniconi
Building semantic kernels for text classification using wikipedia 713-721
|
|
|
Michael L. Wick, Khashayar Rohanimanesh, Karl Schultz, Andrew McCallum
A unified approach for schema matching, coreference and canonicalization 722-730
|
|
|
Fei Wu 0003, Raphael Hoffmann, Daniel S. Weld
Information extraction from Wikipedia: moving down the long tail 731-739
|
|
|
Junjie Wu, Hui Xiong, Jian Chen
SAIL: summation-based incremental learning for information-theoretic clustering 740-748
|
|
|
Shan-Hung Wu, Keng-Pei Lin, Chung-Min Chen, Ming-Syan Chen
Asymmetric support vector machines: low false-positive learning under the user tolerance 749-757
|
|
|
Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dragan
Succinct summarization of transactional databases: an overlapped hyperrectangle scheme 758-766
|
|
|
Yabo Xu, Ke Wang, Ada Wai-Chee Fu, Philip S. Yu
Anonymizing transaction databases for publication 767-775
|
|
|
Jian Yang, Ning Zhong, Yiyu Yao, Jue Wang
Local peculiarity factor and its application in outlier detection 776-784
|
|
|
Luh Yen, Marco Saerens, Amin Mantrach, Masashi Shimbo
A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances 785-793
|
|
|
Chun-Nam John Yu, Thorsten Joachims
Training structural svms with kernels using sampled cuts 794-802
|
|
|
Lei Yu, Chris H. Q. Ding, Steven Loscalzo
Stable feature selection via dense feature groups 803-811
|
|
|
Peng Zhang, Xingquan Zhu, Yong Shi
Categorizing and mining concept drifting data streams 812-820
|
|
|
Xiang Zhang, Fei Zou, Wei Wang 0010
Fastanova: an efficient algorithm for genome-wide association study 821-829
|
|
|
Bin Zhao, Fei Wang, Changshui Zhang
Cuts3vm: a fast semi-supervised svm algorithm 830-838
|
|
|
Zheng Zhao, Jiangxin Wang, Huan Liu, Jieping Ye, Yung Chang
Identifying biologically relevant genes via multiple heterogeneous data sources 839-847
|
|
|
Wenjun Zhou, Hui Xiong
Volatile correlation computation: a checkpoint view 848-856
|
|
|
Shyam Boriah, Vipin Kumar, Michael Steinbach, Christopher Potter, Steven A. Klooster
Land cover change detection: a case study 857-865
|
|
|
Mohamed Bouguessa, Benoît Dumoulin, Shengrui Wang
Identifying authoritative actors in question-answering forums: the case of Yahoo! answers 866-874
|
|
|
Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen, Hang Li
Context-aware query suggestion by mining click-through and session data 875-883
|
|
|
Christine H. Chih, Douglass S. Parker
The persuasive phase of visualization 884-892
|
|
|
Richard Chow, Philippe Golle, Jessica Staddon
Detecting privacy leaks using corpus-based association rules 893-901
|
|
|
Ying Cui, Jennifer G. Dy, Gregory C. Sharp, Brian M. Alexander, Steve B. Jiang
Learning methods for lung tumor markerless gating in image-guided radiotherapy 902-910
|
|
|
Shantanu Godbole, Shourya Roy
Text classification, business intelligence, and interactivity: automating C-Sat analysis for services industry 911-919
|
|
|
Robert L. Grossman, Yunhong Gu
Data mining using high performance data clouds: experimental studies using sector and sphere 920-927
|
|
|
Shen-Shyang Ho, Ashit Talukder
Automated cyclone discovery and tracking using knowledge sharing in multiple heterogeneous satellite data 928-936
|
|
|
Noam Koenigstein, Yuval Shavitt, Tomer Tankel
Spotting out emerging artists using geo-aware analysis of P2P query strings 937-945
|
|
|
Prem Melville, Saharon Rosset, Richard D. Lawrence
Customer targeting models using actively-selected web content 946-953
|
|
|
Fabian Mörchen, Mathäus Dejori, Dmitriy Fradkin, Julien Etienne, Bernd Wachmann, Markus Bundschus
Anticipating annotations and emerging trends in biomedical literature 954-962
|
|
|
G. Niklas Norén, Andrew Bate, Johan Hopstadius, Kristina Star, I. Ralph Edwards
Temporal pattern discovery for trends and transient effects: its application to patient records 963-971
|
|
|
Nish Parikh, Neel Sundaresan
Scalable and near real-time burst detection from eCommerce queries 972-980
|
|
|
Renuka Sindhgatta
Identifying domain expertise of developers from source code 981-989
|
|
|
Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, Zhong Su
ArnetMiner: extraction and mining of academic social networks 990-998
|
|
|
Leonardo Weiss Ferreira Chaves, Erik Buchmann, Klemens Böhm
Tagmark: reliable estimations of RFID tags for business processes 999-1007
|
|
|
Gang Wu, Brendan Kitts
Experimental comparison of scalable online ad serving 1008-1015
|
|
|
Xintian Yang, Sitaram Asur, Srinivasan Parthasarathy, Sameep Mehta
A visual-analytic toolkit for dynamic interaction graphs 1016-1024
|
|
|
Jieping Ye, Kewei Chen, Teresa Wu, Jing Li, Zheng Zhao, Rinkal Patel, Min Bae, Ravi Janardan, Huan Liu, Gene Alexander, Eric Reiman
Heterogeneous data fusion for alzheimer's disease study 1025-1033
|
|
|
Shipeng Yu, Glenn Fung, Rómer Rosales, Sriram Krishnan, R. Bharat Rao, Cary Dehing-Oberije, Philippe Lambin
Privacy-preserving cox regression for survival analysis 1034-1042
|
|
|
Sai Zeng, Prem Melville, Christian A. Lang, Ioana M. Boier-Martin, Conrad Murphy
Using predictive analysis to improve invoice-to-cash collection 1043-1050
|
|
|
Yi Zhang, Arun C. Surendran, John C. Platt, Mukund Narasimhan
Learning from multi-topic web documents for contextual advertisement 1051-1059
|
|
|
Ravi Kumar, Alexander Tuzhilin, Christos Faloutsos, David Jensen, Gueorgi Kossinets, Jure Leskovec, Andrew Tomkins
Social networks: looking ahead 1060
|
|
|
Hendrik Blockeel, Toon Calders, Élisa Fromont, Bart Goethals, Adriana Prado, Céline Robardet
An inductive database prototype based on virtual mining views 1061-1064
|
|
|
Peter Christen
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface 1065-1068
|
|
|
Luigi Di Caro, K. Selçuk Candan, Maria Luisa Sapino
Using tagflake for condensing navigable tag hierarchies from tag clouds 1069-1072
|
|
|
Shantanu Godbole, Shourya Roy
An integrated system for automatic customer satisfaction analysis in the services industry 1073-1076
|
|
|
Ming Hua, Jian Pei
DiMaC: a disguised missing data cleaning tool 1077-1080
|
|
|
Evangelos E. Kotsifakos, Irene Ntoutsi, Yannis Vrahoritis, Yannis Theodoridis
Pattern-Miner: integrated management and mining over data mining models 1081-1084
|
|
|
Hongyan Liu, Hui Yang, Wenbo Li, Wei Wei, Jun He, Xiaoyong Du
CRO: a system for online review structurization 1085-1088
|
|
|
Emmanuel Müller, Ira Assent, Ralph Krieger, Timm Jansen, Thomas Seidl
Morpheus: interactive exploration of subspace clustering 1089-1092
|
|
|
Hill Nguyen, Nish Parikh, Neel Sundaresan
A software system for buzz-based recommendations 1093-1096
|
|
|
Shuyi Zheng, Matthew R. Scott, Ruihua Song, Ji-Rong Wen
Pictor: an interactive system for importing data from a website 1097-1100
|
Copyright ©2010 Association for Computing Machinery
|