Welcome to D
SIGMOD 2005
PODS 2005
SIGMOD-RECOR
CIDR 2005
CIKM 2005
COMAD 2005
CVDB 2005
DaMoN 2005
Data Enginee
DEBS05
DMSN 2005
DOLAP 2005
GIR 2005
GIS 2005
Hypertext 20
ICDE 2005
ICDM 2005
IHIS 2005
IQIS 2005
JCDL 2005
KRAS 2005
MDM 2005
MIR 2005
MobiDE 2005
P2PIR 2005
RIDE 2005
SBBD 2005
<<< = SBBD'05 Pape>>>
SIGIR 2005
SIGIR-FORUM
SIGKDD 2005
SIGKDD-EXP
SSDBM 2005
TIME 2005
TKDE 2005
TODS 2005
VLDB 2005
VLDBJ 2005
WebDB 2005
WIDM 2005

Remoção de Ambiguidades na Identificação de Autoria de Objetos Bibliográficos


Jean W. A. Oliveira, Alberto H. F. Laender, and Marcos André Gonçalves

  View Paper (PDF)  

Return to Bibliotecas Digitais e Similaridade/Digital Libraries and Similarity


Abstract

Digital Libraries collect together digital content and metadata, frequently obtained from several disparate sources. The non-standardization of these sources brings such problems as ambiguous metadata fields. In this paper, we present a strategy for name authority disambiguation in digital libraries. Our strategy uses pattern matching functions and information retrieval techniques along with a clustering algorithm which allows for the creation of unified indexes that register the several variants of an author name appearing in the collection. We demonstrate the effectiveness of our strategy through exhaustive experimentation in two test collections with distinctive features, derived from two digital libraries: BDBComp - Biblioteca Digital Brasileira de Computação and DBLP - Digital Bibliography Library Project. For the collection derived from BDBComp, the average between the measure for the quality of the generated clusters and their fragmentation was higher than 95% while for the collection derived from DBLP that average was higher than 66%.


©2006 Association for Computing Machinery