Welcome to D
SIGMOD 2005
PODS 2005
SIGMOD-RECOR
CIDR 2005
CIKM 2005
COMAD 2005
CVDB 2005
DaMoN 2005
Data Enginee
DEBS05
DMSN 2005
DOLAP 2005
GIR 2005
GIS 2005
Hypertext 20
ICDE 2005
ICDM 2005
IHIS 2005
IQIS 2005
JCDL 2005
KRAS 2005
MDM 2005
MIR 2005
MobiDE 2005
P2PIR 2005
RIDE 2005
SBBD 2005
SIGIR 2005
<<< = SIGIR'05 Pap>>>
SIGIR-FORUM
SIGKDD 2005
SIGKDD-EXP
SSDBM 2005
TIME 2005
TKDE 2005
TODS 2005
VLDB 2005
VLDBJ 2005
WebDB 2005
WIDM 2005

Relevance information: a loss of entropy but a gain for IDF?


Arjen P. de Vries and Thomas Roelleke

  View Paper (PDF)  

Return to Theory 2


Abstract

When investigating alternative estimates for term discriminativeness, we discovered that relevance information and idf are much closer related than formulated in classical literature. Therefore, we revisited the justification of idf as it follows from the binary independent retrieval (BIR) model. The main result is a formal framework uncovering the close relationship of a generalised idf and the BIR model. The framework makes explicit how to incorporate relevance information into any retrieval function that involves an idf-component.In addition to the idf-based formulation of the BIR model, we propose Poisson-based estimates as an alternative to the classical estimates, this being motivated by the superiority of Poisson-based estimates for the within-document term frequencies. The main experimental finding is that a Poisson-based idf is superior to the classical idf, where the superiority is particularly evident for long queries.

BIBTEX


@inproceedings{1076084,
  author = {Arjen P. de Vries and Thomas Roelleke},
  title = {Relevance information: a loss of entropy but a gain for IDF?},
  booktitle = {SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval},
  year = {2005},
  isbn = {1-59593-034-5},
  pages = {282--289},
  location = {Salvador, Brazil},
  doi = {http://doi.acm.org/10.1145/1076034.1076084},
  publisher = {ACM Press},
  address = {New York, NY, USA},
  
}



©2006 Association for Computing Machinery