Test Collection Based Evaluation Of Information Retrieval Systems

eBook Download

BOOK EXCERPT:

Use of test collections and evaluation measures to assess the effectiveness of information retrieval systems has its origins in work dating back to the early 1950s. Across the nearly 60 years since that work started, use of test collections is a de facto standard of evaluation. This monograph surveys the research conducted and explains the methods and measures devised for evaluation of retrieval systems, including a detailed look at the use of statistical significance testing in retrieval experimentation. This monograph reviews more recent examinations of the validity of the test collection approach and evaluation measures as well as outlining trends in current research exploiting query logs and live labs. At its core, the modern-day test collection is little different from the structures that the pioneering researchers in the 1950s and 1960s conceived of. This tutorial and review shows that despite its age, this long-standing evaluation method is still a highly valued tool for retrieval research.

Product Details :

Genre : Computers
Author : Mark Sanderson
Publisher : Now Publishers Inc
Release : 2010-06-03
File : 143 Pages
ISBN-13 : 9781601983602


Information Retrieval Evaluation

eBook Download

BOOK EXCERPT:

Evaluation has always played a major role in information retrieval, with the early pioneers such as Cyril Cleverdon and Gerard Salton laying the foundations for most of the evaluation methodologies in use today. The retrieval community has been extremely fortunate to have such a well-grounded evaluation paradigm during a period when most of the human language technologies were just developing. This lecture has the goal of explaining where these evaluation methodologies came from and how they have continued to adapt to the vastly changed environment in the search engine world today. The lecture starts with a discussion of the early evaluation of information retrieval systems, starting with the Cranfield testing in the early 1960s, continuing with the Lancaster "user" study for MEDLARS, and presenting the various test collection investigations by the SMART project and by groups in Britain. The emphasis in this chapter is on the how and the why of the various methodologies developed. The second chapter covers the more recent "batch" evaluations, examining the methodologies used in the various open evaluation campaigns such as TREC, NTCIR (emphasis on Asian languages), CLEF (emphasis on European languages), INEX (emphasis on semi-structured data), etc. Here again the focus is on the how and why, and in particular on the evolving of the older evaluation methodologies to handle new information access techniques. This includes how the test collection techniques were modified and how the metrics were changed to better reflect operational environments. The final chapters look at evaluation issues in user studies -- the interactive part of information retrieval, including a look at the search log studies mainly done by the commercial search engines. Here the goal is to show, via case studies, how the high-level issues of experimental design affect the final evaluations. Table of Contents: Introduction and Early History / "Batch" Evaluation Since 1992 / Interactive Evaluation / Conclusion

Product Details :

Genre : Computers
Author : Donna Harman
Publisher : Springer Nature
Release : 2022-05-31
File : 107 Pages
ISBN-13 : 9783031022760


Information Retrieval Systems

eBook Download

BOOK EXCERPT:

Information science textbook on information retrieval methodology - focusing on intellectual rather than equipment oriented aspects of information systems, proposes criteria for the evaluation of information service efficiency (incl. Cost benefit analysis), constrasts thesaurus terminology control with natural language ("free text") retrieval, considers trends in data base computerization and information user information needs, and includes the results of a questionnaire appraisal of AGRIS. Bibliography pp. 359 to 373, diagrams, flow charts and graphs.

Product Details :

Genre : Computers
Author : Frederick Wilfrid Lancaster
Publisher : New York ; Toronto : Wiley
Release : 1979
File : 408 Pages
ISBN-13 : UOM:39015004555143


Information Retrieval Evaluation

eBook Download

BOOK EXCERPT:

Evaluation has always played a major role in information retrieval, with the early pioneers such as Cyril Cleverdon and Gerard Salton laying the foundations for most of the evaluation methodologies in use today. The retrieval community has been extremely fortunate to have such a well-grounded evaluation paradigm during a period when most of the human language technologies were just developing. This lecture has the goal of explaining where these evaluation methodologies came from and how they have continued to adapt to the vastly changed environment in the search engine world today. The lecture starts with a discussion of the early evaluation of information retrieval systems, starting with the Cranfield testing in the early 1960s, continuing with the Lancaster "user" study for MEDLARS, and presenting the various test collection investigations by the SMART project and by groups in Britain. The emphasis in this chapter is on the how and the why of the various methodologies developed. The second chapter covers the more recent "batch" evaluations, examining the methodologies used in the various open evaluation campaigns such as TREC, NTCIR (emphasis on Asian languages), CLEF (emphasis on European languages), INEX (emphasis on semi-structured data), etc. Here again the focus is on the how and why, and in particular on the evolving of the older evaluation methodologies to handle new information access techniques. This includes how the test collection techniques were modified and how the metrics were changed to better reflect operational environments. The final chapters look at evaluation issues in user studies -- the interactive part of information retrieval, including a look at the search log studies mainly done by the commercial search engines. Here the goal is to show, via case studies, how the high-level issues of experimental design affect the final evaluations. Table of Contents: Introduction and Early History / "Batch" Evaluation Since 1992 / Interactive Evaluation / Conclusion

Product Details :

Genre : Computers
Author : Donna K. Harman
Publisher : Morgan & Claypool Publishers
Release : 2011
File : 122 Pages
ISBN-13 : 9781598299717


Simulating Information Retrieval Test Collections

eBook Download

BOOK EXCERPT:

Simulated test collections may find application in situations where real datasets cannot easily be accessed due to confidentiality concerns or practical inconvenience. They can potentially support Information Retrieval (IR) experimentation, tuning, validation, performance prediction, and hardware sizing. Naturally, the accuracy and usefulness of results obtained from a simulation depend upon the fidelity and generality of the models which underpin it. The fidelity of emulation of a real corpus is likely to be limited by the requirement that confidential information in the real corpus should not be able to be extracted from the emulated version. We present a range of methods exploring trade-offs between emulation fidelity and degree of preservation of privacy. We present three different simple types of text generator which work at a micro level: Markov models, neural net models, and substitution ciphers. We also describe macro level methods where we can engineer macro properties of a corpus, giving a range of models for each of the salient properties: document length distribution, word frequency distribution (for independent and non-independent cases), word length and textual representation, and corpus growth. We present results of emulating existing corpora and for scaling up corpora by two orders of magnitude. We show that simulated collections generated with relatively simple methods are suitable for some purposes and can be generated very quickly. Indeed it may sometimes be feasible to embed a simple lightweight corpus generator into an indexer for the purpose of efficiency studies. Naturally, a corpus of artificial text cannot support IR experimentation in the absence of a set of compatible queries. We discuss and experiment with published methods for query generation and query log emulation. We present a proof-of-the-pudding study in which we observe the predictive accuracy of efficiency and effectiveness results obtained on emulated versions of TREC corpora. The study includes three open-source retrieval systems and several TREC datasets. There is a trade-off between confidentiality and prediction accuracy and there are interesting interactions between retrieval systems and datasets. Our tentative conclusion is that there are emulation methods which achieve useful prediction accuracy while providing a level of confidentiality adequate for many applications. Many of the methods described here have been implemented in the open source project SynthaCorpus, accessible at: https://bitbucket.org/davidhawking/synthacorpus/

Product Details :

Genre : Computers
Author : David Hawking
Publisher : Springer Nature
Release : 2022-06-01
File : 162 Pages
ISBN-13 : 9783031023231


Methodology For Test And Evaluation Of Information Retrieval Systems

eBook Download

BOOK EXCERPT:

Information retrieval systems are discussed in terms of their purpose and function. The essential components of an information retrieval system are defined. A methodology for evaluating the comparative performance of systems is developed. Specific measures and methods of analysis of results are presented. (Author).

Product Details :

Genre : Information storage and retrieval systems
Author : William Goffman
Publisher :
Release : 1964
File : 19 Pages
ISBN-13 : OCLC:227368435


Information Retrieval Evaluation In A Changing World

eBook Download

BOOK EXCERPT:

This volume celebrates the twentieth anniversary of CLEF - the Cross-Language Evaluation Forum for the first ten years, and the Conference and Labs of the Evaluation Forum since – and traces its evolution over these first two decades. CLEF’s main mission is to promote research, innovation and development of information retrieval (IR) systems by anticipating trends in information management in order to stimulate advances in the field of IR system experimentation and evaluation. The book is divided into six parts. Parts I and II provide background and context, with the first part explaining what is meant by experimental evaluation and the underlying theory, and describing how this has been interpreted in CLEF and in other internationally recognized evaluation initiatives. Part II presents research architectures and infrastructures that have been developed to manage experimental data and to provide evaluation services in CLEF and elsewhere. Parts III, IV and V represent the core of the book, presenting some of the most significant evaluation activities in CLEF, ranging from the early multilingual text processing exercises to the later, more sophisticated experiments on multimodal collections in diverse genres and media. In all cases, the focus is not only on describing “what has been achieved”, but above all on “what has been learnt”. The final part examines the impact CLEF has had on the research world and discusses current and future challenges, both academic and industrial, including the relevance of IR benchmarking in industrial settings. Mainly intended for researchers in academia and industry, it also offers useful insights and tips for practitioners in industry working on the evaluation and performance issues of IR tools, and graduate students specializing in information retrieval.

Product Details :

Genre : Computers
Author : Nicola Ferro
Publisher : Springer
Release : 2019-08-13
File : 597 Pages
ISBN-13 : 9783030229481


Evaluation Of Cross Language Information Retrieval Systems

eBook Download

BOOK EXCERPT:

The second evaluation campaign of the Cross Language Evaluation Forum (CLEF) for European languages was held from January to September 2001. This campaign proved a great success, and showed an increase in participation of around 70% com pared with CLEF 2000. It culminated in a two day workshop in Darmstadt, Germany, 3–4 September, in conjunction with the 5th European Conference on Digital Libraries (ECDL 2001). On the first day of the workshop, the results of the CLEF 2001 evalua tion campaign were reported and discussed in paper and poster sessions. The second day focused on the current needs of cross language systems and how evaluation cam paigns in the future can best be designed to stimulate progress. The workshop was attended by nearly 50 researchers and system developers from both academia and in dustry. It provided an important opportunity for researchers working in the same area to get together and exchange ideas and experiences. Copies of all the presentations are available on the CLEF web site at http://www. clef campaign. org. This volume con tains thoroughly revised and expanded versions of the papers presented at the work shop and provides an exhaustive record of the CLEF 2001 campaign. CLEF 2001 was conducted as an activity of the DELOS Network of Excellence for Digital Libraries, funded by the EC Information Society Technologies program to further research in digital library technologies. The activity was organized in collabo ration with the US National Institute of Standards and Technology (NIST).

Product Details :

Genre : Computers
Author : Martin Braschler
Publisher : Springer
Release : 2003-08-02
File : 606 Pages
ISBN-13 : 9783540456919


Interactive Information Seeking Behaviour And Retrieval

eBook Download

BOOK EXCERPT:

Information retrieval (IR) is a complex human activity supported by sophisticated systems. Information science has contributed much to the design and evaluation of previous generations of IR system development and to our general understanding of how such systems should be designed and yet, due to the increasing success and diversity of IR systems, many recent textbooks concentrate on IR systems themselves and ignore the human side of searching for information. This book is the first text to provide an information science perspective on IR. Unique in its scope, the book covers the whole spectrum of information retrieval, including: history and background information behaviour and seeking task-based information searching and retrieval approaches to investigating information interaction and behaviour information representation access models evaluation interfaces for IR interactive techniques web retrieval, ranking and personalization recommendation, collaboration and social search multimedia: interfaces and access. Readership: Senior undergraduates and masters' level students of all information and library studies courses and practising LIS professionals who need to better appreciate how IR systems are designed, implemented and evaluated.

Product Details :

Genre : Computers
Author : Ian Ruthven
Publisher : Facet Publishing
Release : 2011
File : 337 Pages
ISBN-13 : 9781856047074


Bridging Between Information Retrieval And Databases

eBook Download

BOOK EXCERPT:

The research domains of information retrieval and databases have traditionally adopted different approaches to information management. However, in recent years, there has been an increasing cross-fertilization among the two fields and now many research challenges are transversal to them. With this in mind, a winter school was organized in Bressanone, Italy, in February 2013, within the context of the EU-funded research project PROMISE (Participative Research Laboratory for Multimedia and Multilingual Information Systems Evaluation). PROMISE aimed at advancing the experimental evaluation of complex multimedia and multilingual information systems in order to support individuals, commercial entities and communities, who design, develop, employ and improve such complex systems. The overall goal of PROMISE was to deliver a unified environment collecting data, knowledge, tools and methodologies and to help the user community involved in experimental evaluation. This book constitutes the outcome of the PROMISE Winter School 2013 and contains 9 invited lectures from the research domains of information retrieval and databases plus short papers of the best student poster awards. A large variety of topics are covered, including databases, information retrieval, experimental evaluation, metrics and statistics, semantic search, keyword search in databases, semi-structured search, evaluation both in information retrieval and databases, crowdsourcing and social media.

Product Details :

Genre : Computers
Author : Nicola Ferro
Publisher : Springer
Release : 2014-05-06
File : 247 Pages
ISBN-13 : 9783642547980