No Cover Image

E-Thesis 43561 views 75 downloads

Supporting user selection of digital libraries. / Helen Margaret Dodd

Swansea University Author: Helen Margaret Dodd

Abstract

Subject specialists and researchers often face the problem of identifying authoritative collections: those directly about their topic of interest, to which they regularly return to satisfy related information needs or monitor for new material. Discovery of such collections is often incidental or rel...

Full description

Published: 2013
Institution: Swansea University
Degree level: Doctoral
Degree name: Ph.D
URI: https://cronfa.swan.ac.uk/Record/cronfa42656
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract: Subject specialists and researchers often face the problem of identifying authoritative collections: those directly about their topic of interest, to which they regularly return to satisfy related information needs or monitor for new material. Discovery of such collections is often incidental or relies on suggestions from domain experts. Services such as general purpose search engines and repository directories offer limited support for this search task. As such, there is a clear need for a search service specifically to assist users in finding collections that can serve both their current and future information needs; we refer to this task herein as collection suggestion. However, developing an effective search service of this kind requires fundamental research. There are several preconditions that should be addressed; it is these that form the focus of this thesis. We summarise these areas as follows. An effective search service calls for an appropriate algorithm; in this instance, an algorithm for ranking collections with respect to the user's query. To this end, we investigate the applicability of existing algorithms, from relevant domains (collection selection and query performance prediction), to collection suggestion. In addition, towards identifying an optimal algorithm for a collection suggestion search service, we specify and test a new algorithm (and several alternative variants), designed specifically for this task. The requirement of an appropriate algorithm presents the question of how we evaluate the effectiveness of an algorithm. We have formulated a methodology (comprising evaluation strategies and performance measures) and developed apparatus for evaluating algorithms, with respect to collection suggestion. As far as possible, we have drawn on and extended established algorithm evaluation techniques, to ensure our work follows the expectations of information retrieval research. Our empirical work is conducted over several synthetic and realistic test data sets: we use established data sets built from the TREC document corpus, in addition to data sets of our own compilation, comprising data from real repositories. This combination of test data types ensures a rigorous test environment for algorithms. Over our test environment, we have found three algorithms to be potentially suitable for application in a collection suggestion search service. One collection selection algorithm (CORI), and two variants of our own algorithm were shown to have strong and consistent performance, across the range of test data sets and performance measures used.
Keywords: Computer science.
College: Faculty of Science and Engineering