Biomedical Search Result Diversification
"Traditionally, it is assumed by the retrieval models that the document relevance is dependent only on the query and not on the relevance of the other documents in the rank list. However, this assumption may prove to be wrong. The utility of retrieving a document generally depends on the documents ranked previously, since instead of reading redundant information delivered by the relevant documents, users may want to see documents containing various distinct aspects of their information need in the top ranked list. The focus of this thesis is on applying Information Retrieval techniques on bio-medical literature to return a rank-list of passages which answers the query in such a way that the retrieved passages are relevant to the query and cover a wide range of aspect in the top ranked passages. This is to provide maximum information without facing redundancy in the top ranked passages. Here databases of Wikipedia and UMLS Meta-thesaurus are used for extracting aspects from the passages. the passages are then re-ranked on the basis of these aspects to promote diversity. The data for this research is provided by the Text REtrieval Conference as part of the Genomic track in 2007. Here, the fusion of different models seems to outperform individual models. The re-ranking on the basis of UMLS (Unified Medical Language System) Meta-thesaurus aspects prove to promote the diversity by improving the aspect level score. However, re-ranking on the basis of Wikipedia aspect does not seem to improve the rank-list Diversity."
- M Tech Dissertations