Study on cross lingual information retrieval in indian languages
In this article we evaluate and report the effectiveness of various available resources includingMachine translation systems(MTS) and machine readable dictionaries in the retrievalof Indian language documents. Machine translation systems lend themselves to the task ofcrossing the language barrier in Cross Lingual Information Retrieval because of their easyavailability. In our work we picked up three online translation systems:Google 1, Bing2,Technology Development of Indian Languages(TDIL) translation service3 and assessedtheir performances in the task of Cross Lingual Information Retrieval. We also did querywiseanalysis to determine the issues faced when we rely on machine translation systems inthe multi lingual retrieval. Our experiments shows that not only the quality of translation toa language depends on source language but also differ a lot querywise within source-targetlanguage pair. We also explored different translation difficulties faced in using MTS.We then evaluated and explored the machine readable dictionaries using naive dictionarybased approach which is seen as the simplest implementation of CLIR. Then we exploredthe possibility of enhancing naive dictionary results using word embedding methodword2vec which was followed by error analysis.
- M Tech Dissertations