Shallow parsing of Gujarati text

dc.accession.numberT00318
dc.classification.ddc006.35 DAV
dc.contributor.advisorPandya, Abhinay
dc.contributor.authorDave, Vidhi
dc.date.accessioned2017-06-10T14:39:14Z
dc.date.accessioned2025-06-28T10:20:29Z
dc.date.available2017-06-10T14:39:14Z
dc.date.issued2011
dc.degreeM. Tech
dc.description.abstractShallow parsing is the process of assigning tag to minimal, non recursive phrase of the sentence. It is useful for many applications like question answering system, information retrieval where there is no need of full parsing. Gujarati is one of the main languages of India and 26th most spoken native language in the world. There are more than 50 million speakers of Gujarati language worldwide. Natural language processing of Gujarati is in its infancy. Now days there are many data available in Gujarati on websites but due to lack of resources it is hard for users to retrieve it efficiently. So, shallow parsing of Gujarati can make task easier for another tasks like machine translation, information extraction and retrieval. In this thesis, we have worked on the automatic annotation of Shallow Parsing of Gujarati. 400 sentences have been manually tagged. Different Machine Learning techniques namely Hidden Markov Model and Conditional Random Field have been used. We achieved good accuracy and it is similar to Hindi chunker even though resources available for Gujarati are very less. The best performance is achieved using CRF with contextual information and Part-of-speech tags.
dc.identifier.citationDave, Vidhi (2011). Shallow parsing of Gujarati text. Dhirubhai Ambani Institute of Information and Communication Technology, viii, 34 p. (Acc.No: T00318)
dc.identifier.urihttp://drsr.daiict.ac.in/handle/123456789/355
dc.publisherDhirubhai Ambani Institute of Information and Communication Technology
dc.student.id200911008
dc.subjectNatural language processing
dc.subjectLinguistic analysis
dc.subjectLinguistics
dc.subjectHidden Markov Model
dc.subjectMarkov processes
dc.subjectComputational linguistics
dc.subjectConditional Random Field
dc.subjectMorphology
dc.subjectData processing
dc.subjectGrammar comparative and general
dc.subjectSyntax
dc.subjectData processing
dc.titleShallow parsing of Gujarati text
dc.typeDissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
200911008.pdf
Size:
382.21 KB
Format:
Adobe Portable Document Format