Show simple item record

dc.contributor.advisorMajumder, Prasenjit
dc.contributor.authorPappan, Shiju
dc.date.accessioned2017-06-10T14:45:09Z
dc.date.available2017-06-10T14:45:09Z
dc.date.issued2016
dc.identifier.citationPappan, Shiju (2016). Abstractive text summarization using universal dependency labels, LSA and graph based approach. Dhirubhai Ambani Institute of Information and Communication Technology, vi, 19p. (Acc.No: T00599)
dc.identifier.urihttp://drsr.daiict.ac.in/handle/123456789/636
dc.description.abstractWith the advancements in technology and the increase in the available informationon the Internet it becomes a tiring task to go through each and everydocument available on net to get a gist of each. This could have been really easy ifa summary of each document, highlighting the key concept was readily available.A summary which is very close to a human generated summary. In this paper weaim at proposing a methodology for summarizing documents by using UniversalDependency Labels, Latent Semantic Analysis and Word Graph based approach.As a first step, we start with syntactic analysis and pre-processing of documentwhich involves tokenization, NER (Name Entity Recognition), pronoun resolutionetc. After this we try to modify each sentence by extracting the logical formtriplets viz subject, predicate and object entities using the Universal DependencyLabels. The modified sentences are then checked for similarity or relatedness usingsimilarity measure. Sentences with similarity score more than the pre-definedvalue are used for creating the word graph. It is ensured that each sentence isused only once for constructing the graph. The graph is then used for sentencecompression by finding the K shortest path from starting to end node. The newedge weight formula is used to find the lightest path among the K paths, beingmost informative. The new sentences are then combined together with sentenceshaving similarity score less than pre-defined value to obtain the new modifieddocument. The key sentences related with key topics in the document are identifiedusing Latent Semantic Analysis to give the required abstractive summary ofthe document.
dc.publisherDhirubhai Ambani Institute of Information and Communication Technology
dc.subjectAbstractive Text Summarization
dc.subjectUniversal Dependency Labels
dc.subjectLatent semantic analysis
dc.classification.ddc006.35 PAP
dc.titleAbstractive text summarization using universal dependency labels, LSA and graph based approach
dc.typeDissertation
dc.degreeM. Tech
dc.student.id201411053
dc.accession.numberT00599


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record