• Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    View Usage StatisticsView Google Analytics Statistics

    Document representation using extended locality preserving indexing

    Thumbnail
    View/Open
    201611020 (426.8Kb)
    Date
    2018
    Author
    Khalpada, Vaidehi S.
    Metadata
    Show full item record
    Abstract
    The main purpose of web search is to obtain the relevant information pertaining to our need from the documents available on the Internet. Each term (word) in a document contributes to a dimension. It is challenging to process this high dimensional data. Not all terms convey important meaning, some terms are related to each other, some are synonyms. This redundancy in the document collection increases the dimensionality of the document space. Processing this high dimensional document collection to obtain useful information from it requires a lot of storage space and time for computation. Dimensionality reduction plays an important role here to reduce the data dimension so that computation can be done fast and the storage required is also less. These documents are represented as vectors in high dimensional space. Our main aim is to obtain the representation of documents in this reduced subspace so that the relation among the documents in the subspace does not get changed from the one in original vector space. So, the accuracy of the similarity measure of the documents obtained in the subspace is evaluated. Document representation in terms of term document matrix is an important step in document indexing. Document indexing is the process to obtain an index which helps in retrieving relevant documents effectively, analogous to the index of a book. Latent Semantic Indexing (LSI) is a global structure preserving approach while Locality Preserving Indexing (LPI) is a local structure preserving approach. LPI assigns weights to the neighbours to obtain the reduced representation while preserving local structure. However, it does not retain any information about nonneighbours. A new approach Extended Locality Preserving Indexing (ELPI) is proposed which preserves the topology of the document space by modifying the weighing scheme. Experiments for evaluating document similarity and for classification show small but encouraging improvement using ELPI as compared to LPI.
    URI
    http://drsr.daiict.ac.in//handle/123456789/745
    Collections
    • M Tech Dissertations [923]

    Resource Centre copyright © 2006-2017 
    Contact Us | Send Feedback
    Theme by 
    Atmire NV
     

     


    Resource Centre copyright © 2006-2017 
    Contact Us | Send Feedback
    Theme by 
    Atmire NV