Distant supervision for relation extraction

Doshi, Prarthana

View/Open

201611019 (281.1Kb)

Date

2018

Author

Doshi, Prarthana

Metadata

Show full item record

Abstract

Relation Extraction(RE) is one of important task of Information Extraction. InformationExtraction is used to get data from natural language text. Relation extractionis done using different methods. Most techniques found in the area ofrelation extraction uses labelled data. The downside of using labelled data is thatit is very costly to generate the labelled data as it requires human labour to understandeach sentence and entities and label it accordingly. There is a big amount ofnatural language data available and it is increasing day by day. So, the supervisedtechniques may not scale and adapt well with real time dynamic data.The issue of human annotations is addressed by recent approach of distant supervision.Distant supervision is a task that attempts automatic labelling of data.This is realized by extracting facts from publicly available knowledge bases likeWikidata, DBPedia, etc. Most of the knowledge bases are freely available. Theassumption of distant supervision is that if there is a relation between entitiesin knowledge base, then a sentence, in which those entities are present together,represents that relation. But there are some problems associated with distant supervisionlike incomplete knowledge base or wrong label problem.Most techniques in the area of relation extraction used available NLP toolsfor the feature extraction. These tools themselves have errors. In this work, weexplore convolutional neural network for the task which does not require NLPbased preprocessing.To avoid the wrong label problem, we have used selective attention over instances.It considers the problem as the multi-instance problem and we have concludedthat it gives better result. We have also used CNN with context modelwhere the input of the model is divided in three parts based on the entity position.This helps model to understand the sentence representation and the modelperforms well as compared to basic CNN model.

URI

http://drsr.daiict.ac.in//handle/123456789/744

Collections

M Tech Dissertations [923]