Distant supervision for relation extraction
Abstract
Relation Extraction(RE) is one of important task of Information Extraction. InformationExtraction is used to get data from natural language text. Relation extractionis done using different methods. Most techniques found in the area ofrelation extraction uses labelled data. The downside of using labelled data is thatit is very costly to generate the labelled data as it requires human labour to understandeach sentence and entities and label it accordingly. There is a big amount ofnatural language data available and it is increasing day by day. So, the supervisedtechniques may not scale and adapt well with real time dynamic data.The issue of human annotations is addressed by recent approach of distant supervision.Distant supervision is a task that attempts automatic labelling of data.This is realized by extracting facts from publicly available knowledge bases likeWikidata, DBPedia, etc. Most of the knowledge bases are freely available. Theassumption of distant supervision is that if there is a relation between entitiesin knowledge base, then a sentence, in which those entities are present together,represents that relation. But there are some problems associated with distant supervisionlike incomplete knowledge base or wrong label problem.Most techniques in the area of relation extraction used available NLP toolsfor the feature extraction. These tools themselves have errors. In this work, weexplore convolutional neural network for the task which does not require NLPbased preprocessing.To avoid the wrong label problem, we have used selective attention over instances.It considers the problem as the multi-instance problem and we have concludedthat it gives better result. We have also used CNN with context modelwhere the input of the model is divided in three parts based on the entity position.This helps model to understand the sentence representation and the modelperforms well as compared to basic CNN model.
Collections
- M Tech Dissertations [923]