Distant supervision is widely used to extract relational facts with automatically labeled datasets to reduce high cost of human annotation. However, current distantly supervised methods suffer from the common problems… Click to show full abstract
Distant supervision is widely used to extract relational facts with automatically labeled datasets to reduce high cost of human annotation. However, current distantly supervised methods suffer from the common problems of word-level and sentence-level noises, which come from a large proportion of irrelevant words in a sentence and inaccurate relation labels for numerous sentences. The problems lead to unacceptable precision in relation extraction and are critical for the success of using distant supervision. In this paper, we propose a novel and robust neural approach to deal with both problems by reducing influences of the multi-granularity noises. Three levels of noises from word, sentence until knowledge type are carefully considered in this work. We first initiate a question-answering based relation extractor (QARE) to remove noisy words in a sentence. Then we use multi-focus multi-instance learning (MMIL) to alleviate the effects of sentence-level noise by utilizing wrongly labeled sentences properly. Finally, to enhance our method against all the noises, we initialize parameters in our method with a priori knowledge learned from the relevant task of entity type classification by transfer learning. Extensive experiments on both existing benchmark and an improved larger dataset demonstrate that our proposed approach remarkably achieves new state-of-the-art performance.
               
Click one of the above tabs to view related content.