The neural machine translation (NMT) model is a data hungry and domain-sensitive model but it is almost impossible to obtain a large number of labeled data for training it. This… Click to show full abstract
The neural machine translation (NMT) model is a data hungry and domain-sensitive model but it is almost impossible to obtain a large number of labeled data for training it. This requires the use of domain transfer strategy. In order to solve the problem of domain data mismatch, this paper proposes a neural machine translation transfer model based on domain mutual guidance and establishes the continuous impact through the framework of mutual guidance. At the same time, self-ensemble and self-knowledge-distillation are used in these independent domains so that the model will not deviate from the domain too much. Furthermore, the model can better train the models from the batching way of domain data. It mainly uses the pretraining model out of domain, distillation of existing models in domain and data selection in the training process to guide the in-domain model. These are unified in the training framework, so that model training can be continuously and effectively guided in and out of domain. In this study, three typical experiment scenarios were comprehensive tested and our model was compared with many conventional classic methods. The experiment results showed that our proposed “inter-domain transfer training” and “curriculum scheduling agent” was effective and robust. The most important results and findings are that this comprehensive guided training framework (intra-domain and inter-domain) is suitable for the domain transfer in different scenarios, and this framework doesn’t increase the decoding cost.
               
Click one of the above tabs to view related content.