Domain adaptation leverages the knowledge in one domain—the source domain—to improve learning efficiency in another domain—the target domain. Existing heterogeneous domain adaptation research is relatively well-progressed but only in situations… Click to show full abstract
Domain adaptation leverages the knowledge in one domain—the source domain—to improve learning efficiency in another domain—the target domain. Existing heterogeneous domain adaptation research is relatively well-progressed but only in situations where the target domain contains at least a few labeled instances. In contrast, heterogeneous domain adaptation with an unlabeled target domain has not been well-studied. To contribute to the research in this emerging field, this article presents: 1) an unsupervised knowledge transfer theorem that guarantees the correctness of transferring knowledge and 2) a principal angle-based metric to measure the distance between two pairs of domains: one pair comprises the original source and target domains and the other pair comprises two homogeneous representations of two domains. The theorem and the metric have been implemented in an innovative transfer model, called a Grassmann–linear monotonic maps–geodesic flow kernel (GLG), which is specifically designed for heterogeneous unsupervised domain adaptation (HeUDA). The linear monotonic maps (LMMs) meet the conditions of the theorem and are used to construct homogeneous representations of the heterogeneous domains. The metric shows the extent to which the homogeneous representations have preserved the information in the original source and target domains. By minimizing the proposed metric, the GLG model learns the homogeneous representations of heterogeneous domains and transfers knowledge through these learned representations via a geodesic flow kernel (GFK). To evaluate the model, five public data sets were reorganized into ten HeUDA tasks across three applications: cancer detection, the credit assessment, and text classification. The experiments demonstrate that the proposed model delivers superior performance over the existing baselines.
               
Click one of the above tabs to view related content.