In this paper, a Bayesian nonparametric approach to the two-sample problem is proposed. Given two samples $$\text{X} = {X_1}, \ldots ,{X_{m1}}\;\mathop {\text~}\limits^{i.i.d.} F$$X=X1,…,Xm1~i.i.d.F and $$Y = {Y_1}, \ldots ,{Y_{{m_2}}}\mathop {\text~}\limits^{i.i.d.}… Click to show full abstract
In this paper, a Bayesian nonparametric approach to the two-sample problem is proposed. Given two samples $$\text{X} = {X_1}, \ldots ,{X_{m1}}\;\mathop {\text~}\limits^{i.i.d.} F$$X=X1,…,Xm1~i.i.d.F and $$Y = {Y_1}, \ldots ,{Y_{{m_2}}}\mathop {\text~}\limits^{i.i.d.} G$$Y=Y1,…,Ym2~i.i.d.G, with F and G being unknown continuous cumulative distribution functions, we wish to test the null hypothesis H0: F = G. The method is based on computing the Kolmogorov distance between two posterior Dirichlet processes and comparing the results with a reference distance. The parameters of the Dirichlet processes are selected so that any discrepancy between the posterior distance and the reference distance is related to the difference between the two samples. Relevant theoretical properties of the procedure are also developed. Through simulated examples, the approach is compared to the frequentist Kolmogorov–Smirnov test and a Bayesian nonparametric test in which it demonstrates excellent performance.
               
Click one of the above tabs to view related content.