Abstract The need to model count data correctly calls for the introduction of a flexible yet a strong model that can sufficiently handle various types of count data. Models such… Click to show full abstract
Abstract The need to model count data correctly calls for the introduction of a flexible yet a strong model that can sufficiently handle various types of count data. Models such as Ordinary Least Squares (OLS) used in the past were considered unsuitable, and the introduction of the Generalized Linear Model (GLM) and its various extensions was the first breakthrough recorded in modelling count data. In this article, Bayesian Dirichlet process mixture prior of generalized linear mixed models (DPMglmm) was proposed. Metropolis Hasting Monte Carlo Markov Chain (M-H MCMC) was used to draw parameters from target posterior distribution. The Iterated Weighted Least Square (IWLS) proposal was used to determine the acceptance probability in the M-H MCMC phase. Under and over-dispersed count data were simulated, 500 Burn-in was scanned so as to allow for stability in the chain. 100 thinning interval was allowed so as to nullify the possible effect of autocorrelation in the data due to the Monte Carlo procedure. The DPMglmm and other competing models were fitted to the simulated data and real-life data sets of health insurance claims. The results obtained showed that DPMglmm outperformed MCMCglmm, Bayesian Discrete Weibull and four other frequentist models. This shows that DPMglmm is flexible, and can fit count data better, either under-dispersed or over-dispersed data.
               
Click one of the above tabs to view related content.