Despite recent methodological advances in hidden Markov regression models and a rapid increase in their application in a wide range of empirical settings, complex clustering-based research questions that include the… Click to show full abstract
Despite recent methodological advances in hidden Markov regression models and a rapid increase in their application in a wide range of empirical settings, complex clustering-based research questions that include the contribution of the covariates set to the classification and the presence of atypical observations are often addressed ignoring the possible effects of wrong model assumptions. Hidden Markov regression models with random covariates (HMRMRCs) have been recently proposed as an improvement over the classical fixed covariates approach, allowing the covariates to contribute to the underlying clustering structure. To make the approach more flexible, when all the considered random variables are continuous, HMRMRCs are here defined focusing on three multivariate elliptical distributions: the normal (reference distribution), the t , and the contaminated normal. The latter two, heavy-tailed generalizations of the normal distribution, are introduced to protect the reference model for the occurrence of mildly atypical points and also allow us their automatic detection. Identifiability conditions are provided, EM-based algorithms are outlined for parameter estimation, and various implementation and operational issues are discussed. Properties of the estimators of the regression coefficients, as well as of the hidden path parameters, are evaluated through Monte Carlo experiments with the aim of showing the consequences of wrong model assumptions on paramaters estimates and inferred clustering. Artificial and real data analyses are provided to investigate models behavior in presence of heterogeneity and atypical observations.
               
Click one of the above tabs to view related content.