The analyses of large volumes of metagenomic data extracted from aggregate populations of microscopic organisms residing on and in the human body are advancing contemporary understandings of the integrated participation… Click to show full abstract
The analyses of large volumes of metagenomic data extracted from aggregate populations of microscopic organisms residing on and in the human body are advancing contemporary understandings of the integrated participation of microbes in human health and disease. Next generation sequencing technology facilitates said analyses in terms of diversity, community composition, and differential abundance by filtering and binning microbial 16S rRNA genes extracted from human tissues into operational taxonomic units. However, current statistical tools restrict study designs to investigations of limited numbers of host characteristics mediated by limited numbers of samples potentially yielding a loss of relevant information. This paper presents a Bayesian hierarchical negative binomial model as an efficient technique capable of compensating for multivariable sets including tens or hundreds of host characteristics as covariates further expanding analyses of human microbiome count data. Simulation studies reveal that the Bayesian hierarchical negative binomial model provides a desirable strategy by often outperforming three competing negative binomial model in terms of type I error while simultaneously maintaining consistent power. An application of the Bayesian hierarchical negative binomial model using subsets of the open data published by the American Gut Project demonstrates an ability to identify operational taxonomic units significantly differentiable among persons diagnosed by a medical professional with either inflammatory bowel disease or irritable bowel syndrome that are consistent with contemporary gastrointestinal literature.
               
Click one of the above tabs to view related content.