Many problems in marketing and economics require firms to make targeted consumer-specific decisions, but current estimation methods are not designed to scale to the size of modern data sets. In… Click to show full abstract
Many problems in marketing and economics require firms to make targeted consumer-specific decisions, but current estimation methods are not designed to scale to the size of modern data sets. In this article, the authors propose a new algorithm to close that gap. They develop a distributed Markov chain Monte Carlo (MCMC) algorithm for estimating Bayesian hierarchical models when the number of consumers is very large and the objects of interest are the consumer-level parameters. The two-stage and embarrassingly parallel algorithm is asymptotically unbiased in the number of consumers, retains the flexibility of a standard MCMC algorithm, and is easy to implement. The authors show that the distributed MCMC algorithm is faster and more efficient than a single-machine algorithm by at least an order of magnitude. They illustrate the approach with simulations with up to 100 million consumers, and with data on 1,088,310 donors to a charitable organization. The algorithm enables an increase of between $1.6 million and $4.6 million in additional donations when applied to a large modern-size data set compared with a typical-size data set.
               
Click one of the above tabs to view related content.