We present an efficient density‐based adaptive‐resolution clustering method APLoD for analyzing large‐scale molecular dynamics (MD) trajectories. APLoD performs the k‐nearest‐neighbors search to estimate the density of MD conformations in a… Click to show full abstract
We present an efficient density‐based adaptive‐resolution clustering method APLoD for analyzing large‐scale molecular dynamics (MD) trajectories. APLoD performs the k‐nearest‐neighbors search to estimate the density of MD conformations in a local fashion, which can group MD conformations in the same high‐density region into a cluster. APLoD greatly improves the popular density peaks algorithm by reducing the running time and the memory usage by 2–3 orders of magnitude for systems ranging from alanine dipeptide to a 370‐residue Maltose‐binding protein. In addition, we demonstrate that APLoD can produce clusters with various sizes that are adaptive to the underlying density (i.e., larger clusters at low‐density regions, while smaller clusters at high‐density regions), which is a clear advantage over other popular clustering algorithms including k‐centers and k‐medoids. We anticipate that APLoD can be widely applied to split ultra‐large MD datasets containing millions of conformations for subsequent construction of Markov State Models. © 2016 Wiley Periodicals, Inc.
               
Click one of the above tabs to view related content.