High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure regulatory element activity across the entire human genome at once. The resulting data,… Click to show full abstract
High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure regulatory element activity across the entire human genome at once. The resulting data, however, present substantial analytical challenges. Here, we identify technical biases that explain most of the variance in STARR-seq data. We then develop a statistical model to correct those biases and to improve detection of regulatory elements. This approach substantially improves precision and recall over current methods, improves detection of both activating and repressive regulatory elements, and controls for false discoveries despite strong local correlations in signal.
               
Click one of the above tabs to view related content.