LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Gene expression distribution deconvolution in single-cell RNA sequencing

Photo from wikipedia

Significance We developed deconvolution of single-cell expression distribution (DESCEND), a method to recover cross-cell distribution of the true gene expression level from observed counts in single-cell RNA sequencing, allowing adjustment… Click to show full abstract

Significance We developed deconvolution of single-cell expression distribution (DESCEND), a method to recover cross-cell distribution of the true gene expression level from observed counts in single-cell RNA sequencing, allowing adjustment of known confounding cell-level factors. With the recovered distribution, DESCEND provides reliable estimates of distribution-based measurements, such as the dispersion of true gene expression and the probability that true gene expression is positive. This is important, as with better estimates of these measurements, DESCEND clarifies and improves many downstream analyses including finding differentially expressed genes, identifying cell types, and selecting differentiation markers. Another contribution is that we verified using nine public datasets a simple “Poisson-alpha” noise model for the technical noise of unique molecular identifier-based single-cell RNA-sequencing data, clarifying the current intense debate on this issue. Single-cell RNA sequencing (scRNA-seq) enables the quantification of each gene’s expression distribution across cells, thus allowing the assessment of the dispersion, nonzero fraction, and other aspects of its distribution beyond the mean. These statistical characterizations of the gene expression distribution are critical for understanding expression variation and for selecting marker genes for population heterogeneity. However, scRNA-seq data are noisy, with each cell typically sequenced at low coverage, thus making it difficult to infer properties of the gene expression distribution from raw counts. Based on a reexamination of nine public datasets, we propose a simple technical noise model for scRNA-seq data with unique molecular identifiers (UMI). We develop deconvolution of single-cell expression distribution (DESCEND), a method that deconvolves the true cross-cell gene expression distribution from observed scRNA-seq counts, leading to improved estimates of properties of the distribution such as dispersion and nonzero fraction. DESCEND can adjust for cell-level covariates such as cell size, cell cycle, and batch effects. DESCEND’s noise model and estimation accuracy are further evaluated through comparisons to RNA FISH data, through data splitting and simulations and through its effectiveness in removing known batch effects. We demonstrate how DESCEND can clarify and improve downstream analyses such as finding differentially expressed genes, identifying cell types, and selecting differentiation markers.

Keywords: expression distribution; gene expression; cell; distribution

Journal Title: Proceedings of the National Academy of Sciences of the United States of America
Year Published: 2018

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.