RNA‐Seq is a powerful transcriptomics tool for mammalian cell culture process development. Successful RNA‐Seq data analysis requires a high quality reference for read mapping and gene expression quantification. Currently, there… Click to show full abstract
RNA‐Seq is a powerful transcriptomics tool for mammalian cell culture process development. Successful RNA‐Seq data analysis requires a high quality reference for read mapping and gene expression quantification. Currently, there are two public genome references for Chinese hamster ovary (CHO) cells, the predominant mammalian cell line in the biopharmaceutical industry. In this study, we compared these two references by analyzing 60 RNA‐Seq samples from a variety of CHO cell culture conditions. Among the 20,891 common genes in both references, we observed that 31.5% have more than 7.1% quantification differences, implying gene definition differences in the two references. We propose a framework to quantify this difference using two metrics, Consistency and Stringency, which account for the average quantification difference between the two references over all samples, and the sample‐specific effect on the quantification result, respectively. These two metrics can be used to identify potential genes for future gene model improvement and to understand the reliability of differentially expressed genes identified by RNA‐Seq data analysis. Before a more comprehensive genome reference for CHO cells emerges, the strategy proposed in this study can enable more robust transcriptome analysis from CHO cell RNA‐Seq data. Biotechnol. Bioeng. 2017;114: 1603–1613. © 2017 Wiley Periodicals, Inc.
               
Click one of the above tabs to view related content.