Abstract Long non-coding RNAs (lncRNAs) make up a significant portion of non-coding RNAs and are involved in a variety of biological processes. Accurate identification/annotation of lncRNAs is the primary step… Click to show full abstract
Abstract Long non-coding RNAs (lncRNAs) make up a significant portion of non-coding RNAs and are involved in a variety of biological processes. Accurate identification/annotation of lncRNAs is the primary step for gaining deeper insights into their functions. In this study, we report a novel tool, PLncPRO, for prediction of lncRNAs in plants using transcriptome data. PLncPRO is based on machine learning and uses random forest algorithm to classify coding and long non-coding transcripts. PLncPRO has better prediction accuracy as compared to other existing tools and is particularly well-suited for plants. We developed consensus models for dicots and monocots to facilitate prediction of lncRNAs in non-model/orphan plants. The performance of PLncPRO was quite better with vertebrate transcriptome data as well. Using PLncPRO, we discovered 3714 and 3457 high-confidence lncRNAs in rice and chickpea, respectively, under drought or salinity stress conditions. We investigated different characteristics and differential expression under drought/salinity stress conditions, and validated lncRNAs via RT-qPCR. Overall, we developed a new tool for the prediction of lncRNAs in plants and showed its utility via identification of lncRNAs in rice and chickpea.
               
Click one of the above tabs to view related content.