Plant genomes can withstand small- and large-scale duplications, at a far greater success than any other kingdom in the tree of life, resulting in the existence and evolution of gene… Click to show full abstract
Plant genomes can withstand small- and large-scale duplications, at a far greater success than any other kingdom in the tree of life, resulting in the existence and evolution of gene families, often with over a hundred members! The gene families, in turn, go through subfunctionalization or neofunctionalization, to form protein domains performing unique or grouped functions in context of the original activity. Due to the large number of such cases in the plant kingdom, it has become a routine task for plant biologists to investigate their specific gene family of interest. In this chapter, we provide a simple and standard pipeline for this effort, taking the example of steroidogenic acute regulatory protein (StAR) related lipid transfer (START) domains in rice, as reference. We describe the extraction, processing, and downstream analysis of Oryza sativa var. japonica proteome towards identification and comparative exploration of START domains. This was done by training profile Hidden Markov Models (HMM) of 35 reported START domains in Arabidopsis, which were then used to search potential homologs in rice. Downstream investigations included domain structure analysis, visualization of exon-intron patterns, chromosomal localization of START genes, and phylogenetic studies, followed by identification of cis-regulatory elements and gene regulatory network construction. Additionally, we have also highlighted various alternative tools and techniques that can be used to perform similar analyses, along with salient features.
               
Click one of the above tabs to view related content.