Mining functional microsatellites in legume unigenes
Journal Title: Bioinformation - Year 2011, Vol 7, Issue 5
Abstract
Highly polymorphic and transferable microsatellites (SSRs) are important for comparative genomics, genome analysis and phylogenetic studies. Development of novel species-specific microsatellite markers remains a costly and labor-intensive project. Therefore, interest has been shifted from genomic to genic markers owing to their high inter-species transferability as they are developed from conserved coding regions of the genome. This study concentrates on comparative analysis of genic microsatellites in nine important legume (Arachis hypogaea, Cajanus cajan, Cicer arietinum, Glycine max, Lotus japonicus, Medicago truncatula, Phaseolus vulgaris, Pisum sativum and Vigna unguiculata) and two model plant species (Oryza sativa and Arabidopsis thaliana). Screening of a total of 228090 putative unique sequences spanning 219610522 bp using a microsatellite search tool, MISA, identified 12.18% of the unigenes containing 36248 microsatellite motifs excluding mononucleotide repeats. Frequency of legume unigene-derived SSRs was one SSR in every 6.0 kb of analyzed sequences. The trinucleotide repeats were predominant in all the unigenes with the exception of C. cajan, which showed prevalence of dinucleotide repeats over trinucleotide repeats. Dinucleotide repeats along with trinucleotides counted for more than 90% of the total microsatellites. Among dinucleotide and trinucleotide repeats, AG and AAG motifs, respectively, were the most frequent. Microsatellite positive chickpea unigenes were assigned Gene Ontology (GO) terms to identify the possible role of unigenes in various molecular and biological functions. These unigene based microsatellite markers will prove valuable for recording allelic variance across germplasm collections, gene tagging and searching for putative candidate genes.
Authors and Affiliations
Manish Roorkiwal, Prakash Sharma
Binding site prediction of galanin peptide using evolutionary trace method.
Galanin is a neuropeptide with aminoacid length ranging from 29 to 31 is widely distributed in central and peripheral nervous system. Galanin controls various psychological processes such as sensation of pain, learning,...
Identification of Comamonas species using 16S rRNA gene sequence.
A bacterial strain Bz02 was isolated from a water sample collected from river Gomti at the Indian city of Lucknow. We characterized the strain using 16S rRNA sequence. Phylogenetic analysis showed that the strain formed...
Intergenics: A tool for extraction of intergenicregions.
For the past one decade, there has been considerable explosion of interest in searching novel regulatory elements in the intergenic region between the protein coding regions. The microbial genomes are the most exploited...
Classification and comparative analysis of Curcuma longa L. expressed sequences tags (ESTs) encoding glycine-rich proteins (GRPs)
Glycine-rich proteins (GRPs) are a group of proteins characterized by their high content of glycine residues often occurring in repetitive blocs. The diverse expression pattern and sub cellular localization of various GR...
A database for allergenic proteins and tools for allergenicity prediction.
The AllergenPro database has developed a web-based system that will provide information about allergen in microbes, animals and plants. The database has three major parts and functions:(i) database list; (ii) allergen se...