A semiautomated approach to gene discovery through expressed sequence tag data mining: Discovery of new human transporter genes
Journal Title: The AAPS Journal - Year 2003, Vol 5, Issue 1
Abstract
Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previously characterized human MFS genes and 43 new MFS gene candidates. Development of this approach provided insights into the problems and pitfalls of automated data mining using public databases.
Authors and Affiliations
Shoshana Brown, Jean l. Chang, Wolfgang Sadee, Patricia C. Babbitt
Direct and Rapid Genotyping of SLCO1B1 388A>G and 521T>C in Human Blood Specimens Using the SmartAmp-2 Method
Organic anion-transporting polypeptide (OATP) 1B1, encoded by the solute carrier organic anion transporter family member 1B1 (SLCO1B1) gene, mediates the active uptake of various organic anions into hepatocytes and deter...
Population Pharmacokinetics and Pharmacodynamics of Ribavirin in Patients with Chronic Hepatitis C Genotype 1 Infection
We report a population pharmacokinetic (PK) and pharmacodynamic (PD) model of orally administered ribavirin in patients with chronic hepatitis C virus (HCV) infection enrolled in a multicenter clinical trial, including t...
Using DTA and DTAARRAY Variables and Programming in WinNonlin ASCII Models to Streamline User-Defined Calculation and Data Analysis
The online version of this article (doi:10.1208/s12248-014-9711-7) contains supplementary material, which is available to authorized users.
Determination of the Dominant Arachidonic Acid Cytochrome P450 Monooxygenases in Rat Heart, Lung, Kidney, and Liver: Protein Expression and Metabolite Kinetics
Cytochrome P450 (P450)-derived arachidonic acid (AA) metabolites serve pivotal physiological roles. Therefore, it is important to determine the dominant P450 AA monooxygenases in different organs. We investigated the P45...
Population-Based Mechanistic Prediction of Oral Drug Absorption
The bioavailability of drugs from oral formulations is influenced by many physiological factors including gastrointestinal fluid composition, pH and dynamics, transit and motility, and metabolism and transport, each of w...