A semiautomated approach to gene discovery through expressed sequence tag data mining: Discovery of new human transporter genes

Journal Title: The AAPS Journal - Year 2003, Vol 5, Issue 1

Abstract

Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previously characterized human MFS genes and 43 new MFS gene candidates. Development of this approach provided insights into the problems and pitfalls of automated data mining using public databases.

Authors and Affiliations

Shoshana Brown, Jean l. Chang, Wolfgang Sadee, Patricia C. Babbitt

Keywords

Related Articles

Direct and Rapid Genotyping of SLCO1B1 388A>G and 521T>C in Human Blood Specimens Using the SmartAmp-2 Method

Organic anion-transporting polypeptide (OATP) 1B1, encoded by the solute carrier organic anion transporter family member 1B1 (SLCO1B1) gene, mediates the active uptake of various organic anions into hepatocytes and deter...

Population Pharmacokinetics and Pharmacodynamics of Ribavirin in Patients with Chronic Hepatitis C Genotype 1 Infection

We report a population pharmacokinetic (PK) and pharmacodynamic (PD) model of orally administered ribavirin in patients with chronic hepatitis C virus (HCV) infection enrolled in a multicenter clinical trial, including t...

Using DTA and DTAARRAY Variables and Programming in WinNonlin ASCII Models to Streamline User-Defined Calculation and Data Analysis

The online version of this article (doi:10.1208/s12248-014-9711-7) contains supplementary material, which is available to authorized users.

Determination of the Dominant Arachidonic Acid Cytochrome P450 Monooxygenases in Rat Heart, Lung, Kidney, and Liver: Protein Expression and Metabolite Kinetics

Cytochrome P450 (P450)-derived arachidonic acid (AA) metabolites serve pivotal physiological roles. Therefore, it is important to determine the dominant P450 AA monooxygenases in different organs. We investigated the P45...

Population-Based Mechanistic Prediction of Oral Drug Absorption

The bioavailability of drugs from oral formulations is influenced by many physiological factors including gastrointestinal fluid composition, pH and dynamics, transit and motility, and metabolism and transport, each of w...

Download PDF file
  • EP ID EP681959
  • DOI  10.1208/ps050101
  • Views 87
  • Downloads 0

How To Cite

Shoshana Brown, Jean l. Chang, Wolfgang Sadee, Patricia C. Babbitt (2003). A semiautomated approach to gene discovery through expressed sequence tag data mining: Discovery of new human transporter genes. The AAPS Journal, 5(1), -. https://europub.co.uk/articles/-A-681959