A semiautomated approach to gene discovery through expressed sequence tag data mining: Discovery of new human transporter genes

Journal Title: The AAPS Journal - Year 2003, Vol 5, Issue 1

Abstract

Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previously characterized human MFS genes and 43 new MFS gene candidates. Development of this approach provided insights into the problems and pitfalls of automated data mining using public databases.

Authors and Affiliations

Shoshana Brown, Jean l. Chang, Wolfgang Sadee, Patricia C. Babbitt

Keywords

Related Articles

Evaluation of Pre-existing Antibody Presence as a Risk Factor for Posttreatment Anti-drug Antibody Induction: Analysis of Human Clinical Study Data for Multiple Biotherapeutics

Biotherapeutic-reactive antibodies in treatment-naïve subjects (i.e., pre-existing antibodies) have been commonly detected during clinical immunogenicity assessments; however information on pre-existing antibody...

A Therapeutic Microparticle-Based Tumor Lysate Vaccine Reduces Spontaneous Metastases in Murine Breast Cancer

Metastatic breast cancer is currently incurable, and available therapies are associated with severe toxicities. Induction of protective anti-tumor immunity is a promising therapeutic approach for disseminated breast canc...

Improper Selection of a Pre-specified Primary Dose–Response Analysis Delays Regulatory Drug Approval

Dose–response analysis is one of the accepted efficacy endpoints to establish effectiveness. The purpose of this research was to inform selection of an appropriate pre-specified primary dose–response anal...

Poly(ethylene glycol)-Modified Proteins: Implications for Poly(lactide-co-glycolide)-Based Microsphere Delivery

The reduced injection frequency and more nearly constant serum concentrations afforded by sustained release devices have been exploited for the chronic delivery of several therapeutic peptides via poly(lactide-co-glycoli...

Organic Cation Transporter OCTs (SLC22) and MATEs (SLC47) in the Human Kidney

In the kidney, human organic cation transporters (OCTs) and multidrug and toxin extrusion proteins (MATEs) are the major transporters for the secretion of cationic drugs into the urine. In the human kidney, OCT2 mediates...

Download PDF file
  • EP ID EP681959
  • DOI  10.1208/ps050101
  • Views 78
  • Downloads 0

How To Cite

Shoshana Brown, Jean l. Chang, Wolfgang Sadee, Patricia C. Babbitt (2003). A semiautomated approach to gene discovery through expressed sequence tag data mining: Discovery of new human transporter genes. The AAPS Journal, 5(1), -. https://europub.co.uk/articles/-A-681959