CPLSTool: A Framework to Generate Automatic Bioinformatics Pipelines

Journal Title: Biomedical Journal of Scientific & Technical Research (BJSTR) - Year 2018, Vol 11, Issue 5

Abstract

Many bioinformatics tools have been developed for data analysis and focus on some specific problems. However, one program is not enough to complete the data mining. We developed CPLSTool (https://github.com/maoshanchen/CPLSTool) that can compress multiple bioinformatics tools and the produced pipeline can be used for data anlaysis repeatly. The most significant advantage of using CPLSTool is to save waiting time, compared to step-by-step analysis. In addition, some steps for the data analysis can be run parallely in order to save the program running time. We used CPLSTool to build an automatic pipeline based on QIIME and analyzed skin 16S rRNA data. The results showed that a total of 102 minutes can be saved using CPLSTool and the visualization of results improves our understanding of the results. CPLSTool can be applied in any kind of data analysis, including genomic, transcriptomic, proteomic and metagenomic data analysis. The use of CPLSTool will improve our understanding of data analysis and save time and computing resources.The last decade has witnessed the breaking development of Next-Generation Sequencing (NGS) tools, including Transcriptome Sequencing (RNA-Seq), Whole-Genome and Whole-Exome Sequencing (WGS/WXS), Metagenomics, Chromatin Immunoprecipitation or Methylated DNA Immunoprecipitation followed by Sequencing (ChIP-Seq or MeDIP-Seq), and a multitude of more specialized protocols, such as Cross-Linking Immunoprecipitation (CLIP-Seq), Assay for Transposase-Accessible Chromatin Using Sequencing (ATAC-Seq), and Formaldehyde-Assisted Isolation of Regulatory Elements (FAIRE-Seq) [1]. Every NGS tool was born with one or more analysis applications and now there are many bioinformatics tools developed for general and special research purposes, such as BWA [2], ExScalibur [3], Chipster [4], Churchill [5], NEAT [6], MG-RAST [7], TopHat [8] and QIIME [9]. However, there are some drawbacks for these tools. For example,i) Some tools concentrate on a single analysis step instead of completing all needed contents, such as BWA and Top Hat; ii) It is difficult to add new analysis contents to current integrated pipelines, such as NEAT; iii) Some tools are based on web server and the analysis is limited by the internet speed sometimes, such as MG-RAST; and iv) An automatic pipeline is necessary for the whole analysis rather than step-by-step operation, such as QIIME. Moreover, the tremendous amount of NGS output requires a possible way to speed up the analysis. Thus, it is important to develop a clever way to organize the related tools and software within reasonable time to get automatic pipelines and to speed up the overall procedure using parallelization and acceleration technologies [10]. To address this need, some features of a program should be considered when it is developed, such as i) Management of related tools and programs regardless of their own program language and input file formats, ii) Flexibility of adding new contents, iii) Generating an automatic pipeline instead of step-by-step operations, and iv) use of parallelization and acceleration technologies. We developed CPLSTool, which can conform to all the above features. CPLSTool is freely available for users from https://github.com/maoshanchen/CPLSTool.

Authors and Affiliations

Sifen Lu, Jing Song, Maoshan Chen

Keywords

Related Articles

Early Nano Detection of Liver Toxicity and Injury

Nanotechnology increases the biological applications of nanomaterials, especially in the field of nanomedicine. An efficient method for early detection of liver toxicity and prevention of irreversible damage is important...

The Use of Scan Pet for Lung Cancer

Lung cancer is the most common cause of cancer death in Western countries, and non-small cell lung cancer (NSCLC) accounts for 80% of these neoplasms. Despite the efforts in the early diagnosis and advances in the treatm...

Transcatheter Treatment of Coarctation of Aorta and Dually Connected Anomalous Vertical Pulmonary Vein as a Combined Procedure

Partial anomalous pulmonary venous connection may be an isolated anomay but is usually associated with an atrial septal defect. We report a case in which this anomaly was associated with coarctation of aorta. Since the p...

RNA as A Potent Target for Antibacterial Drug Discovery

The development of novel antibiotics is becoming a real emergency due to the growing number of multidrug-resistant pathogenic bacteria. This is also a global problem due to mass production and application of various anti...

Probiotics for Health Benefits: The Regulatory Concerns and Suggestive Roadmap

Probiotics, the friendly bugs, has gained impressive attention worldwide due to their nutraceutical and pharmaceutical benefits, established in recent years. However existing regulatory regime in various countries has am...

Download PDF file
  • EP ID EP592718
  • DOI 10.26717/BJSTR.2018.11.002172
  • Views 208
  • Downloads 0

How To Cite

Sifen Lu, Jing Song, Maoshan Chen (2018). CPLSTool: A Framework to Generate Automatic Bioinformatics Pipelines. Biomedical Journal of Scientific & Technical Research (BJSTR), 11(5), 8863-8867. https://europub.co.uk/articles/-A-592718