Research Article published BMC Bioinformatics
Background: One of the major challenges facing investigators in the microbiome field is turning large numbers of
reads generated by next-generation sequencing (NGS) platforms into biological knowledge. Effective analytical workflows
that guarantee reproducibility, repeatability, and result provenance are essential requirements of modern microbiome
research. For nearly a decade, several state-of-the-art bioinformatics tools have been developed for understanding
microbial communities living in a given sample. However, most of these tools are built with many functions that require
an in-depth understanding of their implementation and the choice of additional tools for visualizing the final output.
Furthermore, microbiome analysis can be time-consuming and may even require more advanced programming skills
which some investigators may be lacking.
Results: We have developed a wrapper named iMAP (Integrated Microbiome Analysis Pipeline) to provide the
microbiome research community with a user-friendly and portable tool that integrates bioinformatics analysis and data
visualization. The iMAP tool wraps functionalities for metadata profiling, quality control of reads, sequence processing and
classification, and diversity analysis of operational taxonomic units. This pipeline is also capable of generating web-based
progress reports for enhancing an approach referred to as review-as-you-go (RAYG). For the most part, the profiling of
microbial community is done using functionalities implemented in Mothur or QIIME2 platform. Also, it uses different R
packages for graphics and R-markdown for generating progress reports. We have used a case study to demonstrate the
application of the iMAP pipeline.
Conclusions: The iMAP pipeline integrates several functionalities for better identification of microbial communities
present in a given sample. The pipeline performs in-depth quality control that guarantees high-quality results and
accurate conclusions. The vibrant visuals produced by the pipeline facilitate a better understanding of the complex and
multidimensional microbiome data. The integrated RAYG approach enables the generation of web-based reports, which
provides the investigators with the intermediate output that can be reviewed progressively. The intensively analyzed case
study set a model for microbiome data analysis.