Cloud-BS: A MapReduce-based bisulfite sequencing aligner on cloud
- Authors
- Choi, Joungmin; Park, Yoonjae; Kim, Sun; Chae, Heejoon
- Issue Date
- Dec-2018
- Publisher
- IMPERIAL COLLEGE PRESS
- Keywords
- Bisulfite sequencing; aligner; distributed; MapReduce; cloud
- Citation
- JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, v.16, no.6
- Journal Title
- JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY
- Volume
- 16
- Number
- 6
- URI
- https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/4146
- DOI
- 10.1142/S0219720018400280
- ISSN
- 0219-7200
1757-6334
- Abstract
- In recent years, there have been many studies utilizing DNA methylome data to answer fundamental biological questions. Bisulfite sequencing (BS-seq) has enabled measurement of a genome-wide absolute level of DNA methylation at single-nucleotide resolution. However, due to the ambiguity introduced by bisulfite-treatment, the aligning process especially in large-scale epigenetic research is still considered a huge burden. We present Cloud-BS, an efficient BS-seq aligner designed for parallel execution on a distributed environment. Utilizing Apache Hadoop framework, Cloud-BS splits sequencing reads into multiple blocks and transfers them to distributed nodes. By designing each aligning procedure into separate map and reducing tasks while an internal key-value structure is optimized based on the MapReduce programming model, the algorithm significantly improves alignment performance without sacrificing mapping accuracy. In addition, Cloud-BS minimizes the innate burden of configuring a distributed environment by providing a pre-configured cloud image. Cloud-BS shows significantly improved bisulfite alignment performance compared to other existing BS-seq aligners. We believe our algorithm facilitates large-scale methylome data analysis. The algorithm is freely available at https://paryoja.github.io/Cloud-BS/.
- Files in This Item
-
Go to Link
- Appears in
Collections - 공과대학 > 소프트웨어학부 > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.