Error Correction of Second-Generation Sequencing Reads
MetadataShow full item record
The introduction of second-generation DNA sequencers has enabled researchers to explore biological information in ways never before possible. These sequencers provide increased throughput over first-generation sequencers at decreasing costs. However, the information produced by these sequencing technologies contains errors which may complicate downstream analyses. The error correction problem involves locating sequencing errors and making edits that correct or remove errors. We introduce Pollux, a platform-independent error corrector which identifies and fixes errors produced by second-generation sequencing technologies. We evaluate Pollux on several diploid bacterial data sets. Using standardized test data, Pollux corrects 85% of Roche 454 GS Junior, 86% of Ion Torrent PGM, and 94% of Illumina MiSeq errors. We compare Pollux to several current error correctors. Pollux performs comparably with the most effective correctors when correcting Illumina data and makes significant improvements when correcting Roche 454 and Ion Torrent PGM data. Furthermore, we provide evidence that Pollux can correct errors in the presence of varying coverage and improves the quality of sequence assemblies.