Summary: Here, we report the development of a filtering framework designed for efficient identification of both polyclonal and independent errors within SOLiD sequence data. The filtering utilizes the quality values reported by SOLiD's primary analysis for the identification of the two different types of errors. The filtering framework facilitates the passage of high-quality data into a variety of functional genomics applications, including de novo assemblers and sequence matching programs for SNP calling, improving the output quality and reducing resources necessary for analysis.
Availability: This error analysis framework is written in Perl and runs on Mac OS and Linux/Unix systems. The filter, documentation and sample Excel files for quality analysis are available at http://hts.rutgers.edu/filter and are distributed as Open Source software under the GPLv3.0.