Dataset for BenchmarkingIn this study, we evaluate performance of existing assemblers on one real genome Pseudomonas syringae and four hypothetical genomes. These genomes and their short reads are available for users to download from here. User can use this dataset for evaluating performance of their assemblers. Following is description of these genomes.
Real Genome (Pseudomonas syringae pv. syringae B728a)
|Pseudomonas syringae pv. syringae B728a
|Illumina's Solexa, Paired end sequencing
|Total number of reads generated
|Total number of Paired end reads
|Length of each read
|400 base pair
|Complete Genome and Short Reads
In this study, we created four hypothetical genomes of size 6MB each and their paired end reads (i.e. Solexa type). Short read are created at coverage of 10X, 20X, 30X and 40X. Size of short reads is 36 base-pair and inset-length 400 base pairs. We evaluated performance of different assemblers on these hypothetical genomes. These genomes and their short reads are available for public so user can evaluate their server on this dataset.
|Download Short Reads