Genome Sequencing

Whole genome of Rhodococcus imtechensis RKJ300 was sequenced by Illumina-HiSeq 1000 paired end technology. A total of 30,464,548 paired end reads of 101 bp length were generated as a result of sequencing.
We have used NGS QC Toolkit v2.1 to filter high-quality Illumina data (HQ cut off read length for HQ=70%, cutoff quality score=20) and removal of Vector/Adaptor contaminated reads.
We obtained 2,753,155 single end reads and 22,040,838 high-quality, vector-filtered paired end reads after raw data filtering.
We have only used 22,040,838 paired end reads,having ~267X coverage, for genome assembly, discussed in the genome annocement manuscript.

Quality filtering statistics

Illumina ReadsForward ReadsReversed reads
Total number of raw reads1523227415232274
Total number of HQ reads1102043711020437
Percentage of HQ reads72.35%72.35%
Total number of bases15384596741538459674
Total number of bases in HQ reads11130641371113064137
Total number of HQ bases in HQ reads10286104881018508024
Percentage of HQ bases in HQ reads92.41%91.50%
Number of Primer/Adaptor contaminated HQ reads126
Total number of HQ filtered reads1102041911020419
Percentage of HQ filtered reads72.35%72.35%
Total number of High Quality, Vector filtered single end reads2,753,155
Total number of High Quality, Vector filtered paired end reads22,040,838