Overview

The procedure statistics are automatically calculated by SeqSphere+. They are shown in the Procedure tab of the Sample Overview and they can be exported like epidemiological field data. The read and assembly procedure statistics can also be exported to and imported from SPEC Files.

Contamination Check (Mash Screen)

Field	Description	Submission
Top Species¹⁾	Best matching species in Mash Screen result
Top Species Identity¹⁾	Identity score for the best matching species
Top Species Shared-Hashes¹⁾	Amount of shared-hashes for the best matching species
Contamination Check Result¹⁾	Message with the result of the contamination check by using Mash Screen. If a potential contamination was detected, the field is highlighted yellow as warning
Potential Contaminating Species²⁾	If found, the second best matching species above thresholds in Mash Screen result
Potential Contaminating Species Identity²⁾	Identity score for the second best matching species above thresholds
Potential Contaminating Species Shared-Hashes²⁾	Amount of shared-hashes for the second best matching species above thresholds

¹⁾ only available if processed in a SeqSphere+ pipeline with enabled Contamination Check (Mash Screen).
²⁾ only available if processed in a SeqSphere+ pipeline with enabled Contamination Check (Mash Screen) and a potential contamination was found.

Read Statistics

Field	Description	Submission
FastQC Per Base Sequence Quality (Forward Reads)³⁾	Base quality check result from FASTQ Quality Control (FastQC) processing for forward reads; if the check has warnings/failed, the field is highlighted yellow or red, respectively
FastQC Per Base Sequence Quality (Reverse Reads)³⁾	Base quality check result from FASTQ Quality Control (FastQC) processing for reverse reads; if the check has warnings/failed, the field is highlighted yellow or red, respectively
FastQC Adapter Content³⁾	Adapter content check result from FASTQ Quality Control (FastQC) processing; if the check has warnings/failed, the field is highlighted yellow or red, respectively
Avg. Coverage (Unassembled)³⁾	Estimated based on the genome size of the seed genome and unprocessed reads
Avg. Coverage (Processed, Unassembled)³⁾	Estimated based on the genome size of the seed genome and processed (trimming and/or downsampling) reads
Avg. Read Length (Unassembled)³⁾	Average read length for unassambled reads
Avg. Read Length (Processed, Unassembled)³⁾	Average read length for processed (trimming and/or downsampling) and unassambled reads
Read Count (Unassembled)³⁾	Number of reads for unassambled reads
Read Count (Processed, Unassembled)³⁾	Number of reads for processed (trimming and/or downsampling) and unassambled reads
Read Base Count (Unassembled)³⁾	Sum of all read bases for unassambled reads
Read Base Count (Processed, Unassembled)³⁾	Sum of all read bases for processed (trimming and/or downsampling) and unassambled reads

³⁾ only available if assembled with a SeqSphere+ pipeline
Potentially this field is submitted and published on cgMLST.org when the Sample is submitted.

Assembly Statistics

Field	Description	Submission
Contig Count (Assembled)	Number of contigs in the assembly⁴⁾
N50 (Assembled)	N50 calculated for the assembly⁴⁾
Read Count (Assembled)⁵⁾	Number of reads used in the assembly
Read Fwd Count (Assembled)⁵⁾	Number of forward reads used in the assembly
Read Rev Count (Assembled)⁵⁾	Number of reverse reads used in the assembly
Assembly Base Count	Number of bases in all contigs of the assembly
Approximated Genome Size (Mbases)	Number of bases in all contigs of the assembly in Mbases; if the deviation to the Expected Genome Size is higher than 25%, the field is highlighted yellow as warning
Max Contig Length (Assembled)	Maximum length of a contig in the assembly⁴⁾
Min Contig Length (Assembled)	Minimum length of a contig in the assembly⁴⁾
Mean Contig Length (Assembled)	Mean length of a contig in the assembly⁴⁾
Avg. Coverage (Assembled)⁶⁾	Estimated based on the consensus base count and assembled read base count, or imported from GenBank entry
Genome Status	Filled if the sequence data was imported from NCBI

⁴⁾ includes all contigs, i.e., also contigs that are smaller than 200 bases
⁵⁾ only available if imported from ACE/BAM or assembled with a SeqSphere+ pipeline
⁶⁾ only available if imported from ACE/BAM, assembled with a SeqSphere+ pipeline or available in GenBank entry
Potentially this field is submitted and published on cgMLST.org when the Sample is submitted.

cgMLST Statistics

Field	Description	Submission
Procedure cgMLST Perc. Good Targets	Number of cgMLST targets that passed the initial target QC procedure; if the value is below 90%, the field is highlighted orange as warning.

This field is submitted and published on cgMLST.org when the Sample is submitted.

Contents

Overview

Contamination Check (Mash Screen)

Read Statistics

Assembly Statistics

cgMLST Statistics