Control and Edit Laboratory and Assembly Procedure Details

Contents

Overview

The Sequence Specification is used to document the laboratory procedure, the assembly procedure and the scanning procedure. They can be predefined and managed using the menu item Options | Sequence Specifications

If a new Sample created from WGS data, they can be assigned the the data and stored together with the Sample. The stored specifications can be viewed and modified in the Files tab of the Sample Overview. They can also be exported or added to a Comparison Table.

Laboratory procedure details

The laboratory procedure details must be specified manually for an input file, or must be manually defined in the pipeline script.

Field Description ENA submission
Nucleic acid extractionLink to a literature reference, electronic resource or a standard operating procedure (SOP)
Library sourceLibrary source used (genomic includes PCR products from genomic DNA)Required
Library strategyLibrary strategy usedRequired
Library selectionLibrary selection usedRequired
Library construction methodLibrary construction method used
Library amplification methodLibrary amplification method used
Sequencing protocolSequencing protocol usedRequired
Library insert sizeLibrary insert size in base pairs (excluding adaptors and/or primers)Required
Sequencing lengthNumber of bases of insert sequencedRequired
Sequencing vendorSequencing platform producerRequired
Sequencing platformSequencing platform usedRequired

Assembly procedure details

The laboratory procedure details can be specified manually for an input file, or can be manually defined in the pipeline script. If the pipeline performs an assembling (de novo or mapping), any existing values will be overwritten by the actually used assembly details.

Field Description
Assembly pre-processingCoverage downsampeled and/or trimmed by quality (window and QV)
Assembly typeGeneral assembly approach (de-novo or mapping)
Mapping reference genomeNCBI accession number of reference genome against which raw read data were mapped
AssemblerSoftware used for assembly
Assembler versionVersion of software used for assembly
Assembler parametersAssembler parameters used for assembly
Sequencing commentAdditional information regarding laboratory and assembly meta-data

Assembly statistics

Some additional statistics for the assembled sequence are calculated by SeqSphere+. Those fields are also shown in the the Sequence Specification table of the Files tab and they can be exported like normal fields. The values are automatically calculated for and therefore those fields are not shown in the Sequence Specification managing dialog.

Field Description
Contig Count (Assembled)Number of contigs in the assembly3)
Read Count (Assembled)1)Number of reads used in the assembly
Read Fwd Count (Assembled)1)Number of forward reads used in the assembly
Read Rev Count (Assembled)1)Number of reverse reads used in the assembly
Base Count (Assembled)Number of bases in the assembly
Max Contig Length (Assembled)Maximum length of a contig in the assembly3)
Min Contig Length (Assembled)Minimum length of a contig in the assembly3)
Mean Contig Length (Assembled)Mean length of a contig in the assembly3)
N50N50 calculated for the assembly3)
Avg. Coverage (Unassembled)2)Estimated based on the genome size of the reference genome
Avg. Coverage (Processed, Unassembled)2)Estimated based on the genome size of the reference genome and processed (trimming and/or downsampling) reads
Avg. Read Length (Unassembled)2)Average read length for unassambled reads
Avg. Read Length (Processed, Unassembled)2)Average read length for processed (trimming and/or downsampling) and unassambled reads
Read Count (Unassembled)2)Number of reads for unassambled reads
Read Count (Processed, Unassembled)2)Number of reads for processed (trimming and/or downsampling) and unassambled reads
Read Base Count (Unassembled)2)Sum of all read bases for unassambled reads
Read Base Count (Processed, Unassembled)2)Sum of all read bases for processed (trimming and/or downsampling) and unassambled reads
Downsampled to Coverage2)Expected genome coverage that was used to calculate the ratio for random downsampling
Expected Genome Size for Downsampling2)Expected genome size that was used to calculate the ratio for random downsampling
Quality Trimming2)Parameters that were used for quality trimming of reads

1) only available if imported from ACE/BAM or assembled with a SeqSphere+ pipeline
2) only available if assembled with a SeqSphere+ pipeline
3) includes all contigs, i.e., also contigs that are smaller than 200 bases

Scanning procedure details

Example of Scanning procedure details

The scanning procedure details are stored as one text field. In contrast to the other two sections, these details are not editable and are not associated to an assembly file. They are always automatically filled by SeqSphere+ and associated to a Sample.