Overview

Laboratory and assembly procedure details and procedure statistics can be exported into and imported from proprietary SPEC files.

There are three application scenarios where SPEC files are useful:

Exporting/importing contig files from Samples together with the procedure details and statistics

The menu function File | Export Assembly Contigs can be used to export the FASTA assembly contigs for multiple Samples. By default this also stores a separate SPEC file for each FASTA file, with the same name but extension ".spec". If exported FASTA files are later imported by SeqSphere+, the SPEC files are automatically detected and imported as procedure details and statistics. Optionally, metadata and tags can also be exported/imported using SPEC files.

Exporting/importing predefined procedure details

Laboratory and assembly procedure details can be predefined and managed using the menu function Options | Procedure Details. The predefined sets can be exported to SPEC files. They can also be imported from SPEC files that were exported from existing Samples in the Procedure tab.

Forwarding procedure details and statistics from an external pipeline

If an external processing pipeline outside of SeqSphere+ (e.g., to run other assemblers than Velvet/SPAdes) is used to create contig files for later import into SeqSphere+, then the SPEC files can be used to hand over part of or all laboratory and assembly procedure details and read and assembly statistics. The SPEC files must be located in the same directory as the contig files, and they must have the same names as the contig files, but with the extension ".spec". They are then automatically processed by SeqSphere+ together with the contig files (FASTA or ACE/BAM files). To be automatically processed together with FASTQ raw read files the SPEC file name should only contain the Sample ID before the SPEC file name extension.

If the SPEC file is named "procedure_details.spec" it is automatically used for all files in the directory.

Target scan and target QC procedure details can neither be exported or imported from a SPEC file.

SPEC File Format

SPEC files must have the file name extension ".spec". The content of a SPEC file is plain text (UTF-8) where each line stores a single field and value, in the format: field=value. The available fields are shown below. The fields may be in any order.

Example:

assembler_parameters=optimized k and cut-off for highest average length of contigs with length 1000 or above, paired end, 4 parallel procs; best k\=113; best coutoff\=6; velveth\: 113 -fastq -shortPaired; velvetg\: -exp_cov auto -cov_cutoff 6 -clean yes -read_trkg yes -amos_file yes -max_branch_length 200 -scaffolding no
assembler=Velvet
assembler_version=1.1.04
assembly_pre-processing=trim until avg. quality is 30 in window of 20; downsample to coverage 120 (unassambled, exp. genome size 2.94 Mbases)
assembly_type=de novo
avg._contig_length_(assembled)=111790
avg._coverage_(assembled)=102
avg._coverage_(processed,_unassembled)=106
avg._coverage_(unassembled)=110
avg._read_length_(unassembled)=221
consensus_base_count_(assembled)=2906556
contig_count_(assembled)=26
downsampled_to_coverage=120
expected_genome_size_for_downsampling=2.94
library_amplification_method=Illumina Bridge Amplification
library_construction_method=Nextera XT
library_insert_size=500bp
library_selection=random
library_source=genomic
library_strategy=WGS
max_contig_length_(assembled)=473098
min_contig_length_(assembled)=225
n50_(assembled)=293353
nucleic_acid_extraction=MagAttract HMW DNA Kit (Qiagen, Hilden, Germany)
quality_trimming=qual 30 in window of 20
read_base_count_(assembled)=296113983
read_base_count_(unassembled)=323514957
read_count_(assembled)=1458624
read_count_(unassembled)=1458624
read_fwd_count_(assembled)=659534
read_rev_count_(assembled)=657667
sequencing_length=250bp
sequencing_platform=MiSeq
sequencing_protocol=paired-end reads
sequencing_vendor=Illumina