ContentsPreparationBefore whole genome sequence data can be imported a Project with at least one Task Template for whole genome sequencing data must be existing in the database.
Choose Input WGS DataUse the menu function A Project and at least one Task Template must be selected. If a single Task Template is selected, the process can be limited to specific targets using the Define Targets checkbox. In the Input Sequence Data section the files with whole genome sequence data can be selected. It is possible to either
Allowed input file formats are FASTA, GenBank, SAM/BAM and ACE-files. If SAM/BAM files do not contain a reference sequence, a dialog windows opens that allows to specify a FASTA-file with the reference sequence. The sequence names in the FASTA-file must match the names in the SAM/BAM file. If on or more files are added file to the list, a dialog with details are shown. This allows to view and edit:
If the Sample ID is already existing in the database, the contigs will be added to this Sample if they do not exist, or if overwriting was enabled. Press the button Target Scan Procedure Details... to show the parameters:
Click OK to start the process. Scanning Targets in WGS dataNow the ref.-seqs. from the Task Template are scanned in the WGS data using the integrated BLAST. If a unique hit exists that succeeds the threshold that were defined in the Task Template or overwritten in the Scanning Procedure Details, the target is found.
Preview the Found TargetsIf the batch-mode was disabled, a table with all found hits is shown per input data file. Each row in this table represents one target that was searched. The rows that are highlighted red do not fulfill the defined thresholds. Rows for targets that already exist in a Sample with the same name are disabled. To enable overwriting of existing target sequences, mark the checkbox Allow to replace existing targets. The first column of the table shows a checkbox that defines if the found region should be extracted as sequence for the searched target. By default only the targets that fulfill thresholds unambiguously, and that are not already found in an existing Sample are selected. The thresholds can be changed in this preview. The selection marks in the first column are updated automatically. The selection marks can also be changed manually row by row. Press the confirm button at the bottom of the window to create the new Samples, or to extend existing ones. Importing the Found TargetsNow the regions that match to the found targets are extracted from the input data, and added to new or existing Samples. If the input data contains the read information (ACE/BAM file), the aligned reads for this are also extracted and imported corresponding to the advanced settings. However, with default settings the read data will be discarded if the target succeeds all analysis checks to reduce the disk storage size. |