1 Overview

This tutorial describes how to use the Ridom SeqSphere+ software to analyze Sanger sequence data (e.g., chromatogram files) with Multi Locus Sequence Typing (MLST).

Furthermore, it is explained how to create a Task Template for automated sequence analysis. The MLST scheme for N. meningitidis is used as an example for demonstration purposes. However, by reading this tutorial you should be able to define your own MLST templates for other species.

2 Preliminaries

  • Step 1: This tutorial requires a running SeqSphere+ client and server. Start the SeqSphere+ server, then start the SeqSphere+ client and initialize the database. For evaluation purpose a free evaluation license can be requested.

3 Creating Project with Epi Database Scheme and Task Template

  • Step 1: Create a new Project for use with your sample data with the menu: File | New | Create Project
  • Step 2: Enter a name in the field Project Name (e.g., Neisseria MLST Sanger). The fields Category and Acronym can be left empty.
  • Step 3: Each Project within SeqSphere+ needs to have at least one Task Template associated. Press Button16-Plus.gif Add in Task Templates section.
  • Step 4: The dialog window Add Task Template to Project opens. Press the button Button16-ov-TaskTemplate over new star.png Create New

 

Seqsphere tutorial nmengsanger def 01 sanger.png

 

  • Step 5: Choose Create Task Template for Sanger Sequencing Data.

 

Seqsphere tutorial nmengsanger def 02 mlst.png

 

  • Step 6: Now choose Create Task Template by Predefined MLST Scheme.

 

Seqsphere tutorial nmengsanger def 03 nmeng.png

 

  • Step 7: Choose in the organism the entry Neisseria spp. and the data will be downloaded from the public MLST server.

 

Seqsphere tutorial nmengsanger def 04 schema.png

 

  • Step 8: Once downloaded click Next to continue.

 

Seqsphere tutorial nmengsanger def 05 askfnc.png

 

  • Step 9: Now choose Define File naming Automatically from Example Files.

 

Seqsphere tutorial nmengsanger def 06 filebrowser.png

 

  • Step 10: The definition of the file naming is important to enable a batch processing of sequence files. Press the Button16-Open.gif Add Example Files button, and select all scf-files from the tutorial example data directory and confirm with Open. Then press Next to continue.

 

Seqsphere tutorial nmengsanger def 07 configfnc.png

 

  • Step 11: SeqSphere+ tries to guess the file naming from the example files. The green Button16-OK.png on the bottom marks that a file naming was found that matches to all example files. If the file naming is not detected automatically, the Sample ID and target parts of the file name must be configured manually. For the example data it is detected automatically. Click Next to continue.

 

Seqsphere tutorial nmengsanger def 08 targets.png

 

  • Step 12: This step shows the Target Parameters for the Task Template (e.g., the quality check parameters). They can be left unchanged. Click Next.

 

Seqsphere tutorial nmengsanger def 09 name.png

 

  • Step 13: Check the name of your new Task Template, and confirm with Finish. Press OK to save the new Task Template and add it to your Project.

 

Seqsphere tutorial nmengsanger def 10 final.png

 

  • Step 14: In the top row of the Project window the Epi Database Scheme can be selected. This defines the database fields that are available for this Project. For a new Project the Epi Database Scheme Default Bacteria is preselected. Press the Button16-Preview.png button on the right to see the details.

 

Seqsphere tutorial nmengsanger def 11 dbscheme.png

 

  • Step 15: This scheme contains already all fields that are normally needed and is compliant with the NCBI BioSample fields. New fields can be added by creating a new Database Scheme that extends the default one. For this tutorial the Database Scheme is left to default therefore Close the window. Then save your Project by confirming with OK.
Ridom SeqSphere+ is a resequencing software. Once you have setup a project like this you can literally analyze hundreds/thousands of sequence data automatically.

4 Importing the Sequence Data

  • Step 1: Choose from the menu File | Process Sanger Sequencing Data
  • Step 2: Press the Open.gif button above the file browser panel on the left, and choose the directory where you extracted the tutorial example data.

 

Seqsphere tutorial nmengsanger import 01 start.png

 

  • Step 3: Select the tutorial example data directory or all of the scf-files in it, and press the button Button32-ArrowRightBatch.gif (Hint: Use CTRL+A to select all files in the directory).

 

Seqsphere tutorial nmengsanger import 02 sortpreview.png

 

  • Step 4: In the upcoming preview dialog select the Project that was just created. The files are now sorted corresponding to the file naming defined above. Each Sample has 7 targets, and each target has 2 chromatograms. Press OK to confirm the preview dialog.

 

Seqsphere tutorial nmengsanger import 03 sorted.png

 

  • Step 5: The 42 reads are now sorted into 3 Samples listed in the tree on the right. Each Sample has an MLST Task Entry with 7 targets, one for each locus. Press OK to confirm the dialog and start the assembling.

 

Seqsphere tutorial nmengsanger import 04 assembled.png

 

  • Step 6: The 3 Samples are now assembled one after the other. They are listed on the navigation tree in the left of the main window. Double-click on the Task Entry item Button16-ov-TaskDNA COLORED color-14443246 over query color-14443246.png Neisseria MLST Sanger (DE9622) in the navigation tree of the first Sample DE9622.

 

Seqsphere tutorial nmengsanger import 05 taskentry.png

 

  • Step 7: The MLST results are shown in the right panel of the main window. The combination of the 7 MLST loci of this Sample corresponds to sequence type (ST) 42.
  • Step 8: 2 of the 3 Samples have green icons (Green Sample.png), which means that the target QC procedure succeeds for all 7 MLST loci. But Sample D9938 has a red icon (Red Sample.png) because target fumC of this Sample has failed in the target QC procedure. Double click on the red target icon Button16-smiley bad.png of target fumC in Sample D9938 to see the details.

 

Seqsphere tutorial nmengsanger import 06 failedtarget.png

 

  • Step 9: As shown in the warning message on the right, target fumC of this Sample has failed because of too many ambiguities. Click on the Contig link above the warning messange to navigate to the contig level.

 

Seqsphere tutorial nmengsanger import 07 failedcontig.png

 

  • Step 10: Click in the Target QC Procedure pane on the row with the error and the cursor jumps to the problematic area where an ambiguity symbol N can be seen in the read data. This is a wrong base-call in the chromatogram. Obviously the correct base on this position should be a G. Press the G on your keyboard to substitute the ambiguity N with a G base.

 

Seqsphere tutorial nmengsanger import 08 correctedcontig.png

 

  • Step 11: The target QC procedure is automatically updated. The Sample D9938 has now a green icon (Green Sample.png), all targets have succeeded. All edits are logged in an audit trail. Right-click on the Sample node in the navigation tree, and select Button16-History.png Show Sample Audit Trail. A new panel appears on the bottom of the main window, listing the history of the Sample entry with detailed information about all edits (who, when, and what).

 

Seqsphere tutorial nmengsanger import 09 audittrail.png

 

5 Store and Retrieve Samples

  • Step 1: Choose from the menu Button16-SaveAll.gif File | Save All to store the 3 Samples to the database on your SeqSphere+ server.
  • Step 2: Choose File | Close All to remove them from the workspace
  • Step 3: Choose Button16-DBLoad.png File | Search Samples. Select the Neisseria MLST Sanger project in the Project box, and choose 1 days for Recently modified. Then press the Search button.

 

Seqsphere tutorial nmengsanger search 01.png

 

  • Step 4: The 3 Samples that just were saved are listed. Now select the Advanced radio button in the upper right corner of the window.

 

Seqsphere tutorial nmengsanger search 02.png

 

  • Step 5: The window now shows the advanced search mask that can be used to search in specific fields (e.g., 'Neisseria MLST Sanger' ST = 42). Close the window by pressing the Cancel button.

6 Analyzing the MLST Results

  • Step 1: Choose from the menu Button16-ComparisonTable.png Tools | Comparison Table to perform phylogenetic analysis.

 

Seqsphere tutorial nmengsanger phylo 01.png

 

  • Step 2: in the Comparison Table dialog go to the first tab "Create New". In the Choose Samples section select the project Neisseria MLST Sanger (should be preselected). Then select MLST in the Choose Genotypings Schemes section at the bottom. Press the Create Comparison Table button to confirm.

 

Seqsphere tutorial nmengsanger phylo 02.png

 

  • Step 3: The comparison table window opens, showing the ST, some epi metadata fields, the clonal complex (CC), and the 7 allele types of the 3 Samples. The table rows are by default colored by the ST. The comparison table can be used to create phylogenetic trees (neighbor-joining or UPGMA), to export the distance matrix for further usage (e.g., for SplitsTree), or to create minimum spanning trees. Press the Mst.png Minimum Spanning Tree button in the toolbar to calculate and draw a minimum spanning tree for the 3 Samples.

 

Seqsphere tutorial nmengsanger phylo 03.png

 

  • Step 4: The minimum spanning tree window is opened. All 3 Samples collapse to a single node, because all have the same ST 42.