Tutorial for MLST+ with M. tuberculosis

Overview

This tutorial describes how to use a predefined MLST+ schema for automated sequence analysis. The Mycobacterium tuberculosis complex is used exemplarily for this demonstration. However, by reading this tutorial you should be able to define your own projects for all other species for which a offical MLST+ schema is available on the nomencalture server.

Preliminaries

Step 1: This tutorial requires a running SeqSphere+ client and server. If not done yet:

Download and install the SeqSphere+ 64bit client and server software on your computer.

Start the SeqSphere+ server, then start the SeqSphere+ client and initialize the database.

For evaluation purpose a free evaluation license can be requested.

Step 2: Download the example data archive SeqSphere_Examples_WGS_M_tuberculosis.zip

for this tutorial, and extract the zip-file on your computer.

Defining Project and Task Templates

Step 1: First launch the SeqSphere+ client and connect to your SeqSphere+ Server.

Step 2: Then create a new Project with the menu: File | New | Project

Step 3: Enter a name for the new Project (e.g., "MTBC Outbreak").

Step 4: Press the button Add from Store to add predefined Task Templates to your Project. Task Templates controls which kind of sequences you are working with and what SeqSphere+ should do with them.

Step 5: The list of organisms is shown for which Task Templates are available on the server. Choose "Mycobacterium tuberculosis" from the list.

Step 6: By default, the MLST+, ans the Accessory Task Templates are selected. Press OK to download the Task Templates and add them to your new Project.

Step 7: Finally press OK to save your Project.

Importing the Genome Sequencing Data

Step 1: Choose from the menu File | Create Samples from Assembled Genomes

Step 2: Choose the Project you just created.

Step 3: Press Add Task button to the right, and select both Task Templates of the Project.

Step 4: Now use the button Add from File and choose the BAM files from the downloaded tutorial data archive.

Step 7: Confirm with OK. Ridom SeqSphere+ now loads all input sequences and finds (by using built-in BLAST) each of the target reference sequences that are defined in the Task Template.

Step 8: The preview dialog opens up. Normally you can leave it to defaults and press the button Create/Extend Samples to continue.

Details: The scanning result for each input sequence is shown in table format, listing all the targets with their percent identity, alignment, start and stop positions and other relevant data points. The first column of table marks the targets that should be imported into SeqSphere+. By default, only the targets that fulfill the specified identity and alignment thresholds will be added to the new Sample entry. The targets that don't have a unique match that fulfills the thresholds are colored red. The thresholds are normally taken from the Task Template, but they also can be changed in this step.

Editing Samples of Imported Genome Sequencing Data

Step 1: After the import is completed, the navigation tree shows all new Samples. Each Sample node in the navigation has two sub nodes: The classic MLST task and the MLST+ task. Below the task nodes there are the target nodes. Each target node represents one sequence (often a gene) extracted from the input data (genomes or wgs contigs). The targets can have different states:
- Gray Targets were not extracted (because the match did not reached the thresholds in the previous step)
- Green Targets were extracted and fulfill all requirements that are defined in the Task Template Analysis Parameters.
- Yellow Targets were extracted, but fail at least in one of the requirements that are defined in those parameters. For example, they may have frame shifts and incorrect lengths compared to the published strain sequence. Those targets must be inspected further. (e.g., using the Position Navigator function)

Step 2: Click on File | Save All to store the new Samples in the database of your SeqSphere+ server.

Submitting new Allele Types

Step 1: If your sequence data contains new MLST+ alleles, they can be submitted to the MLST+ Nomenclature Server. Right-click on one or more Sample nodes in the navigation tree and choose Assign or Submit new Alleles.

Step 2: If this is the first submission, a submitter account must be created on the MLST+ Nomenclature Server. A dialog is shown to give some contact details. Press OK to perform the registration.

Step 3: The submission form opens up. The submission of new alleles requires some basic information about the Sample. A dialog is shown with an input form that shows all the information that will be submitted. Press OK to perform the submission.

Step 4: After the submission has been performed, the new alleles are assigned to your Sample entries. Click on File | Save All to store the changes in the database of your SeqSphere+ server.

Analyzing the Results

Step 1: When multiple Samples were imported a Comparison Table can be created. This will show the allele types of the Samples in table format. Any differences between Samples can be determined. The Comparison Table offers tools for distance calculation and phylogenetic trees, and also can be exported into Excel spreadsheet format.

Step 2: From the menu Tools | Comparison Table and press New Definition.

Step 3: Enter a name for your Comparison Table definition.

Step 4: Choose your new Project.

Step 5: In the box Typing Results select at least the checkbox for the MLST+ task.

Step 6: Confirm with OK two times.

Step 7: A table with the allele types for the Samples of the Project is shown.

Step 8: Press the Minimum Spanning Tree button in the toolbar to calculate the distances between the Samples and draw a minimum spanning tree for them. If the table contains missing data (targets that have no allele types assigned yet), the columns can be automatically removed from distance calculation by selecting Remove Columns from Distance Calculation.

Contents