Introduction

VFDB (citation) is a virulence factor database that is provided by the Institute of Pathogen Biology, Bejing, China. The database contains virulence factor related alleles for important bacterial pathogens that are searched by BLAST.

Within SeqSphere+ predefined task templates can be downloaded from the Task Template Sphere (requires SeqSphere+ version 7.0 or later) for the 54 species that are available from VFDB (as of February 2020):

Acinetobacter baumannii
Aeromonas hydrophila
Aeromonas salmonicida
Aeromonas veronii
Anaplasma phagocytophilum
Bacillus anthracis
Bacillus cereus
Bacillus subtilis
Bartonella henselae
Bartonella quintana
Bordetella pertussis
Brucella melitensis
Brucella suis
Burkholderia pseudomallei
Campylobacter jejuni (+ C. coli)
Chlamydia trachomatis
Clostridium botulinum
Clostridium difficile
Clostridium novyi
Clostridium perfringens
Clostridium septicum
Clostridium tetani
Corynebacterium diphtheriae
Coxiella burnetii
Enterococcus faecalis
Enterococcus faecium
Escherichia coli
Haemophilus influenzae
Helicobacter pylori
Klebsiella pneumoniae (+ K. variicola/quasipneumoniae)
Legionella pneumophila
Listeria innocua
Listeria ivanovii
Listeria monocytogenes
Mycobacterium tuberculosis (+ M. bovis/africanum/canettii)
Mycoplasma hyopneumoniae
Mycoplasma pneumoniae
Neisseria meningitidis
Pseudomonas aeruginosa
Rickettsia conorii
Rickettsia rickettsii
Salmonella enterica
Shigella dysenteriae
Shigella flexneri
Staphylococcus aureus
Streptococcus agalactiae
Streptococcus pneumoniae
Streptococcus pyogenes
Vibrio cholerae
Vibrio parahaemolyticus
Vibrio vulnificus
Yersinia enterocolitica
Yersinia pestis

Task Entry Overview

Genotyping result allele table of Task Entry Overview for VFDB task

When a VFDB task entry is processed, SeqSphere+ performs a target scanning for the defined virulence factor alleles. The alleles that were found with at least 85% identity and 60% aligned overlap to the allele in library are shown in the Task Entry Overview table. The rows in the table are colored by the percental identity and alignment overlap using the following thresholds:

  • Dark green row: Identity = 100% and Aligned = 100%
  • Light green row: Identity ≥ 85% and Aligned = 100%
  • Gray row: Identity ≥ 85% and Aligned ≥ 60%

If multiple matches for a target (same or different allele) are found on different locations, each match is listed as separate row in the table.

Want to learn more about the virulence factor?! Select the row of a virulence factor of interest, right-click, and choose the menu entry Browse VFDB. On the VFDB WWW page follow the link on top of the page for further information regarding this VF.

Below the table a colored threshold legend, version information, and citation(s) are stated.

Result Fields

Sample result table containing the aggregated result field of VFDB

For each confidently found (colored green) virulence factor allele the Target name is stored as result field of the task entry (e.g., for 'aslA' = 'aslA'). If multiple matches for a target (same or different allele) are found on different locations, the gene appears multiple times concatenated with "," (e.g., for 'aslA' = 'aslA, aslA').

Additionally, the list of confidently found targets is stored in the result field 'Confident Targets' concatenated with "/" (the list is trimmed if longer than 255 characters and should therefore not be used for database queries). Only the latter summary result field is shown in the result tab of the Sample Overview. However, by clicking the VFDB category link all details for this sample can be viewed.

Sample search with Field Criteria for a result field of VFDB

Specific Target fields can be selected from the VFDB entry for searching under 'Field Criteria' in the advanced mode of the sample search dialog.

Comparison Table function to remove resistence/virulence columns containing only empty values

This result field can also be retrieved for a Comparison Table and for exporting metadata. If the VFDB Task Template is chosen in the Create Comparison Table dialog, then the Target data is shown right after the epidemiological metadata (with a gray column header). For a better overview it is recommended to use the command Columns | Remove Resistance/Virulence Genotyping Columns where All Values Are Missing to get rid of those columns that are for all samples empty.

If a virulence profile (presence/absence) comparison of several samples is intended to be done, then it is recommended to use the command Columns | Transform Resistance/Virulence Genotyping Columns to Absence/Presence (+/-). Alternatively handle missing values as an own category when building trees. Next the command Button16-selectGenotypingSchemesForDist.png Columns | Select Genotyping Schemes for Distance Calculation ... must be elicited and in the upcoming dialog VFDB must be selected and all other schemes should best be deselected from distance calculation. If data were not transformed, then once the command for calculating a tree was elicited in the upcoming missing values dialogue the option Missing Values are Own Category must be selected.

Chromosome and Plasmids Overview

If the Chromosome and Plasmids Overview Task Template is used for the same Sample, some VFDB results are integrated there.