There are two different types of cgMLST schemas possible, i.e. stable and ad hoc ones. Stable schemas provide a public expandable nomenclature whereas ad hoc schemas provide a local nomenclature. Defining, evaluating, and calibrating a good stable cgMLST schema is quite laborious. However, all approved stable schemas are publicly available and downloadable for immediate use. In contrast, users have to quickly establish an own ad hoc scheme.

Stable and ad hoc cgMLST schemes deliver equal good genotyping results when used for analyzing outbreak(s). Of course when using an ad hoc scheme it is by definition not possible to share an allele nomenclature between laboratories. Furthermore, stable cgMLST schemes come with a predefined allele distance threshold for detecting clusters. However, users can define for ad hoc schemes their own thresholds that also will be used to trigger cluster alerts. Finally, the percentage of good cgMLST targets might not be a good quality control parameter if used with an ad hoc scheme if the scheme was not carefully enough defined or applied.

Two different approaches to define an ad hoc cgMLST scheme are possible depending whether the scheme is going to be used for analyzing a single or multiple outbreaks:

  • The single outbreak analysis approach is very similar to the procedure how SNP calling publications are usually done. Here, the researcher must first determine the genetically closest available finished complete or chromosome) genome, e.g. by an in silico MLST or kmer search, and then use this genome as seed genome without any query genomes for establishing an ad hoc cgMLST schema. This approach delivers the highest possible discriminatory power but is not well suited to be expanded for the analysis of multiple outbreaks with different genetic background or continuous monitoring of a species.
  • The multiple outbreak analysis approach follows in essence at least the chapter 3 of the stable cgMLST schema tutorial, i.e. a well-characterized strain is taken as seed genome and usually multiple query genomes are used to establish a potentially stable cgMLST schema for ad hoc usage with a local nomenclature.

A cgMLST scheme is usually slightly less discriminatory than a scheme done with an ‘SNP-like’ approach but better suited to do prospective analysis. However, when cgMLST and accessory genome genes of such a scheme are taken for comparative analysis then the discriminatory power is nearly as high as with the ‘SNP-like’ approach.

The ad hoc cgMLST schema tutorial only describes the single outbreak analysis approach.