Overview

Submission Anonymization Filter dialog for cgMLST.org
Submission Anonymization Filter dialog for EBI ENA

The Submission Anonymization Filter dialog appears each time before samples are manually submitted to a public database, i.e. to the cgMLST.org nomenclature server or to EBI ENA. The dialog is also shown when a pipeline is defined that is foreseen to submit to cgMLST.org. The Submission Anonymization Filter is used to define the level of metadata that should be submitted to the public database. Except for the submitter information at cgMLST.org any submitted data is made publicly available.

Major Fields

The fields at the top of the dialog allows to quickly define settings for the most critical data:

  • Sample Submission
This field is only shown for cgMLST.org submissions. It defines one of the following three submission modes:
  • Store only new CT founders as anonymized samples on cgMLST.org (default)
Only the allelic profiles of the submitted Samples that do not belong to an existing CT are stored on cgMLST.org. No submitter data will be stored for these Samples.
  • Store only new CT founders on cgMLST.org
Only the allelic profiles of the submitted Samples that do not belong to an existing CT are stored on cgMLST.org.
  • Store all submitted samples on cgMLST.org
All submitted Samples will be stored on cgMLST.org.
  • Submitter Info
This field is only shown for cgMLST.org submissions. It defines if the submitter contact information should be shown on the public cgMLST.org web site. By default it is not public shown on cgMLST.org. However, it will be used to forward incoming requests about submitted Samples.
  • Submit Sample ID
For the cgMLST.org nomenclature server submission a sample ID submission is required. Instead of the Sample ID an alias ID can be submitted. Alternatively the cgMLST.org ID can be used. In latter case a unique identifier, i.e., a serial number that is always assigned at cgMLST.org to the sample (cgMLST.org ID, e.g. RID002433) is used as sample ID. The cgMLST.org ID is always stored back in the user database.
For EBI ENA submission a unique ID is required. If sample ID and alias ID should not be submitted, the cgMLST.org ID, that was received from the cgMLST.org nomenclature server, can be used as submitted sample ID. However, this requires that the sample was already submitted to cgMLST.org.
  • Submit from Source Location
Defines which level of place information should be submitted: Country, State, or City, or if no place data should be submitted at all (stating 'not provided' or 'restricted access').
  • Submit from Collection Date
Defines which level of time information should be submitted: Year, Month, or Day, or if no time data should be submitted at all (stating 'not provided' or 'restricted access').
  • Submit Assembly Contigs (FASTA)
This field is only shown for cgMLST.org submissions and defines if the assembly contigs (i.e., genome sequence) should be submitted to cgMLST.org or not.

Field Selection for Submission

Below the major fields a table shows the settings for all fields that could potentially be submitted to the public database. This also includes the ones from the settings at the top (e.g., sample ID, place, time). The Submission State for the fields that are not mandatory (mandatory fields are shown at the top of the table), can be switched between Submit and Do Not Submit.

The button Button16-Wizard.png Quick Choose can be used to choose a predefined setting that changes the Submission State for several or all fields. If the dialog is shown during a manual submission, then the settings are stored as defaults and will be used the next time this dialog appears.

Missing Data Terms

For place and time information three different terms for missing data can be used, according to INSDC missing value reporting terms: The term "not collected" can be set as value for place and time fields of a Sample. The terms "not provided" and "restricted access" can be chosen in the Submission Anonymization Filter to prevent submission of data.

The following rules are used for place and time information:

  • If the Sample value should be submitted and is empty, then "not provided" is submitted.
  • If "not provided" is selected and the Sample value is empty, then "not provided" is submitted.
  • If "not provided" is selected and the Sample value is "not collected", then "not collected" is submitted.
  • If "restricted access" is selected and the Sample value is "not collected" or empty, then "restricted access" is submitted.