Submission Anonymization Filter dialog for cgMLST.org
Submission Anonymization Filter dialog for EBI ENA

Overview

The Submission Anonymization Filter dialog appears each time before data is manually submitted to a public database, i.e. to the cgMLST.org nomenclature server or to EBI ENA. The dialog is also shown when a pipeline is defined that should submit to cgMLST.org. The Submission Anonymization Filter is used to define the level of metadata that should be submitted to the public database. Except for the submitter information at cgMLST.org any submitted data is made publicly available.

Top Fields

The fields at the top of the dialog allows to quickly define the levels for the most critical data:

  • Submit Sample ID:
For the cgMLST.org nomenclature server submission the sample ID can be submitted, the alias ID can be submitted instead as sample id, or it can be left empty. In latter case a unique identifier, i.e., a serial number that is always assigned at cgMLST.org to the sample (Ridom ID, e.g. RID002433) is used as sample id. The Ridom ID is always stored back in the user database.
For EBI ENA submission a unique ID is required. If sample ID and alias ID should not be submitted, the Ridom ID, that was received from the cgMLST.org nomenclature server, can be used as submitted sample ID. However, this requires that the sample was already submitted to cgMLST.org.
  • Submit from source location:
Defines which level of place information should be submitted: County, State, or City, or if no place data should be submitted at all (stating 'not provided' or 'restricted access').
  • Submit from collection date:
Defines which level of time information should be submitted: Year, Month, or Day, or if no time data should be submitted at all (stating 'not provided' or 'restricted access').
  • Submit Assembly Contigs (FASTA):
This field is only shown for cgMLST.org submissions and defines if the assembly contigs (i.e., genome sequence) should be submitted to cgMLST.org or not.

Field Selection for Submission

Below the top fields a table shows the settings for all fields that could potentially be submitted to the public database. This also includes the ones from the settings at the top (e.g., sample ID, place, time). The Submission State for the fields that are not mandatory (mandatory fields are shown at the top of the table), can be switched between Submit and Do Not Submit.

The button Button16-Wizard.png Quick Choose can be used to choose a predefined setting that changes the Submission State for several or all fields. If the dialog is shown during a manual submission, then the settings are stored as defaults and will be used the next time this dialog appears.

Missing Data Terms

For place and time information three different terms for missing data can be used, according to INSDC missing value reporting terms: The term "not collected" can be set as value for place and time fields of a Sample. The terms "not provided" and "restricted access" can be chosen in the Submission Anonymization Filter to prevent submission of data.

The following rules are used for place and time information:

  • If the Sample value should be submitted and is empty, then "not provided" is submitted.
  • If "not provided" is selected and the Sample value is empty, then "not provided" is submitted.
  • If "not provided" is selected and the Sample value is "not collected", then "not collected" is submitted.
  • If "restricted access" is selected and the Sample value is "not collected" or empty, then "restricted access" is submitted.