The Minimum Spanning Tree is calculated using the Sample data for the columns that should be used for distance calculation.
Empty values or values that start with a ? are treated as missing values.
These missing values can be treated in two different ways:
- missing values are an own category: When comparing two missing values, they are equal. When comparing a non-missing value with a missing value, they are different.
- pairwise ignore missing values: When comparing two missing values, they are equal. When comparing a non-missing value with a missing value, they are equal.
For a lower number of columns for distance calculation, (e.g. for MLVA data or MLST data), the missing values are an own category option is recommended. For a larger number of columns (e.g. MLST with hundreds of targets) the pairwise ignore missing values option is recommended.
Note that the option pairwise ignore missing values may result in problems in the MST when a Sample contains many missing values. In this case, many or all of the MST nodes might be merged. It is recommended to remove Samples that have missing values in more than 10% of the columns for distance calculation before calculating an MST.
|