For very large data sets with thousands of samples it is recommended to use GrapeTree for drawing a Minimum Spanning Tree (MST).

Installation of GrapeTree on WSL

GrapeTree can be installed into a conda environment using the following command on Windows Subsystem For Linux (WSL).

  • Step 1: Open the start menu, type wsl -d ridom_ubuntu and choose to execute it.
  • Step 2: If the computer requires a proxy to access the Internet, the proxy configuration must be configured first in the WSL. This can be done by placing a file named .condarc in the Linux users home directory containing the proxy configuration. The file can be created with the following command replacing PROXYSERVER:PROXYPORT with your proxy-server and -port:
echo -e "\nproxy_servers:\n http: http://PROXYSERVER:PROXYPORT\n https: http://PROXYSERVER:PROXYPORT\n" >> ~/.condarc
  • Step 3: When the black WSL console window has started up, enter the command:
conda create --name grapetree -c bioconda grapetree

Creating a GrapeTree MST from a SeqSphere+ Comparison Table

GrapeTree export function
GrapeTree browser window
  • Step 1: Choose in the comparison table menu the function File | Export profile and metadata files for GrapeTree (tsv). This function will create two TSV files: one profile file, containing the allelic profiles and one metadata file, containing the epi metadata from the comparison table. To be accessible from the WSL, these files must be saved locally on your computer, i.e. on you C or D drive, not on a network drive.
  • Step 2: If correctly installed, GrapeTree will be automatically started and create a NWK tree file. Then the GrapeTree local server will be started and a webbrowser will automatically open with URL
  • Step 3: The GrapeTree page will be shown in the webbrowser. Press Load Files button and import first the NWK file and then the metadata file created by SepSphere+. Further information about loading the files and modifying the tree layout can be found in the GrapeTree tutorial.

Tree Topology

We noted slight differences in the tree topology between our MST and GrapeTree trees! Those differences are most likely due to different treatment of missing data and/or different tie-breaking rules.

Runtime and Memory usage

Following table contains the runtime and memory usage for calculation and visualization of Mycobacterium tuberculosis samples.

No. of Samples Intel i7, 4 cores, WSL Intel Xeon, 5 cores Linux Intel Xeon, 10 cores, Linux
5k 2m (5GB) 2m (5GB) 1m (5GB)
10K 8m (18GB) 6m (18GB) 2m (18GB)
15k 14m(19GB) 10m (20GB) 7m (20GB)