Velvet is a de novo assembler for short read data. Velvet works well for assembling reads from Illumina systems with read length from 70bp to 250bp length. However, Velvet does not work well with reads that contain many InDel-errors and is therefore not suited for assembling data from Ion Torrent or 454 machines. The assembler's algorithm is based on a so-called 'k-mer' data structure. The assembler results strongly depend on the chosen value for k. An automatic mode is included in SeqSphere+ that runs Velvet for multiple k-values and uses the results from the best run. Multiple instances of Velvet can be run in parallel with different values of k to speed up the assembling. Velvet can require large amounts of memory, therefore at least 32GB of memory are recommended. The Velvet assembler can be accessed using the menu Tools | Short Read Assembler (Velvet). The assembler reads from FASTQ-files that contain either single or paired reads. The input files can be quality trimmed and downsampled before assembling. The assembled reads are written into an ACE-file, they can be imported using Create Samples from Assembled Genomes.
Quality TrimmingThe reads are processed before they are assembled. They can be automatically trimmed based on read quality and downsampled. Default settings for trimming are to trim on both ends of the reads until the average base quality is > 30 in a window of 20 bases. This settings usually work well for Illumina HiSeq/MiSeq data. For other sequencing technologies different values might be used. Note that quality trimming will not remove a read with quality below threshold in all bases, but will always leave (2*window size) bases in the middle of the read. It is recommended to enable quality trimming, it usually results in better assemblies. DownsamplingTo reduce the size of the output files and time and memory usage, the input files can be downsampled. Downsampling randomly removes reads so that the given approx. size is obtained. If quality trimming is selected, downsampling is done on the trimmed reads. Choose of k-valuesVelvet uses a 'k-mer' data structure for assembling. Assembling results differ based on the value for k. A good value for k can be vaguely guessed from the coverage and the read length. To get good assembling results, it is suggested to run Velvet using different values for k and return the assembly with the best results (measured by avg. length of contigs with more than 1000bp). An automatic mode exists that uses an heuristic based on read length to determine k-values that should be checked. Velvet is then run for these k-values. The automatic mode usually finds good results, and using it is recommended for most users. Alternatively, an upper and a lower limit for k-values and a step size can be specified. Velvet is then started for all k-values in the given range and the best assembly is returned. Automatic mode heuristic
SettingsThe dialog window allows to change settings for
Speed and MemoryMemory usage of Velvet depends on
If possible, quality trimming and downsampling is recommended to reduce memory consumption. When running multiple instances of Velvet in parallel, memory consumption increases.
Note that actual time and memory requirements may be different depending on genome size, read length, coverage, read distribution and quality, and k-mer settings.
Open Source noteThe Velvet assembler function is a wrapper to external Velvet executables. The Velvet software is open source software that is licensed under the GNU General Public License version 2.0 (GPLv2). The program homepage is http://www.ebi.ac.uk/~zerbino/velvet/. The program was ported to windows by Applied Maths: http://www.applied-maths.com/download/open-source. The source code is installed next to the executable. Please note that Ridom can only give limited support for the Velvet assembler. For more information see: Zerbino DR and Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18(5):821-9. [PMID: 18349386] |