Dialog window

This function can be invoked via via Tools > Fix rep/dnaA Start and Orientation for Plasmids/Chromosomes in the menu. It searches the sequences in FASTA files for genes encoding for origins of replication within chromosomes and plasmids, and re-orients the sequences so that they start at the respective position. This can be useful for the visualization of sequence comparisons.

The assignment of a contig to either plasmid or chromosome is based on the size of the contig: Contigs from 3000 to 500,000 bases are considered plasmids, larger contigs are considered chromosomes, and contigs shorter than 3000 bases are ignored.

By default the function assumes that each contig represents a circular sequence. If the box Look for [topology=circular] in FASTA contig header instead of assuming that contig(s) are circular is checked, only contigs with the term [topology=circular] in their header are considered circular. Selecting the box Skip non-circular contigs allows to leave contigs without [topology=circular] in their headers unchanged.

BLAST is used to search for hits within a dnaA library for chromosomes and for hits within a rep library for plasmids. If multiple hits are found, the hits that contain rep_cluster in their name are sorted to the end. Afterwards, the best hit is used. If the best hit begins within the first 50 bases of the contig no changes are made.

If a best hit is found the further processing depends on the circularity of the contig:

  • if the contig is circular, the contig is rotated so that it starts at that hit. If the hit is in inverse direction, the reverse complement of the sequence is used. The term [fixed] is added to the contig name.
  • if the contig is not circular, the contig is split into two contigs at that hit. If the hit is in inverse direction, the reverse complement of the sequence is used. The addition _L and the term [split] is added to the first contig name, the addition _R and the term [fixed] is added to the second contig name.


A single file or a directory can be used for input. If a directory is used, all files with a FASTA file format extension are processed. Files in the output directory may be overwritten.


Note that this function is useful only for data with nearly complete chromosomes and plasmids (e.g. long-read data).

Used libraries

for dnaA (chromosomes): dnaA-library (amino acid) from Circlator

for rep (plasmids): rep-library (nucleotide) from Mob-Suite

Blast parameters

The following BLAST parameters are used:

for dnaA (amino acid library): matrix = BLOSUM 62, word size = 3, mismatch penalty = -3, match reward = 1, gap open costs = 11, gap extension costs = 1

for rep (nucleotide library): word size = 11, mismatch penalty = -3, match reward = 1, gap open costs = 5, gap extension costs = 2