The following consensus calling procedure is used to calculate a consensus for BAM files, and to correct the consensus when converting an AMOS file that was created by Velvet into an ACE file.

Consensus Base

Search for each position for forward and reverse reads for a base:

  1. if majority of reads at this position contain a gap, return gap
  2. else calculate sum of read qualities for each base (A, C, G, T) at the position (or base count, if no qualities are used)
  3. return base with maximum quality sum

If different bases are found for forward and reverse reads, unite forward and reverse and repeat algorithm above on this single list.

Finally, do an ambiguity check:

  • If the sum of qualities of reads that contain the resulting base at the position is less than 60% of the total sum of read qualities at the position, an ambiguity 'N' is returned. If no quality values are given, the base count sums are used.

Consensus Base Quality

Consensus gaps have always quality 0. If consensus is not a gap, use MIRA-Like consensus quality calculation:

Best quality for a base in a direction makes basic rate = 100%
add to this: 10% of next best base quality.
 
Same procedure for other direction (if available), then add both qualities
In general, the values are almost the same (mostly a tad higher) as
with the more complicated (and time consuming) old variant.
Cap at 90.

Example: 

+ A 30   -> 30     \
+ A 20   ->  2      \
+ A 20              / = 32   \
+ A 20             /          \
                               > = 60
- A 26   -> 26     \          /
- A 20   ->  2      >  = 28  /
- A 15             /

(+ = forward. - = reverse)

Uncovered Postitions in BAM Files

There is always one consensus called for each reference sequence in the BAM file. If the consensus has uncovered regions, they are concatenated (which means they are treated as deletions).