Degenerate primers

Degenerate primers are useful for pulling out one part of a gene sequence when you only know the gene sequence in related organisms. The more distant those related organsims, the more difficult it can be to design primers.

One solution is to gather sequences from a large range of organisms (say all vertebrates if you are working on fishes), translate them to amino acid sequence and align them. Based on these alignments, you can identify regions of the sequence which are highly conserved at the amino acid level. These conserved regions become possible locations for degenerate primers.

Degenerate primers are designed to match an amino acid sequence. A DNA sequence and its corresponding amino acid sequence would be

P F T K
CCn TTy ACn AAr

Here n = A,C,G or T
y = C,T
r = A,G

For PCR, you need two conserved regions for locating the forward and reverse primers. These should each be at least 5 AA long (preferably 6 or 7 AA long) and fairly close together. If these regions are too far apart, PCR efficiency will drop and you won't get much amplification. A reasonable target is 400 bp, though 200-600 bp should also work. The less degenerate your primers, the further apart they can be. The next task is to determine which of the possible primers are the least degenerate. Looking at the amino acid code, some amino acids are coded for by more triplet codon possibilities than others. The AAcode table shows that the degeneracies are:

1 fold sites M W
2 fold sites F Y H Q N K D E C
3 fold sites I
4 fold sites V P T A G
6 fold sites L S R

In designing degenerate primers, you want to avoid 6 fold sites (L, S and R) and maximize the number of 1 or 2 fold sites in the region. One thing you can do is compare the degeneracies of the possible primers. To compute degeneracy, multiply the degeneracies of each of the contributing AA. For example, if you have a primer which matches the AA sequence D E W V P, this would correspond to a degeneracy of 2 * 2 * 1 * 4 * 4 = 64.

Then you just need to weigh the factors of degeneracy and distance separating the forward and reverse primers to select a pair to try.

One last trick is to add tails to the degenerate primers on the 5' ends. This helps to increase the PCR efficiencies of these primers by increasing primer length and hence annealing temperature. Although the tails do not help in the first few rounds of PCR when only the genomic template is being amplified, the tails do match in subsequent PCR cycles when you are amplifying the short PCR products containing the primers at each end. The tails which I have used successfully are:

5' end of forward primer GCGCGGAATTC (EcoRI)
5' end of reverse primer GCGCGCAAGCTT (HindIII)

These have the added advantage of including restriction sites which can be used for directional cloning. Alternatively, they end with terminal G's which encourages Taq to add overhanging A's for use in TA cloning.

One example of degenerate primers which I have used are primers to amplify a 200 bp fragment of all the cone opsin genes. The two conserved AA regions cover the 6th to 7th TM regions. These AA sequences are

Forward: A S T Q K A E
Reverse: Y N P I/V I Y V

The forward primer, including the EcoRI tag, becomes: 5' GCGCGGAATTCGCNTCNACNCARAARGCNGA 3'

This primer is 1024 fold degenerate. The most 3' base could be degenerate, but it is quite expensive to have degenerate primers made where the first base (the most 3') is degenerate. They must be made at a higher synthesis scale. For the cheapest degenerate primers, make the first base one single base.

The reverse primer, including the HindIII tag becomes: 3' TAY AAY CCN RTN ATN TAY GTAAGCTT 5'

This primer is also 1024 fold degenerate. This matches the coding strand. For a primer, it must be on the other strand (and so reverse complemented). The primer to order then becomes

Reverse primer: 5' GCGCGCAAGCTTAC RTA NAT NAY NGG RTT RTA

These primers are on the upper end of degeneracy. However, they work at an annealing temperature of 40 C.

KC 5/03