Transmutator is a freeware and can be downloaded from http://www.nouspikel.com/transmute.zip
Source code is available upon request. Please report bugs to: nouspikel(at)yahoo(dot)com
cDNA sequence: Here you copy-paste the sequence of the gene you are interested in. Only A, T, G, C and U are considered, in lower or upper case. This means that numbering is ignored, and so are ambiguity codes (e.g. 'N').
ATG: If the sequence you pasted does not start at the ATG, you can enter the position of the ATG in this field.
Name: This optional field allows you to assign a name to your sequence. If you terminate the program by clicking on [Save & Exit], the current sequence is saved and will be loaded next time you start the program. The name field may thus come handy to remember what the sequence was.
DNA mutation: Here you enter the DNA mutation, according to the syntax rules suggested by den Dunnen and Antonarakis. Note that the initial "c." is optional. Upper- and lower-case characters are accepted.
It is possible to enter several mutations on the same allele, by separating them with semicolons (with or without enclosing brackets):
Numbered from: Indicates the numbering scheme used in the mutation description. By convention, numbering should start from the ATG, but some paper (particularly those published before the guidelines were established) number nucleotide with respect to a sequenced deposited in a database. This sequence does not necessarily begin with the ATG, implying that mutation numbering must be adjusted. You can specify the adjustment in this list, or pick an option from the drop list:
Output: With the drop list, you can select the type of output that you wish: wild-type DNA, mutant DNA, wild-type protein, or mutant protein.
[Copy] Allow you to copy the contents of the "Output" field to the clipboard in a single click. This could also be done by selecting the whole field and doing Ctrl-C.
[Options] Calls up a dialog box that lets you specify various formatting options.
Protein mutation: This field will contain the consequences of the DNA mutation at the protein level, described according to the syntax of den Dunnen and Antonarakis. The "Options" dialog box lets you select various alternative formats. One important notion is that protein mutations must be described without assuming knowledge of the DNA mutation. So for instance, if an insertion in the DNA causes a frameshift that happens to match the end of the protein, it will be listed as a deletion. We know it's a frameshift from the DNA sequence, but at the protein level it looks like a deletion and must be listed as such.
[Copy] Copies the contents of the "Protein mutation" field to the clipboard.
[Reset] This button clears the input field, and reset all sequences.
[Help] This button opens the help file (which you're reading now) into your default browser.
[Save & Exit] Quits the program after saving your options,
and the current wild-type DNA sequence. These are saved in a file called
"Transmutator.opt", from which the will be reloaded next time you start
the program.
Important: if this file gets corrupted, it can prevent the program
from starting normally. In such a case, delete the .OPT file and start
over.
Note that nothing gets saved if you leave the program by closing the
main window, which allows you to leave without changing your options.
Group: Nucleotides can be displayed in groups separated with spaces, to facilitate reading.Select the group size from the the drop box: 3 nucleotides (i.e. one codon), 10 nucleotide, or the whole line.
Nt per line: Number of nucleotides to display on each line.
Number lines: Check this box to have each line start with the
number of its first nucleotide.
Group: Amino acids can be displayed in groups separated with spaces, to facilitate reading. Enter the desired group size, or 999 for the whole line.
Nt per line: Number of amino acids to display on each line.
Number lines: Check this box to have each line start with the number of its first amino acid.
End at first stop codon: Check this box to have the protein end at the first stop codon, even if the DNA sequence is longer.
Display stops as: These two fields let you decide how to display
stop codons in 3-letter and 1-letter code respectively. For instance, you
could use "Xxx" or "***" or "Stop" in 3-letter mode, and "X" or "*" in
1-letter mode.
Stop: Here you can enter the string to be used to represent a stop codon. For instance, "*" or "Stop".
ATG change: Altering the initial ATG can have various consequences: no protein produced (coded as: p.0), translation starting at a downstream ATG or even an upstream ATG. Without experimental evidence, the recommendation is to use the "unknown consequences" syntax: p.Met1? However, you have the option to use p.Met1Stop whenever the change would lead to a stop, or to use p.0.
Stop lost: Allows different syntaxes to represent mutations of the final stop codon to a valid amino acid, resulting in an extension of the protein. *110Alaext*17 means that the original protein now contains an extra 17 amino acids. Whereas Nostop110stop127 indicates that the mutated protein now contains a stop codon at position 127. Both notations are equivalent, but den Dunnen & Antonarakis recommend the first one.
Sustitutions: Gives you the choice between the official syntax listing the old and new amino acid (p.Arg20Ser) or a short form that only lists the new amino acid (Ser20).
Frameshift: Allows different syntaxes to represent frameshifts. Arg20fs*25 is the original recommendation, listing the first modified amino-acid and the new position of the stop codon (counting from the ATG). This recommendation was later modified to include the new identity of the first frameshifted amino acid and to number the stop codon from the start of the frameshift (not from the ATG): Arg20Profs*6. Finally, you can use a short form that does not reflect the size of the frameshift: Arg20fs.
Minimum end-of-mutant match for indels: Each and every frameshift can be considered as an indel: insertion of the frame-shifted bit, deletion of the rest of the protein. It is even more tempting to do so when the last framshifter amino acid(s) match the sequence of the wild-type protein (at the limit, if the entire framshift matches the WT protein, it looks like a simple deletion). Conversely, a deletion might be considered as a frameshift that matches the end of the WT protein out of sheer luck. Statistically, a 1-aa acid match will happen about 1 time in 20, a 2-aa match 1 time in 400, etc. The only way we can be sure that it's a frameshift is if we know what happened at the DNA level, but we are not supposed to use DNA information when describing a protein mutation... So Transmutator offers you a compromise solution: you can enter a minimum number of amino-acids that must be matched between the end of the mutant protein and the WT sequence. If this cutoff is met, the mutation will be described as an indel (or a deletion, as the case might be). Otherwise, it will be considered a frameshift. Thus, if you enter 0, frameshifts will never be reported. If you enter 1, frameshift will only be reported if no match is found between the framshifted bit and the WT protein. If you enter 2, you demand a 2-aminoacid match to report an indel (this is the default). Entering a very high number here will cause all deletions to be diagnosed as frameshifts.
Insertions: Allows different options to represent insertions. With Lys2_Met3insGlnSerLys the inserted amino acids are spelled out, whereas Lys2_Met3ins3 only indicates the number of inserted amino acids. You can select "Use number if more than" to switch from one syntax to another, depending on the size of the insertion. Enter the threashold value in the nearby box: if the insertion if larger that the specified value, the numeric format will be used, otherwise the inserted amino acids will be listed.
Detect duplications: If this box is not checked, duplications
will be listed as insertions (a duplication is a special type of insertion).
If the box is unchecked, perfect repeats will be listed as duplications:
p.His7_Gln8dup (or p.Gly4dup if only one amino acid is involved). Be aware
that, duplications in DNA sometimes result in a codon change at the junction
between the two repeats. In such cases, since there isn't a perfect repeat
at the protein level, the mutation will always be listed as an insertion.