Getting started with trimAl v1.1
Thank you for choosing trimAl v1.1 to trim your alignments. You will see that it is very easy to get familiar with the program. The first thing you need to start is to decide whether you will be using trimAl in its command line version or through the web interface. The command line version is faster and have more possibilities, so it is recommended if you are going to use trimAl extensively.
The trimAl webserver included in Phylemon provides a user friendly interface and the opportunity to concatenate your trimmed alignment to many different phylogenetic analyses.
Input and Output formats
trimAl v1.1 reads and produce multiple sequence alignments in Phylip interleaved format
Command line version
You would need first to install trimAl v1.1 on your computer.
On the following link (trimAl v1.1 downloads) you have the necessary files and instructions to get trimAl properly installed on your computer, whether you are a Linux, MacOS or Windows user. Just follow the instructions provided. Note that you will need to have a g++ compiler and appropriate permissions on your computer.
Once you have trimAl v1.1 installed, just type “trimal” on your prompt to get the basic commands you can use.
Now you can start using different trimming algorithms and see which suits best your alignments
A very common way of using trimAl v1.1 to trim and alignment is to use just a gap threshold (the minimum fraction of sequences without a gap that you require to consider a column of “enough quality”)
- trimal -in inputfile -gt 1
will remove all columns with any gap (equivalent to -nogaps option)
- trimal -in inputfile -gt 0.9
will remove all columns with gaps in more than 10% of the sequences
If you feel that, for some alignments this will be too strict and prefer to use a minimum coverage in the trimmed alignment (that is, the trimmed alignment will retain a given percentage of the columns in the original alignment) you can do it as follows:
- trimal -in inputfile -gt 0.9 -cons 60
will remove all columns with any gap, unless this removes more than 40% of the columns in the original alignment, we want to conserve at least 60% of them, in such cases trimAl v1.1 will add the necessary number of columns (in decreasing order of scores) so that the minimum coverage is respected.
The same we have said for the gap threshold, holds for the similarity threshold, although this is more complicated to explain. It is a similarity score that ranges from 0 to 1, that is based on a Blosum62 matrix (also including the gaps in a column) and is more thoroughly described in the trimAl publication.
- trimal -in inputfile -st 0.9 -cons 80
will remove all columns with similarity score lower than 0.9, unless this removes more than 20%, we want to conserve at least 80% of alignment, of the columns in the original alignment, in such cases trimAl v1.1 will add the necessary number of columns (in decreasing order of scores) so that the minimum coverage is respected.
Yet another threshold that you can use is based on the comparison of different alignments. Sometimes one does not know which alignment algorithm will perform best (or which parameters, e.g gap penalties). A way out is to just produce different alignments with the different algorithms and then choose the alignment that contains the most “common” residue-pairings, that is the residue pairs that are recovered by most algorithm.
trimAl v1.1 can do this for you, just provide in the input file a list of the paths for the different alignments. Just type
- trimal -compareset listfile
You can then trim the output alignment (the “best” alignment) with other algorithms or trim it based on the similarity pairings, for instance:
- trimal -compareset listfile -ct 0.5
This will trim such “best” alignment removing all columns with a conservation score lower than 0.5
For all options you can use blocks of conservation instead of single columns. To do that, just specify the number of columns around a given column to be included in the computation. For example, setting -w to “2” will compute the scores for the column i, averaging the scores of the 5 columns ranging from i-2 to i+2.
- trimal -in inputfile -st 0.9 -cons 80 -w 2
Finally, you can use one of the two implemented
heuristics methods to define, automatically, the thresholds. Using trimAl v1.1 the -strict and -relaxed options will define the thresholds according to the cumulative distribution of gaps in the alignment. These heuristics automatic methods basically find abrupt changes in these distribution that correspond to thresholds that have been shown to be nearly optimal in a benchmark based on simulated sequences (see our publication for more details).
- trimal -in inputfile -strict
- trimal -in inputfile -relaxed