Marine Biological Laboratory Workshop on Molecular Evolution  
Marine Biological Laboratory
HomePeopleScheduleSoftwareResources

Software
AWTY
BEAST
BEAUti
Clustal W
Clustal X
FASTA
FigTree
GARLI
GCG
LAMARC
MAFFT
Migrate-n
Modeltest
Modelblock
MrBayes
NCBI Resources
PAML
PAUP*
PHYLIP
ReadSeq
SeaView
T-Coffee
TreeView
Search


Modeltest

Modeltest (Posada and Crandall 1998) is a program that, in conjunction with PAUP*, selects the best-fit nucleotide substitution model for a set of aligned sequences. This model can then be implemented in maximum-likelihood and Bayesian phylogenetic analyses. The aim of this software is to facilitate comparisons between 56 alternative models using different criteria.

Model selection can be conducted on the basis of hierarchical likelihood ratio tests (hLRT), Akaike Information Criterion (AIC = -2 lnL + 2K; Akaike 1974), corrected AIC (AICc = AIC + 2K(K+1)/(N-K-1); Hurvich and Tsai 1989, Sugiura 1978) or Bayesian Information Criterion (BIC = -2lnL + KlogN; Schwarz 1978) [L = model likelihood, K = number of estimatable parameters, N = sample size]. AIC can be interpreted as the amount of information lost when we use a particular model to approximate the real process of nucleotide substitution; thus, the model with the smallest AIC is favored. Given equal priors for each of the competing models, the model with the smallest BIC is equivalent to the model with the maximum posterior probability.

Download the latest version from the Modeltest home page.

Input data file format

NEXUS

Running Modeltest in Windows

There is a tutorial available that has detailed instructions for running the Windows version of Modeltest.

Running Modeltest through a terminal window

  1. Format your data into a NEXUS file. You can use this example dataset (download).
  2. Execute the NEXUS file in PAUP*.
  3. Execute the modelblock (view,download) file within PAUP* by typing:

    execute modelblock;

    This file tells PAUP* to compute likelihood scores for each of 56 models on the same neighbor-joining tree. When the computations are over you will see an output file named model.scores in your home directory.

  4. Save this file under a different name which is specific to your project; otherwise, Modeltest will not work the next time you run it.

  5. To run the computed tree scores in Modeltest, type:

    modeltest3.7 < infile > outfile1

    "infile" is the name of your input file -remember to change it from model.scores to something specific- and outfile1 is the name for your out file.

  6. By default, Modeltest will select the best-fit nucleotide substitution model using the likelihood ratio test and the AIC. Modeltest 3.7 also allows model selection based on the AICc and BIC. To do this, you must specify this option and also specify the sample size. Sample size for an alignment of DNA sequences is a difficult concept as it will depend on the number of characters, the number of taxa, and their correlation. You could specify the number of characters or the number of characters times the number of taxa, but probably none of these options is correct most of the time.

  7. To run the computed tree scores in Modeltest implementing AICc model selection, type:

    modeltest3.7 -n100 [replace 100 by your sample size] < infile > outfile2
  8. To run the computed tree scores in Modeltest implementing BIC model selection, type:

    modeltest3.7 -b -n100 [replace 100 by your sample size] < infile > outfile3

Although Modeltest will automatically create command blocks that can be pasted directly into PAUP* to set the parameters for maximum-likelihood analyses, it is best to first interpret carefully the results generated by the program. Note that hLRT, AIC, AICc and BIC may select different models; choosing among them is up to the user. An important additional issue is taking into account the uncertainty in model selection. The output of Modeltest allows examining uncertainty on the basis of the AIC differences (deltas, or rescaled AICs), and the normalized relative AIC for each model (AIC weights). For cases in which support for a particular model is not overwhelming, users may want to consider the implementation of model averaging, a procedure that allows drawing inferences from several models simultaneously. By default, Modeltest 3.7 calculates model averaged estimates of parameters. This is accomplished by estimating parameters for each model and then averaging the estimates according to how likely each model is (i.e., based on Akaike weights).

For further information about Modeltest 3.7 look at the manual or go to the Modeltest Web Page. For a discussion on the advantages and disadvantages of different model selection approaches in phylogenetics, see Posada and Buckley (2004).

Another useful website for conducting web-based comparisons among nucleotide substitution models is from Los Alamos National Laboratory: FindModel.

If you are interested in selection of best-fit models of evolution for protein sequence alignments, see (Abascal et al. 2005).

.......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... .......... ..........

Maintained by Adam Bazinet
Direct questions and comments to Michael Cummings
Maintained by Adam Bazinet
Direct questions and comments to Michael Cummings