An UltraFast Tool for Minimum Reticulate Networks
- Jun. 6, 2015: corrected a coding error and also made the program significantly faster.
- Jan. 28, 2014: corrected a coding error (which has little effect on the speed).
After downloading a program, you can run it as follows:
program_name -OPTION TreeFile
[Time_Bound_in_Seconds]
where OPTION is a string in the set
{HN, MAAF, MAAFs, rSPRDist, MAF, MAFs, LB} ,
Time_Bound_in_Seconds is an optional upper bound
on the running time, and TreeFile contains two or more
phylogenetic trees in
Newick format such as (((1,4),3),2) and
(((species1, species_3), cat),dog) ended with a semicolon.
The name of each species must consist of letters in
{a, ..., z, A, ..., Z, 0, ..., 9, _, .}.
There is no limit on the length of the name of each species.
You can view a tree in Newick format using
Dendroscope
by Daniel H. Huson or a simple applet
by my student Yohei Terazaki . Finally,
Time_Bound_in_Seconds is meaningful
only when OPTION = LB .
The options control the output as follows:
- HN: This option asks for computing the size of
a maximum acyclic agreement forest of the trees in TreeFile.
Note that if TreeFile contains two trees only, then
the output is the so-called hybridization number of the two trees.
- MAAF: This option asks for computing one
maximum acyclic agreement forest together with one
reticulate network for the trees in TreeFile.
The network is output in the extended Newick format
and its number of reticulate vertices is minimized.
You can view a network in the extended Newick format using
Dendroscope
by Daniel H. Huson.
- MAAFs: This option asks for enumerating
all maximum acyclic agreement forests for the trees in TreeFile.
For each enumerated maximum acyclic agreement forest T,
one reticulate network (with the minimum number of reticulate vertices)
for the trees in TreeFile is also computed from T and is
outputted in the extended Newick format.
- rSPRDist: This option asks for computing the size of
a maximum agreement forest of the trees in TreeFile.
Note that if TreeFile contains two trees only, then
the output is the so-called rSPR distance of the two trees.
- MAF: This option asks for computing one
maximum agreement forest
for the trees in TreeFile.
- MAFs: This option asks for enumerating
all maximum agreement forests
for the trees in TreeFile.
- LB: This option asks for computing a
lower bound on the minimum size of a reticulate network
for the trees in TreeFile.
Note that the size of a reticulate network N is
E minus V, where E is the number of edges
entering the reticulate vertices in N and V is the number of reticulate vertices in N.
If TreeFile contains only two trees, then the outputted lower bound is
the hybridization number of the two trees and is hence actually optimal. On the other hand,
if TreeFile contains three or more trees, then the outputted lower bound
is not necessarily optimal.