Information
Roots, measures, and collapses phylogenetic trees to polytomies based on bootstrap value!
This command-line Java program takes in Nexus/Newick-style phylogenetic tree code (including
files exported from FigTree) that includes bootstrap values and branch lengths, and analyses
them using any combination of three main functions:
-
Rooting the tree
-
Collapsing the tree
-
Measuring the length and other attributes of the tree
More details of these three functions are provided below ('Program Details'). The program
output varies by the functions selected, but trees are output in Nexus/Newick-style code,
and length attributes are output in a CSV (comma delimited) file.
Bug? Question? Comments? Feedback form
Stay informed about updates to TreeCollapse4 - join the mailing list!
Updates Since v3.2:
- Corrected problem with collapsing algorithm that often over-collapsed trees. It's highly
recommended that you re-run data with TreeCollapseCL4, especially if your collapsed tree
was slightly different than expected.
Updates Since v3.1.2: (List of previous updates at the bottom of the page)
- '-t' is now optional - if not included the value defaults to 'O'
- '-rax' is now depreciated. The program
now automatically detects and handles variations in format
- Corrected error that assumed all files had a path before the file name (caused a
StringIndexOutofBoundsException)
- Corrected a minor collapsing problem that caused slight under-collapsing of some nodes
Program Links
PLEASE NOTE: I cannot be held responsible for any incorrect behaviour of this program.
Though I try to thoroughly test all my programs, I cannot test them in every conceivable
situation, and so cannot guarantee that it will behave correctly or as predicted at
every execution.
Download Jar File
User Manual
Program Details
TreeCollapseCL 4 is relatively speedy! In my own tests on my Windows 7 dual core 2.83GHz
machine with 8GB RAM, rooting, collapsing, and getting the length and other attributes from
20 trees (with ~8,500 sequences each) takes about a minute.
-
Rooting the tree
An outgroup is specified by the user, and the tree is rooted or re-rooted to that outgroup.
The rooting function behaves in the same way as the FigTree
program, so that after rooting, the branch length of the branch parental to the outgroup
is split evenly between the outgroup and the rest of the tree, and the new root node has
no bootstrap value.
-
Collapsing the tree
A threshold is provided by the user, and all nodes with bootstrap values below this
threshold are collapsed to polytomies. Length of the tree is preserved.
-
Measuring length and other attributes
The length from each leaf to the node above the root node (if an outgroup is present,
the length from root to first node can be dictated more by the outgroup than anything else)
is calculated, and the average bootstrap value (average of each bootstrap value between
the leaf node and the root) is calculated for each leaf.
Citation
If you publish or present work that has been processed using this program, please
cite Emma Hodcroft and the website where this program can be downloaded
(http://emmahodcroft.com/TreeCollapseCL.html).
(I'd also be very interested to hear about what you've used the program for, so feel free to send me a link to your paper/research!)
As seen in! The TreeCollapserCL has been used in these publications:
- Green, J. & Akam, M. (2013).
Evolution of the pair rule gene network: Insights from a centipede. Developmental Biology. 382(1), 235-245.
- Goldsmith, D. (2014).
Marine Viral Diversity and Spatiotemporal Variability. Graduate Theses and Dissertations.
- Weisberg, A., Elmarakeby, H., Heath, L., & Vinatzer, B. (2015).
Similarity-Based Codes Sequentially Assigned to Ebolavirus Genomes Are Informative of Species Membership, Associated Outbreaks, and Transmission Chains. Open Forum Infectious Diseases. 2(1).
- Goldsmith, D. et al. (2015).
Water column stratification structures viral community composition in the Sargasso Sea. Aquatic Microbial Ecology. 76(2), 85-94.
- Rosario, K. et al. (2016).
Begomovirus-Associated Satellite DNA Diversity Captured Through Vector-Enabled Metagenomic (VEM) Surveys Using Whiteflies (Aleyrodidae). Viruses. 8(2), 36.
Bug? Question? Comments? Feedback form
Previous Updates
Updates Since v3.1.2:
- Now compatible with Windows Shell auto-glob when passing * to specify file endings
- Removed outputting the 'level' of a node when specifying the '-l' parameter
- '-t' is now optional - if not included the value defaults to 'O'
- '-rax' is now depreciated (but can be included without affecting the run). The program
now automatically detects and handles a larger number of minor variations in format that
can occur at the end of Newick files.
- Corrected error that assumed all files had a path before the file name (caused a
StringIndexOutofBoundsException)
- Corrected a minor collapsing problem that caused slight under-collapsing of some nodes
- Corrected rooting to be more robust with particular polytomies
Updates Since v3.1.1:
- Corrected file/directory reading for Unix/Linux/Mac users
Updates Since v3.1:
- Please read updated information on when and how to use ‘-rax’ and ‘-t’!
- Now takes trees with FigTree annotation or in Newick format without bootstrap values,
or with some nodes missing bootstrap values (collapsing cannot be done if bootstrap values
are not present)
- Corrected errors in reading Nexus files
- Corrected errors in handling file names that included decimal points
Updates Since v3.0:
- Now takes trees without bootstrap values, or with some nodes missing bootstrap values
(collapsing cannot be done if bootstrap values are not present)
- Can now correctly handle trees output from RAxML (or other programs) where an extra root
branch length of '0.0' has been attached to the end of the code (see parameter '-rax')
- Stops the run and provides an error message with potential solutions if too much of the tree
has been collapsed, causing a recursive stack overflow.
“To iterate is human, to recurse divine.”