A first draft of the “tree of life” for the roughly 2.3 million named species of animals, plants, fungi and microbes — from platypuses to puffballs — has been released.
A collaborative effort among eleven institutions, the tree depicts the relationships among living things as they diverged from one another over time, tracing back to the beginning of life on Earth more than 3.5 billion years ago.
Tens of thousands of smaller trees have been published over the years for select branches of the tree of life — some containing upwards of 100,000 species — but this is the first time those results have been combined into a single tree that encompasses all of life. The end result is a digital resource that available free online for anyone to use or edit, much like a “Wikipedia” for evolutionary trees.
“This is the first real attempt to connect the dots and put it all together,” said principal investigator Karen Cranston of Duke University. “Think of it as Version 1.0.”
The current version of the tree — along with the underlying data and source code — is available to browse and download at https://tree.opentreeoflife.org.
It is also described in an article appearing Sept. 18 in the Proceedings of the National Academy of Sciences.
Evolutionary trees, branching diagrams that often look like a cross between a candelabra and a subway map, aren’t just for figuring out whether aardvarks are more closely related to moles or manatees, or pinpointing a slime mold’s closest cousins. Understanding how the millions of species on Earth are related to one another helps scientists discover new drugs, increase crop and livestock yields, and trace the origins and spread of infectious diseases such as HIV, Ebola and influenza.
Rather than build the tree of life from scratch, the researchers pieced it together by compiling thousands of smaller chunks that had already been published online and merging them together into a gigantic “supertree” that encompasses all named species.
The initial draft is based on nearly 500 smaller trees from previously published studies.
To map trees from different sources to the branches and twigs of a single supertree, one of the biggest challenges was simply accounting for the name changes, alternate names, common misspellings and abbreviations for each species. The eastern red bat, for example, is often listed under two scientific names, Lasiurus borealis and Nycteris borealis. Spiny anteaters once shared their scientific name with a group of moray eels.
“Although a massive undertaking in its own right, this draft tree of life represents only a first step,” the researchers wrote.
For one, only a tiny fraction of published trees are digitally available.
A survey of more than 7,500 phylogenetic studies published between 2000 and 2012 in more than 100 journals found that only one out of six studies had deposited their data in a digital, downloadable format that the researchers could use.
The vast majority of evolutionary trees are published as PDFs and other image files that are impossible to enter into a database or merge with other trees.
“There’s a pretty big gap between the sum of what scientists know about how living things are related, and what’s actually available digitally,” Cranston said.
As a result, the relationships depicted in some parts of the tree, such as the branches representing the pea and sunflower families, don’t always agree with expert opinion.
Other parts of the tree, particularly insects and microbes, remain elusive.
That’s because even the most popular online archive of raw genetic sequences — from which many evolutionary trees are built — contains DNA data for less than five percent of the tens of millions species estimated to exist on Earth.
“As important as showing what we do know about relationships, this first tree of life is also important in revealing what we don’t know,” said co-author Douglas Soltis of the University of Florida.
To help fill in the gaps, the team is also developing software that will enable researchers to log on and update and revise the tree as new data come in for the millions of species still being named or discovered.
“It’s by no means finished,” Cranston said. “It’s critically important to share data for already-published and newly-published work if we want to improve the tree.”
“Twenty five years ago people said this goal of huge trees was impossible,” Soltis said. “The Open Tree of Life is an important starting point that other investigators can now refine and improve for decades to come.”
This research was supported by a three-year, $5.76 million grant from the U.S. National Science Foundation (1208809).
C. Hinchliff et al. Synthesis of Phylogeny and Taxonomy Into a Comprehensive Tree of Life. Proceedings of the National Academy of Sciences, 2015 DOI: 10.1073/pnas.1423041112