Combining our past life with our present improves the foundation and deeper understanding of our evolutionary tree

Fossils improve phylogenetic analyses of morphological characters

Nicolás Mongiardino, Russel J Garwood, Luke A. Perry.                                                         

Summarized by Sadira Jenarine, a senior at The University of South Florida. She is a geology major and plans on attending graduate school following graduation in the summer. Once she earns her degree, she hopes to work along the lines of environmental conservation and preservation or become a professor. When she’s not looking at rocks, you can usually find her at the local Starbucks making a latte or in the town’s own “Lettuce Lake Conservation Park”.     

What data were used: The authors conducted a simulation of 250 evolutionary trees, also called phylogenies, which were used to determine the most accurate method of creating phylogenetic trees. Programs were designed to account for species’ traits, including strengths and weaknesses as well as their ability and likeliness to survive natural disaster, such as mass extinction events, and/or predation.     

Methods: This study was completed by testing different phylogenetic inference methods: maximum parsimony (MP), Bayesian inference (BI) and tip-dating. MP is essentially the path of least resistance in evolution; the fewer branches you must jump on “the tree of life”, the more closely related a species likely is. Bayesian inference, combined with tip-dating is a method of dating fossils by analysis that gives a numerical age of the specimen, and then tests whether it is statistically accurate using an equation called Bayes Theorem. These methods differ from a more common technique, ‘node dating’ which determines the age through age constraints that are formed by the first and last seen specimen in the fossil record. These new forms of analysis were tested based on the 250 trees as well as over 11,000 different traits that these organisms share. 

Results: This experiment was conducted by testing the results of the length of the simulated evolutionary trees. The graph (Figure 1) measures the accuracy of the placement of the species on the tree among all inference methods performed by testing different accuracy measures, which are measured by using the number of nodes (i.e., the branching point on an evolutionary tree). We see that even with accounting for missing data (i.e., when species don’t have the entire suite of characters used in a phylogenetic analysis), one type of accuracy, quartet-based accuracy, increases proportional to the fossil sample. In turn, bipartition-based accuracy shows a difference in accuracy when there is missing data. This effect is mostly seen when examining tip-dated inference which uses multiple morphological (body shape) and molecular (DNA) data from fossils themselves. Tip-dating is a newer method of inference and should therefore be used with caution, as it is sensitive to missing data, something very common when using fossils. 

Graph in top left measures the topological accuracy of bipartitions based on the proportion of missing fossils in maximum parsimony (MP), Bayesian inference (BI) and tip-dating (clock). No missing data concludes a higher accuracy in all 3 inferences with the most outstanding in clock dating. Even amongst high levels of missing data, the topological accuracy for clock dating is outstanding in comparison to other methods. The bottom left graph measures the same variables however with quartet-based analysis. This graph remains the same even with different levels of missing data. The graphs to the right measures the topological precision. In MP precision decreases as more levels of missing data are introduced, same with BI and clock, however not as outstanding. In quartet-based analysis all three inference methods maintain similar precision even amongst missing data.
Figure 1. The graph shows both topological accuracy (left) and topological precision (right) using both forms of measurement; bipartitions (top) and quartet (bottom). Colors indicated in the graph account for the levels of missing data. Amongst all methods, we see increased accuracy than that of parsimony. Issues arise with the tip-dating (clock) method when levels of missing data are high.

Why is this study important? Using complete morphological and/or molecular data of fossils, as well as data from living organisms, provides the most accurate evolutionary tree reconstruction. This shows us that tip-dating, which is the inclusion of fossils into the construction of the evolutionary tree, creates a more accurate and precise tree. This study compares its results to those from previous analyses and examines a new angle: accounting for missing data. This is beneficial, because this study helps us understand the limitations of a number of methods, which can help us create more realistic phylogenies. 

The big picture: Here, we are learning that using fossils along with modern species, when many studies use just modern species or just fossil species, really gives us a more accurate representation of how life on Earth has evolved through time. Because some of these methods of inference are newer, like tip-dating, there is much room for progress and development. By no means does it mean that new methods should be immediately widely accepted, but that it is our duty to continue to study this new form of inference dating. By understanding how what we have, what we had, and what we lost, we can get a better grasp of the evolutionary tree that we are working to perfect.   

Citation: Mongiardino Koch, Nicolás, et al. “Fossils Improve Phylogenetic Analyses of Morphological Characters.” Proceedings of the Royal Society B: Biological Sciences, vol. 288, no. 1950, 2021,