Reading the Tree of Life

Main Points

  • Phylogenies are depictions of evolutionary relationships in a tree-like form.
  • Paleontologists must use morphology or features of animals to understand evolutionary relationships – our animals don’t have DNA for us to use!
  • Natural taxa = clades = monophyletic groups are bound by origination and extinction.
  • Trees are very useful for testing a variety of scientific questions and being able to read them helps you interpret and understand data.

When scientists map out the relationships of life on Earth, the patterns look very much like trees, which is why we refer to it as the “tree of life”. This page is designed to get you through any vocabulary concerning evolutionary relationships and provide you with a basis to read any trees you might find in the literature. Trees that are generated from morphological or molecular data to show evolutionary relationships are often called phylogenies. The idea behind estimating phylogenetic hypotheses is to group organisms based on shared evolutionary history rather than simply by appearances. As you will see below, appearances can be misleading!

A page from Darwin’s notebook that illustrates his idea of descent with modification.

The history of life has been represented by a branching or tree diagram since Charles Darwin sketched a tree in 1837! Although Darwin was simply drawing his interpretation of how life changed through time, the idea of how life is related to each other was in its infancy. Many scientific studies, however, have added to Darwin’s ideas and these trees are now rigorously tested through detailed analyses. This allows us to provide confidence and support for our hypotheses of how life changed through time (Gregory, 2008). Tree thinking and understanding is critical for modern evolutionary biology and therefore is equally important in the the fossil record. However, how the types of data we use to construct these evolutionary trees using modern organisms or fossil organisms differ quite a bit. Paleontologists lack the molecular data we can get from the DNA of living creatures – we must carefully assess the animal’s morphological features to determine evolutionary relationships!

Below is a simplified phylogeny with the primary components of the tree labeled. We will walk through what each of these terms means and learn how to read evolutionary trees!

Simplified tree diagram, see associated text that describes how to read the tree. Modified from Gregory (2008). Hopefully, you’re picking up on the tree theme with these terms!

Consider this a guidebook or a framework to begin to evaluate larger more complex evolutionary trees. Let’s start at the top of the tree. Each of the letters can be called a terminal node or the tips of the tree. These tips commonly represent species (recall: species are groups of individual organisms and the unit upon which evolution acts) – the species can be alive today or extinct, depending on the research question. Terminal nodes connect to each other by branching points that we refer to as internal nodes. These divergent points commonly represent speciation events and are the location of the most recent shared ancestor between two terminal taxa. Branches in this tree indicate the lineage or transitions from the shared ancestor to the terminal taxon (Gregory, 2008). The terminal taxon is the organism or subject found at the end of a branchIn some cases these branches are scaled differently – this often represents the amount of evolutionary change to get to specific terminal taxa. The outgroup is used to root the tree compared to the in group (the group you are studying). This is usually an organism that is closely related to or sister taxon to the entire ingroup (B, C, D, and E). Outgroups are particularly important because in order to understand how a specific character changes throughout the group, the tree must be rooted. This means that choosing the correct outgroup is essential for assessing single feature changes within the ingroup.

Sister taxa are taxa that are most closely related to each other than to any other taxa on the tree. D and E are more closely related to each other than either is to B or C. It’s the same with B and C! B and C are more closely related to each other than either are to D or E.

Now that you have the basics, let’s keep going!

If you have read through our Taxonomy page, you understand that classifying life is done through understanding natural taxa. Natural taxa can be referred to as clades or monophyletic groups. These groups or taxa are bound by an origination and extinction event.

Distinction between monophyletic, paraphyletic, and polyphyletic groups.

Paraphyletic groups are classified by incorporating an ancestor and some of its descendants. Paraphyletic groups are bound by extinction and the exclusion of specific groups based on if the interpreter believes they are in that group or not. Polyphyletic groups are created from unrelated organisms with multiple ancestors. Classifying groups of organisms that are either paraphyletic or polyphyletic can be problematic. This comes back to the understanding of homology and homoplasy. Consider pachyderms, this group includes elephants, hippos, rhinos – the animals with tough skin! Although this observable feature is very clear on all of the animals, it does not represent a common feature that joins these animals. They are, in fact, not a clade but rather a polyphyletic group. This indicates that the grouping of pachyderms does not follow the goal of establishing evolutionary relationships even though the animals are superficially similar. Let’s consider another group – pisces (fish) are a paraphyletic group because it includes an ancestor and some descendants but not all. If we were to include all the descendants, we would be considered fish! It makes more sense to call us all Tetrapoda – a monophyletic group that is supported by the advent of four limbs.

Here is a simplified example of how nodes on trees can rotate. The relationships remain the same!

Something very important to remember is that nodes (branching points) on a tree can rotate. The relationships remain the same but the order in which they appear on the tree can change.

There are several more terms that we need to be familiar with before beginning to read and interpret these trees. A synapomorphy is a shared derived character; meaning a common features that joins taxa together. Synapomorphies diagnose (i.e., support, characterize) nodes on a tree (a synapomorphy for humans is a large brain size). Plesiomorphic characters tell us something about the ancestor; a plesiomorphic character is one that was gained ancestrally (an example of a plesiomorphic feature in humans would be having four limbs-four limbs is a trait derived from much further back in the evolutionary tree!). An apomorphy is a unique characteristic of a clade – any characteristic or feature that is specific to a species and its descendants. Not to be confused with an autapomorphy, which is a derived character that is present in only one taxon. A convergent character (a feature that has reevolved more than once in the tree-an example of this is wings) is termed homoplastic – read more about this on our Homology page! Let’s examine the tree below and think about the terms.

Example of characters mapped onto a tree.

This example tree can tell us a lot about these imaginary relationships. First, let’s examine the characters.

    Pig snout, denoted by the green line, is found at the base of the tree. This character is supporting clade M, meaning M and all of its descendants have a pig snout. Wings, denoted by the purple line, support clade K. This indicates that K and all of its descendants (E, J (=F+G)) have wings. Moving onto webbed toes – denoted by the orange line. This character is seen twice on the tree at D and G. This character is homoplastic – it appears twice on the tree independently. It is true that D and G share a common ancestor of L, but L does not have webbed toes.

Now let’s think about relatedness on the tree.

    • Something that is very important to think about before beginning to decipher the tree is: what are the data? Remember that species are the unit upon which evolution operates and species are populations of organisms.

Let’s start simple! We can call this tree clade M. M is a monophyletic group that includes A-L. Species A and species B form clade H. What is the most closely related group to clade H? Start at clade H on the tree and move down to M – M will be the common ancestor! We then begin to move up the tree toward clade L – L is the most closely related group to clade H. This is called a sister taxon or a sister group.Now, what if I asked you to figure out the most closely related taxon to species E. Start at E and work your way down to find the most recent common ancestor. Did you stop at clade K? That’s it! Now move up the tree to find the sister taxon. It’s clade J!

For many people it is easier for them to think of trees and clades as nested relationships. Here is an example: ((F+G)E) This is removing the internal nodes of the tree and focusing on the terminal taxa. It means that F + G are more closely related to each other than either is to E.

Why should you care about reading trees?

There is great strength and understanding that can come from being able to read and interpret these trees. Trees can be used to depict relationships of an immeasurable amount of things not just species! Consider understanding the evolution of language, religion, written language, music, and anything else you can think of! It is a way to analytically test and assess patterns in the world around us.

Find further reading from the Digital Encyclopedia of Ancient Life on reading trees here.

Proceed to ‘Homology’