Thursday, October 18, 2012

Making the Ancestor Problem Go Away

At the end of my recent post entitled “The Great Botanical Butter Battle Book” (which you would be well-advised to review before plunging into the even murkier aspects of the debate that follow!), I ended with the implication that, in the traditional taxonomic view, a classification system consisting entirely of monophyletic taxa  is not only undesirable, but also impossible.  Every genus, it is argued,  logically must have originated from a species in an earlier genus, rendering the earlier genus paraphyletic. All efforts to correct this by eliminating paraphyletic genera cause the system to collapse into a single genus. This is sometimes referred to as the “ancestor problem.”  It is a problem that arises with a steadily improving fossil record and actual knowledge of ancestor-descendant relationships.

Recall that monophyletic taxa are complete branches, or clades, of a phylogenetic tree, beginning with the founding ancestral species and including all the descendant species of the common ancestor (all twigs of the branch).   According to the prevailing practice of phylogenetic taxonomy, a clade-taxon at one taxonomic rank (e.g. family) can be subdivided into smaller clade-taxa (e.g. genera) using the same criteria, but none of the subclades can be given the same rank as the main clade (e.g. there cannot be a family within a family).   That is why birds and reptiles cannot  both be ranked as formal classes, as was done traditionally.

A similar example involves dogs and cats, which represent two of the several modern families in the mammal order Carnivora.    The common ancestor, and most likely a group of related species that preceded the split into dog and cat clades (along with the others that don't need to be mentioned), are in a taxonomic no-man’s land.  If we place that ancestral cluster of species, or "stem group," into a family, that family will by definition be paraphyletic, because both the dog and cat clades that we now consider formal families developed as subclades from within that ancient family.

Phylogenetic taxonomy has the logical goal of accurately incorporating the actual branching pattern of evolutionary history into a formalized classification system.  Attempting to put traditional taxonomic boxes (i.e. genera, families, etc.) around clades, however, does create some awkward situations, particularly as we know more and more about ancestral groups of organisms.

As we envision the process of evolution, clades
branch to beget new clades.  Successful
clades further branch into  bushy clusters of
species that  we recognize as genera.   
The hypothetical diagram at the right  represents a common pattern of evolution, with new clades arising from older clades, each in turn developing distinctive adaptations, blossoming out into a diverse group of species, and then declining into extinction as still newer groups come to dominate.  Assuming a fairly complete fossil record of this group, traditional (also called "evolutionary") and phylogenetic taxonomists would have rather different ideas about how to classify them.

Traditional taxonomists would view the four groups as a succession of distinct genera, one on top of the other.  They would see nothing wrong with new genera arising from older genera in sequence over time – it is necessary in fact, if we are to simultaneously believe in evolution and have a system of classification.

Phylogenetic taxonomists view the diagram differently.  A clade is a clade, from the founding species through all of its descendants.  Subclades are viewed as nested within the main clade, rather than emerging on top of them. If all four were considered genera, both A and C would be paraphyletic, because some parts have been removed. "A" can be considered a monophyletic genus, only if B, C, and D are included within it.  If so, the latter 3 genera must be given lower rank than A (e.g. subgenera).   Traditional taxonomists charge that this "lumping" solution would result in a collapse of the entire taxonomic system, for surely genus A itself evolved from some earlier genus, and so on back to the first genus of organisms.

Alternatively, B and D could be recognized as genera, but C and A would have to have higher rank (e.g. C as a subfamily that included genus D, and A as a family that included subfamily C and genus B).  In that scenario, the remaining contents of A and C would have to be split into comparable subgenera and genera.  However, "splitting" like that would just result in even more small, unclassifiable stem groups, and not really solve the problem.  

Many taxonomists, particularly botanists, contend that the ancestor problem is not worth worrying about, because we are highly unlikely to ever have a specimen of an actual ancestor to classify and name, and if we do we won’t know it and it won’t matter.  It’s true that we really don’t have a lot of fossil plants with which to “connect the dots,” and may never directly confront the ancestor problem. Zoologists, however, with a better fossil record, don’t get off so easy.

For example, wee now know quite a lot about our own family of primates, the hominids.  Before the genus Homo existed, there was another genus of hominids, traditionally known as Australopithecus.  Unless you believe in special creation of humans, there has to be a connection between our genus and earlier genera.  The very first species of Homo most certainly descended from a species of Australopithecus, which therefore would be paraphyletic and illegal in phylogenetic classification.  Some anthropologists have dutifully tried to weed out the paraphyly, but no matter how you lump or split the various parts of the hominid tree, there is some part of it that is still directly ancestral to the first Homo.  Is that ancestor unnamable?  Can it be placed in a genus at all?

In the end, the best that hominid taxonomists can do is to “minimize paraphyly” by making the ancestral stem genus as small as possible.  This was the approach of Cela-Conde and Ayala (2003), who recognized the genus Praeanthropus for the stubbornly paraphyletic residue (5 species) of Australopithecines that directly preceded the first humans and the more narrowly defined Australopithecus.  So it seems that in this perspective of evolutionary history, paraphyletic taxa are logically unavoidable.

Yet monophyletic taxa are the foundation of phylogenetic taxonomy.  Is there no way out of this dilemma?

Proponents of the Phylocode movement, advocate simply naming all the branches of a phylogenetic tree, without ranking the parts as families, genera, etc.  This might effectively avoid the conflict of taxonomic philosophies.  Clades can and do arise from one another (e.g. birds from dinosaurs).  Each clade and subclade at every level can have a name, as long as we don't try to rank them as genera, families, classes, etc.  Paraphyly, it can be argued, is an artifact of trying to fit clades into the boxes of traditional classification.

The phylocode makes a lot of sense, and in practice, we don’t worry so much about ranking the bigger clades of life any more.  We talk of Magnolids, Eudicots, Monocots, Amborellids, etc.  in discussions of the evolution of flowering plants, but it seems that no one is seriously trying to squeeze them into classes, subclasses, etc. any more.  There seems to be no point to it. 

There is a snag, however, in going totally rank-less.   Our conventional binomial (“two-name”) system  for naming species requires the genus name plus a specific epithet (e.g. – Quercus is the name of the genus that includes Quercus rubra, the red oak).  So all species must be in a genus.   Attempts to find an alternate species naming system to accompany the Phylocode have failed to achieve any consensus.  So my further discussion and proposed resolution of the ancestor problem assumes that we need formally-defined genera in order to create names for all species, past and present.

It would appear that the prospects for monophyletic-only genera destabilize and collapse the more information we have about ancestral organisms.  But does it really?  Perhaps all this discussion of ancestors having names, and genera evolving from genera, is putting the cart before the horse. 

Going back to the “father” of modern phylogenetic taxonomy, Willi Hennig, we see that his system centered around the process of cladistics and the resulting cladograms.  Cladistics is an objective mathematical process for comparing taxa by coding each for a very specific set of characters (e.g. leaves simple vs compound, stamens 6 vs 5, etc.).  The cladogram is generated by comparing taxa pair by pair.  It superficially resembles a phylogenetic tree as it consists of lines connecting taxa in a branching pattern. Taxa most similar in terms of shared character states come out close together on the cladogram, while those sharing relatively few character states come out distant from each other on the cladogram. 

The difference between a cladogram and a traditional phylogenetic tree (though the two are often considered the same these days) is that the cladogram is purely a diagram of similarity among the taxa being compared.  The cladistics process itself is not prejudiced by whether or not any of the taxa in the comparison are ancestors of any other taxa.  Though we call the stem line that precedes each branch point the “hypothetical common ancestor” (probably a poor choice of words), it is only a mathematically generated line, and does not need to be classified or named.  In a traditional phylogenetic tree (as in the first diagram above, or the diagram published in the study of hominid evolution discussed above), there are built-in hypotheses about ancestors and descendants. 

Now suppose, by chance, that we actually had a common ancestor in a cladistic analysis.  It would be coded and compared like any of the other taxa in the analysis.  It could be classified by putting a box around its line on the cladogram, and named according to conventional rules.  Like the exasperated phylogenetic plant taxonomists mentioned above, we would not know for sure if any taxon were ancestral to any other, and it wouldn't matter.  The cladogram is "ancestor-neutral." 

After the cladogram is made and carved into genera, however, we can then make hypotheses about common ancestors.  This is a separate, secondary process.  At the right is a small portion of a cladogram, with two genera that are sister taxa.  The short horizontal lines on the branch leading to genus B, indicate new adaptations, (technically character state changes or apomorphies), that evolved in the common ancestor of B after its split from A.   There are no comparable character changes indicated on the line leading to genus A, indicating that A has not changed since the split, and is therefore not measurably different from the actual common ancestor.  A and B can be classified as genera based on the cladogram, but then as a separate process we can hypothesize that genus B evolved from a species in genus A.   Genus A is monophyletic by the rules of cladistic classification, but still can be considered ancestral to B.

There is one important assumption in this “solution” of the ancestor problem.  In the simplified diagram above, it is  genera that are the units of cladistic comparison.  The short horizontal lines on the line going to genus B in the diagram are generic level characters.  For  purposes of recognizing genera, we must assume that characters that distinguish genera from one another are of a greater magnitude than, or perhaps qualitatively different from, characters that distinguish species, otherwise we might logically be led to make genera out of smaller and smaller groups of species, even of individual species, following the branching pattern alone.  

The determination of generic-level characters is a subjective judgement that in this context inevitable.   Otherwise, how do we decide that our own genus, Homo, is in fact a genus, not just a section of Australopithecus?  Anthropologists (some at least) have emphasized the "minimization of paraphyly" by lowering the threshold for generic characters.  Other systematists might choose to minimize the proliferation of small, barely-distinguishable genera by raising that threshold.  I have argued that minimal ancestral genera, such as Praeanthropus, are not taxonomically paraphyletic, as they are not identified as ancestral in the cladistic process.  How broadly or narrowly such ancestral genera are defined becomes an issue of how much difference should be required to distinguish genera from one another. This issue is far less cataclismic than the black-and-white battle between pro- and anti-paraphyletic forces that has unnecessarily preoccupied systematists for so many decades.

I would not necessarily extend these arguments to higher levels of classification, to talk about “class-worthy” characters, or “phylum-worthy” characters, for example.  It is not necessary to go there.  It may be that un-ranked clade names are a better option at those levels.  Genera are uniquely important in the taxonomic hierarchy, however, as they are necessary for naming species by the binomial convention, so we must do what we can to  maintain their universal application for all appropriately characterized clusters of species, be they current or ancestral. 


Cela-Conde, C. J. and F. J. Ayala. 2003.  Genera of the human lineage PNAS 100: 7684-7689.