For Taxonomy Menu, click here

The YCC nomenclature system
- a 5 year review

"by lineage" haplogroup labels are arbitrary and transient
"by mutation" SNP nomenclature is permanent and unchanging*

In 2002, the Y-Chromosome Consortium published a research paper proposing a unified nomenclature system for Y-DNA.

The salient features of this nomenclature system are the following:

1) It is based on a cladistic phylogenetic tree, using binary "UEP's" from the Non Recombining regions of the Y-Chromosome . This tree was rooted by comparing homologous DNA sequences between humans and closely related species (eg. Chimpanzees, gorillas and orangutans).

2) The tree was deliberately designed so that the branches with greater "terminal branches" are arranged after branches with fewer "terminal branches" .
eg. if there is a branch that splits into two (or more)"twigs", the "twig(s)" that has/have less "twiglets" is/are arranged before the twig(s) with more twiglets. The YCC calls twiglets "Terminal branches".
An example of this is the "branch" labeled"F". The branchlet "F" is divided into 5 twigs. If you refer to the original YCC tree, you will see that:

G had 5 terminal branches;
H had 5 terminal branches;
I had 10 terminal branches;
J had 13 terminal branches; and
K had 60 terminal branches.

As you can see, the number of terminal branches increases with the ordering of the lettering.

3) Two separate (but complementary) haplogroup nomenclature systems were proposed: By lineage, and by mutation.

Below is an example used by the YCC.

  By lineage By mutation
G-P17 or G-P18
H-M52* or H-M69
H-M39 or H-M138


The YCC authors state that each of these alternate nomenclature versions have their advantages and disadvantages.
The strength of the mutation based system is that the name for each terminal branch will always remain the same, regardless of new discoveries that may change the arrangement of twigs.
The strength of the lineage based nomenclature is that the label easily conveys the relationship of the lineage relative to other lineages (in a manner somewhat analogous to a mnemonic - it's much easier for the average person to wrap their head around the relationship between R1b and R1a compared to R2, than it is to work out the relationship between R-P25 and R-SRY10831.2 compared to R-M124).
On the other hand, new SNP discoveries change the topography of the lineage based nomenclature - which means that today the lineage originally labeled R1b is now labeled R1b1. In contrast, the "by mutation" nomenclature stays the same, regardless of new SNP discoveries.

4) The YCC consortium would publish annual revisions of the Y-Chromosome phylogenetic tree. Any published scientific studies would cite the version of the tree used. Each tree revision would be static and easily referencable, and easily consulted.

(Referring back to the statements made on the taxonomy basics page). The full scientific documentation of taxonomic changes is of paramount importance to assist those in the future to understand what taxa was being referred to in their past. In our present we might understand precisely what we are referring to (regardless of whether it is documented properly), but people in our future will not have that understanding unless it is documented properly.

Post 2002

It is unfortunate that the YCC did not provide annual updates as originally promised. Since the original paper was published, there has been only one official full update (ie. Jobling 2003). This is despite the fact that there have been several new SNP discoveries which would alter the structure of the tree (and would thus alter the "by lineage" nomenclature). To fill the void left by a lack of official YCC updates, several updates have been made that fall outside the principles originally proposed.

Several scientific papers have offered updated nomenclature for various different portions of the tree. There have also been two unofficial updates of the full NRY tree produced, - one by FTDNA and one by ISOGG. The former is most in keeping with the principles outlined by the YCC (the only exception is that you needed to purchase it from FTDNA in order to view a clear copy of the full tree). The latter is no longer a phylogenetic tree (as it contains multi-state characters, in addition to the binomial characters specified by the YCC ), and although it is easily consulted on the internet it is not static and is therefore not easily referencable.

Several modifications that have been done to the tree have not been in strict accordance to the principles that were originally outlined by the YCC .

These include:

Structural changes

Several of the new discoveries were of a nature that did not fit any of the revision examples outlined in the 2002 paper published by the YCC. Below are a few considerations:

If the "by lineage" system is to continue without creating too much confusion, - how much of the original labeling should be set into stone, and how much of the labeling should be altered in the original proposal of alternate alpha numerical coding for each level of nesting?. For instance, - should the whole system of labeling remain dynamic and changing based on new SNP discoveries?.. - in that case P25 might now be validly known as Q1b1 or S1b1 (based on the labeling changes that would result from the discovery of the S2 and S22 SNP's).. assuming that it is even valid to keep P25 in the NRY tree.
What is currently being referred to as IJ would become I and what is now known as I would become either I1 orJ, and what is now known as J would become I2 or K. You might ask: why would M170 become I1 or J, rather than I2 or K? - the answer is simple - M170 has about 17 terminal branches and M304 has about 30 terminal branches. The branches with fewer "tip haplogroups" are arranged before those with more "tip haplogroups"
Instead, it is better that S2/S22 be referred to as IJ? (and thus does not disrupt the labeling of the rest of the tree). However, - what if a SNP had been discovered that united M201 and M304? Would we call this new united branch GJ? Under the nomenclature system originally proposed we wouldn't be able to - GJ would instead include H and I as well as G and J.

Instead, should some levels of labeling be set in concrete? - to minimise the level of confusion that could potentially be created when new SNP discoveries alter the nesting structure of the phylogenetic tree?. Likewise, - should we retain the same system of the most diverse branches being arranged after the less diverse branches?

Indeed - the latter has already been dispensed with in several tree updates. For instance, - when the marker M223 was discovered, - strictly speaking it should have become I1a, and with P30 changed to I1b and P37.2 changed to I1c. In reality though, giving it the label I1c was far less disruptive (aside from the fact it was originally incorporated into the tree before its proper place in the haplogroup I tree was determined).

Non-binomial Characters

 See the DYS 413 review page for more details.

Dynamic/Non static trees

If the tree is dynamic: if (hypothetically) a person talks about R1b1c9 in January 2007 (referencing the 2007 tree), but that nomenclature structure changes before the end of the year, - at the end of 2007 the same nomenclature might refer to an entirely different taxa. While the change in nomenclature might be clear to someone in the present, - it may not be so clear in the future. What the person in the future will see, is the December version of the 2007 tree - which would not be the same tree as the January version of the 2007 tree.
If a tree is static: if (hypothetically) a person talks about R1b1c9 in January 2007 (referencing the 2007 tree), - at the end of 2007 the same nomenclature will refer to the same taxa. In 2008 a new version of the tree would be issued, incorporating all the discoveries made in 2007.
Only static trees can properly document taxonomic changes for those in the future trying to make sense of what we are saying now.

Some of these alterations might be considered valid alterations, - but to be accepted as official then a statement of revised NRY tree guidelines needs to be made by the YCC. If we are to accept STR markers as being valid, then the YCC needs to officially state that the NRY tree is no longer a binomial Phylogenetic tree.

If the YCC cannot make annual updates itself, then it would be helpful if the YCC could publish much clearer guidelines for individuals who do want to update the nomenclature, - and so those alterations can be done in a scientifically valid manner

Creative Commons License
This work is licensed under a
Creative Commons Attribution-No Derivative Works 3.0 License.
This work can be freely cited, if it is attributed to:
The J2 Y-DNA project or
Angela Cone (2007)
Msc Evolutionary Ecology