Caveat: This review assumes that there has not been widespread lab testing error
in the analysis of DYS 413 by the University of Arizona lab that does the standard testing for FTDNA
.

For Taxonomy Menu, click here

DYS 413
- a case study

DYS 413 - "virtual black & white SNP" or "just another grey STR" ?

DYS 413 is a multi copy "STR", which has two copies on a Palindromic region of the Y-Chromosome (Krahn 2006). In 2006, Sengupta et al. described the discovery of the SNP marker M410 which was found in all the J2 M12- lineages they tested. This necessitated a change in nomenclature of J2 M172 lineages, and at the same time, Sengupta et al. acted on the suggestion made by Di Giacomo in 2004 and incorporated DYS 413 into the NRY phylogenetic tree.

Recent evidence suggests that we cannot consider the DYS 413 deletion seen in J2a to be equivalent to a SNP. When all the facts about DYS 413 are examined, it becomes apparent that DYS 413 should never have been incorporated into the NRY Phylogenetic tree (as defined by the YCC in 2002.).

In a nutshell:The only valid way in which DYS 413 can be incorporated into the Y-DNA tree is if the Y-DNA tree is officially redefined by the YCC.
Specifically, the YCC would have to state that it is allowable for the NRY tree to contain multi state characters. This could also effectively mean that we can no longer consider the NRY tree to be a phylogenetic tree (since the tree would no longer be purely cladistically based).
It also would no-longer be the NRY tree - as by definition, the NRY tree is based on markers found on non recombining portions of the Y-Chromosome. Strictly speaking, the Palindromic regions of the Y-chromosome are not non-recombining. Therefore, the NRY tree cannot contain characters that are present on any of the Palindromic regions of the Y-chromosome. That includes DYS413 (if it is indeed correct that it is located on a Palindromic region of the Y-Chromosome). This also applies to existing characters in the tree that are located within Palindromic regions ie.P25.

In this review, we will first consider the evidence that originally led to its incorporation into the Phylogenetic tree by Sengupta et al. 2006. We will then present the current evidence which indicates that DYS 413 cannot be considered a valid character within the context of the whole NRY Phylogenetic tree. Whether or not it can be considered a valid character within the context of only the J2a portion of the tree is a different matter.

If we can consider it a valid taxonomic character just within the context of J2 M410+, we cannot notate the presence/absence of the DYS413 deletion with the NRY nomenclature. Perhaps special notation could instead be officially devised by the Y-chromosome consortium for clusters within haplogroups that are defined by STR markers (like for instance, the Greek character clustering notation used for E3b).

The original case for DYS 413 being considered a "virtual SNP"
The text below was originally written in 2006, soon after the Sengupta et al. 2006 paper was published.

In most haplogroups, DYS413 usually exhibits values of 20 - 24 repeats. In Malaspina et al. ( 2001) it was shown that DYS 413 is strongly bimodal within haplogroup J2. In some, the values were the usual repeats greater than 20, yet in others repeats were 18 or lower (usually they were 17). It was proposed that the reduction of repeats in some J2 was a result of a single deletion event in the J2 family tree, and thus represented a unique event.

 

- ie. all men with a DYS413 repeat values of 18 or below are all descended from a single individual who had a deletion of several repeats on his copies of DYS 413 (and all descendants of that man have DYS 413 repeat values that are either 18 or less than 18).

We now know that this UEP deletion must have occurred after the M410 mutation (it only occurs in J2's that are positive for M410, most that are M410 positive also have the DYS413 UEP deletion, but not all of them. All who are positive for the M340 mutation do not have the DYS413 UEP deletion - thus the M340 lineage is not descended from the DYS413 ≤ 18 lineage (but both are descended from the M410 lineage).
All of the other previously recognised J2 clades possess the DYS413 UEP deletion - ie. all Y-DNA samples that are in J2 M47+, J2 M68+, J2 M137+, J2 M158+, J2 M67+, & J2 M339+ have DYS413 ≤ 18.

The evidence therefore indicated that the DYS413 deletion was definitely a unique event, and that it was quite clear where in the J2 phylogenetic tree that the DYS413 deletion occurred.

In retrospect: What the Malaspinas articles lacked was actual published data on what repeat values were actually found in other haplogroups.

The current case against DYS 413 being considered a "virtual SNP"

Valid characters in a cladistic phylogenetic tree should:
• Have a clear "ancestral" vs "derived" state
• Be a "unique event polymorphism"
    (see the Cladistics page)

A review of DYS 413 values in 14 different haplogroups has demonstrated that DYS 413 does not have a clear-cut "ancestral" v's derived state, nor is it a "unique event polymorphism". Therefore DYS 413 cannot be used as a valid character in the NRY Phylogenetic tree.

The spread of repeat values in DYS 413 is strongly bimodal - however, the frequency distributions surrounding the two modals do overlap. Therefore, a character state of "ancestral" cannot be clearly differentiated from a character state of "derived".

The results also demonstrate that a multi-repeat deletion in DYS 413 has occurred on at least 6 different occasions. On 4 of these occasions the deletion is present on only one of the two copies of DYS 413 (within R1b; G; J1; J2 M12+ predicted, Zita cluster), and on two occasions we can surmise that a recLOH event has subsequently resulted in both copies of the DYS 413 marker having a multi repeat deletion (within J2 M410+ and within haplogroup O).

This conclusion is demonstrated in the data tables and graphs that can be seen below:
(each graph is paired - The second of each pair has the axis truncated, to show the distributions at the low end of the frequency scale)

 

 Low copy
R1b
R1a
Q
O
N
L
K
J
I1c
I1b
I1a
H
G
E
C
16
2
0
0
0
0
0
0
4
0
0
0
0
1
0
0
17
0
0
0
3
0
0
0
89
0
0
0
0
0
0
0
18
0
0
0
0
0
0
0
5
0
0
0
0
0
0
0
19
0
1
0
0
0
1
0
0
21
0
0
0
0
0
0
20
2
1
1
0
1
6
15
4
15
1
4
0
14
10
0
21
65
12
2
0
29
0
0
37
32
6
39
4
65
24
1
22
81
69
24
0
2
0
0
10
8
0
162
0
104
26
2
23
360
0
7
0
0
0
0
2
0
0
249
0
2
36
0
24
0
0
4
0
1
0
0
0
0
0
14
0
1
9
0
25
4
0
0
0
0
0
0
0
0
0
21
0
0
0
0
26
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
27
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
28
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

 

High copy
R1b
R1a
Q
O
N
L
K
J
I1c
I1b
I1a
H
G
E
C
16
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
17
0
0
0
3
0
0
0
82
0
0
0
0
0
0
0
18
0
0
0
0
0
0
0
13
0
0
0
0
0
0
0
19
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
20
0
0
0
0
0
6
12
0
0
0
0
0
2
0
0
21
3
1
0
0
4
1
2
3
28
6
3
4
5
2
0
22
7
77
19
0
24
0
1
37
46
1
7
0
160
33
1
23
460
2
12
0
0
0
0
13
2
0
57
0
11
19
0
24
29
1
5
0
5
0
0
2
0
0
100
0
9
45
2
25
14
2
1
0
0
0
0
0
0
0
312
0
0
6
0
26
1
0
1
0
0
0
0
0
0
0
6
0
0
0
0
27
0
0
0
0
0
0
0
0
0
0
3
0
0
0
0
28
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0

 

Combined
R1b
R1a
Q
O
N
L
K
J
I1c
I1b
I1a
H
G
E
C
16
2
0
0
0
0
0
0
4
0
0
0
0
1
0
0
17
0
0
0
6
0
0
0
171
0
0
0
0
0
0
0
18
0
0
0
0
0
0
0
18
0
0
0
0
0
0
0
19
0
1
0
0
0
1
0
1
21
0
0
0
0
0
0
20
2
1
1
0
1
12
27
4
15
1
4
0
16
10
0
21
68
13
2
0
33
1
2
40
60
12
42
8
70
26
1
22
88
146
43
0
26
0
1
47
54
1
169
0
264
59
3
23
820
2
19
0
0
0
0
15
2
0
306
0
13
55
0
24
29
1
9
0
6
0
0
2
0
0
114
0
10
54
2
25
18
2
1
0
0
0
0
0
0
0
333
0
0
6
0
26
1
0
1
0
0
0
0
0
0
0
6
0
0
0
0
27
0
0
0
0
0
0
0
0
0
0
3
0
0
0
0
28
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0

 

In terms of the Black/white v's grey analogy discussed on the cladistics page, we would say that DYS 413 is neither. It is close to being clear cut - but just falls short of it. For most haplotypes we would be able to assign ancestral v's derived, - but for some haplotypes we would be unable to do this with 100% certainty.
It is currently true that each haplotype can be more-or-less assigned a DYS 413 status of either ancestral or derived (but not with 100% confidence). We can more-or-less say that values of 18 and below are derived, and we can more-or-less say that values of 20 and above are ancestral. What complicates matters is that values of 19 are found in haplotypes that are both ancestral and derived. In addition there are some haplotypes with one DYS 413 value that is clearly derived and one DYS 413 value that is conversely ancestral.

There is one J2 haplotype with a value of 18,19; there are 12 haplotypes in I1c with values of 19,21 and 9 with values of 19,22; there is one haplotype in L with a value of 19,20; and there is one haplotype in R1a with a value of 19,22. While it is reasonably clear that the character state of the J2 haplotype is "derived" and the character state of the I1c, L and R1a haplotypes is "ancestral" - if a recLOH event mutation was to occur there would be a 50:50 chance of the resulting DYS 413 values being 19,19. If such a haplotype was found then we would not be able to assign it an ancestral v's derived status.

There are also two R1b haplotypes with DYS 413 values of 16, 23; one haplogroup G haplotype that has DYS 413 values of 16,22; one haplogroup J1 haplotype that has DYS 413 values of 17,21; and one J2 haplotype that has DYS 413 values of 18,23. This latter haplotype clusters within the projects J2b-Zita cluster, and clusters with the J2 M12+ haplotypes in the 67 marker diagram.

Finally, all three haplogroup O haplotypes (two of which are O3a5) have DYS 413 values of 17,17. Clearly the multi-repeat deletion followed by a recLOH does not constitute an "unique event". While it is true that there are several characters in the YCC tree that are not unique events - these other multi-event mutations have only occurred a couple of times and are not further complicated by being in an evolutionarily volatile region (ie. within palindromes) nor are they complicated by not having 100% clear cut ancestral v's derived states.

It is not just the fact that it is an STR that invalidates DYS 413 as a taxonomic character within the context of the whole NRY phylogenetic tree, - it is that fact combined with the fact that the deletion event in DYS 413 was clearly not a unique event, and also the fact that it is a multi copymarker found on a Palindromicarm.

This discovery that multi repeat deletions can be found in haplogroups other than J2 raises the question - did Sengupta et al. 2006 test DYS 413 in all their study samples?, - or did they only test their haplogroup J2 study samples?
A close examination of the methods and results sections of Sengupta et al. 2006 shows that DYS 413 was only tested in J2-M410+ lineages. If they had instead tested this marker on all their samples (and most importantly the haplogroup O samples), then they would have noted that the DYS 413 deletion was not an evolutionary event unique to a single lineage within J2, and most likely they would have never incorporatedit into the Phylogenetic tree. To give credit to Sengupta et al. 2006, - their treatment of DYS 413 was in accordance to suggestions that were originally made by Di Giacomo et al. 2004. In the previous year, Di Giacomo et al. 2003 did test DYS 413 in all their samples (if they did have any unusual results, they didn't mention them). Their samples presumably did not contain any in haplogroup O.

This highlights the importance of not making assumptions when conducting biological research. If a marker is thought to define a specific group - it is not wise to only test that marker in haplotypes that fall within that specific group. Important biological information could be lost by being too selective.

The final reason why DYS 413 cannot be considered a valid character within the context of the NRY Phylogenetic tree - is the fact that it is found on a Palindromic region of the Y-Chromosome. Most of the Y-Chromosome is non-recombining. The tips of the arms of the Y-Chromosome can recombine, so markers on this portion of the Y-Chromosome are excluded from consideration.

The Y-Chromosome has 8 hair-pin regions that are called Palindromic regions. Within these Palindromic regions recombination can occur. This recombination results in what is known as "recombinational loss of heterozygosity". More about this can be read here.

Since a form of recombination occurs within palindromes, strictly speaking any marker found within a Palindrome cannot be used as a character in the NRY Phylogenetic tree. NRY means Non Recombining regions on the Y-Chromosome.

What if we consider DYS 413 within the context of only haplogroup J2?

If we consider DYS 413 just within the context of haplogroup J2, we can still consider it a very useful character. We however can not be100% certain beyond a doubt that there was only one DYS 413 deletion event within the J2a portion of the J2 tree. We are not saying that we believe that a deletion event has occurred more than once within J2a - we are just saying we cannot be 100% certain that it has only occurred once (given that we can say for sure that a deletion event has occurred at least three times within J).

J M12+ = Red
J M67+ = Blue
J M92+ = Purple
J M172* =Grey

Data from J2 Y-DNA project.

Above is the October 2007 67 marker Network diagram for the J2 Y-DNA project. It indicates which haplotypes have DYS413 in the ancestral condition (light grey background and red background) and which have DYS413 in the derived condition (blue background), and DYS413 in a mixed ancestral/derived condition (white background within the red background area). As can be seen, the M12- haplotypes that have the "ancestral" condition for DYS413 cluster between the M12- haplotypes that have the derived condition, and the M12+ haplotypes. Most of the DYS 413 derived haplotypes we can almost certainly say share the derived state by common descent, but there are a few we can't be 100% sure of (perhaps 99.9% sure).

Overall - DYS 413 is still a very important diagnostic marker within the context of haplogroup J2. The fact that it is a STR on a palindrome does not completely negate its usefulness. Indeed, there are other STR markers that are equally very useful. However none of these markers can be used within the context of the whole NRY phylogenetic tree. The NRY Phylogenetic tree that was devised by the YCC consortium is one that is cladistically based, and however useful they may be within the context of each haplogroup - STR's cannot be assigned 100% clear-cut ancestral/derived states which are essential requirements in a cladistic phylogenetic tree.

ADDED 6 November 2007
When we first began preparing the information about the J2b-Zita member who has DYS 413 values of 18,23 we weren't aware of how many other haplotypes outside of haplogroup J2 also had DYS 413 deletions.
We were originally going to present several different hypotheses to account for the results, and one of those hypotheses was that the DYS 413 deletion had occurred only once, and back mutations had occurred in most of J2b, and some of J2a.

We also now know there has been a deletion event in a J lineage as well, so if there was only one deletion event in J M304, at least 4 subsequent reLOH mutations would be needed to account for the results seen.

Initial mutation              23 23 -> 17,23 (J)
Subsequent mutations                                  --> 23 23 (J1)
                                                                            --> 23 23 (J2a)
                                                                            --> 23 23 (J2b)
                                                                            --> 17 17 (J2a)
                                                                            --> 17 17 (J2b) [not observed -- extinct?]
                                                                            --> 17 17 (J1)    [not observed -- extinct?]

We would also expect to see the full mutation of 17,17 in a proportion of J1 and J2b (which has not been observed). Theoretically, from an initial mutation of 17 23, there should be equal numbers of reLOH mutations from 17 22 to 22 22 and 17 17. This would imply that an additional two mutation events would likely have occured, and these lineages have subsequently become extinct. This would give a total of 7 mutation events (plus the extinction of two lineages.

However we decided that the above hypothesis was far far too convoluted, (with the various back mutations required) - and it was far more parsimonious to hypothesize that a DYS 413 deletion had occurred more than once (this alternative hypothesis just requires 4 mutation events in contrast to the 7 mutation events required above).

In the end, after we had reviewed the distribution of DYS 413 values in the other haplogroups, we decided that the evidence overwhelmingly pointed towards numerous parallel mutations.

This does however, highlight the fact that some J2a without the deletion, may be from a lineage that originally had a single copy deletion, that then reverted to two copies without the deletion, - just as the J2a with two copies of the deletion, gained the two copies due to the loss of the copy without the deletion due to reLOH.


The data was obtained from the 67 marker haplotype results in FTDNA Y-DNA haplogroup projects. We gratefully appreciate these haplotypes being made for available for study (which highlighted to us, the importance of reciprocating). We assume that all individuals whose haplotypes were represented, consented to their haplotype being made publicly available. We also thank all individuals who made their haplotypes available.

Creative Commons License
This work is licensed under a
Creative Commons Attribution-No Derivative Works 3.0 License.
This work can be freely cited, if it is attributed to:
"Cone, A.J. (October 2007) A review of the status of DYS 413 as a taxonomic character in the NRY Phylogenetic tree.
The J2 Y-DNA project. http://www.j2-ydnaproject.net/dys413.html