Null Alleles in 0

Thomas Krahn FTDNA Conference 2009 Definition of Null Allele

● Original meaning: A mutant copy of a gene that completely lacks that gene's normal function. (Wikipedia)

DNA Gene

Promotor Transcription RNA Polymerase

mRNA Not a sharp definition. Splicing Many things can go wrong in the complex gene Translation expression process. Protein

Ribosomes Definition of Null Allele

● Concerning DNA markers: A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.

Note: This is my own definition. Other definitions I found in the literature and on the internet usually focus on a very narrow subtype of a DNA segment. E.g. STR markers.

Definition of Null Allele

● Concerning DNA markers: A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.

For a PCR reaction we need a solution of intact DNA. Degraded (sheared) DNA cannot be amplified because the TAQ polymerase needs to extend one DNA strand down until the reverse primer. If the TAQ drops off from the DNA segment before it reaches the reverse primer we will not get an exponential amplification. Since degraded DNA doesn't represent a species who can have descendants, we exclude degraded DNA from being a Null Allele for genealogical purpose.

Definition of Null Allele

● Concerning DNA markers: A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.

All our known STR markers (e.g. DYS391, DYF385S1, vWA etc.) are DNA segments that are defined by flanking PCR primer sequences. DYS stands for “DNA Y-chromosome Segment”. The famous database GDB that recorded all primer pairs is unfortunately off-line since summer 2007. So it is sometimes difficult to look up the exact primers for D markers from older publications. Genbank still keeps record of a partial subset of the GDB markers. There they are also called STS markers (=Sequence Tagged Sites). An STS may also contain one or more SNP markers. Definition of Null Allele

● Concerning DNA markers: A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.

The actual characteristic of a Null Allele is that we can't detect a signal from a PCR product. We'll go into detail later what “detection” means, but this makes already clear that we need to precisely define a detection limit above the background noise of the detection instrument. Some in the primer binding region don't completely inhibit the formation of a PCR product so that a small signal persists despite the . With alternative assays such a small signal may be still identified as a Null Allele. Definition of Null Allele

● Concerning DNA markers: A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction.

Here comes the population genetic aspect of Null Alleles as a usable phylogenetic marker. It is however important to understand the molecular genetic background of the mutation mechanism. Some of these genetic changes may occur independently on completely different branches of the phylogenetic tree, some of them may even be revertible. Depending on the stability of the marker we may need to select independent assays to restrict or confirm the phylogenetic position of a Null Allele marker.

Definition of Null Allele

● Concerning DNA markers: A DNA segment of good quality, limited to the two primer pairs of a PCR reaction that doesn't yield a PCR product in some biological samples while all other samples of that kind show a clearly detectable signal with the same PCR reaction. This makes clear that every Null Allele requires a positive control. This is usually easy with routine STR markers. However, if “other samples” is restricted to a narrow population the samples with Null Alleles may become the majority. Alternatively a competitive primer may sometimes be designed that inverts the definition of a Null Allele marker to the contrary. This primer matches only the samples who carry the mutation and doesn't yield a PCR product for the “normal” samples. In our lab we have designed assays that combine both primers so that we are able to properly distinguish the alleles and always get at least one positive result. Basics

Basics

Basics

● Capillary electrophoresis to detect the PCR products AAAATTGGTTCCTTGGGGTTTTGGAAGGGGCC

- +

What can go wrong at PCR?

● Bad DNA template ● Assay doesn't work ● Detection method fails

If we can exclude the above, but still get no signal from a PCR product

=> then a NULL allele is very likely

(...but not proven).

DYS439 Null mutation (L1)

Not observed when STR testing was performed in the GRC lab because we use a different forward primer.

DYS437 Null

DYS391 Null

DYS463 Null

DYS565 Null

DYS448 Null

DYS448 Null PCR with more distant primers did NOT yield any PCR products. Regular primer pair

D Y S 4 4 8

Outer primers

DYS448 Null PCR product on agarose gel GRC005356 -> DYS448 Null GRC003436 -> DYS448 19 GRC000001 -> DYS448 18 GRC000027 -> female StandardSize

The DYS448 Y-STR marker has been amplified with alternative primers DYS448_f: GAGGAGGATATGTCAAAGGATTC 4000 DYS448_r: CAGTTTCACTTCATGTTTGGG and PCR products have been sized on an agarose gel (FlashGel 1.2% agarose Lonza 2000 200V/5min). 1250 The positive controls (19 and 18 repeats) show a band ant the expected size of ca. 800 bp. 800 The female negative control and the DYS448 Null allele sample don't have a PCR product 500 and their lanes on the gel are empty. Amplification assays with alternative primer sets 300 practically eliminate the hypothesis of an inhibited PCR due to a mutation on the primer 200 binding site. 100

DYS448 Null

Palindromic Pack results are generally inconspicuous...

Except DYF397 has possibly only 2 alleles

DYS448 Null

DYS461 DYS448 = Null DYS464 = 14-15 DYS452 DYS459 = 9-10 DYS485 DYS392 is NOT missing DYS401 = 14-17 DYF408 = 188-188-8-13 DYS392 DYF399 = 21t-25c (no .1 allele!) DYF397 DYS448 P3 DYF397 = 14-15 DYF397 DYS725 = 31-31 DYS392 = 11 DYF399 ins G, T-type DYS464 G-type DYS725 DYS448 is located on the unique loop P2 of the P3 palindrome DYS464 DYS725 C-type DYF371 DYF399 DYS464 N.N. DYF397 DYF401 DYF387 DYS459 DYF385 DYS724 C-type DYF408 T-type C-type DYS725 188 bp 188 bp P1 N.N. DYF397 DYF401 DYF387 DYS459 DYF385 DYS724 DYF371 DYF408 DYF399 DYS464 DYS725 C-type C-type C-type

DYF399 only 2 alleles DYF397 only 2 alleles and no .1 allele DYS464 and DYS725 only 2 alleles

DYS448 Null

Loop Constellation! DYS448 = Null DYS464 = 14-15 DYS459 = 9-10 DYS464 DYS725 DYF399 ins G, T-type G-type DYS401 = 14-17

DYS461 DYF408 = 188-188-8-13 DYF399 = 21t-25c (no .1 allele!) DYS452 DYS725 DYF397 DYF397 = 14-15

DYS485 DYS725 = 31-31 DYS464 C-type DYS448 DYS392 = 11 DYS392 DYF397 DYF371 DYF399 DYS464 C-type T-type C-type DYF397 P1 DYF DYF DYS DYF DYS DYF371 DYF DYF399 DYS464 DYS 401 387 459 385 724 C-type 408 C-type C-type 725 DYF397

Recombination

DYS448 Null

Loop Constellation!

DYS464 DYS725 G-type DYF399 ins G, T-type DYS448 = Null DYS464 = 14-15 DYS459 = 9-10 DYS725 DYF397 DYS401 = 14-17

DYS461 DYF408 = 188-188-8-13 DYS464 C-type DYS448 DYF399 = 21t-25c (no .1 allele!) DYS452 DYF397 DYF397 = 14-15 DYS485 DYS725 = 31-31 DYS392 = 11 DYS392 DYF371 DYF399 DYS464 C-type T-type C-type

DYF397 P1 DYF DYF DYS DYF DYS DYF371 DYF DYF399 DYS464 DYS 401 387 459 385 724 C-type 408 C-type C-type 725 DYF397

DYS389 Null

DYS389 Null

Yfiler

Singleplex

DYS389 Null

Deletion of the middle fragment in between DYS389I and DYS389B

1 2 3 4 5 1 2 3 4 5 6 7 8 9 1011 1 2 3 1 2 3 4 5 6 7 8 9 10

DYS389I

DYS389II

The nomenclature of DYS389 is defined as DYS389I: [TCTG]q [TCTA]r = GenBank top strand DYS389II: [TCTG]n[TCTA]p[TCTG]q [TCTA]r = GenBank top strand See: http://www.cstl.nist.gov/biotech/strbase/str_y389.htm

The deleted sample matches the first 5 repeats [TCTG] from the related samples in R1b1c. It shows 10 repeats of TCTA which we can align to the left or to the right side. 5 x [TCTG] + 10 x [TCTA] = 15 repeat units

DYS389 Null

Peak shows up at “16” - But really has 15 repeats!

DYS389 Null

3 3 3 3 8 8 4 4 4 4 4 4 3 3 3 8 8 4 3 4 9 3 9 4 5 5 4 4 4 4 4 4 6 6 6 6 9 9 1 9 5 5 2 8 3 | 9 | 5 9 9 5 5 4 3 4 4 4 4 4 4 3 0 9 1 a b 6 8 9 1 2 2 8 a b 5 4 7 7 8 9 a b c d DYS389 Null 13 24 14 10 11 15 12 12 12 15 13 0 18 9 10 11 11 25 15 19 30 15 15 17 18 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 15 16 16 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 31 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 17 13 24 14 10 11 15 12 12 12 14 13 30 18 9 10 11 11 25 15 19 30 13 15 16 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 16 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 16 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 13 15 15 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 16 16 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 15 15 15 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 13 15 16 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 12 15 16 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 16 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17 13 24 14 10 11 15 12 12 12 13 13 30 18 9 10 11 11 25 15 19 30 15 15 17 17 13 24 14 10 11 15 12 12 12 13 13 29 18 9 10 11 11 25 15 19 30 13 15 17 17 DYS389 Null

1 2 3 4 5 1 2 3 4 5 6 7 8 9 1011 1 2 3 1 2 3 4 5 6 7 8 9 10 13? DYS389I 28? DYS389II

5? 10? 3? 10? Looping constellation

Recombination

5 10 DYS425 Null

DYS413 P8 DYS425 Null DYS413 DYS390 DYF371 YCAII DYF395 T-type DYF408 P5 DYS425 = DYF371 T-type allele YCAII DYF395 DYF371 DYF408 C-type DYF411 DYS385b* P4 The T-type SNP can get lost by a recLOH DYF411 DYS385a* DYS461 This is seen as a “NULL-Allele” if only DYS425 is tested DYS452

DYS485

DYS392 DYF397 DYS448 P3 DYF397

DYF399 ins G, T-type DYS464 G-type DYS725 P2 DYS464 DYS725 C-type DYF371 DYF399 DYS464 N.N. DYF397 DYF401 DYF387 DYS459 DYF385 DYS724 C-type DYF408 T-type C-type DYS725 188 bp 188 bp P1 N.N. DYF397 DYF401 DYF387 DYS459 DYF385 DYS724 DYF371 DYF408 DYF399 DYS464 DYS725 C-type C-type C-type DYS425 Null / DYF371X

DYS425 12 DYS425 Null DYS425 Null

The HUGO sequence has also a Null allele at DYS425

10c-10c-13c-14c

Normally in R1b (and most other haplogroups):

10c-12t-13c-14c

Multi Marker Deletion

Multi Marker Deletion

Marker Allele Region Start Stop DYS393 13 ChrY 3191128 3191246 DYS19 16 ChrY 10131934 10132128 DYS391 10 ChrY 12612758 12613044 DYS437 14 ChrY 12976972 12977163 DYS439 11 ChrY 13025167 13025418 DYS389I 14 ChrY 13122100 13122515 DYS389II 32 ChrY 13122100 13122515 DYS388 13 ChrY 13256856 13257013 DYS438 10 ChrY 13447189 13447409 DYS390 0 ChrY 15784268 15784613 DYS426 0 ChrY 17644207 17644303 DYS385b 0 ChrY 19260844 19261212 DYS385a 0 ChrY 19301724 19302104 DYS392 0 ChrY 21043146 21043399

Possible P1/P5 deletion in the palindromic region

GRC Lab Pics

Astrid Krahn (mt hg J)

GRC Lab Pics

Dr. Connie Bormans (mt hg I)

GRC Lab Pics

Jory Clark (Y hg T) GRC Lab Pics

Brent Maning (Y hg R-U106*)

GRC Lab Pics

Dr. Arjan Bormans (Y hg R-L2*) GRC Lab Pics

...and our other lab coworkers

Thanks for listening!