Forensic DNA Phenotyping for Degraded Samples

Ronak Hassani Nejad, Steen Harsten, and Jason Moore British Colombia Institution of Technology

9. A threshold of higher or equal to 0.7 was considered when deter- Abstract mining the final eye color predictions. 10. For hair color predictions, the highest p-value among the 4 color Forensic DNA phenotyping (FDP) is a novel technique that can assist category and the shade color was considered in final determina- the investigation by predicting a person of interest’s external visible tion of hair color. characteristics (EVCs) and such as eye, hair, and skin col- our (Walsh, 2013). Several DNA markers with the prediction ability Results for these EVCs are recognized and various commercial and non- commercial tools were developed for practical use in forensic labora- The majority of samples had a p-value above 0.7 for all of the tests of tories. the experiment in eye colors even at 16% (22 samples out of 30) (Figure 3). This study investigates the compatibility of the Verogen Universal Analysis System (UAS) with the HIrisPlex Webtool. This study also tests the HIrisPlex Webtool’s reliability to generate accurate pheno- typing predictions for degraded samples through a series of tests in which we intentionally created degraded samples by deleting SNP data from the DNA profile. For p-values resulted for brown and blond hair color categories, there is a noticeable fluctuation between the tests (Figure 8). This fluctuation indicates that some individual SNPs have more predic- Introduction tive power than others. Eye color prediction results had less predic- tion fluctuations (figure 7) than hair colour. These results are con- DNA markers called Single Polymorphisms (SNP) are a sistent with the previous literature that eye color predictions can be DNA sequence variation at a single nucleotide that differs between in- accurately estimated with a 0.899 prediction accuracy when only the dividuals in the population. These differences can be analyzed in FDP HERC2 (rs12913832) SNP is present in the input data (Kayser to help predict EVC’s such as hair and eye colour. 2015).

Figure 3: Number of Samples with P-Values > 0.7 for each test.

The highest eye color change percentage observed between the tests for eye colors was 3.33% (figure 5) and was largely influenced by sample 17 which did not meet the threshold of 0.7 in test 1 (Figure 4).

Figure 1: Single Nucleotide Polymorphisms (http://standardofcare.com/) Recently, Verogen has developed a next-generation sequencing Fo- renSeq DNA Signature DNA Prep kit which has :

 Incorporated 24 HIrisPlex SNP markers

 Utilizes their algorithm and statistical model to predict hair and Figure 7: Average difference of p-values of eye color results eye colour with its ForenSeq Universal Analysis software (UAS).

 Will not generate a phenotyping estimation for degraded samples. HIrisPlex is a free online webtool which :

 Is capable of simultaneously generating eye and hair color predic- tions with input SNP data. Figure 4: Change in eye colour prediction for Sample 17.

 Is capable of working with degraded samples, The highest p-value shift and change from one colour to a different  Has a larger dataset implemented color prediction was observed more in the hair color prediction re-

 Has been validated numerous times sults compared to the eye color prediction results (Figure 5). Once the number of SNPs present in the samples dropped from 90% pre- In the validation studies, eye color predictions reached almost 94% sent to 70% present more than 10% of the samples changed from one accuracy rate (Walsh et al. 2013). hair colour being the top hair colour prediction to a different hair col- our being the top hair colour predication (Figure 5).

Figure 8: Average difference of p-values of hair color results

Figure 2: HIrisPlex Webtool (https://hirisplex.erasmusmc.nl/) Conclusions

• In eye colors, 22 samples out of 30 samples had p-values above threshold of 0.7 at 16% of data available. The color change was Research Questions low. Therefore, HIrisPlex is a reliable tool in predicting eye col- 1. How accurate or different are the results generated by both soft- ors for severely degraded samples.

ware when the data is 100% available? • Hair color estimation depended on the combination of SNPs 2. If the UAS is unable to generate a prediction for the de- available in the data. graded sample, will HIrisPlex be a suitable replacement for accu- Figure 5: Percentage of Colour Changes • Hair color was concluded to be less reliable than eye color as it rate phenotype predictions? This change in prediction can be seen for sample 13 which had a depends on the SNP markers available and not the percentage of 3. At what percentage of SNP absence in the DNA profile can black hair as the top prediction above 0.7 p with the ForenSeq UAS degradation. HIrisPlex generate accurate results? software and brown hair as the secondary prediction. This order of • After test 3, the average color change is above 10% (figure 5). hair colour prediction held for the 90% and 70% tests, and then This color change could result in an color prediction that is differ- Methodology switched to brown hair being the top hair colour prediction for the ent from when 100% of the data is available. Therefore, HIrisPlex 50% and 30% tests however, not above 0.7 p (Figure 6). HIrisPlex SNP data for 30 samples provided by Verogen were utilized becomes less reliable after 70% of degradation in predicting hair in this study. The steps taken in this study include: colors.

1. Phenotyping predictions were generated by the UAS system for all • The HIrisPlex Webtool can be a used for generating pheno- 30 samples. typing estimations for degraded samples when the UAS soft- 2. SNP data was converted into the HIrisPlex webtool input file and ware cannot be used if proper precautions and considerations run with 100% of data present are taken. 3. Calculated Average Difference, Min, and Max to compare the UAS and HIrisPlex webtool predictions with full SNP profiles. References 4. Generated hypothetical degraded samples by deleting SNP data in Walsh et al. (2013). The HIrisPlex system for simultaneous prediction of hair and the input file eye colour from DNA. International: , 7(1), 98-115. doi:10.1016/j.fsigen.2012.07.005 5. SNPs were chosen to be deleted based on their number of reads Walsh et al. (2014). Developmental validation of the HIrisPlex system: DNA- present and their importance for making the predictions. Figure 6: Change in hair colour prediction for Sample 13. based eye and hair colour prediction for forensic and anthropological usage. Fo- 6. Several combination of available/unavailable SNPs was tested. rensic Science International: Genetics, 9, 150-161. doi:10.1016/ As illustrated in the Figure 7 and 8, there was a steady increase ob- j.fsigen.2013.12.006 7. Nine tests and five degradation percentages of 90%, 70%, 50%, served between the average difference of p-values of each degrada- Walsh, S. (2013). DNA Phenotyping: The prediction of human pigmentation traits 30%, and 16%. tion test and 100% of available data. from genetic data. 8. The average difference of each of the results of these degradation This increase indicates that as degradation increases so does the av- Kayser M. (2015). Forensic DNA Phenotyping: Predicting human appearance tests was compared to when 100% of the data was available. from material for investigative purposes. Forensic Sci Int Genet. erage difference. Therefore, the software becomes less reliable. 2015 Sep;18:33-48. doi: 10.1016/j.fsigen.2015.02.003. Epub 2015 Feb 16. PMID: 25716572.