Personal and Ethical Issues

02-223 How to Analyze Your Own Declining Cost of Genome

• The genome sequencing is expected to happen rounely in the near future Schadt, MSB, 2012 The era of big data: the genome data are already being collected in a large scale and being mined for scienfic discovery to drive more accurate descripve and predicve models that inform decision making for the best diagnosis and treatment choice for a given paent. Would you post your genome on the web? Genomes and Privacy

• DNA sequence data contain informaon that can be used to uniquely idenfy an individual (i.e., genome sequences are like fingerprints)

• Balancing the need for scienfic study and privacy Genomes and Privacy

• Privacy concerns – Genome sequence data and other related types of data (gene expressions, clinical records, epigenec data, etc.) are collected for a large number of paents for medical research – Most types of data are freely available through internet except for data • NCBI GEO database for gene expression data • The cancer genome atlas data portals – Genotype data are available to sciensts through restricted access – Protecng parcipants’ privacy through informed consent hps://tcga-data.nci.nih.gov/tcga/tcgaHome2.jsp The Cancer Genome Atlas (TCGA) Data The Cancer Genome Atlas (TCGA) Data

hps://tcga-data.nci.nih.gov/tcga/tcgaCancerDetails.jsp? diseaseType=LAML&diseaseName=Acute%20Myeloid%20Leukemia Access Control for TCGA Data

• Open access data er – De-idenfied clinical and demographic data – Gene expression data – Copy-number alteraons in regions of the genome – Epigenec data – Summaries of data, such as genotype frequencies, compiled across individuals

• Controlled-access data er – Individual germline variant data – DNA sequence data – One should apply for an access to the data through NIH (database of and phenotypes) Informed Consent for Scientific Research

• Standard pracce for enrolling human subjects in a research study – fully informing potenal parcipants on all aspects of a study including the aims of the study, risks, benefits, costs, and protecon of personal privacy – The origins of modern day informed consent for medical research can be traced to the Nuremberg Code in 1947 in an effort to protect parcipants in research studies (Homan, 1991). Nuremberg Code

• Research ethics principles for human experimentaon • Established aer the Nuremberg Trials at the end of the Second World War

hp://www.hhs.gov/ohrp/archive/nurcode.html Nuremberg Code • On August 19, 1947, the judges of the American military tribunal in the case of the USA vs. Karl Brandt et. al. delivered their verdict. Before announcing the guilt or innocence of each defendant, they confronted the difficult queson of medical experimentaon on human beings. Several German doctors had argued in their own defense that their experiments differed lile from previous American or German ones. Furthermore they showed that no internaonal law or informal statement differenated between legal and illegal human experimentaon. This argument worried Drs. Andrew Ivy and Leo Alexander, American doctors who had worked with the prosecuon during the trial. On April 17, 1947, Dr. Alexander submied a memorandum to the United States Counsel for War Crimes which outlined six points defining legimate research. The verdict of August 19 reiterated almost all of these points in a secon entled "Permissible Medical Experiments" and revised the original six points into ten. Subsequently, the ten points became known as the "Nuremberg Code." Although the code addressed the defense arguments in general, remarkably none of the specific findings against Brandt and his codefendants menoned the code. Thus the legal force of the document was not well established. The uncertain use of the code connued in the half century following the trial when it informed numerous internaonal ethics statements but failed to find a place in either the American or German naonal law codes. Nevertheless, it remains a landmark document on medical ethics and one of the most lasng products of the "Doctors Trial." hp://www.ushmm.org/informaon/exhibions/online-features/special-focus/doctors-trial/ nuremberg-code Institutional Review Board (IRB)

• A commiee that has been formally designated to approve, monitor, and review biomedical and behavioral research involving humans

• Title 45 Code of Federal Regulaons Part 46 – hp://www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html Current Generation Informed Consents

• Single study focused • Top-down unidireconal researcher-parcipant (research subject) relaonship. • Protecng the parcipant is considered among the chief aims • Data generaon on study parcipants usually an integral part of the consent • Data ownership and terms of use driven by the invesgator and/or hosng instuon • Study parcipants are counseled to ensure they understand all aspects of the study, although no evidence of understanding is sought or required • In most cases, anonymity, privacy, and confidenality are guaranteed as a key condion for a parcipant’s consent • Big data, more open data sharing mentality demand a new generaon of informed consents Genomes and Privacy

• How much should we be concerned about the privacy issues regarding personal genome data?

• Non-genec data can be used to predict the genotypes of individuals (Bayesian method to predict individual SNP genotypes from gene expression data, Schadt et al. Nature Genecs, 2012) – Uses gene expressions as non-genec data and predicts the genotypes based on the gene expressions Predicting Genotypes with Non-Genetic Data (Schadt et al., 2012)

• Study design – Learn a predicve model for predicng genotypes given gene expression data from training set – Use the learned predicve model to test whether genotype can be predicted correctly given gene expression from test set

• Two datasets from non-overlapping groups of individuals – the human liver cohort (HLC): liver gene expression and genotype data for 378 European- American individuals – Roux-en-Y gastric bypass cohort (RYGB): genotype data and expression data for liver and adipose ssue from 580 European-American subjects undergoing Roux-en-Y gastric bypass (RYGB)

• Learn model from HLC data (training set) and predict RYGB genotypes given RYGB expressions (test set) Predicting Genotypes from Gene Expressions

• Le semicircle: observed genotypes • Right semicircle: predicted genotype

• Blue line: correctly matched individuals • White line: incorrectly matched individuals

• Overall, we can resolve 99% of the idenes of individuals Personal (www.personalgenomes.org)

• Volunteers from the general public working together with researchers to advance personal

• Aims to sequence genomes of 100,000 individuals from the general public

• Volunteers should be willing to make their genec and trait informaon publicly available The Evolving Informed Consent for Scientific Research I

• Open consents for public resources - the (PGP) Consent (Church, 2005; Lunshof et al, 2008) • Differs from classic informed consent in the following ways – Data ownership and terms of use of data no longer driven by study invesgator – Data are published to the web and made available without restricon – Single-study focused, but has broad and open-ended scope (data sharing as an aim) – Parcipants agree to reciprocal interacon with researchers – Parcipants must pass an exam to ensure they possess basic genec literacy, are informed about the public nature of the study, understand the possibility of re-idenficaon, and that some risks are unknown and unpredictable. The Evolving Informed Consent for Scientific Research II

• Interoperable and Open Consents - The Portable Legal Consent (PLC) (hp://weconsent.us/) • Based upon the PGP consent, but altered in the following important ways – The PLC can be used across any number of studies – If variaons of the same PLC form guarantee the same freedoms and creates no more than the same obligaons, then it can be cerfied as interoperable across the PLC network – Fully digital, requires no input from a physician or other health/ research professional – Requires users sign terms of a contract to ensure compliance with data use terms – Intended for data already generated to enable open access of data across many studies Other Issues in Scientific Research

• Open personal data environment

• A greater parcipaon of informed paents

• Protecng individuals from discriminaon – Genec Informaon Nondiscriminaon Act (2008) • Law protecng individuals from discriminaon based on their genec informaon for health insurance and employment Other Social/Ethical Issues in Personal Genomes

• Consumer genomics services – , deCODE genecs, Navigenics – Personal genomic services are offered in the private sectors more widely than by clinicians – Commercial genomic services may displace clinicians as the primary provider of health-related genec informaon – Individuals may assume more responsibility for health-promong behavior Other Social/Ethical Issues in Personal Genomes

• P4 medicine (hp://p4mi.org) – Predicve, prevenve, personalized, and parcipatory medicine – Apply to personalized disease prevenon and maintenance of health Summary

• Ethical/Social/Legal issues in personal genomes – Protecng privacy in terms of genec informaon while enabling scienfic research – Protecng individuals from discriminaon based on genec informaon – Empowering individuals by keeping them informed of the various issues involved in personal genomes