Lumping and Splitting: Defining the disease entity for curations

Courtney Thaxton, Ph.D. on behalf of the ClinGen Lumping and Splitting Working Group ClinGen Virtual Retreat 2021 June 24, 2021

Participate in survey with URL: http://etc.ch/pExA Predicament(s) encountered during curation When to lump, or when to split?

• Given that a gene can be associated with several phenotypes and/or disease entities, how can a curator assess if the condition they are curating is the most appropriate?

• If all curations are split for each phenotypic feature reported in the literature, we risk ad infinitum curations on ClinGen that might not reflect the true nature of each condition or association, nor the mutational spectrum of the gene.

• If we lump all of the existing phenotypic features asserted for a gene into a greater known syndrome, we risk losing the intricacies of each condition and/or phenotype (phenotypic feature).

Survey URL: http://etc.ch/pExA Lumping and Splitting Working Group members

Chair Clinicians/ Clinical molecular geneticists: Courtney Thaxton, Ph.D. • Jonathan Berg, M.D., Ph.D. • ClinGen PI • Heidi Rehm, Ph.D. • ClinGen PI and Ontology expert(s): Biocurators: • Ada Hamosh, M.D., M.P.H • Jenny Goldstein, PhD, CGC • Scientific Director of OMIM • Marina DiStefano, Ph.D. • Melissa Haendel, Ph.D. • Dane Witmer, Ph.D. • PI Monarch Initiative (MonDO)

Survey URL: http://etc.ch/pExA Overarching principle:

Genes associated with a single published disease entity should only be curated for that condition (i.e. lumped) unless there are indications to split specific phenotypic features of a syndrome or variable phenotype into separate curations based on the guidance provided by our criteria.

Survey URL: http://etc.ch/pExA Lumping and Splitting Criteria

• Assertion/ Defining the disease entity: • What has the literature reported, as well as other nosological authorities, about a gene’s association with disease? • Molecular Mechanism: • Are there differences in the molecular mechanism(s) underlying each asserted disease (and/or condition)? • Phenotypic Variability: • Does a phenotype(s) segregate consistently within a pedigree, or present similarly in two or more unrelated probands with the same gene variant? • Or, is there variable of phenotype between family members and/or an unrelated proband harboring the same gene variant? • Inheritance Pattern: • Are there differences in inheritance patterns between the disease entities associated with the gene of interest? • Do they represent distinct disease entities, or a continuum of disease?

Survey URL: http://etc.ch/pExA Survey URL: http://etc.ch/pExA

How does one name a newly “lumped” disease entity?

• Different types of naming schema: • Eponymous: • , Charcot-Marie-Tooth disease • Phenotypic: • Includes Latin based (Diabetes mellitus and retinitis pigmentosa), and clinically based (Limb girdle muscular dystrophy, amyotrophic lateral sclerosis) names. • Biochemical: • Phenylketonuria, Vitamin D Deficiency, RASopathy • Acronym: • MELAS, CHARGE • Gene-based: • PTEN harmartoma syndrome

Survey URL: http://etc.ch/pExA Defining a disease entity for CAV3 (Caveolin-3)

Assertion(s) and Inheritance Pattern Molecular Mechanism

Per OMIM.org 2019

Survey URL: http://etc.ch/pExA Defining a disease entity for CAV3 (Caveolin-3) Phenotypic Variability

Fee et al., 2004 Survey URL: http://etc.ch/pExA

Defining a disease entity for CAV3 (Caveolin-3)

• The following disease entities associated with CAV3 were lumped: • HCM- hypertrophic cardiomyopathy • hCPK- hyper creatine kinase • RMD- rippling muscle disease • LGMD- limb girdle muscular dystrophy • DM, TT- distal , Tateyama type

• At this time LQTS (long QT syndrome) is not included in part of this lumping, and will remain split pending further investigation.

Survey URL: http://etc.ch/pExA How to name the lumped CAV3 associated diseases? • Options per the current outlined schema: • Eponymous: • Minetti-McNally-Hermann-Carbone-Hayashi dystrophy? • Phenotypic: • Due to the numerous phenotypes it may be difficult to produce an accurate nomenclature. • Muscular dystrophy with hypercreatine phosphokinase, with or without limb girdle muscular dystrophy, with or without Tateyama- type distal myopathy, with or without rippling muscle (?) • Biochemical: • Caveolinopathy • This is the current proposed naming scheme, but does this nomenclature suggest in all caveolins would produce the same sets of phenotypes? • Acronym: • LiRCHeD • (Limb girdle muscular dystrophy, Rippling muscle, Cardiomyopathy, Hyper CKemia, Distal myopathy)? • Is this an acceptable acronym? • Genotype-phenotype (dyadic-based): • CAV3- related muscular dystrophy • Incorporates the specific gene and the muscle and cardiac phenotypes that emerge from variation in the gene.

Survey URL: http://etc.ch/pExA

Future Aims:

• ClinGen has formed a Disease Naming Committee (a subset of the LSWG). It is tasked with developing guidance for EPs when disease nomenclature arises due to lumping (or splitting). • This guidance will be reviewed by key representatives for CDWGs/GCEPsfor feedback. • Continue collaboration with OMIM and Monarch Initiative, nosological and ontological authorities, respectively, on how we can harmonize efforts on disease nomenclature • Helps to establish consistency amongst groups and provide up-to-date information that may directly effect the nosology and/or ontology of disease(s).

Survey URL: http://etc.ch/pExA

Thank you!