ORIgInAL PAPERS

Whole exome sequencing of patients with diffuse idiopathic skeletal hyperostosis and calcium pyrophosphate crystal chondrocalcinosis

Parreira B 1,2 , Couto AR 1,2 , Rocha F 1,2 , Sousa M 1,2 , Faustino V 1, Power DM 3, Bruges-Armas J 1,2

ACTA REUMATOL PORT. 2020;45:116-126

ABSTRACT could be involved in this phenotype in an as yet un - known way. Objectives: DISH/CC is a poorly understood pheno - type characterised by peripheral and axial entheso - Keywords: Rheumatic and musculoskeletal diseases ; pathic calcifications, frequently fulfilling the radiolog - Genetic association; Rheumatology. ical criteria for Diffuse Idiopathic Skeletal Hyperostosis (DISH, MIM 106400), and in some cases associated with Calcium Pyrophosphate Dihydrate (CPPD) Chon - INTRODUCTION drocalcinosis (CC). The concurrence of DISH and CC suggests a shared pathogenic mechanism. In order to Previous studies undertaken by our group, identified identify genetic variants for susceptibility we performed and characterized twelve families affected with Diffuse whole exome sequencing in four patients showing this Idiopathic Skeletal Hyperostosis (DISH, MIM 106400) phenotype. and/or Calcium Pyrophosphate Dihydrate (CPPD) Materials and methods: Exome data were filtered in Chondrocalcinosis (CC), hereafter designated, order to find a variant or a group of variants that could DISH/CC. DISH/CC is a poorly understood phenotype be associated with the DISH/CC phenotype. V ariants characterised by peripheral and axial enthesopathic cal - of interest were subsequently confirmed by Sanger se - cifications, frequently fulfilling the radiological criteria quencing. Selected variants were screened in a cohort for DISH, and in some cases associated with CPPD of 65 DISH/CC patients vs 118 controls from Azores. Chondrocalcinosis. A common pathogenic mechanism, The statistical analysis was performed using PLINK shared by the two conditions, has been suggested 1. V1.07. DISH is a common skeletal disorder characterized by Results: We identified 21 genetic variants in 17 progressive calcification and ossification of ligaments that were directly or indirectly related to mineraliza - and entheses 2-3 . The exact prevalence and incidence of tion, several are predicted to have a strong effect at a DISH is unknown, however it is well known that it is level. Phylogenetic analysis of altered amino more frequent in males and its prevalence rapidly in - acids indicates that these are either highly conserved creases with age, mainly affecting subjects over the age in vertebrates or conserved in mammals. In case-con - of 40 4. The prevalence of DISH in patients over 50 years trol analyses, variant rs34473884 in PPP2R2D was sig - of age is 25% in males and 15% in females, 5 and this nificantly associated with the DISH/CC phenotype disease is now becoming a serious problem in aging so - (p=0.028; OR=1.789, 95% CI= 1.060 - 3.021) ). cieties. Several lines of evidence suggest that genetic Conclusion: The results of the present and preceding factors might play a part in the etiology of DISH, such studies with the DISH/CC families suggests that the as the existence of familial cases with early onset (in the phenotype has a polygenic basis. The PPP2R2D third decade of life) 6 and the higher frequency of DISH in the boxer dogs relative to other dog breeds 7-8 . So far, however, no single gene has been conclusively associa- 1. Serviço Especializado de Epidemiologia e Biologia Molecular ted with the disease. Chondrocalcinosis is characteri- (SEEBMO) / Hospital de Santo Espírito da Ilha Terceira (HSEIT) 2. Comprehensive Health Research Center, CHRC, Lisbon, Portugal zed by deposition of crystals of calcium pyrophosphate 3. Centre of Marine Science (CCMAR) / University of Algarve dihydrate (CPPD) in articular hyaline and fibro-carti -

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 116 PARREIRA B ET AL

lage 9. For the moment two genes, ANKH (CCAL2) and analysis with an Agilent 2100 Bioanalyzer and Qubit. TNFRSF11B , are known to be involved in the devel - Each library went through a process of emulsion PCR opment of Chondrocalcinosis 10-11 . Previously, we per - for clonal amplification of the fragments, followed by formed genetic studies of DISH/CC using a “Whole an enrichment process and chemical modification to genome wide linkage study” and an “Identity by allow loading into the reaction chamber. The quality State/Identity by Descent” association study and two and quantity of the beads obtained for each library genes were identified as good candidate genes for were estimated taking into account the parameters giv - DISH/CC; RSPO4 on 20, and LEMD3 on en by Work Flow Analysis. Then, ligation sequencing chromosome 12. Several RSPO4 gene variants were was done to obtain sequences of 50nt +35nt (Paired- identified and nucleotide modifications located in the end) reads using a SOLiD4 sequencer. The data qual - regulatory region were more frequent in control indi - ity was estimated using the parameters provided by viduals than in the DISH/CC group 12 . Even though the the software SETS (SOLiD Experimental Tracking Sys - genetic basis of DISH/CC is unknown, it is considered tem). Single Nucleotide Variants (SNVs) were classified to be a bone forming disease and genes related to cal - using Ensembl’s nomenclature and grouped using the cification and ossification are considered good candi - following scheme: Known and Novel (Coding, Splic - dates. In the present study, we took advantage of whole ing, Others). The coding variants were divided into exome sequencing (WES) so that even in the absence nonsynonymous and synonymous. The “Others” in - of sufficient pedigrees and samples for traditional link - cluded intronic, UTR, regulatory region, intergenic, age approaches we could identify rare protein-coding downstream and upstream variants. variants that are the cause of the majority of mono - genic diseases 13-14 . We performed targeted exome se - WES fIlTERINg quencing on four DISH/CC patients, with an appar - The segregation analysis of these families indicated that ently autosomal dominant DISH/CC phenotype, in the most likely model for this disease is an autosomal order to capture rare and pathogenic variants expect - dominant model 1. However, given that a previous ed to have potentially damaging effects on protein linkage study did not identify a major dominant lo - function that lead to modified calcification and/or os - cus, the data were analyzed assuming both a recessive sification. To our knowledge this is the first report of and a dominant model. WES analysis in patients affected with DISH. Genetic variants were identified in 815, 917, 872 and 593 genes under a dominant model, for indivi - duals AZ1 to AZ4. Assuming that the genetic cause is MATERIAlS AND METHODS the same for the individuals AZ1 to AZ4, identified genes were then filtered to select common, shared This study was approved by the “Comissão de Ética genes (supplementary Table I, available in supple - do Hospital de Santo Espírito da Ilha Terceira”. All mentary file). A group of 52 genes were common to the methods were performed in accordance with the ap - four patients; candidate genes for testing were select - proved guidelines, obtaining informed consent from ed taking into account their involvement in calcifica - each subject before conducting the experiments. tion and/or ossification or related conditions that could be associated with the DISH/CC disease. Gene vari - ExOME CApTURE ants shared between the genes common to the four pa - The selection of patients was made after ruling out mu - tients were identified and nonsynonymous, splice site, tations in ANKH and secondary causes for Chondro - stop lost/gain and frameshift variants were targeted, in calcinosis. DNA was extracted from peripheral blood anticipation that synonymous and intronic variants samples using a standard salting out procedure. Sam - would be far less likely to be relevant in the ples were resequenced, using an ABI-SOLiD platform pathogenicity. and Agilent’s SureSelect Target Enrichment System for 38 Mb, by “Sistemas Genómicos, S.L.” in Valencia, CANDIDATE gENES – SElECTION By fUNCTION Spain. After standard DNA quality control, SOLiD A group of 20 candidate genes was selected and in - Fragment libraries were prepared and enriched with cluded genes that are reported to be involved in bone SureSelect All Human Enrichment Target Exon. The metabolism and/or related conditions. These genes quality and quantity of the libraries were assessed by were Alkaline Phosphatase ( ALPL/TNAP ), the Calcium

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 117 EcToPIc cALcIfIcATIon

Sensing Receptor ( CASR ), Bone Morphogenetic s s u d t

e Protein Receptor Type 1B ( BMPR1B ), Osteopon - i n

s l , , a a l

d a

i s s l y y e c n i i e tin ( OPN/SPP1 ), Integrin Binding Sialoprotein t t a a s s a i i m s c

i m i s s a a i s

h i i t d i e e t d s

r (IBSP ), Fibroblast Growth Factor 2 ( FGF2 ), Inor - r s

h h b b y e e a t t r o t r i i r V c e O O r e

e : L L ganic Pyrophosphate Transport Regulator a l h b s : t c i a y s s i

O h , o (ANKH ), Collagen Type XI Alpha 2 ( COL11A2 ), t t D g a y n p i h Nucleotide Pyrophosphatase 1 ( ENPP1 ), Runt- o p w s o o e

r t

h related Transcription Factor ( ), Dickkopf

t RUNX2 m r t a s a e 0 0

A n e 0 s / n e 4 4

d E

3 WNT Signaling Pathway ( DKK-1 ), Insulin Like n

g e < < N n ) c o y A 4 a

S

; Growth Factor 1 ( IGF1 ), Matrix Gla Protein

p e ) s

2 u t

s ;

n (MGP ); Vitamin D ( VDR ), Bone Morphogenetic s i s i e t o i

j

e s s s s

d Protein 4 ( BMP-4 ), Collagen Type 1 Alpha 1 f r r r r n o n n n n o a a a a

l l l l o b o o o o

e i i i i b u u u u

l (COL1A1 ), Transforming Growth Factor Beta 1 c

t t t t w a c c c c f n a a a a i i i i r o o t t t t e

c c c c b b r r r r s (TGF 1), Solute Carrier Family 29 Member 1 i i i i s e l f f f f e a a a a t i i i i r i i i i w r E c c c c r r r r p e o l l l l (SLC29A1 ), Bone Morphogenetic Protein 2 ( BMP-

r e e e e v : a a a a

g s P P P P s c c c c t i s u u -2 ) and Collagen Type VI Alpha 1 ( COL6A1 ). o o o r

u e h g r t i r a t

A n h

gENETIC vARIANTS – SElECTION By o s ) c

i

i

c , 5 n , ,

s

h i s y y s r ;

i o

i

e pATHOgENICITy w u , h s h s r s

t t s n s , o o n h i o i f a s a o t s

t t s s o r r r b p p In the selection by variant pathogenicity filter - y e i y o

A o u t o a e o o / r h r

o h

. p a s s t , e n s p e h h p c N

e e ing strategy we used two levels; a “variant level” e e t n t i f n o l i K r o r f r i h h

o e i e t t h a u

e p t A t t c l s s e

n n

s and a “knowledge level” ( Figure 1). I n the “vari - l s

f u c p e r a E o s o O n a a

c p e t c b ant level” we first filtered according to variant a s c c e e m

r p r u

p type and focused on functional significance s a

L l : a

: s u l i e c a s (splice sites, frameshift coding, nonsynonymous i n r t H o i s e r t

i S t

p , a y . I s

a

S coding and loss or a gain of a stop). Only the s l h r - o i S D o t o p s L

E

r

Y y o s o a e SIFT prediction with Deleterious and Unknown i d t e i t h G W s t n c y n

s e p s o a a O h

t a R o

n and GERP values equal or higher than 3 were in - O H H f e i

e p L y e

O ) , S S n t p h o h i I I s O s f 3 t

S

cluded. We excluded variants with MAF values p I p n ; o - m D D s g

o e S o

s D L r n c c d T e i o a higher than 0.05 by filtering the SNP_IDs (rs o m A n l c d i s N p e a a r R s t

n e E r

e I , y n

d number or chromosome position) against the c t o s s o S T t h n n i i t A n a y T d a

Ensemble Variant Effect Predictor tool e S : l p c

e i a m f r n i a D i

b (http://www.ensembl.org/ info/docs/tools/vep/ s g E e s p i s t l i S o T

r

- s f e

C index.html). Then in the selection by “knowl - s T o o

v

u E t , r n o e l y e l

e edge” we focused only on genes associated with o t u n E h a i i n n t n H H p S i i p i r

m t S S o S e calcification and/or ossification or related con - e p r

I I n s R l h S o o m n a t - D D U i

s c

f C N T

i ditions. Lastly, we evaluated the functional sig -

e O e o : v

d h f r s t H

e

n nificance and pathogenic potential of each vari - n S f i y I C M o

g

S : D r O e

e ant found in the selected genes. a t ) R i n i 1 s m f

p

l y e S a S g h - r l l e t o T

e a a l C t n t l EvAlUATION Of gENES AND vAlIDATION Of H o , i a a A

m m e l U S / n p

l s r r i I S e s b S N o o vARIANTS - e m a D h E l r t c N N i

C R e o f a

t r

Information about candidate genes was obtained o v l y p a

a

n g t c n o i from several databases, including Ensembl o i o O g t i x n l t a o

e l : F F a c M M O (http://www.ensembl.org/index), National Center i c o S I A f i i / i f d i D c N l s a

for Biotechnology Information (NCBI) (http:// A s a : . R s c

o s R

/ f i

l n . s o n www.ncbi.nih.gov), Pubmed (https:// www.ncbi. a I o

o i

o c t t t n i i E t y r a n o i t a

i nlm.nih.gov/pubmedwe), GeneCards (http:// l h e t e v c i i 1 2 3 4 p i B e t f n r m o i Z Z Z Z i a A www..org/), Online Mendelian Inheri - e b c f t l m A A T P A A e b s a y o A c s D tance in Man (OMIM) (http://www. omim.org/)

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 118 PARREIRA B ET AL

fIgURE 1. The two-level filter approach used to analyze the WES results from 4 patients with DISH/CC disease. SIFT: Sorting Intolerant From Tolerant, GERP: Genomic Evolutionary Rate Profiling score and MAF: Minor Allele Frequency and MalaCards (http://www.malacards.org/). ASSOCIATION STUDIES AND STATISTICAl PCR primers were designed using the software ANAlySIS Primer3 (http://www.bioinformatics.nl/cgi-bin/ In order to verify a possible association between iden - primer3plus/primer3plus.cgi) to amplify and validate tified genes from WES and the DISH/CC phenotype, variants detected by exome sequencing. Sanger se - selected variants were screened in family members (af - quencing using standard protocols was performed us - fected with DISH/CC and unaffected), and t he most ing an automated DNA sequencer ABI 3130xl (Applied conserved variants and those that were present in three Biosystems Ò). Genetic variants were screened using Se - or four WES patients, were selected for screening in a quencing Analysis and SeqScape (Applied Biosystem - group of 65 DISH/CC patients and a group of 118 con - sÒ) and using a reference NCBI sequence . trols. Association testing was performed by chi-squared The functional significance and the potential deleteri - analysis using PLINK software (V1.07) 16 : p-values ous effect of each variant was explored using the follow - <0.05 were considered significant. ORs with a 95% CI ing databases: Ensembl (http://www.ensembl. org/index), were calculated for the minor alleles of each variant. Human Gene Mutation Database (HGMD) (http://www. OR >1 indicates a susceptibility allele and OR<1 indi - hgmd.cf.ac.uk/ac/index.php) and dbSNP (https://www. cates a protective allele for the disease. ncbi.nlm.nih.gov/projects/SNP/), ma king use of PolyPhen-2 (Polymorphism Phenotyping v2) (http://ge - netics.bwh.harvard.edu/pph2/), SIFT and GERP. RESUlTS PolyPhen is classified as, benign [0-0.2], possibly dam - aging [0.2-0.85], and probably damaging [0.85-1], SIFT The WES was performed using DNA from four patients as deleterious if less than 0.05 and GERP, ranges from - from unrelated DISH/CC families (AZ1-AZ4) from 12.3 to 6.17, with 6.17 being the most conserved 15 . Azores. A detailed description of these families is pro - The MAF values of each variant were also analyzed. vided in our previous study 1. The patients who gave Protein conservation analysis was performed using informed consent, a blood sample was obtained and ClustalW (http://www.genome.jp/tools/clustalw/) to standard X-rays were taken from: the knees, axial skele - compare homologous amino acid sequences among ton, wrists, hands, elbows, and pelvis. The 4 patients multiple vertebrates at the sites where the variations oc - were selected because they shared a very evident DISH cur (accession numbers of transcripts available in sup - phenotype. The radiological characterization of pa - plementary Table II, available in supplementary file). tients selected for WES are indicated in Table I.

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 119 EcToPIc cALcIfIcATIon 4 t t t t t t t t t t e m Z l h h h h h h h h h h : e h A l l m h A

, * y

c . n n e g 3 i u t t t t t t t t t m m m n Z q h h h h h h h h h e e h h h t A r B

n F r

e i o e f l t

e a l B l

P d A

2 n t t t t t t t t r m m m a Z o

h h h h h h h h h h h ” A n i g n M i

: g a F A m a 1 M t t t t t t

m m m m m D Z ,

h h h h h h h h h h h d y A i l c b i a s o s n o i n P e m “

h A r D D D D

A A A

: o P B B B B B B B B B B B B B B f s b b b

y A N N N l P P P P A D o

, s P e P l

, b ” a

g c

i . n l i p g T g p A A A a F N T T T T T T T T T T T D D D D D D D A I I N N N

m t S C . a y o N D d

N E

u y : t l U s b A

a q * * e N b

F E h 1 1 1 5 1 1 9 6 7 2 8 3 4 3 9 1 4 7 7 8 1 , t o

0 0 0 0 0 0 0 0 2 3 0 2 3 3 3 0 2 2 3 1 0 S A r n ......

n o P i 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i R M

g < < 4 E r e Z r o g

f

A d

N e D o

A t x

b a S T L N Q Q l 1 S S W

S P A A Q K V H H Q H

e 3 2 2 7 1 Z 8 T , 6 4 0 2 y r 2 3 6 4 0 2 9 3 A A 8

s A 2 6 6 2 5 1 A 8 6 9 : 5 0 7 7 3 5 5 0 6 B

u 0 7 2 5 0

3 A 9 N N 0 1 s R 1 1 1 2 4 8 1 2 2 o M 1 t 1 1 1 1 i 1 D R R A G L T E Y V R R R K n r

P A E R K , R E e e s i t l t e M a a l R e p m I

D e f

m : A h a t N C T T T T A A A > T C C

D G l

A A T f C G C C m O t , > > > > > > > > > > e >

G o > > > > C > > > d

C n d n C C C G G G G G 7 C G G

e i e > A T T A a G G A

8 t r 5 0 7 6 7 0 9 2 + 0 1 6 i S - 0 T 7 5 7 6 5 4 a d o r 6 9 5 5 4 0 4 7 4 0 3 8 r T e 3 7 2 8 5 1 2 5 0 a e . 1 1 7 9 2 7 5 0 5 3 0 7 m v 1 5 l 7 4 5 8 4 3

N c V r . * . . 5 3 . 2 3 . . . 4 2 1 0 1 3 3 r o 7 ...... e c c c A c c c c . 2 o T s c c c c c c c c I c c c .

c . : n c o t R T o c

w A , c e t

j e

: v

o

p l n r i M y a C

P t I m e

C n

c g T d , u o l n r n i E d i A C C C C R R R R R R R R R R R R i M M M M t e t e c r w N c a R R R R R R R R R R R R v N H H H H

C C C C r n t r e t E u e e u n p c g s u e

s c n s q M E o e e o

r T c S o

p t A

e y e l r g D

4 9 2 4 m h I n 4 6 2 4 s i o g 5 2 9 2 2 1 8 4 6 0 1 8 5 4 l i 7 9 0 8 8 l D d x 2 9 6 1 0 6 9 3 2 7 0 9 3 4 5 e D h r 1 9 0 6 6 8

E N 7 7 5 3 9 0 4 9 7 5 2 4 6 8 2 I c :

o

8 4 9 7 3 3 _ 1 9 1 3 2 3 4 7 1 8 8 4 5 5 0 c A k C 7 5 5 5 2 7 O c 0 2 9 5 0 4 3 7 0 2 4 4 7 1 0 P C n 2 8 6 3 4 4 a H

G 8 2 2 0 2 4 9 2 8 2 0 0 1 8 2

a

s N 1 2 5 2 4 , l s e I r D 1 2 2 1 7 7 4 9 1 2 1 1 3 3 s l e r 4 7 5 3 S B s s s s s s s s s s B r

E 3 3 1 1 b s s s s , r r r r r r r r r r L m s s s s a r r r r s R r r r r t o

H u E s e o o N T h

g l t y m

I m z o n f o r i r o

r 2 2 1 7 6 6 9 0 0 4 6 r h h f f 6 3 4 6 3 6 7 1 2 1 d

e c 1 1 2 1 1 1 1 1 2 1 1 t e

C d O : t e

e r s h i T h

l m :

S t C u e I

r s h : l s a s ,

a 2 2 . s n s

I D t 1 1 u s I o A A i

2 e n 3 o 6 t 2 2 1 1 A A 1 i 1 a E g R a R c i 1 6 1 1 2 4 i β P C C 2 y R R G G r l e L L 2 n P E v L L L L z P P a R P F S S F n P P e C B N C P C e o v G r e

u O A O O O A L B M D L G N G L L P L M M A b e q m C T G P C V C F B E B M P A C C P A A C T A F b e h o r f h A T

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 120 PARREIRA B ET AL

fIgURE 2. Sequence conservation in six vertebrates of the variants identified in candidate genes possibly associated with the DISH/CC phenotype in humans (some genes are not available in all species). The variant for the FGF2 gene is not presented since it is located in the 3’ untranslated region. Abbreviations: PLCG2- Phospholipase C Gamma 2, ALPL- Alkaline Phosphatase, Liver/Bone/Kidney, CASR- calcium-sensing receptor, COL11A2- Collagen Type XI Alpha 2 Chain, ENPP1– ectonucleotide pyrophosphatase/phosphodiesterase 1, MGP- Matrix Gla Protein, VDR- Vitamin D Receptor, BMP4- Bone morphogenetic protein 4, COL1A1- Collagen Type I Alpha 1 Chain, BMP2- Bone morphogenetic protein 2, COL6A1- Collagen Type VI Alpha 1 Chain, FLNC- Filamin C, AMER3- APC Membrane Recruitment Protein 3, PPP2R2D- Regulatory Subunit B delta, ABCC6 – ATP-binding cassete subfamily C, member 6.

fIlTERINg RESUlTS CASR, FGF2, COL11A2, ENPP1, MGP, VDR , BMP-4, Approximately 38 Mb of sequence per patient was gen - COL1A1, TGF 1, BMP-2 , COL6A1 , FLNC , AMER3 , erated and the capture specificity and sensitivity in all PPP2R2D and ABCC6. Ten of the identified variants samples was about 55% and 94%, respectively. The re - were present in the HGMD mutation database, linked sults for each sample after SNV calling and indel identi - to phenotypes, other than DISH/CC. The identified fication steps are summarized in supplementary Table III. variants and available functional information are pre - After filtering, 21 missense, deletion and splice site sented in Table II. variants in 17 genes were obtained: PLCG2 , ALPL, Sequence conservation of the missense variants were

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 121 EcToPIc cALcIfIcATIon

TABlE III. ASSOCIATION STUDy BETWEEN SEvEN vARIANTS fROM SEvEN gENES AND THE DISH/CC pHENOTypE

MAF SNP Allele DISH/CC Controls Chr Gene M/m MAF (N=65) (N=118) OR (95% CI) p-value 10 PPP2R2D rs34473884 G/A 0.262 0.165 1.789 (1.060 - 3.021) 0.028 14 BMP4 rs17563 T/C 0.454 0.390 1.301 (0.843 - 2.006) 0.234 4 FGF2 rs1048201 C/T 0.139 0.131 1.063 (0.569 - 1.985) 0.849 6 ENPP1 rs1044498 A/C 0.200 0.203 0.979 (0.574 - 1.670) 0.938 6 COL11A2 rs9277934 G/A 0.385 0.360 1.110 (0.713 - 1.728) 0.643 20 BMP2 rs235768 A/T 0.292 0.284 1.042 (0.650 - 1.671) 0.865 12 VDR rs2228570 T/C 0.400 0.373 1.121 (0.723 - 1.739) 0.609

The minor alleles are indicated in bold. Abbreviations: Chr: Chromossome, SNP: Single nucleotide polymorphism, M/m: major allele/minor allele, MAF: Minor allele frequency and OR (95% CI): Odds Ratio (95% Confidence Interval).

then evaluated in 6 different vertebrates. It is evident BMP4 genes, respectively, were present in three of the in Figure 2 that the variants in the genes PPP2R2D , four patients selected for WES. Seven variants were BMP2 , FLNC , and CASR are highly conserved in all the screened in a group of 65 DISH/CC patients and 118 vertebrates analyzed. Variants of the genes VDR , BMP4 , controls and the results are indicated in Table III. The COL1A1 and ABCC6 are only conserved in mammals. variant rs34473884 in the PPP2R2D gene was the only The degree of conservation is not directly linked to low - that gave significant results. It was found to be much er allele frequency, since several of these variants have more frequent in the DISH/CC group than in the con - high Minor Allele Frequencies (MAF). trols (p=0.028; OR=1.789, 95% CI= 1.060 - 3.021 ).

ASSOCIATION BETWEEN vARIANTS AND THE DISH/CC pHENOTypE DISCUSSION Two cohorts, one of 65 unrelated DISH/CC patients (45 males, 20 females; age of onset around 40 years) In this study, we used WES as a method to identify can - and another of 118 unrelated controls without any didate genes for DISH/CC aetiology and association signs of DISH/CC, with a similar ethnic background, studies to investigate some of the identified variants. were selected after radiological characterization (47 As expected, thousands of protein coding variants per males, 71 females; mean current age, 68 years; range, patient were identified across each exome. Conse - 60-104) and used for association studies. In cases we quently, a number of filtering strategies were used to se - selected younger patients around the age of 40, where lect potential high risk variants of genes potentially as - genetic factors may be associated, however in the con - sociated with DISH. The filtering strategies employed trols we selected older individuals to ensure that they were based on previous knowledge about gene func - have not developed the disease. tion. The major limitations of the strategy used to iden - All patients had at least 2 highly conserved genetic tify candidate genes is that unannotated genes and variants in combination with other variants that are those with unknown functions were not investigated in normally conserved in mammals (Table II). Variants the study. with a high degree of conservation, present in 4 or 3 Very few genetic studies have been published about WES patients, irrespective of the MAF, were selected DISH so far. DISH susceptibility genes have previous - for the association study. The variants rs34473884, ly been investigated, including Human Leukocyte Anti - rs9277934 and rs2228570 in PPP2R2D , COL11A2 and gens (HLA), 17-24 Collagen 6A1 gene ( COL6A1 ), 25 Fi - VDR genes, respectively, were present in all four WES broblast Growth factor 2 ( FGF2), 26 Vitamin D (1,25- patients. The variants rs1048201, rs235768, Dihydroxyvitamin D3) Receptor ( VDR) and Collagen 27 rs1044498 and rs17563 in FGF2 , BMP2 , ENPP1 and Type I 1 (COL1A1 ), but only two genes are known to

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 122 PARREIRA B ET AL

have a positive association with DISH susceptibility; glycine for serine at amino acid 327 in the protein. COL6A1 28-29 and FGF2. 26 However, the COL6A1 and These two amino acids are hydrophilic but glycine is FGF2 variants associated with the disease, are located non-polar and serine is polar. The glycine is normally in non-coding regions and are common variants with - totally conserved in all vertebrates studied, and the in the general population, suggesting they have a small variant has a low, deleterious, SIFT score (0.03) but the genetic effect. Polyphen score (0.33) does not corroborate its harm - In our study we found 3 genetic variants that were ful effect. The frequency of this variant is high in Eu - present in all 4 DISH/CC patients and 6 that were pre - ropean populations with a MAF of 0.18 and this vari - sent in 3 DISH/CC patients. Minor allele frequencies of ant was identified in all four DISH/CC patients used these variants were high (mean 0.25), meaning that for WES; AZ1 was homozygous and AZ2, 3 and 4 were these were common variants in the general population. heterozygous. As far as we known no phenotype has Rare nonsynonymous SNPs are two times more likely been associated with this gene. to be predicted as protein-affecting, when compared to Rare variants, with a deleterious effect on in common SNPs. However, in our study three common the , are the basis for the development SNPs, shared by the 4 patients (G358S, E276K and of Mendelian diseases. However, common polymor - M1T in PPP2R2D , COL11A2 and VDR genes, respecti - phisms that individually exert small effects might, as a vely), according to SIFT and/or PolyPhen algorithms group, also play a substantial genetic role. Despite the were predicted to have a strong effect on the protein. lack of association of most of the studied gene variants Eight of these genetic variants, in heterozygous and/or in the present study, this may be in part linked to the homozygous states, were in conserved positions of pro - heterogeneous features of the phenotype, and so im - teins associated with mineralization; four of which were proved characterization of the disorder under study is in regions that were highly conserved across the verte - essential for the planning of future studies. The results brates. The four WES selected patients had at least 2 of our study lead us to suggest that DISH/CC is poly - highly conserved genetic variants in combination with genic, and is influenced by the interaction of multiple, other variants that were conserved in mammals. The as - small effect gene variants and possibly by unknown en - sociation study indicated that the SNP rs34473884 in vironmental factors. the PPP2R2D gene was significantly associated with the DISH/CC phenotype. In humans, the full PPP2R2D (Protein Phosphatase CONClUSION 2 Regulatory Subunit delta) gene structure is com - posed of 9 exons and spans 79.19 kb in chromosome Our results underline the polygenic nature of DISH and 10. The gene encodes a crucial serine/threonine protein a number of conserved and sometimes deleterious vari - phosphatase that regulates basal cellular activities by ants were identified in genes with a role in mineraliza - dephosphorylating substrates. Protein Ser/Thr phos - tion. The variant rs34473884 in the PPP2R2D gene was phatases are a group of enzymes that catalyze the re - significantly associated with the DISH/CC phenotype moval of phosphate groups from serine and/or threo - and we propose it may contribute to the development nine residues by hydrolysis of phosphoric acid of this disorder. Further studies will be needed to con - monoesters. They oppose the action of kinases and firm the association of PPP2R2D with the phenotype phosphorylases and are involved in signal transduc - under study. tion. Protein phosphatases have long been postulated to influence TGF- superfamily signaling, which regu - ACkNOWlEDgEMENTS lates numerous cellular responses 30 . Despite the long- The authors thank all the patients who participated in this research and made it possible. We also thank Isa Dutra (MSc), João Paulo standing suspected influence of protein phosphatases Pinheiro (BsC) and Raquel Meneses (BsC) from Hospital Santo Es - in TGF- signaling, concrete data confirming the inter - pírito da Ilha Terceira, EPE, Epidemiology and Molecular Biology action only started to emerge recently. In human cell Service (SEEBMO), Angra do Heroísmo, Portugal, who performed lines, PPP2R2D negatively modulates TGF- / Activin/ the DNA extractions and Vânia Machado (MSc), who participated in the elaboration of the pedigrees. BP was supported by “Fundo Re - /Nodal signaling by inhibiting the type I receptors gional para a Ciência e Tecnologia (FRCT)” (M3.1.2/F/023/2011). ALK4 and ALK5 31 . The variant c.1072G>A (G327S) found in PPP2R2D COMpETINg fINANCIAl INTERESTS gene is a missense variant which causes substitution of The authors declare no competing interests.

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 123 EcToPIc cALcIfIcATIon

CORRESpONDENCE TO 15. Cooper GM, Stone EA, Asimenos G, Green ED, Batzoglou S, Jácome Bruges Armas Sidow A. Distribution and intensity of constraint in mammalian Serviço Especializado de Epidemiologia e Biologia Molecular genomic sequence. Genome Res 2005; 15: 901-913. (SEEBMO) 16. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Ben - Hospital de Santo Espírito da Ilha Terceira (HSEIT) der D, et al. PLINK: a tool set for whole-genome association and Canada do Briado population-based linkage analyses. Am J Hum Genet 2007; 81: 9700-049, Angra do Heroísmo 559-575. E-mail: [email protected] 17. Rosenthal M, Bahous I, Müller W. Letter: HL-A B27 and modi - fied bone formation. Lancet 1976 1: 543. REfERENCES 18. Rotés Querol J, Ercilla González G. Letter: HLA-B27 and mod - 1. Bruges-Armas J, Couto AR, Timms A, Santos MR, Bettencourt ified bone formation. Lancet 1976 1: 482-3. BF, Peixoto MJ, et al. Ectopic calcification among families in the 19. Brigode M, Francois RJ. Histocompatibility antigens in vertebral Azores: clinical and radiologic manifestations in families with ankylosing hyperostosis. J Rheumatol 1977 4: 429-434. diffuse idiopathic skeletal hyperostosis and chondrocalcinosis. 20. Spagnola AM, Bennett PH, Terasaki PI. Vertebral ankylosing hy - Arthritis Rheum 2006; 54: 1340-1349. perostosis (Forestier's disease) and HLA antigens in Pima Indi - 2. Mader R. Clinical manifestations of diffuse idiopathic skeletal ans. Arthritis Rheum 1978 21: 467-72. hyperostosis of the cervical spine. Sem Arthritis Rheum 2002; 21. Perry JD, Wolf H, Festenstein H, Storey GO. Ankylosing hy - 32: 130-135. perostosis: a study of HLA A, B, and C antigens. Ann Rheum Dis 3. Mader R, Verlaan JJ, Buskila D. Diffuse idiopathic skeletal hy - 1979 38: 72-73. perostosis: clinical features and pathogenic mechanisms. Nat 22. Dostál C, Ivasková E, Hána I, Sváb V. HLA antigens in vertebral Rev Rheumatol 2013; 9: 741-750. ankylosing hyperostosis (Forestier's disease). J Hyg Epidemiol 4. Utsinger PD. Diffuse Idiopathic Skeletal Hyperostosis. Clin Microbiol Immunol 1983; 27: 98-102. Rheum Dis 1985; 11: 325-351. 23. Rosenthal M, Bahous I, Muller W. Increased frequency of HLA 5. Weinfeld RM, Olson PN, Maki DD, Griffiths HJ. The prevalence B8 in hyperostotic spondylosis. J Rheumatol Suppl 1977; 3: 94- of diffuse idiopathic skeletal hyperostosis (DISH) in two large -96. American Midwest metropolitan hospital populations. Skeletal 24. Ercilla González G, Brancós MA, Breisse Y. Histocompatability Radiol 1997; 26: 222-225. antigens in Forestier's disease, polyarthrosis and ankylosing 6. Beardwell A. Familial Ankylosing Vertebral Hyperostosis With spondylitis. HLA and Disease. Paris: INSERM; 1976. p. 28. Tylosis. Ann Rheum Dis 1969; 28: 518-523. 25. Tsukahara S, Miyazawa N, Akagawa H, Forejtova S, Pavelka K, 7. Woodard JC, Poulos PW, Jr., Parker RB, Jackson RI, Jr., Eurell Tanaka T, et al. COL6A1, the candidate gene for ossification of JC. Canine diffuse idiopathic skeletal hyperostosis. Vet Pathol the posterior longitudinal ligament, is associated with diffuse id - 1985; 22: 317-326. iopathic skeletal hyperostosis in Japanese. Spine 2005 30: 2321- 8. Morgan J, Stavenborn M. Disseminated idiopathic skeletal hy - -2324. perostosis (DISH) in a dog. Vet Radiol Ultrasound 1991; 32: 26. Jun JK, Kim SM. Association study of fibroblast growth factor 2 65-70. and fibroblast growth factor receptors gene polymorphism in 9. Gutierrez M, Silveri F, Bertolazzi C, Giacchetti G, Tardella M, Di korean ossification of the posterior longitudinal ligament pa - Geso L, et al. [Gitelman syndrome associated with chondrocal - tients. J Korean Neurosurg Soc 2012; 52: 7-13. cinosis: description of two cases]. Reumatismo 2010; 62: 60-64. 27. Havelka S, Uitterlinden AG, Fang Y, Arp PP, Pavelková A, Veselá 10. Netter P, Bardin T, Bianchi A, Richette P, Loeuille D. The ANKH M, et al. Collagen type I(alpha 1) and vitamin D receptor poly - gene and familial calcium pyrophosphate dihydrate deposition morphisms in diffuse idiopathic skeletal hyperostosis. Clin disease. Joint Bone Spine 2004; 71: 365-368. Rheumatol 2002 21: 347-348. 11. Ramos YF, Bos SD, van der Breggen R, Kloppenburg M, Ye K, 28. Tanaka T, Ikari K, Furushima K, Okada A, Tanaka H, Furukawa Lameijer EW, et al. A gain of function mutation in TNFRSF11B K, et al. Genomewide linkage and linkage disequilibrium anal - encoding osteoprotegerin causes osteoarthritis with chondro - yses identify COL6A1, on chromosome 21, as the locus for os - calcinosis. Ann Rheum Dis 2015; 74: 1756-62. sification of the posterior longitudinal ligament of the spine. 12. Couto AR, Parreira B, Thomson R, Soares M, Power DM, Am J Hum Genet 2003; 73: 812-822. Stankovich J, et al. Combined approach for finding suscepti - 29. Kong Q, Ma X, Li F, Guo Z, Qi Q, Li W, et al. COL6A1 poly - bility genes in DISH/chondrocalcinosis families: whole-genome- morphisms associated with ossification of the ligamentum wide linkage and IBS/IBD studies. Hum Genome Var 2017; 4: flavum and ossification of the posterior longitudinal ligament. 17041. Spine (Phila Pa 1976) 2007; 32: 2834-2838. 13. Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent 30. Liu T, Feng XH. Regulation of TGF-beta signalling by protein KM, et al. Exome sequencing identifies the cause of a mendelian phosphatases. Biochem J 2010; 430: 191-198. disorder. Nat Genet 2010; 42: 30-35. 31. Batut J, Schmierer B, Cao J, Raftery LA, Hill CS, Howell M. Two 14. Rabbani B, Tekin M, Mahdieh N. The promise of whole-exome highly related regulatory subunits of PP2A exert opposite ef - sequencing in medical genetics. J Hum Genet 2014; 59: 5-15. fects on TGF-beta/Activin/Nodal signalling. Development 2008; 135: 2927-2937.

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 124 PARREIRA B ET AL 1 1 3 2 2 7 7 2 3 4 4 1 8 1 ...... 3 5 0 1 2 3 9 2 8 4 0 7 4 2 4 7 5 0 9 9 9 2 0 5 5 5 3 9 9 1 3 1 9 3 3 6 6 7 1 6 9 8 2 2 7 1 9 9 1 9 0 5 5 6 0 1 7 7 2 3 4 0 2 4 1 0 7 6 1 6 1 1 1 1 1 0 0 1 1 1 0 1 0 1 h s 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i 0 0 0 0 0 0 0 0 0 0 0 0 0 0 SUpplEMENTARy fIlES f i a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 r N 0 0 0 0 0 0 0 0 0 0 0 0 0 0 b 0 0 0 0 0 0 0 0 0 0 0 0 0 0 e T T T T T T T T T T T T T T Z R R R R R R R R R R R R R R A A A A A A A A A A A A A A D D D D D D D D D D D D D D S S S S S S S S S S S S S S N N N N N N N N N N N N N N E E E E E E E E E E E E E E 1 1 1 1 1 1 4 3 5 1 1 ...... 7 3 8 0 0 8 3 9 6 5 SUpplEMENTARy TABlE I. NUMBER Of 2 1 2 6 9 6 6 3 7 6 1 3 8 . 6 0 4 9 9 2 1 6 3 4 6 1 1 1 . . . 8 1 6 8 5 0 9 9 0 5 CANDIDATE gENES pER SAMplE AND NUMBER 1 1 8 5 4 8 6 6 5 5 1 3 2 6 7

. . . 2 0 0 0 0 0 0 0 0 0 0 0 3 1

Of CANDIDATE gENES SHARED By DISH/CC n S 2 0 0 0 0 0 0 0 0 0 0 0 I 7 e 9 3 0 0 0 0 0 0 0 0 0 0 0 5 i S 4 k 7 0 0 0 0 0 0 0 0 0 0 pATIENTS 0 c 4 6 y N i 2 0 0 0 0 0 0 0 0 0 0 0 0 1 l h 5 0 0 0 0 0 0 0 0 0 0 0 2 4 A 1 T T T T T T T T T T T C _ _ 0 L L L L L L L L L L L N _ M Number of M A A A A A A A A A A A A

X N M G G G G G G G G G G G N S S S S S S S S S S S candidate genes X O N N N N N N N N N N N I E E E E E E E E E E E Dominant Recessive T A v 2 0 0 0 3 1 4 7 7 8 2 7 4 7 Samples model Model 6 ...... 1 1 1 1 1 1 1 R ...... 0 0 2 2 7 7 0 6 E 7 1 7 0 7 7 9 5 2 3 4 4 4 9 AZ1 815 48 3 S 9 5 9 7 9 7 1 8 5 2 3 5 1 0 8 0 5 4 6 5 0 1 N 2 5 1 2 1 1 5 8 1 0 7 2 3 4 3 0 0 8 3 0 0 6 2

AZ2 917 58 O 4 3 8 5 6 7 2 0 1 0 0 0 0 0 0 C 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 e 0 0 0 0 0 0 AZ3 872 47 0 0 0 0 0 0 0 0 0 R s 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 u O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o f 0 0 0 0 0 0 AZ4 593 25 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 M T T T T T T T T D T T T T T T T S S S S S S S S

AZ1+AZ2 220 6 E S S S S S S S U U U U U U U U S U U U U U U U U M M M M M M M AZ1+AZ3 212 4 M

M M M M M M M S S S S S S S S S S S S S S S S S N N N N N N N N N N N N N N N AZ1+AZ4 139 4 R E E E E E E E E E I E E E E E E E E C AZ2+AZ3 249 6 B ) E t M a P U S 3 3 1 3 3 4 3 3 3 2 3 3 1 AZ2+AZ4 149 4 2 C ...... ( N 8 1 6 5 8 9 3 8 4 7 0 4 0 3

0 0 1 1 7 0 5 1 0 4 6 2 9 7 AZ3+AZ4 162 3 1 T . 9 0 4 8 5 4 9 9 5 7 7 6 6 4 4 p 8 0 4 1 3 1 6 8 2 6 8 3 9 3 I 1 2 0 4 3 2 0 2 1 0 0 1 2 4 AZ1+AZ2+AZ3 118 2 4 R 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3

AZ1+AZ2+AZ4 65 1 S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 g 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 N o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A 0

AZ2+AZ3+AZ4 78 2 D 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 . R T T T T T T T T T T T T T T 0 6 T F F F F F F F F F F F F F AZ1+AZ3+AZ4 77 1 F

1 0 0 A A A A A A A A A A A A A A T N 2 A

C C C C C C C C C C C C C C

AZ1+AZ2+AZ3+AZ4 52 1 O r I S S S S S S S S S S S S S S C e S F N N N N N N N N N N N N N N b S S E E E E E E E E E E E E E E m E N e C v E o C A N 4 3 4 3 2 4 4 3 5 3 4 4 4 3

...... n R 8 6 9 9 2 4 1 6 4 4 6 2 3 6 o I 9 5 2 7 9 6 3 5 4 2 9 3 4 0

E 3 3 8 4 5 5 2 1 4 1 9 7 6 6 d e 4 4 5 5 0 8 7 6 6 3 3 8 1 4 H e s 1 3 0 1 0 4 1 2 3 2 4 0 1 2 T s e

0 0 0 0 0 0 0 0 0 0 0 0 0 0 e z c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 D n c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 a i a N

0 0 0 0 0 0 0 0 0 0 0 0 0 0 p ; N A 0 0 0 0 0 0 0 0 0 0 0 0 0 0 e

m s 0 0 0 0 0 0 0 0 0 0 0 0 0 0 i S a T T T T T T T T T T T T T T h E b R R R R R R R R R R R R R R a C t N T T T T T T T T T T T T T T a E P P P P P P P P P P P P P P d

g S S S S S S S S S S S S S S

e N N N N N N N N N N N N N N f m E E E E E E E E E E E E E E O o

n T e 1 2 g S 6 5 5 7 6 9 7 1 3 5 7 8 4

...... 1 1 I l . . 1 6 8 0 7 4 6 1 9 1 2 1 7 l b 7 8

7 6 3 4 4 6 6 8 1 6 2 5 2 . 5 8 I m 9 5 1 8 9 9 8 9 6 2 0 4 8 I 5 8 e

0 5 4 4 1 5 1 3 8 9 9 5 8 s 5 5 6 5 6 7 4 2 6 2 9 3 2 4 7 E n n 0 2 3 4 5 3 3 2 3 4 4 5 2 2 3 l a E 2 3

0 0 0 0 0 0 0 0 0 0 0 0 0 B 0 0 m e 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A u h 0 0 0 0 0 0 0 0 0 0 0 0 0 t 0 0 T

0 0 0 0 0 0 0 0 0 0 0 0 0 H

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 m y 0 0 T T T T T T T T T T T T T o R T T r S S S S S S S S S S S S S f S S

A N N N N N N N N N N N N N . d T N N e E E E E E E E E E E E E E d E E N v e i e E f i i r t t M 2 n D e 1 1 E A e r 2

3 S l 6 2 1 A A d 1 s i R R E

1 1 6 p 4 2 a P C C R G L t 2 P E L L L P P R N P S p w o P C N C P

G E A O O O n B L D M a U N L L P M M

t : A S G P A C C E M V B C B C F A P i a N D

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 125 EcToPIc cALcIfIcATIon

SUpplEMENTARy TABlE III. kNOWN, NOvEl AND TOTAl NUMBER Of SNvS AND INDElS fOR THE fOUR SAMplES TESTED (AZ1-4)

SNVs Indels TOTAL/ Sample Known Novel Total Known Novel Complex Total sample AZ1 17281 1619 18900 457 405 51 913 19813 AZ2 18086 1759 19845 440 496 66 1002 20847 AZ3 18327 1650 19977 524 505 73 1102 21079 AZ4 17575 1223 18798 475 424 72 971 19769

SNVs: Single nucleotide variants.

ÓRgÃO OfICIAL dA SOCIEdAdE PORTUgUESA dE REUMATOLOgIA 126