CLINICAL RESEARCH

e-ISSN 1643-3750 © Med Sci Monit, 2019; 25: 5657-5665 DOI: 10.12659/MSM.915732

Received: 2019.02.17 Accepted: 2019.03.21 Trends in the Epidemiology of Sexually Published: 2019.07.30 Transmitted , Acquired Immune Deficiency Syndrome (AIDS), Gonorrhea, and Syphilis, in the 31 Provinces of Mainland

Authors’ Contribution: CDEF 1 Xuechen Ye 1 Department of Social , School of , China Medical University, Study Design A BCDF 2 Jie Liu Shenyang, Liaoning, P.R. China Data Collection B 2 Department of Health Statistics, School of Public Health, China Medical Statistical Analysis C AE 3 Zhe Yi University, Shenyang, Liaoning, P.R. China Data Interpretation D 3 Department of Prothodontics, School of Stomatology, China Medical University, Manuscript Preparation E Shenyang, Liaoning, P.R. China Literature Search F Funds Collection G

Corresponding Author: Zhe Yi, e-mail: [email protected] Source of support: This study was funded by the National Natural Science Foundation of China (Grant numbers 71673301, 71473269, and 81273186)

Background: This study aimed to investigate trends in the epidemiology of the leading sexually transmitted (STDs), acquired immune deficiency syndrome (AIDS), gonorrhea, and syphilis, in the 31 provinces of mainland China. Material/Methods: This retrospective study analyzed the incidence data of STDs from official reports in China between 2004 and 2016. The grey model first order one variable, or GM (1,1), time series forecasting model for epidemiological studies predicted the incidence of STDs based on the annual incidence reports from 31 Chinese mainland prov- inces. Hierarchical cluster analysis was used to group the prevalence of STDs within each province. Results: The prediction accuracy of the GM (1,1) model was high, based on data during the 13 years between 2004 and 2016. The model predicted that the incidence rates of AIDS and syphilis would continue to increase over the next two years. Cluster analysis showed that 31 provinces could be classified into four clusters according to similarities in the incidence of STDs. Group A (Sinkiang Province) had the highest reported prevalence of syph- ilis. Group B included provinces with a higher incidence of gonorrhea, mainly in the southeast coast of China. Group C consisted of southwest provinces with a higher incidence of AIDS. Conclusions: The GM (1,1) model was predictive for the incidence of STDs in 31 provinces in China. The predicted incidence rates of AIDS and syphilis showed an upward trend. Regional distribution of the major STDs highlights the need for targeted prevention and control programs.

MeSH Keywords: Cluster Analysis • Forecasting • Incidence • Sexually Transmitted Diseases

Full-text PDF: https://www.medscimonit.com/abstract/index/idArt/915732

2695 5 4 30

Indexed in: [Current Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] This work is licensed under Creative Common Attribution- [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) 5657 [Chemical Abstracts/CAS] Ye X. et al.: CLINICAL RESEARCH Epidemiology of STDs in China © Med Sci Monit, 2019; 25: 5657-5665

Background strategies and was chosen for this study to predict the prev- alence of STDs in China. Acquired immune deficiency syndrome (AIDS) due to human immune deficiency (HIV) , syphilis, and gonor- To identify and monitor the incidence of STDs an understanding rhea are the three major sexually transmitted diseases (STDs) of the regional distribution of disease incidence is required [13]. in China, and are legally notifiable infectious diseases accord- In China, the occurrence of STDs has significant disparity across ing to the Law of the Peoples’ Republic of China on Prevention regions, with some reaching epidemic levels in cer- and Control of Infectious Diseases [1]. Clinically diagnosed tain parts of the country [14,15]. The regional epidemiology of STDs, or those diagnosed by laboratory investigations, have disease must be considered when healthcare services, includ- a mandatory requirement to be reported to national surveil- ing administration and allocation of resources to implement lance systems in China [1]. Surveillance data from 2004 to prevention and control of STDs. 2013 in China showed that HIV infection and syphilis had the most rapid increase in incidence, with an annual increase in Comprehensive STD control programs are urgently required to prevalence of 16.3% (95% CI, 11.5–21.2), and 16.3% (95% CI, reduce the spread of STDs in China. Therefore, this study aimed 13.8–18.8) respectively, and gonorrhea had the most rapid to investigate trends in the epidemiology of the leading sexu- rates of decline with an annual change of –8.5% (95% CI, ally transmitted diseases (STDs), acquired immune deficiency –11.7 to –5.1) [2]. The increased incidence of HIV and syphilis syndrome (AIDS), gonorrhea, and syphilis, in the 31 provinces was attributed to a variety of reasons, including internal la- of mainland China between 2004 and 2016. bor migration, unsafe sexual practices, and lack of consulting healthcare services [3–5]. Material and Methods Monitoring and evaluating the increasing incidence of STDs, particularly the ability to predict the prevalence of STDs, is es- Study design and data access sential to control infectious disease [1]. Several methods have been used to detect and predict the prevalence of infectious This retrospective study required no ethical approval as ano- disease over time, including the autoregressive integrated mov- nymized retrospective data were collected and analyzed from ing average (ARIMA) model, the back propagation neural net- existing resources. The incidence data of legally notifiable in- work (BPNN) model, and the grey model (GM) [6,7]. The ARIMA fectious diseases in mainland China were publicly accessible model assumes a linear relationship between dependent and from the official website of the National Health and Family independent variables and a constant standard deviation in Planning Commission of the Peoples’ Republic of China [16]. a time series [8]. However, the actual surveillance data often presents complex non-linear relationships and non-stationary Data collection distributions over time [8,9]. The BPNN model requires suffi- cient training data and longer training times to obtain an ac- The incidence of acquired immune deficiency syndrome (AIDS) curate and generalized model and may be associated with due to human immune deficiency virus (HIV) infection, syphilis, a predisposition to falling into local minima in the training and gonorrhea in mainland China from 2004 to 2016 were ob- process [6]. Difficulties in the collection of detailed informa- tained from official reports released by the National Health tion and complex mathematical operations also limit the ap- and Family Planning Commission of the People’s Republic of plication of the BPNN model. China [16]. The reported incidence data of 31 provinces, or au- tonomous regions and municipalities, were also collected. Compared with the ARIMA and BPNN methods, the grey model requires a smaller sample size, less stringent data distribution, Data analysis using the grey prediction model, GM (1,1) and it is easier to analyze [10]. Deng first proposed the grey system theory in 1982 through the analysis of systems with The grey model was designed to make predictions from par- partially known, incomplete, or uncertain information [11]. tially known or limited data [12]. The grey model first order The grey model can be built to aid predictions and decision- one variable, or GM (1,1), time series forecasting model for ep- making in the face of limited information and knowledge [11]. idemiological studies is a more specific and widely used grey Recently, the grey model has been applied to the analysis of prediction model [7]. In this study, the reported incidence rates data involving the environment, energy, and finance, with sat- of AIDS, syphilis, and gonorrhea from 2004 to 2016 were col- isfactory reported results [12]. The grey model can also pre- lected to establish the GM (1,1) model for incidence prediction. dict the epidemiology of infectious diseases, including typhoid The following steps explain how the model was developed. fever and echinococcosis [7,10]. The grey model can predict the incidence of infectious disease and facilitate intervention The original data sequence was presented as Eq (1):

Indexed in: [Current Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] This work is licensed under Creative Common Attribution- [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) 5658 [Chemical Abstracts/CAS]

infectious diseases in mainland China were publicly accessible from the official website of the National Health and Family Planning Commission of the Peoples’ Republic of China [16]. infectiousinfectious diseases diseases in inmainland mainland China China were were publicly publicly accessible accessible from from the the official official website website of ofthe the

NationalNational Health Health and and Family Family Planning Planning Commission Commission of ofthe the Peoples’ Peoples’ Republic Republic of ofChina China [16]. [16]. Data collection

The incidence of acquired immune deficiency syndrome (AIDS) due to human immune DataData collection collection deficiency virus (HIV) infection, syphilis, and gonorrhea in mainland China from 2004 to 2016 TheThe incidence incidence of ofacquired acquired immune immune deficiency deficiency syndrome syndrome (AIDS) (AIDS) due due to tohuman human immune immune were obtained from official reports released by the National Health and Family Planning deficiencydeficiency virus virus (HIV) (HIV) infection, infection, syphilis, syphilis, and and gonorrhea gonorrhea in inma mainlandinland China China from from 2004 2004 to to2016 2016 Commission of the People’s Republic of China [16]. The reported incidence data of 31 provinces, werewere obtained obtained from from official official reports reports released released by by the the National National He Healthalth and and Family Family Planning Planning or autonomous regions and municipalities, were also collected. CommissionCommission of ofthe the People’s People’s Republic Republic of ofChina China [16]. [16]. The The reported reported incidence incidence data data of of31 31 provinces, provinces, or orautonomous autonomous regions regions and and munic municipalities,ipalities, were were also also collected. collected. Data analysis using the grey prediction model, GM (1,1)

The grey model was designed to make predictions from partially known or limited data [12]. The DataData analysis analysis using using the the grey grey prediction prediction mo model,del, GM GM (1,1) (1,1) grey model first order one variable, or GM (1,1), time series forecasting model for TheThe grey grey model model was was designed designed to tomake make predictions predictions from from partially partially known known or orlimited limited data data [12]. [12]. The The epidemiological studies is a more specific and widely used grey prediction model [7]. In this greygrey model model first first order order one one variable, variable, or orGM GM (1,1), (1,1), time time series series forecasting forecasting model model for for study, the reported incidence rates of AIDS, syphilis, and gonorrhea from 2004 to 2016 were epidemiologicalepidemiological studies studies is ais more a more specific specific and and widely widely used used grey grey prediction prediction model model [7]. [7]. In Inthis this study,collected the reported to establish incidence the GM rates (1,1) of AIDS,model syphilis,for incidence and gonopredictrrheaion. from The 2004following to 2016 steps were explain study, the reported incidence rates of AIDS, syphilis, and gonorrhea from 2004 to 2016 were how the model wasYe X. et developed. al.: collectedcollected to toestablish establish Epidemiologythe the GM GM of STDs(1,1) (1,1) in China model model for for incidence incidence predict prediction.ion. The The following following steps steps explain explain CLINICAL RESEARCH © Med Sci Monit, 2019; 25: 5657-5665 where,where, how the model was developed. where, how the model was developed. (0) where, xˆxˆ (0)( k+(k+1)1) represents represents the the predicted predicted value. value. The original data sequence was presented as Eq (1): xˆ (0) (k+1) represents the predicted value. (0) (0) (0) (0) (0) (0) xˆ (k+1) represents the predicted value. x ={x (0), x (1), x (2), , x (n)} (1) TheThe original original data data sequence sequence was was presented presented as asEq Eq (1): (1): where, (0) (0) (0) (0) After(0)After establishing establishing the the GM GM (1,1), (1,1), its its accuracy accuracy was was examined examined to to e extrapolatextrapolate the the predicted predicted values. values. where, ⋯ (0) x(0)x ={={x(0)x(0),(0), x(0) x (1),(1), x(0) x (2),(2), , x(0), x (Aftern)}(n )} establishing (1) cˆ ( k+ the (1)1) (1) representsGM (1,1), the its predicted accuracy value. was examined to extrapolate the predicted values. (0 ) After establishing the GM (1,1), its accuracy was examined to extrapolate the predicted values. x (i) ≥ 0, I = 0, 1, 2, , n. where, where, ⋯ After establishing the GM (1,1), its accuracy was examined to where, ⋯ DuringDuring the the error errorextrapolate analysis analysis the of of predictedthe the grey grey values. system, system, the the late late error error de detectiontection method method was was used used to to quantify quantify (0 ) (0 x) (i) ≥ 0, I = 0, 1, 2, ,⋯ n. During the error analysis of the grey system, the late error detection method was used to quantify x (i) ≥ 0, I = 0, 1, 2, , n. During the error analysis of the grey system, the late error detection method was used to quantify thethe prediction prediction error errorDuring and andthewhere, toerror to assess assess analysis its its of accuracy. theaccuracy. grey system, This This the method method late error us usdeeded- two two important important indices, indices, the the The accumulatedk generating operation (AGO) was then used to generate the first order 1 (0) The accumulatedk generating operation⋯ (AGO) wasthe then prediction used errortection and method to assesswas used itsxˆ to(0) accuracy.(quantifyk+1) represents the This prediction the predictedmethod error value. andused two important indices, the x (k)=∑ x (i),k 1k = 0, 1, 2,(0) , n. ⋯ the prediction error and to assess its accuracy. This method used two important indices, the i=01 x ((0)k)=∑i=0x (i), (0)k = 0, 1, 2, , n. posteriorposterior error (0)error ratio, ratio, termed termed C, C, and and the the small small error error probabilit probabilityy termed termed P. P. Between Between the the actual actual data data accumulativex (k)= data∑ito=0 xgeneratesequence(i), k the= 0, offirst 1, x 2, order : , naccumulative. data sequence ofc : to assess its accuracy. This method used two important indices, The� � accumulatedk generating� � operation (AGO) was then posteriorused to gen errorerate ratio, the termedfirst order C, and the small error probability termed P. Between the actual data Thex accumulated1 (k)=∑ � �x(0) generating(i), k = 0, 1, operation 2, , n. (AGO) was then used to generate thethe posterior first order error ratio, termed C, and the small error prob- i=0 (1) (1)⋯ (1) ⋯(1) posteriorand(1) the error predicted ratio, termedvalues,After C,the establishing and residual the the small errorGM (1,1), error was its accuracydenotedprobabilit was as: examinedy termed to eP.xtrapolate Between the predicted the actual values. data (0) ⋯ and the predicted (2) ability values, termed the P. Between residual the error actual was data denoted and the predicted as: accumulative� � data sequencex = { xof(0) x(0),: x (1), x (2), , xand(n the)} predicted values, (2) the residual error was denoted as: accumulative data sequence of x : (1) (1) k The adjacent neighbor mean sequence(1) and of x the was predicted expressed values,values, as: the the (0)(0) residual residual error(0)(0) error was denoted(0) was(0) denoted as: as: 1The adjacentk The(0) adjacentneighbor neighbor mean sequence mean⋯ sequence of x of was x expressedwas expressed as: as: εε (k(k) )= = x x (k(k)−)−xˆxˆ (k(k),), k k = = 1, 1, 2, 2, 3, 3, , ,n n (10)(10) x 1 (where,k)=∑i=0 x (0)(i), k where,= (1)0, 1, 2,(1) , n. (1) (1) (1) (0) (0) (0) x (k)=∑ x (i), k(1) x= 0,= 1, (1){ x2, (0),, (1)n .x (1),(1) x (1)(2), (1)(1) ⋯, x ((1)n )} (1) (2)Duringε (k )the = error x (analysisk)− xˆ of (thek), grey k = system, 1, 2, the 3, late ,error n de tection method was(10) used to quantify � � i=0 x = {x (0),z(1) x= {(1),z(1)(1) (1), x (1) z(2),z(1) (1) =(2), {(1)z, x (1),, z((1)n z)} (n (1)(2),)} , z ( n )} (2) ε (0) (3) (k ) = x (0) ((3)k)− xˆ (0)(k), k = 1, 2, 3, , n (10) (10) �The�k adjacent1 neighbork (0) mean sequencek z = of{z x (1), was z expressed(2), , z ( as:n)} (3) x 1 (k )=∑ x(0)(i),x k =( k0,)= 1,∑ 2, x , (ni1.), k = 0, ⋯1, 2, (0), n. the prediction error and to assess its accuracy. This⋯⋯ method used two important indices, the where,i=0 i=0 x (k)=∑⋯i=0x (i), k = 0, 1, 2,⋯ , n. ⋯ � �where, � � where, (1) (1) (1) (1) ⋯ where, where, � � z = {z (1), z (2),⋯ , z⋯ (n)} where, C(3) posteriorwas the errorratio ratio,between termed the C, standard and the small deviation⋯ error probabilit of the y termed P. Between the actual data ⋯ ⋯ (1) ⋯ where,where, C C was was the the ratio ratio between between the the standard standard5 deviation deviation of of the the re residualsidual error error ( S(Se)e )and and the the standard standard The adjacent neighborThe meanadjacent sequence neighbor mean of x sequence(1) was(1) expressedof c(1) was expressed as:(1) as: (1) residual error (S ) and the standard deviation of original data The adjacent neighbor mean sequence(1) of(1) x wasz (1)( kexpressed ) ⋯=(1) 0.5 where,(1)× [ xas:(1) ( kC) +was x (thek − ratio1)], andk between= the 1, epredicted2, ,the n .values, standard the residual deviation error was ofdenoted the as:re sidual error (Se) and the standard (1) z (k) =(1) 0.5 × [xwhere,(k) + xC was(k − 1)],the kratio = 1, between2, , n. the standard deviation of the residual error (Se) and the standard The adjacentwhere, neighbor The adjacentmean sequence neighbor of mean(1)x wasz sequence ((1)expressedk) = 0.5 of(1) x as:× was[x expressed(k) (1)+ deviationx as:(k − 1)], of koriginal =( S1,c). 2,The data specific, n (. S ). equation The specific was(0) given equation(0) as: (0) was given as: ⋯ deviation of original data (Sx x). The specificε (k) = x equation(k)− xˆ (k), wask = 1, given2, 3, , nas: (10) z (1)= {z (1)(1), z (1)(2), , z (1)( n)} (1) (3) (3) 5 (1) The(1) adjacentz (1)= (1){ z neighbor(1),(1)(1) z mean(1)(2), sequence, z(1) (deviationn)} of x was of original expressed (3) data as: (S x).⋯ The specific5 equation was given as: z = {z (1), z (2),z(1) = {, zz (1),(n)} z (2), (1) , zdeviation (n(3))}(1) of original (3) data ⋯(S ). The specific equation was given as: z (k) = 0.5 × [x (k) + x (k − 1)], k = 1, 2, ⋯ , nx . CC = = S Se/S/Sx (11)(11)(11)⋯ where, (1) (1) (1) (1) e x where, Next,where, the basic grey differential equation⋯z = { zof GM(1), (1,1) z (2), was esta, zblished(n)} as: C = S(3)e/S x (11) where,where, Next, the basic grey differentia⋯ l equation of⋯ GM (1,1) was established as: where, C was the ratio between the standard deviation of the residual error (Se) and the standard Next, the basic grey differentia(1)l equation of GM(1)(0) (1,1)(1) where,was (1) esta blishedwhere, as: C = Se/Sx (11) (1) (1) (1) (0)(1) (1) (1) (1) where, ⋯ z (k) = 0.5 ×z [(1)x(kz()k )=( k+ x)0.5 x= 0.5( (k×k) −×[+x 1)],[xax(1) × ( k (kz k=))) +1,+( k x2,xa) (1)=×(k b( z, k− n .1)],− ( k 1)],) k= = b k1, =2, 1, (4), 2, n . , deviationn .(4) of original data (Sx). The specific equation was given as: where, z (0)(k ) = 0.5 × [x(1) (k) + x where,(k − 1)], k = 1,⋯ 2, , n. x (k) + a × z (k) =where, b (4) ∑(ε(0)(0)(k)-ε)2 2 ∑(x(0)(0)(k)-x)2 2 Next, the basic grey differential equation of GM (1,1) was established as: ∑(ε (k)-ε) ∑(Cx = S(ke/)-Sx ) (12) (11) (1) ⋯ (1) ⋯ (1) Se= (0) ,2 Sx= (0) 2 (12) z (k) = 0.5 × [x (k) +⋯ x (k S−e =1)],∑( kε n=( k1,)-ε )2,, Sx=, n.∑( xn (k)-x) (12) Next, the basic grey(0) differential equation(1) of GM (1,1) was es- ⋯ where,Se = (0) n2 ̅ , Sx= (0) n2 � (12) Next, the basic grey differentiax l (equationk) + a ×of zGM( (1,1)k) = wasb esta blished as: (4) ∑(ε (k)-ε)n ̅ ∑(x (k)-x)n � Next, the basic grey differentiaThetablishedl equation whitening as: of GM differential (1,1) was estaequatioblishedn of as: GM (1,1) represented the first-orderSe= linear, differentialSx̅ = � (12) Next, the basicThe whiteninggrey differentia differentiall equation equatio ofn of GM GM (1,1) (1,1) was represented established the first-order as: linear differential� n � (0)n 2 (0) 2 (0) (1) PP was was calculated calculated as as follows: follows: � ∑(�ε (k)-ε) ∑(x (k)-x) Next, the basic grey differentia(0) l(1) equation of GM (1,1) was established as: ̅ Se⋯= �, Sx= (12) x (k) + a × z (kx) =(1) (bk ) + a × z (k ) =(4) b P was calculated (4) P was as calculatedfollows: as� follows: � n n The whitening differentialequation based equatio(1) on xn of(k GM), given (1,1) as: represented the first-order linear differential ̅ � equation based on x ((0)k), given as: (1) P was calculated(4) as follows: � (0)(0) � x (0)( k) + a × z (1)( k) = b (4) PP =P = wasP{ P{ │calculated│εε (k( k)−as)− follows:ε│ε│<0.6745<0.6745 � SSx}=x}=� m m/n/n (13)(13) (1)Next, the xbasic(k) grey+ a ×differentia z (k(1)) = bl equation (1) of GM (4) (1,1) was esta│blished(0) as: │ (13) (1) dx (1)/d t + ax = b (5)P = P{ ε (k)− ε <0.6745(0) Sx}= m/n (13) equationThe whitening based differentialon x (k), given equatio as:n ofdx GM/dt (1,1)+ ax represented= b the first-order (5) linear (0)differentialP = P{│ε (k)− ε│<0.6745S }= m/n (13) The whitening differential equation of GM (1,1)(0) represented (1)the P = P{│ε (k)− ε│<0.6745│Sx}=(0)(0) m/n │ x (13) The whitening differentialThe whitening equatio differentialn of GM (1,1) equatio representedn of GM the(1,1) first-order represented where,linearwhere, the differential first-order m m was was thelinear the total total differential number number of of cases cases ̅ that that │εε (k(k)−)−εε│<0.6745<0.6745SSx.x . (1) (1) x (k(1)) + a × z (k) = b (4) ̅ (0) (0) first-order(1) linear differential equation based on (k), given as: where, mwhere, was mthe was total the totalnumber number of ofcases│ cases thatthat │ε │(k)− ε│ <0.6745Sx. (1) dx /d t + ax = b c where, m(5) was the total number of cases ̅ that ε̅ (k)− ε <0.6745Sx. equationequation based(1) onbased x on ( kx), (givenk), given as: as: │ (0) │ equationThe based whitening on x (k differential), given as: equation of GM (1,1) representedwhere, mthe was first-order the total<0.6745 linear numberS . differential of cases ̅ that ε (k)− ε <0.6745Sx. The whitening differential equation of GM (1,1) represented the first-order linear c differential ̅ ̅ ̅ where, a where,and (1)b were a and defined(1) b were(1) (1)as defined the model (1)(1)as the parameters model parameters and estimated(5) and estimated using the using ordinary the leastordinary least (1)d x /dt + ax =d bx d x /d /d t t+ + axax =(5)= b b (5) (5) ̅ equation based on x (1)( k), given as: A lower C-value and a higher P-value indicated ̅ a better forecasting result. The combination of P equation based on squarex The(k), (OLS) whiteninggiven method. as: differential The estimation equatio of naA Aandof lower lower GM b using C-value(1,1) C-value the represented AOLS lowerand and method a C-valuea higher higher the wasand first-orderP-value P-value aas higher follows: indicatedP-value indicated linear indicated differential a a better better a better forecas forecas fore-tingting result. result. The The combination combination of of P P square (OLS) method. The estimation of a and b usingA lower the OLS C-value method and was aand ashigher C follows: defined P-value four levels indicated of forecasting a accuracybetter forecas(Table 1). tingStatistical result. analysis The was combination performed of P where, a and b werewhere, defined a and asb were(1)the modeldefined(1) parameters as the model andparameters estimated and usingcasting the result. ordinary The combination least of P and C defined four levels dx (1)/dt + ax(1)(1) = b a A lower C-value (5) and a higher P-value indicated a better forecasting result. The combination of P where, a andestimated b were defined using theas the ordinary modela least (parametersk), Tsquaregiven−1 T (OLS)andas:Tand estimatedand method.−1 C TC defined defined usingThe the four offour o forecastingrdinary levels levels least of accuracyof forecasting forecasting (Table 1). accuracy accuracyStatistical (Tableanalysis (Table was1). 1). perSt Statistical-atistical analysis analysis was was performed performed where, a and b were defined as theequation model parameters baseddx /d ontu +and= x ax estimated= (=Bu =bB ) using= B ( BY theB ) o rdinary B Y (5) least (6) (6)using Predictive Analytics SoftWare (PASW) Statistics version 18.0 for Windows. squarewhere, (OLS)a and bmethod. wereestimation defined The ofestimation asa and the b modelusing ofb the a parameters andOLS methodbb using wasand theand as estimatedOLS Cfollows: defined method using formedfour was thelevels using as o rdinaryfollows: Predictiveof forecasting least Analytics accuracy SoftWare (PASW) (Table Statistics 1). Statistical analysis was performed square (OLS) method. The estimation of a and b usingand the(1)using COLS defined methodPredictive(1) fourwas as levelsAnalytics follows: of forecasting SoftWare (PASW)accuracy Statistics(Table 1). version Statistical 18.0 analysis for Windows. was performed square (OLS) method. The estimation of a and b using the OLS� method� � d wasx /d usingast follows:+ ax Predictive = b version Analytics 18.0 for SoftWaWindows.(5) re (PASW) Statistics version 18.0 for Windows. where, where, a � � T� −1 T using Predictive Analytics SoftWare (PASW) Statistics version 18.0 for Windows. square (OLS) method.a The estimationu= a= (ofB TaB )and−1 BT bY using the OLS (6)(6)method was Hierarchicalas follows: cluster analysis T −1 T u= = (B B) 0B Y using Predictive1 (6) Analytics SoftWare (PASW) Statistics version 18.0 for Windows. where, a and b were udefined= = (B asB) theB Y model b b 0 parameters x (6)(1) 1 and zestimated(1) 1 using the ordinary least where, a and b were defined b as the modelx (1) parametersz (1) and 1estimated usingHierarchical the oHierarchicalrdinary cluster leastanalysiscluster analysis was used to classify the variables according to differences or a T −10 T 1 where, where, �u=� ��=��0 �(B B)x� B�(2)Y� 1� z � �(2) 1(6) where,square where, (OLS) method.� �The� estimationY=bx of(2) aY and=, B= b usingz , B(2) = theHierarchical−Hierarchical 1 OLS method cluster cluster was (7) as analysis similarities analysis follows: (7) [17]. In this study, the reported incidence rates of AIDS, syphilis, and gonorrhea square (OLS) method.where, The estimation a and b were of a defined and⎡ � �b− using as⎤ the ⎡the model �OLS� parameters method⎤ Hierarchical was and as estimatedfollows: cluster analysis using was the usedordinary to classify least the vari- x 0 (1) z 10(1)x 0 ⎡(1) 1� � ⎤ z 11⎡(1)� 1� Hierarchical⎤ cluster analysis x a(1) 0 T −1z 0T (1)1 1 − 1 ables according to differences or similarities [17]. In this study, where, � ��0�⎢x� (n)⎥ ⎢x�1�⎢−(nz)⎥ (Hierarchicaln)⎢ z⎥ (n) 1 ⎥cluster analysis x�0�square(2) (OLS)uz�=1�(2)xa method.=(2) 1(B TB) −1ThezB T(2)Y estimation 1 Hierarchical Hierarchical1 of a (6)and(7) bcluster cluster using analysis theanalysis OLS wa methodwas sused used was to to classify classifyas follows: the the variable variables saccording according to to differences differences or or 7 Y= , B= Yu=�0�b = (B ,⋮ B =B ) −⎢ B� 1 �⋮�Y ⎥ (7)⋮ ⎢ �⋮ � (6) (7)⋮⎥ −x 0 (2)� �⎢ � � ⎥ z� 1�⎢(2)� � 1 Hierarchical⋮⎥ cluster analysis was used to classify the variables according to differences or ⎡ � � ⎤ Y⎡= x� �⎡ b(1) ,⎤ B= ⎡−⎣z (1)⎦ ⎤Hierarchical 1⎣ − cluster⎦ (7)Table analysis 1. Four levelswas ofused forecasting to classify accuracy the of thevariable grey model.s according to differences or ⎢x 0 (n)⎥ ⎢−⎡z �1⎢(�xn)0 ⎣(n⎤)⎥ ⎡⎢⎦−z �1 ⎣�(−n) 1⎥ ⎤asimilaritiessimilarities⎦ T −1 [17]. T[17]. In In this this study, study, the the reported reported incidence incidence rates rates of of AIDS, AIDS, syphilis, syphilis, and and gonorrhea gonorrhea where, � �0�� � 1 �1� u= similarities= (B B) B [17].Y In this study, (6) the reported incidence rates of AIDS, syphilis, and gonorrhea where, With the⎢ Withobtained� �⋮ the⎥ obtainedparameters⎢ �x�⋮0�⎢� �(2) �parameters�⋮ ⋮a⎥ and⎢ −b substitutedz �a⋮1� and(2) b⋮ ⎥substituted 1 intob Eq (5), into the Eq time (5), responsethe time responsefunction offunction GM of GM With the obtainedY=⎢x (parametersn)⎥, B=⎢− a zand (bn substituted) similarities1⎥ into Eq [17]. (5), In(7) this study,Level the reportedC-value incidence ratesP-value of AIDS, syphilis, and gonorrhea ⎣ ⎦ ⎣x−0 �(1)⎣� ⎦ ⎣z−1 �(1)� 1⎦ the time response⎢⎡0� �⋮ function⎥⎤ of⎢ ⎡GM1 (1,1)�⋮� was denoted⋮⎥⎤ as follows: (1,1) was(1,1) denotedwhere, was asdenoted x follows:0(1) as follows:z 1(1)� 1 � � Good (G) C £0.35 P ≥0.95 7 With the obtained Withparameters the obtained a and bparameters substituted�⎢0x� a( n intoand)⎥ Eqb substituted (5),⎢−� 1thez� time( ninto) response Eq⎥ (5), thefunction time responseof GM function of GM 7 x⎣�0�(2) ⎦ ⎣z−�1�(2) 1 1⎦ 7 (1)Y= x (1)(0)(2)⋮ , B=b (0) -az k b(2)⋮b -a 01k b 1 (7) (1,1) was denoted (1,1)as follows: was denoted xˆ asY( k=follows:) =�⎢ [xˆ�x� � ((0)−k) ,= ⎥B [=x]− e⎢(0)−� +�� � ], ekx = +0,(1)⋮ ⎥ 1, , 2,k = 0, , n1, z (8) 2, (1) (7) , n Acceptable 1 (8) (A) (8) 0.35 0.65 P <0.70 ⎢ � �⋮ ⎥ ⎢ �⋮� ⋮⎥ (0) (1) (0) (1) (1) (1) 0 ⋯ 1 xˆ (k+1) = xˆ(1)(k+xˆ1)−(k+xˆ (0)1)(⎣k =) xˆ ( k+ ⎦ 1)− -a xˆk ⋯⎣ − (k ) ⎢ x ⎦ ((9)n ) ⎥ ⎢− z ((9)n) ⎥ (1,1) was denotedxˆ ( k as) = follows: [x (0)−⎣ b ]⎦ e +⎣−b , k = 0, 1, ⎦2, , n 1 (8) With the obtained parameters a and b substituted into⎢ � �⋮Eq ⎥(5), the⎢ time�⋮� response⋮⎥ function of GM With the obtained parameters a anda b substituteda into Eq (5), the time response function of GM (1) (0) b -a k b ⎣ ⎦ ⎣− ⎦ 6 xˆˆ(0)(k) = [x ˆ(0)−(1) ] eˆ (1)+ , k = 0, 1, 2, ⋯, n (8) 6 (1,1) was denoted x as(k+ follows:1) = x (k+1)− x (k) (9) Indexed in: [Current6 Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] With the Thisobtained awork is licensed parameters undera Creative Common a and Attribution- b substituted into6 Eq (5), the time response function of GM (1,1) was denoted as follows: [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) 5659 (1)(0) (0) (1) b -a k (1)b ⋯ [Chemical Abstracts/CAS] xˆ (1)xˆ(k)( k+= [1)x (0)=(0)−xˆ (k+b ] 1)−e -a kxˆ+ b(k, )k = 0, 1, 2, , n (9)(8) xˆ (k) =(1,1) [x (0)−was adenoted] e + asa follows:, k = 0, 1, 2, , n (8) a a (0) (1) (1)(1) (0) b ⋯ -a k b 6 xˆ (0)(k+1) = xˆ (1)(k+1)−xˆxˆ (1)((kk)) = [ x (0)− ]⋯ e + , k = 0, 1, (9) 2, , n (8) xˆ (k+1) = xˆ (k+1)− xˆ (k) a a (9) (0) (1) (1) 6 xˆ (k+1) = xˆ (k+1)− xˆ (k) ⋯ (9)

6 6

6 Ye X. et al.: CLINICAL RESEARCH Epidemiology of STDs in China © Med Sci Monit, 2019; 25: 5657-5665

Table 2. Actual and predicted values of the incidence of reported cases of acquired immune deficiency syndrome (AIDS) in China using the GM (1,1) model from 2004 to 2018 (1/100,000).

_ Year k c(0)(k) cˆ(0)(k) e(0)(k) e(0)(k)–e

2004 0 0.23 – – –

2005 1 0.43 0.66 –0.23 0.01

2006 2 0.51 0.79 –0.28 –0.05

2007 3 0.74 0.95 –0.21 0.02

2008 4 0.76 1.14 –0.38 –0.15

2009 5 1 1.37 –0.37 –0.14

2010 6 1.2 1.64 where, –0.44 –0.21 xˆ (0) (k+1) represents the predicted value. 2011 7 1.53 1.97 –0.44 –0.21

2012 8 3.11 2.37 0.74 0.97 After establishing the GM (1,1), its accuracy was examined to extrapolate the predicted values. 2013 9 3.12 2.85 0.27 0.51

2014 10 3.33 3.42 –0.09 0.15 During the error analysis of the grey system, the late error detection method was used to quantify 2015 11 3.69 4.11 –0.42 –0.18 the prediction error and to assess its accuracy. This method used two important indices, the 2016 12 3.97 4.93 –0.96 –0.73 posterior error ratio, termed C, and the small error probability termed P. Between the actual data 2017 13 – 5.92 and the predicted values,– the residual error was– denoted as: 2018 14 – 7.11 – ε(0)(k) = x(0)(k)− xˆ (0)(k),– k = 1, 2, 3, , n (10)

(0) (0) c (k) – actual data of reported incidence of acquired immune deficiency syndrome (AIDS)_ at yeark ; cˆ (k) – predicted value of ⋯ reported AIDS incidence at year k; e(0)(k) – residual error between c(0)(k) and cˆ(0)(k); e(0)(k)–e – deviation from the average of residual where, C was the ratio between the standard deviation of the residual error (S ) and the standard error. e

deviation of original data (Sx). The specific equation was given as:

the reported incidence rates of AIDS, syphilis, and gonorrhea studies was used to predict the reportedC incidence = Se/Sx of AIDS by (11) across the 31 Chinese provinces were introduced into a hier- the followingwhere, equation: archical cluster analysis. Given the significant differences -be (0) 2 (0) 2 (0) ∑(0.183kε (k)-ε) ∑(x (k)-x) Se= , Sx= (12) tween the incidence rates of the three STDs, all data were first cˆ (k+1)=0.66e n n Z-score standardized. The between-group linkage method was ̅ � P was calculated as follows: � � used for the clustering algorithm, and the squared Euclidean The C-value of the GM (1,1) calculated over the 13-year period (0) P = P{│ε (k)− ε│<0.6745Sx}= m/n (13) distance was selected to measure the similarity coefficient. was 0.30, and the grey forecasting accuracy was good. The Sc (0) In the analysis dendrogram, closely connected provinces were value waswhere, 1.34, m and was the total number number of ofcases cases ̅ thatthat │ε (k)− ε│ <0.6745Sx. identified to be more similar than those further apart. Provinces <0.6745S was 11. The P-value of GM (1,1) was 92%, and the c ̅ in the same cluster had similar STD epidemiological charac- grey forecasting accuracy was acceptable. The predicted inci- teristics and provinces between clusters were more variable. dence obtainedA lower using C-value the and GM a higher(1,1) model P-value closely indicated agreed a better with forecas ting result. The combination of P the actualand data, C definedsuggesting four thelevels model of forecasting fitted well. accuracy The results (Table pre 1).- Statistical analysis was performed dicted thatusing the Predictive incidence Analytics of AIDS maintainedSoftWare (PASW) an upward Statistics trend version 18.0 for Windows. Results in China, and the annual reported incidence rates from 2017 and 2018 were 5.92/100,000 and 7.11/100,000, respectively. The predicted incidence of acquired immune deficiency Hierarchical cluster analysis syndrome (AIDS) in China PredictedHierarchical incidence ofcluster gonorrhea analysis inwa Chinas used to classify the variables according to differences or

The reported incidence of acquired immune deficiency syn- The reportedsimilarities incidence [17]. ofIn thisgonorrhea study, the in reported China fromincidence 2004 rates to of AIDS, syphilis, and gonorrhea drome (AIDS) in China from 2004 to 2016 is shown in Table 2 2016 is shown in Table 3 and Figure 2. The forecast equation and Figure 1. The incidence of AIDS continually increased be- of GM (1,1) was applied for data analysis. 7 tween 2004 to 2016. The grey model first order one variable, or GM (1,1), time series forecasting model for epidemiological cˆ(0) (k+1)=12.52e–0.065k

Indexed in: [Current Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] This work is licensed under Creative Common Attribution- [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) 5660 [Chemical Abstracts/CAS] Ye X. et al.: Epidemiology of STDs in China CLINICAL RESEARCH © Med Sci Monit, 2019; 25: 5657-5665

8.0 20.0 AIDS reported incidence rate 7.0 Predicted value 18.0 16.0 6.0 14.0 5.0 12.0 4.0 10.0 3.0 8.0 6.0 2.0

ted incidence rate (per 100,000) ted incidence rate (per 100,000) 4.0 1.0 Gonorrhea reported incidence rate 2.0 Predicted value Repor 0.0 Repor 0.0 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 Year Year

Figure 1. Actual and predicted reported incidence of acquired Figure 2. Actual and predicted reported incidence of gonorrhea immune deficiency syndrome (AIDS) in China using in China using the GM (1,1) model, from 2004 to 2018. the GM (1,1) model, from 2004 to 2018.

Table 3. Actual and predicted values of the incidence of reported cases of gonorrhea in China using the GM (1,1) model from 2004 to 2018 (1/100,000).

_ Year k c(0)(k) cˆ(0)(k) e(0)(k) e(0)(k)–e

2004 0 17.34 – – – 2005 1 13.79 12.52 1.27 1.24 where, 2006 2 12.14 11.73 0.41 0.38 xˆ (0) (k+1) represents the predicted value. 2007 3 11.08 10.99 0.09 0.05

2008 4 9.90 10.30 –0.40 –0.43 After establishing the GM (1,1), its accuracy was examined to extrapolate the predicted values. 2009 5 9.02 9.65 –0.63 –0.67

During the error analysis of the2010 grey system, the late6 error detection method7.91 was used to quantify 9.04 –1.13 –1.17 the prediction error and to assess2011 its accuracy. This method7 used two important7.31 indices, the 8.47 –1.16 –1.20 posterior error ratio, termed C, 2012and the small error probabilit8 y termed P.6.82 Between the actual data 7.94 –1.12 –1.16 and the predicted values, the residual2013 error was denoted9 as: 7.36 7.44 –0.08 –0.12

(0) (0) (0) ε (k) = x2014(k)− xˆ (k), k = 1, 102, 3, , n 7.05 (10) 6.97 0.08 0.04

2015 11 ⋯ 7.36 6.53 0.83 0.79 where, C was the ratio between2016 the standard deviation12 of the residual error8.39 (Se) and the standard 6.12 2.27 2.23 deviation of original data (S ). The specific equation was given as: x 2017 13 – 5.74 – – C = Se/Sx (11) 2018 14 – 5.38 – – where, c(0)(k) – actual data of reported incidence of gonorrhea at year k; cˆ(0)(k) – predicted value of reported incidence of gonorrhea at year k; (0) 2 (0) 2 _ (0) ∑(ε (k)-ε) ∑(x (k)-x) (0) (0) (0) e (Ske)= – residual ,error Sx= between c (k ) and cˆ ((12)k); e (k)–e – deviation from the average of residual error. n n ̅ � P was calculated as follows: � � The C-value of the GM (1,1) model was 0.33 and the grey fore- Predicted incidence of syphilis in China (0) P = P{│ε (k)− ε│<0.6745Sx}= m/n (13) casting accuracy was good. The Sc value was 3.04, and the │ (0) │ where, m was the totalnumber number of of casescases ̅ that ε (k)− ε <0.6745SSxc. was 11. Therefore, The reported incidence of syphilis in China from 2004 to 2016 the P-value of GM (1,1) was 92%, and the grey forecasting is shown in Table 4 and Figure 3. The forecast equation of GM ̅ accuracy was acceptable. The results predicted that the inci- (1,1) was applied to analyze the data. A lower C-value and adence higher of P-value gonorrhea indicated followed a better a decreasingforecasting result. trend The in China, combination and of P and C defined four levelsthe ofannual forecasting reported accuracy incidence (Table rates 1). St atisticalfrom 2017 analysis to 2018 was performedwere cˆ(0) (k+1)=15.56e0.078k using Predictive Analytics5.74/100,000 SoftWare (PASW)and 5.38/100,000, Statistics version respectively. 18.0 for Windows.

Hierarchical cluster analysis Indexed in: [Current Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] This work is licensed under Creative Common Attribution- Hierarchical cluster analysis was used to classify the variables according to differences or [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) 5661 [Chemical Abstracts/CAS] similarities [17]. In this study, the reported incidence rates of AIDS, syphilis, and gonorrhea

7 Ye X. et al.: CLINICAL RESEARCH Epidemiology of STDs in China © Med Sci Monit, 2019; 25: 5657-5665

Table 4. Actual and predicted values of the incidence of reported cases of syphilis in China using the GM (1,1) model from 2004 to 2018 (1/100,000).

_ Year k c(0)(k) cˆ(0)(k) e(0)(k) e(0)(k)–e

2004 0 7.12 – – –

2005 1 9.67 15.56 –5.89 –5.49

2006 2 12.80 16.83 –4.03 –3.62

2007 3 15.88 18.19 –2.31 –1.91

2008 4 19.49 19.67 –0.18 0.23

2009 5 23.07 21.26 1.81 2.21

2010 6 26.86 22.99 3.87 4.28

2011 7 29.47 24.85 4.62 5.02

2012 8 30.44 26.87 3.57 3.98

2013 9 30.04 29.05 0.99 1.40

2014 10 30.93 31.41 –0.48 –0.07

2015 11 31.85 33.95 –2.10 –1.70 where, 2016 12 31.97 36.71 –4.74 –4.33

xˆ (0) (k+1) represents2017 the predicted13 value. – 39.69 – –

2018 14 – 42.91 – –

After establishing the GM(0) (1,1), its accuracy was examined to extrapolate the predicted values.(0) c (k) – actual data of reported incidence of syphilis at _year k; cˆ (k) – predicted value of reported incidence of syphilis at year k; e(0)(k) – residual error between c(0)(k) and cˆ(0)(k); e(0)(k)–e – deviation from the average of residual error. During the error analysis of the grey system, the late error detection method was used to quantify The reported incidence of the three STDs in China in 2016 the prediction error and to assess45.0 its Syphilisaccuracy. reported This incidence method rate used two important indices, the 40.0 Predicted value posterior error ratio, termed C, and the small error probability termed P. Between the actual dataThe reported incidence of AIDS, syphilis, and gonorrhea across 35.0 and the predicted values, the30.0 residual error was denoted as: the 31 provinces in 2016 are shown in Table 5. Guangxi, Yunnan, 25.0 Sichuan, and Chongqing reported a higher incidence of AIDS. ε(0)(k) = x(0)(k)− xˆ (0)(k), k = 1, 2, 3, , n (10) 20.0 Areas with a high incidence of gonorrhea included Zhejiang, 15.0 ⋯ Shanghai, Hainan, Guangdong, and Fujian. Sinkiang, Zhejiang,

ted incidence rate (per 100,000) 10.0 Fujian, and Shanghai had a higher reported incidence of syph- where, C was the ratio between the standard deviation of the residual error (Se) and the standard 5.0 ilis than other provinces. Repor deviation of original data (Sx).0.0 The specific equation was given as: 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 C = Se/Sx Y(11)ear Cluster analysis of the three STDs in China in 2016 where, Figure 3. Actual and predicted reported incidence of syphilis in The dendrogram (Figure 4) shows the cluster analysis results ∑(ε(0)(k)-ε)2 ∑(x(0)(k)-x)2 Se= China using, Sx= the GM (1,1) model, from (12) 2004 to 2018. n n of the incidence of STDs across the provinces of China in 2016. ̅ � The 31 provinces were classified into four groups according to P was calculated as follows: � � The C-value of the GM (1,1) model was 0.39, and the grey fore- the degree of similarity in reported incidence. In combination (0) P = P{│ε (k)− ε│<0.6745Sx}= m/n (13) casting accuracy was acceptable. The Sc value was 8.68, and the with the results of Table 5, the classification results showed │ (0) │ where, m was the totalnumber number of of casescases ̅ that ε (k)− ε <0.6745SSxc. was 12. Therefore, that group A (Sinkiang) had the highest reported incidence of the P-value of the GM (1,1) model was 100%, indicating good syphilis. Group B (Fujian, Hainan, Guangdong, Zhejiang, and ̅ forecasting accuracy. The results predicted that the incidence Shanghai) had the highest incidence of gonorrhea. Group C A lower C-value and aof higher syphilis P-value would indicated increase a better in China, forecas andting theresult. annual The combination reported of P(Chongqing, Guizhou, Guangxi, Yunnan, and Sichuan) had the and C defined four levelsincidence of forecasting rates from accuracy 2017 (Table to 2018 1). St wereatistical 39.69/100,000 analysis was performed and highest incidence of AIDS. using Predictive Analytics42.91/100,000, SoftWare (PASW) respectively. Statistics version 18.0 for Windows.

Hierarchical cluster analysis Hierarchical cluster analysis was used to classify the variables according to differences or Indexed in: [Current Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] This work is licensed under Creative Common Attribution- [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] similarities [17]. In this study, NonCommercial-NoDerivativesthe reported incidence rates4.0 International of AIDS, (CC syphilis, BY-NC-ND and4.0) gonorrhea5662 [Chemical Abstracts/CAS]

7 Ye X. et al.: Epidemiology of STDs in China CLINICAL RESEARCH © Med Sci Monit, 2019; 25: 5657-5665

Table 5. The reported incidence of acquired immune deficiency syndrome (AIDS), gonorrhea, and syphilis in China in 2016 (1/100,000).

Region AIDS Gonorrhea Syphilis Beijing 3.62 6.58 22.92 Tianjin 1.79 2.06 16.6 Hebei 0.79 1.91 12.88 Shanxi 1.45 3.57 25.3 Inner Mongolia 1.14 8.91 38.12 Liaoning 2.31 5.85 37.41 Jilin 1.99 4.97 18.36 Heilongjiang 1.48 4.01 24.3 Shanghai 2.27 28.03 57.07 Jiangsu 2.02 9.03 29.7 Zhejiang 3.37 32.75 62.12 Anhui 2.17 6.05 37.14 Fujian 2.43 15.42 58.51 Jiangxi 3.01 8 28.7 Shandong 0.74 4.15 15.54 Henan 3.17 3.13 15.44 Hubei 2.18 4.74 21.67 Hunan 4.15 4.82 33.43 Guangdong 3.8 19.8 48.73 Guangxi 12.48 10.23 15.41 Hainan 1.92 20.36 50.69 Chongqing 10.2 6.79 54.86 Sichuan 11.16 3.56 28.35 Guizhou 7.42 6.41 30.37 Yunnan 12.04 8.64 35.96 Tibet 1.57 1.39 35.25 Shaanxi 2.13 4.53 26.09 Gansu 1.55 3.24 18.61 Qinghai 2.97 2.52 48.47 Ningxia 1.26 4.72 56.7 Sinkiang 8.14 7.29 89.05

Discussion who have sex with men (MSM), and female sex workers (FSWs) and their clients [18,19]. Of the newly reported annual cases of This study aimed to predict the incidence of acquired immune HIV/AIDS, cases transmitted by homosexual sex increased from deficiency syndrome (AIDS) due to human immune deficiency 2.5% in 2006 to 27.6% in 2016 [20,21]. Previous studies have virus (HIV) infection, syphilis, and gonorrhea using data reported also shown that that rural-to-urban migration, social stigma, from 2004 to 2016, and to classify the regional distribution of and lack of healthcare-seeking behavior expand the spread of these sexually transmitted diseases (STDs) in China. The in- HIV/AIDS and syphilis [3–5,18]. Also, HIV and syphilis can be cidence rates of AIDS and syphilis may increase for a further co-transmitted, which potentially further drives the growth of two years, indicating that AIDS had reached epidemic levels in the epidemic [22]. There is an urgent need for the promotion China. The factors associated with the increasing trend include of interventions such as condom use, voluntary counseling and engagement in unsafe sexual practices, especially among men testing, and health education regarding safe sex and STDs to

Indexed in: [Current Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] This work is licensed under Creative Common Attribution- [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) 5663 [Chemical Abstracts/CAS] Ye X. et al.: CLINICAL RESEARCH Epidemiology of STDs in China © Med Sci Monit, 2019; 25: 5657-5665

Based on this information, the government should strengthen 0 5 10 15 20 25 anti-drug law enforcement to control drug use as an effective Liaoning 6 response to the HIV/AIDS epidemic in Southwest areas [27]. Anhui 12 Inner Mongolia 5 Jiangsu 10 Group B included provinces with a high reported incidence of Jiangxi 14 gonorrhea, which were mainly located in the Southeast coast- Beijing 1 Hunan 18 line of China that has a developed economy and mobile popula- Tibet 26 tion. Previous studies identified high-incidence clusters of gon- Jilin 7 Hubei 17 orrhea that were primarily distributed in the eastern Yangtze Shanxi 4 River Delta (Zhejiang and Shanghai) and southern Pearl River Heilongjiang 8 Delta (Guangdong), which are the most developed coastal areas Shaanxi 27 D Tianjin 2 of China [28]. This regional pattern was associated with mass Gansu 28 rural-to-urban migration [14]. A previously published system- Hebei 3 Shandong 15 atic review showed that the prevalence of gonorrhea amongst Henan 16 migrants was 13.6 greater (range, 5.8–32.1) than the odds of Qinghai 29 infection amongst the general Chinese population [4]. Several Ningxia 30 Sichuan 23 migrants who were unmarried or away from their spouses have Yunnan 25 flocked to wealthy cities for better , with increased Guangxi 20 C Guizhou 24 rates of unsafe sexual behaviors including unprotected sex and Chongqing 22 multiple sexual partners [18,28]. Furthermore, the high rates Shanghai 9 B of migration have facilitated the spread of gonorrhea across Zhejiang 11 Guangdong 19 prosperous regions in China. Therefore, gonorrhea control mea- Hainan 21 sures including health education, the promotion of condom use, Fujian 13 A Sinkiang 31 and early screening and treatment should be increased in the Southeast provinces and targeted to key groups including mi- grant workers and female sex workers [14,28]. Figure 4. Hierarchical cluster analysis dendrogram of the reported incidence of the three sexually transmitted Group A included the Sinkiang province, which had the high- diseases (STDs), acquired immune deficiency syndrome est reported incidence of syphilis. This province is located in (AIDS), gonorrhea, and syphilis in the 31 provinces of northwest China and borders many Asian countries. Regional mainland China in 2016. or spatial analysis studies have also indicated that some re- gions at high risk of syphilis were present in less developed control HIV/AIDS and syphilis [18,23]. Since syphilis and HIV areas and inland areas, including Sinkiang [15]. Approximately share similar modes of transmission and similar risk behaviors, half of the syphilis cases in Sinkiang international ports origi- their control interventions should be combined [22]. nated from ethnic minorities, particularly the Uyghurs, account- ing for 37.88% of the population [29]. In remote areas near Cluster analysis classified the provinces into 4 groups based the Chinese border, the spread of syphilis may be associated on the degree of similarities of the characteristics of the inci- with different cultures and customs, inadequate sexual knowl- dence of STDs. The provinces in each group were geographi- edge, and high-risk behaviors among a multi-ethnic popula- cally adjacent. Group C included Chongqing, Guizhou, Yunnan, tion [3,29]. Therefore, targeted interventions should consider and Sichuan in southwest China that reported the highest in- ethnic minorities, and should include bilingual health educa- cidence of AIDS. Provinces in the Southwest border of China tion promotion pamphlets both in the Chinese Han language are adjacent to the Golden Triangle Drug Trade in Southeast and in ethnic minority languages [29,30]. Also, the quality and Asia. Yunnan is renowned for drug trafficking and drug con- allocation of medical resources should be optimized in distant sumption, which may explain the high rates of HIV/AIDS [24]. inland areas to reduce the incidence of syphilis effectively [15]. HIV infection in China among injecting drug users was identi- fied in Yunnan in 1989. The transmission rates of HIV among This study had several limitations. In contrast to nationwide injecting drug users residing near major routes of drug-traf- time trend analysis for 13 years, and for provincial analysis, ficking in Guangxi, Sichuan and other provinces increased rap- this study analyzed only incidence information over one year idly [25]. It has previously been suggested that injecting syn- at the provincial level. In future studies, time trends and the thetic drugs, such as ketamine and ecstasy, increases high-risk geographic distribution of incidence at the sub-provincial level sexual practices, and facilitated the spread of HIV to spread should be assessed to identify the diversity of the epidemiol- from intravenous drug users to their sexual partners [24,26]. ogy of STDs in China in more detail.

Indexed in: [Current Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] This work is licensed under Creative Common Attribution- [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) 5664 [Chemical Abstracts/CAS] Ye X. et al.: Epidemiology of STDs in China CLINICAL RESEARCH © Med Sci Monit, 2019; 25: 5657-5665

Conclusions provinces, gonorrhea in the Southeast coastline provinces, and syphilis in Sinkiang province of northwest China. This study showed that the GM (1,1) model was an effective method to predict the incidence of sexually transmitted dis- Acknowledgments eases (STDs). The incidence rates of acquired immune defi- ciency syndrome (AIDS) and syphilis in China are predicted to We gratefully acknowledge Haiqiang Guo for the practical ad- continue to rise, and the rates of STD were unequally distrib- vice on analysis of the data. uted across mainland China. Therefore, control and prevention programs should consider the geographic distribution of STDs. Conflict of interest Specifically, intervention programs must acknowledge the ep- idemiology and increasing incidence of AIDS in the Southwest None.

References:

1. Vlieg WL, Fanoy EB, van Asten L et al: Comparing national infectious dis- 16. National Health and Family Planning Commission of the Peoples Republic ease surveillance systems: China and the Netherlands. BMC Public Health, of China. China health statistical yearbook Available from: URL: http://www. 2017; 17: 415 nhfpc.gov.cn/zwgkzt/tjnj/list.shtml 2. Yang SG, Wu J, Ding C et al: Epidemiological features of and changes in in- 17. Qian SS, Guo W, Xing JN et al: Diversity of HIV/AIDS epidemic in China: A cidence of infectious diseases in China in the first decade after the SARS result from hierarchical clustering analysis and spatial autocorrelation anal- outbreak: an observational trend study. Lancet Infect Dis, 2017; 17: 716–25 ysis. AIDS, 2014; 28: 1805–13 3. Lin CC, Gao X, Chen XS, Chen Q, Cohen MS: China’s syphilis epidemic: A 18. Zhao YP, Luo TY, Tucker JD, Wong WC: Risk factors of HIV and other sexu- systematic review of seroprevalence studies. Sex Transm Dis, 2006; 33: ally transmitted infections in China: A systematic review of reviews. PLoS 726–36 One, 2015; 10: e0140426 4. Zou X, Chow EP, Zhao PC et al: Rural-to-urban migrants are at high risk of 19. Tucker JD, Chen XS, Peeling RW: Syphilis and social upheaval in China. N sexually transmitted and viral infections in China: A systematic Engl J Med, 2010; 362: 1658–61 review and meta-analysis. BMC Infect Dis, 2014; 14: 490 20. National Health and Family Planning Commission of the Peoples Republic 5. Xu JJ, Yu YQ, Hu QH et al: Treatment-seeking behaviour and barriers to ser- of China. 2015 China AIDS Response Progress Report. Available from: URL: vice access for sexually transmitted diseases among men who have sex http://www.aidsdatahub.org/sites/default/files/publication/China_narra- with men in China: A multicentre cross-sectional survey. Infect Dis , tive_report_2015.pdf 2017; 6: 15 21. NCAIDS, NCSTD, China CDC. Update on the AIDS/STD epidemic in China in 6. Zhang XY, Liu YY, Yang M et al: Comparative study of four time series meth- December, 2016. Chin J AIDS STD, 2017; 23: 93 ods in forecasting typhoid fever incidence in China. PLoS One, 2013; 8: 22. Li YB: Epidemiological characteristics of HIV/AIDS combined with syphilis e63116 infection. Occup and Health, 2017; 33: 1665–67 7. Yang XB, Zou JJ, Kong DG, Jiang GF: The analysis of GM (1,1) grey model to 23. Du MR, Zhao J, Zhang JX et al: Depression and social support mediate the predict the incidence trend of typhoid and paratyphoid fevers in Wuhan effect of HIV self-stigma on condom use intentions among Chinese HIV- City, China. Medicine, 2018; 97: e11787 infected men who have sex with men. AIDS Care, 2018; 30: 1197–206 8. Wang YW, Shen ZZ, Jiang Y: Comparison of ARIMA and GM (1,1) models for 24. Huang MB, Ye L, Liang BY et al: Characterizing the HIV/AIDS Epidemic in prediction of in China. PLoS One, 2018; 13: e0201987 the United States and China. Int J Environ Res Public Health, 2016; 13: 30 9. Kane MJ, Price N, Scotch M, Rabinowitz P: Comparison of ARIMA and Random 25. Wang L, Guo W, Li DM et al: HIV epidemic among drug users in China: 1995– Forest time series models for prediction of H5N1 outbreaks. 2011. Addiction, 2015; 110(Suppl. 1): 20–28 BMC Bioinformatics, 2014; 15: 276 26. Ritchwood TD, Ford H, DeCoster J et al: Risky sexual behavior and substance 10. Zhang LP, Wang L, Zheng YL et al: Time prediction models for echinococ- use among adolescents: A meta-analysis. Child Youth Serv Rev, 2015; 52: cosis based on gray system theory and epidemic dynamics. Int J Environ 74–88 Res Public Health, 2017; 14: 262 27. Ma Y, Du CH, Cai T et al: Barriers to community-based drug dependence 11. Deng JL: Introduction to grey system theory. J Grey Syst, 1989; 1: 1–24 treatment: implications for police roles, collaborations and performance in- 12. Yin MS: Fifteen years of grey system theory research: A historical review dicators. J Int AIDS Soc 2016;19(4 Suppl3): 20879. and bibliometric analysis. Expert Syst Appl, 2013; 40: 2767–75 28. Cao WT, Li R, Ying JY et al: Spatiotemporal distribution and determinants 13. Law DC, Serre ML, Christakos G et al: Spatial analysis and mapping of sex- of gonorrhea infections in mainland China: A panel data analysis. Public ually transmitted diseases to optimise intervention and prevention strat- Health, 2018; 162: 82–90 egies. Sex Transm Infect, 2004; 80: 294–99 29. Gulinuer M, Zhu L, Wang LB, Zhou XB: Epidemiological study of syphilis in- 14. Yin F, Feng ZJ, Li XS: Spatial analysis of county-based gonorrhoea incidence fection in Xinjiang international port from 2001 to 2010. Chin J AIDS STD, in mainland China, from 2004 to 2009. Sex Health, 2012; 9: 227–32 2013; 19: 138–39 15. Yin F, Feng Z, Li X: Spatial analysis of primary and secondary syphilis inci- 30. Liao KJ, Zhang SK, Liu M et al: Seroepidemiology of syphilis infection among dence in China, 2004–2010. Int J STD AIDS, 2012; 23: 870–75 2 million reproductive-age women in rural China: A population-based, cross- sectional study. Chin Med J (Engl), 2017; 130: 2198–204

Indexed in: [Current Contents/Clinical Medicine] [SCI Expanded] [ISI Alerting System] This work is licensed under Creative Common Attribution- [ISI Journals Master List] [Index Medicus/MEDLINE] [EMBASE/Excerpta Medica] NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) 5665 [Chemical Abstracts/CAS]