<<

International Journal of Medical Informatics 53 (1999) 239–252

Measurement and classification of retinal vascular tortuosity

William E. Hart a,*, Michael Goldbaum b, Brad Coˆte´ c, Paul Kube d, Mark R. Nelson e

a Department of Applied and Numerical Mathematics, Sandia National Laboratories, Alburquerque, NM 87185, USA b Department of Ophthalmology, Uni6ersity of California, San Diego, CA, USA c The Registry, Inc., San Diego, CA, USA d Department of Computer Science and Engineering, Uni6ersity of California, San Diego, CA, USA e Data Vector, San Diego, CA, USA

Abstract

Automatic measurement of blood vessel tortuosity is a useful capability for automatic ophthalmological diagnostic tools. We describe a suite of automated tortuosity measures for blood vessel segments extracted from RGB retinal images. The tortuosity measures were evaluated in two classification tasks: (1) classifying the tortuosity of blood vessel segments and (2) classifying the tortuosity of blood vessel networks. These tortuosity measures were able to achieve a classification rate of 91% for the first problem and 95% on the second problem, which confirms that they capture much of the ophthalmologists’ notion of tortuosity. Finally, we discuss how the accuracy of these measures can be influence by the method used to extract the blood vessel segments. © 1999 Elsevier Science Ireland Ltd. All rights reserved.

Keywords: Tortuosity; Retina; Blood vessel; Automated measurement

1. Introduction because of longitudinal stretching. The tortu- osity may be focal, occurring only in a small Normal retinal blood vessels are straight or region of retinal blood vessels, or it may gently curved. In some diseases, the blood involve the entire retinal vascular tree. Fig. 1 vessels become tortuous, i.e. they become di- shows images with tortuous and non-tortu- lated and take on a serpentine path. The ous blood vessels. dilation is caused by radial stretching of the Many disease classes produce tortuosity, blood vessel and the serpentine path occurs including high blood flow, angiogenesis and blood vessel congestion. High blood flow * Corresponding author. E-mail: [email protected]. may occur locally in arteriovenous anastamo-

1386-5056/99/$ - see front matter © 1999 Elsevier Science Ireland Ltd. All rights reserved. PII: S1386-5056(98)00163-4 240 W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252

Fig. 1. Images with (a) tortuous and (b) non-tortuous blood vessel segments. sis or systematically in high cardiac output between observers and with the same ob- such as accompanies anemia. In response to server at different times. Furthermore, mea- ischemia or inflammation, new blood vessels surements on such a gross scale make grow, often in a destructive manner. The changes in vascular tortuosity difficult to angiogenic process commonly causes dilation discern. and tortuosity of retinal blood vessels prior A quantitative tortuosity measurement was to the onset of the neovascularization. first described by Lotmar, Freiburghaus and Venous congestion can result from the occlu- Bracher [1] and slightly extended by Bracher sion of the central retinal vein or a branch [2]. This measurement involves the manual retinal vein. The resulting elevated intravas- selection of points on fundus photographs cular pressure results in tortuosity and dila- that subdivide the tortuous vessel into a se- tion in the blocked vein. ries of single arcs. Fig. 2 illustrates the mea- Information about disease severity or change of disease with time may be inferred by measuring the tortuosity of the blood vessel network. Consequently, there is a benefit in measuring tortuosity in a consis- tent, repeatable fashion. Vascular tortuosity measurements are commonly performed by human observers using a qualitative scale (e.g. mild, moderate, severe and extreme). Fig. 2. Illustration of the subdivision used to estimate Variability in grading can result because the blood vessel tortuosity with the relative length varia- boundaries between the grades may differ tion. W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252 241 surements they use to calculate the vessel’s of tortuosity. To assess the relative utility of tortuosity. Their method decomposes the ves- these measures, they were used to classify sel into a series of circular arcs, for which the blood vessel segments and blood vessel net- chord lengths li and arrow heights hi are works. The segments used in these classifica- measured. The tortuosity is measured as the tion experiments were extracted manually relati6e length 6ariation and automatically and we discuss how the tortuosity measures can be influenced by L−l 8 h 2 : % i , properties of the method used to extract the l 3 i li vessel segments. The classification rate was as where L is the length of the blood vessel and high as 91.5% for blood vessel segments and l is the chord length. The approximation is 95% for blood vessel networks. While no derived using a sinusoidal model of a blood single tortuosity measure was clearly supe- vessel. This method was used by Kylstra et rior, the experiments recommend a total al. [3] to measure the effects of transmural squared measure. pressure on the tortuosity of latex tubing. Given the increased availability of digitized fundus photographs, automated tortuosity 2. Tortuosity measures measurements are now feasible. Kaupp et al. [4] have reported unpublished results of an 2.1. Abstract properties automated tortuosity measurement that uses a Fourier analysis of the perpendicular along We begin by discussing several properties the blood vessel. Smedby et al. [5] describe of tortuosity measures that are motivated by five tortuosity measures used to measure tor- the ophthalmologist’s notion of tortuosity. A tuosity in femoral arteries. Included are sev- tortuosity measure that has many of the eral measures of the curvature along properties described below should provide the blood vessel, the number of inflection good predictive performance in our classifica- points of the vessel and the fraction of the tion tasks, thereby matching an ophthalmolo- vessel that has high curvature. Their experi- gist’s measure of tortuosity. We will discuss ments examine properties of these measures the following properties of tortuosity mea- like reproducibility and scalability. Ca- sures: invariance to translation and rotation, powski, Kylstra and Freedmen [6] describe a response to scaling and compositionality of measure of blood vessel tortuosity based on vessels and vessel networks. These properties spatial frequencies. Zhou et al. [7] have also are defined for parametrized differentiable described a method for distinguishing tortu- curves C=(x(t), y(t)), with t in an interval ous and nontortuous blood vessels in an- [t0, t1]. We model retinal blood vessels with giograms. In a preliminary abstract, we have such curves. A tortuosity measure ~ takes a proposed a tortuosity measure based on the curve C as its argument and returns a real integral curvature along a blood vessel [8]. number. In this paper we describe tortuosity mea- sures that are used to measure the tortuosity 2.1.1. In6ariance to translation and rotation of retinal blood vessels as well as the retinal Suppose a vessel with tortuosity ~ were blood vessel network. We motivate our tortu- moved on the retina without changing its osity measures by analyzing abstract proper- shape or size. What should be the tortuosity ties of measures based on medical intuitions of the resulting vessel? 242 W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252

While ophthalmologists’ judgments of ves- appears that if a vessel segment with a given sel tortuosity do seem to incorporate the tortuosity were extended with a segment of relative curvature of other vessels on the fun- the same tortuosity, the resulting vessel dus, we believe their judgment is largely inde- would have the same tortuosity as either of pendent of the location or orientation of the its segments. That is, a vessel doesn’t increase vessels. Our measures assume that vessel tor- (or decrease) its tortuosity by having the tuosity is not dependent on the location of same serpentine pattern extended at greater orientation of the vessel. length. We formalize these intuitions as follows. œ 2.1.2. Response to scaling We write C=C1 C2 and say curve C has Suppose a vessel with tortuosity ~ was constituent segments C1 and C2. Whenever  enlarged uniformly by some scale factor k\ C=(x(t), y(t)) with t [t0, t1], C1 =(x(t),  1. What should be the tortuosity of the scaled y(t)) with t [t0, t2] and C2 =(x(t), y(t)) with   vessel? t [t2, t1], for some t2 [t0, t1]. Now suppose ~ 5~ Ophthalmologists do not seem to have un- that (C1) (C2). We say that a tortuosity equivocal intuitions about this question, measure has the property of compositionality though they believe that a vessel’s tortuosity when, for any such C1, C2, C3, should be roughly invariant to scaling. How- ~(C )5~(C)5~(C ) (1) ever, it is clear that if scale does affect the 1 2 tortuosity then it does so in a multiplicative with the equality holding if and only if ~ ~ manner. That is, if curve C=(x(t), y(t)) has (C1)= (C2). tortuosity ~(C), curve C%=(kx(t), ky(t)) has The property of compositionality provides tortuosity i(k)·~(C) for some function i. constraints on how to compute the tortuosity If tortuosity is invariant to scaling, i(k) 1; of a vessel, given the tortuosities of its con- and if tortuosity is inversely related to scale, stituent segments. This computation will be i(k)B1 when k\1. We examine tortuosity necessary in an automated system that ex- measures that exhibit various tortuosity scale tracts vessel segments and not whole vessels factors i(k) and investigate their effect on from fundus images. A method of computing classification performance. the tortuosity of a whole vessel that is consis- tent with Eq. (1) is to weight the tortuosity of 2.1.3. Vessel compositionality each constituent segment by the fraction of Suppose a vessel were composed of two which that segment contributes to ~ œ ~ vessel segments C1 and C2, one with tortuos- the vessel. That is, (C1 C2)=[s(C1) (C1) ~ ~ ~ œ ity (C1) and the other with tortuosity (C2). +s(C2) (C2)]/s(C1 C2), where s(C)isthe What should be the tortuosity of the whole arc length of the curve C (see below). We call vessel? this method weighted additivity. It appears to be the intuition of ophthal- On the other hand, computing the tortuos- mologists that a vessel that is composed of ity of a vessel by averaging the tortuosities of two segments of different tortuosities would its constituent segments is inconsistent with have a degree of tortuosity between the tortu- Eq. (1). It follows from Eq. (1), together with osities of its constituent segments. This as- the assumption that the tortuosity of a vessel sumes that the two segments are smoothly is independent of how it is segmented, that connected, but the ‘order’ in which segments the tortuosity of a vessel cannot be a function occur in the vessel is not important. It also only of the tortuosities of its constituent seg- W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252 243

colinear and whose tortuosity is roughly equivalent.

2.1.4. Network compositionality Another type of compositionality concerns the way in which the tortuosity measures for Fig. 3. Two curves with the same tortuosity for a vessel segments are combined to determine a measure with chord-colinear compositionality. tortuosity measure for an entire vessel net- work. This type of compositionality might ~ ~ B~ ~ differ from vessel compositionality because ments. Let (C1)= (C2) (C3)= (C4) and œ œ œ the method for combining the tortuosity val- consider the curves C1 C2 C3 and C2 œ ues from vessel segments may depend on C3 C4. If we segment these curves as œ œ œ œ whether they are part of the same vessel. We C1 (C2 C3) and (C2 C3) C4, then the tortuosity measures for these two curves are have yet to develop a method for extracting a different. However, if these curves are seg- complete vessel network, so we calculate the œ œ œ œ tortuosity of a blood vessel network using the mented as (C1 C2) C3 and C2 (C3 C4), the tortuosities of the constituent segments of weighted additivity of all of the blood vessel ~ segments in the image. the two vessels are the same, viz. (C1) and ~ (C3). Finally, we note that the property of com- 2.2. Definitions positionality implies that extending a vessel segment ‘in the same way’ does not affect its Using our previous definition of a curve tortuosity. The vessel segment which consti- segment C, we define a suite of tortuosity tutes the extension would have the same tor- measures. We begin by defining the compo- tuosity as the vessel segment, so the nents used to construct these measures. These tortuosity of the extended vessel remains the measures are defined for a curve C=(x(t), y(t)) on the interval [t , t ]. same. Some of the proposed tortuosity mea- 0 1 sures will be invariant to some types of exten- 2.2.1. Arc length sions without strictly satisfying the The arc length of C is compositionality property. We say that a t measure has the property of chord-colinear & 1 s(C)= x%(t)2 +y%(t)2 dt. compositionality if a vessel C is segmented t 0 such that each segment has the same tortuos- ity and the chords of the segments are colin- 2.2.2. Chord length ear, then the tortuosity of the vessel is the The chord length of C is same as the constituent segments. A tortuos- 2 2 chord(C)= (x(t1)−x(t0)) +(y(t0)−y(t1)) ity measure with this property is invariant to chord-colinear extensions. Fig. 3 shows two 2.2.3. Total cur6ature curves that will have the same tortuosity for a The curvature of C at t is measure that has chord-colinear composition- x%(t)y¦(t)−x¦(t)y%(t) ality. This is an interesting property since s(t)= , (2) % 2 % 2 3/2 retinal blood vessels are often roughly peri- [y (t) +x (t) ] odic along a straight line. Thus they can be and the total curvature of a curve segment C partitioned into segments whose chords are is 244 W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252

Table 1 Summary of the tortuosity measures and their properties

Tortuosity measure Response to scale Chord-colinear compositionality Compositionality

~ ã 1 s(C)/chord(C)−1 1 ~ k 2 tc(C) 1/ ~ k 2 3 tsc(C) 1/ ~ k 2 ãã 4 tc(C)/s(C) 1/ ~ k 3 ã ã 5 tsc(C)/s(C)1/ ~ k 2 ã 6 tc(C)/chord(C) 1/ ~ k 3 ã 7 tsc(C)/chord(C) 1/

&t1 tc(C)= s(t) dt. emphasis on the parts of the curve that have t 0 high curvature and de-emphasizes the parts of the curve that have low curvature. Since 2.2.4. Total squared cur6ature ~ the curvature is greater for small vessels, 3 The total squared curvature of C is will emphasize the tortuosity of smaller ves- ~ ~ sels more than 2. The 2 measure is the same &t1 tsc(C)= s(t)2 dt. as the total curvature measure described by t 1 0 Smedby et al. [5]. The remaining measures are ‘length-nor- Table 1 defines the measures that we exam- ~ ~ malized’. Measures 4 and 5 average the total ined in our experiments and describes their ~ curvature measures by the arclength, while 6 properties. These measures have zero mea- ~ and 7 average by the chord length. The sure for straight vessel segments and increas- ~ ~ ing positive measure for segments as they measures 4 and 5 have the advantage that become tortuous. The variety of tortuosity they have the property of compositionality. measures allows us to compare measures to We believe that these two measures a priori determine the relative importance of the tor- come closest to the ophthalmologist’s notion tuosity properties described previously. of tortuosity. ~ The measure 1 simply computes the tortu- osity of the segment by examining how long 2.3. Tortuosity calculation the curve is relative to its chord length. This measure is the same as the distance factor The definitions for our tortuosity measures tortuosity measure described by Smedby et apply to abstract curves. The skeletonized al. [5] and it is very similar to the relative blood vessels in our data sets consist of se- length variation proposed by Lotmar et al. quences of pixel locations that approximate [1]. If the curve C is a single arc, or if C is the center line of the blood vessels in the ~ chord-colinear, then l=chord(C) and 1 is retinal images. The arc length of a skele- equivalent to relative length variation. tonized blood vessel with n pixel locations ~ ~ Measures 2 and 3 directly calculate the curvature of the curve. Unfortunately, nei- 1 We disagree with the claim by Smedby et al. that the total ther of these measures have either of the curvature measure is invariant to linear scaling. If x(t) and ~ y(t) are multiplied by a constant factor, A, then their deriva- compositionality properties. The 3 measure tives are multiplied by A. Thus the curvature measure in Eq. ~ differs from 2 in that it places a greater (2) is scaled by 1/A. W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252 245

(xi, yi) can be calculated by walking along the 45° angle in a image can have a pixel repre- curve and summing the distance between sentation that is a zigzag line of pixels along neighboring points: the blood vessel. The smoothed coordinates n−1 are not restricted to the discrete pixel grid, so % 2 2 (xi −xi+1) +(yi −yi+1) . their tortuosity measures more closely reflect i=1 the tortuosity of the actual blood vessel. Pre- To calculate the total curvature and total liminary experiments indicated that two ap- squared curvature, recall [9] that the curva- plications of this smoothing method were ture of a parameterized curve can also be sufficient to reduce this noise. defined as The smoothing operation also eliminated noise that was introduced when independent dh s(t)= (t), segments were linked together to form a ds longer segment. Adjacent segments extracted where h(t) is the angle of the tangent line at from a blood vessel can form a non-smooth t. To compute (dh/ds)(t), we measured h(t) sequence of pixels. Smoothing combined seg- and s(t) and computed the of h(t) ments produced a sequence of points that with respect to s(t). Here, s(t) is the arc more closely resembled the contour of the original blood vessel. length between (x1, y1) and (xt, yt) We deter- mined h(t) by calculating the angle of the line connecting (xt, yt) and (xt+l, yt+1). To com- pute the derivative (dh/ds)(t), we used a win- 3. Methods dow of 10 values of s(t) and h(t): (s(t−4), h(t−4)),…, (s(t+5), h(t+5)). A line, We examined the utility of our tortuosity f(x)=mx+b, was fitted to these points with measures by using them as features in two a linear regression method to minimize different classification problems. In the first problem, we classified blood vessel segments %(h(i)−f(s(i)))2. as tortuous or non-tortuous. In the second i problem, we classified the tortuosity of the The derivative (dh/ds) was equated with m, entire blood vessel network. the slope of this line. The n−9 values for (dh/ds) were used to numerically estimate the 3.1. Data needed to calculate the total curva- ture and total squared curvature. Blood vessel segments were extracted from Following a suggestion of Flynn and Jain a set of 20 retinal images in two different [10], we smoothed the pixel representation of ways: automatically and manually. To extract a blood vessel segment before making our blood vessel segments, we applied the blood tortuosity calculations. We independently vessel filter described in Chaudhuri et al. [12] smoothed the x and y coordinate sequences, to the green plane of an RGB image; the

{x1,…, xn} and {y1,…, yn} using the smooft green plane was selected because it typically method described by Press et al. [11], which exhibits the greatest contrast. The filter was applies a low-pass filter to the data. This applied at 12 orientations over 180°. The final method eliminated undesirable noise that is response map was computed by taking the due to the discrete nature of the pixel repre- maximum response of the 12 filters at each sentation. For example, a blood vessel at a location. We thresholded and thinned the 246 W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252 response map of the blood vessel filter to blood vessels. These two sets of extracted produce an image containing binary edge seg- blood vessels were used for the first classifica- ments. The edge segments were primarily tion problem. blood vessels, but also included some edges For the second classification problem, the of large objects like the optic nerve. extracted blood vessels were used to calculate To create the set of automatically extracted the network tortuosity measures for each im- blood vessel segments, the edge segments age. The vessel network tortuosity of the 20 were classified as blood vessels or non-blood retinal images were labeled; there were ten vessels using the linear classifier described in tortuous and ten non-tortuous images. Coˆte´ et al. [13]. There were 981 automatically extracted blood vessel segments, of which 252 3.2. Classification were tortuous and 729 were non-tortuous. The unclassified edge segments were also 3.2.1. Logit models used to extract blood vessel segments manu- We used a logit model [14] to classify the ally. The edge segments were manually iden- data for both classification problems. For tified as blood vessels and linked together to problems with two classes, a logit model form the final blood vessel segments. Care computes a weighted sum of the input fea- was taken to link vessel segments only if the tures passed into a logistic function. Let smoothness of the link reflected the curvature { f1,…, fn}ben input features and {w0,…, of the underlying blood vessel. This is reason- wn}bethen+1 weights. The two-class logit able since blood vessel segments are model is smoothed before the tortuosity measures are  n  % calculated. There were 284 blood vessels ex- f(x, w)=g w0 + wi fi , tracted manually, of which 133 were tortuous i=1 and 151 were non-tortuous. where g(x) is the logistic function g(x)=1/ We are primarily interested in the automat- (1+e−x). The output of the logit model is ically extracted blood vessels, since they can between zero and one. To perform classifica- be used as part of automated diagnostic tion, the output is thresholded to zero or one, tools. However, the blood vessel filter tends depending on whether the output is greater to break up longer blood vessels. This usually or less than 0.5. occurs when two vessels cross in an image, Given a set of training examples, where a vessel branches or when a vessel {(x1, y1),…, (xn, yn)}, the optimal weight vec- bends very rapidly. Since tortuous blood ves- tor is found by minimizing sels can bend very rapidly, they are broken % up more often than non-tortuous vessels. The J(w)= E(yi h, f(xi, w)) manually extracted blood vessel segments i provide a benchmark against which the per- where formance of the automatically extracted seg- Á −log(1−b),a=0 ments can be compared. E(a, b)=Í After extraction, the tortuosity measures Ä −log(b),a=1 were calculated for each segment and the segments were labeled by one of the authors This is the ‘cross-entropy’ error, which is (MG), a retinal specialist who has experience motivated by an analysis based on the rela- with diseases causing tortuosity in retinal tive entropy of the distribution of outputs W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252 247 with the distribution of binary data [15]. We samples. The outputs were split into 20 found the optimal weight vector by minimiz- groups based on the rank of their values. The ing J(w) with the conjugate gradient method true-positive and false positive proportions [11]. were measured for each group. The trape- Logit models are well suited for modeling zoidal rule was used to integrate the the probability distribution of binary data. If ROC values. This method of calculating the the data are linearly separable, logit models ROC is similar to the method described by will simply perform a linear classification of Swets [17]. the data. However, for data that are not The performance on the samples used to linearly separable, logit models perform dif- train a classifier is typically an optimistic ferently from linear models. Aldrich and Nel- estimate of the classifier’s performance on a son [16] describe how nonlinear transforma- new set of data [18]. To estimate the true error tions like the logistic function modify the rate, we used cross-validation to partition the behavior of the linear model. Further, they data into two subsets that are used to train note that the there are a variety of reasons and test the data. In the first classification why the assumption that a probability model problem, we used 50 randomly selected parti- is linear is unrealistic in most cases. In prelim- tions for which 70% of the data were used to inary experiments, we found that logit models train the classifiers. A different classifier had better classification performance than lin- ear models. was trained and tested on each partition. The mean classification rate and integrated ROC on the testing subsets were used to 3.2.2. Performance measures evaluate the expected performance of a Two measures were used to compare the classifier. performance of classifiers using the different The second classification problem has a tortuosity measures: classification rate and much smaller set of data, so we performed the integrated relative operating characteristic 10-fold cross-validation. The data set was (ROC). The classification rate is simply the randomly partitioned into ten sets, each of proportion of test samples that are correctly which contained one positive and negative classified. The ROC measures the proportion of test samples that are correctly classified as example that were used for testing the positive instances (true-positives) as a func- classifier. Again, classifiers were trained and tion of the proportion of negative test samples tested on each partition. The mean classifica- that are classified as positive instances (false- tion rate was used to evaluate the expected positives) [17]. The integrated ROC measures performance of a classifier. Because of the how a classifier’s performance varies for dif- small size of the data set, the integrated ROC ferent classification thresholds. The integrated was calculated once using the values of the ROC, A, typically assumes values between 0.5 logit model for every test sample. and 1.0. When A is near 0.5, the classifier performs no better than chance alone. When 3.3. Experiments A is near 1.0, the classifier has near perfect discrimination for most classification In both classification problems, we per- thresholds. formed experiments that examine the classifi- The ROC was measured by calculating the cation rate using each tortuosity by itself. We output of the logit model for all of the test also performed experiments using all of the 248 W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252

Table 2 Results for the problem of classifying blood vessel segments and blood vessel networks. The mean of the cross-validation classification rate and integrated ROC

MeasureSegments Networks

Classification rate Integrated ROC Classification rate Integrated ROC

ManualAuto Manual Auto Manual Auto Manual Auto

~ 1 79.591.3 0.897 0.955 65 90 0.61 0.91 ~ 2 79.3 82.7 0.875 0.915 70 85 0.79 0.85 ~ 3 82.9 89.5 0.927 0.960 90 90 0.86 0.86 ~ 4 80.488.5 0.905 0.951 80 95 0.83 0.99 ~ 5 81.2 87.7 0.906 0.941 85 90 0.84 0.83 ~ 6 82.189.1 0.921 0.956 80 90 0.81 0.91 ~ 7 82.0 88.1 0.914 0.944 85 90 0.88 0.91 All85.6 91.5 0.935 0.970 95 90 0.95 0.86

tortuosity measures together. We applied for either the manually or automatically ex- Tukey’s method for multiple comparisons tracted data sets. [19] to see if there were significant differ- For most of the classifiers, the relative ences in the classification rate of the integrated ROC values mirrored the relative classifiers in the two classification problems classification rates. An interesting exception ~ (pB0.05). is the integrated ROC for 1. For the seg- ment classification problem, the classifica- ~ tion rate of the classifier using 1 to classify 4. Results automatically extracted blood vessels was significantly higher than the rest of the Table 2 summarizes the experimental re- classifiers. However, the integrated ROC for ~ ~ sults for the two classification problems. The 1 was lower than the integrated ROC for 3 ~ experiment name indicates whether the data and 6. This indicates that the classification were manually extracted or automatically threshold used in these experiments favors ~ ~ ~ extracted and indicates which tortuosity 1, but the 3 and 6 measures are robust for measures were used in the experiment. For a wider range of classification thresholds. the segment classification problem, the man- In preliminary experiments, we examined ually extracted data set exhibited significant tortuosity measures that multiplied the differences between the following pairs of length-normalized measures by the average ~ ~ ~ ~ ~ ~ tortuosity measures: ( 1, 3), ( 1, 6), ( 1, 7), vessel diameter. However, these measures ~ ~ ~ ~ ~ ~ ( 2, 3), ( 2, 6) and ( 2, 7). All of the inter- performed slightly worse than their counter- actions in the automatically extracted data parts. We thought these would improve the set were significantly different, except the performance of our tortuosity measures ~ ~ ~ ~ ~ ~ ~ ~ following: ( 3, 6), ( 4, 5), ( 4, 6), ( 4, 7) and since this increases the measure’s scale rela- ~ ~ ( 5, 7). For the network classification prob- tion without changing its tortuosity proper- lem, no significant differences were found ties. W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252 249

5. Discussion gests that the response to scale is an impor- tant property. The fact that the difference ~ ~ Our experimental results demonstrate that between 2 and 3 was greater on the manu- the proposed tortuosity measures can be ally extracted data set supports this conclu- used to effectively classify the tortuosity of sion. blood vessel segments and blood vessel net- works. In particular, these results show that 5.2. Blood 6essel network tortuosity the tortuosity measures can be used with automatically extracted blood vessel seg- The results for the vessel network classifi- ments, which is important for the develop- cation problem follow the same pattern as ment of retinal image analysis tools. the results for the blood vessel classification problem. The classification performance was 5.1. Blood 6essel segment tortuosity better on the automatically extracted data ~ set and the 1 tortuosity measure performed The better classification performance on much better on the automatically extracted the automatically extracted data set can be data set. attributed to the short, simple blood vessel It is surprising that the performance also ~ ~ segments in this data set. While an entire increased for 4 and 5. These measures vessel segment may be labeled tortuous, have the compositionality property, so the subsegments of the vessel may not be par- blood vessel networks should have roughly ticularly tortuous. These non-tortuous sub- the same tortuosity measure for both data segments will influence the overall tortuosity sets. An analysis of the data set indicated of the vessel segment, making it appear less that the performance difference was due to tortuous and thereby more difficult to clas- a difference in the extraction process on sify. The automatically extracted blood ves- one of the images. This particular image sels are usually short enough to be entirely was tortuous, but only slightly so. The au- tortuous or non-tortuous. tomatically extracted segments included This also explains why the ~ tortuosity 1 some exudate edges that were misclassified measure performed much better on the au- as blood vessel edges during extraction. tomatically extracted data set. This measure These edges had a high curvature and does not accurately discriminate between long segments because it does not analyze thereby made the image appear more tortu- the tortuosity along the segment. Since the ous. The manually extracted segments did tortuosity is more constant along shorter not include these segments and thereby re- segments, this global measure performs well. duced its tortuosity measure to a point We can use the statistical analysis of the where the image appeared non-tortuous. classification rates to analyze the relative One other image was responsible for importance of the abstract tortuosity mea- most of the remaining error for both data sures. The only clear pattern for both of sets. This image is clearly tortuous, but it ~ the data sets is that the 2 measure had has very small blood vessels. The segmenta- ~ consistently poor performance. In fact, 2 tion algorithm tends to smooth out very ~ was significantly worse than 3 for both small blood vessels, so the tortuosity mea- data sets. The only property not shared by sures for this image were low enough to ~ ~ 2 and 3 is response to scale, which sug- make it appear non-tortuous. 250 W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252

5.3. Blood 6essel extraction 6. Conclusions

The experimental results indicate that the Our experimental results demonstrate that utility of the tortuosity measures can be af- the proposed tortuosity measures can be used fected by the manner in which the blood to classify the tortuosity of blood vessel seg- vessel segments are extracted. We have ob- ments and blood vessel networks. We believe served three different ways in which the ex- that automated tortuosity measures like these traction process influenced the performance will play an important part of retinal image in the classification tasks. While these influ- analysis tools. Automated measures are im- ences are specific to our data, they highlight portant because they provide consistent mea- factors that will influence the performance of surements that are directly comparable and tortuosity measures on any data set. can be used to detect changes in vascular First, the segmentation method can affect tortuosity accurately. the tortuosity of the extracted vessels. Our Both classification problems used a coarse, segmentation algorithm breaks up vessels at two-class scale to evaluate the tortuosity points of very high curvature and tends to measures. A coarse scale was chosen for the smooth out very thin blood vessels. These experiments to minimize the possible error of issues are less problematic when calculating the hand labeling. One advantage of the pro- the tortuosity of blood vessel networks. The posed tortuosity measures is their ability to broken blood vessels can be reconnected and make tortuosity measurements in a continu- factors that cause tortuosity will usually af- ous fashion. However, additional experiments fect both large and small blood vessels. are needed to evaluate whether the proposed The length of the extracted segments can tortuosity measures correctly predict tortuos- also affect the utility of the tortuosity mea- ity on a finer scale. sure for classification. The shorter segments We have motivated our choice of tortuos- in the automatically extracted data sets is the principle reason for the better classification ity measures with abstract properties that results for these data sets. formalize ophthalmological intuitions con- Finally, extracting blood vessel segments cerning tortuosity measures. Abstract proper- automatically can effect the tortuosity mea- ties provide a formal basis for comparing sures of blood vessel networks since they can tortuosity measures, which enables us to ana- include misclassified segments. Non-blood lyze the prospective performance of new tor- vessel segments are often more tortuous than tuosity measures. Our experimental results blood vessel segments, so the tortuosity of a suggest that response to scaling is an impor- blood vessel network can be biased when tant property for tortuosity measures, while misclassified segments are included. However, the compositionality properties did not exhib- misclassified segments occur almost only in ited discernible influence on the classification images that contain a large number of abnor- performance. The importance of the response mal objects such as lesions. This problem did to scaling is particularly interesting because not seriously impact our results because the the results indicate that inverse scaling is blood vessel classifier used to automatically preferred, which is contrary to our ophthal- extract the blood vessel segments is quite mological intuitions. accurate. It has a 89.6% classification rate for While our classification results do not non-blood vessels on a data set containing strongly support the use of one measure over ~ images with lesions [13]. the others, the 3 measure appears the best W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252 251

~ choice for a tortuosity measure. Both the 3 [3] J.A. Kylstra, T. Wierzbicki, M.L. Wolbarsht, ~ M.B. Landers III, E. Stefansson, The relation- and 4 measures were closest to the ophthal- ~ ship between retinal vessel tortuosity, diameter mologist’s notion of tortuosity, though the 3 measure performed slightly better on the clas- and transmural pressure, Graefe’s Arch. Clin. Exp. Ophthalmol. 224 (1986) 477–480. sification tasks. Classification with all of the [4] A. Kaupp, H. Toonen, S. Wolf, K. Schulte, R. ~ tortuosity measures was better than 3, but Effert, D. Meyer-Ebrecht, M. Reim, Automatic not in a statistically significant manner. evaluation of retinal vessel width and tortuosity Because tortuosity measures are used to in digital fluorescein angiograms, Invest. Oph- infer information about disease severity or thalmol. Vis. Sci. 32 (1991) 952. change of disease over time, the reproducibil- [5] O8. Smedby, N. Ho¨gman, S. Nilsson, U. Erikson, ity of tortuosity measures is an important A.G. Olsson, G. Walldius, Two-dimensional tor- tuosity of the superficial femoral artery in early property. We have not examined this issue in atherosclerosis, J. Vasc. Res. 30 (1993) 181–191. this study, but the analysis in Smedby et al. [6] J.J. Capowski, J.A. Kylstra, S.F. Freedman, A [5] suggests that our measures will be very numeric index based on spatial-frequency for the reproducible. Smedby et al. [5] observe that tortuosity of retinal-vessels and its application to ~ the total curvature ( 2) and distance factor plus disease in retinopathy of prematurity, Retina ~ 15 (6) (1995) 490–500. ( 1) measures are very reproducible across a series of images of the same blood vessel. All [7] L.A. Zhou, M.S. Rzeszotarski, L.J. Singerman, of the tortuosity measures that we have ex- J.M. Chokreff, The detection and quantification of retinopathy using digital angiograms, IEEE amined are very similarly (if not identical) to ~ ~ Trans. Med. Imag. 13 (4) (1994) 619–626. 1 and 2. [8] N.P. Katz, M.H. Goldbaum, S. Chaudhuri, M.R. Nelson, Automated measurements of blood vessels in digitized images of the ocular fundus, Acknowledgements Invest. Ophthalmol. Vis. Sci. 31 (1990) 1185. [9] R. Courant, J. Fritz, Introduction to Calculus This work was supported by NIH Grant and Analysis, Wiley-Interscience, New York, 1965. No. RO1LM05759. This work was per- [10] P.J. Flynn, A.K. Jain, On reliable curvature esti- formed in part at Sandia National Laborato- mation, in: IEEE Proceedings on Computer Vi- ries. Sandia is a multiprogram laboratory sion and Pattern Recognition, 1989, pp. operated by Sandia corporation, a Lockheed 110–116. Martin Company, for the United States De- [11] W.H. Press, B.P. Flannery, S.A. Teukolsky, W.T. partment of Energy under Contract DE- Vetterling, Numerical Recipies in C—The Art of AC04-94AL85000. Scientific Computing, Cambridge University Press, Cambridge, MA, 1990. [12] S. Chaudhuri, S. Chatterjee, N. Katz, M. Nel- son, M. Goldbaum, Detection of blood vessels in References retinal images using two dimensional blood vessel filters, IEEE Trans. Med. Imag. 8 (3), Septem- [1] W. Lotmar, A. Freiburghaus, D. Bracher, Mea- ber, 1989. surement of vessel tortuosity on fundus photo- [13] B. Coˆte´, W.E. Hart, M. Goldbaum, P. Kube, graphs, Graefe’s Arch. Clin. Exp. Ophthalmol. M.R. Nelson, Classification of blood vessels in 211 (1979) 49–57. images of the ocular fundus. Technical Report [2] D. Bracher, Changes in peripapillary tortuosity CS94-350, University of California, San Diego, of the central retinal arteries in newborns, Grae- CA, 1994. fe’s Arch. Clin. Exp. Ophthalmol. 218 (1986) [14] J.S. Cramer, The Logit Model: An Introduction 211–217. for Economists, E. Arnold, London, 1991. 252 W.E. Hart et al. / International Journal of Medical Informatics 53 (1999) 239–252

[15] J.S. Bridle, Training stochastic model recog- Quantitative Applications in the Social Sciences. nition algorithms as networks can lead to maxi- Sage, 1984. mum mutual information estimation of parame- [17] J.A. Swets, Measuring the accuracy of diagnostic ters, in: D. Touretzky (Ed), Advances in Neural systems, Science 240 (1988) 1285–1293. Information Processing Systems, vol. 2, Morgan [18] R.O. Duda, P.E. Hart, Pattern Classification and Kaufman, San Mateo, CA, 1990, pp. 211– Scene Analysis, Wiley, New York, 1973. 217. [19] J. Rice, Mathematical Statistics and Data Analy- [16] J.H. Aldrich, F.D. Nelson, Linear Probability, sis. Statistics and Probability Series. Wadsworth Logit and Probit Models. Number 07-045 in and Brooks/Cole, 1988.

.