<<

Vision Res. Vol. 31, No. 12, pp. 2195-2207, 1991 0042-6989/91$3.00 + 0.00 Printed in Great Britain. All rights reserved Copyright 0 1991Pergalnon FVessplc

INTEROCULAR CORRELATION, LUMINANCE CONTRAST AND CYCLOPEAN PROCESSING

LAWRENCE K. CORMACK, Sco3-r B. STEVENSONand CLIFTON M. SCHOR School of Optometry, University of California, Berkeley, CA 94720, U.S.A.

(Received 27 August 1990; in revised form 17 January 1991)

Abstract-We have investigated the nature and viability of interocular correlation as a of signal strength in the cyclopean domain. Thresholds for the detection of interocular correlation in dynamic random element stereograms were measured as a of luminance contrast, a more traditional measure of stimulus strength. At high contrasts, correlation thresholds were independent of contrast. At low contrasts, correlation thresholds were inversely proportional to the square of contrast. Stereothresholds were also measured as a function of both contrast and interocular correlation. At low contrasts, stereoacuity was inversely proportional to both interocular correlation and the square of contrast. These results are consistent with an inherently multiplicative mechanism of binocular combi- nation, such as a cross-correlation of the two eye’s inputs.

Interocular correlation Contrast Stereopsis

INTRODUCTION images, with their generous variety of colors, luminances, etc. interocular correlation is some- In general, interocular correlation can be what difficult to intuit, regardless of the defi- thought of as the degree to which the images in nition employed. It would be unclear, for the two eyes match one another. Intuitively, example, how to change one member of an image interocular correlation, if reasonably defined, pair in order to reduce the interocular corre- should provide a measure of signal strength in lation by some desired amount. In the labora- the cyclopean domain. That is, if the two eye’s tory however, where one can restrict the visual views are almost identical, as is the case with environment to one-bit random dot stereograms binocular fixation of a flat surface, interocular of 50% density, the notion of interocular corre- correlation is very close to the maximum poss- lation is quite intuitive. Under these conditions, ible. On the contrary, if the two eye’s views of the interocular correlation for a given disparity the world are predominantly non-overlapping is simply a linear function of the proportion of then nothing can be predicted about the right dots which match at that disparity, i.e. eye’s image given only the left eye’s image and vice versa. In this case there would be zero IOC(d) = 2P, - 1 (2) interocular correlation and, hence, no cyclopean information available. where Pd is the proportion of matching dots at Formally, interocular correlation can be disparity d. defined as the cross-correlation of the image Examples of different interocular correlations pair comprising the right and left eye’s views of are shown in Fig. 1. The top stereo pair the world. For the simple one-dimensional case illustrates an interocular correlation of + 1 (all the interocular correlation at some disparity d is dots match at 0 disparity) and, when fused, given by: the percept is that of a flat plane. In the middle and bottom panels, the interocular correlation IOC(d) = f(x)h(x + d) dx (1) has been reduced to +OS and 0 respectively, s accompanied by a degradation of the perceived where f(x) and h(x) represent the intensity quality of the plane. It should be noted that the profiles (or some derivative of them) along the phenomenology of these static examples differs horizontal meridian of the right and left eye’s somewhat from that of dynamic displays such as retinae. were employed in our experiments. Specifically, This definition is simple, quantitative, and the dynamic displays of interocular correlations works for any given image pair. For “natural” less than unity give rise to a percept much like

2195 2196 LAWRENCE K. C‘ORMACK el (11.

b.

C.

Fig. 1, Examples of random noise with various amounts of interocular correlation. These examples can either be free-fused or viewed through a stereoscope. In (a). the interocular correlation is 100% at 0 dn, parity, and the percept is one of a Aat plane. In (b) the interocular correlation has been reduced to SO” ,I while a flat plane is still perceived, it appears less robust and accompanied by dots both in front of :III~! behind the plane of fixation. In (c), the interocular correlation is 0, and no percept of a coherent plani. : extant. In dynamic versions of these stimuli, such as were employed in the experiments, the phenomtr; ology is somewhat different. Lower interocular correlations in particular appear as semi-transparct!t volumes when dynamic, whereas. when static, they appear as opaque surfaces with chaotic topograph? a dirty window embedded in fog; as correlation was the appearance of’s pstchwork 01 cc)-plan;lr increases, the window grows dirtier and the fop dots imbedded in a ri~Hlrousllustr_c,a- b:~l\ grows thinner. Absent in the dynamic displays t‘cv-ound, such as is seen in thcw stalk ;-tarl~pie~ lnterocular correlation, luminance contrast and cyclopean processing 2197

Yet Fig. 1 does illustrate a basic point, which is that as one decreases the interocular corre- lation of the stimulus, the salience of the flat plane also decreases. In this sense, interocular correlation could represent a metric of signal amplitude in the cyclopean domain analogous to the manner in which luminance contrast provides a metric of signal amplitude in the spatial domain. * The generation of a cyclopean signal does not occur in parallel with the gener- ation of a spatial (contrast) signal however; the presence of a contrast signal is a necessary precursor to the generation of a cyclopean signal. Given this ordinal relationship, and the fact that contrast is already known to influence Fig. 2. Schematic illustration of the experimental apparatus. such hypercyclopean functions as stereoacuity Random bit streams were hardware generated and sent to a pair of video monitors, which were viewed through a (Halpern & Blake, 1988; Legge & Gu, 1989; mirror haploscope. Disparities were created by delaying the Heckman & Schor, 1989), it might be possible sync to one monitor. Interocular correlation was manipu- to control the amplitude of a cyclopean signal lated by driving the monitors with a single dot generator, by manipulating luminance contrast. For independent dot generators, or a combination thereof (see example, if the luminance contrast of a cy- text for details). The psychophysics (stimulus presentation, data aquisition, etc.) were all under computer control. clopean stimulus is reduced, it might be possible to compensate for the resulting decrease in signal strength by increasing the interocular 7 MHz and was displayed on a pair of matched correlation, thereby maintaining a constant (e.g. TSD monitors (p4 phosphor, 60 Hz non- threshold) level of performance on some task. interlaced) viewed through a mirror haploscope. Thus, given a threshold level of performance on The viewing distance was 53.7 cm and the said task, a trading relation would be expected haploscope mirrors were adjusted for each between contrast and interocular correlation. subject to the corresponding convergence Moreover, the form of this trading relation angle, thus obviating any mismatch between would reflect the manner in which cyclopean convergence and accommodation or “higher signals are derived from monocular contrast level” distance cues. Mean luminance was signals. Accordingly, we measured the effect 80cd/m2. The displays were viewed through of contrast on the detection of interocular 7 deg circular apertures in an otherwise black correlation in the first experiment. Based on surround. the results of this experiment, a model was Horizontal disparities were produced by de- developed to generate predictions concerning laying the horizontal video sync to one monitor. hypercyclopean functions such as stereoacuity. This was accomplished via a programmable Predictions of this model were then tested delay chip (Digital Delay Devices model PDU- in Experiment 2 by measuring stereoacuity as 13256-0.5), which allowed us to delay the noise a function of both contrast and interocular stimulus to one eye in 0.5 nsec (corresponding to correlation. 2 arc ) increments. Interocular correlation was simply the pro- GENERAL METHODS portion of the dots which were “forced” to match in the two images; the remainder of the The experiments were performed using dy- dots then had a 50% chance of matching. Thus, namic random-element stereograms of 50% el- in a display which had an interocular corre- ement density. A diagram of the basic apparatus lation of 0, half of the dots in the right image is shown in Fig. 2. A random noise signal was were matched by dots in the left image. In a hardware generated via shift registers running at display which had an interocular correlation of - 1 (“anticorrelation”), the right image was simply the opposite contrast version of the left *In this paper, we will use the term “cyclopean” to refer to the site and/or processes of binocular combination itself image and no matches existed. For an inter- and the term “hypercyclopean” to refer to processes ocular correlation of + 1, of course, the two occurring after or beyond the cyclopean stage. images were identical. 2198 LAWRENCEK. CORMACKet al

Thus, rearranging equation (2), the pro- be placed under software (Experiment 1) or portion of matching dots is, on average, simply hardware (Experiment 2) control.* (IOC + 1)/2; the two are linearly related, as is Thus, to create an interocular correlation of somewhat demanded by intuition. 0.75, the duty cycle of the rectangular wave Interocular correlation was varied through switching pulse was such that the subject was the use of two independent noise generators. To viewing a fully correlated display 75% of the produce an interocular correlation of + 1, the space/time, and viewing an uncorrelated display output of a single noise generator was sent for the remainder. Given the 1 kHz switching to both monitors. To produce an interocular pulse frequency and the 60 Hz non-interlaced correlation of 0, each monitor was driven by a frame rate, however, the perception was one of separate noise generator. Intermediate inter- a continuously present intermediate correlation; ocular correlations were produced by switching no perception of the correlation switching was between the above two conditions at a suffi- present during the experiments. ciently high rate (1 kHz) to allow the spatio- Luminance contrast was controlled by adjust- temporal integration of the visual system to ing the peak-to-trough value of the video signals render the stimulus identical with one in which using customized hardware and software. interocular correlation is statistically deter- The response functions of the two monitors mined on a dot-by-dot basis. It should be noted were matched and calibrated prior to both that this switching pulse was not synched to the experiments using a Photo Research Spectra video signal, nor was the switching rate suffi- Spotmeter photometer. ciently close to an even multiple of the frame rate. lest stationary or drifting bands of corre- Experiment 1: The Detection of Interocular Cor- lation be visible. Using this method. the inter- relation as a Function of Luminance Contrast ocular correlation is simply the duty cycle of the (rectangular wave) switching pulse. and could Method The 3 authors served as subjects. All had normal or corrected to normal acuity, normal *Two control experiments were done to Insure that this method was adequate. First, we had subjects attempt to contrast sensitivity and good stereopsis. distinguish between an intermediate correlation stimulus A temporal 2AFC-method of constant stimuli produced by a 1 kHz switching pulse (as used in our paradigm was employed. The subjects fixated a experiments) and a 100 kHz switching pulse. All subjects 12 arc min wide “ +” which, to insure accurate failed to make the distinction. On the assumption that convergence, was flanked above and below by 100 kHz (100 switching cycles every msec) is fast enough. then this experiment shows that 1 kHz is also fast 48 x 5 arc min nonius lines. A block of trials enough. consisted of 1 trial at each of 4-6 correlation Second, we had subjects attempt to dtscriminate be- levels, chosen to bracket the expected threshold tween “true” zero correlation (right and left eyes’ images based on pilot data. The order of trials was produced by independent random bit streams) and an randomized within blocks. A run was composed intermediate correlation as produced by a 1 kHz mixture of 30 blocks of trials at a single stimulus con- of t 1 and - 1 interocular correlation. The independent variable was the duty cycle of the switching pulse. As the trast. A threshold correlation for a particular duty cycle of the switching pulse went to SO%, the ability contrast level was defined as 75% correct on a of the subjects to make the discrimination fell to chance psychometric function fit to the data from 3 levels. (subject CMS) or 5 (subjects SBS and LKC) tlndividual monocular contrast detection thresholds were measured for our dynamic random element stimuli on such runs. Each subject ran at between 9 and the same apparatus using the same 2 aft/constant stimuli 11 contrast levels, which were even multiples paradigm. Contrast detection thresholds were between 2 of that subject’s contrast threshold for our and 4% (Michelson contrast for a single stimulus frame). dynamic random-element stimuli.7 These relatively high thresholds are to be expected A trial was composed of two stimulus inter- because of the effective contrast reduction which the vals, 1.2 set in duration, which were delimited temporal integration of the visual system imparts on dynamic stimuli. The lowest contrast at which a particu- by audible tones. The dynamic noise was con- lar subject could perform either the correlation detection tinuously present but, during one of the two task (Experiment 1) or the stereoacuity task (Experiment intervals, switched from 0 interocular corre- 2) corresponded to the calculated contrast at which both lation to some positive interocular correlation eyes were reliably detecting the stimulus [i.e. P(RE for 200msec. The plane of positive correlation detection AND LE detection) = 0.751, as obtained by multiplying the psychometric functions for the right and was always presented at 0 disparity. i.e. the 1ett eve&, plane of stimulus presentation was coincident Interocular correlation, luminance contrast and cyclopean processing 2199 with the plane of fixation as defined by the Discussion nonius horopter. The subject’s task was to It can be reasonably assumed that the detec- signal, by means of a key press, which interval tion of a stimulus in a psychophysical task contained some non-zero interocular corre- occurs when a signal-to-noise ratio reaches lation. The subject’s response was followed by some critical value at the relevant site (or in auditory feedback and initiated the next trial. the relevant functional unit) of the visual system 1966). The data from Results (cf. Green SC Swets, this experiment indicate that, at higher con- The results for the 3 subjects are shown trasts, the relevant signal-to-noise ratio remains superimposed in Fig. 3. The axis of ordinates constant as contrast changes; this is reflected represents correlation threshold while the axis by the fact that correlation thresholds are inde- of abscissas represents contrast expressed in pendent of contrast at high stimulus contrasts. threshold multiples. Both axes are logarithmic. At lower contrasts, however, this is not the As can be seen from the figure, correlation case. As stimulus contrast is reduced, eventually threshold is independent of contrast at relatively contrasts are reached which render a previously high contrasts. At relatively low contrasts, how- threshold level correlation indistinguishable ever, correlation thresholds decrease rapidly from uncorrelated noise. Threshold level per- as stimulus contrast increases. Thus, in the formance can be restored, however, by increas- low contrast region, a decrease in stimulus ing the interocular correlation of the stimulus effectiveness due to a contrast reduction can be by some amount. This indicates that, at low compensated for by an increase in interocular contrasts, the relevant signal-to-noise level correlation in order to restore a criterion (e.g. is changing with stimulus contrast and that threshold) level of performance. this change can be compensated for by Generally, the data can be described as con- appropriate adjustment of the interocular corre- sisting of two regions, a high contrast region in lation. Further, the relative effectiveness of which the data asymptote to a line of slope 0, and a low contrast region in which they asymp- contrast and interocular correlation on the tote to a line of slope - 2. The solid line in Fig. 3 signal-to-noise ratio (i.e. the trading relation is of slope -2, plotted for reference. The between the two variables) is given by the short vertical lines at the top of the figure log-log slope to which the data asymptote at show representative error bars for the low (left- low contrast levels. From Fig. 3, it can be seen hand line) and high (right-hand line) contrast that this slope is roughly -2, which indicates regions. that the relevant signal-to-noise ratio is pro- portional to the square of stimulus contrast. A possible means of binocular combination by

+ sbs which this behavior could be realized is a simple u e ems cross-correlation of edge information in the z! + Ike stimulus. ul A cross-correlation of two signals produces i?! 5 a cross-correlation function, the height of .l which at some relative displacement reflects the s degree to which the two input signals match at that displacement. When a horizontal cross- 1*Ij . .\ . _,,.,1 correlation is done on two retinal images (assuming correct vertical alignment of the 1 10 100 images), the resulting function can be thought of contrast (threshold multiples) as representing the number of matching stimu- In Fig. 3. Threshold for the detection of interocular correlation lus elements as a function of retinal disparity. as a function of luminance contrast, expressed in threshold order to determine the behavior of such cross- multiples, for the 3 subjects. Both axes are logarithmic. The correlation functions in response to various data asymptote to a log-log slope of 0 at high contrasts and stimulus parameters, particularly luminance - 2 at low contrasts. A line of slope - 2, indicating a trading contrast and interocular correlation, a model of relation between interocular correlation and the square of contrast, is plotted for reference. The left- and right-hand binocular combination was developed which vertical lines at the top of the figure represent typical SD for incorporated the operation of cross-correlation low and high contrast judgments respectively. as the “engine” of the model.

VR 31112-L 2200 LAWRENCEK. CORMACK et al.

A truly global cross-correlation would be which could be varied. Other image parameters, undesirable because such a model would have such as the size and density of the elements difficulty distinguishing between transparent :omprising the stereograms, were set to match stimuli and stimuli with local depth variations. the conditions of our experiments. The input For example, a transparent stimulus consisting image arrays were “blurred” by the optics of the of one surface in front of a second surface and :ye via convolution with a gaussian represen- a “checkerboard” stimulus in which alternate tation of the point spread function of the squares lie at different depths would both lead :mmetropic eye (0.84 arc min space constant). to a double peaked cross-correlation function. The blurred image frames were then sampled by This problem can be avoided by doing “local” “retinae” with a 30 arc set inter-receptor dis- cross-correlations on smaller patches of the tance, and averaged over space-time to simulate visual scene. The result is a set of cross- reasonable values for spatial and temporal inte- correlations functions whose members corre- gration of the visual system. The amount of spond to different visual directions. Thus, integration was varied within sensible limits to transparent stimuli would give rise to double insure that the general conclusions from the peaked cross-correlation functions at each modeling did not depend on the particular visual direction along which transparency values chosen; the primary effect of increased exists, whereas depth variations across a visual spatio-temporal integration is simply to reduce scene would give rise to single peaked cross- the relative amplitude of noise from spurious correlation functions, with the location of the matches [i.e. the “ghost” matches of Cogan peak depending on the visual direction to which ( 1978) or the “phantom” matches of Julesz the function corresponds. (1971)] in the cross-correlation function. The A cartoon representation of the model is images were then differentiated with respect to shown in Fig. 4, which illustrates the various space (edge extraction) to yield an image rep- stages of the model along with the physiological resentation assumed to exist at an immediately properties which they were intended to mimic. pre-binocular stage in the visual system. The input to the model was a of image These filtered left and right eye image pairs arrays, representing video frames of the stimu- were then cross-correlated to produce a cross- lus, the contrast and interocular correlation of correlation function such as is illustrated in

vtsua1 stages Simulation

Image -7 Array

Smoothing Optics

MOnOCULar Differencing, cells

+++v averaging

Detection/ localization Decision Downstream I processing I

Fig. 4. An illustration representing the flow of processing in the model. The input to the model is a series of one bit arrays, analogous to our stimuli. The arrays are then blurred by an amount typical of an emmetropic eye. Spatio-temporal averaging is then done to simulate the spatial and temporal integration of the visual system. Edge information is extracted by means of differentiation. Finally, the left and right eye “images” are cross-correlated to yield a function on which the binocular operations of correlation detection and localization can be performed. Not illustrated is the addition of noise, assumed to be present and intrinsic to the visual system. to the cross-correlation function Interocular correlation, luminance contrast and cyclopean processing 2201

Levi, 1987) which occur in stereoscopic vision (Stevenson et al., 1991). As stated above, the purpose for which 8 this model was developed was to analyze the ? 3 behavior of the cross-correlation function as stimulus contrast and interocular correlation z were varied. Figures 6 and 7 show graphical --JL-examples of the output of the model in response to changing contrast and interocular correlation DISPARITY respectively (for purposes of analysis, of course, Fig. 5. Typical shape of a cross-correlation function for one quantitative output was employed). In both bit, random element stimuli. Unrealistically long values of figures, successive curves are displaced vertically spatial and temporal integration were used to generate this for clarity. Each curve is based on a single run example. This makes the underlying shape of the function more obvious by averaging out noise which normally results of the model, in which it received 4 stimulus from “ghost” matches. The central peak occurs at the frames over which to integrate. Conclusions disparity (relative displacement) of highest correlation. The from the modeling, however, were based on flanking side lobes occur at the disparity corresponding to average values from a minimum of 1000 of such a displacement of one stimulus element. The psychophysical runs per stimulus condition. Two important tasks in our experiments are discussed in terms of functions such as this. points emerged from the modeling, one pertain- ing to peak signal height and the other to the relative amount of noise due to spurious Fig. 5, on which “psychophysical” judgments matches. such as detection and localization could be As luminance contrast is changed (Fig. 6) performed. The particular function in Fig. 5 was both the peak signal height and the noise level generated from images with relatively high lumi- change dramatically. Specifically, both grow as nance contrast and interocular correlation (i.e. the square of input image contrast, such that the a high-contrast stereogram of a flat plane), and signal-to-noise ratio of the cross-correlation a relatively long period of temporal integration function remains constant in the face of (which reduces noise from spurious matches changing stimulus contrast. across disparity). The central peak of the cross- correlation function naturally occurs at the dis- parity of the stimulus plane. The flanking Effect of Contrast troughs generally occur when edge represen- tations (first derivatives) of images are used* (Nishihara, 1987; Stevenson, Cormack & Schor, 1991) and have been used to explain the attrac- tion and repulsion effects (e.g. Westheimer &

*These side lobes occur because the differencing operation of edge extraction introduces a kind of spatial corre- lation into the stimulus. Consider, for example, an array A of length N containing random one bit values, and an associated array A’ of length N - 1 containing the first derivative of A. Elements of A can either be 0 or 1, so the elements of A’ can be - 1, 0 or + 1. If one is told the value of the n th element of A, one has been given no information concerning the value of either the (n - 1)th or the (n + 1)th elements. Similarly if one is told that the DISPARITY value of the n th element of A’ is “O”, one has been given Fig. 6. As the contrast of an image pair with some tixed no information concerning the value of either the interocular correlation is reduced, both the signal amplitude (n - 1)th or the (n + 1)th elements. If, however, one is and the noise level are decreased by the square of contrast. told that the value of the n th element of A’ is + 1 then Shown in this figure is a family of cross-correlation func- one knows that neither the (n - 1)th nor the (n + 1)th tions, the members of which differ in the contrast of the elements can have the value of + 1; it must either be - 1 input image pair. The particular contrast of each input or 0. The same argument (with signs flipped of course) image pair is displayed to the right of the corresponding holds if the n th element of A ’ is - 1. Thus, in a difference function. The interocular correlation of the input image pair array, adjacent elements are, on average, negatively was 80% in all cases, and the functions are displaced correlated. vertically for clarity. 2202 LAWRENCEK. COIUNACK et al.

As interocular correlation is changed (Fig. 7), values of spatial and temporal integration). the noise level remains constant. But the peak Based on this information, we would predict height changes in linear proportion to the in- that correlation thresholds should be indepen- terocular correlation. The signal-to-noise ratio dent of stimulus contrast. The reasoning is as of the cross-correlation function, therefore, follows. changes as a linear function of interocular corre- If our cross-correlation model provides a lation. This is really a tautology since we defined reasonable description of the stimulus strength interocular correlation as a cross-correlation, on the “cyclopean retina”, then an observer’s but it serves as a check on the model. threshold for the detection of a cyclopean stimu- Thus, the amplitude of the cross-correlation lus will correspond to some critical value of function depends on both the stimulus contrast signal-to-noise in the cross-correlation function. and its interocular correlation, whereas the But since the signal-to-noise ratio of the cross- noise level due to spurious matches is solely a correlation function is independent of contrast, function of stimulus contrast (given constant correlation threshold should also be indepen- dent of contrast. This is precisely the behavior *The case where peripheral noise is present is slightly more that is observed in the data at relatively high complex. If N,(x) and N,(x) are independent sources of contrasts (see Fig. 3). noise present in the right and left eyes respectively, the In order to understand the behavior of the cross-correlation function becomes: data at low contrasts, the effect of intrinsic noise must be considered (cf. Barlow, 1964). With any IOC(d) = system, it is reasonable to assume the presence of internal noise, the human visual system being = f(x)h(x + d) dx + f(x)N,(x) dx i no exception. As is well known, intrinsic noise is present at even the most peripheral site of + h(x + d)N,(x) dx + N,(x)N,(x) dx (4) s s the visual system (thermal breakdown of the The first term in equation (4) is identical to the first term rhodopsin molecule, or “dark light”) and rep- in equation (3) which is simply equation (1). The last resents a fundamental limit of visual efficiency term in equation (4) is analogous to the last term in (Barlow, 1964). Consider, then, the effect of equation (3); it is simply noise with an amplitude which some small but constant amount of noise, which is independent of contrast. The two middle terms of equation (4) however, represent the product of a noise is independent of stimulus contrast, added to signal which is independent of contrast [N,(x) or N,(x)] the cross-correlation function. Such noise could and a noise signal (i.e. the stimulus) which is directly either be intrinsic to the cyclopean retina, or proportional to contrast. These terms then, are noise could have a peripheral origin provided that signals with amplitudes directly proportional to contrast the noise is uncorrelated between the two eyes added to the cross-correlation function. Despite these additional terms, the argument given in text for high and (i.e. rhodopsin breakdown), or both. Here, low contrasts remains essentially unchanged. At high we consider the simple case where noise is contrasts, the first term of equation (4) will be the added at or beyond the site of binocular cross- dominant term. In other words, correlation.* The cross-correlation function now becomes: j’(x)h(x + d) dx >> .f(x)N,(x) dx s 1 . _1 . IOC(d) f(x)h(x + d) dx + n(.x) (3) 2 h(x + d N,(x) dx > N,(x)N,(x) dx = J L! The argument at low contrasts also remains unchanged, where n(x) is simply the noise. with the last term of equation (4) being the dominant Two distinct sources of noise are now present, term. In other words, asf(x) and h(x + d) become very noise due to spurious matches [present in the first small term of equation (3)] and the intrinsic noise [the N, (x)N, (x) dx >> f(x)N, (x) dx second term of equation (35 At high contrasts, s I the noise produced by spurious matches (which z h(x + d)N,(x) dx z f(x)h(x + d) dx z 0 does vary with contrast, see Fig. 6) is presumably s f of much greater amplitude than the intrinsic The difference between equation (3) and equation (4) is noise and therefore is the major determinant manifest at intermediate contrasts where, in equation of noise level. Thus the original argument, that (3), one expects a sharper transition between the two limiting cases. Thus, one expects a two lobed function in the signal-to-noise ratio of the cross-correlation either case, but the locus of added noise will afkt the function is constant across contrast, remains rate of transition between slope -2 and slope 0. unchanged for high contrasts. Interocular correlation, luminance contrast and cyclopean processing 2203

Effect of Correlation

80%

? 60% I- i h 40% f

CONTRAST 20% Fig. 8. Schematic diagram showing the effect which intrinsic noise of constant amplitude would have on the detection of 0% interocular correlation as a function of stimulus contrast. When contrast is sufftciently high, the amplitude of the false target noise is much greater than that of the internal noise. DISPARITY Thus the effective noise amplitude is simply the amplitude Fig. 7. As the interocular correlation of an image pair with of the false target noise. Since the signal height and the false some fixed contrast is reduced, the signal amplitude is target noise vary conjointly as contrast is changed, the decreased proportionally while the noise level remains con- signal-to-noise ratio of the cross-correlation function re- stant. Shown in this figure is a family of cross-correlation mains constant and no change in correlation detection functions, the members of which differ in the interocular threshold is anticipated. As contrast is reduced, however, correlation of the input image pair. The particular inter- eventually a point is reached where the amplitude of the ocular correlation of each input image pair is displayed to intrinsic noise is much greater than that of the false target the right of the corresponding function. The contrast of the noise. Thus the effective noise amplitude is simply the input image pair was 20% in all cases, and the functions are amplitude of the intrinsic noise, which is constant. Under displaced vertically for clarity. these conditions, signal-to-noise ratio will vary directly with signal amplitude which, in turn, varies with the square of If one reduces contrast, however, one eventu- contrast (Fig. 7). But since signal-to-noise ratio is linearly ally reaches a point where the intrinsic noise related to interocular correlation (Fig. 8), one anticipates a square relation to determine the correlation/contrast combi- (which is assumed not to vary with contrast) is nations which will yield a given (e.g. threshold level) signal- of much greater amplitude than the noise pro- to-noise ratio. This is reflected by the log-log slope of -2 duced by spurious matches. At this point, over- in the low contrast region of the figure. all noise level is effectively constant since its amplitude is determined almost exclusively by where the signal-to-noise ratio is independent of the amplitude of the intrinsic noise. Signal-to- contrast, we expect the data to asymptote to a noise ratio, then, will be determined solely log-log slope of 0. At lower contrasts, where the by the peak height, which varies linearly with signal-to-noise ratio varies as the square of correlation but as the square of contrast. It contrast, we expect the data to asymptote to a follows that the interocular correlation required log-log slope of -2. to produce a given signal-to-noise ratio (e.g. Since the data (Fig. 3) conform to this expec- that corresponding to threshold) would be tation, it would seem that for dynamic random proportional to the square of the stimulus con- element stereograms, a cross-correlation of the trast. For example, if one starts with a cross- two eye’s inputs after accounting for such fac- correlation function with a signal-to-noise ratio tors as optical low pass filtering (“blurring”) corresponding to threshold, and then reduces and spatial and temporal summation, provides the contrast by a factor of 2, the signal-to-noise an adequate description of the stimulus at ratio will decrease by a factor of 4. To restore cyclopean levels of the visual system, i.e. the the threshold level signal-to-noise ratio, the earliest level of the visual system where infor- interocular correlation would then have to be mation from the two eyes is integrated. But how increased by a factor of 4. Thus, in the low far past the stage of binocular combination contrast region, we would expect a log-log can this analysis be continued? That is, can plot of correlation threshold as a function of downstream binocular processes be treated as contrast to show a slope of -2. an analysis of, or operations performed on, a Combining the above reasoning from the low cross-correlation function? With this question and high contrast regions, we expect the data to in mind, we attempted to extend our experimen- take the form shown in Fig. 8. At high contrasts, tation into the hypercyclopean domain. 2204 LAWRENCEK. CORMACK et al.

Experiment 2: Stereoacuity as a Function of trast, whereas at higher contrasts the model Contrast and Interocular Correlation predicts that stereoacuity will be independent of In Experiment 2, we measured stereo- contrast. thresholds (stereoacuity) as a function of both Methods luminance contrast and interocular correlation. The predictions of the cross-correlation model The methods of this experiment are essen- are as follows. tially the same as those of the first experiment. Consider a cross-correlation function such as The 3 authors served as subjects and a temporal ZAFC-method of constant stimuli paradigm illustrated in Fig. 5. The certainty with which was employed. The observers task, however, the “true” position of the function can be was to indicate which of the two temporal determined depends on the signal-to-noise ratio intervals contained a horizontal step change in of the function. For a given signal-to-noise ratio disparity approximately halfway down the however, a sharp peak can, in principle, be more stimulus . For determining stereoacuity as a precisely localized than a broad peak, so it is function of contrast, interocular correlation was important to know how the second derivative of set at 80%, and contrast was varied over a the cross-correlation function behaves. Further, broad range. For determining stereoacuity as a the zero-crossings or loci of steepest slope function of interocular correlation, contrast was could be used, along with assumptions of sym- set at 12% (Michelson definition, as measured metry, to determine the position of the func- on a static pattern) and correlation was tion (cf. Legge & Gu, 1989) so it is also of varied from either 30% (LKC and SBS) or 60% import to determine the behavior of the first (CMS) up to 90%. Thresholds for each run derivative. (where a “run” is the same as defined in Exper- As it turns out, the height of both the first and iment 1) were defined as 75% correct on the second derivatives of the cross-correlation func- psychometric function, and the thresholds from tion is simply a linear function of the height of 5 runs per subject at each contrast/correlation the cross-correlation function (as is the case combination were obtained. with sinusoids). This is convenient, because it means that in order to express the precision with Results which the position of the cross-correlation func- Stereothresholds as a function of contrast are tion can be localized (without invoking disparity shown for all 3 subjects in Fig. 9. The high domain interactions), it is necessary only to contrast portion of the data (above 5 contrast specify the signal-to-noise ratio of the cross- threshold multiples, say) is essentially a replica- correlation function. tion of both Halpern and Blake (1988) and As we already know, both the height of the cross-correlation function and the signal-to- 100 noise ratio grow linearly with increasing inter- U sbs ocular correlation (Fig. 7). Thus, we predict that stereoacuity should simply be a linear function of interocular correlation. The predicted slope for stereoacuity as a function of contrast however, depends on the absolute contrast level. At higher contrasts, where false target noise is the dominant source of noise, both the noise amplitude and the peak signal height are proportional to the square of contrast such that the signal-to-noise ratio 1 IO 100 remains constant (Fig. 6). At lower contrasts, where intrinsic noise is the dominant source contrast (threshold multiples) of noise, both the peak signal height and Fig. 9. Stereoacuity as a function of contrast for the 3 the signal-to-noise ratio vary as the square of subjects. Both axes are logarithmic and contrast is expressed contrast, since noise amplitude is effectively in threshold multiples. Over almost a full log unit of contrast, the data can be described by a cube root law constant. Thus, at low contrasts, the cross- contrast dependence, roughly in accordance with the data of correlation model predicts the stereoacuity previous investigators. At lower contrasts however, the data will vary in proportion to the square of con- more closely follow a square law dependence Interocular correlation, luminance contrast and cyclopean processing 2205

Legge and Gu (1989). In this portion, stereo- Discussion acuity improves roughly in proportion to the The form of the cross-correlation model with cube root of contrast (i.e. a log-log slope of which we are currently working is intended to -l/3). In the low contrast region, however, provide a description of the stimulus at and the data asymptote to a log-log slope of -2, immediately beyond the site of binocular combi- indicating that stereoacuity is proportional to nation, i.e. at the cyclopean and early hyper- the square of contrast. cyclopean stages of visual processing. The main Figure 10 shows how stereoacuity varies as a assumption of the model is that the mechanism function of both interocular correlation [panel of binocular combination is inherently multi- (b)] and contrast [panel (a)] for comparison. plicative. Models of binocular combination Note that Fig. 10(a) is simply a plot of the low which are additive would predict a linear depen- contrast (< 5 threshold multiples) branch of the dence on contrast under conditions in which data from Fig. 9. As can be seen from this we find a square dependence, viz. correlation figure, the stereoacuity data are linearly depen- detection at low contrast (Experiment 1) and dent on interocular correlation and, for low stereoacuity at low contrast (Experiment 2). contrasts, are proportional to the square of A similar cross-correlation model correctly contrast. The solid lines in the figure show the predicts attraction and repulsion in the stereo slopes predicted by the model. domain (Stevenson et al., 1991) based solely on a cyclopean level stimulus description, i.e. with- a out invoking hypercyclopean processing such as 100 disparity domain interactions. In addition, any cross-correlation based model qualitatively pre- dicts the “reversed” depth seen from “negative stereograms” as reported by Rogers and Anstis (1975). 10 There remains, however, the fact that at high contrasts, stereothresholds change in pro- portion to the cube root of contrast, whereas our model predicts no contrast dependence. Our model, however, does not incorporate parallel spatial channels into its front end. Yet the lb evidence for such spatial channels in the visual contrast (threshold multiples) system is, to say the least, robust (see DeValois & DeValois, 1988, for a review). Since human b contrast detection threshold varies over roughly 1000 a 2 log unit range as spatial frequency is varied, it could be argued that the improvement in stereoacuity as a function of increasing contrast is simply a reflection of the input of higher spatial frequency channels. Standing against this hypothesis, however, are at least two lines of evidence. First, Schor, Wood and Ogawa (1984) found that maximum stereoacuity was obtained using DOGS with a center frequency of 2-3 c/deg; further increase in spatial frequency did not 10 100 improve stereoacuity. Since the peak of the CSF % interocular correlation generally falls at around 3 c/deg, maximum Fig. 10. Stereoacuity plotted as a function of contrast (a) stereoacuity is probably derived from the output and interocular correlation (b) for the 3 subjects [note that of the same spatial channel which is responsible panel (a) is simply a replot of the low contrast data from for the absolute contrast threshold for a broad Fig. 91. All axes are logarithmic. The aspect ratios of the two graphs are the same, allowing comparison of the slopes. The band target, i.e. the channel centered at the peak slopes predicted by the model are shown by the solid lines of the CSF. Increasing the suprathreshold con- (the vertical position of the solid lines is arbitrary). trast of a broad band target, and thus activating 2206 LAWRENCE K. COFWACK et al channels centered at higher spatial frequencies, Consider the possibility that cross-correlation stands only to increase stereoacuity via prob- functions are encoded as relative activity in an ability summation. This possibility need not be array of such units located along the disparity seriously entertained, however, given the second axis. Without disparity domain inhibition, a line of evidence against a significant contri- graph of unit activity vs peak disparity tuning of bution of high spatial frequency information. the unit for various contrasts would simply look This is simply that both Halpern and Blake like Fig. 6. With the addition of inhibition (1988) and Legge and Gu (1989) found a square between the units, however, the signal-to-noise root dependence of stereoacuity on contrast ratio would continue to improve as contrast using narrow band targets (the specific targets increased, rather than remaining constant. The employed were tenth derivatives of gaussians, nature of this inhibition could be “tweeked” in mercifully known as DlOs, and truncated sinu- one’s model to yield whatever contrast depen- soids, respectively). dence was desired. The argument at low stimu- A second mechanism which could subserve lus contrasts, however, need not be altered by the improvement of stereoacuity with increasing the addition of inhibition. At low stimulus contrast is a compressive contrast response contrasts, where intrinsic noise seems to effec- function inherent in the stereopsis “mechan- tively determine the overall noise amplitude, the ism”. In its current form, our model predicts the activity of the disparity-tuned units outside of same shape for the data in both the correlation the stimulus plane could be so low that their detection and stereoacuity tasks, because per- influence through inhibitory pathways would be formance on both tasks is assumed to reflect negligible. the signal-to-noise ratio of the same cross- The presence of some sort of disparity do- correlation function. If, however, an appropri- main inhibition is also supported by the phe- ate compressive contrast response function was nomenology of random element stereograms. inserted at some cyclopean stage prior to stereo As illustrated in Fig. 7, noise in the cross- localization, the high contrast branch of the correlation function due to false dot matches is model would asymptote more gradually, yield- unaffected by stimulus correlation. Thus, when ing a curve resembling the data of Fig. 9. This disparity domain inhibition is not considered, would indicate that either the mechanism of the activity of a unit tuned to a disparity other correlation detection and stereoacuity each have than that of, or immediately adjacent to, the a different contrast response and operate in stimulus plane, would be unaffected by chang- parallel or that the cross-correlation function mg the interocular correlation. Perceptually. is passed through a compressive transducer however, this is not the case. When viewing function after the stage at which correlation ;+ random element stimulus of 0 interocular detection occurs but before the stage at which correlation, a voluminous “swarm” of dots is stereo localization occurs. perceived. But when interocular correlation is The later alternative is perhaps the more raised to 100% at some disparity, the only thing plausible one on the grounds that stereolocaliza- that is seen is a flat plane at that disparity, tion is probably the primary reason that a no false matches are seen at all. Therefore, it’ cross-correlation operation would be per- it is assumed that some process like a cross- formed. A second cross-correlation operation correlation occurs at the site of binocular com- occurring in parallel would, therefore, be super- bination, then there must also be some process fluous in all but artificial tasks of correlation to “clean up” the spurious disparities before a detection such as in the present experiments site is reached where the neural activity IS (it is conceivable that a parallel cross- reflected perceptually. correlation operation could be occurring in Since cross-correlation is an inherently multi- the eye movement pathway, but it is unlikely plicative mathematical operation, our model that this would be a factor in a psychophysical is somewhat at odds with additive models of task). binocular contrast combination (e.g. Leg@ & A third mechanism which could bring about Gu, 1989). Dynamic random element stimuli, an improvement of stereoacuity with increasing however, by their very nature isolate a highly contrast is inhibition across disparity tuned specialized subset of the visual system. It would “units”, be they the individual cells of Poggio be premature, therefore, to place too much and Poggio (1984), the channels of Stevenson weight on apparent contradictions with results i’t nl. (1990). or the detectors of Tyler (1983). &tained with other sorts of stimuli Interocular correlation, luminance contrast and cyclopean processing 2201

In conclusion, our psychophysical results in- Halpem, D. L. & Blake, R. (1988) How contrast affects dicate that interocular correlation is an import- stereoacuity. Perception, 17, 483-495. Heckman, T. & Schor, C. M. (1989) Is edge information for ant controlling variable at both the cyclopean stereoacuity spatially channeled? Vision Research, 29, and hypercyclopean stages of visual processing. 593-607. Further, a model which utilizes the operation of Julesz, B. (1971) Foundations of cyclopean perception. cross-correlation as the means of binocular Chicago: University of Chicago Press. combination provides a good description of the Legge, G. E. & Gu, Y. (1989). Stereopsis and contrast. Vision Research, 29, 989-1004. stimulus in both the cyclopean and hyper- Nishihara, H. K. (1987). Hidden information in transparent cyclopean domains insofar as it accounts for stereograms. Proceedings of the Twenty-first Asilomar and/or predicts psychophysical data from both Conference on Signals, Systems & Computers, 21,695-700. of these domains. Poggio, G. F. & Poggio, T. (1984). The analysis of stereop- sis. Annual Review of Neuroscience, 7, 379-412. Acknowledgements-This work was supported in part by Rogers, B. J. & Anstis, S. M. (1975). Reversed depth from grant # EYO 06045 to SBS and CMS and grant # ROl- positive and negative stereograms. Perception, 4, 193-201. EY03532 to CMS. The authors with to thank Gordon Legge Schor, C. M., Wood, I. C. & Ogawa, J. (1984). Spatial for his invaluable comments and J. Malik and D. Jones for tuning of static and dynamic local stereopsis. Vision helpful conversation. Research, 24, 573-578. Stevenson, S. B., Cormack, L. K. & Schor, C. M. (1991). Depth attraction and repulsion in random dot stereograms. Vision Research, 31, 805-813. REFERENCES Stevenson, S. B., Cormack, L. K., Schor, C. M. & Tyler, C. W. (1990). Disparity tuned channels in human stereopsis. Barlow, H. B. (1964) The physical limits of visual discrimi- Investigative Ophthalmology and Visual Science (Suppl.), nation. In Giese, A. C. (Ed.), Photophysiology (Vol. 2, 31, 95. pp. 163-201). New York: Academic Press. Tyler, C. W. (1983) Sensory processing of binocular dis- Cogan, A. I. (1978) Fusion at the site of the “ghosts”. Vision parity. In Schor, C. M. & Ciuffreda, K. J. (Eds), Vergence Research, 18, 657-664. eye movements: Basic and clinical aspects (pp. 199-295). DeValois, R. L. & DeValois, K. K. (1988) Spatial vision. London: Butterworth. New York: Oxford University Press. Westheimer, G. & Levi, D. M. (1987). Depth attraction and Green D. M. & Swets, J. A. (1966). Signal detection theory repulsion of disparate fovea1 stimuli. Vision Research, 27, and psychophysics. New York: Wiley. 1361-1368.