(19) *EP002290949B1*

(11) EP 2 290 949 B1

(12) EUROPEAN PATENT SPECIFICATION

(45) Date of publication and mention (51) Int Cl.: of the grant of the patent: H04N 5/232 (2006.01) H04N 5/235 (2006.01) 04.12.2019 Bulletin 2019/49

(21) Application number: 10012182.1

(22) Date of filing: 17.12.2004

(54) Method, medium and system for optimizing capture device settings through depth information Verfahren, Medium und System zur Optimierung der Einstellungen einer Erfassungsvorrichtung mittels Tiefeninformation Procédé, medium et système pour optimiser les paramètres de dispositif de capture par des informations de profondeur

(84) Designated Contracting States: (72) Inventor: Marks, Richard AT BE BG CH CY CZ DE DK EE ES FI FR GB GR Foster City HU IE IS IT LI LT LU MC NL PL PT RO SE SI SK TR CA 94404-2175 (US)

(30) Priority: 16.01.2004 US 758817 (74) Representative: D Young & Co LLP 120 Holborn (43) Date of publication of application: London EC1N 2DY (GB) 02.03.2011 Bulletin 2011/09 (56) References cited: (62) Document number(s) of the earlier application(s) in EP-A- 0 750 202 WO-A-01/18563 accordance with Art. 76 EPC: FR-A- 2 832 892 US-A- 6 057 909 04814607.0 / 1 704 711 US-A1- 2003 100 363

(73) Proprietor: Sony Interactive Entertainment Inc. Tokyo 108-8270 (JP)

Note: Within nine months of the publication of the mention of the grant of the European patent in the European Patent Bulletin, any person may give notice to the European Patent Office of opposition to that patent, in accordance with the Implementing Regulations. Notice of opposition shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention). EP 2 290 949 B1

Printed by Jouve, 75001 PARIS (FR) 1 EP 2 290 949 B1 2

Description producing an image that has the exposure/gain and other related parameters adjusted for both the foreground and BACKGROUND OF THE INVENTION background. [0006] US 6057909 discloses an apparatus for creat- 1. Field of the Invention 5 ing an image indicating distances to objects in a scene, comprising: a modulated source of radiation, having a [0001] This invention relates generally to image cap- first modulation function, which directs radiation toward ture techniques, and more particularly to enhancing a a scene; a detector, which detects radiation reflected captured image of a scene by adjustments enabled from the scene, modulated by a second modulation func- through depth information. 10 tion, and generates, responsive to said detected modu- lated radiation, signals responsive to the distance to re- 2. Description of the Related Art gions of the scene; a processor, which receives signals from the detector and forms an image, based on the sig- [0002] Image capture devices, whether cameras or nals, having an intensity value distribution indicative of based devices, typically have a limited contrast ra- 15 the distance of objects from the apparatus; and a con- tio, which is a measure of the difference between the troller, which varies at least one of the first and second darkest and lightest parts of a scene. One exemplary modulation functions, responsive to the intensity value scene may include a person in the shade and the back- distribution of the image formed by the processor. One ground having bright sunlight. When the background of implementation of US 6057909 includes a camera, an this scene is exposed correctly for the image capture 20 image analyzer, a system control unit and a video display. device, there is little or no detail in the shaded person’s face. SUMMARY OF THE INVENTION [0003] Auto-exposure and auto-gain features are com- monly used to set brightness levels for the capture de- [0007] Broadly speaking, the present invention fills vice. These features tend to take the entire scene and 25 these needs by providing a method and system that en- average it to apply a certain exposure or gain setting. ables adjustment of segments of a scene, e.g., fore- While the averaging may work well for a scene having a ground and background images, where the foreground great deal of images and colors, this scheme quickly and background images are identified through a depth breaks down as the scene has less variety. mask. According to the present invention there is provid- [0004] One attempt to address the limited contrast ratio 30 ed a method for adjusting image capture settings for a of current capture devices is through the use of a back- video capture device during execution of an interactive light feature. For instance, where there is a bright light gaming application according to claim 1. According to source in the background, e.g., sunlight, backlight com- further aspects there are provided a corresponding me- pensation will take the center of the scene and use that dium claim 10 and a corresponding system claim 11. region as the average. By doing this, the center of the 35 [0008] Other aspects and advantages of the invention scene may be brightened while the bright sunlight on the will become apparent from the following detailed descrip- edges becomes washed out or darkened. The shortcom- tion, taken in conjunction with the accompanying draw- ing with backlight compensation is that the object to be ings, illustrating by way of example the principles of the brightened must be in the center of the scene. In addition, invention. a region of the scene is used for computing the average, 40 rather than the actual object itself, which may cause some BRIEF DESCRIPTION OF THE DRAWINGS display artifacts. Furthermore, backlight compensation does not provide a solution where there are multiple fore- [0009] The invention, together with further advantages ground images in different regions of the scene. Addi- thereof, may best be understood by reference to the fol- tionally, with backlight compensation, the foreground ob- 45 lowing description taken in conjunction with the accom- ject is brightened, however, this is done at the expense panying drawings. of the detail in the background. Thus, a user is required to choose between foreground detail and background Figure 1 is a simplified schematic diagram illustrating detail. Some of these shortcomings may be extended to a scene having foreground and background objects, video capture devices which may be used for interactive 50 which is captured through an image capture device entertainment applications. For example, where an im- and subsequently displayed in accordance with one age of a user is incorporated into a video game, a bright embodiment of the invention. light source may adversely the displayed image as de- Figure 2 is a simplified schematic diagram illustrating scribed above. This adverse impact may prevent the the generation of a depth mask for use in discerning tracking of an object of the image in addition to displaying 55 between foreground and background objects in ac- a poor quality image. cordance with one embodiment of the invention. [0005] Accordingly, there is a need to solve the prob- Figures 3A and 3B are simplified schematic dia- lems of the prior art to provide a system and method for grams illustrating the amount of detail enabled in de-

2 3 EP 2 290 949 B1 4

fining foreground objects in accordance with one em- scribed below. The segmentation based upon depth is bodiment of the invention. captured through a foreground/background depth mask Figure 4 is a simplified schematic diagram illustrating which may be created through an image capture device a captured image which is enhanced through a gen- having depth capability or through a light pulse/flash with erated mask to define background and foreground 5 a time of flight cut-off technique, both discussed in more images in accordance with one embodiment of the detail-below. invention. [0012] Figure 1 is a simplified schematic diagram illus- Figure 5 is a simplified schematic diagram of an in- trating a scene having foreground and background ob- teractive entertainment system which utilizes the jects which is captured through an image capture device mask generation in order to more effectively track a 10 and subsequently displayed in accordance with one em- user in accordance with one embodiment of the in- bodiment of the invention. Image capture device 100 is vention. configured to capture an image of a scene in which a Figure 6 is a simplified schematic diagram of an im- person 102 is in the foreground and background scenery age capture device in accordance with one embod- 104. The captured image of the scene is then displayed iment of the invention. 15 on display panel 106. Display panel 106 may be a display Figure 7 is an alternative schematic diagram of an panel affixed to image capture device 100, e.g., a liquid image capture device having logic configured to dif- crystal display (LCD) panel where the image capture de- ferentiate between foreground and background im- vice is a digital camera or camcorder. Alternatively, dis- ages in the invention. play panel 106 may be remote from image captures de- Figure 8 is a flow chart diagram illustrating the meth- 20 vice 100, e.g., a television screen where the image cap- od operations for adjusting image capture settings ture device is a used in conjunction with a com- for an image capture device in accordance with one puting device, such as a game console. As will be de- embodiment of the invention. scribed in more detail below, foreground image 102 and background scenery 104 are capable of having their cor- DETAILED DESCRIPTION OF THE PREFERRED EM- 25 responding image or video characteristics independently BODIMENTS compensated irrespective of their position in either the foreground or the background. While a single foreground [0010] An invention is disclosed for a system and meth- image 102 is shown in Figure 1, it should be appreciated od for differentiating between foreground and back- that multiple foreground images may be captured. The ground objects of a scene and subsequently adjusting 30 image or video characteristics for each of the multiple image or video characteristics based upon whether the foreground images may be independently adjusted objects are located in the foreground or background. Al- based upon depth information. As used herein, image or ternatively, the image or video characteristics may be video characteristics may refer to brightness, exposure, adjusted based upon the relative distance between the gain, focus and other suitable characteristics capable of objects and the image capture device. In the following 35 being adjusted for image display. It should be appreciat- description, numerous specific details are set forth in or- ed that image or video characteristics may be referred der to provide a thorough understanding of the present to simply as characteristics and correspond to the inher- invention. It will be apparent, however, to one skilled in ent image data which improves the display quality of the the art that the present invention may be practiced without image data through the embodiments described herein. some or all of these specific details. In other instances, 40 Additionally, image capture device 100 may be a digital well known process steps have not been described in still camera, a single lens reflex camera, a video capture detail in order not to unnecessarily obscure the present device, such as a web cam or camcorder, or any other invention. suitable image capture device. [0011] The embodiments of the present invention pro- [0013] Image capture device 100, of Figure 1, is capa- vide a method and system that eliminates the user from 45 ble of generating and utilizing a mask in order to identify having to choose between foreground and background objects as being within a foreground or background re- objects of a scene. Through the use of depth information, gion as will be described in more detail below. This mask the scene may be segmented into regions of different can then be used in order to compensate for the fore- depths. In addition, the depth information allows for the ground and background regions, in order to provide a definition of an exact outline of the image, thereby pro- 50 subsequent display which shows details for objects within viding a specific and accurate mechanism for controlling both regions. For example, the use of backlight compen- image capture device parameters, e.g., expo- sation to reduce the impact of a bright light source on the sure/gain/brightness/gain and focus. The segmentation scene, such as sun 108, causes details to be defined for based upon depth information makes it possible to assign foreground objects, i.e., object 102, while the background different parameter values to different pixel regions for a 55 images are washed out. While sun 108 is illustrated on digital capture device. Thus, an image having the expo- display panel 106, it is shown for exemplary purposes sure/gain adjusted properly for both the foreground and and a scene need not include an actual image of the sun background is enabled through the embodiments de- to be adversely impacted by the light originating from the

3 5 EP 2 290 949 B1 6 sun. Without backlight compensation, foreground objects from the foreground objects is received. The aperture will would be darkened and lose their corresponding detail stay open for a predefined amount of time. The prede- in the resulting display. With the depth mask capability fined amount of time is set so that light traveling from described in more detail below, the exact location of fore- light source 110 and reflected back to image capture de- ground and background and background objects in the 5 vice 100, travels a defined maximum amount of distance. scene may be determined. This location may be trans- The maximum distance from image capture device 100 lated to a resulting image of the scene in order to manip- is illustrated as line 117. Therefore, any light which is ulate corresponding pixel values to enhance the resulting reflected from a source past line 117 will not be received image. In addition, image capture device settings, which by image capture device as the aperture is closed prior include mechanical and electrical settings that affect the 10 to this reflected light reaching the sensor of the image image or video characteristics of the resulting image, may capture device. Of course, the ambient light, i.e., the light be adjusted in order to provide optimized settings for the not generated by the burst of light from the light source, scene. is subtracted from the received light. [0014] Where image capture device 100 is a video cap- [0017] Various techniques may be used for the deter- ture device, e.g., a web cam, the enhanced functionality 15 mining the foreground objects through the time of flight. enabled through the depth information provided by the One technique is through the use of a frequency of light mask may be applied to frames of the captured video in not present in the ambient light. Alternatively, an image order to improve or enhance the image display. For ex- of the scene may be taken without the light on, then an ample, where the video capture device is used to track image taken with the light from the light source. The light an object or person subsequently incorporated into an 20 generated by the light source may then be determined interactive entertainment application, the mask may be by subtracting away the light not generated from the light applied to prevent difficulty encountered when tracking source, i.e., the image taken without the light on, from the object or person in the presence of a bright light the image taken with the light source. In yet another al- source. With respect to a video game application in the ternative, the amount of light reflected from the light home environment, such as the EYETOY™ application 25 source may be distinguished from ambient light by es- owned by the assignee, a user being tracked and incor- tablishing a threshold of how much light must strike each porated into the video game may be positioned in front pixel. Thus, a value which is less than the threshold would of a window. As explained below in more detail with ref- not be considered as light originating from the device and erence to Figure 4, if the window is allowing light from a values greater than or equal to the threshold would be bright light source through the window, then the user may 30 considered as originating from the light source of the de- become washed out and the window will become the vice. Still yet another alternative that employs the use of focus of the capture device. It should be appreciated that a modulated light source. Here, the light from the light backlight compensation techniques will not be effective source is generated in a modulated format, e.g., a sine here if the user is not in the center of the capture region. wave. The frequency chosen depends upon a range [0015] Figure 2 is a simplified schematic diagram illus- 35 where no more than one period of the modulation covers trating the generation of a depth mask for use in discern- the entire range from the light source and back to the ing between foreground and background objects in ac- device. cordance with one embodiment of the invention. It should [0018] In one embodiment, the maximum amount of be noted that the terms "depth mask" and "mask" are distance is defined as about four meters from the image interchangeable as used herein and may include multiple 40 capture device. From this data, a depth mask is created depth layers. For example, the foreground and the back- and stored in memory of the image capture device. This ground represent 2 depth layers, however, the scene depth mask may then be used in conjunction with a si- may be segmented into more than two depth layers. Im- multaneous or subsequent captured image of the scene age capture device 100 includes light source 110. In one in order to compensate for the image or video character- embodiment, light source 110 sends out a burst or pulse 45 istics for the foreground and background objects accord- of light which is reflected by foreground objects 114 and ingly. It will be apparent to one skilled in the art that light 116. This reflected light is eventually captured by a sen- source 110 may emit any suitable wavelength of light. In sor located behind lens 112 of image capture device 100. one embodiment, infrared light is emitted from light Of course, light source 110 may be a flash commonly source 110. used for cameras. One skilled in the art will appreciate 50 [0019] In another embodiment, the depth mask defined that the sensor may be located anywhere on image cap- through the reflected light is a binary bit mask. Here, a ture device 100 that is capable of receiving the reflected first logical value is assigned to locations in the mask light from the foreground objects within the scene for a associated with foreground images, while a second log- defined time period. ical value is assigned to locations associated with back- [0016] As the speed of light is known, image capture 55 ground images. Thus, where image capture device 100 device 100 of Figure 2 may be configured to pulse the is a digital device, pixel data for an image associated with burst of light from light source 110 and open an aperture the depth mask may be manipulated to adjust the bright- of image capture device 100 so that the reflected light ness of the foreground and background images. Where

4 7 EP 2 290 949 B1 8 image capture device is a traditional camera, foreground game application where tracking the user is important, and background images may be detected through the the tracking will become difficult where the bright light burst of light scheme described above. Based on the de- darkens the image of the user. Thus, where the video tection of the foreground and background images, the cam incorporates the embodiments described herein, the exposure, gain, brightness, focus, etc., settings of the 5 user will be able to be tracked more easily. That is, a camera may be adjusted prior to taking a picture of the mask generated as described above, may be used to scene. As mentioned above, the aperture size may be manipulate the pixel values to reduce the brightness. changed to manipulate the amount of light received by [0022] Figure 5 is a simplified schematic diagram of an the image capture device. Of course, other mechanical interactive entertainment system which utilizes the gen- and electrical settings may be adjusted where the me- 10 erated mask in order to more effectively track a user in chanical or electrical settings impact the resulting photo- accordance with one embodiment of the invention. Here, graph quality. Thus, both the foreground and background image capture device 100 is configured to capture an properties may be adjusted rather than having to choose image of user 134 in order for the user’s image to be between the foreground and the background. displayed on display screen 136. Image capture device [0020] Figures 3A and 3B are simplified schematic di- 15 100 is in communication with computing device 138, agrams illustrating the amount of detail enabled in defin- which in turn, is in communication with display screen ing foreground objects in accordance with one embodi- 136. As can be seen, image 135 of user 134 is displayed ment of the invention. Figure 3A illustrates display screen on display screen 136. Thus, as user 134 moves, this 120 having a foreground object defined through rectan- movement is captured through image capture device 100 gular region 122. Figure 3B shows display screen 120 20 and displayed on display screen 136 in order to interact illustrating a foreground object 124 in which a mask has with the entertainment application. As mentioned above, been defined, as described herein, in order to capture the image capture device is configured to compensate the exact outline of the foreground image. That is, with for bright light entering through window 132. current auto focus, auto gain, backlight compensation [0023] Still referring to Figure 5, image capture device techniques, the center of a scene in which an image cap- 25 100 is a video capture device. Here, the pixel data asso- ture device is targeting, is generally represented as an ciated with each video frame may be adjusted according area and is incapable of outlining the exact image. Thus, to a corresponding depth mask. In one embodiment, a as illustrated in Figure 3A, rectangular region 122 in- depth mask is generated for each video frame. In another cludes the foreground object as well as other image data. embodiment, the depth mask is generated every x Furthermore, the foreground object must be within a cent- 30 number of frames, where x may be any integer. For the er region of the image or the auto focus, auto gain, or frames not associated with a mask in this embodiment, backlight compensation features will not work. In con- the image or video characteristics from the last previous trast, the depth mask captures any foreground object ir- frame associated with a mask are applied to the frames respective of its location within the scene. Moreover, the not associated with a mask. Thus, the image or video foreground object is captured without any additional im- 35 characteristics may be frozen for a certain number of age data being included. As mentioned above, for a dig- frames until a new mask is generated. It will be apparent ital device, the image or video characteristics for any fore- to one skilled in the art that the processing for the func- ground object may be manipulated by adjusting pixel val- tionality described herein may be performed by a proc- ues. With respect to a traditional film camera, the gain, essor of computing device 138. However, the depth mask exposure, focus, and brightness may be manipulated 40 may be generated by image capture device 100 and through mechanical or electrical adjustments responsive stored in memory of the image capture device. Of course, to the depth mask. the image capture device would contain a microproces- [0021] Figure 4 is a simplified schematic diagram illus- sor for executing the functionality for generating the depth trating a captured image which is enhanced through a mask and adjusting the image or video characteristics or mask generated to define background and foreground 45 adjusting the device parameters. images in accordance with one embodiment of the in- [0024] Image capture, device 100 of Figure 5 may gen- vention. Here, image scene 128 may be a scene captured erate the mask through the techniques described with through an image capture device such as a video cam reference to Figure 2, however, image capture device or a web cam for an interactive gaming application where 100 may alternatively include depth capturing logic, such participant 130 is incorporated into the interactive gaming 50 as 3DV SYSTEM’s ZCAM™ or similar products commer- application. An exemplary interactive gaming application cially available through ™. The depth captur- is the EYETOY™ interactive game application. Here, ing logic includes an that captures the participant 130 is standing in front of a web cam or some depth value of each pixel in a scene in order to create a other suitable video capture device. Behind participant depth mask to be used as discussed herein. It should be 130 is window 132. It should be appreciated that where 55 noted that while a single user 134 is depicted in Figure bright light is shining through window 132, the resulting 5, it should be noted that multiple users may be incorpo- image of participant 130 captured by the image capture rated in the embodiments described here. Since the device will become darkened. In an interactive video depth mask enables adjustment of both foreground and

5 9 EP 2 290 949 B1 10 background object image or video characteristics, it is cludes depth logic capable of capturing a depth value for not required that user 134 be located in the middle or any each pixel. One exemplary image capture device with other particular area of the capture region for image cap- depth logic is the ZCAM™ mentioned above. The method tures device 100. It should be further appreciated that then proceeds to operation 164 where an image of the one exemplary system represented by Figure 5 is the 5 scene is captured and this captured image corresponds EYETOY™ system mentioned above. to the depth mask. It should be appreciated that for the [0025] Figure 6 is a simplified schematic diagram of an ZCAM™ embodiment, operations 162 and 164 are per- image capture device in accordance with one embodi- formed simultaneously. The method then moves to op- ment of the invention. Image capture device 100 includes eration 166 where pixel values of objects within either, depth logic 140, image capture device logic 142, and 10 or both, of the foreground and background regions of the memory 144 all in communication with each other. As captured image are adjusted. This adjustment is based described herein, depth logic 140 includes circuitry con- upon the depth mask defined above. figured to generate a mask in order for image capture [0028] For example, the depth mask may be defined device 100 to enhance a captured image with the assist- through bit values where a first bit value is assigned to ance of the depth information. For example, depth logic 15 foreground objects and a second bit value is assigned to 140 may generate the mask in order to differentiate be- background objects. The adjustment then enhances the tween foreground and background objects within an im- brightness of foreground objects while decreasing the age scene, and this mask will be stored in memory 144. brightness of background objects where a bright light Then, a corresponding scene of the image that is cap- source exists in one embodiment Where the image cap- tured and processed by image capture device logic 142 20 ture device is not a digital device, e.g., a SLR camera, will be enhanced. That is, certain image or video char- mechanical or electrical adjustments of the image cap- acteristics are manipulated as described herein depend- ture device parameters may be made as a result of the ing on whether an object within the scene is located in foreground and background objects identified by the bit the foreground or background, as determined by the mask. These mechanical or electrical adjustments may depth mask. In one embodiment, depth logic 140 is ac- 25 include defining an aperture size corresponding to a cer- tivated by button 141 or some other suitable activation tain exposure level, lens settings for a particular focus mechanism. Thus, a user has the option of activating the level, etc. In another embodiment, the pixel values are depth logic for enhanced image presentation, or bypass- adjusted according to depth information included with the ing the image presentation. image data, i.e., distance information tagged to each pix- [0026] Figure 7 is an alternative schematic diagram of 30 el of the image data. One skilled in the art will appreciate an image capture device having logic configured to dif- that the aperture size may be controlled mechanically or ferentiate between foreground and background images electronically. The electronic control may be performed in the invention. Image capture device 100 includes lens through a sensor on a chip. Thus, each pixel adjusted 150 behind which is charged coupled device (CCD) 152. separately with the electronic control. Depth logic 140, microprocessor unit (MPU) 148, and 35 [0029] In summary, an image capture device capable memory 144 are also included. Image capture device of generating a depth mask for corresponding segments 100 includes display panel 154. It will be apparent to one of a scene is provided. It should be appreciated that while skilled in the art that while image capture device 100 is the invention has been described in terms of the back- depicted as a digital camera in Figure 7, the invention is ground and foreground segments (2 layers) of a scene, not limited to a digital camera. Depth logic module 140 40 the embodiments described herein may be extended to may be included in a video capture device in order to any number of layers of the scene. Through the depth adjust image or video characteristics of each frame or mask, the image or video characteristics for an image every xth frame. may be selectively adjusted irrespective of where an ob- [0027] Figure 8 is a flow chart diagram illustrating the ject is located in the scene. Furthermore, the capture method operations for adjusting image capture settings 45 device described herein enables enhanced functionality for an image capture device in accordance with one em- for interactive entertainment applications. For example, bodiment of the invention. The method initiates with op- with respect to a video game application, where a user eration 160 where a scene is identified. Here, an image is tracked to incorporate his image into the video game, capture device may be used to identify a scene defined the capture device described above enables for en- by a capture region. Of course, the image capture device 50 hanced tracking of the user. The user is free to move may be a video capture device. The method then ad- anywhere in the capture region and is not limited to one vances to operation 162 where a depth mask of the scene area, such as a center region. Additionally, as the user is generated for segmentation of foreground and back- moves in front of a bright light source, e.g., sunlight com- ground regions. In one embodiment, the depth mask is ing through a window, the detail of the user’s image is generated by pulsing light and capturing reflections from 55 not lost. With respect to a video capture device, the ad- an object within a certain distance as described with ref- justments may be applied every interval of frames in order erence to Figure 2. Here the light may be infrared light. to avoid constant adjustments from occurring. For exam- In another embodiment, the image capture device in- ple, if a user briefly holds up a black piece of paper in

6 11 EP 2 290 949 B1 12 front of him, the frame interval delay will prevent the user capturing (164) video frames of the scene to from suddenly turning darker. Also, if the user temporarily track an object or user (134); leaves the field of view of the image capture device and generating (162) depth masks of the scene from comes back, the adjustment and re-adjustment of the data of every x number of frames, where x is scene is avoided. 5 any integer greater than one, and for frames not [0030] It should be appreciated that the embodiments associated with one of the depth masks, using described above may be extended to other systems in one depth mask from a previous frame, each addition to an interactive entertainment input device, i.e., depth mask identifying objects in the scene that the EYETOY™ system capture device. For example, the are either in a foreground region or a back- video capture device may be used in a videoconferencing 10 ground region; and system to provide enhanced video images for the con- adjusting (166) pixel values of each frame or ference. Here, the capture device may not be used for every xth frame using the generated depth tracking purposes, but for the enhancement of the image masks, such that the adjusted pixel values ad- or video characteristics enabled through the depth infor- just one or both of the foreground region and the mation. 15 background region of the captured image during [0031] The invention may employ various computer- execution of the interactive gaming application. implemented operations involving data stored in compu- ter systems. These operations are those requiring phys- 2. The method of claim 1, wherein the method opera- ical manipulation of physical quantities. Usually, though tion of generating depth masks of the scene from not necessarily, these quantities take the form of electri- 20 data defining the image of the scene includes, cal or magnetic signals capable of being stored, trans- segmenting the foreground and background regions ferred, combined, compared, and otherwise manipulat- of the scene. ed. Further, the manipulations performed are often re- ferred to in terms, such as producing, identifying, deter- 3. The method of claim 1, wherein the data defining the mining, or comparing. 25 image of the scene includes pixel data where each [0032] Any of the operations described herein that form pixel is tagged with distance information. part of the invention are useful machine operations. The invention also relates to a device or an apparatus for 4. The method of claim 1, wherein the method opera- performing these operations. The apparatus may be spe- tion of adjusting pixel values includes, cially constructed for the required purposes, or it may be 30 independently adjusting pixel values associated with a general purpose computer selectively activated or con- the foreground region from pixel values associated figured by a computer program stored in the computer. with the background region. In particular, various general purpose machines may be used with computer programs written in accordance with 5. The method of claim 1, wherein the video capture the teachings herein, or it may be more convenient to 35 device is one of a digital camera, a web cam, or a construct a more specialized apparatus to perform the camcorder. required operations. [0033] Although the foregoing invention has been de- 6. The method of claim 1, further comprising: scribed in some detail for purposes of clarity of under- displaying a portion of the image of the scene having standing, it will be apparent that certain changes and 40 adjusted pixel values. modifications may be practiced. Accordingly, the present embodiments are to be considered as illustrative and not 7. The method of claim 1, wherein the adjusting of the restrictive, and the invention is to be limited only by the pixel values is according to bit values of the depth scope of the claims. masks. 45 8. The method of claim 1, further comprising: Claims receiving a light signal reflected from one of the user or object in the foreground, the receipt of the light 1. A method for adjusting image capture settings for a signal indicating a depth location. video capture device (100) during execution of an 50 interactive gaming application, the video capture de- 9. The method of claim 1, wherein the adjusting of the vice being in communication with a computing device pixel values modifies a characteristic selected from (138), which is in turn in communication with a dis- the group consisting of exposure, gain, focus and play screen (136), wherein said video capture device brightness. (100) performs the steps of: 55 10. Computer readable medium including program in- identifying (160) a scene in front of the video structions configured to be executed by a system capture device; according to any one of claims 11 to 14 to render the

7 13 EP 2 290 949 B1 14

method of any preceding claim. während der Ausführung einer interaktiven Spie- leanwendung, wobei die Videoaufnahmevorrichtung 11. A system for adjusting image capture settings for a mit einer Rechenvorrichtung (138) in Kommunikati- video capture device (100) during execution of an on ist, die ihrerseits mit einem Anzeigebildschirm interactive gaming application, the video capture de- 5 (136) in Kommunikation ist, wobei die Videoaufnah- vice (100) being in communication with a computing mevorrichtung (100) die folgenden Schritte ausführt: device, which is in turn in communication with a dis- play screen, the video capture device (100) further Identifizieren (160) einer Szene vor der Video- including: aufnahmevorrichtung; 10 Aufnehmen (164) von Videorahmen der Szene, means (150,152) for identifying a scene in front um ein Objekt oder einen Anwender zu verfol- of the video capture device (100); gen (134); means (150,152) for capturing video frames of Erzeugen (162) von Tiefenmasken der Szene the scene to track an object or user; aus Daten von jeweils x Rahmen, wobei x eine means (140) for generating depth masks of the 15 ganze Zahl größer als eins ist, und für Rahmen, scene from data of every x number of frames, die keiner der Tiefenmasken zugeordnet sind, where x is any integer greater than one, and for unter Verwendung einer Tiefenmaske von ei- frames not associated with one of the depth nem vorherigen Rahmen, wobei jede Tiefen- masks, using one depth mask from a previous maske Objekte in der Szene identifiziert, die ent- frame, each depth mask identifying objects in 20 weder in einem Vordergrundbereich oder einem the scene that are either in a foreground region Hintergrundbereich sind; und or a background region; and Anpassen (166) von Pixelwerten jedes Rah- means (148) for adjusting pixel values of each mens oder jedes x-ten Rahmens unter Verwen- frame or every xth frame using the generated dung der erzeugten Tiefenmasken, so dass die depth masks, such that the adjusted pixel values 25 angepassten Pixelwerte den Vordergrundbe- adjust one or both of the foreground region and reich und/oder den Hintergrundbereich des auf- the background region of the captured image genommenen Bildes während der Ausführung during execution of the interactive gaming ap- der interaktiven Spieleanwendung anpassen. plication. 30 2. Verfahren nach Anspruch 1, wobei der Verfahrens- 12. The system of claim 11, wherein the means for gen- vorgang des Erzeugens von Tiefenmasken der Sze- erating depth masks of the scene includes means ne aus Daten, die das Bild der Szene definieren, for segmenting the foreground and background re- Folgendes enthält: gions of the scene; and wherein the data defining Segmentieren des Vordergrund- und des Hinter- the image of the scene includes pixel data where 35 grundbereichs der Szene. each pixel is tagged with distance information. 3. Verfahren nach Anspruch 1, wobei die Daten, die 13. The system of claim 12, further comprising, das Bild der Szene definieren, Pixeldaten enthalten, means for receiving a light signal reflected from one wobei jedes Pixel mit Abstandsinformationen mar- of the user or object in the foreground, the receipt of 40 kiert ist. the light signal indicating a depth location; and means for independently adjusting pixel values as- 4. Verfahren nach Anspruch 1, wobei der Verfahrens- sociated with the foreground region from pixel values vorgang des Anpassens von Pixelwerten Folgendes associated with the background region. enthält: 45 Anpassen der Pixelwerte, die dem Vordergrundbe- 14. The system of claim 11, wherein the video capture reich zugeordnet sind, unabhängig von Pixelwerten, device is one of a digital camera, a web cam, or a die dem Hintergrundbereich zugeordnet sind. camcorder, and further comprising, means for adjusting the pixel value characteristics 5. Verfahren nach Anspruch 1, wobei die Videoaufnah- by modifying exposure, gain, focus and/or bright- 50 mevorrichtung eine Digitalkamera oder eine Web- ness, and for adjusting pixel values according to bit cam oder ein Camcorder ist. values of the depth masks. 6. Verfahren nach Anspruch 1, das ferner Folgendes umfasst: Patentansprüche 55 Anzeigen eines Teils des Bildes der Szene mit an- gepassten Pixelwerten. 1. Verfahren zum Anpassen von Bildaufnahmeeinstel- lungen für eine Videoaufnahmevorrichtung (100) 7. Verfahren nach Anspruch 1, wobei das Anpassen

8 15 EP 2 290 949 B1 16

der Pixelwerte gemäß Bitwerten der Tiefenmasken enthalten, wobei jedes Pixel mit Abstandsinformati- geschieht. onen markiert ist.

8. Verfahren nach Anspruch 1, das ferner Folgendes 13. System nach Anspruch 12, das ferner Folgendes umfasst: 5 umfasst: Empfangen eines Lichtsignals, das von dem Anwen- der oder dem Objekt im Vordergrund reflektiert wird, Mittel zum Empfangen eines Lichtsignals, das wobei der Empfang des Lichtsignals einen Tiefenort von dem Anwender oder dem Objekt im Vorder- angibt. grund reflektiert wird, wobei der Empfang des 10 Lichtsignals einen Tiefenort angibt; und 9. Verfahren nach Anspruch 1, wobei das Anpassen Mittel zum Anpassen von Pixelwerten, die dem der Pixelwerte eine Eigenschaft ändert, die aus der Vordergrundbereich zugeordnet sind, unabhän- Gruppe ausgewählt ist, die aus Belichtung, Verstär- gig von Pixelwerten, die dem Hintergrundbe- kung, Bildschärfe und Helligkeit besteht. riech zugeordnet sind. 15 10. Computerlesbares Medium, das Programmanwei- 14. System nach Anspruch 11, wobei die Videoaufnah- sungen enthält, die konfiguriert sind, durch ein Sys- mevorrichtung eine Digitalkamera oder eine Web- tem nach einem der Ansprüche 11 bis 14 ausgeführt cam oder ein Camcorder ist und ferner Folgendes zu werden, um das Verfahren nach einem vorheri- umfasst: gen Anspruch wiederzugeben. 20 Mittel zum Anpassen der Pixelwerteigenschaften durch Ändern der Belichtung, der Verstärkung, der 11. System zum Anpassen von Bildaufnahmeeinstellun- Bildschärfe und/oder der Helligkeit und zum Anpas- gen für eine Videoaufnahmevorrichtung (100) wäh- sen der Pixelwerte gemäß Bitwerten der Tiefenmas- rend der Ausführung einer interaktiven Spielean- ken. wendung, wobei die Videoaufnahmevorrichtung 25 (100) in Kommunikation mit einer Rechenvorrich- tung ist, die ihrerseits in Kommunikation mit einem Revendications Anzeigebildschirm ist, wobei die Videoaufnahme- vorrichtung (100) ferner Folgendes enthält: 1. Procédé d’ajustement de réglages de capture d’ima- 30 ges pour un dispositif de capture vidéo (100) au Mittel (150, 152) zum Identifizieren einer Szene cours de l’exécution d’une application de jeu interac- vor der Videoaufnahmevorrichtung (100); tif, le dispositif de capture vidéo étant en communi- Mittel (150, 152) zum Aufnehmen von Videorah- cation avec un dispositif informatique (138), lui-mê- men der Szene, um ein Objekt oder einen An- me en communication avec un écran d’affichage wender zu verfolgen; 35 (136), ledit dispositif de capture vidéo (100) réalisant Mittel (140) zum Erzeugen von Tiefenmasken les étapes suivantes : der Szene aus Daten von jeweils x Rahmen, wo- bei x eine ganze Zahl größer als eins ist, und für identification (160) d’une scène au devant du Rahmen, die keiner der Tiefenmasken zugeord- dispositif de capture vidéo ; net sind, unter Verwendung einer Tiefenmaske 40 capture (164) d’images vidéo individuelles de la von einem vorherigen Rahmen, wobei jede Tie- scène afin de suivre à la trace un objet ou un fenmaske Objekte in der Szene identifiziert, die utilisateur (134) ; entweder in einem Vordergrundbereich oder ei- génération (162) de masques de profondeur de nem Hintergrundbereich sind; und la scène à partir de données relatives à toutes Mittel (148) zum Anpassen von Pixelwerten je- 45 les x images individuelles, x étant un entier quel- des Rahmens oder jedes x-ten Rahmens unter conque supérieur à un, et, pour des images in- Verwendung der erzeugten Tiefenmasken, so dividuelles qui ne sont pas associées à l’un des dass die angepassten Pixelwerte den Vorder- masques de profondeur, utilisation d’un masque grundbereich und/oder den Hintergrundbereich de profondeur issu d’une image individuelle pré- des aufgenommenen Bildes während der Aus- 50 cédente, chaque masque de profondeur identi- führung der interaktiven Spieleanwendung an- fiant des objets dans la scène situés soit dans passen. une région d’avant-plan, soit dans une région d’arrière-plan ; et 12. System nach Anspruch 11, wobei die Mittel zum Er- ajustement (166) de valeurs de pixels de chaque zeugen von Tiefenmasken der Szene Mittel zum 55 image individuelle ou de chaque xème image in- Segmentieren des Vordergrund- und des Hinter- dividuelle à l’aide des masques de profondeur grundbereichs der Szene enthalten; und wobei die générés, de sorte que les valeurs de pixels ajus- Daten, die das Bild der Szene definieren, Pixeldaten tées ajustent la région d’avant-plan et/ou la ré-

9 17 EP 2 290 949 B1 18

gion d’arrière-plan de l’image capturée au cours (100) au cours de l’exécution d’une application de de l’exécution de l’application de jeu interactif. jeu interactif, le dispositif de capture vidéo (100) étant en communication avec un dispositif informa- 2. Procédé selon la revendication 1, dans lequel l’opé- tique, lui-même en communication avec un écran ration de procédé de génération de masques de pro- 5 d’affichage, le dispositif de capture vidéo (100) com- fondeur de la scène à partir de données définissant portant en outre : l’image de la scène comporte, la segmentation des régions d’avant-plan et d’arriè- des moyens (150, 152) d’identification d’une re-plan de la scène. scène au devant du dispositif de capture vidéo 10 (100) ; 3. Procédé selon la revendication 1, dans lequel les des moyens (150, 152) de capture d’images vi- données définissant l’image de la scène comportent déo individuelles de la scène afin de suivre à la des données de pixels, chaque pixel étant étiqueté trace un objet ou un utilisateur ; par des informations de distance. des moyens (140) de génération de masques 15 de profondeur de la scène à partir de données 4. Procédé selon la revendication 1, dans lequel l’opé- relatives à toutes les x images individuelles, x ration de procédé d’ajustement de valeurs de pixels étant un entier quelconque supérieur à un, et, comporte, pour des images individuelles qui ne sont pas l’ajustement de valeurs de pixels associées à la ré- associées à l’un des masques de profondeur, gion d’avant-plan indépendamment de valeurs de 20 d’utilisation d’un masque de profondeur issu pixels associées à la région d’arrière-plan. d’une image individuelle précédente, chaque masque de profondeur identifiant des objets 5. Procédé selon la revendication 1, dans lequel le dis- dans la scène situés soit dans une région positif de capture vidéo est l’un des éléments dans d’avant-plan, soit dans une région d’arrière- le groupe constitué par un appareil de prise de vues 25 plan ; et numérique, une webcam ou un caméscope. des moyens (148) d’ajustement de valeurs de pixels de chaque image individuelle ou de cha- 6. Procédé selon la revendication 1, comprenant en que xème image individuelle à l’aide des mas- outre : ques de profondeur générés, de sorte que les l’affichage d’une partie de l’image de la scène com- 30 valeurs de pixels ajustées ajustent la région portant des valeurs de pixels ajustées. d’avant-plan et/ou la région d’arrière-plan de l’image capturée au cours de l’exécution de l’ap- 7. Procédé selon la revendication 1, dans lequel l’ajus- plication de jeu interactif. tement des valeurs de pixels s’effectue selon des valeurs de bits des masques de profondeur. 35 12. Système selon la revendication 11, dans lequel les moyens de génération de masques de profondeur 8. Procédé selon la revendication 1, comprenant en de la scène comportent des moyens de segmenta- outre : tion des régions d’avant-plan et d’arrière-plan de la la réception d’un signal lumineux réfléchi soit par scène ; et dans lequel les données définissant l’ima- l’utilisateur, soit par l’objet dans l’avant-plan, la ré- 40 ge de la scène comportent des données de pixels, ception du signal lumineux indiquant un emplace- chaque pixel étant étiqueté par des informations de ment de profondeur. distance.

9. Procédé selon la revendication 1, dans lequel l’ajus- 13. Système selon la revendication 12, comprenant en tement des valeurs de pixels modifie une caractéris- 45 outre, tique sélectionnée dans le groupe constitué par une des moyens de réception d’un signal lumineux réflé- exposition, un gain, une mise au point et une lumi- chi soit par l’utilisateur, soit par l’objet dans l’avant- nosité. plan, la réception du signal lumineux indiquant un emplacement de profondeur ; et 10. Support lisible par ordinateur comportant des ins- 50 des moyens d’ajustement de valeurs de pixels as- tructions de programme configurées pour être exé- sociées à la région d’avant-plan indépendamment cutées par un système selon l’une quelconque des de valeurs de pixels associées à la région d’arrière- revendications 11 à 14 dans le but de réaliser le pro- plan. cédé selon l’une quelconque des revendications pré- cédentes. 55 14. Système selon la revendication 11, dans lequel le dispositif de capture vidéo est l’un des éléments 11. Système permettant d’ajuster des réglages de cap- dans le groupe constitué par un appareil de prise de ture d’images pour un dispositif de capture vidéo vues numérique, une webcam ou un caméscope, et

10 19 EP 2 290 949 B1 20 comprenant en outre, des moyens d’ajustement des caractéristiques de valeurs de pixels par modification d’une exposition, d’un gain, d’une mise au point et/ou d’une luminosité, et d’ajustement de valeurs de pixels selon des va- 5 leurs de bits des masques de profondeur.

10

15

20

25

30

35

40

45

50

55

11 EP 2 290 949 B1

12 EP 2 290 949 B1

13 EP 2 290 949 B1

14 EP 2 290 949 B1

15 EP 2 290 949 B1

16 EP 2 290 949 B1

17 EP 2 290 949 B1

18 EP 2 290 949 B1

19 EP 2 290 949 B1

REFERENCES CITED IN THE DESCRIPTION

This list of references cited by the applicant is for the reader’s convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Patent documents cited in the description

• US 6057909 A [0006]

20