Active Range Imaging Dataset for Indoor Surveillance

Total Page:16

File Type:pdf, Size:1020Kb

Active Range Imaging Dataset for Indoor Surveillance

DISTANTE et al: ACTIVE RANGE IMAGING DATASET... 1 Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

Active Range Imaging Dataset for Indoor Surveillance

Cosimo Distante1, Giovanni Diraco2, Alessandro Leone2 1National Institute for Applied Optics INOA Viale della Libertà 3 Arnesano, 73010 Lecce (Italy) 2Institute for Microelectronics and Microsystems National Research Council Provinciale per Arnesano, Palazzina A3, 73100 Lecce (Italy)

Abstract Range Imaging (RIM) is a new suitable choice for measurement and modelling in many different applications. The ability to describe scenes in three dimensions opens new scenarios, providing new opportunities in different fields, including visual monitoring and security. Active range sensors provide depth information allowing to develop algorithms much less complex and allows problems to be approached in a new, robust and efficient way. The paper presents a wide dataset for indoor surveillance applications acquired by a state-of-the art range camera. The main issue is the definition of a common basis for the comparative evaluation of the performance of vision algorithms.

1 Introduction Acquiring 3D information in real environments is an important task for many computer vision applications (i.e. surveillance, security, autonomous navigation, computer graphics, human-computer interaction, automotive, biomedical, augmented environments, etc.) and benefits from range image acquisition are apparent. In the last years several active range sensors having small dimensions, low-power consumption and real-time performances have been released. Since the Time-Of-Flight (TOF) range cameras are widely used by researchers in monitoring and surveillance applications (as described in the next section),

© 2010. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms. 2 DISTANTE et al: ACTIVE RANGE IMAGING DATASET... Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010) ad-hoc workshops on this topic have been organized [1, 2] and emerging projects have been started [3, 4, 40] in order to develop TOF sensors and relative software technologies for ambient intelligence (people tracking, head tracking, eye movements tracking, behaviour analysis, etc.). Comparing algorithms is especially difficult since they are tested on different datasets under widely varying conditions. A large amount of real and synthetic image datasets have been defined for the evaluation of the performance of passive vision-based algorithms in surveillance applications [6, 7, 8, 9], whereas datasets of range images for surveillance and monitoring applications are limited to head/face detection and gesture analysis [15] (different databases are presented in literature with the aim to evaluate object segmentation on synthetic still range images and also for biometrics and robot navigation [16, 17, 18, 41]). Rather, 3D range imaging datasets for people posture analysis and behaviour recognition are not available yet. For the previous reason, the definition of a database of real 3D range images for human behaviour analysis and posture recognition appears useful and important for the scientific community in order to define a common basis for the evaluation of algorithms in the indoor surveillance field. This paper presents a large database called ARIDIS (Active Range Imaging Dataset for Indoor Surveillance) of benchmark sequences captured by a state-of-the-art TOF range camera. The paper is organized as follows. In section 2, the state-of-the-art of TOF sensors is discussed and the widely used commercial TOF cameras and related works are presented. Section 3 provides some details about advantages and drawbacks by using 3D active range vision instead of passive stereo vision in surveillance. In section 4, the ARIDIS dataset is described whereas conclusive considerations are presented in section 5.

2 TOF cameras: State of The Art and Related Works

1.1 TOF State-Of-The-Art Optical distance measurement techniques are usually classified into three fundamental categories: Interferometry, Stereo/Triangulation and Time-of-Flight [14]. The Interferometry method makes use of the periodic nature of the light by implementing an interferometric setup where the light from a reference beam path interferes with the light reflected from an object. The high accuracies mainly depend on the coherence length of the light source, so that Interferometry is not suitable for ranges greater than few centimetres since the method is based on the evaluation of very short optical wavelength. The active triangulation method exploits geometrical relations between an illumination source and a camera, whereas the passive triangulation concerns relations between two or more cameras. The geometrical relations between the object, the sensor and a known baseline are used for the calculation of the distance. In the passive case the acquisition of depth is done by stereoscopy or multi-view geometry, meaning the clever combination of at least DISTANTE et al: ACTIVE RANGE IMAGING DATASET... 3 Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010) two 2D images from different points of view. The Time-of-Flight technique exploits the finite speed of light for the determination of distances through modulated light signal. The distribution of distances can be determined by measuring the round-trip time of the modulated light from the illuminator to the object and back. The round-trip time of the

(a)

(b) (c) (d) Figure 1. a) TOF PMD Principle; b) PMD 19k; c) SwissRanger 3000; d) Canesta PD200. emitted signal is related to the object distance and it may be determined in various ways. Currently, devices employing nanosecond laser pulses are widely investigated; architectural constraints and the integration of highly precise miniaturized timing circuitry increase the costs of the systems. The sensors based on the Photonic Mixer Devices (PMD) technology assure advantages according to size, weight and scan time. A PMD sensor employees incoherent, near- infrared, amplitude-modulated, continuous wave light and it determine the signal phase shift (and hence the object distance) by mixing the emitted signal with the returned signal at every pixel (Fig. 1.a) [13]. The illumination unit of PMD cameras are usually realized with low-cost LED arrays. Various TOF cameras support the Suppression of Background Intensity (SBI) modality or an equivalent IR-suppression scheme, allowing the usage of the devices in outdoor applications. As mentioned above, the TOF technique is based on the indirect estimation of the arrival time by measuring the phase shift between the transmitted and the received signal. The phase shift φ is directly proportional to the round-trip distance d as shown in the following relation: 4 DISTANTE et al: ACTIVE RANGE IMAGING DATASET... Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

   d  mod  with d  mod (1) 2 2 2 where λmod is the modulation wavelength which is about 15m in most TOF cameras. By substituting λmod=15m in Eq. 1, the resulting non-ambiguous distance is of 7.5m. Recent cameras allow to use more than one modulation frequencies in order to increase the non- ambiguity range, but increasing device complexity and decreasing accuracy of measure. However, TOF sensors normally suffer for the following drawbacks [29]: - Low Resolution: Current sensors have a resolution between 64x48 and 176x144, which is lower if compared to standard RGB sensors but sufficient for most current applications. Additionally, the resulting large solid angle yields affect the measure in inhomogeneous depth areas (e.g. at silhouette boundaries). - Depth Distortion: Since a sinusoidal signal is practically not achievable, the measure is affected by systematic errors. - Motion Artefacts: Since four phase images are acquired subsequently in order to reconstruct the received signal, movements of the device or in the scene lead to erroneous distance values at object boundaries. - Multiple Cameras: The illuminator of a camera can affect another camera placed in the same environment. Recent developments allow the use of multiple TOF cameras. More technical details on range imaging can be found in [35].

1.2 Commercial TOF camera The TOF phase-measurement principle is used by several manufacturers of TOF cameras, such as Canesta Inc. [30], PMDTechnologies GmbH [31] and MESA Imaging AG [32] (Fig. 1.b, Fig. 1.c, Fig. 1.d). Canesta provides several models of depth vision sensors differing for pixel resolution, covered distance, frame rate and field of view. Canesta distributes sensors with field of view ranging between 30degs and 114degs, depending on the nature of the application. Currently, the maximum resolution of Canesta sensor is 160x120. Although some applications would benefit from more pixels, the sensor cost grows in size with the amount of pixels. Canesta sensors are mainly employed in automotive applications [10, 11] and customized for specific application requirements. The TOF cameras by PMDTechnologies GmbH include all key technologies for a TOF-System. The illumination LEDs are generally mounted in two arrays, one for each side of the camera. This illumination configuration is not optimal [34], since it introduces errors due to TOF differences between the left and the right array that are corrected in hardware. The correlation inside the camera is performed by using a CMOS based optical semiconductor increasing speed and decreasing cost and noise of the system. PMDTechnologies GmbH provides several models of TOF camera with different features and suitable for measurements in bright daylight since all cameras are equipped with the SBI technology. DISTANTE et al: ACTIVE RANGE IMAGING DATASET... 5 Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

Specifications MESA MESA PMD PMD Canesta SR3000 SR4000 Technologies Technologies DPxxx PMD19k CamCube Pixel Array Size 176x144 176x144 160x120 204x204 64x64 / 120x160 (QCIF) (QCIF) (QQVGA) Field of View 47.5° × 39.6° 43.6° × 34.6° 40° × 40° 32° × 32° (30° ÷ 88°) × (H x V) (30° ÷ 114°) Interface USB 2.0 USB 2.0, Firewire, USB 2.0 USB 1.1 / 2.0 Ethernet Ethernet Illumination 1Watt @850nm 1Watt @850nm 3Watt @870nm 3Watt @870nm 1Watt @785nm Power 5Watt @870nm Power Supply (V) 12 12 9… 18 12 5 Modulation 20, default 29 ÷ 31 20, default 20 ÷ 90 13, 26, 52, 104 Frequency (MHz) (30, default) (20, default) Non-ambiguous 7.5 0.3 ÷ 5 7.5 0.3 ÷ 7 0.3 ÷ 20 range (meters) Distance 1% of range 1% of range 3% of range 2% of range 2% of range Resolution Distance +/- 1cm +/- 1cm 6cm (std. dev.) 3cm (std. dev.) 0.6cm –20cm Accuracy (absolute (absolute at 2m and 90% at 2m and 90% (std. dev.) at (single pixel) accuracy) accuracy) reflectivity reflectivity 0.3m-20m Frame Rate 25 fps, typical 54 fps max 15 fps 25 fps 30 fps Outdoor Yes No No Yes Yes Table 1. Main features of the TOF cameras employed by scientific community.

The PMDTechnologies cameras can be used to get the excellent depth information as well as gray scale value of the scene. Currently, the PMD devices provide the resolutions of 48x64 (model 13k), 64x16 (model A2) and a non-ambiguity distance up to 40m. Greater pixel resolution (160x120) can be achieved by PMD 19k and by the CamCube model (204×204) with a non-ambiguity range up to 7.5m. However, at now the only camera models referenced in scientific works are the 13k, the A2 and the 19k model, employed in automotive and robotics applications [12, 19, 20, 21]. The MESA Imaging AG has developed the SwissRanger TOF camera series: SR-2, SR3000 and SR4000. MESA SwissRanger cameras are employed in automotive, medical, biometrics, robotics and surveillance applications [22-27]. Since Swissranger SR3000 offers an excellent trade-off between pixel resolution and covered distance, it has been employed for the ARIDIS dataset acquisition. Table 1 summarizes the main characteristics of the discussed optical devices. Since the MESA SR3000 has been employed in dataset acquisition, a more detailed description is discussed in the following [5]. SR3000 camera provides two kind of images at QCIF spatial resolution (a depth map and a grey levels image) at least 15 fps (up to 25fps, variable with setting parameters). The camera works with an integrated, modulated near infrared light source (the illuminator is composed by an array of 55 near infrared LEDs). The emitted light is reflected by the objects in the scene and sensed by a pixel array realized in a specialized mixed CCD and CMOS process: the target depth is estimated by measuring the phase shift of the signal round-trip from the 6 DISTANTE et al: ACTIVE RANGE IMAGING DATASET... Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010) device to the target and back. Moreover, attention must be dedicated to the Field-Of-View (FOV) of the device: normally active range sensors exhibit narrow FOV so a pan-tilt solution could be used for monitoring large areas. A narrow-band infrared filter is used so

(a) (b) (c)

(d) (e) (f) Figure 2. Unwanted effects are shown for different acquisition configuration. Noise corrupts the measure (a) since the Integration Time (IT) is too low (IT=5ms). Saturation effects (b) are due to the high IT setting (IT=30ms) and they are limited (c) when the IT is correctly adjusted (IT=20ms). The effects of the IT tuning are visible in the acquired dataset (in f the intensity image). A 10ms IT causes noise in the measure (d), whereas a better tuning (IT=20ms) is shown in (e). that depth map and intensity image are not affected by environmental illumination conditions (the camera is suitable in night vision applications). By using the default parameters, the SR camera is able to define a depth map with a good approximation (greater than 99%) when the target object falls in the non-ambiguity range (up to 7.5 meters). When the camera-target distance overcomes the non-ambiguity range, aliasing effects appear (i.e. object at 8 meters is “seen” as at 0.5 meters) as shown in Fig. 2.d and Fig. 2.e, demoting only the depth estimation. Another important parameter is the Integration Time (IT), that can be tuned in a proper way to limit saturation and noise effects. In particular, if the IT is short (Fig. 2.a) the results are very noisy; if it is long (Fig. 2.b) the results are smoothed and overflow starts to contaminate the results with objects close to the camera. The above discussion is also suitable for the other TOF sensors using similar technologies, as they exhibit the same problems (non-ambiguity range, etc.). A 3D plotting of the scene shown in Fig. 2.e is provided in Fig. 3. DISTANTE et al: ACTIVE RANGE IMAGING DATASET... 7 Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

3 Active Range Vision vs. Passive Vision In this section we focus the attention on advantages and drawbacks when an active vision system is used instead of a classical 2D/3D vision system (passive vision). Table 2 synthesizes the most important characteristics of TOF cameras in comparison with triangulation system ones (passive stereo vision). The main advantage in the use of RIM is the description of a scene with a more detailed information, since RIM provides both depth map and intensity image, that could be used at the same time in the image processing applications. In particular, typical problems of passive vision (foreground camouflage, shadows, partial occlusions, etc.) could be overcome by using depth information that is not affected by illumination conditions and objects appearance. On the other hand, in several situations active RIM exhibits unwanted behaviours due to limitations of specific 3D sensing technology (limited depth range due to aliasing, multi path, reflection object properties). Note that the depth information could be obtained by using inexpensive passive stereo vision. However, this approach presents high computational costs and it fails when the scene is poorly textured or the illumination is insufficient; vice-versa, active vision provides depth maps even if appearance information is untextured and in all illumination conditions [28]. It is important to note that both distance and amplitude images delivered by the TOF camera have a number of systematic drawbacks that must be compensated. The main amplitude-related problem comes from the fact that the power of a wave decreases with the square of the distance it covers. For the previous consideration, the light reflected by imaged objects rapidly decreases with the distance between object and camera. In other words, objects with the same reflectance located at different distances from the camera will appear with different amplitudes in the image. Benefits of TOF RIM in surveillance contexts are summarized in Table 3, whereas drawbacks are emphasizes in Table 4. Drawbacks are showed in Fig. 4 by using typical frames of the ARIDIS dataset sequences. 8 DISTANTE et al: ACTIVE RANGE IMAGING DATASET... Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

TOF sensor Stereo (passive) vision

Depth Sub-centimetre (if chromaticity conditions Sub-millimetre (if images are highly textured). resolution are satisfied).

Spatial Medium (about QCIF, 144x176 pixels). High (over 4CIF, 704x576 pixels). resolution

Dimensions are the same of a normalTwo video cameras are needed and also external light Portability camera. source.

Computationa On-board FPGA for phase and intensityHigh workload (the calibration step and the l efforts measurement. correspondences search process are hard).

High for a customizable prototype (1500€ - Cost It depends on the quality of stereo vision system. 5000€).

Table 2. Comparison of important characteristics of TOF cameras and stereo vis- ion systems.

TOF sensor Passive vision

Illumination Accurate depth measurement in allSensible to illumination variations and artificial conditions illumination conditions. lights. Unable to operate in dark environments.

Shadows It does not affect principal steps ofReduced performances in segmentation, recognition, presence monitoring applications. etc.

Due to projective ambiguity, occlusions are difficult Partial Partially occluded objects are detected as to detect (merging blobs) and handle (predictive occlusions separated (if they are at different depths). tracking strategies).

Camouflage is avoided but appearanceCamouflage effects are presented when Objects could affect depth precision (chromaticityforeground/background present same appearance appearance dependence). properties.

Table 3. Advantages in the use of TOF sensors in surveillance contexts.

Drawback description

It affects the non-ambiguity range i.e. the maximum achieved depth is reduced (up to 7.5 Aliasing meters).

Multi-path effects Depth measurement is strongly corrupted when the target surface presents corners.

Objects reflection Materials having different colors exhibit dissimilar reflection properties that affect reflected properties light intensity and, therefore, depth resolution.

Usually it is limited so that an accurate positioning of the sensor is needed. A pan-tilt Field of view architecture could be useful.

Table 4. Drawbacks in the use of TOF sensors in surveillance contexts. DISTANTE et al: ACTIVE RANGE IMAGING DATASET... 9 Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

(a) (b) Figure 4. The aliasing effect (a) appears in regions for which the distance from the camera overcomes the 7.5m (the opened door). The multi-path effects are apparent (a) in the regions in which adjacent walls intersect. Moreover, depth measurement is corrupted by object reflection properties (the LCD TV in (a)). Note that the corresponding intensity image (b) doesn’t presents the previous discussed problems. In order to understand the advantages in the use of range imaging in surveillance, a qualitative comparison between intensity-based and depth-based segmentation is presented in Fig. 5, when the same well-known segmentation approach (Mixture of Gaussians-based background modelling and Bayesian framework for segmentation) [39] is used. The better segmentation is achieved by using the depth image, whereas the same segmentation approach applied on the intensity image suffers of mimetic effects.

(a) (b)

(c) (d) Figure 5. Segmentation results are shown (b, d) when the segmentation approach in [39] is applied on depth (a) and intensity (c) information, respectively. The better segmentation is achieved starting from the depth image, whereas the same segmentation approach applied on the intensity image suffers of mimetic effects (sweater and wall present the same brightness). 10 DISTANTE et al: ACTIVE RANGE IMAGING DATASET... Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

4 Datasets Description In the visual-surveillance context, several benchmark sequences and ground-truth data have been defined by using traditional monocular (passive) vision systems (ETISEO [2], CAVIAR [3], PETS [4]). Datasets for the evaluation of range imaging algorithms on dynamic scenes are not available yet. The goal of ARIDIS is to define a common dataset that could be used in order to evaluate the performance of computer vision algorithms by using active RIM technologies in indoor scenes. The main idea is to investigate RIM approaches when typical troublesome issues of the traditional passive vision occur (illumination conditions changing, shadows, mimetic and untextured regions, occlusions as described in Table 3). Moreover the dataset is useful for the evaluation of active vision- based algorithms when typical problems for the active range imaging are present (aliasing, reflective surfaces, multipath effects as described in Table 4). Due to the limited FOV and the non-ambiguity range of the device, the benchmark sequences have been captured in small indoor environments (up to 5×5 meters). The camera configuration employed in data collection is shown in Fig. 6, according to the following notation: - L : covered room length (5.0 meters) - H : distance camera/ground floor - θ : camera tilt angle - β : camera roll angle - (x, y, z) : camera coordinates The camera is statically positioned in wall mounting configuration at height H from the ground floor with a tilt angle θ and roll angle β. For each collected sequence, a text file provides the parameters (H, β, θ). The dataset is composed by 61 sequences and each of them is about 1800 frames long, captured at different frame rate and according to the most critical parameter (Integration Time). In order to cover a large amount of events, several persons have been captured with different orientations in the scene and different heights/dimensions and coloured dresses. Sequences present typical indoor situations, Behaviour / Posture Sequences Walking / Stand 1, 2, 5, 6, 9, 10, 11, 12, 13, 14, 17, 18, 19, 20, 21, 22, 25, 26, 29, 30, 33, 34, 35, 36, 37, 38, 39, 40, 43, 44, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61 Falling / Lying down 11, 12, 13, 14, 43, 44, 51, 52, 53, 54, 55, 56, 58, 60, 61 Limping 19, 20, 59 Sitting / Sit 21, 22, 33, 34, 37, 38, 39, 40, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57 Crouching / Squat 19, 20, 35, 36, 37, 38, 53, 54, 55 Object moving 17, 18, 21, 22, 25, 26, 29, 30, 33, 34, 39, 40, 47, 48, 49, 50, 55, 56, 57, Figure 6. Wall mounting58, 59 camera setup used in the data collection. The metric Bendinginformation / Bent are referred19, to20, the 35, 36,TOF 37, camera 38, 43, 44, coordinate 51, 52, 53, axes54, 55, (x, 56, y, 57 z) represented. TableThe setup 5. Behaviours parameters and (H, postures β, θ) are collected provided in ARIDISfor each sequences. sequence collected in the ARIDIS dataset. DISTANTE et al: ACTIVE RANGE IMAGING DATASET... 11 Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

Furthermore, the dataset acquisition has been realized in different occlusion conditions: without occlusions, with static occlusions (an object occluding a person, named object- person occlusion) and with dynamic occlusions (two or more people in the scene occluding each others, denoted as person-person occlusion) [36, 37, 38]. Tab.5 and Tab.6 report the sequences in ARIDIS dataset grouped by behaviours/postures and occlusion types, respectively. Others useful parameters can be found in the documentation downloadable together with the dataset [33], such as the Integration Time of the device, the camera setup parameters, the amount of persons in the scene and the kind of objects present in the scene for each sequence. Also a background modelling sequence is provided when the background of the scene changes. In Fig.7 keyframes are shown corresponding to the collected posture/behaviour typologies. The proposed dataset do not includes ground-truth data referred to segmentation and tracking. Further work is addressed with the aim to define ground-truth allowing a quantitative performance comparison of algorithms. 12 DISTANTE et al: ACTIVE RANGE IMAGING DATASET... Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

5 Conclusions

Occlusions Sequences None 1, 2, 11, 12, 19, 20, 60, 61 Object-Person 5, 6, 17, 18, 21, 22, 25, 26, 29, 30, 33, 34, 35, 36, 37, 38, 39, 40, 43, 44, 47, 48, 49, 50,(a) 51, 52, 53, 54, 55, 56, 57, 58, 59 (b) Person - Person 9, 10, 13, 14, 43, 44 Table 6. ARIDIS sequences grouped by occlusion (if any) type.

(c) (d)

(e) (f)

(g) (h)

(i) (j)

(k) (l) Figure 7. Keyframes of the collected postures and behaviours (depth and intensity images). a) Object-person occlusion; b) Person-person occlusion; c) Walking; d) Falling down; e) Sitting; f) Crouching; g) Object moving; h) Bending. Postures and behaviours are collected in scenes with chairs (i), a desk table, a chair and a stool (j), a large table, chairs and a stool (k) and a vertical desk (l). DISTANTE et al: ACTIVE RANGE IMAGING DATASET... 13 Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

In indoor surveillance applications, range images provide a better perception of scenes in all illumination conditions, deterring the use of cheap stereo systems that fail in dark or low-textured environments. If critical parameters of TOF sensor are adjusted, reliable, computationally low-cost and real-time segmentation/tracking can be realized by only using depth measure, since intensity images present unwanted fluctuations. Depth information overcomes projective ambiguity, whereas intensity image provides appearance information, so that the joined use of them could improve critical steps (object recognition, behaviour analysis, etc.) allowing a better description of moving objects. The suggested dataset provides a common basis to investigate vision algorithms; it can be improved by defining ground-truth data to quantify performances. Further work is addressed with the aim to define ground-truth allowing quantitative performance analysis.

References [1] Workshop on Time of Flight Camera based Computer Vision (TOF-CV), IEEE International Conference on Computer Vision (CVPR 2008), Anchorage, Alaska (USA), June 2008. [2] Workshop on Dynamic 3D Imaging, 31 Annual Symposium of the German Association for Pattern Recognition (DAGM 2009), Jena (Germany), September 2009. [3] ARTTS UE Project, http://www.artts.eu/ Last accessed 2 November, 2009 [4] NETCARITY UE Project, http://www.netcarity.org/ Last accessed 2 November, 2009 [5] T. Oggier, M. Lehmann, R. Kaufmannn, M. Schweizer, M. Richter, P. Metzler, G. Lang, F. Lustenberger, N. Blanc, An all-solid-state optical range camera for 3D-real-time imaging with sub-centimeter depth-resolution (SwissRanger), Proc. SPIE Vol. 5249, pp. 634-545, 2003. [6] A.T. Nghiem, F. Bremond, M. Thonnat, V. Valentin, ETISEO, performance evaluation for video surveillance systems, Proc. IEEE AVSS 2007, pp. 576-481, 2007. [7] The CAVIAR website http://homepages.inf.ed.ac.uk/rbf/CAVIAR/ Last accessed 2 November, 2009. [8] http://pets2006.net Last accessed 2 November, 2009. [9] http://marathon.csee.usf.edu/range/raw-image-data Last accessed 2 November, 2009. [10] O. Gallo, R. Manduchi, A. Rafii, Robust Curb and Ramp Detection for Safe Parking Using the Canesta TOF Camera, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW '08), Anchorage (Alaska), June 2008. [11] S. Acharya, C. Tracey, A. Rafii, System design of time-of-flight range camera for car park assist and backup application, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW '08), Anchorage (Alaska), June 2008. [12] T. Ringbeck, B. Hagebeuker, A 3D Time of flight camera for object detection, Optical 3-D Measurement Techniques, ETH Zürich, 2007. 14 DISTANTE et al: ACTIVE RANGE IMAGING DATASET... Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

[13] W. Karel, "Integrated Range Camera Calibration Using Image Sequences from Hand- Held Operation", XXIst Congress of the International Society for Photogrammetry and remote sensing (ISPRS 2008), Beijing (China), July 2008. [14] B. Buttgen, P. Seitz, Robust Optical Time-of-Flight Range Imaging Based on Smart Pixel Structures, IEEE Transactions on Circuits and Systems, Vol. 55, Issue 6, pp. 1512 – 1525, July 2008. [15] ARTTS 3D-TOF Database, http://www.artts.eu/publications/3d_tof_db Last accessed 2 November, 2009. [16] A. Birk, K. Pathak, S. Schwertfeger,W. Chonnaparamutt, The IUB Rugbot: an intelligent, rugged mobile robot for search and rescue operations, International Workshop on Safety Security and Rescue Robotics (SSRR) IEEE Press, 2006. [17] A. Birk, S. Markov, I. Delchev, K. Pathak, Autonomous Rescue Operations on the IUB Rugbot International Workshop on Safety Security and Rescue Robotics (SSRR) IEEE Press, 2006. [18] Jacobs University, Robotics group, http://robotics.iu-bremen.de Last accessed 2 November, 2009. [19] S. Hussmann, T. Liepert, Three-Dimensional TOF Robot Vision System, IEEE Transactions on Instrumentation and Measurement, Accepted for future publication. [20] C. Joochim, H. Roth, Development of a 3D mapping using 2D 3D sensors for mobile robot locomotion, IEEE International Conference on Technologies for Practical Robot Applications (TePRA 2008), Nov. 2008. [21] T. Ringbeck, B. Hagebeuker, A 3D Time Of Flight Camera For Object Detection, Optical 3-D Measurement Techniques, ETH Zürich. [22] J.T. Thielemann, G.M. Breivik, A. Berge, Pipeline Landmark Detection for Autonomous Robot Navigation using Time-of-Flight Imagery, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008), June 2008. [23] S.A. Guomundsson, R. Larsen, H. Aanaes, M. Pardas, J.R. Casas, TOF Imaging in Smart Room Environments towards Improved People Tracking, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008), June 2008. [24] M.B. Holte, T.B. Moeslund, P. Fihl, Fusion of Range and Intensity Information for View Invariant Gesture Recognition, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008), June 2008. [25] Youding Zhu, B. Dariush, K. Fujimura, Controlled human pose estimation from depth image streams, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008), June 2008. DISTANTE et al: ACTIVE RANGE IMAGING DATASET... 15 Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

[26] J. Penne , C. Schaller , J. Hornegger, T. Kuwert, Robust real-time 3D respiratory motion detection using time-of-flight cameras, International Journal of Computer Assisted Radiology and Surgery, pp. 427-431, July 2008. [27] T. Oggier, B. Büttgen, F. Lustenberger, G. Becker, B. Rüegg, A. Hodac, SwissRanger SR3000 and first experiences based on miniaturized 3D-TOF cameras, Proceedings of the 1st Range Imaging Research Day. Zurich, Switzerland, 2005. [28] Z. Hu, T. Kawamura, K. Uchimura, A Performance Review of 3D TOF Vision Systems in Comparison to Stereo Vision Systems, Stereo Vision, InTech Education and Publishing, November 2008. [29] A. Kolb, E. Barth, R. Koch, ToF-Sensors: New Dimensions for Realism and Interactivity, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008), June 2008. [30] Canesta Inc., www.canesta.com Last accessed 2 November, 2009. [31] PMDTechnologies GmbH, www.pmdtec.com Last accessed 2 November, 2009. [32] Mesa Imaging AG, www.mesa-imaging.ch Last accessed 2 November, 2009. [33] SIPlab Imm Cnr, http://siplab.le.imm.cnr.it/3D/index.htm Last accessed 2 November, 2009. [34] H. Rapp, Experimental and Theoretical Investigation of Correlating TOF-Camera Systems, Master Thesis, Interdisciplinary Center for Scientific Computing (IWR), University of Heidelberg, September 2007. [35] B. Büttgen, T. Oggier, M. Lehmann, R. Kaufmann, F. Lustenberger, CCD/CMOS lock- in pixel for range imaging: Challenges, limitations and state-of-the-Art, 1st Range Imaging Research Days, ETH, Zurich, Switzerland, pp. 21–32, Sep. 2005. [36] C.J. Lin, J.G. Wang, Human Body Posture Classification Using a Neural Fuzzy Network based on Improved Particle Swarm Optimization, 8th International Symposium on Advanced Intelligent Systems, Sokcho-City , Korea , pp. 414-419, Sep. 5-8, 2007. [37] Nooritawati Md Tahir, Aini Hussain, Salina Abdul Samad, Hafizah Husain, “Posture recognition using correlation filter classifier, Journal of Theoretical and Applied Information Technology, 2008. [38] B.B. Francois, F. Bremond, M. Thonnat, I. S. Antipolis, Human Posture Recognition in Video Sequence, Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS 2003), pp. 23-29, 2003. [39] D. Lee, Effective Gaussian Mixture Learning for Video Background Subtraction, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.27, no.5, pp.827-832, 2005. [40] COGAIN Network of excellence, http://www.cogain.org Last accessed 2 November, 2009. [41] G. Passalis, I. A. Kakadiaris, T. Theoharis, G. Toderici, T. Papaioannou, “Towards fast 3D ear recognition for real-life biometric applications," avss, pp.39-44, 2007 IEEE 16 DISTANTE et al: ACTIVE RANGE IMAGING DATASET... Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

Conference on Advanced Video and Signal Based Surveillance, 2007. (http://vislab.ucr.edu) Last accessed 2 November, 2009.

Recommended publications