A Program for Screening Remotely Captured Images

Environ Monit Assess (2019) 191:406 https://doi.org/10.1007/s10661-019-7518-9 EventFinder: a program for screening remotely captured images Michael Janzen · Ashley Ritter · Philip D. Walker · Darcy R. Visscher Received: 22 August 2018 / Accepted: 3 May 2019 © Springer Nature Switzerland AG 2019 Abstract Camera traps are becoming ubiquitous in conjunction with other software developments for tools for ecologists. While easily deployed, they managing camera trap data. require human time to organize, review, and classify images including sequences of images of the same Keywords Background subtraction · Camera traps · individual, and non-target images triggered by envi- Computer classification · Data reduction · ronmental conditions. For such cases, we developed Image processing an automated computer program, named EventFinder, to reduce operator time by pre-processing and clas- sifying images using background subtraction tech- Introduction niques and color histogram comparisons. We tested the accuracy of the program against images previ- Remote cameras are increasingly important tools for ously classified by a human operator. The automated ecologists from site-specific studies to global monitor- classification, on average, reduced the data requiring ing initiatives (Meek et al. 2014; Burton et al. 2015; human input by 90.8% with an accuracy of 96.1%, Steenweg et al. 2017). While they collect field data and produced a false positive rate of only 3.4%. Thus, with minimal human attention, the resulting images EventFinder provides an efficient method for reducing require organization, post-processing, and character- the time for human operators to review and classify ization before analysis, requiring input of time and images making camera trap projects, which compile money. Harris et al. (2010) highlighted two general a large number of images, less costly to process. Our issues facing camera-aided studies, namely the lack testing process used medium to large animals, but of systematic organization and problems involving the will also work with smaller animals, provided their volume of imagery produced. images occupy a sufficient area of the frame. While While a number of applications exist to organize, our discussion focuses on camera trap image reduc- manage, and facilitate characterization of camera trap tion, we also discuss how EventFinder might be used images (Fegraus et al. 2011; Krishnappa and Turner 2014; Bubnicki et al. 2016; Niedballa et al. 2016), fewer options exist for pre-processing images to reduce the need for human post-processing. The need M. Janzen () · A. Ritter · P. D. Walker · D. R. Visscher The King’s University, Edmonton, Alberta, Canada for pre-processing is due, in part, to a high incidence e-mail: [email protected] of non-target images produced by the camera or long D. R. Visscher sequences of the same animal. These images create an e-mail: [email protected] “analytical bottleneck” during post-processing (Harris 406 Page 2 of 10 Environ Monit Assess (2019) 191:406 et al. 2010; Spampinato et al. 2015) and less human the image it determined most relevant for human involvement in pre-processing images may help alle- processing. EventFinder functions as a collection of viate it. In particular, the ability to remove “noisy” smaller programs and modules, each with a focused images resulting from background movement and the task. The program overview is shown in Fig. 1 with ability to identify a single image from a sequence for programs and the information passed to the next pro- classification will greatly reduce processing time. gram depicted. Segmenting tasks helps to focus each A primary technique to discriminate a “noisy” programs’ inputs and output, but the process is auto- image from an event (animal) is through background matic, so the user needs only to start the program subtraction (Piccardi 2004). This technique helps dif- without concern of which module to start next. In ferentiate images containing background noise from this respect, the user selects the input images to pro- images containing events of interest, and is imple- cess, optionally specifies parameters, and EventFinder mented in video imaging studies (Desell et al. 2013; copies the relevant processed images to a direc- Goehner et al. 2015; Swinnen et al. 2014; Weinstein tory for human classification and identification. We 2015). Camera trap images, as opposed to video, are describe our usage of EventFinder with default set- separated by longer periods of time making back- tings, but users may set their own parameters and ground subtraction more challenging. We developed modify XML instruction files to make customizations a stand-alone computer program to aid in image pro- to EventFinder, if they consider a different sequence cessing, categorization, and data reduction using back- of operations provides better image classification. ground subtraction to identify candidate foreground As labeled in Fig. 1, in our study, the Picture Pro- areas and color histogram comparisons to determine cessor program produced sets of images based on foreground areas from background. The program pre- temporal proximity, separating sequences when more processes remotely captured images into relevant and than 60 s had elapsed between sequential images. This irrelevant sets for human consideration, identifying part of the program required the image file names to unique events. In this paper, we describe the develop- link to the images and the time difference between ment and procedure of EventFinder, and provide an image sets as input. EventFinder extracts the time the assessment of EventFinder based on a known dataset picture was taken from the file metadata. Users select characterized completely by a human. files using a GUI (shown in Appendix 1). The output is a batch file that repeatedly launches Automate Finder, once for each image set. Automate Finder loads a Methods default XML file specifying a series of image operations, which are passed to Find Animal.IfFind Animal Study area is run in user mode, rather than the regular automated mode, the user can press buttons to initiate image- From June 2015 through June 2016, we maintained processing operations (shown in Appendix 1). User nine cameras (Reconyx PC500) throughout the Cook- mode is intended to test different image-processing ing Lake—Blackfoot Provincial Recreation Area, sequences to include in the automated mode rather located 45 km east of Edmonton, Alberta. Cameras than use for actual batch image processing since the were located along trails frequented by animals at user must be present to initiate each operation in this approximately 1.25 m from the ground and set to mode. User mode lets a user graphically test sequences collect a burst of three images when triggered. Dur- of operations by clicking buttons rather than requiring ing the time they were active, the individual cameras users to enter instructions in an XML file. collected, on average 3514 (SD = 3214) images each. Our default image-processing operations for Find Animal were as follows. To reduce noise and improve Program description processing speed, Find Animal down-sampled each image by averaging 16 pixel blocks. The result was EventFinder pre-processed camera trap images and, a mapping from 4 × 4 blocks of pixels to one pixel for each sequence of images, determined if the with less noise. From the images in a sequence, Find sequence contained a target animal or noise. For Animal generated a background image and identified sequences with a target animal, EventFinder marked foreground regions. With video frames, a background Environ Monit Assess (2019) 191:406 Page 3 of 10 406 Fig. 1 The flow, inputs, and relationship between components of EventFinder. Input parameters and operations are shown in ellipses while program modules are shown in rectangles. The user specifies inputs into Picture Processor and the program calls Automate Finder and Find Animal to determine which image for humans to examine. Providing the generated CSV file to Picture Mover moves retained images to a separate directory for human processing image can be generated with temporal averaging or initially used as the threshold to classify foreground other techniques for moving backgrounds (Power and and background pixels to give an initial approxima- Schoonees 2002; Van Droogenbroeck and Barnich tion of foreground and background pixels. At each 2014); however, camera images of an event consisted pixel location, two sets of pixels were created, the set of as few as three images causing ghosting, where of foreground pixels and the set of background pix- foreground objects are included in the background as els. Considering each input image, the pixel at the faint regions. Thus, when using camera traps instead location was either included or excluded, forming a of video, only pixels that are actually background binary label. Connected foreground pixels with the should be included. EventFinder determines which same binary labels were grouped into regions, sim- pixels to include by comparing the pixel in question ilar to the approach mentioned in McIvor (2000), with the average value from all pixels at that location, where each region corresponds to a moving object or averaging a location across images in the sequence. noise. The largest region should most frequently cor- To create a suitable background image, the Find respond to the largest moving animal in the image, Animal program computed the standard deviation in while other regions correspond

Load more