Opencv.Js: Computer Vision Processing for the Web
Total Page:16
File Type:pdf, Size:1020Kb
24 OpenCV.js: Computer Vision Processing for the Web Sajjad Taheri, Alexander Veidenbaum, Mohammad R. Haghighat Alexandru Nicolau Intel Corp Computer Science Department, Univerity of [email protected] California, Irvine sajjadt,alexv,[email protected] ABSTRACT the web based application development, the general paradigm The Web is the most ubiquitous computing platform. There is used to be deploying computationally complex logic on the are already billions of devices connected to the web that server. However, with recent client side technologies, such have access to a plethora of visual information. Understand- as Just in time compilation, web clients are able to handle ing images is a complex and demanding task which requires more demanding tasks. sophisticated algorithms and implementations. OpenCV is There are recent efforts to provide computer vision for the defacto library for general computer vision application web based platforms. For instance [4] and [5] have pro- development, with hundreds of algorithms and efficient im- vided lightweight JavaScript libraries that offer selected vi- plementation in C++. However, there is no comparable sion functions. There is also a JavaScript binding for Node.js, computer vision library for the Web offering an equal level that provides JavaScript programs with access to OpenCV of functionality and performance. This is in large part due libraries. There are requirements for a computer vision li- to the fact that most web applications used to adopt a client- brary on the web that are not entirely addressed by the server approach in which the computational part is handled above mentioned libraries that this work seeks to meet: by the server. However, with HTML5 and new client-side 1. High performance: Computer vision processing requires technologies browsers are capable of handling more complex a large amount of complex computation. This substan- tasks. This work brings OpenCV to the Web by making tial computational cost demands efficient implementa- it available natively in JavaScript, taking advantage of its tion and careful optimization. efficiency, completeness, API maturity, and its community's collective knowledge. We developed an automatic approach 2. Comprehensiveness: Complex computer vision appli- to compile OpenCV source code into JavaScript in a way cations often incorporate several algorithms from dif- that is easier for JavaScript engines to optimize significantly ferent domains to such as image processing, machine and provide an API that makes it easier for users to adopt learning, and data analysis. Hence, it is amenable to the library and develop applications. We were able to trans- provide programmers wit a comprehensive list of func- late more than 800 OpenCV functions from different vision tionality. categories while achieving near-native performance for most of them. 3. Portability: Library must be portable across all diverse web based platforms including browsers and IOT de- Keywords vices. Computer vision; Web; JavaScript; Performance 4. Adoptability: It should be easy for the community to adopt the library. This requires documentation, tuto- 1. INTRODUCTION AND MOTIVATION rials and online forums. JavaScript has rapidly evolved from a programming lan- guage designed to add scripting capabilities to make static Towards achieving these goals we decided to port OpenCV web pages more appealing[1] into the most popular and ubiq- library to JavaScript, so that can be run in different web uitous programming language deployed on billions of devices clients. We call it OpenCV.js. OpenCV[6] is an open source [2, 3]. With emergence of new technologies such as WebVR computer vision library that offers a large number of low and WebRTC, the popularity of Internet Of Things(IOT) level kernels and high level applications ranging from im- platforms, and with the abundance of visual data, computer age processing, object detection, tracking, and matching. vision processing on the web will have numerous applica- OpenCV provides efficient implementations of algorithms tions. optimized for multiple target architectures such as Desk- However, computer vision usually has high computational top and mobile processors and GPUs[7]. OpenCV has a cost, due to 1) sheer amount of computation especially on mature API which is well documented with a lot of sam- images with higher resolution and high frame rates, 2) com- ples and online tutorials. The source code is also rigorously plex algorithms to process and understand the visual data, tested. Although it is developed in C++, there are bindings 3) real time requirements for interactive applications. JavaScript for other languages such as Python and Java that expand is a scripting dynamically-typed language, which makes it its availability to a broader scope and audience. OpenCV.js performance-wise inferior to languages such as C++. With is provided as a JavaScript library that works with different Listing 2: Implementation of Strlen function with typed ar- User Application rays used as program memory (JavaScript) OpenCV.js // Memory var Memory= new Uint8Array(256 ∗ 1 0 2 4 ) ; Host ... Layout Engine JS Engine Environment function s t r l e n(ptr) f ptr= ptr j 0 ; var curr= ptr; OS and Hardware while (Memory[curr] j 0 != 0) f Figure 1: OpenCV.js interaction with user programs and curr=(curr+1) j 0 ; host environments g return (curr −ptr) j 0 ; g web environment. As shown in Figure 1, it utilizes the un- derlying environment to perform computation and media ac- cess. Environments rely on JavaScript engines for executing 2.2 WebAssembly JavaScript logic and utilize rendering and browser engines While ASM.js provides opportunity to reach near native to display graphics. Two such environments that are consid- performance, it has limitations that make it a challenge to ered in this work are: web browsers such as Mozilla Firefox use for some targets with low resources such as embedded and Node.js[8]. Table 1 provides a comparison between vi- devices. One major limitation is that the size of the gen- sion libraries that are available to JavaScript programs. erated JavaScript code tend to be large. This makes pars- ing JavaScript code to hot spot especially on mobile devices. 2. METHODOLOGY This motivates WebAssembly[11] to be pushed forward. We- bAssembly is a portable size and load-time efficient binary This section describes the approach taken to compile OpenCV format designed as an alternative target for web compila- source code directory into JavaScript equivalent. Modifica- tion. tions to the source code and techniques to adopt it to the web model will be discussed. We have used Emscripten[9] 2.3 Binding Generation and Compilation to generate JavaScript code. It is a source to source com- Regardless of the target, there is an issue with this ap- piler developed by Mozilla to translates LLVM bitcode to proach: as part of the compilation, compiler removes high JavaScript. Emscripten targets a subset of JavaScript called level language information such as class and function iden- ASM.js that allows engines to perform extra level of opti- tifiers and assign unique mangled names to them. While mizations. this is fine for compiling a C++ programs to executable 2.1 ASM.js JavaScript programs, when porting libraries, it will be al- most impossible for developers to develop programs through ASM.js[10] is a strict subset of JavaScript that is designed mangled names. To address this issue, we have provided an to allow JavaScript engines to perform additional optimiza- automatic approach to extract binding information of differ- tions that is not possible with normal JavaScript. Some ent OpenCV entities such as functions and classes and ex- JavaScript engines such as SpiderMonkey are even able to pose them to JavaScript properly. This enables the library perform Ahead-Of-Time (AOT) compilation. to have a similar interface with normal OpenCV that many ASM.js makes several limiations to the normal JavaScript: programmers are already familiar with. Table 2 shows equiv- 1) data is annotated with explicit type information. This in- alent JavaScript data types for common C++ data types . formation can be sued to eliminates dynamic type guards. Listing 3 shows how C++ and JavaScript API of OpenCV Code listing 1 demonstrate this technique. without this can be used to implement a filter. assumption, since addition between different types such as Although it is possible to convert the majority of OpenCV Numbers and Strings leads to different operations, JavaScript library to JavaScript, in order to make it portable, some of engines need to generate guards that check the data type its functionality can be skipped: dynamically. 2) ASM.js utilizes JavaScript typed arrays to provide an abstraction for program memory similar to 1. There are alternative implementations for some OpenCV the C/C++ virtual machine. Typed arrays are raw buffers functions that are better suited for the web. For in- available in JavaScript that engines are highly optimized for stance accessing file system is not trivial in web and working with it. Listing 2 demonstrate how typed arrays functions to access media devices such as cameras, and can be used to model program memory. 3) Memory is ex- graphical user interfaces have alternatives. We pro- pected to be manually managed by the programmer and no vide a JavaScript helper module that uses HTML5 garbage collection is provided. primitives to provides users with replacement func- tions to access to files hosted on the web, media de- Listing 1: Type inferrance based optimization vices through getUserMedia and display graphics us- function add(x,y) f ing HTML Canvas. x=x j 0 ; //x isa 32-bit value y=y j 0 ; //y is alsoa 32-bit value 2. OpenCV is very comprehensive. Some of the provided functions are not used in certain applications domains. // 32-bit addition Including them in the library, will make the library un- return x+y; necessary bigger.