<<

2330 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 32, NO. 6, JUNE 2021 Broad Colorization Yuxi Jin ,BinSheng , Member, IEEE,PingLi , Member, IEEE, and C. L. Philip Chen , Fellow, IEEE

Abstract— The scribble- and example-based colorization meth- be time-consuming for a preferable colorization result. The ods have fastidious requirements for users, and the training scribble-based methods provide users with great autonomy by process of deep neural networks for colorization is quite time- requiring them to provide substantial scribbles on the consuming. We instead proposed an automatic colorization approach with no dependence on user input and no need to input gray-scale images and extending the special scribbles to endure long training time, which combines local features and the whole image, which means the result of the scribble-based global features of the input gray-scale images. Low-, mid-, and colorization depends largely on the scribbles. Thus, getting a high-level features are united as local features representing cues good colorization result is a challenging process for users, existed in the gray-scale image. The global feature is regarded and the practicability of the scribble-based colorization is as data prior to guiding the colorization process. The local broad learning system is trained for getting the limited. The example-based methods have to look for related value of each pixel from the local features, which could be images with good color design, which has a great impact on expressed as a chrominance map according to the position of the colorization result. However, looking for an appropriate pixels. Then, the global broad learning system is trained to refine reference image needs users to have a high esthetic standard the chrominance map. There are no requirements for users in our on image color; thus, the example-based methods often fail to approach, and the training time of our framework is an order of magnitude faster than the traditional methods based on deep achieve outstanding colorization results. neural networks. To increase the user’s subjective initiative, our Recently, deep learning has achieved outstanding results system allows users to increase training data without retraining in the field of image and data processing [10]–[16]. Much the system. Substantial experimental results have shown that our work has been done on the colorization by the deep neural approach outperforms state-of-the-art methods. networks [17]–[27]. It is proven that the deep learning-based Index Terms— Colorization, global broad learning system method can achieve an end-to-end framework and get satis- (GBLS), global features, local broad learning system (LBLS), factory results. Preprocessing nor postprocessing from the user local features. to design color is not required. However, the architecture of a deep neural network is complex and has to train a large I. INTRODUCTION number of weight data, leading to a long time of training OLORIZATION is first introduced at the end of the procedure. Although the deep neural network could make a C19th century [1], [2]. The main task of colorization is large reduction in the burden of users, the training process is assigning reasonable to pixels in a given gray-scale intolerable for users. Especially, when the architecture of the image. For colorizing a particular gray-scale image, tradi- framework is redesigned, the whole network has also to be tional methods can be roughly summarized as scribble-based retrained, which is unbearable. colorization [3]–[5] and example-based colorization [6]–[8]. The broad learning system is an efficient and effective learn- Both of them require considerable user interaction, which ing system [28]–[31]. Given input data, mapping features and is challenging for users, and the interaction process would enhanced features of the broad learning system are extracted and then placed in the input layer. The enhanced nodes are Manuscript received July 7, 2018; revised May 26, 2019, November 26, 2019, and April 3, 2020; accepted June 21, 2020. Date of publication July 2, obtained by enhancing the feature nodes for improving the 2020; date of current version June 2, 2021. This work was supported in part learning ability, which allows the input nodes and the output by the National Natural Science Foundation of China under Grant 61872241, nodes to connect directly, leading to the result that the input Grant 61572316, Grant 61751202, Grant 61751205, and Grant 61572540 and in part by The Hong Kong Polytechnic University under Grant P0030419 and data itself has a certain influence on the output. In this article, Grant P0030929. (Yuxi Jin, Bin Sheng, and Ping Li contributed equally to this we propose a new broad learning system-based automatic work.) (Corresponding author: Bin Sheng.) colorization algorithm, which can know exactly the category Yuxi Jin is with the Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China. of pixels blocks and the color of pixels through training. Bin Sheng is with the Department of Computer Science and Engi- Compared with existing traditional methods, the proposed neering, Shanghai Jiao Tong University, Shanghai 200240, China (e-mail: method does not require users to scribble in the gray-scale [email protected]). Ping Li is with the Department of Computing, The Hong Kong Polytechnic image and does not have to find the related for the University, Hong Kong (e-mail: [email protected]). colorization. It is well known that hyperparameters set up by C. L. Philip Chen is with the School of Computer Science and Engineering, users greatly affect the performance of deep neural networks. South China University of Technology, Guangzhou 510006, China, also with the Navigation College, Dalian Maritime University, Dalian 116026, China, Besides, the result of deep neural networks converges slowly and also with the Faculty of Science and Technology, University of Macau, and easily gets to the local minimum. However, the weights Macau (e-mail: [email protected]). of a broad learning system can be calculated based on its Color versions of one or more of the figures in this article are available online at https://ieeexplore.ieee.org. flat framework, and new weights can be calculated without Digital Object Identifier 10.1109/TNNLS.2020.3004634 the retraining process. Thus, the proposed broad learning 2162-237X © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. JIN et al.: BROAD COLORIZATION 2331 system-based method can be effectively and efficiently even scribbles, which could help users to get the desired results when the nodes are increased. quickly. Based on the weighted geodesic distance, this method We combine the global features and the local features is fast, and it can change the colors of an existing color image extracted from the gray-scale image to colorize each pixel. or change the relevant luminance. Luan et al. [5] reduced the All the extracted local features of the image are used as requirement of user-specific scribbles by utilizing the texture input data of the local broad learning system (LBLS, see similarity, and Qu et al. [33] employed the feature classifi- Section IV-B). Since the global information is useful for cation for a cartoon colorization technique, which propagates colorizing gray-scale image, we use the global feature to color over regions emphasizing the continuity in the target guide the colorization via a global broad learning system image. (GBLS, see Section IV-C). The colorization result of our approach is outstanding for generally combining the result of B. Example-Based Colorization via LBLS and GBLS. For special cases where users have already User-Supplied Example(s) known the color of the gray-scale image, they can increase the training data with special colors that they want without Unlike scribble-based methods, example-based methods waiting for hundreds of minutes (see Section V). In other could colorize gray-scale images without user-specific scrib- words, our colorization procedure can be user-guided through bles. The example-based methods colorize the gray-scale the increment of input data, taking advantage of the special image by exploiting the colors of reference images. For the framework of the broad learning system. Our work makes the accuracy of the colorization result, the reference images should following three main contributions. be similar to the target gray-scale images. Welsh et al. [34] colorized a gray-scale image by matching the pixel inten- 1) Efficient and Effective Learning System: The special sity and statistical information of its neighboring pixels structure of the broad learning system makes the training between the target image and the reference image, which is time significantly less than the model based on deep inspired by color transfer technique [35]. After transferring neural networks, which helps our approach avoid the the gray-scale image and its reference image to CIELAB unbearable long training time. Furthermore, the broad , the color transfer technique computes the lumi- learning system could be structured without retraining, nance of a particular pixel and the standard deviation with while the number of input nodes is changed. its neighboring pixels and then maps the color of the best 2) Global-and-Local Semantic-Guided Colorization: match point to the particular pixel to colorize the gray-scale The local features cannot completely determine the image. Irony et al. [36] improved the colorization by calcu- corresponding color values of the pixels since images lating the texture of the input image. Comparing with the with the same structure may be attached to different methods that colorize a gray-scale image only depending on colors for the different scene or season or time. Thus, pixel-level information, the high-level context could help to get the global description of a gray-scale image is obtained a result with much better performance in spatial coherency. to guide the colorization process. Charpiat et al. [6] proposed a colorization with the global 3) Optional User-Guided Colorization: Our framework optimization algorithm. Gupta et al. [8] employed superpixels can be guided by users through the increment of input to get a higher degree of spatial consistency. All of these data to avoid the inaccuracy color caused by unusually colorization methods are limited by the user-supplied example original objects, which gets colorization results partly images, and it is a difficult task for users to find a proper according to the expectation of users. reference image.

II. RELATED WORK C. Example-Based Colorization via Web-Supplied Example(s) A. Scribbled-Based Colorization While the user-supplied still require users to find a suitable The gray-scale image can be colorized by propagating the image, the colorization methods with web-supplied example(s) scribble from users. Levin et al. [3] proposed a colorization release the users’ burden. Liu et al. [37] labeled a set of method, which is based on the user-specific scribbles. The similar reference images on the Internet with an intrinsic main idea is to solve the cost function of the luminance image, which is used as matches. This method is robust, difference between a special pixel and its contiguous pixels but it also has its limitations. A particular object or scene using least-square optimization. While this method only con- is needed for the preciseness, and this method cannot colorize siders the luminance between pixels, the color bleeding may the moving factors because these factors are not included in appear at the boundaries in original images. Huang et al. [4] the computation of the intrinsic image. Chia et al. [7] required developed the colorization method by extracting reliable edge users to provide semantic text and segmentation cues for the information from the original image, which is realized by major foreground objects in the scene of the target image. the adaptive edge detection algorithm. The scribble-based colorization system, the modified color transferring sys- tem, and the colorization scheme are integrated after the D. Deep Learning-Based Colorization edge detection. This method could prevent color bleeding Deshpande et al. [17] proposed a learning-based system around the boundaries. The colorization method proposed by to colorize gray-scale images, and the colorized results are Yatziv and Sapiro [32] provided a reduced set of chrominance refined by histogram correction. Thus, a proper histogram is

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. 2332 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 32, NO. 6, JUNE 2021 required in their method. Cheng et al. [27] proposed a deep Algorithm 1 Broad Colorization learning framework for colorization. They classified multiple Input: Image IG images into different classes and then trained a distinctive deep Output: Colorized Image IC neural network for a particular image class. They extracted 1: Extract the local feature X =[Xl , Xm , Xh ] from the three kinds of features to describe an image, and the feature grayscale image; descriptor is set as the input of the corresponding network’s 2: Calculate Z and H of LBLS from X via Eq. (1) and Eq. (7); input. For training the deep network, their input is pairs of 3: Obtain the chrominance map by LBLS; reference images: gray-scale images and the corresponding 4: Classify the input image into one of the classes; colorful images. The feature descriptors are computed from the 5: Refine the chrominance map with the guidance of the sampled gray-scale pixels, and the corresponding chrominance classification result; values are computed from the color pixels. Then, the weights 6: Combine the luminance map and chrominance map to get of the deep learning system are obtained by the training the colorized image IC ; process with feature descriptors and chrominance values. The method is later improved on the feature descriptors in [19]. In succession, Iizuka et al. [18] proposed a deep learning-based colorization, which combines the local fea- tures and global features of gray-scale images. Their network could be concluded as four parts: low-level feature network, mid-level feature network, global feature network, and col- orization network. While the low-level feature network could extract the low-level features, which can be treated as the input of the mid-level feature network, the global feature is fused with the mid-level features by the fusing layer. With the guidance of the global features, the colorization result is pro- moted much more. Dahl [22] proposed a convolutional neural network (CNN)-based colorization, and they map the input gray-scale images to the chrominance values based on the Fig. 1. Colorization results of our approach. The input gray-scale image extracted features. Zhang et al. [20] proposed a method based is shown in the first row, and our approach assigns a color to each pixel for colorizing it. (a) Part of the mountain. (b) Underwater scene. (c) and on CNN, and rare colors are emphasized in their work so that (d) Historical photographs that are challenging to the color method. this method could produce lifelike colors. Larsson et al. [21] took advantage of the advances in the deep neural networks. Their method predicted color histograms of every pixel by results of LBLS and GBLS are combined to obtain the final the trained model. Qayynm et al. [26] proposed a method to chrominance of the gray-scale image. Finally, the colorized colorize the thermal imagery with the deep neural networks. result is obtained by the chrominance map and the gray-scale image. The technological process of our broad colorization is illustrated in Algorithm 1. III. APPROACH OVERVIEW As shown in Fig. 1, our approach could colorize images In our approach, we extract local features from the input of distinguished types, including historical photographs. His- gray-scale image first. The local features consist of low-, mid-, torical photographs are hard to colorize because there is no and high-level features. The low-level features are arrays con- original color of historical photographs for training and the taining 7×7 patch centered at a pixel, but they are not enough limitations of the shooting techniques at that time make his- to estimate the chrominance of a pixel. Thus, we synchro- torical photographs low resolution. Besides, the storage could nously extract mid-level features from the gray-scale image. also damage the quality of historical photographs. However, The mid-level features in our approach are represented as the our approach can get good colorization results for historical DAISY features, which are fast local descriptors for dense photographs. The colorization process is illustrated in Fig. 2. matching. The patch features and DAISY features illustrate the structural features of the gray-scale image. Furthermore, IV. BROAD COLORIZATION MODEL we extract the semantic feature as high-level features for further describing the local features of the input image. After Although the learning system has been proven to be an the local feature extraction from the input image, we treat efficient method in colorization, deep learning seems to be the local features as the input of the LBLS for the local time-consuming in the training process. Thus, we instead colorization learning process. propose a colorization approach via the broad learning system. The input nodes are calculated based on the input data, Our approach takes advantage of the learning method avoiding and the output of LBLS is the possible chrominance map the drawback of deep neural networks in the training process, of the gray-scale image. We also obtain global features of which means that our model could be trained quickly. Since the input gray-scale image by the GBLS for classifying input features of a pixel and its neighboring pixels could indicate the images into classes. The possible color specified by GBLS chrominance of it, we extract features from the target image could refine the chrominance map obtained by LBLS, and the for colorizing a gray-scale image.

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. JIN et al.: BROAD COLORIZATION 2333

Fig. 2. Overview of the proposed colorization approach. Local features are extracted from the input image and are viewed as the input data of LBLS. We obtain the possible chrominance map through LBLS. The chrominance map consists of the channel of a* and b* in the CIELAB color space. The GBLS will classify the input image into one of the 205 types of image sets in the places scene data set [9]. The possible color of the matched class will be used to refine the chrominance map obtained by LBLS. The chrominance map and the gray-scale image are combined for the colorized image.

m While the input data X is obtained, the input layer Fn is m calculated in succession. Fn consists of feature nodes and enhanced nodes. The feature nodes are termed Zi and are the mapping feature of the input data X with the transformation function φ.Theith mapping feature could be calculated by = φ ( + β ), = ,..., Zi i XWi i i 1 n (1)

where φi is the transformation function of the ith group m W β Fig. 3. Structure of BLS. The input layer is expressed as Fn , containing n of mapping feature, and the related parameters i and i groups of feature nodes Zi and m groups of enhanced nodes H j . The feature are generated randomly with proper dimensions. There are n nodes Zi are mapped feature generated from input data X, while the enhanced groups of feature nodes in total and k nodes per group. For nodes H are improved feature nodes. For LBLS, the input is the extracted features, and the output is a chrominance map consisting of the color channel determining the chrominance map from the LBLS, we set the a* and b*. For GBLS, the input data are the input gray-scale image, which number of feature nodes as 10 × 20. There are ten groups of will be classified into a certain kind of image categories. the feature nodes and 20 nodes per group. For enhancing the mapping feature, the enhanced features are calculated based on the mapping feature A. Feature Extraction = ξ ([ ,..., ] + β ), = , ,..., The color of a pixel is not only related to the gray value of H j j Z1 Zn W j j j 1 2 m (2) the pixel point but also related to the structure around the pixel where ξ j is the transformation between jth enhanced node point. Thus, the gray value of only one-pixel points cannot be β and all feature nodes Zi . Similarly, W j and j are generated used to speculate on the color of the point. We extract the randomly. What has to be emphasized is that φi and φk may pixel gray value around the pixel 7 × 7 of the pixel as one of be different if i = k.Likeφi and φk, ξ j and ξk may also be the bases for predicting the color of the point. However, only different when j = k.Therearem groups of enhanced nodes small pieces of data are not sufficient to determine the color of with q nodes per group are generated, and we set m = 1and the modified pixels, and the data blocks cannot determine the q = 14000 for the chrominance map. characteristics of the segmentation lines in the image. We use m =[ , ,..., We represent all the input nodes with Fn Z1 Z2 the DAISY features to help determine the mid-level feature. Zn|H1, H2,...,Hm], containing n groups of feature nodes The DAISY features are similar to the features of the SIFT and m groups of enhanced nodes. The final BLS is formu- but faster than the soft features. We also use semantic features = m m m lated as Y Fn W n . The weight coefficient W n could be as input features to get color accurately by applying semantic m = ( m)+ calculated by pseudoinverse in BLS W n Fn Y,and extraction algorithms [38], [39]. The input library of local BLS ( m )+ the pseudoinverse of input nodes Fn could be calculated contains three parts: a pixel block around a pixel, the day ( m)+ = (λ + m( m )T )−1( m )T by Fn limλ→0 I Fn Fn Fn . I is a unit feature, and the semantic feature. Pixel blocks are low-level matrix with proper dimensions, λ is a regularization parameter features, day features are mid-level features, and semantic for ridge regression (see the detail in [30]), and λ is set features are high-level features. as 10−8. Since the weight coefficient is calculated instead of the time-consuming iterative process, the classification result B. Local Broad Learning System can be obtained quickly. We propose an LBLS based on the broad learning system to extract local features of an image. The structure of the broad C. Global Broad Learning System learning system is shown in Fig. 3. There are only two layers The colorization results only relying on local features are totally in the system: the input layer and the output layer. usually not satisfied, so we try to combine global features to The input nodes consist of feature nodes and enhance nodes. refine the colorization result. The global features are treated as

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. 2334 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 32, NO. 6, JUNE 2021

on the places scene data set [9], which has 2 448 873 images taken in 205 different scenes. The image categories contain general scenes and animals, such as architecture, forest, river, garden, arch, abbey, and alley. We take 269 853 images from the data set in total, of which 254 312 are training samples and 15 541 test samples. During training, we transfer training samples from RGB color space to the CIELAB color space. The gray-scale image is the L channel of the CIELAB color space and is used as an input image. The input data of LBLS consist of features extracted from the gray-scale image, and the chrominance is used to find the appropriate weight in the system. Furthermore, Fig. 4. Comparison between the result with GBLS and without GBLS. In the the gray-scale image and its original category are used to train first row, the colorized result of an outdoor scene without GBLS has apparent GBLS, which could be used to guide the colorization process. artificiality (see the color of the building in the sky), and the result calculated with GBLS eliminated the artificiality. Similarly, the colorized result of the indoor scene without GBLS in the second row has obvious artificiality (see V. M ODEL OPTIMIZATION the color on the roof), and the colorized result with GBLS could abandon the impossible color in the process of refining the chrominance map. (a) Input. A. Number Nodes (b) Without GBLS. (c) With GBLS. The feature nodes and enhanced nodes decide the learning ability and training time of the broad learning system, and the number of network nodes means the complexity of a network. an image prior to guiding the colorization for a more accurate More nodes enable the system to colorize images taken in colorization result. We use GBLS to classify input images, more difficult scenes. Conversely, fewer nodes mean that the and every category has a possible color. The image category system needs less training time. We attempt to obtain the most will be introduced in detail later. After classifying images by appropriate number of nodes by taking advantage of the spe- GBLS, the impossible color in the chrominance map obtained cial structure and quick training process of the broad learning by LBLS will be eliminated to refine the chrominance map. system. While the input nodes increase, the corresponding While mid-level features could help to colorize objects with weights are calculated based on the original system. In other complex texture, regions with low texture may be excessively words, the new weights after increasing input nodes could be segmented in our approach. However, the GBLS could help calculated without the retraining process, which could show to refine the colorized result with global guidance. that the structure of the broad learning system overmatches The structure of GBLS is similar to local BLS but with the structure of the deep neural networks. fewer input nodes. For determining the category of the input The enhanced nodes H and feature nodes Z can be image from the GBLS, we set the number of feature nodes as increased individually or together in BLS. As shown in 8 × 10, which means there are eight groups of the feature Fig. 5(a), a group of enhanced node Hm+1 with q nodes per nodes and ten nodes every group. We set m groups of group is increased, and the additional nodes can be calculated enhanced nodes with q nodes per group generated to enhance by H + = ξ + ([Z ,...,Z ]W + + β ). The updated = m 1 m 1 1 n m 1 m+1 the mapping feature. In this article, we set m 1and input nodes Fm+1 are denoted as q = 7000 for the chrominance map. While the input nodes n m+1 =[ , ,..., | , ,..., ] and the corresponding weights are calculated by the same Fn Z1 Z2 Zn H1 H2 Hm+1 (3) ≡[ m | ]. equation with LBLS, the input data of GBLS is the original Fn Hm+1 gray-scale image. As shown in Fig. 4, the colorization results   m+1 + of both the indoor scene and the outdoor scene without GBLS Then, the pseudoinverse of the new input nodes Fn have obviously artificiality. The chrominance maps obtained can be calculated by        m + m + T by LBLS have some unharmonious factors because similar + + F − F H + B Fm 1 = n n m 1 (4) structures in different parts may have different colors in the n BT training images. The GBLS classifies the input image at the where B can be calculated by the additional enhanced node first row of Fig. 4 into the outdoor scene so that the color of m Hm+1 and the original input nodes Fn (see the detail in [30]). the cloud could be refined. Similarly, the indoor scene has no m+1 Obviously, the weight W n could be calculated by color of the sky, and the bright part of the roof at the second     m m + T row of Fig. 4 caused by the could be refined. + W − F H + B Y W m 1 = n n m 1 . (5) n BT Y D. Model Training m+1 Obviously, the new weights W n could be computed based m+1 Considering the intensity of pixels in the gray-scale image on the pseudoinverse of all input nodes Fn , and only could be the luminance of the color image, we train our the pseudoinverse of additional nodes needs to be calculated model using the CIELAB color space. The L channel is without calculating the entire input nodes so that the system obtained from the input image, and our approach just needs could be built quickly. The increment in feature nodes is nec- to calculate the channel of a and b. We trained our model essary sometimes, as an insufficient mapped feature may not

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. JIN et al.: BROAD COLORIZATION 2335

Fig. 5. Structure of incremental BLS. (a) Group of additional enhanced nodes Hm+1 inserted in the original BLS and connecting the output nodes directly. (b) Increment of a group of feature nodes Zn+1 and the corresponding enhanced nodes H z. (c) Structure of the increment of feature nodes and enhanced nodes synchronously, where the feature nodes are increased before enhanced nodes. extract enough underlying factors from input data. As shown is impossible for the training data to embody everything in the in Fig. 5(b), the feature nodes Zn+1 can be calculated by word. If we want to colorize something whose structures have never appeared in the training data, the colorized result will Z + = φ + (XW + + β + ). (6) n 1 n 1 n 1 n 1 fail, or the colorized result may be not similar to the ground The corresponding enhanced feature Hz increased after the truth. In order to increase the user’s controllability, we allow additional (n + 1t h) feature node group inserted as users to increase the input data allodially. =[ξ ( + β ),...,ξ ( + β )]. As shown in Fig. 6, the structure with an increment of input H z z1 Zn+1W z1 z zm Zn+1W zm zm (7) 1 data X is similar to the original BLS structure, yet the dimen- m m Similarly, the weights of the updated structure W n+1 can sions of all matrices changed. As earlier, the weights x W can m m n be calculated by W , Zn+1, Hz, Y,andM (see the m n n be obtained by the original weight W n . The mapping feature detail in [30]). The increment of both feature nodes and of the additional data can be calculated by: xZi = φi (X a W i + enhanced nodes is shown in Fig. 5(c). The additional enhanced β ), = ,..., i i 1 n, and the corresponding enhanced feature can nodes after the incremental feature nodes H(m+1) are different be calculated by xH = ξ ([xZ ,...,xZ ]W + β ), j = = ξ ([ ,..., ] + j j 1 n j j from Hm+1,whereH(m+1) (m+1) Z1 Zn+1 W (m+1) 1, 2,...,m. The updated input nodes can be noted as m+1 β( + )). Then, the new weights W can be calculated by   m 1 n+1 m twice matrix conversion. The update process is shown in m = Fn x Fn [ ··· | ··· ]T . (8) Algorithm 2. All of the experimental data are obtained by the xZ1 xZn xH1 xHm experiment on MATLAB software platform under a laptop that The pseudoinverse of the updated input could be calculated equips with Intel-i5 2.4-GHz CPU, 16-GB memory. by As shown in Table I, we look for a suitable number of   m + input nodes of LBLS. While only the input nodes of LBLS x Fn =[( m )+ − ([ ··· | ··· ]T ( m )+)T | ] are determined here to explain the number selection process Fn B xZ1 xZ1 xH1 xHm Fn B of nodes, we select 1100 images in the same category to (9) train our LBLS, and 200 images in the same category are used as test images to calculate the colorization accuracy. where B can be determined by the increased feature and We set the feature nodes as 100 and the enhanced nodes the original feature. Finally, the updated weights could be as 4000 preliminarily, and the nodes are gradually increased calculated by later. We set the number of feature nodes from 100 to 800, x W m = W m + (Y T W m)B (10) and the enhanced nodes from 4000 to 18 000 to choose the n n a n most suitable number. As we can see the additional training where Y a is the colorful image of the additional input images time (AdTT) is much less than the based training time. While so that the updated weights are obtained based on the weights there are 100 feature nodes and the enhances increased from without the additional input data. Thus, the system can be 4000 to 6000, there are only 7.841 s needed. The basic training restructured quickly. The detail of the structure updating is time of 43.5274 s is not spent again in the incremental process showninAlgorithm3. because the system does not need to be retrained. As shown in Fig. 7, the colorized images are not natural in the second column. We cannot colorize the Ailurus ful- B. User-Guide Colorization gens ( pandas) well because they have never appeared in All of the learning-based colorization methods rely on the the training images and there is no similar structure in the training data and the generalization ability of the system, and it training process. The colorized results in the third column

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. 2336 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 32, NO. 6, JUNE 2021

Algorithm 2 Node Number Selection TABLE I Input: Original feature nodes n , Original enhanced nodes FOR SELECTING THE MOST SUITABLE INPUT NODES OF LBLS, WE SET o DIFFERENT NUMBER FEATURE NODES AND ENHANCES NODES TO mo, Training samples X FIND THE PROPER ACCURACY AND ACCEPTABLE TRAINING TIME Output: Final feature nodes n f , Final enhanced nodes m f 1: Calculate the accuracy and count the training time; 2: n = no, m = mo; 3: for i = 1; i ≤ n do β 4: Random generation W i , i ; = φ ( + β ) 5: Calculate Zi i XWi i ; 6: end for 7: for j = 1; j ≤ m do β 8: Random generation W j , j ; = ξ ([ ,..., ] + β ) 9: Calculate H j j Z1 Zn W j j ; 10: end for m 11: Calculate F =[Z1, Z2,...,Zn|H1, H2,...,Hm]; n  − ( m)+ = λ + m( m )T 1( m )T 12: Calculate Fn limλ→0 I Fn Fn Fn ; m = ( m )+ 13: Calculate W n Fn Y; 14: repeat 15: if a group of enhanced nodes are added then β 16: Random generation W m+1, m+1; = φ ( + β ) 17: Calculate Zi i XWm+1 m+1 ; m+1 18: Update Fn ; ( m+1)+ 19: Calculate Fn by Eq. (4); m+1 20: Calculate W n by Eq. (5); 21: m = m + 1; 22: end if 23: if a group of feature mapping are added then β 24: Random generation W n+1, n+1; = φ ( + β ) 25: Calculate Zi i XWn+1 n+1 ; m 26: Update Fn+1; ( m )+ Fig. 6. Structure of BLS when data increase. The structure is similar to the 27: Calculate Fn+1 ; m basic BLS structure, but the dimension of related parameters is changed. 28: Calculate W n+1; 29: n = n + 1; 30: end if 31: until the accuracy is not satisfactory; our method with those deep learning-based state-of- the-art methods by Iizuka et al. [18], Zhang et al. [20], 32: n f = n; Larsson et al. [21], Dahl [22], Richart et al. [23], Xiao et al. 33: m f = m; [24], and Cheng et al. [27]. It has to be mentioned that these methods also directly colorize the target / or gray images. Fig. 10 shows the qualitative comparison with two are improved with the incremental training data, as shown selected representative cases for each method, from which we in Fig. 8. As images with the red pandas in different scenes can see that the proposed method can generate more saturated increase, the results may be much improved. and colorful results than the existing deep learning-based state- As images containing people are common and there always of-the-art methods. exist special requirements for the colorization of people’s skin, The first row of Fig. 10 shows the comparison with we trained our system with images of people from different Iizuka et al. [18], which trained their system using the place regions. According to their skin color, people are generally scene data set. The roof of our result is much more realistic classified into three different categories: white, , and than the result of Iizuka et al. [18]. Besides, compared with dark. For each category, 1000 images are taken to train Iizuka et al. [18], the clothes of the girl of our result is the system. As shown in Fig. 9, the colorized results for significantly separated from her face. The second row of images of people from different regions are distinguished Fig. 10 shows the comparison with Cheng et al. [27]. Our accurately. result can generate a more realistic sky color, and the shape of the cloud in the first input image is well maintained, while VI. EXPERIMENTAL RESULTS the result of Cheng et al. [27] colorized the sky between two A. Comparison With State-of-the-Art Methods clouds into white. Considering that the sky is colorized as Since our method is based on broad learning sys- blue, the buildings of the second input image should be gray, tem and directly colorize the black/white or gray images which is quite common in a modern city; however, the result without reference images or user interactive, we compare of Cheng et al. [27] is colorized as .

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. JIN et al.: BROAD COLORIZATION 2337

Algorithm 3 System Updated With New Inputs Input: Training samples X Output: W 1: Calculate the accuracy and count the training time; 2: for i = 1; i ≤ n do β 3: Random generation W i , i ; = φ ( + β ) 4: Calculate Zi i XWi i ; 5: end for 6: for j = 1; j ≤ m do β 7: Random generation W j , j ; = ξ ([ ,..., ] + β ) 8: Calculate H j j Z1 Zn W j j ; 9: end for m 10: Calculate F =[Z1, Z2,...,Zn|H1, H2,...,Hm]; n  − ( m)+ = λ + m( m )T 1( m )T 11: Calculate Fn limλ→0 I Fn Fn Fn ; m = ( m )+ Fig. 9. Examples of colorizing images containing people. As the processing 12: Calculate W n Fn Y; of people in images is unique, we train our network with images containing 13: repeat people of different complexions to improve the practicability of the system. 14: if New inputs Xa are added then (a) First character image example. (b) Second character image example. m (c) Third character image example. 15: Update x Fn by Eq. (8); ( m )+ 16: Calculate x Fn by Eq. (9); m 17: Calculate x Wn by Eq. (10); 18: end if method can better handle the grass hand and the color 19: until the accuracy is not satisfactory; separation between objects and grassland. The comparison results with Larsson et al. [21] are shown in the fourth row of Fig. 10. As we all know, a cock mostly has the color of red, which is well achieved by our method, while the result of Larsson et al. [21] is not that realistic. What is more, Larsson et al. [21] colorized part region of the wall in the sec- ond input image into blue, which should be consistent with other regions in color. Obviously, our method well colorized the wall. Dahl [22] built a quite famous and mature web site, generating colorized images for black/white or gray images. We also compared our method with his method. From the fifth row of Fig. 10, we can see our method can get more appealing results. The method of Richart et al. [23] is based on training a Fig. 7. Since the training data do not contain images of the Ailurus fulgens, simple classifier using backpropagation over a training set namely, red pandas, the colorization result shown in the middle column is not of color and corresponding gray-scale pictures. The classifier satisfactory. Thus, we increase images of red pandas into training data. The final result shown in the right column is obviously improved. predicts the color of a pixel based on the gray level of the pixels surrounding it. As shown in the sixth row of Fig. 10, the results of Richart et al. [23] look quite realistic, but the results of our method are more appealing and saturated. As we mentioned in Section V, our method utilizes the local and global features of the gray image, and we also compared our method with Xiao et al. [24], which combined local features and global features, in the last row of Fig. 10. By contrast, our method colorized the flower into a more realistic color for the first image and got a more realistic sky colorization result for the second image.

B. User Study We performed a user study of real versus fake two- alternative forced-choice on Amazon Mechanical Turk (AMT) to evaluate how compelling our colorization results to human Fig. 8. Increased image set of Ailurus fulgens (red pandas). observers. The perceptual realism test is similar to the approach taken by Zhang et al. [20] and He et al. [41]. Accord- We compared our method with Zhang et al. [20] in the ing to the characteristic of our colorization method (what we third row of Fig. 10, from which we can see that our have mentioned earlier), we took the learning-based methods

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. 2338 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 32, NO. 6, JUNE 2021

Fig. 10. Comparison with state of the art. We compared our proposed method with the methods by Iizuka et al. [18], Zhang et al. [20], Larsson et al. [21], Dahl [22], Richart et al. [23], Xiao et al. [24], and Cheng et al. [27]. Each row shows the comparison with one method. The first and fourth columns are input gray images, the second and fifth columns are results of state-of-the-art methods, and the third and sixth columns are the result of our proposed method. by Iizuka et al. [18], Zhang et al. [20], and Larsson et al. [21] of one method with the ground truth so as to ensure that all for comparison. algorithms were tested in equivalent conditions. There were total of 100 participants invited to take part For the first five image pairs, participants were given feed- in the study. They were randomly divided into four groups, back about their choices to make sure that all participants each of which was given a series of image pairs. Every image are competent at the test. We recorded the choices of all pair consisted of a ground truth color photo and a recolorized participants about the remaining 50 image pairs and analyzed version produced by our method or the learning-based compar- these records. The total fooling rate for all these compared ison methods and was shown side by side to the participants. methods (including ours and ground truth) is shown in Table II, Which side shows the ground truth was randomly set. We gave which validates the effectiveness of our method. Besides, the participants 3 s to observe the image pairs shown on the we also recorded the average runtime for our method and screen and asked them to choose the one they considered as the compared learning-based methods colorizing these tested real in 5 s. For fair comparisons, we posted the results of images, and we can see that our method took the shortest time. the four comparison methods (including ours) for 55 images The participants’ competency about detecting subtle errors of simultaneously and randomly show each group only the results our colorization results can be seen in Fig. 11. The percentages

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. JIN et al.: BROAD COLORIZATION 2339

Fig. 11. Examples from the user study. GT denotes the ground truth. The percentages show the statistics that the participants choose our results as real rather than the ground truth. Input images: ImageNet data set [40].

TABLE II FOOLING RATE OF THE USER STUDY FOR REAL VERSUS FAKE, AND THE AVERAGE COLORIZATION RUNTIME FOR TESTED IMAGES.WE COMPARED OUR COLORIZATION METHOD WITH SOME LEARNING-BASED METHODS.INPUT IMAGES:IMAGENET DATA SET [40]

Fig. 13. Comparison with deep exemplar-based method [41] over legacy photos. The reference images are the images used by He et al. [41] to guide their generation of the final results.

Fig. 12. Comparison with methods by Larsson et al. [21], Zhang et al. [20], and Iizuka et al. [18] over black-and-white images.

denote the statistics that the participants choose our results Fig. 14. Comparison with method under user guidance [42]. The user-specific images contain color blocks that guide the colorization procedure of their as real rather than the ground truth. In all pairs above the results. dotted line, the participants preferred our colorization results to be real rather than the ground truth, which is partly due to the more saturated and colorful appearance of our results. C. Colorization Over Past Images Below the dotted line are examples that are not easy to fool We also colorized historic images, and com- the participants. From Fig. 11, we can see our method can get pared them with the results of some existing state-of-the-art more appealing images than the ground truth in some cases. colorization methods (see [18], [20], [21], [41], and [42])

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. 2340 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 32, NO. 6, JUNE 2021

Fig. 15. Colorization result of our approach to images of a different scene. These pictures belong to different types of images and contain different objects. They are photographed at different times and different seasons. Most of them are downloaded from the Internet randomly, part of them are from the place test set, and part of them are from films.

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. JIN et al.: BROAD COLORIZATION 2341

TABLE III TRAINING TIME CHANGE WITH ADDING INPUT NODES.THE ADTT: ADDITIONAL TIME SPENT WHEN NODES ARE ADDED, THE ACTT: TOTAL TIME CONSISTING OF THE SUM OF ADTT, AND TRAINING TIME BEFORE ADDING INPUT NODES

TABLE IV IMAGES WITH DIFFERENT RESOLUTION ARE RUN ON BOTH CPU Fig. 16. Limitation of our approach. For the artificial images, the color AND GPU. GPU COULD SPEED UP OUR COMPUTING ON is painted by the will of the painters, and our approach colorizes the input IMAGES WITH DIFFERENT SIZES gray-scale image based on the similar structure of the training data. The first column is the input images, and the second column is the colorized results of our approach. As we can see, the color is lost in the gray-scale image. Comparing with the ground truth (the third column), our result is totally different.

training time (AcTT). While the training images are increased, the AcTT grows slowly. Besides, we test the system with 500 images in the chose categories. We run images with different resolutions and calculate the average running time for comparison. As shown in Table IV, the running time of for further explaining the efficiency of our proposed model, different resolutions is speed up from 4.3–5.5 times on the which can be seen in Figs. 12–14. All the tested historic black GPU platform. and white images are taken from the project web sites of We show some additional colorization results on different He et al. [41] and Zhang et al. [42]. kinds of images. As shown in Fig. 15, unconstrained images Fig. 12 shows the results of our proposed method and some from different data sets are colorized by our approach. Most state-of-the-art learning-based colorization methods, where we of them are downloaded from the Internet randomly, part of can see our results are more saturated and colorful. The second them are chosen from the test set in the place data set, and column of Fig. 13 shows the results of a deep exemplar-based part of them are selected from files. As we can see, all of the method [41], which colorizes images according to the color colorized results are close to the truth, even the images with distribution characteristics of the reference images (as shown a little color (the scene of ice and snow). While our system in the third column of Fig. 13). The second column of colorizes images combing the local feature and global feature, Fig. 14 shows the results of a user-guided image colorization the generalization ability of our approach is approved. method [42], which colorizes images under the guidance of user-specified colors (shown as color blocks in the images VII. CONCLUSION of the third column of Fig. 14). From the second column of Figs. 13 and 14, we can see the two methods can achieve quite We proposed a broad learning system-based colorization for good results with the guidance of either reference images or colorizing gray-scale images. The LBLS is used to obtain the users. Although our model does not need reference images chrominance map, and the map will be refined by the result of or user interaction to colorize images, our results still look the GBLS. Local features and global features are combined to very appealing and realistic, which can be seen from the third get natural results. Substantial experimental colorized results column of Figs. 13 and 14. of our approach are shown to be close to nature, and the PSNR comparison between our results and the ground truth shows that our results are similar to the ground truth. The comparison D. Performance Evaluation with existing colorization methods is conducted on the place Our core algorithm is developed on an NVIDIA TIAN scene data set and images downloaded from the Internet and XP GPU. All of our experiments are conducted on the GPU shows our method outperforms other methods not only on the aforementioned and an Intel E5 4-GHz CPU. We trained our training time but also on the accuracy. However, there is a system with images in 15 different categories. As shown in limitation in our method: we cannot colorize images with the Table III, we first trained the system with 5000 images, and unfixed color well. As shown in the first column of Fig. 16, then, the input images are increased gradually. Because of original colors are lost in the gray-scale images. It is mainly the special structure of BLS, our system could be trained because the colors in the ground truth do not appear with based on the original, while the input data are increased. The a particular structure so that the proposed system could not AdTT is required when the input data are increased. Although colorize them well. In future research, we will try to tackle just the AdTT is required, we also compute the accumulative this problem.

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. 2342 IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, VOL. 32, NO. 6, JUNE 2021

REFERENCES [28] C. L. P. Chen and Z. Liu, “Broad learning system: A new learning par- adigm and system without going deep,” in Proc. 32nd Youth Academic [1] W. Markle and B. Hunt, “Coloring a black and white signal using motion Annu. Conf. Chin. Assoc. Autom. (YAC), May 2017, pp. 1271–1276. detection,” U.S. Patent 4 755 870, Jul. 5, 1988. [29] Z. Liu, J. Zhou, and C. L. P. Chen, “Broad learning system: Fea- [2] R. Cooper, “Colorization and moral rights: Should the united states ture extraction based on K-means clustering algorithm,” in Proc. 4th adopt unified protection for artists?” Journalism Quart., vol. 68, no. 3, Int. Conf. Inf., Cybern. Comput. Social Syst. (ICCSS), Jul. 2017, pp. 465–473, Sep. 1991. pp. 683–687. [3] A. Levin, D. Lischinski, and Y. Weiss, “Colorization using optimization,” [30] C. L. P. Chen and Z. Liu, “Broad learning system: An effective ACM Trans. Graph., vol. 23, no. 3, pp. 689–694, 2004. and efficient incremental learning system without the need for deep [4] Y.-C. Huang, Y.-S. Tung, J.-C. Chen, S.-W. Wang, and J.-L. Wu, architecture,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 1, “An adaptive edge detection based colorization algorithm and its appli- pp. 10–24, Jan. 2018. cations,” in Proc. ACM Multimedia, 2005, pp. 351–354. [31] Z. Liu and C. L. P. Chen, “Broad learning system: Structural extensions [5] Q. Luan, F. Wen, D. Cohen-Or, L. Liang, Y.-Q. Xu, and H.-Y. Shum, on single-layer and multi-layer neural networks,” in Proc. Int. Conf. “Natural image colorization,” in Proc. Eurographics Symp. Rendering, Secur., Pattern Anal., Cybern. (SPAC), Dec. 2017, pp. 136–141. 2007, pp. 309–320. [32] L. Yatziv and G. Sapiro, “Fast image and video colorization using [6] G. Charpiat, M. Hofmann, and B. Schölkopf, “Automatic image coloriza- chrominance blending,” IEEE Trans. Image Process., vol. 15, no. 5, tion via multimodal predictions,” in Proc. ECCV, 2008, pp. 126–139. pp. 1120–1129, May 2006. [7] A. Y.-S. Chia et al., “Semantic colorization with Internet images,” ACM [33] Y. Qu, T.-T. Wong, and P.-A. Heng, “Manga colorization,” ACM Trans. Trans. Graph., vol. 30, no. 6, p. 156, 2011. Graph., vol. 25, no. 3, pp. 1214–1220, Jul. 2006. [8] R. K. Gupta, A. Y.-S. Chia, D. Rajan, E. S. Ng, and Z. Huang, “Image [34] T. Welsh, M. Ashikhmin, and K. Mueller, “Transferring color to colorization using similar images,” in Proc. ACM Multimedia, 2012, greyscale images,” ACM Trans. Graph., vol. 21, no. 3, pp. 277–280, pp. 369–378. Jul. 2002. [9] R. Raturi, “Adapting deep features for scene recognition utilizing places [35] F. Wu, W. Dong, Y. Kong, X. Mei, J.-C. Paul, and X. Zhang, database,” in Proc. 2nd Int. Conf. Inventive Commun. Comput. Technol. “Content-based colour transfer,” Comput. Graph. Forum, vol. 32, no. 1, (ICICCT), Apr. 2018, pp. 487–495. pp. 190–203, Feb. 2013. [10] W. Guo and P. Aarabi, “Hair segmentation using heuristically-trained [36] R. Irony, D. Cohen-Or, and D. Lischinski, “Colorization by example,” neural networks,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 1, in Proc. Eurographics Symp. Rendering, 2005, pp. 201–210. pp. 25–36, Jan. 2018. [37] X. Liu et al., “Intrinsic colorization,” ACM Trans. Graph., vol. 27, no. 5, [11] S. Mohamad, A. Bouchachia, and M. Sayed-Mouchaweh, “A bi-criteria pp. 152-1–152-9, 2008. active learning algorithm for dynamic data streams,” IEEE Trans. Neural [38] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks Netw. Learn. Syst., vol. 29, no. 1, pp. 74–86, Jan. 2018. for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) [12] K. Ding, C. Huo, B. Fan, S. Xiang, and C. Pan, “In defense of locality- , Jun. 2015, pp. 3431–3440. [39] E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks sensitive hashing,” IEEE Trans. Neural Netw. Learn. Syst., vol. 29, no. 1, for semantic segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., pp. 87–103, Jan. 2018. vol. 39, no. 4, pp. 640–651, Apr. 2017. [13] H.-G. Han, W. Lu, Y. Hou, and J.-F. Qiao, “An Adaptive-PSO-based [40] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification self-organizing RBF neural network,” IEEE Trans. Neural Netw. Learn. with deep convolutional neural networks,” Commun. ACM, vol. 60, no. 6, Syst., vol. 29, no. 1, pp. 104–117, Jan. 2018. pp. 84–90, May 2017. [14] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, [41] M. He, D. Chen, J. Liao, P. V. Sander, and L. Yuan, “Deep exemplar- no. 7553, pp. 436–444, May 2015. based colorization,” ACM Trans. Graph., vol. 37, no. 4, pp. 1–16, 2018. [15] C. Yan, L. Li, C. Zhang, B. Liu, Y. Zhang, and Q. Dai, “Cross-modality [42] R. Zhang et al., “Real-time user-guided image colorization with learned IEEE bridging and knowledge transferring for image understanding,” deep priors,” ACM Trans. Graph., vol. 36, no. 4, p. 119, Aug. 2017. Trans. Multimedia, vol. 21, no. 10, pp. 2675–2685, Oct. 2019. [16] C. Yan et al., “A fast uyghur text detector for complex background images,” IEEE Trans. Multimedia, vol. 20, no. 12, pp. 3389–3398, Dec. 2018. [17] A. Deshpande, J. Rock, and D. Forsyth, “Learning large-scale automatic Yuxi Jin received the B.Eng. degree in software image colorization,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), engineering from Henan University, Kaifeng, China, Dec. 2015, pp. 567–575. in 2015, and the M.Eng. degree in computer science [18] S. Iizuka, E. Simo-Serra, and H. Ishikawa, “Let there be color!: Joint from the East China University of Science and end-to-end learning of global and local image priors for automatic image Technology, Shanghai, China, in 2018. She is cur- ACM Trans. Graph. colorization with simultaneous classification,” , rently pursuing the Ph.D. degree in computer science vol. 35, no. 4, p. 110, 2016. with the Faculty of Information Technology, Macau [19] Z. Cheng, Q. Yang, and B. Sheng, “Colorization using neural network University of Science and Technology, Macau. ensemble,” IEEE Trans. Image Process., vol. 26, no. 11, pp. 5491–5505, She is currently with the Visual Media and Nov. 2017. Data Management Laboratory as a Visiting Scholar, [20] R. Zhang, P. Isola, and A. A. Efros, “Colorful image colorization,” in Department of Computer Science and Engineering, Proc. ECCV, 2016, pp. 649–666. Shanghai Jiao Tong University, Shanghai. Her current research interests [21] G. Larsson, M. Maire, and G. Shakhnarovich, “Learning representations include image colorization, broad learning system, stylization, and computer for automatic colorization,” in Proc. ECCV, 2016, pp. 577–593. vision. [22] R. Dahl. (2016). Automatic Colorization. [Online]. Available: http://tinyclouds.org/colorize [23] M. Richart, J. Visca, and J. Baliosian, “Image colorization with neural networks,” in Proc. Workshop Comput. Vis. (WVC), Oct. 2017, pp. 55–60. Bin Sheng (Member, IEEE) received the B.A. [24] Y. Xiao, P. Zhou, Y. Zheng, and C.-S. Leung, “Interactive deep coloriza- degree in English and the B.Eng. degree in computer tion using simultaneous global and local inputs,” in Proc. IEEE Int. Conf. science from the Huazhong University of Science Acoust., Speech Signal Process. (ICASSP), May 2019, pp. 1887–1891. and Technology, Wuhan, China, both in 2004, the [25] A. M. El Dakrory and M. Tawfik, “Identifying the attitude of dynamic M.Sc. degree in software engineering from the Uni- systems using neuralnetwork,” in Proc. Int. Workshop Recent Adv. versity of Macau, Macau, in 2007, and the Ph.D. Robot. Sensor Technol. Humanitarian Demining Counter-IEDs (RST), degree in computer science and engineering from Oct. 2016, pp. 1–4. The Chinese University of Hong Kong, Hong Kong, [26] U. Qayynm, Q. Ahsan, Z. Mahmood, and M. A. Chcmdary, “Thermal in 2011. colorization using deep neural network,” in Proc. 15th Int. Bhurban He is currently an Associate Professor with the Conf. Appl. Sci. Technol. (IBCAST), Jan. 2018, pp. 325–329. Department of Computer Science and Engineering, [27] Z. Cheng, Q. Yang, and B. Sheng, “Deep colorization,” in Proc. IEEE Shanghai Jiao Tong University, Shanghai, China. His current research interests Int. Conf. Comput. Vis. (ICCV), Dec. 2015, pp. 415–423. include virtual reality and computer graphics.

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply. JIN et al.: BROAD COLORIZATION 2343

Ping Li (Member, IEEE) received the Ph.D. degree and Computer Science Programs of the University of Macau receiving accred- in computer science and engineering from The Chi- itations from Washington/Seoul Accord through the Hong Kong Institute nese University of Hong Kong, Hong Kong, in of Engineers (HKIE), of which is considered as his utmost contribution in 2013. engineering/computer science education for Macau as the former Dean of the He is currently a Research Assistant Professor with Faculty of Science and Technology. He is also a Highly Cited Researcher by The Hong Kong Polytechnic University, Hong Kong. Clarivate Analytics in 2018 and 2019. His current research interests include He has one image/video processing national inven- systems, cybernetics, and computational intelligence. tion patent and has an excellent research project Dr. Chen is also a fellow of the American Association for the Advancement reported worldwide by ACM TechNews. His current of Science and the Science (AAAS), the International Association for Pattern research interests include image/video stylization, Recognition (IAPR), the Chinese Association of Automation (CAA), and artistic rendering and synthesis, and creative media. HKIE and a member of Academia Europaea (AE), the European Academy of Sciences and Arts (EASA), and the International Academy of Systems and Cybernetics Science (IASCYS). He received the IEEE Norbert Wiener Award in 2018 for his contribution in systems and cybernetics, and machine C. L. Philip Chen (Fellow, IEEE) received the learning. He was a recipient of the 2016 Outstanding Electrical and Computer Ph.D. degree in electrical engineering from Purdue Engineers Award from his alma mater, Purdue University, in 1988, after he University, West Lafayette, IN, USA, in 1988. graduated from the University of Michigan at Ann Arbor, Ann Arbor, MI, He is currently a Chair Professor and the Dean of USA, in 1985. He was the Chair of the TC 9.1 Economic and Business the School of Computer Science and Engineering, Systems of International Federation of Automatic Control from 2015 to South China University of Technology, Guangzhou, 2017. He was the IEEE Systems, Man, and Cybernetics Society President China. Being a Program Evaluator of the Accredita- from 2012 to 2013 and the Editor-in-Chief of the IEEE TRANSACTIONS ON tion Board of Engineering and Technology Educa- SYSTEMS,MAN, AND CYBERNETICS:SYSTEMS from 2014 to 2019. He is tion (ABET) in the USA, for computer engineering, also the Editor-in-Chief of the IEEE TRANSACTIONS ON CYBERNETICS and electrical engineering, and software engineering pro- an Associate Editor of the IEEE TRANSACTIONS ON FUZZY SYSTEMS.Heis grams, he successfully architects the Engineering also the Vice-President of the Chinese Association of Automation (CAA).

Authorized licensed use limited to: Hong Kong Polytechnic University. Downloaded on August 03,2021 at 12:56:25 UTC from IEEE Xplore. Restrictions apply.