<<

information

Article Anti-Shake HDR Imaging Using RAW Image Data

Yan Liu * , Bingxue Lv, Wei Huang, Baohua Jin and Canlin Li

School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China; [email protected] (B.L.); [email protected] (W.H.); [email protected] (B.J.); [email protected] (C.L.) * Correspondence: [email protected]; Tel.: +86-136-6386-9878

 Received: 10 March 2020; Accepted: 13 April 2020; Published: 16 April 2020 

Abstract: Camera shaking and object movement can cause the output images to suffer from blurring, noise, and other artifacts, leading to poor image quality and low . Raw images contain minimally processed data from the compared with JPEG images. In this paper, an anti-shake high-dynamic-range imaging method is presented. This method is more robust to camera motion than previous techniques. An algorithm based on information entropy is employed to choose a reference image from the raw image sequence. To further improve the robustness of the proposed method, the Oriented FAST and Rotated BRIEF (ORB) algorithm is adopted to register the inputs, and a simple Laplacian pyramid fusion method is implanted to generate the high-dynamic-range image. Additionally, a large dataset with 435 various image sequences is collected, which includes the corresponding JPEG image sequences to test the effectiveness of the proposed method. The experimental results illustrate that the proposed method achieves better performance in terms of anti-shake ability and preserves more details for real scene images than traditional algorithms. Furthermore, the proposed method is suitable for extreme-exposure image pairs, which can be applied to binocular vision systems to acquire high-quality real scene images, and has a lower algorithm complexity than deep learning-based fusion methods.

Keywords: raw data; high dynamic range; camera shake; image fusion

1. Introduction Raw images capture all image data recorded by the sensor, providing many advantages compared with common compressed image formats, such as lossless compression, linear response to scene radiance, non-destructive white balance, wider gamut, higher dynamic range (generally 12–14 bits), and so on. Most vision tasks are finished using 8-bit standard RGB (sRGB) compressed because raw images require large amounts of memory space and are not well supported by many imaging applications. However, while compression algorithms may be optimized perfectly, the sRGB images are highly processed by the color Image Processing Pipeline (IPP) in terms of color and scene radiance. The process of IPP may result in noise and loss of detail, causing sRGB image information distortion in many advanced vision tasks. Hence, researchers need to increase the complexity of algorithms to obtain better results when using the compressed JPEG formats in vision tasks. High-dynamic-range (HDR) imaging provides the capacity to capture, manipulate, and display real-world lighting in different imaging conditions, significantly improving the visual experience. Some HDR imaging methods enhance the dynamic range by using particular sensors [1–3], and others use image processing techniques to generate a high-quality HDR image by merging multiple low-dynamic-range (LDR) images captured sequentially with different exposure times. HDR imaging has many applications in augmented reality [4,5], , lighting [6–8], saliency detection [9,10], data hiding [11], quality assessment [12,13], and image stitching [14], etc. Many HDR imaging

Information 2020, 11, 213; doi:10.3390/info11040213 www.mdpi.com/journal/information Information 2020, 11, 213 2 of 17 methods [15–24] have been proposed, which can be divided into two categories: tone-mapping methods [15,18,20,25–28], and multi-exposure image fusion (MEF) methods [16,23,29–31]. In addition, some methods of HDR acquisition based on dynamic scenes [32,33] have also been proposed. Tone-mapping methods produce HDR images by adjusting the grayscale image, which may create some halos in the mapping process. MEF methods merge well-exposed regions from numerous LDR input images to produce a single visually appealing LDR result that appears to possess a higher dynamic range, but the fusion result is usually unsatisfactory due to camera shaking and object movement. Most researchers have designed complicated HDR image fusion algorithms to improve the quality of produced HDR images using the compressed JPEG images as inputs. Nevertheless, camera shaking and object movement in the fusion process still represent a tricky problem. In this paper, an anti-shake HDR image acquisition method is proposed for real scenes based on raw camera data. This method is ghost-free and can save more details than traditional HDR fusion methods. In summary, the main contributions of this paper are as follows:

A new dataset of multi-exposure raw image sequences featuring camera shaking is collected, • and each raw image has a corresponding JPEG format. A reference image selection method for multi-exposure raw image sequences with camera shaking • is proposed based on information entropy. A pipeline for raw image sequences with camera shaking to produce ghost-free HDR images is • designed, with more robust performance for extreme-exposure raw image pairs.

2. Related Work HDR imaging technology has been the focus of a significant body of imaging science and computer vision in recent years. There are three related categories of work: tone mapping, multi-exposure image fusion, and raw image processing. Reinhard et al. [25] proposed a tone-mapping method based on photographic models that automatically adjusts the brightness. Durand et al. [27] used an edge-preserving filter called the bilateral filter to reduce contrast and preserve details. The method of Drago et al. [28] is based on the logarithmic compression of luminance values by introducing a bias power function to adaptively vary logarithmic bases. Mantiuk et al. [26] utilized higher-order image statistics and quadratic programming through the model of the human visual system to reduce image distortion. Although the above methods [25–28] can improve the quality of images, they fail to restore the details of image and are sensitive to lighting conditions. To enrich the details of images, Kinoshita et al. [23] proposed a novel pseudo multi-exposure image fusion method based on a single image, utilizing the local contrast enhancement method to obtain high-quality pseudo multi-exposure images. However, with this method the fusion results depend on the quality of input images, resulting unsatisfactory for extreme exposures. Zhang et al. [29] suggested a wavelet-based fusion algorithm for multi-exposure images. However, the method cannot enhance the details adequately for low-resolution images. Kinoshita et al. [31] proposed a multi-exposure image fusion method based on exposure compensation, but the method requires that the inputs have no moving artifacts. Zhen et al. [34] proposed an algorithm to improve accurate contrast enhancement, but it has a poor effect on over-exposed regions. Some HDR algorithms [35–37] based on deep learning have been proposed to generate HDR images, but these algorithms do not solve the problems of camera shaking and object movement. Given the advantages of raw images, they are desirable for many algorithms because they allow flexibility for post-processing. Hasinoff et al. [38] described a computational photography pipeline that captures, aligns, and merges a burst of frames to reduce noise and increase dynamic range, but the pipeline failed to suppress the noise of high-contrast features. Chen et al. [35] developed a pipeline for processing low-light raw images, but with this method some obvious details are lost, and as such it is not suitable for images with high amounts of light. Information 2020, 11, x FOR PEER REVIEW 3 of 16

Due to the poor performance of existing HDR algorithms with respect to camera shaking and extreme exposure, an anti-shake HDR image acquisition method based on raw image sequences with position offsets is proposed. The method is not only robust to camera shaking and extreme-exposure image pairs, but also has lower algorithmic complexity and better performance.

3.Information The Proposed2020, 11, 213Method 3 of 17 Modern digital cameras attempt to render a pleasant and accurate image of the world, similar to that perceivedDue to the by poor the performance human eye of[39]. existing As shown HDR algorithmsin Figure with1a, a respecttypical tocamera camera IPP shaking includes and linearization,extreme exposure, white an balance, anti-shake demosaicking, HDR image color acquisition correction, method brightness, based on and raw contrast image sequences control, with with theposition aim of off convertingsets is proposed. raw sensor The method images isto not JPEG only images. robust As to camerashown in shaking Figure and 1b, extreme-exposurethe conventional directimage fusion pairs, butalgorithm also has for lower multi-exposure algorithmic sequences complexity usually and better uses performance.compressed JPEG formats, which are only suitable for inputs without camera shaking and object movement. Another traditional 3. The Proposed Method improved multi-exposure image fusion pipeline is shown in Figure 1c, which can be used to process the inputsModern with digital position cameras offsets. attempt However, to render the loss a pleasantof key feature and accurate information image in ofcompressed the world, images similar cannotto that perceivedbe recovered by theby humanthe fusion eye [algorithms,39]. As shown and in therefore Figure1a, the a typical fusioncamera results IPPare includesusually undesirable.linearization, white balance, demosaicking, color correction, brightness, and contrast control, with the aim ofIn convertingthis paper, an raw anti-shake sensor images HDR toimage JPEG acquisition images. As method shown based in Figure on 1rawb, the image conventional sequences direct with camerafusion algorithmshaking and for object multi-exposure movement sequencesis proposed, usually which uses includes compressed raw image JPEG processing, formats, whichreference are imageonly suitable selection, for inputsimage withoutregistration, camera and shaking image and fusion, object as movement. shown in AnotherFigure 1d. traditional Raw images improved are chosenmulti-exposure instead of image JPEG fusionimages pipeline for reference, is shown as mo in Figurere feature1c, which points can can be be used detected to process for matching. the inputs In thewith process position of o ffimagesets. However, matching, the incorrect loss of key matche features are information rejected inmany compressed times to images achieve cannot better be matchingrecovered accuracy by the fusion and ensure algorithms, the fused and thereforeHDR images the fusionare of higher results quality. are usually undesirable.

(a)

(b)

(c)

Figure 1. Cont.

Information 2020, 11, 213 4 of 17 Information 2020, 11, x FOR PEER REVIEW 4 of 16

Information 2020, 11, x FOR PEER REVIEW 4 of 16

(d)

Figure 1. DifferentDifferent image image proces processingsing pipelines. pipelines. (a) (Traditionala) Traditional color color image image processing processing pipeline. pipeline. (b) Traditional(b) Traditional direct direct fusionfusion pipeline pipeline for multi-exposurefor multi-exposure images. (cimages.) Improved (c) traditional Improved multi-exposure traditional multi-exposureimage fusion pipeline. image fusion (d) The pipeline. pipeline (d of) theThe proposed pipeline of algorithm. the proposed algorithm.

In this paper, an anti-shake HDR image acquisition method based on raw image sequences with 3.1. Reference Image Selection camera shaking and object movement is proposed, which includes raw image processing, reference Instead of operating on sRGB images produced(d) by the traditional camera processing pipeline, image selection, image registration, and image fusion, as shown in Figure1d. Raw images are chosen we operated using raw sensor data. Hence, we needed to convert raw images from a single channel insteadFigure of JPEG1. Different images image for reference,processing aspipelines. more feature (a) Traditional points color can be image detected processing for matching. pipeline. (b In) the to three channels based on Bayer interpolation algorithm to obtain full-color images, as shown in processTraditional of image direct matching, fusion incorrect pipeline matches for multi-exposure are rejected manyimages. times (c) toImproved achieve bettertraditional matching Figure 2. accuracymulti-exposure and ensure image the fusedfusion HDRpipeline. images (d) The are pipeline of higher of the quality. proposed algorithm.

3.1. Reference Reference Image Selection Instead of operating on sRGB images produced by the traditional camera processing pipeline, pipeline, we operated usingusing rawraw sensorsensor data.data. Hence,Hence, wewe needed needed to to convert convert raw raw images images from from a singlea single channel channel to tothree three channels channels based based on Bayer on Bayer interpolation interpolation algorithm algorithm to obtain to obtain full-color full-color images, images, as shown as in shown Figure in2. Figure 2. Figure 2. Color filter array (CFA).

To better automatically correct the position offsets caused by camera shaking between multi-exposure image sequences, we introduced a reference image selection method based on information entropy for raw image sequences. In information theory, entropy is a measure of the uncertainty in a random variable [40]. The concept of information entropy describes how much information is provided by the signal or image. The image with different exposure values contains Figure 2. Color filter array (CFA). different image information and theFigure higher 2. Color the filter entropy array of(CFA). the image, the more information the image contains. For the grayscale image, we calculated the image information entropy by Equation To better automatically correct the position offsets caused by camera shaking between (1): multi-exposure image sequences, we introduced a reference image selection method based on information entropy for raw image sequences. In information theory, entropy is a measure of the uncertainty in a random variable [40]. The concept of information entropy describes how much information is provided by the signal or image. The image with different exposure values contains different image information and the higher the entropy of the image, the more information the image contains. For the grayscale image, we calculated the image information entropy by Equation (1):

Information 2020, 11, 213 5 of 17

To better automatically correct the position offsets caused by camera shaking between multi-exposure image sequences, we introduced a reference image selection method based on information entropy for raw image sequences. In information theory, entropy is a measure of the uncertainty in a random variable [40]. The concept of information entropy describes how much information is provided by the signal or image. The image with different exposure values contains different image information and the higher the entropy of the image, the more information the image InformationInformationInformationInformation 2020Information 2020, 11Information, 2020,x 202011 FOR, 2020,x ,11 FOR 11PEER, 2020,,x 11x FOR PEERFOR, ,REVIEWx 11 FOR PEER ,PEER xREVIEW FOR PEER REVIEW REVIEW PEER REVIEW REVIEW 5 of 516 of 5165 of of 516 16of 516 of 16 contains. For the grayscale image, we calculated the image information entropy by Equation (1):

255 255 255X255255 255 255 Epipi= Epipi-()log()= Epipi-()log()Epipi==Epipi-()log()-()log()=Epipi-()log()= -()log(), , , , , , (1) (1) (1)(1) (1) (1) E = 2p(i2) log222p2(i),2 (1) i=0 i=0 −i=i=00 i=0 i=0 i=0 wherewhere p(i)wherewhere p(i)whereis the p(i)where isp(i) thedistribution p(i)is is thedistribution thep(i)is thedistribution distributionis thedistribution probability distribution probability probability probability probability of probabilityimage of image of of grayimage imageof grayimage ofi. grayimage gray i. gray i .i .gray i. i. where p(i) is the distribution probability of image gray i. TableTable 1Table Tableshows 1Table shows 1 Table1 shows the shows1 showstheinformation 1 showsthe informationthe theinformation information theinformation entropy information entropy entropy entropy average entropy average entropy average averagevalues average values average valuesofvalues thevaluesof thevaluesraw ofof therawofthe images the ofraw rawimages theraw images imagesand raw images and corresponding images and andcorresponding and corresponding corresponding and corresponding corresponding Table1 shows the information entropy average values of the raw images and corresponding JPEG JPEGJPEG images.JPEGJPEG images.JPEG images. images.JPEGAs images. shownAs images. shownAs As shownAs inshown Table shownAsin Table shownin in 1, Table Tableinthe 1, Table in theinformation 1, Table1, the information the1, theinformation information1, theinformation entropy information entropy entropy entropy of entropy raw of entropy raw ofimages of raw rawofimages raw ofimages isimages raw higherimages is higherimages is is higherthan higheris higherthan isthat higherthan than that of than thatthe thatof than thatthe of of thatthe theof the of the images. As shown in Table1, the information entropy of raw images is higher than that of the JPEG JPEGJPEG images,JPEGJPEG images,JPEG images, images,JPEGand images, and the images, and andthemiddle and the middlethe and themiddle imagemiddle themiddle image middle hasimage image hasimagethe hasimage thehashighes has the highesthe has tthehighes highesentropy. thethighes entropy. thighes tentropy. entropy. tHence, entropy. tHence, entropy. Hence, weHence, Hence,wechose Hence, wechose we the wechose chose the middlewechose the middlechosethe themiddle imagemiddle themiddle image middle asimage image asimage asimage as as as images, and the middle image has the highest entropy. Hence, we chose the middle image as the the thereference thereferencethe thereference reference image thereference image reference forimage image multi-exforimage multi-exforimage for formulti-ex multi-exposure formulti-exposure multi-ex posurerawposure posureraw imageposure raw rawimage raw sequences.image image raw sequences.image sequences.image sequences. sequences. sequences. reference image for multi-exposure raw image sequences.

TableTable 1.Table TableThe 1.Table Theinformation 1. 1.Table The informationThe 1. Theinformation information1. Theinformation entropy information entropy entropy ofentropy the entropy of image theentropy of of imagethe theof sequence. imagethe imageof sequence. imagethe sequence. sequence.image sequence. sequence. Table 1. The information entropy of the image sequence.

ExposureExposureExposureExposureExposure Exposure Exposure ImageImageImage ImageImage Image Image SequenceSequenceSequenceSequenceSequence Sequence Sequence

RAW 4.80025180 6.02060459 7.64435717 7.36287806 6.95316271 RAWRAW RAWRAW RAW4.80025180 RAW4.80025180 4.800251804.80025180 4.80025180 6.020604594.80025180 6.02060459 6.020604596.02060459 6.02060459 7.644357176.02060459 7.64435717 7.644357177.64435717 7.64435717 7.362878067.64435717 7.36287806 7.362878067.36287806 7.36287806 6.953162717.36287806 6.95316271 6.953162716.95316271 6.95316271 6.95316271 JPEG 4.66945082 5.99022103 7.61556499 7.32166800 6.08631492 JPEGJPEG JPEGJPEG JPEG4.66945082 JPEG4.66945082 4.66945082 4.669450824.66945082 5.990221034.66945082 5.99022103 5.990221035.99022103 5.99022103 7.615564995.99022103 7.61556499 7.615564997.61556499 7.61556499 7.321668007.61556499 7.32166800 7.321668007.32166800 7.32166800 6.086314927.32166800 6.08631492 6.086314926.08631492 6.08631492 6.08631492

3.2. Feature Point Detection and Matching 3.2. 3.2.Feature 3.2.Feature3.2. 3.2. Feature PointFeature 3.2. FeaturePoint Detection Feature Point Point Detection Point Detection DetectionPoint and Detection and MatchingDetection and Matchingand andMatching Matching andMatching Matching The proposed method utilizes the Oriented FAST and Rotated BRIEF (ORB) algorithm [41] to TheThe proposedThe TheproposedThe proposed proposedThe methodproposed methodproposed method methodutilizes method utilizes method utilizes utilizesthe utilizes theOriented utilizes the Orientedthe theOriented Oriented FAST theOriented FASTOriented andFAST FAST and FASTRotated and FASTandRotated and Rotated RotatedBRIEF and Rotated BRIEF Rotated (ORB)BRIEF BRIEF BRIEF(ORB) BRIEFalgorithm(ORB) (ORB) algorithm(ORB) (ORB)algorithm algorithm algorithm[41] algorithm[41] to [41] [41]to [41] to to [41] to to extract feature points of three channels raw images. The image marked with red box in Figure3 is the extractextract extractfeatureextract extractfeature extract featurepointsfeature feature points feature ofpoints points three pointsof threepointsof of channels threeofthree channels threeof channels channelsthree raw channels raw channelsimages raw rawimages raw images. imagesThe raw images. The image images. .The Theimage. The markedimage image. The markedimage markedimage markedwith marked with markedred with with redbox with red box redinwith redFigurebox inbox redboxFigure in in boxFigure 3inFigure is Figure 3in is Figure 3 3 is is 3 is 3 is selected reference image, and the rest of images are the raw image sequences to be matched. The red the theselected theselectedthe theselected selected reference theselected reference selected reference reference image, reference image, reference image,and image, image,and the image,and andtherest and therest theof and theimagesrest ofrest theimagesrest of of images rest areimagesof images aretheof images arethe raware aretheraw theimage aretheraw rawimage theraw sequencesimage image raw sequencesimage sequencesimage sequences sequencesto be sequencesto matched. be to to matched. be beto matched. bematched. to The matched.be The matched. The The The The points in Figure3 stand for the detected key points. red redpoints redpointsred redinpoints points Figure redinpoints Figure inpoints in Figure3 Figurein stand Figure3 in stand Figure3 for3 stand stand 3 thefor stand 3 thefordetected standfor forthedetected the forthedetected detected key thedetected key points. detected key points.key key points. points. key points. points.

Figure 3. FigureFigure 3.FigureFigure Oriented 3.Figure Oriented 3. Figure3. Oriented Oriented 3.FAST Oriented 3.FAST Orientedand FAST FAST andRotated FAST and Rotated andFAST and Rotated BRIEF Rotated and Rotated BRIEF (ORB)Rotated BRIEF BRIEF (ORB) BRIEF detect (ORB) BRIEF(ORB) detect (ORB)ion detect (ORB)detect detection resultsion detect resultsionion detect ofionresults results the ofionresults rawthe of resultsof raw the image the of raw the image raw of withrawthe image image withraw imagedifferent with imagewith different with different differentdi withff differenterent different levels of exposure. levelslevels oflevels levelsexposure. oflevels exposure. of oflevels exposure. exposure.of exposure. of exposure. There were a large number of incorrect matches in the matching process, but we required good ThereThere wereThereThere wereThere a werelarge Therewere a werelarge a number a werelarge large anumber large anumber numberlargeof number incorrect of number incorrect of of incorrect incorrectof matches incorrect of matches incorrect matches matches in matches the in matches thematching in in thematching thein the matching inmatching process,thematching process,matching process, process, but process, butwe process, but werequiredbut butwerequired we but werequired required good werequired good required good good good good matches to estimate the transform matrix. The method to reject incorrect matches can be divided into matchesmatchesmatchesmatches tomatches estimatetomatches estimate toto estimatetoestimate the estimateto the transformestimate thetransformthe the transformtransform the transformmatrix. transformmatrix. matrix.matrix.The matrix. The meth matrix. The Themeth odThe meth meth toodThe meth rejecttoodod meth rejectod toto incorrect rejectodtoreject incorrect rejectto incorrectrejectincorrect matchesincorrect matchesincorrect matchesmatches can matches can bematches candividedbecan can divided bebe can divided be divided dividedbe divided three steps. intointo threeinto intothree steps.into three three steps.into three steps. steps.three steps. steps. The first step is to utilize the Brute-Force (BF) algorithm to match the feature points of reference TheThe firstThe Thefirst stepThe first firststepThe firstis step steptofirstis step utilizeto is is step utilizetoisto utilizetotheutilizeis utilize totheBrute-Force utilize thetheBrute-Force the Brute-ForceBrute-Force theBrute-Force (BBrute-ForceF) (B algorithmF) (B (BalgorithmF) F)(B algorithmF) algorithm(B algorithmF)to algorithmmatchto matchtoto matchtothematch match tothefeature match thethefeature the featurefeaturepoints thefeature points feature pointsofpoints pointsof points ofof of of images and exposure images according to Hamming distance. The smaller the Hamming distance, referencereferencereferencereference imagesreference imagesreference images andimages images and exposure images and andexposure and exposure exposure and imagesexposure imagesexposure images accordingimages images according images according according accordingto Hamming accordingto Hamming to to Hamming toHamming Hamming distance.to Hammingdistance. distance. distance. The distance. The smallerdistance. The Thesmaller The smaller smallerthe The smaller theHamming smaller the Hammingthe theHamming Hamming the Hamming Hamming distance,distance,distance,distance, thedistance, themoredistance, themorethe similarthe moremore similarthemore similarmoresimilarthe similar thetwo similar thetwothe feature the two twofeature thetwo featurefeaturepoints. two feature points. feature points.points.If points.theIf points.the HammingIfIf theHammingtheIf the HammingIfHamming the Hammingdistance Hammingdistance distancedistance is distance less isdistance less isthanis less islessthan certainless is thanthan certainless than certain certainthan certain certain thresholdthresholdthresholdthreshold Tthreshold1, thethresholdT1, theT twoT1,1 , Tthe two thefeature1, theTtwo twofeature1, thetwo feature pointsfeature two feature points feature willpoints points pointswill be points will consideredwillbe will considered bebe will considered beconsidered consideredbe to consideredmatch. to match. to to match. Ttomatch.1 canmatch. Tto1 match.can TbeT1 1 cancalculatedTbecan1 can calculatedT bebe1 can calculatedbecalculated calculatedbeby calculatedEquationby Equation byby Equation byEquation Equationby Equation (2): (2): (2):(2): (2): (2): =+=+()=+()=+=+()()=+()() Tsd1maxminTsd1maxminTsdTsd1maxmin1maxminTsd1maxminTsd d1maxmin d, d d, d, , d , , (1) (1) (1)(1) (1) (1) wherewhere whereswhere is wheres ais where svariable, s a is issvariable, ais a s variable, variable,a is and variable,a and variable,d max and andd maxandand d dmaxand maxand dd min maxand and dd minrepresentmaxand d dminrepresentandmin d minrepresent representd minrepresentthe representthemaximum themaximumthe the maximummaximum the maximumHamming maximumHamming HammingHamming Hammingdistance Hammingdistance distancedistance anddistance and distancethe and and the and the theand the the minimumminimumminimumminimum minimumHamming minimumHamming Hamming Hamming Hammingdistance, Hammingdistance, distance, distance, respec distance, respec distance,tively. respec respec tively.respec We tively.respectively. Wetively.can Wetively. canWefind We can findcan the Wecan find findthebest can find thebest theT find1 thebest by bestT1 the changingbestby T T1 1changingbestby Tby1 changing by changingT 1the changingby thevaluechanging thevalue the of thevalue value s . of thevalueIf s the.of valueofIf s thes.of .If If s the. of theIf s the. If the HammingHammingHammingHamming Hammingdistance Hammingdistance distance distance is distance greater is distance greater is is greater thangreateris greater thanis the greater than than thebest than the best the Tthan 1 the bestvalue, bestT 1 the bestvalue, T T1 thebest1value, value,T1 thevalue,matched T1 thevalue, matchedthe thematched matched pair thematched pair matchedwill pair pairwill be pair will eliminated;willbe pair will eliminated;be be will eliminated; beeliminated; eliminated;be otherwise, eliminated; otherwise, otherwise, otherwise, otherwise, otherwise, it willit will beitit will consideredwillbeit willconsidered be itbe will considered beconsidered consideredbe a correctly considered a correctly a a correctly correctly a matched correctly a matched correctly matched matched pair. matched pair. matched pair. pair. pair. pair.

Information 2020, 11, 213 6 of 17

the more similar the two feature points. If the Hamming distance is less than certain threshold T1, the two feature points will be considered to match. T1 can be calculated by Equation (2):

T1 = s(dmax + dmin), (2) where s is a variable, and dmax and dmin represent the maximum Hamming distance and the minimum Information 2020, 11, x FOR PEER REVIEW 6 of 16 Hamming distance, respectively. We can find the best T1 by changing the value of s. If the Hamming distance is greater than the best T value, the matched pair will be eliminated; otherwise, it will be The second step is to further 1eliminate the incorrect matches by the KNN algorithm. For each considered a correctly matched pair. feature point in the reference image, we must find the two best matches in the raw image to be The second step is to further eliminate the incorrect matches by the KNN algorithm. For each registered. Then we also should find the two best matches for the feature point of the raw image in feature point in the reference image, we must find the two best matches in the raw image to be the reference image. The best match for each feature point should be chosen based on the Hamming registered. Then we also should find the two best matches for the feature point of the raw image in distance between their descriptors from the two candidate matches. The threshold T2 is used to the reference image. The best match for each feature point should be chosen based on the Hamming choose a good match, which defined as: distance between their descriptors from the two candidate matches. The threshold T2 is used to choose = a good match, which defined as: Td2secbest d ond , (3) T2 = dbest/dsec ond, (3) where dbest and dsecond denote the measured distance for the best and the second-best match, where d and d denote the measured distance for the best and the second-best match, respectively. respectively.best If thesecond dbest is much smaller than the dsecond, the best match will be regarded as the better Ifof thethe dtwobest candidateis much smaller matches. than the dsecond, the best match will be regarded as the better of the two candidateAlthough matches. a large number of incorrect matches are eliminated in step one and step two, we cannotAlthough guarantee a large that number the matches of incorrect obtained matches will be are perf eliminatedectly exact. in step Hence, one andthe steplast step two, is we to cannot reject guaranteethe matches that that the matchesdo not obey obtained the willepipolar be perfectly constraint exact. by Hence, utilizing the lastthe stepRANSAC is to reject algorithm the matches [42]. thatRANSAC do not is obey an uncertain the epipolar algorithm constraint that by can utilizing estimate the mathematical RANSAC algorithm model parameters [42]. RANSAC from is the an uncertainmatch set algorithmcontaining that “outliers”, can estimate which mathematical are the matches model that parameters do not fromfit the the model, match setand containing “inliers”, “outliers”,which are whichthe matches are the that matches can be that explained do not fit by the mo model,del parameters. and “inliers”, In this which step, are we the assume matches all thatthe canmatches be explained retained byin modelstep one parameters. and step Intwo this are step, “inl weiers”, assume and allthen the randomly matches retainedselect eight in step sets one of andmatching step two pairs are from “inliers”, the “inliers” and then to randomlyestimate the select model eight by sets iteration, of matching which pairs can quickly from the compute “inliers” the to estimatetransform the matrix. model However, by iteration, if one which of the can selected quickly computematches is the an transform inaccurate matrix. match, However, the computed if one oftransform the selected matrix matches will be isincorrect. an inaccurate Hence, match, the Hamming the computed and KNN transform algorithms matrix are will chosen be incorrect. to reject Hence,the most the incorrect Hamming matches and KNN before algorithms RANSAC. are chosenFigure to4a reject is the the matching most incorrect result matchesof raw beforeimage RANSAC.sequences Figurewithout4a isrejecting the matching incorrect result matches. of raw image Figure sequences 4b is the without final rejecting matching incorrect result matches. by the FigureRANSAC4b isalgorithm. the final matchingAll the incorrect result bymatches the RANSAC were rejected, algorithm. indicating All the that incorrect the proposed matches matching were rejected,algorithm indicating can greatly that improve the proposed the matching matching rate. algorithm can greatly improve the matching rate.

(a)

(b)

Figure 4. The matching results of raw exposure sequence. ( a) The results without rejecting incorrect matches. ( b) The final final matching results by the RANSAC algorithm.

3.3. Image Aligned and Registration The alignment of raw images with camera shaking is a special challenge, so the homography matrix was adopted to warp the matched raw image sequences. This matrix was computed by the method of direct linear transform (DLT), as shown in Equation (4), which used to estimate the perspective transformation H matrix of the remaining matched feature pairs. For all feature points, the mapping error from the raw image to the reference image was calculated. The maximum inlier feature points set by the mapping error could be found, and the matrix H could be recalculated by the coordinates of the inlier feature points.

Information 2020, 11, 213 7 of 17

3.3. Image Aligned and Registration The alignment of raw images with camera shaking is a special challenge, so the homography matrix was adopted to warp the matched raw image sequences. This matrix was computed by the method of direct linear transform (DLT), as shown in Equation (4), which used to estimate the perspective transformation H matrix of the remaining matched feature pairs. For all feature points, the mapping error from the raw image to the reference image was calculated. The maximum inlier Informationfeature points 2020, 11 set, x byFOR the PEER mapping REVIEW error could be found, and the matrix H could be recalculated by7 of the 16 coordinates of the inlier feature points.

 xx12    x1   x2   y  ~ Hy ,  (2)  y1 12 H  y2 , (4)   ∼    111  1  where (x1, y1) stands for the coordinates of keypoints in the reference image, (x2, y2) denotes the where (x1, y1) stands for the coordinates of keypoints in the reference image, (x2, y2) denotes the coordinates ofof keypointskeypoints in in the the raw raw image image to beto aligned,be aligned, and Handrepresents H represents the transformation the transformation matrix matrixwith 3 with3 size, 3 × as3 size, shown as shown in Equation in Equation (5): (5): ×

 hhh11 12 13   h11 h12 h13  Hhhh= , (5) H =  h 21h 22h 23 , (5)  21 22 23   hhh31 32 33  h31 h32 h33 where hij (i, j = 1, 2, 3) can be any value. where hij (i, j = 1, 2, 3) can be any value. The registration results are described in Figure 5, in which the reference image is marked by The registration results are described in Figure5, in which the reference image is marked by the the red box and the right four images are aligned images. It can be observed that the position offsets red box and the right four images are aligned images. It can be observed that the position offsets caused by camera shaking have been corrected and the coordinates of one pixel in different images caused by camera shaking have been corrected and the coordinates of one pixel in different images are consistent. are consistent.

Figure 5. The results of image registration. Figure 5. The results of image registration. 3.4. Image Fusion 3.4. Image Fusion To improve the efficiency of the proposed algorithm, the simple Laplacian pyramid decomposition algorithmTo improve was used the to efficiency fuse the registeredof the proposed raw image algorithm, sequences, the extracting simple Laplacian the high-frequency pyramid decompositioninformation of thealgorithm image and was enriching used to the fuse detailed the registered information raw to achieveimage sequences, a better HDR extracting fusion result. the high-frequencyThe calculation formulainformationGl ( i,ofj) the of theimage Gaussian and enrich pyramiding the layer detailedl is shown information in Equation to achieve (6): a better HDR fusion result. The calculation formula Gl (i, j) of the Gaussian pyramid layer l is shown in 2 2 Equation (6): X X Gl(i, j) = w(m, n)Gl 1(2i + m, 2j + n)(1 l N, 0 i Cl, 0 j Rl), (6) − ≤ ≤ ≤ ≤ ≤ ≤ m= 2 22n= 2 ()=++≤≤≤≤≤≤− − ( ) ( )( ) Gijll,,2,21,0,0 wmnG−1 i m j n l N i C ll j R, (6) where w (m, n) representsmn=− the22 =− 2-D low-pass filter, and N denotes the total number layer of the pyramid. whereCl and Rwl represent(m, n) represents the height the and 2-D width low-pass of the filter, pyramid andl -levelN denotes image, the respectively. total number Equation layer (6)of wasthe used to generate each layer of the Gaussian pyramid G , G , G , , G . pyramid. Cl and Rl represent the height and width 1of 2the 3pyramid··· N l-level image, respectively. Equation (6) was used to generate each layer of the Gaussian pyramid G1, G2, G3, ···, GN. The function Expand () was used to expand each layer size of the Gaussian pyramid by interpolation, and the calculation formula is shown in Equation (7):

22 ++ ()=× ( )  imin Gijlk,,1,4 wmnG , lk− , (7) mn=−22 =− 22 where Gl,k denotes the interpolation result of Gl. Gl,0 = Gl, and Gl,k = Expand(Gl,k-1)(0 ≤ k ≤ N). The Laplacian pyramid can be obtained by calculating the two different layers of the Gaussian pyramid:

Information 2020, 11, 213 8 of 17

The function Expand () was used to expand each layer size of the Gaussian pyramid by interpolation, and the calculation formula is shown in Equation (7):

2 2 X X i + m i + n Gl,k(i, j) = 4 w(m, n)Gl,k 1 , (7) − 2 × 2 m= 2 n= 2 − − where G denotes the interpolation result of G . G = G , and G = Expand(G )(0 k N). l,k l l,0 l l,k l,k-1 ≤ ≤ InformationThe Laplacian 2020, 11 pyramid, x FOR PEER can REVIEW be obtained by calculating the two different layers of the Gaussian pyramid:8 of 16 ( Ll ==−Gl Expand(()Gl+1), 0 ≤

Gl ==+L0 + Expand(()Gl+1), (9) Gll L01 Expand G + , (9)

Equation (9)(9) waswas usedused to to reconstruct reconstruct the the image image from from the the top top to the to bottom,the bottom, and theand bottom the bottom layer Glayer0 is theG0 reconstructedis the reconstructed image. image. The Gaussian The Gaussian and Laplacian and Laplacian pyramids pyramids of raw images of raw are images shown are in shownFigure6 in. Figure 6.

(a) (b)

Figure 6.6. TheThe Gaussian Gaussian and and Laplacian Laplacian pyramid pyrami resultsd results of the of raw the image. raw image. (a) The ( resulta) The of result the Gaussian of the Gaussianpyramid; (pyramid;b) The result (b) The of the result Laplacian of the Laplacian pyramid. pyramid.

The weighted average method was used for imag imagee fusion, and each exposure image was fused with the reference image by Equation (10).(10). =∗ +∗ FFA= A 50%50%+ BB 50%50%,, (10) (10) ∗ ∗ where F is the fusion image of A and B, and A and B are the Laplacian components of the exposure whereimage Fandis thereference fusion image, image ofrespectively.A and B, and Finally, A and for B areeach the exposure Laplacian image components sequence, of the exposurereference image andwas referencefused with image, each respectively. image of the Finally, exposu forre eachimage exposure sequence image by the sequence, Laplacian the referencepyramid methodimage was to generate fused with the each final image HDR of image, the exposure which imagecould be sequence displayed by theon Laplacianthe monitor pyramid directly method without to tonegenerate mapping. the final HDR image, which could be displayed on the monitor directly without tone mapping. The fused HDR image of the proposed method is shown in FigureFigure7 7a.a. DueDue toto thethe rawraw imageimage sequences withwith cameracamera shaking shaking and and object object movement, movement, the the black black area area needed needed to be to clipped. be clipped. The result The resultafter clipping after clipping is shown is shown in Figure in Figure7b. 7b.

(a) (b)

Figure 7. High-dynamic-range (HDR) fusion results. (a) Fusion result; (b) clipped result.

Information 2020, 11, x FOR PEER REVIEW 8 of 16

 =−() ≤< Lll G Expand G l+1 , 0 l N  ==, (8)  LGNN, lN

The Gaussian pyramid is a set of low-pass filtered images, while the Laplacian pyramid is a set of band-pass filtered images. The image can be reconstructed by the Laplacian pyramid: =+ () Gll L01 Expand G + , (9)

Equation (9) was used to reconstruct the image from the top to the bottom, and the bottom layer G0 is the reconstructed image. The Gaussian and Laplacian pyramids of raw images are shown in Figure 6.

(a) (b)

Figure 6. The Gaussian and Laplacian pyramid results of the raw image. (a) The result of the Gaussian pyramid; (b) The result of the Laplacian pyramid.

The weighted average method was used for image fusion, and each exposure image was fused with the reference image by Equation (10).

FA=∗50% +∗ B 50% , (10) where F is the fusion image of A and B, and A and B are the Laplacian components of the exposure image and reference image, respectively. Finally, for each exposure image sequence, the reference image was fused with each image of the exposure image sequence by the Laplacian pyramid method to generate the final HDR image, which could be displayed on the monitor directly without tone mapping. The fused HDR image of the proposed method is shown in Figure 7a. Due to the raw image Informationsequences2020 with, 11, 213camera shaking and object movement, the black area needed to be clipped.9 ofThe 17 result after clipping is shown in Figure 7b.

(a) (b)

Figure 7. High-dynamic-range (HDR) fusion results. (a) Fusion result; (b) clipped result. Information 2020Figure, 11, x7. FOR High-dynamic-range PEER REVIEW (HDR) fusion results. (a) Fusion result; (b) clipped result. 9 of 16 4. Experiments and Results 4. Experiments and Results 4.1. Dataset 4.1. Dataset Camera shaking is inevitable in real shooting. Most of the existing multi-exposure image sequence datasetsCamera are in shaking JPEG format is inevitable with no camerain real shaking, shooting. and Most do not of meetthe existing the requirements multi-exposure for real scenes.image Consideringsequence datasets the scarcity are in ofJPEG large format datasets with for no MEF camera in literature, shaking, aand new do dataset not meet was the collected requirements for our experimentfor real scenes. using Considering the handheld the iPhone6s scarcity of plus large smartphone, datasets for including MEF in 435 literature, multi-exposure a new dataset raw image was sequences.collected for Each our image experiment sequence using consisted the ofhandheld 4–6 images iPhone6s with exposure plus smartphone, values between including3EV and435 − +multi-exposure3EV (inclusive), raw and image the captured sequences. images Each had image a resolution sequence ofconsisted 3024 4032 of 4–6 pixels. images The with scenes exposure of the × multi-exposurevalues between raw−3EV image and +3EV dataset (inclusive), included and buildings, the captured cars, street images lamps, had a scenery, resolution benches, of 3024 bridges, × 4032 stones,pixels. The etc., scenes and each of rawthe multi-exposure image had a corresponding raw image dataset JPEG format.included The buildings, multi-exposure cars, street raw lamps, image sequencesscenery, benches, are shown bridges, in Figure stones,8. etc., and each raw image had a corresponding JPEG format. The multi-exposure raw image sequences are shown in Figure 8.

(a) (b) (c) (d) (e) (f) (g) (h)

Figure 8.8. (a(a–)–(h)h are) are the the sequences sequences of a multi-exposureof a multi-exposure raw image. raw image. The top The row top is the row selected is the referenceselected imagereference and image the rest and are the the rest multi-exposure are the multi-exposure raw image raw sequences. image sequences.

4.2. Experiments

4.2.1. Experimental Results Figure 9a shows the fusion result of a raw image sequence with the proposed method, and the fusion result of the corresponding JPEG image sequence can be observed in Figure 9b. The experiments were performed on a computer with an Intel Core i7-4790 CPU, 3.6 GHz processor and 8.0 GB RAM. JPEG images lose much image information in IPP, which results in the poor quality of the final HDR images. There is distinct ghosting in Figure 9b (red box), but the ghosting does not exist in Figure 9a, which indicates that the proposed method is superior to the traditional method based on JPEG images.

Information 2020, 11, 213 10 of 17

4.2. Experiments

4.2.1. Experimental Results Figure9a shows the fusion result of a raw image sequence with the proposed method, and the fusion result of the corresponding JPEG image sequence can be observed in Figure9b. The experiments were performed on a computer with an Intel Core i7-4790 CPU, 3.6 GHz processor and 8.0 GB RAM. JPEG images lose much image information in IPP, which results in the poor quality of the final HDR images. There is distinct ghosting in Figure9b (red box), but the ghosting does not exist in Figure9a, Information 2020, 11, x FOR PEER REVIEW 10 of 16 which indicates that the proposed method is superior to the traditional method based on JPEG images.

(a) (b)

Figure 9.9. The comparisoncomparison ofof imageimage fusion fusion results. results. ( a()a) Fusion Fusion result result of of raw raw images; images; (b )(b Fusion) Fusion result result of JPEGof JPEG images. images.

4.2.2. Comparison with Other Methods Multi-method fusion forfor multi-exposuremulti-exposure images.images. In this section, the proposed algorithm is compared withwith other traditionaltraditional methods on multi-exposure raw image sequences. In In Figure 10a–d,10a–d, a multi-exposure raw raw image image sequence sequence is is shown, shown, and and the the fusion fusion results results are are shown shown in Figure in Figure 10e–j. 10e–j. In InFigure Figure 10e–g 10e–g the the fusion fusion results results of of three three tone tone mapping mapping methods methods [25,26,28] [25,26,28] are are provided, provided, and appear toto havehave distinctdistinct ghosting.ghosting. Figure 1010hh represents the traditional direct fusion result using JPEG image sequences as inputs, the performanceperformance is extremely poor. Figure 1010ii depicts the fusion result of JPEGJPEG imageimage sequencessequences with our method and a significantsignificant improvement in image quality can be observed, but ghostingghosting artifactsartifacts can can be be seen seen in in the the red red box. box. Figure Figure 10j shows10j shows the raw the imageraw image sequence sequence fusion fusion result, whichresult, achieveswhich achieves a satisfactory a satisfactory performance. performance. In conclusion, In conclusion, the anti-shake the anti-shake performance performance of the proposed of the algorithmproposed isalgorithm superior is to superior other traditional to other methodstraditional based methods on multi-exposure based on multi-exposure JPEG image sequences.JPEG image sequences.

(a) (b) (c) (d)

(e) (f) (g) (h) (i) (j)

Figure 10. Comparison with other methods. (a)–(d) The multi-exposure raw image sequence. (e) The result using the tone mapping method of Drago [28]. (f) The result using the tone mapping method of Mantiuk [26]. (g) The result using the tone mapping method of Reinhard [25]. (h) The traditional direct fusion result with JPEG. (i) The JPEG image fusion result using our method; (j) Our method with raw images.

Image quality evaluation. To further compare the performance of the proposed method with conventional methods, the quality of fusion results was evaluated using five evaluation criteria, including entropy, PSNR, average gradient, SSIM, and HDR-VDP-2.

Information 2020, 11, x FOR PEER REVIEW 10 of 16

(a) (b)

Figure 9. The comparison of image fusion results. (a) Fusion result of raw images; (b) Fusion result of JPEG images. 4.2.2. Comparison with Other Methods Multi-method fusion for multi-exposure images. In this section, the proposed algorithm is compared with other traditional methods on multi-exposure raw image sequences. In Figure 10a–d, a multi-exposure raw image sequence is shown, and the fusion results are shown in Figure 10e–j. In Figure 10e–g the fusion results of three tone mapping methods [25,26,28] are provided, and appear to have distinct ghosting. Figure 10h represents the traditional direct fusion result using JPEG image sequences as inputs, the performance is extremely poor. Figure 10i depicts the fusion result of JPEG image sequences with our method and a significant improvement in image quality can be observed, but ghosting artifacts can be seen in the red box. Figure 10j shows the raw image sequence fusion result, which achieves a satisfactory performance. In conclusion, the anti-shake performance of the proposedInformation 2020 algorithm, 11, 213 is superior to other traditional methods based on multi-exposure JPEG image11 of 17 sequences.

(a) (b) (c) (d)

(e) (f) (g) (h) (i) (j)

Figure 10. ComparisonComparison with with other other methods. methods. (a (a)–(–dd)) TheThe multi-exposuremulti-exposure rawraw imageimage sequence.sequence. ( e) The resultresult using the tone mapping method method of of Drago Drago [28] [28].. ( f) The result using the tone mapping method of Mantiuk [26] [26].. ( g) The result using the tone mapping method method of of Reinhard Reinhard [25] [25].. ( h) The traditional direct fusion resultresult withwith JPEG. JPEG. ( i()i The) The JPEG JPEG image image fusion fusion result result using using our our method; method; (j) Our (j) Our method method with withraw images.raw images.

ImageImage quality evaluation. To To further further compare compare th thee performance of of the proposed method with conventional methods, methods, the the quality quality of of fusion fusion resu resultslts was was evaluated evaluated using using five five evaluation evaluation criteria, criteria, includingincluding entropy, PSNR,PSNR, averageaverage gradient,gradient, SSIM,SSIM, andand HDR-VDP-2.HDR-VDP-2. 1. Entropy. The information entropy describes how much information is provided by the signal or image on average, as defined in Equation (1): 2. PSNR. PSNR is the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity. The larger the PSNR, the better the fusion result. The calculation formulae are shown in Equations (11)–(12):

H W 1 X X MSE = (X(i, j) Y(i, j))2, (11) H W − × i=1 j=1  2   (2n 1)  PSNR = 10 log  − , (12) 10 MSE  where MSE stands for the mean square error between fusion image X and the reference image Y; H and W are the height and width of the image, respectively; (i, j) represent the image coordinates; and n denotes the number of bits per pixel.

3. Average gradient. The average gradient reflects the sharpness and texture of the image, as shown in Equation (13). When the average gradient is higher, the method is regarded as more effective.

v u 2  2 t ∂ f ∂ f M N + 1 X X ∂x ∂y G = , (13) M N 2 × i=1 j=1 ∂ f where M N stands for the size of the image, ∂x represents the gradient in the horizontal direction, ∂ f × and ∂y is the gradient in the vertical direction. Information 2020, 11, 213 12 of 17

4. SSIM. SSIM describes the similarity between two images. Equation (14) is used to calculate the similarity between the reference images and the fused HDR images. A larger value of SSIM indicates a smaller difference between the two images, and that the fusion result is of higher quality.

   2µxµy + c1 σxy + c2 SSIM(x, y) = , (14)  2 2  2 2  µx + µy + c1 σx + σy + c2 where x denotes the reference image, y is the fused HDR image, µx and µy represent the average pixel values of x and y, respectively, σx and σy are the standard deviation of x and y, respectively, σxy is the covariance of x and y, c1, c2, and c3 are the constants, and c3 = c2/2.

5. HDR-VDP-2. The HDR-VDP-2 [43] metric is based on the human visual perception model that can predict visibility and quality difference between two images. For quality differences, the metric produces a mean-opinion score (Q-score), which computes the quality degradation of the fused HDR images with respect to the reference images.

We randomly selected 150 exposure image sequences from the dataset to evaluate the fusion results by different methods. The sequence of exposed images and fusion results of the test are shown in Figure 11. In Figure 11a,b the exposure image sequence is shown, and the fusion results of different methods are shown in Figure 11e–j. Information 2020, 11, x FOR PEER REVIEW 12 of 16

(a) (b) (c) (d)

(e) (f) (g) (h) (i) (j)

Figure 11. The sequence of of exposure exposure for for the the image image and and fusion fusion results results of of the the test. test. (a ()–(a–d)) TheThe image sequence. ((ee)) FusedFused imageimage by by JPEG JPEG directly. directly. (f) ( Fusedf) Fused image image using using the method the method of Drago of Drago [28]. ( g[28].) Fused (g) imageFused usingimage the using method the method of Durand of Durand [27]. (h) [27]. Fused (h image) Fused using image the using method the ofmethod Reinhard of Reinhard [25]. (i) Fused [25]. image(i) Fused using image the using method the of method Mantiuk of [Mantiuk26]. (j) Fused [26]. ( imagej) Fused using image our using proposed our proposed method. method.

Table2 2reports reports the the average average evaluation evaluation scores scores of fusion of fusion results results using diusingfferent different evaluation evaluation criteria andcriteria fusion and methods.fusion methods. The fusion The resultsfusion results of JPEG of images JPEG images have a have higher a higher information information entropy entropy value, butvalue, with but a with lower a averagelower average gradient gradient and serious and serious blurring blurring effects. effects. The tone The mapping tone mapping methods methods of Drago, of Durand,Drago, Durand, Reinhard, Reinhard, and Mantiuk and Mantiuk have lower have information lower information entropy andentropy fail to and recover fail to the recover lost details the lost of over-details or of under-exposed over- or under-exposed regions. As regions. shown in As Table shown2, the in fusion Table results 2, the offusion the proposed results of method the proposed have a highermethod value have thana higher the other value five than methods the other in entropy,five methods PSNR, in and entropy, average PSNR, gradient, and indicatingaverage gradient, that the HDRindicating images that generated the HDR by images the proposed generate methodd by the areproposed clearer. method are clearer.

Table 2. The analysis of the experimental comparative.

Average Method/Metrics Entropy PSNR SSIM Q-Score Gradient JPEG 7.24006583 20 dB 0.0089 0.35 37.35 Drago [28] 4.73977225 28 dB 0.0082 0.49 49.33 Durand [27] 4.86853024 36 dB 0.0316 0.68 62.18 Reinhard [25] 6.48690328 33 dB 0.0191 0.54 53.26 Mantiuk [26] 4.80421760 35 dB 0.0239 0.78 64.07 Our method 7.71679023 41 dB 0.0451 0.91 73.50

Moreover, the overall runtimes of the image-matching process of eight raw image sequences in Figure 8 were evaluated, as shown in Table 3. The image sequences of Figure 8a–h are defined from the first column on the left to the eighth column on the right, and the data in Table 3 are provided as the average value of each image sequence. We can observe from Table 3 that most of the matching rates of raw images are higher than for JPEG images, and the runtimes for raw images are shorter than for JPEG images. There is little difference in runtime between raw and JPEG images, but the matching rates of raw images are higher than JPEG images, demonstrating that the proposed method can improve the matching rate with the same time consumption.

Information 2020, 11, 213 13 of 17

Table 2. The analysis of the experimental comparative.

Method/Metrics Entropy PSNR Average Gradient SSIM Q-Score JPEG 7.24006583 20 dB 0.0089 0.35 37.35 Drago [28] 4.73977225 28 dB 0.0082 0.49 49.33 Durand [27] 4.86853024 36 dB 0.0316 0.68 62.18 Reinhard [25] 6.48690328 33 dB 0.0191 0.54 53.26 Mantiuk [26] 4.80421760 35 dB 0.0239 0.78 64.07 Our method 7.71679023 41 dB 0.0451 0.91 73.50

Moreover, the overall runtimes of the image-matching process of eight raw image sequences in Figure8 were evaluated, as shown in Table3, which report the average value of each image sequence of Figure8a–h. We can observe from Table3 that most of the matching rates of raw images are higher than for JPEG images, and the runtimes for raw images are shorter than for JPEG images. There is little difference in runtime between raw and JPEG images, but the matching rates of raw images are higher than JPEG images, demonstrating that the proposed method can improve the matching rate with the same time consumption.

Table 3. The comparative analysis of feature-matching time and matching rate.

The Number of The Number of Matching Rate Feature Points Feature Points Time (ms) Figure8 (Nd Nr)/Nd Detected (Nd) Rejected (Nr) − JPEG RAW JPEG RAW JPEG RAW JPEG RAW (a) 2893 3415 1367 1134 52.75% 66.80% 14.23 13.95 (b) 3135 3389 1945 1870 37.96% 44.82% 14.78 14.59 (c) 2520 2765 1937 2087 23.13% 24.52% 5.69 4.15 (d) 2693 2905 1263 1256 48.26% 56.76% 9.28 8.49 (e) 2960 3218 1458 1282 50.74% 60.16% 12.36 11.48 (f) 3045 3319 1363 1427 55.24% 57.01% 8.04 6.01 (g) 3157 3578 1950 2218 38.02% 38.01% 15.63 13.04 (h) 2875 3176 1469 1589 48.90% 49.97% 18.55 14.37

4.2.3. Image Fusion with the Extreme-Exposure Sequence To test the robustness of the proposed method, we tested 100 sets of raw image pairs with extreme exposure, which contain very little image information owing to under- or over-exposure. The fusion results are demonstrated in Figures 12 and 13. In Table4 the fusion results are compared, and the average scores of the 100 extreme-exposure image pairs with entropy, PSNR, average gradient, SSIM, and HDR-VDP-2 are reported, confirming that the proposed method has strong robustness for extreme-exposure image pairs. InformationInformation 2020,, 11,, xx FORFOR PEERPEER REVIEWREVIEW 13 of 16

Table 3. The comparative analysis of feature-matching time and matching rate. The Number of The Number of The Number of The Number of Matching Rate Feature Points Feature Points Matching Rate Time (ms) Feature Points Feature Points d r d Time (ms) Figure 8 (Nd-Nr)/Nd Figure 8 d r (N -N )/N Detected (Nd) Rejected (Nr) JPEG RAW JPEG RAW JPEG RAW JPEG RAW (a) 2893 3415 1367 1134 52.75% 66.80% 14.23 13.95 (b) 3135 3389 1945 1870 37.96% 44.82% 14.78 14.59 (c) 2520 2765 1937 2087 23.13% 24.52% 5.69 4.15 (d) 2693 2905 1263 1256 48.26% 56.76% 9.28 8.49 (e) 2960 3218 1458 1282 50.74% 60.16% 12.36 11.48 (f) 3045 3319 1363 1427 55.24% 57.01% 8.04 6.01 (g) 3157 3578 1950 2218 38.02% 38.01% 15.63 13.04 (h) 2875 3176 1469 1589 48.90% 49.97% 18.55 14.37

4.2.3. Image Fusion with the Extreme-Exposure Sequence To test the robustness of the proposed method, we tested 100 sets of rawraw imageimage pairspairs withwith extreme exposure, which contain very little image informationinformation owingowing toto under-under- oror over-exposure.over-exposure. The fusion results are demonstrated in Figures 12 and 13. In Table 4 the fusion results are compared, and the average scores of the 100 extreme-exposure image pairs with entropy, PSNR, average Information 2020, 11, 213 14 of 17 gradient, SSIM, and HDR-VDP-2 are reported, confirming that the proposed method has strong robustness for extreme-exposure image pairs.

(a) (b) (c)

FigureFigure 12. 12.The The fusion fusion results results of extreme-exposureof extreme-exposuretreme-exposure pairs. pairs.pairs. (a) Over-and((a) Over-and under-exposed under-exposed raw raw image image pairs. (b)pairs. JPEG ( fusionb) JPEG result fusion using result our using method. our method. (c) The (c result) The result of the of proposed the proposed method. method.

(a) (b) (c)

FigureFigure 13. 13.The The fusion fusion results results of of extreme-exposure extreme-exposure imageimage sequence. (a (a) )The The over- over- and and under-exposed under-exposed rawraw image image sequence. sequence. (b ()b JPEG) JPEG fusion fusion result result using usingusing ourourour method. method. (( (cc) )Raw Raw fusion fusion result. result.

Table 4. The results analysis of the extreme-exposure image.

Images/Metrics Entropy PSNR Average Gradient SSIM Q-Score JPEG 7.50736193 34 dB 0.3967 0.74 63.75 RAW 7.70380035 40 dB 0.0463 0.87 69.29

The red boxes in top of Figure 12a show the over-exposed regions which result in the loss of many image details, such as the outline of fence and the shape of bricks. Moreover, at the bottom of Figure 12a the lettering on the bench can barely be seen due to the lack of exposure in the yellow box. The blue boxes in Figure 12c show that the lost details generated by the proposed method have been recovered well. It can be seen that the fence, the blue writing on the bench, the rectangular bricks on the ground, and even the yellow leaves on the ground are demonstrated clearly. In contrast, the fusion results of compressed JPEG images are unsatisfactory; the words on the bench cannot be seen, and the shapes of the tiles are fuzzy (Figure 12b, blue boxes). As shown in Figure 13a, the details of circular columns are lost due to over-exposure (region marked by red box), and there is no reflection in the water in the yellow box. The fusion results of JPEG images are shown in Figure 13b, where blurring of the stone in the circular columns can be seen. The blue boxes in Figure 13c show clear details and color fidelity, and the crack in the stone wall is also observed to be closer to the real-world scene. In summary, the experimental results show that our fusion method achieves an outstanding effect, even in extreme exposure conditions. Information 2020, 11, 213 15 of 17

5. Conclusions In this paper, an anti-shake HDR acquisition method based on raw camera data was presented, and a new raw image dataset of a real scene was created. Compared with traditional algorithms, the proposed method retained more HDR image details by using raw images as inputs, avoiding the artifacts caused by camera shaking and object movement. The performance of the proposed method was evaluated using extreme-exposure image pairs. A better HDR image result was found as compared to existing algorithms; hence, the proposed method can be used in binocular or multi-sensor vision systems. The proposed method also has a lower algorithm complexity than deep learning-based fusion methods and adapts more easily to real processing.

Author Contributions: Conceptualization, Y.L. and B.L.; methodology, Y.L. and B.L.; software, W.H.; formal analysis, Y.L.; investigation, B.J. and C.L.; resources, Y.L.; data curation, W.H.; writing—original draft preparation, Y.L. and C.L.; writing—review and editing, Y.L., B.L., W.H., B.J., and C.L.; visualization, B.J. All authors have read and agreed to the published version of the manuscript. Funding: This research was funded by the National Natural Science Foundation of China, grant numbers 61605175, 61602423 and the Department of Science and Technology of Henan Province, China, grant numbers 192102210292, 182102110399. Acknowledgments: In this section you can acknowledge any support given which is not covered by the author contribution or funding sections. This may include administrative and technical support, or donations in kind (e.g., materials used for experiments). Conflicts of Interest: The authors declare no conflict of interest.

References

1. Shafie, S.; Kawahito, S.; Itoh, S. A dynamic range expansion technique for CMOS image sensors with dual charge storage in a pixel and multiple sampling. Sensors 2008, 8, 1915–1926. [CrossRef][PubMed] 2. Shafie, S.; Kawahito, S.; Halin, I.A.; Hasan, W.Z.W. Non-linearity in wide dynamic range CMOS image sensors utilizing a partial charge transfer technique. Sensors 2009, 9, 9452–9467. [CrossRef][PubMed] 3. Martínez-Sánchez, A.; Fernández, C.; Navarro, P.J.; Iborra, A. A novel method to increase LinLog CMOS sensors’ performance in high dynamic range scenarios. Sensors 2011, 11, 8412–8429. [CrossRef][PubMed] 4. Agusanto, K.; Li, L.; Chuangui, Z.; Sing, N.W. Photorealistic rendering for augmented reality using environment illumination. In Proceedings of the Second IEEE and ACM International Symposium on Mixed and Augmented Reality, Tokyo, Japan, 10 October 2003; pp. 208–216. 5. Quevedo, E.; Delory, E.; Callicó, G.; Tobajas, F.; Sarmiento, R. Underwater enhancement using multi-camera super-resolution. Opt. Commun. 2017, 404, 94–102. [CrossRef] 6. Ward, G.; Reinhard, E.; Debevec, P.High dynamic range imaging & image-based lighting. In ACM SIGGRAPH 2008 Classes; Association for Computing Machinery: New York, NY, USA, 2008; pp. 1–137. 7. Debevec, P. Image-based lighting. In ACM SIGGRAPH 2006 Courses; Association for Computing Machinery: New York, NY, USA, 2006; p. 4. 8. Debevec, P.; McMillan, L. Image-based modeling, rendering, and lighting. IEEE Comput. Graph. Appl. 2002, 22, 24–25. [CrossRef] 9. Pece, F.; Kautz, J. Bitmap movement detection: HDR for dynamic scenes. In Proceedings of the 2010 Conference on Visual Media Production, London, UK, 17–18 November 2010; pp. 1–8. 10. Dong, Y.; Pourazad, M.T.; Nasiopoulos, P. Human visual system-based saliency detection for high dynamic range content. IEEE Trans. Multimed. 2016, 18, 549–562. [CrossRef] 11. Lin, Y.-T.; Wang, C.-M.; Chen, W.-S.; Lin, F.-P.; Lin, W. A novel data hiding algorithm for high dynamic range images. IEEE Trans. Multimed. 2016, 19, 196–211. [CrossRef] 12. Ravuri, C.S.; Sureddi, R.; Dendi, S.V.R.; Raman, S.; Channappayya, S.S. Deep no-reference tone mapped image quality assessment. In Proceedings of the 2019 53rd Asilomar Conference on Signals, Systems, and Computers, Systems, and Computers, Pacific Grove, CA, USA, 3–6 November 2019; pp. 1906–1910. 13. Hadizadeh, H.; Baji´c,I.V. Full-reference objective quality assessment of tone-mapped images. IEEE Trans. Multimed. 2017, 20, 392–404. [CrossRef] Information 2020, 11, 213 16 of 17

14. Eden, A.; Uyttendaele, M.; Szeliski, R. Seamless image stitching of scenes with large motions and exposure differences. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR‘06), New York, NY, USA, 17–22 June 2006; pp. 2498–2505. 15. Endo, Y.; Kanamori, Y.; Mitani, J. Deep reverse tone mapping. Acm Trans. Graph. 2017, 36, 177. [CrossRef] 16. Liu, Y.; Wang, Z. Dense SIFT for ghost-free multi-exposure fusion. J. Vis. Commun. Image Represent. 2015, 31, 208–224. [CrossRef] 17. Li, Z.; Wei, Z.; Wen, C.; Zheng, J. Detail-enhanced multi-scale exposure fusion. IEEE Trans. Image Process. 2017, 26, 1243–1252. [CrossRef][PubMed] 18. Eilertsen, G.; Unger, J.; Mantiuk, R.K. Evaluation of tone mapping operators for HDR video. In High Dynamic Range Video; Academic Press: Cambridge, MA, USA, 2016; pp. 185–207. 19. Shen, J.; Zhao, Y.; Yan, S.; Li, X. Exposure fusion using boosting Laplacian pyramid. IEEE Trans. Cybern. 2014, 44, 1579–1590. [CrossRef][PubMed] 20. Lee, D.-H.; Fan, M.; Kim, S.-W.; Kang, M.-C.; Ko, S.-J. High dynamic range image tone mapping based on asymmetric model of retinal adaptation. Signal Process. Image Commun. 2018, 68, 120–128. [CrossRef] 21. Ma, K.; Yeganeh, H.; Zeng, K.; Wang, Z. High dynamic range image tone mapping by optimizing tone mapped image quality index. In Proceedings of the 2014 IEEE International Conference on Multimedia and Expo (ICME), Chengdu, China, 14–18 July 2014; pp. 1–6. 22. Liu, Z.; Yin, H.; Fang, B.; Chai, Y. A novel fusion scheme for visible and infrared images based on compressive sensing. Opt. Commun. 2015, 335, 168–177. [CrossRef] 23. Kinoshita, Y.; Yoshida, T.; Shiota, S.; Kiya, H. Pseudo multi-exposure fusion using a single image. In Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Kuala Lumpur, Malaysia, 12–15 December 2017; pp. 263–269. 24. Huo, Y.; Zhang, X. Single image-based HDR imaging with CRF estimation. In Proceedings of the 2016 International Conference On Communication Problem-Solving (ICCP), Taipei, Taiwan, 7–9 September 2016; pp. 1–3. 25. Reinhard, E.; Stark, M.; Shirley, P.; Ferwerda, J. Photographic tone reproduction for digital images. In Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, San Antonio, TX, USA, 23–26 July 2002; pp. 267–276. 26. Mantiuk, R.; Daly, S.; Kerofsky, L. Display adaptive tone mapping. In ACM SIGGRAPH 2008 Papers; Association for Computing Machinery: New York, NY, USA, 2008; pp. 1–10. 27. Durand, F.; Dorsey, J. Fast bilateral filtering for the display of high-dynamic-range images. In Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, San Antonio, TX, USA, 23–26 July 2002; pp. 257–266. 28. Drago, F.; Myszkowski, K.; Annen, T.; Chiba, N. Adaptive logarithmic mapping for displaying high contrast scenes. In Computer Graphics Forum; Blackwell Publishing: Oxford, UK, 2003; pp. 419–426. 29. Zhang, W.; Liu, X.; Wang, W.; Zeng, Y. Multi-exposure image fusion based on wavelet transform. Int. J. Adv. Robot. Syst. 2018, 15, 1729881418768939. [CrossRef] 30. Vanmali, A.V.; Kelkar, S.G.; Gadre, V.M. Multi-exposure image fusion for dynamic scenes without ghost effect. In Proceedings of the 2015 Twenty First National Conference on Communications (NCC), Mumbai, India, 27 February–1 March 2015; pp. 1–6. 31. Kinoshita, Y.; Shiota, S.; Kiya, H.; Yoshida, T. Multi-exposure image fusion based on exposure compensation. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 1388–1392. 32. Sen, P.; Kalantari, N.K.; Yaesoubi, M.; Darabi, S.; Goldman, D.B.; Shechtman, E. Robust patch-based hdr reconstruction of dynamic scenes. Acm Trans. Graph. 2012, 31, 203:201–203:211. [CrossRef] 33. Li, Z.; Zheng, J.; Zhu, Z.; Wu, S. Selectively detail-enhanced fusion of differently exposed images with moving objects. IEEE Trans. Image Process. 2014, 23, 4372–4382. [CrossRef] 34. Ying, Z.; Li, G.; Ren, Y.; Wang, R.; Wang, W. A new image contrast enhancement algorithm using exposure fusion framework. In Proceedings of the International Conference on Computer Analysis of Images and Patterns, Ystad, Sweden, 22–24 August 2017; pp. 36–46. 35. Chen, C.; Chen, Q.; Xu, J.; Koltun, V. Learning to see in the dark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3291–3300. Information 2020, 11, 213 17 of 17

36. Shen, L.; Yue, Z.; Feng, F.; Chen, Q.; Liu, S.; Ma, J. Msr-net: Low-light image enhancement using deep convolutional network. arXiv 2017, arXiv:1711.02488. 37. Yang, B.; Zhong, J.; Li, Y.; Chen, Z. Multi-focus image fusion and super-resolution with convolutional neural network. Int. J. WaveletsMultiresolution Inf. Process. 2017, 15, 1750037. [CrossRef] 38. Hasinoff, S.W.; Sharlet, D.; Geiss, R.; Adams, A.; Barron, J.T.; Kainz, F.; Chen, J.; Levoy, M. Burst photography for high dynamic range and low-light imaging on mobile cameras. Acm Trans. Graph. (Tog) 2016, 35, 1–12. [CrossRef] 39. Brooks, T.; Mildenhall, B.; Xue, T.; Chen, J.; Sharlet, D.; Barron, J.T. Unprocessing images for learned raw denoising. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2019; pp. 11036–11045. 40. Ihara, S. Information Theory for Continuous Systems; World Scientific: Hackensack, NJ, USA, 1993. 41. Rublee, E.; Rabaud, V.;Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF.In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. 42. Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. Acm 1981, 24, 381–395. [CrossRef] 43. Narwaria, M.; Mantiuk, R.; Da Silva, M.P.; Le Callet, P. HDR-VDP-2.2: A calibrated method for objective quality prediction of high-dynamic range and standard images. J. Electron. Imaging 2015, 24, 010501. [CrossRef]

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).