Chinese Journal of Electronics Editorial Committee (2011∼2015) Chairman Vice Chairman LI Yanda LIU Shenggang Fuqing Professor, Academician Professor, Academician Professor, Academician University of Electronic Science WANG Shoujue and Technology of Professor, Academician PAN Yunhe Yaoxue XIE Weixin Institute of Semiconductors, Professor, Academician Professor, Academician Professor Chinese Academy of Sciences Chinese Academy of Engineering Central South University Shenzhen University DUAN Baoyan WONG Chingping LIU Li Professor, Academician Professor, Academician Professor Xidian University The Chinese University Chinese Institute of Hong Kong of Electronics Members (in alphabetic order)

Benjamin W. WAH HAO Yue MEI Hong WANG Ya n g y u a n Professor Professor Professor, Academician Professor, Academician Dept. of Electrical & Computer Xidian University School of Information, Peking University Engineering, University of Illinois, JIAO Licheng Peking University WANG Z i y u Urbana-Champaign, USA Professor, School of Electronic MIN Geyong Professor B. H. Juang Engineering, Xidian University Professor Dept. of Electronics Professor, Academician LANG Tong School of Informatics, Peking University School of Elec. & Comp. Professor Dept. of Computing, WU Yirong Engineering, Cornell University, University of Bradford, Professor, Academician Georgia Inst. of Technology, U.S.A UK Institute of Electronics, USA LI Deyi NG Tung Sang Chinese Academy of Sciences CAO Zhigang Professor, Academician Professor XU Lei Professor 61 Institute, School of Engineering, Professor Dept. of Electronic Engineering, General Staff Headquarters University of Hong Kong Dept. of Computer Science and Tsinghua University P.L.A NIE Zaiping Engineering, The Chinese CHEN Chun LI Wei Professor University of Hong Kong Professor Professor, Academician University of Electronic Science XU Qingxiang Software School, University of and Technology of China Professor University Aeronautics and Astronautics NIU Xiamu Institute of Electronic Engineering, CHEN Lianghui LI Xia Professor Tsing Hua University, Hsinchu, Professor, Academician Professor Harbin Institute of Technology Taiwan Institute of Semiconductors, School of Electronics, PAN Yi YANG Shiwen Chinese Academy of Sciences Shenzhen University Professor Professor CHENG Yiuchung LIANG Changhong School of Computer Science, School of Electronic Engineering Professor, Academician Professor Georgia State University, University of Electronic Science CITIC Pacific Ltd Xidian University USA and Technology of China Hong Kong LIM Yongching Philip CHAN YANG Zhen CHI Huisheng Professor Professor Professor Professor Dept. of Electrical Engineering, The Hong Kong Polytechnic Univer- Nanjing University of Posts Peking University National University of Singapore sity and Telecommunications Chong-Yung Chi LIN Huimin Roger MOHR YOU Xiaohu Professor Professor, Academician Professor Professor Institute of Communications Institute of Software Institute National de College of Information Science Engineering, & Chinese Academy of Sciences Recherche en Informatique and Engineering Dept. of Electrical Engineering, LIU Q. H. et en Automatique, Southeast University Tsing Hua University, Hsinchu, Professor France YUAN Baozong Taiwan Dept. of Electrical Engineering, SHEN Lansun Professor GONG Ke Duke University, Professor Institute of Information Science, Professor USA Beijing University of Technology Beijing Jiaotong University Tianjin University LIU Min SUN Shenghe ZHENG Nanning GOU Zhongwen Professor Professor Professor, Academician Professor Dept. of Automatic Harbin Institute of Technology Xi’an Jiaotong University Government of Beijing Tsinghua University TAN Yonghong ZHONG Yixin Gun-Sik Park LIU Rulin Professor Professor Professor Professor Normal University Beijing University of School of Physics, Chinese Institute of Electronics WANG Houjun Posts and Telecommunications College of Natural Sciences, MAO Erke Professor ZHU Yemei Seoul National University, Professor, Academician University of Electronic Science Professor Korea Beijing Institute of Technology and Technology of China Chinese Institute of Electronics Chinese Journal of Electronics Vol. 21 No. 4 (quarterly) Oct. 2012

CONTENTS

COMPUTER AND MICROELECTRONICS 583..... RouteOptimizationAlgorithmforVehicletoVehicleCommunicationUsingLocationInformation XU Shenglei and LEE Sangsun 589..... AnEvidence-DrivenFrameworkforTrustworthinessEvaluationofSoftwareBasedonRules WANG Xiaoyan, LIU Shufen and BAO Tie 594..... TISA:ReconfigurableSystemforTemplate-BasedStreamComputing YANG Qianming, WU Nan, WEN Mei, QUAN Wei and ZHANG Chunyuan 599..... AnAlgorithmforBusTrajectoryExtractionBasedonIncompleteDataSource DAI Dameng and MU Dejun 604..... Controllability of Multi-agent Systems with Multiple Leaders and Switching Topologies LUO Xiaoyuan, LIU Dan, ZHANG Fan and GUAN Xinping 609..... A NovelCollaborativeFilteringUsingKernelMethodsforRecommenderSystems CAO Jie, WU Zhiang, ZHUANG Yi, MAO Bo and YU Zeng 615..... ParallelTestTaskSchedulingwithConstraintsBasedonHybridParticleSwarmOptimizationandTabooSearch LU Hui, CHEN Xiao and LIU Jing 619..... SHISModelofE-mailVirusPropagation ZHONG , LI Ang and WEN Luosheng 623..... A ConvexApproachforLocalStatisticsBasedRegionSegmentation MA Liyan and YU Jian 627..... A NovelBoostedChargeTransferCircuitforHighSpeedChargeDomainPipelinedADC CHEN Zhenhai, YU Zongguang, HUANG Songren, JI Huicai and ZHANG Hong 633..... Frequent2-EpisodeMiningwithMinimalOccurrencesBasedonEpisodeMatrixandLockState LIN Shukuan, WANG Ya, WANG Jue, GUO Tianzhu and QIAO Jianzhong 636..... ModelingandPathGenerationApproachesforCrowdSimulationBasedonComputationalIntelligence LIU Hong, SUN Yuling and LI Yuanyuan 642..... LowPower EEPROM Designed for Sensor Interface Circuit MENG Xiangyun, YANG Sen, CHEN Zhongjian, LU Wengao, ZHANG Yacong, HUANG Jingqing, LI Haojiong, SU Weiguo and LI Song SIGNAL PROCESSING 645..... Orthogonality is Better: Auxiliary Problems in ASO Algorithm ZHANG Taozheng, WANG Xiaojie and TONG Hui 651..... A NonLocalFeature-PreservingStrategyforImageDenoising HE Ning and LU Ke 657..... A Graph-basedMethodtoMineCoexpressionClustersAcrossMultipleDatasets ZAN Xiangzhen, XIAO Biyu, MA Runnian, ZHANG Fengyue and LIU Wenbin 663..... A FusionSchemeofRegionofInterestExtractioninIncompleteFingerprint JING Xiaojun, ZHANG Bo, ZHANG Jie and ZHONG Mingliang 667..... UniformSolutiontoQSATbyP SystemswithProteins LU Chun and SHI Xiaolong 673..... MissingValueEstimationforGeneExpressionProfileData WANG Xuesong, LIU Qingfeng and CHENG Yuhu 678..... A Rate-DistortionModelBasedFrameLayerRateControlAlgorithmforStereoscopicVideoCoding WANG Qun, ZHUO Li, ZHANG Jing and LI Xiaoguang 683..... A KPLS-EigentransformationModelBasedFaceHallucinationAlgorithm LI Xiaoguang, XIA Qing and ZHUO Li 687..... A Two-PartyCombinedCryptographicSchemeandItsApplication WANG Shengbao, XIE Qi, TANG Qiang, ZENG Peng and CHEN Wei 692..... DiscriminativeDecisionFunctionBasedScoringMethodUsedinSpeakerVerification LIANG Chunyan, ZHANG Xiang and YAN Yonghong TELECOMMUNICATION 697..... VisualAttentionModelBasedRegionsofInterestDetectioninCompressedDomain SUI Lei, ZHANG Jing, ZHUO Li and YANG Yuncong 701..... A HomomorphicAggregateSignatureSchemeBasedonLattice ZHANG Peng, YU Jianping and WANG Ting 705..... AnApproximateApproachtoEnd-to-EndTrafficinCommunicationNetworks JIANG Dingde, XU Zhengzheng, NIU Laisen and LIU Jindi 711..... A NovelCovertTimingChannelBasedonRTP/RTCP YING Lizhi, HUANG Yongfeng, YUAN Jian and LINDA Yunlu Bai Chinese Journal of Electronics Vol. 21 No. 4 (quarterly) Oct. 2012

715..... LinearApproximationsofPseudo-HadamardTransform WANG Bin, WU Chunming and CHANG Yaqing 719..... A PersonalDRMSchemeBasedonSocialTrust QIU Qin, TANG Zhi, LI Fenghua and YU Yinyan 725..... QualityofExperienceAssessmentforCross-layerOptimizationofVideoStreamingoverWirelessNetworks LIU Fangqin, LIN Chuang and MENG Kun 730..... AdaptiveLayer-3BufferManagementSchemeforDomesticWLAN Wai Leong Pang, David Chieng and Nural Nadia Ahmad MICROWAVE AND ELECTRONIC SYSTEM ENGINEERING 736..... ImprovedEavesdropping Detection Strategy Based on ExtendedThree-particle Greenberger-Horne-Zeilinger State in Two-step Quantum Direct Communication Protocol LI Jian, YE Xinxin, LI Ruifan, ZOU Yongzhong and LU Xiaofeng 740..... InterferometricPhaseStatisticsandEstimationAccuracyofStrongScattererforInSAR XU Huaping, LI Shuang and FENG Liang 745..... RadarClutter Suppression Based on SαS Fractional Autoregressive Model FENG Xun, WANG Shouyong, YANG Jun and ZHU Xiaobo 751..... A NovelSoftSwitchingConverterwithActiveAuxiliary Resonant Commutation CHU Enhui, HOU Xutong and ZHANG Huaguang 756..... Monte-CarloSimulationsontheNoiseCharacteristicsoftheIonBarrierFilmofMicrochannelPlate SHI Feng, FU Shencheng, LI Ye, DUANMU Qingduo and TIAN Jingquan 759..... ANNSynthesisModelsforAsymmetricCoplanarWaveguideswithFiniteDielectricThickness WANG Zhongbao and FANG Shaojun 764..... A NewApproachtoAirborneHighResolutionSARMotionCompensationforLargeTrajectoryDeviations MENG Dadi, HU Donghui and DING Chibiao 770..... NovelImplementationofTrack-OrientedMultipleHypothesisTrackingAlgorithm GUO Jianhui and ZHANG Rongtao —·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—·—

CJE Office The Chinese Institute of Electronics P. O. Box 165, Beijing 100036, CHINA Tel: (8610) 6828 5082 Fax: (8610) 6817 3796 E-mail: [email protected] [email protected] Home page: http://www.ejournal.org.cn

c Copyright 2012 by Chinese Institute of Electronics All rights reserved. This book, or parts thereof, may not be re- produced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written per- mission from the Publisher.

Distributed by Technology Exchange Ltd Suite 1102, Fo Tan Industrial Centre, 26–28 Au Pui Wan Street, Fo Tan, Shatin, HONG KONG Tel: (852)2602 6300 Fax: (852)2609 1687 Website: www.tech-ex.com

PrintedbyNewSkyMediaLimited Suite 1102, Fo Tan Industrial Centre, 26–28 Au Pui Wan Street, Fo Tan, Shatin, HONG KONG Tel: (852)2602 6300 Fax: (852)2609 1687

ISSN 1022-4653 Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Route Optimization Algorithm for Vehicle to Vehicle Communication Using Location Information∗

XU Shenglei and LEE Sangsun

(Department of Electronics Computer Engineering, Hanyang University, Seoul 133-791, Korea)

Abstract — This paper has proposed a route optimiza- been done these years, such as some researchers aimed to select tion algorithm to reduce the routing burden and to increase the effective routing paths to transmit data[4], some researches the routes’ life time in VANET (Vehicular ad-hoc network) aimed to select the shortest path using various resources, or to environment. Then it has been combined into the LAR select the qualified path based on power consumption in the (Location-aided routing) and AODV (Ad hoc on-demand node[5,6], and also, some researches had restrained this work distance vector routing) protocols to see the performance. [7] After the simulation which has been done by the Qualnet into a certain scenario, such as in the collision model ,and simulator, the result shows much obvious improvement in some researches had restrained it into a certain environment, data transmission. such as in the city roads[8], but all of these works were aiming Key words — Ad-hoc networking, Vehicular ad-hoc to select a new route to the destination at the beginning pro- network (VANET), Routing protocol, Location-aided rout- cedure. Different with them, this paper focuses on the route ing (LAR), Route update, Location based routing. maintenance and repair procedure, and at the scenario which a longer connection time is necessary, this algorithm will be much useful. I. Introduction The left chapters of this paper are arranged as follows: In Section II, after introducing some contents about VANET, the With the development of the digital technology and the route optimization algorithm for vehicle to vehicle communica- electronic science, and also with the promotion of the wire- tion will be proposed. In Section III, this proposed algorithm less network technology, the network transmission speed has will be combined with LAR and AODV, and in Section IV the increased much fast. In the past few years, most of the re- simulation test bed and results have been introduced, finally search about wireless network was focused on the needed Wire- in Section V, it’s the conclusion of this paper. less LAN (WLAN), but recently the Vehicular ad-hoc net- work (VANET) which comes from the Mobile ad-hoc network II. The Optimized Route Optimization (MANET) has become a very hot issue[1]. As a part of Intelli- gent transport system (ITS), VANET can provide users a lot of Algorithm for Vehicle to Vehicle kinds of services, such as the nearest parking area notification, Communication the nearest gas station notification, and etc. If integrating the vehicle with wireless network and related VANET data library VANET tends to exhibit a drastically different behavior information together, we can develop a comprehensive assis- from the usual MANET. High speeds of vehicles, mobility con- tant system to serve the public users. straints on a straight road and drivers’ behavior are some fac- VANET allows vehicles to form a self-organized network tors which cause VANET possesses very different characteris- without the requirement of permanent infrastructures, have a tics from the typical MANET Modes. Due to the fundamental highly dynamic topology due to the high mobility of vehicles, behavioral difference between MANET and VANET, most of and the vehicles’ movements are restricted to a geographical the MANET routing protocols cannot be used for VANET, but pattern, such as a network of streets or highways[2].It’sthose the position-based routing protocols such as LAR, DREAM or characters will let the route become worse very quickly, then GPSR that require prior knowledge of vehicles’ geographic lo- the source node have to rebuild the route to the destination cation (from a GPS service) could be used for VANET for [9] node very frequently, and the redundant route request mes- faster route discovery and to improve its performance . sages will waste a lot of network resource[3], so the routing 1. Proposal of the problem problem can be seen as a vital problem for some ITS appli- No matter which kind of protocol it is, almost all the pro- cations. In order to solve this problem, a lot of research has tocols contain route discovery procedure, route maintenance

∗Manuscript Received Nov. 2011; Accepted Dec. 2011. This work is supported by the Brain Korea 21 Project in 2011. 584 Chinese Journal of Electronics 2012 procedure, and route repair procedure, and this opinion can the other nodes’ current locations, and find which node is the be found in Refs.[10] to [14]. There exist two common prob- nearest to the destination now, and then give one notification lems in the latter two procedures. The first one is that, as to that node. If that node can do the route repair procedure time goes by, the route which was set before still can be used, instead of this node, the optimized route may be found. As but it is not the optimized route now, which means that in shown in Fig.2, if node 5 finds the link between itself and the this route some redundant nodes may exist, and we can re- destination node is broken, it calculates all the other nodes’ move them. The second one is that, when the route repair current locations, and finds that node 3 is the nearest to the procedure is carried out, the intermediate node may find one destination node. It gives one notification to node 3 that the route which is not the shortest one. destination is within your range now. Now node 3 does the route repair procedure instead of node 5, and the new route is set as the shortest one as S→ 1 → 2 → 3 →D. 2. Description of the proposed route optimization algorithm Actually, the prescribed two scenarios are only two exam- ples of many different situations. All these situations have one common feature that is the nodes in the route may have dif- Fig. 1. Problem in route maintenance procedure ferent velocities, so there may emerge some redundant nodes in the route. If those redundant nodes can be removed timely, it will enhance the efficiency of the routing protocol. For this reason, our route optimization algorithm has been proposed. In its main route update procedure, each node initially stays at the start status and sets route update timer, then the route update timer is decreased until its value comes to 0. Let’s assume this node’s current hop number is n, the path length of this route is m. This node (node n)calculatesthe Fig. 2. Problem in route repair procedure distances between itself and all the nodes from number 0 to number n − 2, and from number n +2 tonumberm.Ifany As shown in Fig.1, firstly, at the route discovery proce- distance is smaller than its radio range (200m), it deletes the dure, the route may be the shortest one, but as time goes by, redundant nodes between it and the corresponding node, and some vehicles have a relatively higher speed while some have generates one DLI (Delete indication) message, to tell that lower speed, so some vehicles which were located out of other node to do the same action. vehicles’ radio range earlier may come into their ranges now. So the route should be updated in this situation in the route In this algorithm, any node, if it receives the DLI message, maintenance procedure. will get one node number from this message, if it is marked as n In this procedure, all nodes except the final two nodes , then that node will delete all the nodes between itself and node n. check whether their next-next hop nodes have come into their radio ranges. If any node finds this, it will update the route by And also, each node uses the other nodes’ location, time eliminating its next hop node and communicate with its next- stamp, direction, velocity and the current time to calculate next hop node directly. As shown in Fig.1, if node 3 finds that their current location, and we had added this information in node 5 has come into its radio range, it will eliminate node 4 the RREQ and RREP messages. Firstly, let’s assume node n t from the route and communicate with node 5 directly. Finally, is doing this route update procedure, its current time is , n.x, n.y the route will be changed from S→ 1 → 2 → 3 → 4 → 5 →D and its current location is ( ). It calculates the distance k k to S→ 1 → 2 → 3 → 5 →D. between itself and node .Node had put its information n And also, when the destination node of the route slows when it relayed the RREQ or RREP messages, node can re- down, but the upper node keeps going at the same speed, the trieve this information from those messages. This information k v k connection between them may break. As shown in Fig.2, node contains node ’s velocity: , the time when node relayed t k t k.x ,k.y 5 will try to repair the route to the destination by broadcasting that message: 1, node ’s location at time 1: ( 1 1), k k the RREQ message. In this message, it will set the receiver and node ’s direction: D. Then node ’s current location will node as node D, so if node 6 receives this RREQ message, as be calculated by: k.x k.x t − t ∗ v it can connect with both of the node 5 and node D, the new = 1+( 1) ,IfDisTure; route will be changed from S→ 1 → 2 → 3 → 4 → 5 →Dto k.x = k.x1 − (t − t1) ∗ v, If D is False; S→ 1 → 2 → 3 → 4 → 5 → 6 →D. But as we can see, if k.y = k.y1 the route can be changed to S→ 1 → 2 → 3 →D, it is much The distance between node n and node k: 2 2 1/2 better. L =((k.x − n.x) +(k.y − n.y) ) In this route repair procedure, if any one of the source or There is still one problem in this algorithm. The node intermediate node finds the connection with its next hop to the didn’t change their velocities will be found at their estimated destination is broken, it broadcasts the RREQ message to find locations, but if any one of them changed its velocity, then the route to the destination. But if it can calculate the desti- that node won’t come to the calculated location at that time. nation’s current location, then it should compare this with all The worse thing is that, a node such as node n may find that Route Optimization Algorithm for Vehicle to Vehicle Communication Using Location Information 585 the distance L between itself and node k is smaller than 200. Lar1DeleteOldNeighbor function, Lar1HandleRouteUpdate In this case it will delete all the nodes between them, even if IndicationMessage function, Lar1HandleAck function, and node k isnotatthelocationof(k.x, k.y)now.Inthiscase,a some other assistant functions. valid route will be broken. AODV stands for Ad hoc on-demand distance vector rout- In order to avoid this problem, it’s better to use the hello ing, its route discovery mechanism is also based on flooding, message to indicate each node’s location instead of letting each and this is the same as LAR. As the most famous on de- node calculate the other nodes’ location. Then if a node (such mand routing protocol, the network is silent until a connec- as node N) receives one hello message from another node (such tion is needed in AODV. The same as other routing protocols, as node i), it means node i is located in node N’s radio range AODV also has that mentioned 3 procedures, and this can be now, so node N checks whether node i is one node of the route. confirmed from the AodvInitiateRREQ function, AodvHan- If so, then node N sends one route update message to indi- dleRequest function, AodvInitiateRREP function, Aodv Han- cate node i to delete the redundant nodes between them. If dleReply function, AodvSendRouteErrorForLinkFailure func- node i receives the route update message from node N, it will tion, AodvHandleRouteError function and so on. And also, delete the redundant nodes between them. Then, node i trans- in order to realize our proposed algorithm, we modified and mits one Acknowledge (ACK) message to node N.IfnodeN added some functions similar with what we have done in LAR, receives one ACK message now, it will delete the redundant such as the AodvBroadcastHello Message function, AodvHan- nodes. dleHelloMessage function, AodvHanleRouteUpdateIndication III. Combination of the Proposed Message function, AodvHandleAck function, and some other functions. Algorithm with LAR and AODV Routing Protocols IV. Performance Analyses

LAR stands for Location-Aided Routing, its route discov- As a powerful simulator, Qualnet provides a comprehensive ery mechanism is based on flooding, which means each node set of tools with all the components for custom network mod- relays incoming route request to its direct neighbor nodes. eling and simulation projects. Qualnet’s unparalleled speed, LAR aims to decrease the number of useless messages by re- scalability, and fidelity make it very easy for modelers to op- stricting route request flood to a geographical region, and its timize existing networks through quick model setup and in- special characteristic is the definition of the request zone and depth analysis tools, so we chose Qualnet as our simulator in expected zone, this can be considered as the reduction of the [15,16] this paper. search space for desired routes . In order to do this work, 1. Simulation environment each node of LAR should get their real-time location infor- Similar to the OSI 7 layer model, Qualnet has a layered mation from the Global positioning system (GPS), and this protocol stack. Most protocols reside at a single layer in this is important for our proposed algorithm to check whether the stack. Each protocol communicates with the upper layer and other nodes are located in its radio range. lower layer protocols through well-defined Qualnet APIs. With the availability of GPS, the mobile nodes can know Before implementing a new protocol in a fixed layer, the their physical location. To reduce the complexity of the pro- other layers also should be appointed to some fixed protocols tocol, it is assumed that every node knows its own position by the user or just set as the default protocols which are au- exactly. The difference between the exact position and the tomatically set by the simulator directly. calculated position of GPS will not be considered. And as mentioned at the beginning of chapter 2, each routing protocol Table 1. Main parameters in simulation usually has 3 procedures, they are route discovery procedure, Parameters Value optimal route selection procedure and route maintenance pro- Simulation space 1500m×26m cedure, LAR also has these three procedures. If we check Physical layer protocol IEEE 802.11a LAR’ssourcecodeinQualnet,wecanfindthat,inorderto MAC layer protocol IEEE 802.11e Data rate 6 ∼ 54 Mbps realize these 3 procedures, some special structures are defined, Radio range 200m such as lar1 routerequest structure, lar1 routereply structure Frequency 5.9GHz and lar1 routeerror structure. Some special functions are also Receive sensitivity −73dBm defined. Examples are Lar1InitiateRouteRequest function, Data packet size 512Byte Lar1HandleRouteRequest function, Lar1InitiateRouteReply Pathloss model Two-ray model function, Lar1HandleRouteReply function, Lar1TransmitError Packet function, Lar1HandleBrokenLink function, and so on. In this paper, in order to make the simulation environment In order to realize LAR’s special characteristic, it also de- similar to the WAVE environment, the physical layer protocol fines the Lar1NodeInZone function, Lar1CalculateReqZone is assigned to the IEEE 802.11a, the MAC layer protocol is function, and some other functions. In order to realize the assigned to IEEE 802.11e MAC, the smallest data rate is set proposed algorithm, we had added some structures, and they as 6Mbps, the largest data rate is set as 54Mbps, and the radio are lar1 hello message structure, lar1 routeupdateindicate range is set as 200m. The application layer is assigned to the structure, and lar1 ack structure. Some other functions also CBR (Constant bit rate), and the packet size is set as the de- had been modified or added, and they are Lar1 Broadcas- fault value of 512bytes. The simulation space is set similar to tHelloMessage function, Lar1HandleHelloMessage function, the real road environment as 1500m×26m. And the pathloss 586 Chinese Journal of Electronics 2012

Fig. 3. Scenario with 30 Nodes model is set to the two-ray way. All this information is shown After combing these two famous routing protocols with our in the following table. proposed algorithm, both of their throughputs are increased. For the highway environment and the urban environment, Because in our proposed algorithm, the redundant interme- they have 3 different salient features. They are the number of diate nodes are reduced and the link broken problem during vehicles on the road, the distance between the front and back the data transmit procedure can be reduced or avoid, thus the vehicles and the velocities of the vehicles. route maintains consumption can be reduced. In the highway environment, the distance between two suc- cessive vehicles is 150m on average, so in 750 meters, there will be 5 vehicles on the road. If there are 3 lanes in one direction, in both directions there will be (750/150)×3×2=30vehicles. The average velocity of each vehicle is usually about 80km/h, and which is about 22m/s. In the urban environment, the distance between two suc- cessive vehicles is 45m on average, so in 750 meter, there will be nearly 17 vehicles on the road. If there are 3 lanes in one di- rection, in both directions there will be (750/45) × 3× 2 = 100 vehicles. The average velocity of each vehicle is usually about 20km/h, and which is about 5.56m/s. Fig. 4. CBR server throughput These are only two typical situations, in order to contain more general situations, we added two more scenarios which are with 50 nodes and 80 nodes, so we will totally simulate 4 situations, and then they are scenarios with 30 nodes, 50 nodes, 80 nodes, and 100 nodes. The depiction of scenario with 30 nodes is given in Fig.3. 2. Performance analysis Throughput is defined as the average rate of successful message delivery over a communication channel. The data may be delivered over a physical or logical link, or pass through a certain network node. The throughput is usually measured in bits per second (bits/s or bps), and sometimes in data packets per second or data packets per time slot. In qualnet, through- Fig. 5. Packet delivery success rate put is measured in bits/s. If this parameter is high, it means the network situation is good, and there are not so much use- Packet delivery success rate is the ratio of total packet re- less data frame occupying the network resource, so no mat- ceived and total packet sent, and its value is similar with the ter the client throughput or server throughput, the values of bytes delivery success rate which means the ratio of the to- them are the bigger the better. The simulation result about tal bytes received and the total bytes sent. If the number of the throughput is shown in Fig.4. packet delivery success rate is high, it means the network is in In this figure and the following figures, new AODV and a better status, because much more data packet can be trans- new LAR means the protocols which are combined the origi- mitted successfully. The simulation results about this value nal AODV and LAR with our proposed algorithm. As shown are shown in Fig.5. in this figure, as the increasing of node numbers, the through- As we can see in Fig.5, our algorithm affects the packet de- put of AODV and LAR are increasing. When node number is livery success rate just a little. It’s because usually, although small, link broken may be the most important problem for data the intermediate links are broken, with a little time delay, the routing, and as the increase of the node number, this problem route will be repaired and can continue to transmit data again can be reduced, and the throughput also can be increased. But if the new route exists. But if the destination node moves too while the node number is big enough, the hello message may fast, the route repair time will be very precious. With our occupy a lot of network resource, so the throughput may not algorithm, the route repair time can be reduced, so we can be increased very obviously. use the saved time to transmit the data, and then the packet Route Optimization Algorithm for Vehicle to Vehicle Communication Using Location Information 587 delivery success rate can be increased. But this case is occa- AODV or LAR much obviously. sional, so generally speaking, this parameter can be affected but not so much by our algorithm. V. Conclusions In VANET environment, the nodes’ fast movement leads the increasing of the redundant node number in the route, and also leads the increasing of the route maintenance con- sumption. To solve this problem, we proposed an algorithm to remove redundant nodes from the route and to modify the communication route in real time. In order to verify the accuracy of the new algorithm, we chose two famous reference routing protocols which are AODV and LAR. After adapting those two protocols, we compare them with the original protocols at several parameters, such as the CBR server throughput, packet delivery success rate Fig. 6. Average end-to-end delay and etc. Most of them have demonstrated that our algorithm End-to-end delay refers to the time taken for a packet to have improved the performance of the protocols greatly. be transmitted across a network from source to destination. It includes the transmission delay, propagation delay and the References processing delay. Fig.6 shows that as the increasing of the [1] R.Z. Liang, “Using selective-broadcast method to disseminate node number, all protocols’ average end-to-end delay values emergency event in vehicular ad-hoc network”, National Central are increasing. That’s because as the node number increasing, University, July, 2008. the data frame which are sent to the network are increasing, [2] Z. Wu, Y. Yang, X. Guo and J.W. An, “Analysis of collision such as the hello message, RREQ message, RREP message, probability in IEEE 802.11 based VANETs”, Chinese Journal RERR message, and etc., and also, the nodes who retrans- of Electronics, Vol.19, No.1, pp.187–190, Jan. 2010. [3] R.J. Guo, “Connection-less source routing in vehicle ad hoc mit these data frames are increasing, then the processing de- networks”, National Tai Wan Technology University. January lay is increasing. But no matter how many nodes there are, 2007. AODV’s average end-to-end delay is larger than LAR, that’s [4] A. Gowri, R. Valli, K. Muthuramalingam, “A review: Optimal because their RREQ transmit mechanisms are different, and path selection in ad hoc networks using fuzzy logic”, Interna- the AODV’s transmit area is much larger than LAR. In our tional Journal on Applications of Graph Theory in Wireless proposed algorithm, the routes can be modified before their ad hoc Networks and Sensor Networks (GRAPH-HOC), Vol.2, broken, so the route length can be shortened and the route No.4, Dec. 2010. [5] Wu Xiaoyan, Liu Yang, “Routing optimizing algorithm of mo- repair procedure also can be avoided, then the number of pro- bile ad-hoc network based on genetic algorithm”, American cessing data and the processing delay can be reduced, and Journal of Engineering and Technology Research, Vol.11, No.9, finally the average end-to-end delay can be reduced.) 2011. [6] P. Seethalakshmi, M. Joseph Auxilius Jude and G. Rajendran, “An optimal path management strategy in mobile ad hoc net- work using fuzzy and route set theory”, American Journal of Applied Sciences, Vol.8, No.12, pp.1314–1321, 2011. [7] Chulhee Jang and Jae Hong Lee, “Path selection algorithms for multi-hop VANETs”, Vehicular Technology Conference Fall (VTC 2010-Fall), IEEE 72nd, 2010. [8] Josiane Nzouonta, Neeraj Rajgure, Guiling (Grace) Wang and Cristian Borcea, “VANET routing on city roads using real-time vehicular traffic information”, IEEE Transactions on Vehicular Technology, Vol.58, No.7, Sept. 2009. [9] S. Jaap, M. Bechler, L. Wolf, “Evaluation of routing protocols Fig. 7. Total overhead for vehicular ad hoc networks in city traffic scenarios”, Proceed- ings of the 5th International Conference on Intelligent Trans- ( ) , June 2005. In the frame structure of computer network, in addition portation Systems ITS Telecommunications [10] S.R. Zheng, H.T. Wang, Z.F. Zhao, Z.C. Mi, N. Li, Ad Hoc to user data, there is much control information which is used Network Technology, Posts and Telecom Press, Oct. 2005. to ensure the communication’s completion. This control in- [11] H.Y. Zhang, Y.Y. Li, Y.H. Liu, “Research of position-based formation called system overhead. So, the system overhead in routing for wireless sensor networks”, Computer Application the experiment of communication is the total overhead of the Research, Vol.25, 2008. route build and maintain for data transmit. In Fig.7, whether [12] B.Y. Yong, “Development of local repair algorithm for seamless routing path in vehicle to vehicle communication”, Hanyang AODV or LAR, in order to discovery, build and maintain the University, 2007. route, the data transmitted to each node becomes larger and [13] T.S. Su, W.L. Jeng, W.S. Hsieh, “A survey on topology-based larger when time goes by. Since it reduces the length of the routing for ad hoc wireless networks”, Eighty-Second Years’ communicated route and avoid the unnecessary overhead of School Anniversary and Thirteenth Symposium of Three Of- route repair, our algorithm can improve the performance of ficers School, Taiwan, May 2006. 588 Chinese Journal of Electronics 2012

[14] T.Y. Xiao, “AODV-ABR adaptive backup route in ad-hoc net- LEE Sangsun (Corresponding Au- works”, National Zhong- shan University, Taiwan, July 2004. thor) received B.S. degree and M.S. degree from Hanyang Univeristy, Korea, in 1978 [15] Y.B. Ko, N.H. Vaidya, “Location - aided routing (LAR) in mo- and 1983, and received Ph.D. degree from bile ad hoc networks”, Wireless Networks, Vol.6, No.4, pp.307– University of Florida, USA, in 1990. Since 321, 2000. 1993, he has been a professor of College of [16] Y.L. Cao, “Mobile ad hoc network routing algorithm research”, Information and Communications, Depart- Huanan Science and Engineering University, 2006. ment of Electronics Computer Engineering XU Shenglei received M.S. de- in Hanyang University. From 2002, he has gree from Hanyang University, Korea, in become an Internal Commissary of Korean 2009. He is currently a Ph.D. student of Standards Association ISO TC204 WG16, and has become a mem- Hanyang University. His research inter- ber of Technique Section of Ministry of Knowledge Economy Depart- ests include Ad-hoc routing protocol de- ment, Car Telematics Forum from 2005. Also from 2005, he has be- velopment and simulation in vehicle to ve- come the Chairman of ITS Korean Public Transit Information Man- hicle communication, communication stan- agement and Communications Research Committee, and then from dardization and protocol development in 2006, he has become the Chairman of TTAPG310 ITS/Telematics ITS/telematics and positioning technology. Section, and also has become the Chairman of Korean Information (Email: [email protected]) and Communications Society ITS/Telematics Research Committee. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

An Evidence-Driven Framework for Trustworthiness Evaluation of Software Based on Rules∗

WANG Xiaoyan, LIU Shufen and BAO Tie

(College of Computer Science and Technology, Jilin University, Changchun 130012, China)

Abstract — Through the analysis and research on ex- a bottom-up way to calculate the system’s reliability, Palvi- isting trusted software evaluation technology, an evidence- ainen et al. [4] provide a coherent approach by combining both driven framework for trustworthiness evaluation of soft- predicted and measured reliability values with heuristic esti- ware based on rules is put forward, rules are used as ex- mates in order to facilitate a smooth reliability evaluation pro- pression method of trustworthiness evaluation logic, and [5] evidence is used to drive the operation of trustworthiness cess, BAO Tie et al. proposed a trustworthiness evaluation evaluation process, trustworthiness evidence collection and method for domain software including trust level definition and processing logic as well as mapping method of trustworthi- bottleneck analysis. In the research of trusted software anal- ness levels have been encapsulated in rules, and the selec- ysis and construction, Holzworth et al.[6] describe a simple tion, instantiation, collection, format definition and mea- model development process (using version control, automated surement of trustworthiness evidence are carried out under testing and differencing tools), that will enhance the reliabil- the guidance of the rules, and the mapping of trustwor- ity and usefulness of a model, Wang Yang et al.[7] proposes a thiness levels and the analysis of trustworthiness bottle- neck are done based on the measured evidence instances, novel dynamical trust construction schema based on fuzzy de- this framework has provided an application implementa- cision and Extended automated trust negotiation, GU Liang tion scheme for software trustworthiness evaluation. et al.[8] proposes a runtime software trustworthiness evidence Key words — Trusted software, Trustworthiness eval- collection mechanism based on trusted computing technology, Ryu SH et al.[9] present a framework that supports service uation, Evidence-driven, Rules. managers in managing the business protocol evolution by pro- viding several features. Above relevant work has focused on I. Introduction the concept of software trustworthiness and definition of the attribute contained, or aimed at study of trustworthiness eval- The softwares are playing more and more important roles uation method for certain software forms or phases, however, in analysis, decision and production control in many fields, no explicit solutions were given to how to practically express which also demands higher requirements for the reliability and trustworthiness attribute and evidence, how to specifically de- stability of software, however, with the increase of software scribe trustworthiness requirements and level of softwares, and scale and the complexity of application environment, it is very how to analyze trustworthiness bottleneck of softwares. For hard to ensure that software will operate as expected motions, this reason, this paper proposes an evidence-driven frame- while the breakdown and failure of software often cause enor- work for trustworthiness evaluation of software based on rules, mous loss. Due to the emergence of trustworthiness issues, express the collection and processing logic of trustworthiness software trustworthiness research becomes more and more im- evidence as well as mapping method of trustworthiness level portant in computer application of society. In the research through rules, drives the operation of trustworthiness evalu- of trustworthiness evaluation for certain software or phase, ation process through trustworthiness evidence tree, assesses WANG Huaimin et al.[1] put forward a conceptual model for trustworthiness level of software, analyzes bottleneck problems the trustworthiness of Internet-based software, and propose and gives specific improvement suggestions based on trustwor- a trustworthy assurance framework for Internet-based virtual thiness evaluation data. computing environment, JIN Zhi et al.[2] present a trustworthi- ness requirements pattern and a method about how to generate II. Framework of Trustworthiness patterns from the knowledge base to help eliciting trustworthi- Evaluation ness requirements, LV Jian et al.[3] put forward an improved reliability evaluation method which describes how to use Petri Software trustworthiness evaluation framework based on nets as software architecture description, and then introduces the rules is shown in Fig.1, which contains model library of

∗Manuscript Received Aug. 2011; Accepted May 2012. This work is supported by the National High Technology Research and Development Program of China (863 Program) (No.2009AA010314) and National Natural Science Foundation of China (No.60973041). 590 Chinese Journal of Electronics 2012 trustworthiness levels and trustworthiness evidences. Trust- may include text analytical processing, digit arithmetic or log- worthiness evidence model library contains trustworthiness ev- ical calculation; complete the measurement of trustworthiness idence entity set available under the existing conditions, the evidence based on measurement rules and standard evidence entity is composed of collection rules, format definition rules information, and generate the unified measurement value to and measurement rules, it can collect original evidence in- support comparison and operation of different trustworthiness formation, conduct standard format processing and generate evidences. Based on these trustworthiness evidence instances trustworthiness evidence with measurement value. Trustwor- collected and processed, trustworthiness analysis can be made thiness level model library, which contains all trustworthiness in accordance with trustworthiness level mode, the correspond- level mode based on selected evidence, is composed of level ing trustworthiness level of software is analyzed based on level definition rules and bottleneck analysis rules, it enables to an- definition rules, which can not only have a intuitive under- alyze trustworthiness level and bottleneck of software based standing of software trustworthiness, but also make compari- on specific evidence. The aforesaid two resource libraries may son of the trustworthiness between different softwares, finally, be managed via definition tools of evidence rules and that of software weakness is further needed to be analyzed based on level rules. the rules of bottleneck analysis, and the analysis report will Evaluation process may be managed through trusted re- be obtained, which includes specific bottleneck problems of quirement management tool and software evaluation manage- software trustworthiness, but also improvement directions and ment tool. It needs to customize evaluation parameters prior suggestions given in accordance with actual analyzed data. to starting up trustworthiness evaluation process of software, which may be completed through the confirmation of software III. Trustworthiness Evidences trusted requirement, that is to determine which aspects that Management software trustworthiness needs to be investigated in, trustwor- thiness evidence may be selected in the evidence model library Trustworthiness evidence is the basis of trustworthiness according to trusted requirement, trustworthiness level mode evaluation of software; our trustworthiness evidence comes of evaluation may be determined based on the selected evi- from actual evidence in the lifecycle of software, such as soft- dence, furthermore, in selection process, evaluation costs and ware testing report, user feedback report, software develop- application environment factors shall be taken into considera- ment document, software acceptance report and so on, such tion. evidence is easier to collect and measure, and trustworthiness evaluation work based on such actual evidence is of more op- erability, which may reduce actual workload at the same time. For example, the trustworthiness evidence “system evaluation of users” came from user feedback report, based on the map- ping between user satisfaction and score, measurement value of the evidence can be obtained. Another example is the trust- worthiness evidence “correct rate of system functions”, mea- surement value of the evidence can be obtained by dividing the number of system functions which meet the user needs by the total number of system functions. The framework in this paper maintains the evidence avail- able for the existing conditions into trustworthiness evidence model library, and evidence may be conserved in accordance with the classification of different fields, different applica- tion environments and different attributes, the organizational structure is shown in Fig.2.

Fig. 1. Trustworthiness evaluation framework

After determination of trustworthiness evidence set and level mode for evaluation, trustworthiness evaluation process may be started up, and under the guidance of collection rules, formal definition rules and measurement rules in the trustwor- thiness evidence entity, standard evidence may be generated. As the collection of original evidence information can be com- pleted by means of collection rules, collection methods may etc contain such standard protocols as FTP, Telnet, .aswellas Fig. 2. Evidence model involve collection of some specific files and forms; complete the processing of original evidence information based on format Each practical evidence can be analyzed and trans- definition rules, and extract the value of the key field which formed into a leaf node in tree of evidence model. One An Evidence-Driven Framework for Trustworthiness Evaluation of Software Based on Rules 591 leaf node evidence entity can be described as a tetrad- Begin Ed ID,Col Rule, F D Rule, Mea Rule,ofwhichEd ID Extract ee which belongs to EvidenceEntitySet and refers to unique identifier of the mentioned evidence; ee.Ed ID equals to ei.Ed ID; Col Rule refers to collection rule of the mentioned evi- ei.OraData = ee.Col Rule.run(Ed Para); dence, and it encapsulates the specific collection method ei.F orData = ee.F D Rule.run(OraData); for evidence information; FD Rule refers to format def- ei.MeaV alue = ee.Mea Rule.run(ForData); inition rule of the mentioned evidence, and it encapsu- End lates format definition method for original evidence infor- mation; Mea Rule refers to measurement rule of the men- IV. Trustworthiness Level Model tioned evidence, it encapsulates the specific method for gen- erating the quantitative measurement value. Above men- In order to analyze and compare the trustworthiness of tioned rules will conduct the data processing in trustwor- the software, this paper will adopt level mode to measure the thiness evidence instances. With regard to one trustworthi- trustworthiness of the software, analyze bottleneck problems ness evidence instance, it can be described as a quintuple- for software trustworthiness, and give some specific improve- Ed ID,Ed Para,OraData,ForData, MeaV alue,ofwhich ment directions and suggestions by combining with evaluation Ed ID refers to the identifier of the related evidence en- data. The composition of level mode in the level model library tity; Ed Para saves related parameters of the mentioned ev- is shown in Fig.3: idence instance, which include some parameters in process Evidence set of trustworthiness evaluation may be deter- of evidence collection, format definition and measurement; mined based on the cost and application environment for eval- OraData saves original evidence information of the mentioned uation, and then instances set for trustworthiness evidence is instance; ForDatasaves standard evidence information by for- established. The level mode of this paper is based on evi- mat parsing; MeaV alue saves evidence measurement value dence instance set, and level evaluation tree is generated via generated by running measurement rule, which may be used combining with level evaluation conditions, on the tree, evi- in some operations. dence instances are leaf nodes, while the measurement value The trustworthiness evidence available for the existing con- of non-leaf nodes are obtained by calculating the measure- ditions are conserved in the trustworthiness evidence model ment value of all child nodes. One non-leaf node can be de- library, we need to select trustworthiness evidence based on scribed as a tetrad-Node ID, opr, ChildLink, MeaV alue,of the evaluation cost and specific application environment in which Node ID refers to unique identifier of the mentioned the specific evaluation for software, thereby determine which node; opr refers to operator of the mentioned operation node; trustworthiness evidence needs to be taken into consideration ChildLink refers to link to child node of the mentioned node, in the specific process of evaluation. We need first to determine and it can be described as a triple-ChildNode, W eight, Link, trustworthiness evidence entity set EvidenceEntitySet = ChildNode refers to the current child node, Weight refers {ee1,ee2, ···,eem}, and then establish trustworthiness evi- to weight value of the mentioned child node in operation, dence instance set EvidenceInstanceSet = {ei1,ei2, ···,eim}, Link refers to link to the next child node; MeaV alue refers which also satisfies the following conditions: to measurement value of the mentioned node, which may be ⎡ ⎛ ⎞ ⎤ (∀ee ∈ EvidenceEntitySet) used in operation of other upper nodes. As informed in Sec- ⎢ ⎝ (∃ei ∈ EvidenceInstanceSet) ⎠ ∧ ⎥ tion III, the assessed value of leaf nodes in the level evalu- ⎢ ⎥ ⎢ ee.Ed ID ei.Ed ID ⎥ ation tree may be calculated, so we can traverse each node ⎢ ⎛ = ⎞ ⎥ ⎢ (∀ei ∈ EvidenceInstanceSet) ⎥ of tree to calculate their evaluation values, and calculation ⎣ ⎦ ⎝ (∃ee ∈ EvidenceEntitSet) ⎠ process for measurement value of node in non-leaf node set ei.Ed ID = ee.Ed ID EvidenceOperationSet = {eo1,eo2, ···,eon} is as follows: Thus, under the guidance of trustworthiness evidence For each eo in EvidenceOperationSet rules, we can process the evidence instance, with computa- Begin tional processes as follows: Object cl = eo.ChildLink(); For each ei in EvidenceInstanceSet While(cl!=null)

Fig. 3. Level mode in trust level model 592 Chinese Journal of Electronics 2012

{eo.MeaV alueeo.opr = cl.ChildNode.MeaV alue∗ { cl.W eight; if (i < trustlevel) cl = cl.Link; } trustlevel = i; End if (i!=p) MeaV alue The measurement value of leaf node and non- T rustDataSet ∪{ek,i,Condi+1,k} leaf node in the level evaluation tree can be calculated, these } nodes reflect trustworthiness attributes of software in a cer- End tain aspect or several aspects. This paper adopts level mode For each td in T rustDataSet to describe software trustworthiness, so trustworthiness level Begin may be set up based on these nodes measurement values, if (td.LevelNum == trustlevel) as every trustworthiness level is likely to have some lim- BottleneckSet ∪{td.Node, td.cond} iting conditions to certain node, only these restricted con- ditions are met can specific level be reached. The node End set of level evaluation tree is expressed as LevelNodeSet = {ei1,ei2, ···,eim,eo1,eo2, ···eon}, ei is leaf node of trustwor- V. Conclusion and Prospect thiness evidence instance type, eo is non-leaf node of arith- This paper provides an evidence-driven framework for metic type, evaluation condition for trustworthiness level may trustworthiness evaluation of software based on rules. The be transformed into a [(p +1)× (m + n)] conditions ma- key of trustworthiness evaluation is the construction of trust- trix Cond, any element Condi,j of which means limiting con- worthiness evaluation tree. Based on trustworthiness evalua- ditions of level Leveli for evaluation tree node ej , Level0 tion tree, we can determine the trustworthiness level of soft- is the minimum level, expressing the unavailability of soft- ware, and analyze bottleneck problems of software according ware or incapable evaluation of trustworthiness, so Cond0,j = to trustworthiness analytical data, therefore give the concrete null, among which, 1 ≤ j ≤ m + n. While trustworthi- improvement suggestions. A bottom-up way is adopted to ness level set is a finite ordered collection consisting of Level, construct the evaluation tree. The trustworthiness evidence which can be described as {Level0, Level1, ···, Levelp},ful- leaf nodes of tree come from the practical evidence in software filling Level0 < Level1 < ··· < Levelp; here the condition lifecycle, and rules are used to encapsulate the collection, for- Leveli < Levelj can be described as: mat definition and measurement logic for practical evidence ⎡ ∀e ∈ LevelNodeSet ⎤ ( k ) ∧ processing. The work of this paper provides effective support ⎢ satisfied e ,Cond ∧ satisfied e ,Cond ⎥ ⎣ ( k j,k ) ( k i,k) ⎦ for trustworthiness evaluation of software, but the analysis of (∃es ∈ LevelNodeSet) trusted requirement and the determination of trustworthiness satisfied(es,Condi,s) ∧¬satisfied(es,Condj,s) evidence set still need some human involvement. In the follow- This paper will analyze the trustworthiness of software ing work, we will seek for more effective methods aimed at dif- based on the aforementioned trustworthiness level evaluation ferent application fields and actual application environments, tree and level limiting conditions, mainly including the judg- so as to reduce human involvement and improve automation ment of trustworthiness level and the analysis of bottleneck degree of framework; in addition, to provide more flexible and problems, besides determining the corresponding level of soft- diversified arithmetic method for computing nodes on trust- ware, T rustDataSet is used to save trustworthiness analytical worthiness evaluation tree is also one of our future works. data, and one element in the set can be described with triple el- ement group Node, LevelNum, cond. BottleneckSet is used References to save trustworthiness bottleneck data, and one element can [1] Wang Huaimin, Tang Yangbin, Yin Gang, Li Lei, “Trustwor- Node,cond Node be described as a two-tuples- . refers to thiness of Internet-based software”, Science in China Series F: node in tree of trustworthiness evaluation; LevelNum refers to Information Sciences, Vol.49, No.6, pp.759–773, 2006. corresponding level of the mentioned node, and it is the level [2] Wang Yue, Liu Chun, Zhang Wei, Jin Zhi, “Knowledge whose condition the measurement value of the mentioned node guided software trustworthiness requirements elicitation”, Chi- can meet; cond refers to improved condition of the mentioned nese Journal of Computers, Vol.34, No.11, pp.2165–2175, 2011. [3] Lu Wen, Xu Feng, Lv Jian, “An approach of software reliabil- node, which is the condition of the next higher level for level ity evaluation in the open environment”, Chinese Journal of LevelNum. BottleneckSet indicates the leading factor to re- Computers, Vol.33, No.3, pp.452–462, 2010. strict the current software trustworthiness level, which can be [4] M. Palviainen, A. Evesti, E. Ovaska, “The reliability estima- referred to improve the trustworthiness of software, and the tion, prediction and measuring of component-based software”, specific algorithm is described as follows: Journal of Systems and Software, Vol.84, No.6, pp.1054–1070, Input: LevelNodeSet, Cond 2011. Output: trustlevel, BottleneckSet [5] Bao Tie, Liu Shufen, Wang Xiaoyan, “Research on trustwor- Initialize trustlevel = p, BottleneckSet = {}, thiness evaluation method for domain software based on ac- T rustDataSet {} tual evidence”, Chinese Journal of Electronics, Vol.20, No.2, = , pp.195–199, 2011. e LevelNodeSet For each k in [6] D.P. Holzworth, N.I. Huth, P.G. DeVoil, “Simple software pro- Begin cesses and tests improve the reliability and usefulness of a for (int i = p, i ≥ 0,i−−) model”, Environmental Modelling & Software, Vol.26, No.4, if (satisfied(ek,Condi,k)) pp.510–516, 2011. An Evidence-Driven Framework for Trustworthiness Evaluation of Software Based on Rules 593

[7] Wang Yang, Wang Ruchuan, Han Zhijie, “Dynamical trust con- LIU Shufen was born in Jilin struction schema with fuzzy decision in P2P systems”, Chinese Province, China, in 1950. Currently she is Journal of Electronics, Vol.18, No.3, pp.417–421, 2009. a professor and doctoral supervisor in Col- [8] L. Gu, Y. Guo, H. Wang, Y.Z. Zou, B. Xie, W.Z. Shao, “Run- lege of Computer Science and Technology time software trustworthiness evidence collection mechanism of Jilin University. Her current research in- based on TPM”, Journal of Software, Vol.21, No.2, pp.373–387, terests include computer supported cooper- 2010. ative work, software architecture and soft- [9] S.H. Ryu, F. Casati, H. Skogsrud, B. Benatallah, R. Saint-Paul, ware programming method based on MDA. “Supporting the dynamic evolution of Web service protocols in (Email: [email protected]) service-oriented architectures”, ACM Transactions on the Web, Vol.2, No.2, artn.13, 2008. BAO Tie (corresponding author) WANG Xi a oya n was born in Jilin was born in Jilin Province, China, in 1978. Province, China, in 1977. She received the He received the B.S., M.S. and Ph.D. de- B.S., M.S. and Ph.D. degrees from Jilin grees from Jilin University, China, in 2001, University, China, in 2000, 2003 and 2008, 2004 and 2007, respectively, all in Com- respectively, all in Computer Science and puter Science and Technology. His re- Technology. Her current research interests search area covers trusted software tech- include model driven architecture. (Email: nology and network management. (Email: [email protected]) [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct 2012

TISA: Reconfigurable System for Template-Based Stream Computing∗

YANG Qianming, WU Nan, WEN Mei, QUAN Wei and ZHANG Chunyuan (College of Computer Science and Engineering, National University of Defense and Technology (NUDT), Changsha 410073, China)

Abstract — For High performance Digital signal pro- Third, rather than be specially designed for single algorithms cess (HP-DSP) system with low-volume market, perfor- with fixed dedicated FPGA configurations, the new reconfig- mance/watt, flexibility and cost of design are becom- urable computing system requires containing programmable ing goals pursued by architects. This paper presents and un-programmable logic (customized logic) simultaneously a template-based reconfigurable platformTISA-II, where application-specific stream computing system can be con- for various demands. structed fast, efficiently and conveniently. This paper de- Confronting these challenges, a template-based approach scribes the platform in terms of architecture, program- for HP-DSP system was followed in the Tiled stream archi- ming model, a hardware/software co-design flow, and our tecture (TISA) project. Our approach is to construct a plat- implementation. Finally, the paper evaluates TISA-II form based on a uniform template, where the customer could by a real applications HD H.264 encoding. The results build their own HP-DSP systems fast, efficiently and conve- are encouraging, TISA-II with 4 computing node achieves niently. The main points of the approach are the following: 50∼100x speedup over embedded programmable proces- sors (C64 DSP, MIPS) and 3x over dedicated stream pro- A well-engineered architecture template, instead of generating cessor (Storm), while consequently performance per watt the given hardware from a generic representation of a high that is also greater. level language. A uniform programming model is needed to orchestrate numerous and various resources. A soft/hardware Key words — Stream architecture, Field programable co-design flow to cooperate architecture template with pro- gate arrays (FPGA), Reconfigurable computing, Accelera- gramming model efficiently. The Architecture template, pro- tor. gramming model and co-design flow are all highly optimized for stream model[1]. This is because the stream model shows I. Introduction surprising efficiency on most HP-DSP algorithms with data flow processing nature[2,4,7]. There are various HP-DSP applications with low-volume The rest of the paper details each part of TISA-II and their and volatile market. Dedicated ASIC is uneconomical, implementation and evaluation results. and easy to be outdated for these applications, while pro- grammable processors are hard to meet the harsh require- II. Related Work ments. Reconfigurable computing based on Field programmable In the stream model, the data primitive is a stream, an gate arrays (FPGAs) is one very promising technology for both ordered set of data of an arbitrary data type. Operations in improving performance and reducing cost. In addition, with the stream model are expressed as computational kernel and their in-system programmability, short design cycle, FPGAs load/store on entire streams. Since there is high predefined provide an excellent ASIC replacement for volatile market. directed acyclic graph in the stream model, computation and [1] However, upcoming HP-DSP products present further require- stream transfer are decoupled . The newest GPUs, CELL BE [7] ments beyond traditional development approach as follows: processor and Storm DSP are designed to support stream First, due to the ever-growing computation demands and model. rising algorithm complexities for HP-DSP applications, the There exist some template based reconfigurable FPGA ac- scale of reconfigurable computing system is becoming larger celerators proposed for the purpose of generality and rapid and larger[3], that multi-core, multi-chip and multi-board sys- proto-typing. The BEE reconfigurable computing system[8] tems require a sophisticated communication infrastructure. provided is capable of real-time execution of a class algorithms Second, the increasing system scale requires advanced ap- in signal processing domain due to its novel routing architec- proach to simplify and shorten of product development cycles. ture and application specific nature. Moreover, the dataflow

∗Manuscript Received Nov. 2011; Accepted Jan. 2012. This work is supported by the National Natural Foundation of China (No.61033008, No.60703073, No.61103080) and SRFDP (No.20104307110002). TISA: Reconfigurable System for Template-Based Stream Computing 595 structure facilitates the development of a highly abstracted ponent configuration and macrolevel floorplaning with weak design flow for the emulator. BEE reconfigurable comput- programmability using distributed microcoding. However, ing system can implement any applications in theory, but the since their platform is designed for a special application, the design of hardware logic of FPGA on BEE is complex. Dif- reconfigurability and programmability are relatively weak. ferently, Belles’ work[4] andTISA-IIpresentedinthispaper, stream architecture template is used to simplify logic design III. TISA-II Architecture Template of FPGA. Sven Heithecker presented a reconfigurable platform for high-end real-time digital film processing[9]. With some simi- 1. System wide architecture larities with our design methodology, it combines macro com- The system wide overview of TISA-II is shown in Fig.1.

Each TISA-II node board features 5 daughter cards. Four of them are called CP card (Computing card). Each computing card features a Xilinx Virtex-5 LX FPGA, provid- ing the massive processing elements required to implement the stream processing data path. Another NR card (Net route card) features a Xilinx Virtex-5 FX FPGA, which acts as the network router with four 120-pin parallel I/O and four GTXx4 serial I/O, meanwhile the FPGA-embedded CPU on NR is used for con- trol and non-computation-intensive tasks. In addition, a NI-PCIE Bridge card is designed Fig. 1. System-wide view of TISA-II for plugging TISA-II into PC or workstation through PCI Express bus (x16). 2. Computing and communication template The computing hierarchy of TISA is fully optimized for stream model, there are three levels including Node Board, Processing Core and Configurable Tile. Meanwhile, corre- sponding communication unit is designed for each computing level. (1) Node board A backboard with 4 CP cards and 1 NR card is a node of TISA-II. Each CP card consists of a FPGA configured using macro components that contains a MicroBlaze processor core and a MASA Stream processing core (SPC)[6], 2 DDR2 mem- Fig. 2. NetRoute FPGA. (a) Architectural template of NR ories, configuration circuits and a memory module connecter. (block diagram); (b) Implementation view in a Xilinx The NR Card consists of memories and configure elements XCV5FX device (an example for H.264 encoder in sec- tion 6) similar to the CP card, but an 8-radix router with additional global interconnect interfaces and control signals to the sec- ondary system components. In addition, since the Virtex-5 FX FPGA already integrates two PowerPC cores on die, and can run Linux OS, the system functionalities carried by the Host CPU are also superseded by the NR FPGA directly, which means control and interconnection logic are integrated in a chip. The architecture of NR FPGA is shown in Fig.2. (2) Processing core Typically, there are two heterogeneous cores on CP FPGA: a MicroBlaze Processor Core and a MASA SPC. The SPC as shown in Fig.3 consists of a Coarse-grain reconfigurable array Fig. 3. Computing FPGA. (a) Architectural template of CP (CRA), a Fine-grain reconfigurable array (FRA), a Microcode (block diagram); (b) Implementation view in a Xilinx controller (MC), a Stream register file (SRF), 2 Address gen- XCV5LX device (an example for H.264 encoder in sec- tion 6) erators (AG) and a Network interface (NI). 596 Chinese Journal of Electronics 2012

CRA is an array which consists of Configurable tile and SRF provides operand data to CT array through 8 Stream performs kernel programmed in high-level language. Both data buffers (SB). Each SB provides 19.2Gbps peak bandwidth. and instruction network on the array are dynamic reconfig- The configure file for each kernel is also resided in SRF as urable to support multiple parallelism execution models. a stream. Before a kernel start, MC load configure word from FRA is a resources pool containing fine granularity FPGA the stream to configure the function of each CT. After config- LUTs, which allows customer to statically reconfigure Special uration, the MC provides VLIW instructions corresponding to purpose unit (SPU) that performs user-defined instruction, or the instruction register of each CT on a cycle by cycle basis. special cluster that performs kernel programmed in hardware In middle bottom of Fig.4 shows an example of CT struc- description language. ture and instruction architecture. 35-bit VLIW specifies the More details of MASA are referred to Ref.[6]. Note that, operation for the FU, the inputs to be selected for LRF0, LRF1 there is difference between original MASA and the SPC in and CCRF, the amount and direction of shift of the ALU out- TISA-II on the interconnection between modules. In TISA- put, and the register for storing the result as the figure shown. II, stream buffers decouple all of them, while bus is used in The VLIW architectures of other CGRAs such as Refs.[3, 5] Ref.[6]. are similar to the case of TISA-II, although there is a wide (3) Configurable tile (CT) variance in Instruction-width and kind of fields used by differ- The CT consists of functional units such as ALU, multi- ent functionality. plier, shifter, and a few storage units such as register file. The The tiles interconnect using eight 32-bit full-duplex data

bus, and six half-duplex instruction bus, as right bottom of Fig.4 shows. Local switch corresponding to each CT is functioned as router and interface for these buses. The CT in the array is connected to the near- est neighbor CTs -top, bottom, left and right. In addition, they have limited inter- connections between non-neighboring CTs in order to perform column-wise or row- wise data-transfer efficiently. 3. Multi-morph configurable com- puting Reconfigurable array engine can be configured for various Kernel execution modes that we call morphs. The morph can be configured kernel by kernel, which is well suited for different type of paral- lelism along with the kernel. By now, SPC in TISA-II already supports following com- mon used morphs: SIMD softkernel morph.Localswi- tch of each CT is configured to connect Fig. 4. Example of a SPC configuration (MIMD VLIW Clusters+SPU+ Hardkernel other CTs in the same row. Thus, there are Clusters) are several clusters, each of which consists

Fig. 5. Morphs for SPC configurable array TISA: Reconfigurable System for Template-Based Stream Computing 597

Table 1. Implementation of HD H.264 encoder on TISA-II Kernels Execution CPFPGA0 and CPFPGA1 CPFPGA3 (do Deblock morph CPFPGA2 (do encode) (do analyse) Filter and CALVC) Scan zigzag 16x16, Data tranlate, Analyse intra Chroma, SIMD Scan zigzag 4x4, Macroblock cache load, Refine Qpel, Encode 8x8 chroma, Coeff calculate Chroma, softkernel Non zero Luma, Coeff calculate Luma Analyse intra P Non zero Chroma Macroblock head write MIMD Predict4x4, Predict16x16, Sub16x16 DCT, Sub4x4 DCT, N/A softkernel Encode Pskip Quant 16x16, Quant 4x4 Analyse inter P16x16, SIMD Dequant 16x16, Dequant 4x4 Analyse inter P8x8, softkernel Add 16x16 IDCT, N/A Analyse inter P16x8, +SPU Add 4x4 IDCT Analyse inter P16x8 BS write chroma, BS write luma, Hardkernel N/A N/A BS write ve, Frame filter, Fdec deblock 80% LUT Utilization, 75% Reg FPGA 77% LUT Utilization, 73% Reg 67% LUT Utilization, 65% Reg Utilization 2.4M BRAM Utilization, Hardware Utilization, 2.4M BRAM Utilization, Utilization, 2.4M BRAM Utilization, 100MHz system frequency, Implementation 100MHz system frequency, 100MHz system frequency, 150MHz CRA frequency, Report 150MHz CRA frequency 150MHz CRA frequency 200MHz FRA frequency of one row of CTs, as shown in top-left of Fig.5. Kernels are ex- coder. The stream code contains a stream thread, which con- ecuted time-multiplexed in SPC. All instruction stores would sists of 32 kernels and 112 streams. Based on the design flow be aggregated together to issue one VLIW instruction. During mentioned in section5, we construct the HD H264 encoder on kernel execution, the same VLIW instruction is broadcast to TISA-II platform. Considering the scale of the application, we all 4 clusters in SIMD fashion. Stream read/write operations just use a Node Board. The raw video data is input from a are performed by passing stream elements from and back to workstation by a PCIE-NI Card. The mapping solution and the SRF. hardware resource utility are listed in Table 1, and experiment MIMD softkernel morph. Local switch of each CT is results are listed in Table 2. configured, so that each CT is connected to the nearest neigh- bor CTs — top, bottom, left and right. Thus, there are sev- Table 2. Result of 3000 Frames 1080P H.264 eral clusters, each of which consists of one rectangle regions encode (Results of other processors except of arrays, as shown in bottom-left of Fig.5. In this mode, TISA-II are referenced from Ref.[2]) instruction store is split into several separate stores, each of Processor Peak Time Frame/s Speedup Power power efficiency∗ which issue a VLIW instruction to corresponding clusters in MIPS 4KEc 0.7W 4291s 0.88fps 1 1 every cycle. It means that several parallel kernels may runs si- TMS320 C64 1.6W 1908s 1.57fps 1.8 0.78 multaneity space-multiplexed. The output record produced by X86 core 2 E8200 65W 259s 10.2fps 11.6 0.12 one kernel will be sent to the consumed kernel directly through STORM-SP16 12W 98s 30.6fps 34.8 2.02 COM unit. TiSA-II (1 Node) 25W 32s 94.6fps 107 3.01 Hardkernel morph. There are no programmable clus- ∗ fps/W, normalized to MIPS ters for hardkernel. Some CRA and FRA are used to design dedicatedly for application. This mode majorly is used to The major conclusions from our comparison are: On one process some kernels with simple, irregular and extreme high side, TISA-II with 4 CP node achieves 53∼107x speedup computation demand, such as super large FFT used by sig- over embedded programmable processors (C64 DSP, MIPS) nal processing, entropy coding in video compression, etc.The and 3x over dedicate stream processor (Storm), On the other hardkernel acts as a separate cluster to process stream. In side, TISA-II’s frequency is 100∼200MHz, power is 10W- the scenario, the execution mode may be different from other 25W/Node, the size of superposed 4 node boards is (23cm clusters. × 23cm × 18cm), which is well suitable for harsh physical environment. IV. Results V. Conclusion To test this system architecture, two typical high perfor- mance DSP applications were implemented on TISA-II Plat- A reconfigurable HW/SW platform for high performance form. One is a complex video compression H.264 encoder computing is presented. The combination of programmable (1080P) HD. Another is High resolving radar data compres- and configurable stream architecture template and hardkernel sion. library is the key for system developer to reap high designer Ref.[2] has implemented a streaming full HD H.264 en- productivity. Furthermore, a coarse and fine granularity re- 598 Chinese Journal of Electronics 2012 YANG Qianming is a Ph.D. can- configurable architecture provides customizable infrastructure didate in computer science at National Uni- to achieve further speedup of performance. Results show that versity of Defense Technology. His research TISA-II is well suited for constructing application-specific sys- interests include computer architecture and tem within a relatively short development cycles. The exper- VLSI design. iment result of real application shows that TISA-II platform facilitates FPGA programming for HP-DSP developers by pro- viding them with performance that is greater and power con- sumption that is less than their current CPU platforms, but without sacrificing their familiar, C-based programming envi- WU Nan is an assistant professor in computer science at National University ronment. of Defense Technology. His research inter- ests include computer architecture, stream References computing, and compiler design. (Email: [email protected]) [1] S. Amarasinghe, Thies B. Architectures, “Languages and com- pilers for the streaming domain”, Tutorial at 15th International Conference on Parallel Architectures and Compilation Tech- niques (PACT), New Orleans, LA, pp.4–6, 2003. [2] Nan Wu et al., “Streaming HD H.264 encoder on programmable WEN Mei is an associate profes- processors”, ACM Multimedia 2009, Beijing, pp.125–128, Oct. sor in computer science at National Uni- 2009. versity of Defense Technology. Her re- [3] Yale Patt, Jim Smith and Mateo Valero, “Fine- and coarse-grain search interests include computer architec- reconfigurable computing”, Springer, pp.211–214, 2007. ture, high performance computing and me- [4] Nikolaos Bellas et al., “Mapping streaming architectures on re- dia processing. configurable platforms”, ACM SIGARCH Computer Architec- ture News 2, Vol.35, No.3, pp.24–27, June 2007. [5] F. Bouwens et al., “Architectural exploration of the ADRES coarse-grained reconfigurable array”, in Proc. ARC 2007, LNCS, Vol.4419, pp.1–13, Springer, Heidelberg, 2007. QUAN Wei is a Ph.D. candidate in [6] Nan Wu et al., “A stream architecture supporting multiple computer science at National University of stream execution models”, 10th ACSAC, LNCS 3740, pp.143– Defense Technology. His research interests 156, Singapore, Oct. 2005. include computer architecture. [7] B. Khailany et al., “A programmable 512GOPS stream proces- sor for signal, image, and video processing” IEEE ISSCC,San Francisco, pp.125–128, Feb. 2007. [8] K. Kuusilinna et al., “Designing BEE: a hardware emulation engine for signal processing in low-power wireless applications”, EURASIP Journal on Applied Signal Processing, pp.34–45, 2003. ZHANG Chunyuan is a professor of computer science at [9] Sven Heithecker et al., “A high-end real-time digital film pro- National University of Defense Technology. His research interests cessing reconfigurable platform”, EURASIP Journal on Embed- include computer architecture, embedded system and parallel com- ded Systems, pp.67–70, Article ID 85318, 2007. puting. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

An Algorithm for Bus Trajectory Extraction Based on Incomplete Data Source∗

DAI Dameng and MU Dejun

(School of Automation, Northwestern Polytechnical University, Xi’an 710072, China)

[7] Abstract — As the transportation system usage grows, other researchers. Kwong gave an algorithm to match vehi- there remains of some challenges. On the one hand, the cle signature from wireless magnetic sensors and adopted the emergence of intelligent transportation system becomes in- technology to estimate the arterial travel time. Ou[8] stud- creasingly attractive; on the other hand, researchers are ied the traffic speed estimation through low-resolution posi- hardly to access real-time and dynamic information about tioningdataanddevelopedandatafusingmethodtoachieve traffics. In this paper, a new solved approach is designed. [9] At first, real-time data on incomplete public traffics are more accurate estimation. Li put forward a measurement to collected via Internet. Then to achieve reliable extracting monitor the travel time within city road network on the ba- bus trajectory, the algorithms are proposed that consists sis of taxi GPS data and Bluetooth scan. Li’s research shows of two aspects: trajectory fragments are generated based that the data of coordinates obtained by GPS devices sampled strong correlation of original data and trajectory fragments in 30s per-cycle are well consistent with the data sampled in are connected with minimum connection distance. To the 3Hz. Padmanaban[10] used the GPS data from probe cars (15 best of our knowledge, there are the first results for solving these problems, which will lay on data basis for subsequent minutes apart for every two cars) to forecast the travel time of traffic forecast. At the same time, the proposed algorithm succeeding cars in . Within 10 days of test, 7-day forecast on eliminating abnormal data and tackle with mass data error was less than 10%, which achieved a quite satisfactory effectively will provide new knowledge and experience for result. Yu[11,12] predicted the bus arrival time from the driving similar research areas. data of different buses passing through the same road section. Key words — Intelligent transportation, Bus trajec- Meanwhile, the research also compared different methods for tory extraction, Trajectory fragments generation, Trajec- traffic forecast. Recent traffic research work has made a great of achieve- tory fragments connection. ments, some corresponding transportation systems have been complemented in many cities in our country. However, most I. Introduction bus companies or transportation departments do not open the high-resolution real time traffic data for security reason How to effectively solve the congestions is an important or industry secrecy, which inhibits the traffic research. Most problem all over the world, significant research has been con- researchers are unable to obtain the first-hand traffic data ducted in recent years. These studies include traffic net- to confirm the experiments conducted by other researchers. work, urban traffic state monitoring and traffic prediction. In Therefore, intelligent transportation research is in a “minority terms of traffic network study, Celikoglu[1],Wang[2],Liu[3] and science” (a small number of researchers carrying out scientific Sumalee[4] have studied the correlation among traffic flow, ve- studies) state, which will go against the rapid development of hicle density and travel time, in which the Lighthill-Whitham- information-driven intelligent transportation. Richards (LWR) model has been the most popular and com- Fortunately, the advance of information-driven transporta- monly cited one. Fontaine[5] worked out a method to monitor tion network makes transportation research feasible by Inter- the traffic state based on wireless location technology which net data. Yet we should take kinds of complicated factors derives vehicle speeds on links by series of positions acquired into accounts, the main findings of this paper are: (1) how from cellular phones equipped in the vehicles. Liu[6] developed to extract information from the unreliable data, which is the a time-dependent arterial travel time estimation by tracing primary work for traffic research via Internet data; (2) to over- a virtual probe vehicle. Liu collected high-resolution event- come certain data interference; (3) restore connection to get based traffic data including every vehicle actuations over loop close real traffic data. This paper will propose an algorithm detector and every signal phase changes from multiple inter- for bus trajectory extracting from unreliable Internet data. sections to achieve high performance real-time urban traffic The extracted trajectory is utilized to get travel time between estimation. However, the data collecting infrastructure may adjacent stations for every bus in the studied bus lines. Our not be supported in other cities which are an obstacle for study gives a new approach to data mining of urban traffic

∗Manuscript Received Dec. 2011; Accepted Feb. 2012. 600 Chinese Journal of Electronics 2012 with the advantage of data sharing. In our study, we take the up/downstream setting on GPS data sending when arrived city Suzhou in China for example to develop research work at the terminal station, which caused the server misread and mainly considering the maturity of local real-time bus network falsely resolved the information received from the bus in re- construction. turn trip. Especially in the area with dense data, the reverse trajectory will strongly interfere the generation of a correct II. Problem Statements trajectory. (4) In addition to the exceptional message at starting and Suzhou is one of the most successful cities with real-time terminal stations, there are yet abnormal data at other sta- online bus information query system. We have developed soft- tions. As shown in Area D of Fig.1, one bus station collected ware to acquire bus arrival data from Suzhou bus website four exceptional data (Point D1, D2, D3, and D4), whereas (http://m.sz-map.com). Due to the quality of the GPS fa- there was only one data correct for the bus trajectory. How cilities, the limitations of online real-time information system to effectively eliminate such interfering messages is one of the and the accessibility of Http connection, there will be informa- challenges for extracting reliable trajectory. tion loss and data error. Fig.1 is time-space diagram for Bus III. Algorithms for Bus Trajectory line 10 retrieved on September 2nd, 2011. The vertical ordi- nate is the index of bus station, while the horizontal ordinate Extraction is the arrival time, and the black spots displays the collected real arrival information. In the figure, four significant features 1. Algorithm for generating trajectory fragments have been presented for the information loss and interference. with strong correlation Denote by S ∈N×M the arrival time record set of a spec- ified bus line, N the total record number, M the number of stations of the line, S[i] ∈1×M the i’th record and S[i][j]the recorded arrival time (in minutes) of station j in i’th record. In this paper, all indices of data are started at 1. We intro- duce link l ∈1×M as the trajectory fragment of a bus with l[i] the arrival time at station i generated by the same bus for i =1, 2, ···,M.DenotebyRp the present record in the dealing procedure, RL the last record, Ω = {l} the set of link for a specified bus line, Ψ ∈M the index vector where Ψ[i] is the index of link in Ω for station i connected to in present procedure. Denote by k the auxiliary index of link, Z¯(V )the nonzero index set of vector V , Z¯p the nonzero index set of Rp and Z¯ + l the nonzero index set of RL. The initialization of link generating is depicted in Fig.2, which creates Ω and Ψ from the first record. Fig. 1. Time-space diagram of Bus line 10

Rp = S[1], k =1,Ψ =0; ¯ ¯ (1) There are exceptional message at the starting station Zp = Z(Rp); ¯ and the terminal station. In Area A of Fig.1, it is possible for each element i in Zp that point A1, A2 and A3 were generated from the same bus, Ω[k][i]=Rp[i]; as GPS started sending the bus positions to the data server Ψ[i]=k; k = k +1; before departure. In this case, there was redundant informa- end tion reported, which would obstruct bus trajectory generating. ¯ ¯ Rl = Rp, Zl = Zp; Similar phenomenon occurs at the terminal station. (2) Information loss lasts for a long time. As shown in Fig. 2. Initialization of link generating Area B of Fig.1, Point B1 and B2 is generated from the same The algorithm for generating trajectory fragments is de- bus, there were no arrival time information for three stations picted in Fig.3, where Xp,l is the index set which consists of the in between B1-B2 section, neither for four stations in between index of element in Rp different from the value in Rl with the B3-B4 section. As verified, the local client guarantees success- same position, Xˆp the nonzero index set of Rp which indicates ful data acquisition of other lines from the website in every the element in Rp different from Rl, i.e., Xˆp = Z¯p∩Xp,l.Inthe 30 seconds in the loss period, which exclude the network ac- same way, denote by Xˆl the nonzero index set of Rl with differ- cess congestion factor, and could be due to the self limitations ent value of Rp.DenotebyXˆp,l the trim set of Xˆp with respect of GPS equipment or the Suzhou bus information publishing to Xˆl such that all elements in Xˆp,l should not be less than system. Missing information would undermine the complete- min(Xˆl). We introduce “checkLink (Rp[x],Rl[y],TR(y,x))” as ness of information, even creating difficulties for bus trajectory the discrimination function which determine whether Rp[x] generating. and Rl[y] can be connected by the travel time reference from (3) There is information interference from anti-direction station y to x, i.e., “checkLink()” judges whether Rp[x]and lines. Area C of Fig.1 has shown a reverse trajectory. The Rl[y] are produced by the same bus with respect to TR(y,x), reason for this appearance may be the bus did not switch the where TR is the referenced travel time vector and TR(y,x) An Algorithm for Bus Trajectory Extraction Based on Incomplete Data Source 601 is the directed travel time from station y to station x with 3.2 gives further procedure to connect fragments into a com- respect to TR. From the directed property, one gets that plete trajectory and corrects some wrong connections caused TR(y,x)=−TR(x, y). Enlighten by the validating concept by abnormal data. proposed in Refs.[13, 14], we introduce γ(Rp[x],Rl[y])|TR the similarity between travel time derived by Rl[y], Rp[x]andthe referenced travel time TR(y,x) to check the connectivity of Rp[x]andRl[y]. γ(Rp[x],Rl[y])|TR is given as follows:

Rp[x] − Rl[y] γ(Rp[x],Rl[y])|TR = . TR(x, y)

When γ(Rp[x],Rl[y])|TR ∈ [γmin,γmax], “checkLink()” returns TRUE; for other cases “checkLink()” returns FALSE.

for r from2tolength(S) ¯ Rp = S[r], get Xp,l, Zp; ˆ ¯ ˆ ¯ Xp ← Zp ∩ Xp,l, Xl ← Zl ∩ Xp,l; ˆ ˆ ˆ ˆ ˆ Xp,l =trim(Xp, Xl)={x|x ∈ Xp, x ≥ min(Xl)}; ˆ ˆ for each x in Xp,l,eachy in Xl,andy 0 ( x y R)isgivenby: ⎧ LinkIndex= [y]; ⎪ Ψ ⎪ D1,C(Ly) >C(Lx) CloneLinkMap[y]=0; ⎨⎪ end D L ,L , |T D2,C(Ly)

Δ Δ Definition 4 Denote by Su = Su(Ω,Iu,Nu) the upper mij = M [i][j], Lij = Su[i], Ld = Sd[j]. link set which consists of at least Nu stations in the upper Link connecting rule Given upper link Lx ∈ Su and station index set Iu. Su(Ω,Iu,Nu)isgivenby: downstream link Ly ∈ Sd,ifLx and Ly can be connected, they should satisfy the following conditions: S {l| Z¯ l ∩ I ≥ N ,l ∈ }. u = length( ( ) u) u Ω (1) The maximum span of overlapped stations should not N In Definition 4, the upper bus stations refer to the stops be greater than e; L a L b which are near to the starting station of the bus. Simi- (2) The similarity of x[ ]and y[ ]withrespect T γ ,γ i.e larly, the downstream bus stations refer to the stops near to R should be in the specified span [ min max], ., γ L a ,L b |T ∈ γ ,γ a to the terminal station of the bus. In the same way, de- ( x[ ] y[ ]) R [ min max], where, is the maximum Δ nonzero index of Lx which does not overlap with Ly, b is the note by Sd = Sd(Ω,Id,Nd) the downstream link set generated minimum nonzero index of Ly which does not overlap with Lx. by Ω with at least Nd stations in the downstream set Id. Procedure for dealing with the connected links which have Sd(Ω,Id,Nd)isgivenby: overlapped stations is described in the following four steps. Sd = {l|length(Z¯(l) ∩ Id) ≥ Nd,l ∈ Ω}. (1) Generate the connected link L˜ by connecting the non- overlapped part of the link pair and update TR with L˜. D Lx,Ly|TR,λ Definition 5 Denote by ( ) the connect- (2) Generate new link fragments L˜x and L˜y with the ar- Lx Ly ing distance for upper link and downstream link with rival records extracting from the overlapped part of Lx and Ly T λ D L ,L |T ,λ respect to R and . ( x y R )isgivenby: respectively. ⎧ D L˜ , L˜ |T ⎨ λ (3) Calculate the undirected distances ( x y R)and D(Lx,Ly|TˆR)+ , case 1 D L ,L |T ,λ L D(L˜y, L˜|TR), choose the link from {L˜x, L˜y} for connecting ( x y R )=⎩ min +∞, other cases with shorter distance. (4) Output L˜ as the connected link. where: Algorithm for link connection The algorithm is split into following 11 steps. TˆR =update(Lx,Ly,TR) (1) Pick the upper link set parameters: Iu, Nu, then ini- Lmin =min(length(Z¯(Lx), length(Z¯(Ly)) tialize Id,Nd,λ,Ne, Ω¯ = Ø, finally set the auxiliary count n = 0 and choose TR from historical data. and case 1 satisfies: (2) Get Su = Su(Ω,Iu,Nu) from link fragment set Ω. (1) Z¯(Lx) =Ø, Z¯(Ly) =Ø; (3) Get Sd = Sd(Ω,Id,Nd)andSˆd. Sˆd is the link set from (2) max(Z¯(Lx)) − min(Z¯(Ly)) ≤ Ne. Sd which excludes the elements in Su, i.e., Sˆd = Sd/{Sd ∩Su}. In Definition 5, Ne is the maximum span of the overlapped Sˆ stations in the upper link and downstream link for fault toler- (4) Continue to (5) if d =Ø,otherwise,jumpto(6). ∪ ¯ ¯ ance. We recommend that Ne = 2 in most cases. λ is intro- (5) Recover Ω = Ω Ω,andsetΩ = Ø, increase every duced to control the influence of link completeness for distance elements in Id by 1, jump to (10) if min(Id) is greater than connecting. The weight of completeness increases with the the total number of stations M,elsegobackto(3). growth of λ. The empirical value of λ is adopted by λ =15 (6) Get Lu,Ld | min M (Su, Sˆd|TR,λ)andsetn = n +1. in real systems. TˆR is updated by the procedures given as (7) Check whether Lu,Ld can be connected by the Link follows: connecting rule. (1) Initialize TˆR by TˆR = TR. (8) If Lu,Ld can be connected, clear n =0,replaceLu (2) Take Lˆx = Lx/{Lx ∩ Ly}, then interpolate the missing by the connected link in Su and Ω,setSˆd = Ø, delete Ld from arrival time in Lˆx by TR. Update the first part of TˆR with sta- Ω and go back to (2); otherwise, go back to (5). tion index prior to min(Z¯(Ly)) by calculating the travel time (9) If max(Id) is greater than the maximum station index of adjacent stations in Lˆx. M and n ≥ 10, continue to (10); else if n<10, add Lu to set (3) Take Lˆy = Ly/{Lx ∩ Ly}, then interpolate the missing Ω¯ , delete Lu in Ω, go back to (2); for the other cases go back arrival time in Lˆy by TR. Update the final part of TˆR with to (5). station index after max(Z¯(Lx)) by calculating the travel time (10) Recover Ω = Ω ∪ Ω¯ and post process Ω by Link cor- of adjacent stations in Lˆy. rection rule. Definition 6 Denote by M (Su,Sd|TR,λ) the connecting (11) Output Ω as the connected trajectory set. S distance matrix of upper link set u and downstream link set Link correction rule The correction rule is set to deal S T λ M S ,S |T ,λ d with respect to R and . ( u d R )isgivenby: the following three cases. (1) Split the link to 2 links if the station index distance of mij = D(Lx,Ly|TR,λ),Lx = Su[i],Ly = Sd[j], two adjacent nonzero data is greater than 10 in the link; where mij = M [i][j]. (2) Anti-cross the adjacent links if they appear the Definition 7 Lu,Ld | min M (Su,Sd|TR,λ) is the linked crossover phenomenon. pair of Su and Sd with the minimum connecting distance. (3) Connect the unconnected stations (most are starting Lu,Ld is searched by: stations) to Ω by Link connecting rule. find link pair Lu,Ld Fig.6 shows the connected bus trajectories from the link s.t. D(Lu,Ld|TR,λ) = min{mij } fragments in Fig.5. As demonstrated in the figure, most of the An Algorithm for Bus Trajectory Extraction Based on Incomplete Data Source 603 links are reasonably connected. We can get abundant infor- mission model (SCTM): a stochastic dynamic traffic model mation such as travel time of adjacent stations and departure for traffic state surveillance”, Transportation Research Part B: , Vol.45, No.3, pp.507–533, 2011. intervals from bus trajectories. Hence, reliably extracting of Methodological [5] Michael D. Fontaine, Brian L. Smith, “Investigation of the per- bus trajectories is an important work for traffic estimation and formance of wireless location technology-based traffic monitor further study of bus arrival time prediction via Internet data. system”, Journal of Transportation Engineering, Vol.133, No.3, pp.157–165, 2007. [6] Henry X. Liu, Wenteng Ma, “A virtual vehicle probe model for time-dependent travel time estimation on signalized arteri- als”, Transportation Research Part C: Emerging Technologies, Vol.17, No.1, pp.11–26, 2009. [7] Karric Kwong, Robert Kavaler, Ram Rajagopal et al., “Arte- rial travel time estimation based on vehicle re-identification us- ing wireless magnetic sensors”, Transportation Research Part C: Emerging Technologies, Vol.17, No.6, pp.586–606, 2009. [8] Qing Ou, R.L. Bertini, J.W.C. van Lint et al., “A theoretical framework for traffic speed estimation by fusing low-resolution probe vehicle data”, IEEE Transactions on Intelligent Trans- portation Systems, Vol.12, No.3, pp.747–756, 2011. [9] Jie Li, Henk van Zuylen, Chunhua Liu et al., “Monitoring travel times in an urban network using video, GPS and Bluetooth”, Procedia - Social and Behavioral Sciences, Vol.20, pp.630–637, Fig. 6. The connected bus trajectories from link fragments 2011. [10] R.P.S. Padmanaban, K. Divakar, L. Vanajakshi et al., “Devel- opment of a real-time bus arrival prediction system for Indian IV. Conclusions traffic conditions”, IET Intelligent Transport Systems, Vol.4, No.3, pp.189–200, 2010. In this paper, we have taken into account some interference [11] Bin Yu, William H.K. Lam, Mei Lam Tam, “Bus arrival time from the recorded data. To overcome certain data interference, prediction at bus stop with multiple routes”, Transportation Re- a trajectory fragments generating algorithm based on original search Part C: Emerging Technologies, Vol.19, No.6, pp.1157– records that of high anti-interference has been provided. At 1170, 2011. the same time, based on link fragments, a trajectory connec- [12] Bin Yu, Zhong Zhen Yang, Jing Wang, “Bus travel-time pre- diction based on bus speed”, tion algorithm within minimum connecting distance for tra- Proceedings of the Institution of Civil Engineers: Transport, Vol.163, No.1, pp.3–7, 2010. jectory extraction has been presented. Our proposed schemes [13] Bing Zhang, Huanzhang Lu, “The detection algorithm for mov- can provide a basic data for subsequent study settings, such ing point target trajectory in image sequences”, Acta Electron- as urban traffic estimation and bus arrival forecast. What’s ica Sinica, Vol.32, No.9, pp.1524–1256, 2004. (in Chinese) more, our proposed methods can be extended to subsequent [14] Zhipei Huang, Shuyan Sun, Jiankang Wu et al., “Dense multi- research smoothly, yet most researchers are confused with cur- target adaptive video tracking based on signature”, Acta Elec- rent work due to not acquiring the actual bus data in real time. tronica Sinica, Vol.39, No.3A, pp.37–42, 2011. (in Chinese) was born in 1975. All above work will provide new thinking for similar research DAI Dameng She is currently a Ph.D. student in School areas, so as to enhance relative academic communications. of Automation of Northwestern Polytechni- cal University. She is also an associate pro- fessor in College of Physics and Electronic References Information Engineering of Wenzhou Uni- [1] Hilmi Berk Celikoglu, “Travel time measure specification by versity. Her research interests include in- functional approximation: application of radial basis function telligent information processing and data neural networks”, Procedia - Social and Behavioral Sciences, mining. (Email: jsj [email protected]) Vol.20, pp.613–620, 2011. [2] Yibing Wang, Markos Papageorgiou, Albert Messmer et al., “An MU Dejun was born in 1963. He adaptive freeway traffic state estimator”, Automatica, Vol.45, is currently a professor and Ph.D. direc- No.1, pp.10–24, 2009. tor in School of Automation of Northwest- [3] Ronghui Liu, Tony May, Simon Shepherd, “On the fundamen- ern Polytechnical University. His main tal diagram and supply curves for congested urban networks”, research interests include control theory Procedia - Social and Behavioral Sciences, Vol.17, pp.229–246, and applications, intelligent information 2011. processing and network security. (Email: [4] A. Sumalee, R.X. Zhong, T.L. Pan et al., “Stochastic cell trans- [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Controllability of Multi-agent Systems with Multiple Leaders and Switching Topologies∗

LUO Xiaoyuan1,LIUDan1, ZHANG Fan2 and GUAN Xinping3

(1.Institute of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China) (2.723 Station, State Administration of Radio, Film and Television, Shijiazhuang 050000, China) (3.School of Electronic Information and Electrical Engineering, Shanghai Jiaotong University, Shanghai 200240, China)

Abstract — This paper studies the controllability of the whole topology and the follower set. Subsequently, the multi-agent systems with multiple leaders and switching concept of network equitable partition was introduced to re- topologies. In this case, we introduce the concept of union veal the controllability characterizations of multi-agent sys- graph and reveal that the multi-agent system with switch- tems in multi-leader setting[6]. Moreover, in Ref.[7], the au- ing topologies is controllable if the corresponding system with union graph is controllable. Moreover, we obtain a thors investigated the controllability problem based on a new necessary and sufficient condition for structural control- notation called leader-follower connectedness. With respect lability of multi-agent systems with multiple leaders and to the multi-agent systems with switching topologies, some switching topologies. With these conclusions respect to sufficient conditions for the interconnected systems in switch- the union graph, some sufficient conditions for controlla- ing networks are presented by using neighbor rules[8,9].Re- bility of the switching multi-agent systems are shown based cently, a new definition of structural controllability was in- on the properties of controllability of multi-agent systems troduced to investigate the structural controllability of multi- with fixed topology. Three computational tests and an ap- [10,11] plication example are given to demonstrate the effective- agent systems . ness and the practicability of the results. In contrast with the existing literature, we investigate the controllability of the multi-agent systems with multiple lead- Key words — Multi-agent system, Switching topology, ers and switching topologies. Assume we can freely assign Structural controllability, Union graph. the weights, we propose a sufficient and necessary condition for the structural controllability of the multi-agent systems. I. Introduction Some sufficient conditions for controllability of the switching multi-agent system are also derived based on the properties of Over the past few years, scholars have paid great atten- the corresponding system with union graph. Some numerical tion to the distributed cooperative control for multiple au- examples are given to underscore the conclusions. tonomous agents[1]. This is partly due to its broad applica- tions in cooperative control of Unmanned air vehicles (UAVs), II. Preliminaries Autonomous underwater vehicles (AUVs), space exploration, congestion control in communication networks, air traffic con- 1. Graph preliminaries trol and so on. The weighted graph G with N vertices consists of a vertex Because of the high complexities of such large scale com- set v = {v1,v2, ···,vN } and an edge set ε = {e1,e2, ···,eN } plex systems, a distinct research area with respect to the in- that represents the interconnection links among the vertices. tersection of systems theory and graph theory has emerged. vi and vj ,i,j ∈{1, ···,N} are neighbors if (vi,vj ) ∈ ε.Ifthere One basic issue is the controllability of multi-agent systems. is a path between any two vertices of the graph, then the graph In Ref.[2], Tanner proposed the controllability issue in leader- is connected. For two graphs G =(v, ε)andG =(v,ε), we follower multi-agent framework and provided some necessary call G is a subgraph of G, denoted by G ⊆ G,ifv ⊆ v, ε ⊆ ε. and sufficient conditions in terms of eigenvectors of graph A subgraph G is said to be induced from G if it is obtained Laplacian. Under the same setup, a sufficient condition was by deleting a subset of vertices and all the edges connecting derived in Ref.[3] and some graph-theoretic characterizations to those vertices. An induced subgraph of a graph, which is of controllability were examined by Rahmani and Mesbabi[4]. maximal and connected, is said to be a connected component In Ref.[5], Ji and Egerstedt showed another necessary and suf- of the graph. The adjacency matrix of graph G, A(G), is de- ficient condition based on eigenvalues of Laplacian matrix of fined as A =[wij ], wij =0 ⇔ (i, j) ∈ ε,wherewij > 0 stands

∗Manuscript Received May 2011; Accepted Feb. 2012. This work is supported by the National Basic Research Program of China (973 Program) (No.2010CB731800), the National Natural Science Foundation of China (No.60934003, No.60974018 and No.61074065), Natural Science Foundation of Hebei Province (No.F2012203119), Key Project for Natural Science Research of Hebei Education Department (No.ZD200908). Controllability of Multi-agent Systems with Multiple Leaders and Switching Topologies 605 for the weight of edge (vi,vj ). The Laplacian L(G) of a graph where εi, i = {1, 2, ···,m} are edge sets of the switching topol- G plays important roles in many graph-theoretic treatments ogy. of multi-agent systems. Actually, the union graph Gu stands for the system 2. Problem formulation (−(F1 + F2 + ···+ Fm), −(R1 + R2 + ···+ Rm)) with fixed The multi-agent system with is given by topology Gu. Some algebraic criteria for controllability based on control- x˙ i = ui,i=1, ···,N (1) lability matrix are given below. x u ,j , ···,n [13] ˙ N+j = N+j =1 l Lemma 2 If the matrix where N and nl represent respectively the numbers of follow- [−(R1 + R2 + ···+ Rm), (F1, +F2 + ···Fm) ers and leaders, and xi indicates the state of the ith agent, R R ··· R , ··· i =1, ···,N + nl. The interconnection graph, G =(v,ε), ( 1 + 2 + + m) N N−1 is defined as an undirected graph consisting of a set of ver- (−1) (F1 + F2 + ···Fm) (R1 + R2 + ···+ Rm)] (5) tices v = {v1, ···,vN ,vN+1, ···,vN+n indexed by the agents l is full row rank, system (−(F1 + F2 + · + Fm), −(R1 + R2 + and a set of edges ε = {(vi,vj ) ∈ v × v|vi ∼ vj } that joint · + Rm)) with fixed topology is controllable, and vice versa. interconnected agents. Let Ni be the neighboring set of vi, This matrix is called the controllability matrix of the system i =1, ···,N, i.e., Ni = {j|(vi,vj ) ∈ ε, vj ∈ v},wegivethe (−(F1 + F2 + ···+ Fm), −(R1 + R2 + ···+ Rm)). control protocol as [14] Lemma 3 If the matrix μi = − j∈N wij (xi − xj )(2) i [−R1, −R2, ···, −Rm,F1R1,F2R1, ···,FmRm, w  v ,v N N−1 N N−1 where ij = 0 stands for the weight for edge ( i j ). (−1) F1 R1, (−1) F2F1 R1, ···, With x =(x1, ···,xN+n ) being the stack vector of all N N−2 N N−1 l (−1) F1Fm R1, ···, (−1) Fm Rm](6) agent states, we havex ˙ = −Lx,whereL is the Laplacian matrix of the interconnection graph. Rename the agents as is full row rank, the switching system (4) is controllable, and Δ Δ yi = xi, zj = xN+j the system can be rewritten in the form vice versa. This matrix is called the controllability matrix of the corresponding switching linear system (4). y FR y ˙ − 0 In the following discussion, the definitions of union graph z = z + u (3) ˙ 00 and controllability matrix are employed to propose some prop- where y,z and u are the stack vectors of all yi,zj and uN+j erties of controllability of multi-agent systems. respectively, F is the matrix obtained d from the Laplacian matrix L of G after deleting the last nl rows and nl columns, III. Main Results and R is the N × nl submatrix consisting of the first N rows According to Definition 1, we can formulated the multi- of the deleted columns. Since the interconnection topology agent system under fixed topology Gu as follows. y among agents is time-varying, dynamics (1) can be viewed reasonably as a system with switching topologies, which can y˙ = −(F1 + F2 + ···+ Fm)y − (R1 + R2 + ···+ Rm)z (7) bewrittenintheform We next show the relationship of controllability between y˙ = −Fσ(t)y − Rσ(t)z (4) system (4) and (7). Theorem 1 The multi-agent system (4) with switching where σ(t) ∈{1, 2, ···,m} is the switching sequence and topologies is controllable if the corresponding linear system (7) {G1, ···,Gm} is the switching topology set of system (4). The with union graph is controllable. matrix Fσ(t) reflects the interconnections among followers, and Proof Assume that the linear system (7) is controllable. the column vectors of Rσ(t) represents the relations between According to Lemma 2, controllability matrix (5) is full row followers and leaders of the corresponding subsystems. Then rank. Expanding the matrix, it follows that the matrix we can investigate the controllability of multi-agent system (1) with switching topology from the controllability of the switch- [−(R1 + R2 + ···+ Rm),F1R1 + F2R1 + ···+ FmR1 + F1R2 ing system (4). + F2R2 + ···+ FmR2 + ···+ F1Rm + F2Rm + ···+ FmRm, ···, 3. Union graph and controllability matrix − N F N−1R F F N−2R ··· F N−1R Let C and V denotes the controllable state set and reach- ( 1) ( 1 1 + 2 1 1 + + m m)] able state set of system (4), the relationship between control- is full row rank. We add some column vectors and get lability and reachability of this system is given by Lemma 1∗. [12] Lemma 1 For switched system (8), C = V . [−(R1 + R2 + ···+ Rm), −R2, −R3, ···, −Rm, N N−1 N−2 N−1 To study the controllability of system (8), here we intro- (−1) (F1 R1 + F2F1 R1 + ···+ Fm Rm), duce a new graph called as union graph. N N−2 N N−1 (−1) F2F1 R1, ···, (−1) Fm Rm] Definition 1 (Union graph) Given a switching linear system (−Fσ(t), −Rσ(t)), where {G1, ···,Gm} is the switch- This matrix still has N linearly independent ing topology set and σ(t) ∈{1, 2, ···,m} is the switching se- columns, so it is full row rank. Next, we subtract quence. The union graph of the switching topologies is defined −R2, −R3, ···, −Rm from −(R1 + R2 + R3 ··· + Rm), ∗∗ as Gu = G1 ∪ G2 ∪···∪Gm = {v; ε1 ∪ ε2 ∪···∪εm} , subtract F, R1,F3R1, ···FmRm from F1R1 + F2R1 +

∗ The detail definition of controllability and reachability can be found in Ref.[12]. ∗∗ The symbol ∪ stands for the mathematical operation of set intersection. 606 Chinese Journal of Electronics 2012

··· + FmR1 + ··· + F1Rm, + ··· + FmRm and subtract than one isolate follower or isolate connected component. By N N−2 N N−2 N N−1 (−1) F2F1 R1, ···, (−1) F1Fm R1, ···, (−1) Fm Rm Definition 3, the controllability matrix Eq.(5) has at least one N N−1 N−2 N−1 from (−1) (F1 R1 + F2F1 R1 + ···+ Fm Rm). row of which the values of all elements are identically zero. Because these column elementary transformations do not Expanding the matrix yields change the matrix rank, the matrix is still full row rank. Now the matrix is same as Eq.(6), which is the controllability ma- [−(R1 + R2 + ···+ Rm),F1R1 + F2R1 + ···+ FmR1 + F1R2 trix for switching linear systems as shown in Lemma 3 and is + F2R2 + ···+ FmR2 + ···+ F1Rm + F2Rm + ···+ FmRm, ··· full row rank. Therefore, the switching multi-agent system (4) N N−1 N−2 N−1 (−1) (F1 R1 + F2F1 R1 + ···+ Fm Rm)] is controllable. Based on the above conclusion, we further research the The zero row is identically zero for every summing factor. relationship between interconnection topologies and controlla- Consequently, we can know that every summing element of bility of the switching multi-agent system. Then the controlla- p q the matrix, such as Ri, FiRj and Fi Fj Rr, i, j, r =1, ···,m, bility problem of multi-agent systems can now be formulated p, q =1, ···,N− 1, has one zero element. As a result, the con- as the structural controllability problem of the switching linear trollability matrix Eq.(6) always has one zero row. Therefore, system (4). [11] the multi-agent system (4) is not controllable and structurally Definition 2 The linear system (4), whose matrix el- uncontrollable. ements are zeros or undetermined parameters, is said to be (Sufficiency) Assume that the union graph Gu is leader- structurally controllable if and only if there exist a set of follower connected, which consists of two cases. If the union weights wij that can make the system (4) controllable in a graph Gu is connected, the corresponding system (−(F1 +F2 + classical sense. ···+ Fm), −(R1 + R2 + ···+ Rm)) is structurally controllable For the structural controllability of multi-agent systems, from Lemma 3. According to Theorem 1, the multi-agent sys- we need the following definition. tem (4) is structurally controllable. Definition 3[11] The matrix pair (F , R)issaidtobe On the other hand, if the union graph Gu has δ connected reducible if they can be written in the following form components. It can be verified clearly from Lemma 4 that each switching subsystem corresponding to the related con- F11 0 0 F = , R = nected subgraph is structurally controllable. Therefore, the F21 F22 R22 multi-agent system (4) is structurally controllable. p×p (N−p)×p (N−p)×(N−p) where F11 ∈ R , F21 ∈ R , F22 ∈ R and According to Theorem 2, we can keep leader-follower con- (N−p)×m R22 ∈ R . nectedness of the union graph to make sure the structural con- Whenever the matrix pair (F , R) is reducible, the system trollability of a group of agents with multiple leaders under is structurally uncontrollable, and meanwhile the controllabil- switching topologies. With the assumption of leader-follower ity matrix Q =[R, F R, ···,FN−1R] has at least one row of connectedness, we next expand some properties of controlla- which the values of all elements are identically zero. bility for multi-agent system with fixed topology to the switch- Lemma 4[11] The multi-agent system with a single ing multi-agent system to get some sufficient conditions for the leader and switching topologies Gσ(t), σ(t) ∈{1, 2, ···,m}, controllability. is structurally controllable if and only if the union graph Gu The following results are about the controllability of the is connected. system (7) with multiple leaders. We denote Gf and Gl that are induced respectively by Lemma 5 Consider the system described by (7) that cor- the follower and leader node set, to represent the follower and responds to a leader-follower connected interconnection graph L − F1 F2 ··· leader subgraphs of Gu.LetGc1 , ···,Gcr stand for the r con- with Laplacian . Then the system ( ( + + + F , − R R ··· R nected components in the follower subgraph Gf . m) ( 1 + 2 + + m)) is controllable if and only if Definition 4[7] (Leader-follower connected topol- one the following conditions are satisfied. ogy) An interconnection graph G is said to be leader- fol- (1) The eigenvalues of −(F1 + F2 + ···+ Fm)arealldis- − F1 F2 ··· Fm lower connected if for every connected component Gci of Gf , tinct and the eigenvectors of ( + + + )arenot there exists a leader in the leader subgraph Gl such that orthogonal to −(R1 + R2 + ···+ Rm). −L − F F ··· F there is an edge between this leader and some agent in Gci , (2) and ( 1 + 2 + + m) do not share any common i =1, ···,r. eigenvalue. We investigate graphic interpretation for the relationship Proof The conclusion is a direct consequence of Theo- between structural controllability and switching topologies by rem IV.1 in Ref.[2] and Lemma 2.2 in Ref.[5]. Theorem 2. Obviously, from Theorem 1, these properties of control- Theorem 2 The multi-agent system (4) with multiple lability of multi-agent systems with fixed topology can be leaders and switching topologies Gσ(t), σ(t) ∈{1, 2, ···,m},is reformed to the sufficient conditions for the controllability structurally controllable if and only if the union graph Gu is of multi-agent systems with multiple leaders and switching leader-follower connected. topologies. Proof (Necessity) We need to prove that the multi- Corollary 1 Consider the system described by (4) agent system is structurally uncontrollable if the union graph with multiple leaders and switching topologies, of which the Gu is not leader-follower connected. For simplicity, we assume union graph is a leader-follower connected interconnection that there exits only one disconnected portion. The proof can graph with Laplacian L. The system corresponding to the be straightforwardly extended to more general cases with more union graph is described by system (7). Then the system Controllability of Multi-agent Systems with Multiple Leaders and Switching Topologies 607

(−Fσ(t), −Rσ(t)) is controllable if the following conditions are satisfied. (1) The eigenvalues of −(F1 + F2 + ···+ Fm)arealldis- tinct and the eigenvectors of −(F1 + F2 + ···+ Fm)arenot orthogonal to −(R1 + R2 + ···+ Rm). (2) −L and −(F1 +F2 +···+Fm) do not share any common Fig. 3. Switching topologies and the leader follower connected eigenvalue. union graph

(1) We can get the eigenvalues of −(F1 + F2)are−3.618, IV. Numerical Examples −1.3820 and −1.0000, which are distinct from each other. − F F 1. Computational tests Moreover, the eigenvectors of ( 1 + 2) are not orthogonal − R R To illustrate the main results, we consider here some to ( 1 + 2). −L − . − . − . switching topology sequences to check the conditions for con- (2) The eigenvalues of are 4 8136, 2 5293, 2 0000, trollability of the corresponding multi-agent systems. We first 0.0000, 0.0000. It is clearly that the eigenvalues are all distinct − F F give a switching topology sequence of which we can set freely with those of ( 1 + 2). the weights to show the structural controllability of multi- We know that the multi-agent system with multiple leaders agent systems with multiple leaders and switching topologies. and switching topologies is controllable from Corollary 1. Example 1 A six-agent network with agent 1, 2, 3, 4 Next, we check the rank condition for controllability ma- as the followers and 5,6 as the leaders is considered, and the trix of the switching multi-agent system. We could get the switching topologies are described by the graphs in Fig.1(a)– controllability matrix is full row rank. Then the multi-agent (b). system is controllable. It does also illustrate the Corollary 1 effectively. 2. Application example Consider a multi-agent system with Markovian commu- nication failure and communication recovery which may be caused by the limited communication capacity. The occur- rence of the failures is modeled by a discrete-time Markov chain and the communication status process can take value in Fig. 1. Switching topologies and the leader follower connected a finite set W ∈{0, 1} with transition probability matrix union graph 1 − βij βij From Fig.1(a)–(b), we can get the switching system ma- = αij 1 − αij trices of the subgraphs and the controllability matrix. If we ij impose scalar 1 to all the weights, we can easily get three linear ≤ {γ k |γ k } α ≤ independent column vectors where 0 Pr ij ( +1)= 0 ij ( )=1 = ij 1and ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 ≤ Pr{γij (k +1)= 1|γij (k)=0} = βij ≤ 1 are called −1 0 0 0 the failure probability and the recovery probability, respec- ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ −1 ⎥ γ k ⎣ ⎦ , ⎣ ⎦ , ⎣ ⎦ , ⎣ ⎦ tively. ij ( ) denote the communication status between agent 0 −1 0 0 Δ −1 0 −1 0 i and agent j at time k. To simplify the expression, α =(αij ), Δ As a result, the matrix has full row rank and it follows that β =(βij ) are used to denote the failure probability matrix and the multi-agent system is structurally controllable. recovery probability matrix, respectively. Next, we overlay the switching topologies together to get In this application example, a multi-agent system consists the union graph Gu that is shown in Fig.1(c). It turns out that of five agents is considered and the failure probability and re- the union graph of the switching system is leader-follower con- covery probability are given as follows nected. By Theorem 2, the system is structurally controllable. ⎛ ⎞ This example verifies the correctness of Theorem 2. 00.41 01 ⎜ 1010.91⎟ Example 3 Also a six-agent network with agent 1,2,3 as ⎜ ⎟ α = ⎜ 11010⎟ , the followers and 4,5,6 as the leaders is considered, and the ⎝ ⎠ 0.70101 different switching topologies are described by the graphs in 110.110 a b Fig.3( )–( ). ⎛ 00.20 10⎞ The system switching matrices of the subgraphs are ⎜ 0000.80⎟ ⎡ ⎤ ⎡ ⎤ ⎜ ⎟ 100 −10 0 β = ⎜ 0 0001⎟ ⎣ ⎦ ⎣ ⎦ ⎝ ⎠ F1 = 020,R1 = 00−2 , 0.11 000 000 000 000.600 ⎡ ⎤ ⎡ ⎤ 00 0 000 F = ⎣ 01−1 ⎦ ,R= ⎣ 000⎦ From the above matrices, we can see that there are eight pos- 2 1 a h 0 −12 0 −10 sible topologies in Fig.4( )–( ). It is clearly that the union graph is not connected. We Overlay the switching topologies, we find that the union could choose two leaders from agent 1, 2, 4 and agent 3, 5 re- graph that is showed in Fig.3(c) is leader-follower connected. spectively to make the union graph leader-follower connected, Then we check the sufficient conditions about the system which fulfill the necessary and sufficient conditions of struc- (−(F1 + F2), −(R1 + R2)) corresponding to the union graph. tural controllability in Theorem 2. If we could freely set the 608 Chinese Journal of Electronics 2012 weights, we could design the weights of the union graph, ac- Automatic Control, Vol.53, No.4, pp.1009–1013, 2009. cording to the sufficient conditions in Corollary 1. Therefore, [9] Z.J. Ji, H. Lin and T.H. Lee, “Controllability of multi-agent sys- this application example demonstrates the practicability of the tems with switching topology”, IEEE Conference on Robotics, Automation and Mechatronics, pp.421–426, 21-24 Sept. 2008. results. [10] M. Zamani, H. Lin, “Structural controllability of multi-agent systems”, Proceedings of the American Control Conference, pp.5743–5748, June 10-12, 2009. [11] X.M. Liu, H. Lin and B.M. Chen, “A graph-theoretic character- ization of structural controllability for multi-agent system with switching topology”, Proceedings of the 48th IEEE Conference on Decision and Control, pp.7012–7017, Dec. 16-18, 2009. [12] Z.J. Ji, L. Wang, X.X. Guo, “On controllability of switched lin- ear systems”, IEEE Transaction on Automatic Control, Vol.53, No.3, pp.796–800, 2008. [13] D.Z. Zheng, Linear System Theory (second edition), Tsinghua University Press, 2002. Fig. 4. Possible topologies of the networks with Markovian [14] Z. Sun, S.S. Ge and T.H. Lee, “Controllability and reachability communication failure and recovery criteria for switching linear systems”, Automatica, Vol.38, No.5, pp.775–786,2002. [15] J.G. Hao, Y.Q. Dai, “Achieving controllable privacy protection V. Conclusions in position service for VANETs”, Chinese Journal of Electron- ics, Vol.20, No.3, pp.395–400, 2011. In this paper, the controllability of multi-agent systems [16] C.T. Lin, “Structural controllability”, IEEE Transactions on with multiple leaders is studied, in which the agents inter- Automatic Control, Vol.19, No.3, pp.201–208, 1974. connected via switching weighted topologies. We introduce [17] C. Godsil, G. Royle, Algebraic Graph Theory, Springer, New the concept of union graph and controllability matrix to show York, 2001. LUO Xiaoyuan received the M.S. that the multi-agent system with multiple leaders and switch- and Ph.D. degrees from the Department ing topologies is controllable if and only if the corresponding of Electrical Engineering, Yanshan Uni- system with union graph is controllable. From this conclusion, versity, China in 2001 and 2005, respec- we get a graphic characterization for the structural controlla- tively. He is currently a professor in Yan- bility of the switching multi-agent systems. Based on some shan University. His research interests in- clude multi-agent and networked control well-known properties of the controllability, we further get systems. (Email: [email protected]) some sufficient conditions for such multi-agent systems. The conclusions showed in some sense that the connection between controllability of multi-agent system and topology structures LIU Dan is currently a postgrad- of the interconnection graph, and make a foundation to de- uate student in Institute of Electrical Engineering, Yanshan University. Her sign the optimal control solution for the switching multi-agent research interest is cooperative control system. for multi-agent systems. (Email: ly- dia [email protected]) References [1] C.W. Reynolds, “Flocks, herds, and schools: A distributed behavioral model]], in Proc. Computer Graphics, ACM SIG- ., Vol.21, No.4, pp.25–34, 1987. GRAPH’87 Conf ZHANG Fan received B.E. de- [2] H. Tanner, “On the controllability of nearest neighbor intercon- gree in communication engineering from nections”, in Proc. of 43rd IEEE Conference on Decision and Yanshan University, China in 2008. He Control, pp.2467–2472, 2004. is currently an assistant engineer in 723 [3] M. Ji, A. Muhammad and M. Egerstedt, “Leader- based multi- Station, State Administration of Radio, agent coordination: controllability and optimal control”, Proc. Film and Television. His research inter- of the American Control Conference, pp.1358–1363, June 14-16, est is networked control systems based 2006. on digital broadcast signals. (Email: [4] A. Rahmani and M. Mesbahi, “On the controlled agreement [email protected]) problem”, Proc. of the American Control Conference, pp.1376– 1381, 2006. GUAN Xinping received M.S. de- [5] M. Ji and M. Egerstedt, “A Graph-Theoretic characterization gree in applied mathematics from Harbin of controllability for multi-agent systems”, Proc. of American Institute of Technology, China in 1991, and Control Conference, pp.4588–4593, July 11-13, 2007. Ph.D. degree in electrical engineering from [6] A. Rahmani, M. Ji, M. Mesbahi and M. Egerstedt, “Control- Harbin Institute of Technology, China in lability of multi-agent systems from a graph-theoretic perspec- 1999. He is currently a professor of control tive”, SIAM J. Control Optim., Vol.48, No.1, pp.162–186, 2009. theory and control engineering in Shanghai [7] Z.J. Ji, Z.D. Wang, H. Lin and Z. Wang, “Interconnection topol- Jiaotong University. His research interests ogy for multi-agent coordination under leader-follower frame- include robust congestion control in com- work”, Automatica, Vol.38, No.5, pp.2857–2863, 2009. munication networks, cooperative control [8] B. Liu, G.M. Xie et al., “Controllability of a leaderfollower dy- of multi-agent systems, and networked control systems. (Email: namic network with switching topology”, IEEE Transaction on [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Novel Collaborative Filtering Using Kernel Methods for Recommender Systems∗

CAO Jie1,WUZhiang1, ZHUANG Yi2,MAOBo1 and YU Zeng1

(1. Provincial Key Laboratory of E-Business, Nanjing University of Finance and Economics, Nanjing 210003, China) (2.College of Computer and Information Engineering, Zhejiang Gongshang University, 310018, China)

Abstract — Recommender systems form an essential user’s interested content from an overwhelming set of poten- part of e-business systems. Collaborative filtering (CF), a tial choices. kNN-based CF performs poorly for so-called cold widely used technique by recommender systems, performs start users who have expressed only a few ratings. Meanwhile, poorly for cold start users and is vulnerable to shilling at- the recommender systems using kNN-based CF are vulnerable tacks. Therefore, a novel CF using kernel methods for pre- diction is proposed. The method is called Iterative kernel- to shilling attacks. In such attacks, malicious users create bi- based CF (IKCF), for it is an iterative process. First, ased rating profiles to manipulate the recommendation output [6] mode or mean is used to smooth the unknown ratings; of the system . second, discrete or continuous kernel estimators are used To solve the inherent problems of kNN-based CF, we pro- to generate predicted ratings iteratively and to export the pose a novel CF algorithm, called Iterative kernel-based collab- predicted ratings in the end. The experimental results orative filtering (IKCF), which predicts the preference of users on three real-world datasets show that, with IKCF as a based on kernel method. . The kernel method was first used booster, the prediction accuracy of recommenders can be significantly improved especially for sparse datasets. IKCF in Support vector machines (SVM), and it has been applied can also achieve high prediction accuracy with a small num- to many applications such as classification, principal compo- ber of iteration. nent analysis, regression, machine learning and missing value [7,8] Key words — Recommender systems, Collaborative fil- imputation . In IKCF method, the prediction phase is an iteratively process using the kernel method. First, the missing tering, Kernel methods, Prediction. ratings are smoothed by mode or mean values. Second, the I. Introduction predicted values are generated iteratively based on the known ratings with discrete or continuous kernel functions and the A tremendous progress of recommender systems has been predicted ratings are given in the end. This iteration process witnessed in the past several years, since data in various for- will stop if certain criterion is satisfied. Since the missing rat- mats increases explosively on the Web. Recommender sys- ings are smoothed firstly, the accuracy of prediction for cold tems are changing the way people interact with Web. From start users is improved greatly. Moreover, the predicted val- e-commerce sites like Amazon.com to news and information ues are generated iteratively, which will weaken the shilling sites like dig and Slashdot, recommender systems help peo- attackers. Experimental results also show that the proposed ple choose from diverse products and mess data by providing IKCF leads to higher prediction accuracy than two kinds of k a more personalized information access experience[1].Cur- NN-based CF. rently, Collaborative filtering (CF) is the most widely used The rest of the paper is structured as follows. Related work technique in recommender systems. Tapestry and GroupLens is given in Section II. In Section III, we utilize an illustrative are the two earliest CF-based recommender systems for mail example to paraphrase the difference between IKCF and tra- k and news respectively[2]. Amazon employs CF algorithm for ditional NN-based CF. Section IV presents the novel IKCF book recommendation[3]. Facebook also utilizes CF algorithm algorithm. The efficiency of the proposed algorithm is illus- for advertisement recommendation[4]. trated with various kinds of experiments in Section V. Section In data mining, recommender systems are able to be re- VI summarizes the whole paper. garded as the prediction task which is of high importance to business applications[5]. Predicting the preference of users is II. Related Work the key step of CF, and CF-based recommenders often utilize the opinions of its k nearest neighbors (kNN) to identify the Existing CF algorithms are able to be categorized into the

∗Manuscript Received Sept. 2011; Accepted Nov. 2011. This work is supported by the National Natural Science Foundation of China (No.71072172, No.61103229, No.60003047), Industry Projects in the Jiangsu Science & Technology Pillar Program (No.BE2011198), Jiangsu Provincial Key Laboratory of Network and Information Security (Southeast University) (No.BM2003201) and the Program of Natural Science Foundation of Zhejiang Province (No.Y1090165, No.Y1110644, No.Y1110969). 610 Chinese Journal of Electronics 2012 neighborhood-based approach and the latent factor models[9]. isfied. Details on IKCF are discussed in Section IV. Neighborhood-based CF algorithms contain User-based CF (UCF) and Item-based CF (ICF), according to whether neigh- IV. Algorithm Design bors are being computed for the users or items. The kNN algo- R rithm is adopted by both UCF and ICF. For example, if UCF Without loss of generality, a user-item-rating cube =  (Ru,i,r)n×m×w can be generated, where n, m and w are the wantstopredicttheratingofuseru for item j (ratingu,j), it first computes a set of similar neighbors on a user-item ma- number of dimensions of users, items and ratings, respec- trix N , in which element N(v, i) is the rating of user v for tively. Assume there are three types of ratings in the cube R item i. Many other methods for similarity measures are also (non-ordering and ordering discrete ratings, and continu- r R proposed, such as cosine, correctional cosine, and Pearson cor- ous ratings). When the rating is set, the cube becomes a n × m matrix Ru,i(r). And there is only one type of val- relation coefficient (PCC). The predicted rating ratingu,j can R r r , , ···,w be generated by combining similarity measures of kNN and uesineachmatrix u,i( ), =12 . Fig.1 depicts the R their ratings on item j. user-item-rating cube and its three kinds of matrixes. Different from the above-mentioned kNN-based CF, this paper utilizes kernel methods for the prediction phase of CF. Kernel function is widely used to build prediction models, such as in Refs.[10, 11] and [12]. Kernel function selection is the key step of applying the kernel method to prediction. Usually, pre- dicted ratings in recommender systems are ordering discrete values, such as MovieLens, Epinions, Jester Joke, Netflix, and so on. Currently there is not too many open recommenda- tion dataset, in which the values are continuous. However, for the sake of integrality, a cube model containing order- ing, non-ordering discrete and continuous ratings is presented. Three kinds of kernel functions for dealing with ordering, non- ordering discrete and continuous ratings are also proposed in this paper. Fig. 1. The user-item-rating cube R

III. An Example 1. Three types of kernel functions For a certain r,letX(r)beallknownvaluesinthema- In this section, we utilize an example to demonstrate the trix Ru,i(r), X u(r)betheu-th (u =1, 2, ···,n)rowofX(r), k difference of our approach from traditional NN-based CF. and Y (r)bean × 1 vector with unknown values remained. Example 1 We construct a 5∗5 User-Movie matrix from According to the type of values in Ru,i(r), X(r)andY (r)are u, j MovieLens datasets. An entry in ( )ofthismatrixrep- separated into three types: Xα(r)andY α(r) for non-ordering u j resents the rating user on the -th movie. The symbol ? discrete values; Xβ(r)andY β (r) for ordering discrete values; represents the unknown rating. Xγ (r)andY γ (r) for continuous values. We first present three UID/MID 1 2 3 4 5 types of kernel functions, upon which estimators to predict 1 5 3 4 3 3 the unknown ratings are constructed. Non-ordering discrete 2 4 ? ? ? ? values, ordering discrete values and continuous values kernel 13 3 ? ? 5 1 functions are defined as follows. 16 5 ? ? 5 ? Definition 1 (Non-ordering discrete kernel func- 21 5 ? ? ? 2 tion) Xα r i i , , ···,m Let u,i( )bethe -th ( =12 ) component of If we want to make recommendations for UID-2, we should α m α α Xu (r)anddu,x = i=1 I(Xu,i(r),Xx,i(r)) denote the num- predict the ratings UID-2 on MID-2 to MID-5. The kNN-based α ber of disagreeing components between any two rows of X (r), CF works as follows. Assume UID-1 and UID-13 are two near- α α where I(Xu,i(r),Xx,i(r)) is an indicator function taking value est neighbors of UID-2. The predicted values of (2, 2) and (2, α α 1ifXu,i(r) = Xx,i(r) and 0 otherwise. The non-ordering dis- 3) is determined by (1, 2) and (1, 3), because the values of (13, crete kernel function is defined as: 2) and (13, 3) are unknown. But the predicted values of (2, α α 4) and (2, 5) are determined by (1, 4), (13, 4) and (1, 5), (13, Kα,u,x,λ =K(Xu (r),Xx (r),λ) 5). Since UID-2 only has one rating, any similarity measure m α α is hard to find its real similar neighbors, which decrease the = l(Xu,i(r),Xx,i(r),λ) prediction accuracy. i=1 m−d d Different from the kNN-based CF, IKCF fills all unknown =1 u,x λ u,x d ratings in User-Movie matrix with the mode of that column. =λ u,x (1) For example, we fill (2, 4) and (21, 4) with 5. Then, ordering discrete kernel function is employed to update new predicted where λ is a smoothing parameter of which the range is in α α α α ratings based on the predicted ratings in the last iteration. (0,1), and l(Xu,i(r),Xx,i(r),λ)=λ if Xu,i(r) = Xx,i(r)and1 This process is repeated until certain stopping criterion is sat- otherwise. A Novel Collaborative Filtering Using Kernel Methods for Recommender Systems 611

α,t Definition 2 (Ordering discrete kernel function) y and 1 otherwise. The kernel estimator, Yˆx (r), of the x- m β β Let δu,x = i=1 |Xu,i(r)−Xx,i(r)| denote the L1-distance th unknown value in the non-ordering discrete target vector α between any two rows of Xβ (r). The ordering discrete kernel Y (r) is defined as:   function is then defined as: n α,t−1 u=1 α α,t−1 yl(Yu (r),y,λ)Kα,u,x,λ α,t y∈Dy ,y=Yu (r) β β Yˆx (r)= n Kβ,u,x,λ K Xu r ,Xx r ,λ = ( ( ) ( ) ) u=1 Kα,u,x,λ m (5) |Xβ (r)−Xβ (r)| = λ u,i x,i Definition 5 (Ordering discrete kernel estimator) i=1 β Similar to non-ordering discrete kernel estimator, let Dy = λδu,x β = (2) {0, 1, ···,cu − 1} denote the range of Yu (r). The kernel esti- β,t β mator, Yˆx (r), for the ordering discrete target vector Y (r) Definition 3 (Continuous kernel function) is defined as: γ γ γ Xu r Xx r X r Krbf For any two rows ( )and ( )in ( ), let = n  β,t−1 γ γ 2 2 −X r − X r  /h u=1 β β,t−1 yl(Yu (r),y,λ)Kβ,u,x,λ exp( u ( ) x ( ) ) denote the Radial basis function Yˆ β,t r y∈Dy ,y=Yu (r) K b x ( )= n (RBF) and poly =( u ( ) x ( ) +1) denote the poly- u=1 Kβ,u,x,λ nomial kernel. The continuous kernel function is the combina- (6) β,t−1 β,t−1 tion of RBF and polynomial kernel: where l(Yu (r),y,λ)=λ if Yu (r) = y and 1 otherwise. Definition 6 (Continuous kernel estimator) K ρK − ρ K γ,t γ,u,x,h = rbf +(1 ) poly (3) The kernel estimator, Yˆx (r), for the continuous target vector Y γ (r) is defined as: where b is the degree of the polynomial, h is the width of the −1 n γ,t−1 ρ ≤ ρ ≤ n Yu (r)Kγ,u,x,h RBF and is the optimal mixed coefficient (0 1). Yˆ γ,t r u=1 x ( )= −1 n −2 (7) For the continuous vector, it is difficult to determine the n u=1 Kγ,u,x,h + n parameters b, h and ρ a simultaneously due to the exponential [13] θ,t−1 time complexity. However, Jordaan shown experimentally In Eqs.(5), (6) and (7), the value of Yu (r)(θ = α, β or γ Y θ r δ Yˆ θ,t−1 r δ that only a small proportion of RBF is needed to take the )isequalto u ( )if u =0and u ( )if u =1.Thatis K [10]  ability of interpolation into γ,u,x,h.Moreover,Zhuet al. Y θ r , δ θ,t−1 u ( ) if u =0 also demonstrated that using higher degrees of polynomials or Yu (r)= where θ = α, β or γ Yˆ θ,t−1 r , δ larger widths of RBF is not necessary. Therefore, we suggest u ( ) if u =1 (8) set b =2,h =0.2, ρ =0.05 in Eq.(3). Based on the Eqs.(5), (6), (7), the pseudo-code of the IKCF is 2. Iterative kernel-based collaborative filtering showninTable1. (IKCF) IKCF is an iterative process and it assigns all unknown Table 1. Pseudo-code of the IKCF ratings in Ru,i(r) with the mode or the mean of that column Input: R: the user-item-rating cube in the first round. Then, three kinds of kernel estimators are maxIter: the max number of iteration used to obtain the predicted values in the t-th round based : the threshold of stopping criterion on the values in the previous round. Therefore, the kernel Output: Recommendation list for any user estimators are the key ingredients of IKCF. 1: if (thetypeofvaluesinRu,i(r)isnon-ordering Let the n×1vectorY (r) be the target vector and Yu(r)be discrete or ordering discrete) the u-th component of Y (r). Since Y (r) contains both known 2: all unknown values are filled with the mode of that column 3: (thetypeofvaluesinR (r) is continuous) and unknown components, a bool vector δ is used to represent else if u,i 4: all unknown values are filled with the mean of that column this difference. Then, the relationship of X(r)andY (r)can 5: end of if //initialization be formulated as Eq.(4): 6: for t ← 1 to maxIter //iteratively predicting the unknown values X r ,Yu r ,δu ,u , , ···,n ( ( ) ( ) ) =1 2 (4) 7: if (thetypeofvaluesinRu,i(r) is non-ordering discrete) 8: compute PRˆ t by Eq.(5) δ Y r δ Y r where u =0when u( ) is known, and u =1when u( )is 9: else if (the type of values in Ru,i(r) is ordering discrete) unknown. 10: compute PRˆ t by Eq.(6) t In the t-th round, let Yˆx (r) denote the predicted value of 11: else if (thetypeofvaluesinRu,i(r) is continuous) ˆ t the x-th unknown value in the target vector Y (r). Then, the 12: compute PR by Eq.(7) unknown ratings in the t-th round are predicted with three 13: end of if 14: (PRˆ t − PRˆ t−1 <ε) kinds of kernel estimators based on the values in the (t − 1)-th if ∞ 15: break round. The definitions of three kernel estimators are described 16: end of if as follows. 17: end of for Definition 4 (Non-ordering discrete kernel esti- 18: Generate a recommendation list based on the predicted mator) ratings. α α Let Dy = {0, 1, ···,cu − 1} denote the range of Yu (r), −1 n α,t−1 and n u=1 l(Yu (r),y,λ)Kα,u,x,λ be the joint density of In line 1 to line 5, the unknown ratings in Ru,i(r) are ini- α α α,t−1 α,t−1 X (r)andY (r), where l(Yu (r),y,λ)=λ if Yu (r) = tialized. Line 6 ∼17 are the iterative process of IKCF. After 612 Chinese Journal of Electronics 2012  |P − Pˆ | unknown ratings in Ru,i(r) are predicted, we can make deci- |PR| i i where MAE = sions on recommendation for any user. Generally speaking, |PR| (10) there are two methods for decisions making in recommenda- tion. (1) Select top N target items with higher ratings for NMAE is in [0, 1]. Small value of NMAE indicates a precise users. This method has to generate recommendation results recommendation. even though some items may have low ratings and useless to 2. Experimental results the user. (2) Define a threshold value, and if the rating is The first experiment compares NMAE performance of greater than the threshold value, the item is selected. This IKCF, UCF and ICF for all users in three datasets. In IKCF, λ method can also produce useful recommendation for the user. we set to 0.6 and the number of iterations to 8. As Fig.2 Norm, a concept in mathematics, is employed to be the shows, IKCF is more convergence condition. A norm is a function that assigns a precise than UCF and strictly positive length to all vectors in a vector space, and ICF in three datasets, all types of norm are proven to be equivalent[14].Tosim- and especially for the plify the calculation, we utilize the infinite norm as the con- sparse dataset (e.g. vergence condition. Let PR denote a set of all unknown Epinions), NMAE of values in matrix Ru,i(r), and the |PR| is size of PR.Let IKCF is far lower than t t t t PRˆ =(PRˆ 1, PRˆ 2, ···, PRˆ |PR|) denote the predicted values that of UCF and ICF. of all unknown ratings in the t-th iteration. The infinite norm It appears that because between PRˆ t and PRˆ t−1 is then given by Eq.(9). the similarity measure Fig. 2. NMAE Performance com- of UCF and ICF cannot t t−1 t t−1 t t PRˆ − PRˆ ∞ =Max(|PRˆ 1 − PRˆ 1 |, |PRˆ 2 − PRˆ 2|, ···, parison for all userse reflect the truth for the t t−1 sparse dataset. |PRˆ |PR| − PRˆ |PR||)(9) The second experiment studies the impact of the number of iteration of IKCF. We set λ to 0.6 and record the NMAE V. Experiments of IKCF from the first iteration to the tenth iteration. The variation of NMAE in each round is shown in Fig.3. Combin- This section reports the experimental results of various ing the results shown in Fig.2 and Fig.3, NMAE of IKCF in versions of CF applied on the real datasets. These results the first round is worse than that of UCF and ICF because indicate that the proposed IKCF can achieve superior perfor- the unknown values are predicted by mode in IKCF. From the mances comparing to UCF and ICF on sparse datasets. second round, the performances of IKCF are better than UCF 1. Experimental design and ICF. When the number of iteration reaches 4 to 6, IKCF The proposed IKCF is implemented by C++ program lan- performs drastically better than UCF and ICF. However, the guage in Visual Studio environments. Standard UCF and ICF performances of IKCF tend towards stability with the contin- are also implemented as two fundamental versions of CF. The uing increase of the number of iteration. experiment system is running on an Intel Core2 2.33 GHz CPU with 3GB RAM with Windows XP SP3 system. There are a number of datasets for recommendation, among which three typical data sets are chosen. Jester Joke data set released by UC Berkeley contains 4.1 million con- tinuous ratings (−10.00 to +10.00) of 100 jokes from 73,496 users[15]. The MovieLens data set consists of 100,000 ratings on 1682 movies by 943 users[16]. All the ratings are integer values between one and five where one is the lowest (disliked) and five is the highest (most liked). Article ratings dataset of Epinions datasets is utilized where the ratings represent how much a certain user rates a certain textual article written by [17] another user . Jester Joke is a dense dataset, and Epinions Fig. 3. The number of iteration of IKCF on NMAE dataset is very sparse. The dense degree of MovieLens dataset is moderate. The third experiment investigates the impact of the λ Three above-mentioned datasets contain ordering discrete smoothing parameter . We set the number of iterations to 8, λ values. To the author’s knowledge, there is no open recom- and vary the value of from 0.1 to 0.9 and the interval is set to mendation dataset of which the values are continuous. Al- 0.1. Fig.4 shows the NMAE of IKCF for three datasets varies λ though we only use the ordering discrete estimator for pre- with the increase of . Three curves exhibit similar form. The λ diction, the experimental results illustrate the effectiveness of three curves hit a trough when the value of is intermediate. λ IKCF. Meanwhile, we utilize Normalized mean absolute error From the results shown in Fig.4, it is appropriate that be (NMAE) to measure the error in recommendation: set between 0.5 to 0.7 in IKCF. If we want to find the rigorous value of λ, the cross-validation approach may be utilized. In MAE our experiments, we set λ to 0.6 which leads to the satisfactory NMAE =  , ( |PR| Pi)/|PR| performance of IKCF. A Novel Collaborative Filtering Using Kernel Methods for Recommender Systems 613

on three real-world datasets including Jester Joke, MovieLens and Epinions. Meanwhile, the impact of the number of itera- tion and the smoothing parameter λ are investigated. IKCF can achieve high prediction accuracy with a small number of iteration, and tend towards stability with the continuing in- crease of the number of iteration. We also suggest to use the cross-validation to select the value of λ because carefully de- termining the intermediate value of λ can improve the predic- tion accuracy further. Experimental results show that IKCF performs better than them on sparse datasets. Also, IKCF can significantly improve the prediction accuracy for cold start Fig. 4. The impact of λ of IKCF on NMAE users.

The last experiment ascertains the performance of IKCF References for cold-start users. In Epinions dataset, we consider users with less than 5 ratings as cold start users the number of which [1] J. Riedl, B. Smyth, “Introduction to special issue on recom- mender systems”, ACM Transactions on the Web, Vol.5, No.1, is 24,000. But in MovieLens dataset, there are 2 users with less Article 1, pp.1–2, 2011. than 10 ratings and more than 100 users with less than 20 rat- [2] F. Cacheda, V. Carneiro, D. Fernandez and V. Formoso, “Com- ings among 943 users. So we consider users with less than 20 parison of collaborative filtering algorithms: limitations of cur- ratings as cold start users. Since the Jester Joke dataset is too rent techniques and proposals for scalable, high-performance dense, we do not select cold start users from it. As shown in recommender systems”, ACM Transactions on the Web, Vol.5, Fig.5, the performances of IKCF are significantly better than No.1, Article 2, pp.1–33, 2011. that of UCF and ICF. In addition, ICF performs better than [3] G. Linden, B. Smith and J. York, “Amazon.com recommen- dations: item-to-item collaborative filtering”, UCF because it is difficult to find neighborhoods for the cold IEEE Internet Computing, Vol.7, No.1, pp.76–80, 2003. start user. [4] L.J. Fang, H. Kim, K. LeFevre and A. Tami, “A privacy rec- ommendation wizard for users of social networking sites”, Pro- ceedings of the 17th ACM Conference on Computer and Com- munications Security, Chicago, IL, USA, pp.630–632, 2010. [5] J.J. Wu, H. Xiong and J. Chen, “COG: Local decomposition for rare class analysis”, Data Mining and Knowledge Discov- ery, Vol.20, No.2, pp.191–220, 2010. [6] Z.A. Wu, J. Cao, B. Mao and Y.Q. Wang, “Semi-SAD: apply- ing semi-supervised learning to shilling attack detection”, The 5th ACM Conference on Recommender Systems, Chicago, IL, USA, pp.289–292, 2011. [7] C.A. Micchelli, M. Pontil, “Learning the kernel function via regularization”, Journal of Machine Learning Research, Vol.6, Fig. 5. NMAE Performance Comparison for cold start users pp.1099–1125, 2005. [8] J.B. Li, L.J. Yu and S.H. Sun, “Refined kernel principal com- ponent analysis based feature extraction”, In summary, the proposed IKCF substantially improves Chinese Journal of Electronics, Vol.20, No.3, pp.467–470, 2011. the performance of prediction especially on sparse datasets. [9] Y. Koren, “Factor in the neighbors: scalable and accurate col- This improvement is achieved by filling the mode for unknown laborative filtering”, ACM Transactions on Knowledge Discov- ratings in the first round and utilizing kernel estimators to ery from Data, Vol.4, No.1, pp.1–24, 2009. predict unknown ratings iteratively. IKCF clearly outperforms [10] X.F. Zhu et al., “Missing value estimation for mixed-attribute traditional CF (both user-based and item-based) in terms of data sets”, IEEE Transactions on Knowledge and Data Engi- precision. Since cold start users have only a few ratings, the neering, Vol.23, No.1, pp.110–121, 2011. improvement of NMAE using IKCF compared to UCF and [11] S.C. Zhang, Z.J. and X.F. Zhu, “Missing data imputation by utilizing information within incomplete instances”, ICF is much more for cold start users than for all users. The Jour- nal of Systems and Software, Vol.84, No.3, pp.452–459, 2011. [12] Y.S. Qin et al., “POP algorithm: kernel-based imputation to VI. Conclusion treat missing values in knowledge discovery from databases”, Expert Systems with Applications, Vol.36, No.2, pp.2794–2804, It is observed that several inherent problems including cold 2009. start users and vulnerability to shilling attacks cannot solved [13] E.M. Jordaan, “Development of robust inferential sen- industrial application of support vector machines for re- by traditional kNN-based CF. We present IKCF, a novel iter- sorsñ gression”, , Technical University Eindhoven, 2002. ative CF using kernel methods for prediction. The proposed Ph.D. Thesis [14] R.A. Horn, C.R. Johnson, Matrix Analysis, Cambridge Univer- infinite norm is tested suitable as the stopping criterion. For sity Press, 1985. the sake of integrality, a cube model including various formats [15] Jester Joke. http://www.ieor.berkeley.edu/∼goldberg/jester- of ratings is presented. The cube model is the extension of data, 2011. user-item matrix and is pervasive for any recommender sys- [16] GroupLens Research. http://www.grouplens.org/node/73, tem. Our experiments compare IKCF with UCF and ICF 2011. 614 Chinese Journal of Electronics 2012

[17] P. Massa and P. Avesani, Trust Metrics in Recommender Sys- shang University where he joined as fac- tems, Computing with Social Trust: Human-Computer Inter- ulty member since May 2008. Dr. Zhuang action Series, Springer, pp.259–285, 2009. is a recipient of the CCF Doctoral Disser- CAO Jie received the Ph.D. degree tation Award conferred by Chinese Com- from Southeast University, China, in 2002. puter Federation in 2008 and IBM Ph.D. He is currently a professor and the director Fellowship 2007-2008. He obtained his of Jiangsu Provincial Key Laboratory of E- Ph.D. degree in computer science from Business at Nanjing University of Finance in Mar. 2008. Dr. and Economics. He has been selected in the Zhuang is a member of ACM, IEEE and Program for New Century Excellent Tal- CCF. Dr. Zhuang has published more than ents in University (NCET). His main re- 30 high-quality papers and co-authored two books or chapters, and search interests include cloud computing, 3 patents have been issued. Dr. Zhuang is engaged in the two as- business intelligence and data mining. He pects of research issues: index techniques (multimedia or uncertain) has published one book and more than 40 refereed papers in various and cloud computing, etc. conferences and journals. He has taken charge of over 20 national MAO Bo received the Ph.D. degree projects and obtained Jiangsu Provincial Scientific and Technolog- from Royal Institute of Technology (KTH), ical Progress Second Prizes Awarded (2008). Sweden in 2011 and the B.S. and M.S. de- WU Zhiang (corresponding au- grees from Xi’an Jiaotong University and thor) received the Ph.D. degree from Southeast University in China in 2005 and Southeast University, China, in 2009. He is 2008, respectively. He is currently an as- currently an associate professor of Jiangsu sociate professor of Jiangsu Provincial Key Provincial Key Laboratory of E-Business Laboratory of E-Business at Nanjing Uni- at Nanjing University of Finance and Eco- versity of Finance and Economics. His re- nomics. He is the member of the CCF cent research interests include 3D visual- and ACM. Dr. Wu is a recipient of the ization and data mining. Dr. Mao has published over 20 refereed Excellent Student Paper Award of IFIP papers in international journals and conferences. Intl. Conf. on Network Parallel Comput- received B.S. and M.S. degrees from the Depart- ing (2007). His recent research interests include recommender sys- YU Zeng ment of Mathematics, School of Sciences, China University of Min- tems, cloud computing and data mining. Dr. Wu has authored two ing and Technology in 2008 and 2011, respectively. He is currently books and more than 20 refereed papers in international journals working as a researcher in Jiangsu Provincial Key Laboratory of and conferences. (Email: [email protected]) E-Business at Nanjing University of Finance and Economics. His ZHUANG Yi is currently an associate professor at the Col- current research interests include recommender systems and data lege of Computer and Information Engineering in Zhejiang Gong- mining. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Parallel Test Task Scheduling with Constraints Based on Hybrid Particle Swarm Optimization and Taboo Search∗

LU Hui, CHEN Xiao and LIU Jing

(School of Electronic Engineering, Beihang University, Beijing 100191, China)

Abstract — Parallel test task scheduling is one of the solutions based on the scheduling model. In this paper, PSO is key technologies used in parallel test. A hybrid Particle proposed to solve the PTTSP because of its advantages, such swarm optimization and Taboo search algorithm (PSO-TS) as the simplicity of the algorithm, ease of implementation, lack is proposed to solve parallel test task scheduling with con- of gradient information, and fewer parameters. straints. The scheduling process is divided into two sub- problems: task scheduling sequence with constraints and When dealing with task priority and resource conflicts, two resource optimization. Under the view, the test resource approaches are proposed. One is the adjustment to the solu- scheduling problem can be solved after the task schedul- tions that do not satisfy constraints. It results in a low effi- ing with constraints. This can improve the optimization ciency in searching optima because the adjustment of one task rate of PSO-TS. What is more, a new inertia weight is pro- will affect the arrangement of other tasks[7]. Another strategy posed to enhance exploitation and exploration ability and a is to add a penalty function to the fitness function. It has new constraint-handling mechanism is used to code during the particle updating for the test task scheduling problem. difficulty to deal with the large number of constraints. Hence, Simulation results show the suitability of the proposed al- we provide a new method to handle constraints to improve gorithm in terms of feasibility and effectiveness. efficiency and avoid local optima in this paper. Key words — Scheduling, Parallel test, Taboo search The organization of this paper is as follows. A brief in- (TS), Automatic test system, Particle swarm optimization troduction of PTTSP is introduced in Section II. The new constraint-handling methods for PTTSP–PSO-TS algorithm (PSO). is proposed in Section III. The feasibility and effectiveness of the PSO-TS algorithm is proved by a comparison with other I. Introduction algorithms in Section IV. Conclusions are shown in Section V.

Parallel testing involves testing multiple products or sub- II. Mathematical Description of Parallel components simultaneously in automatic test. It typically Test Task Scheduling shares a set of test equipment across multiple test sockets, sav- ing expensive test idle more than 50 percent of the test time when comparing with serial test technology[1]. Thus, parallel An automatic test system has a finite task set T {t ,t , ···,t } R test has an increasing application in semiconductor, commu- = 1 2 m , and a finite resource set = {r ,r , ···,r }[7] nication and aeronautic and aerospace industries. With the 1 2 n . Alltaskshavetobescheduledinapre- determined order and must satisfy task precedence restric- development of parallel testing Parallel test task scheduling j tions. Each task tj (j =1, 2, ···,m) has a finite set W of problem (PTTSP) such as the resource conflicts and the op- j j j j kj test scheme W = {w ,w , ···,w },andtheresourceset timal task sequence occurs. It can also help to demand lower 1 2 kj ji j ji ji ji ji R w j , , ···,kj R {r ,r , ···,r } cost and improve test efficiency and importability. Now it is of scheme i ( =1 2 )is = 1 2 lji . ji j ji a hot research issue though it is a complex and difficult Non- The cost set c of test scheme wi (i =1, 2, ···,kj )isc = [2] ji ji ji deterministic polynomial (NP) problem for optimization. {c1 ,c2 , ···,cs },ands is the number of objective. The con- s There are three categories in solving the PTTSP. The first straint matrix between all test tasks is denoted by Tm×m. [3−6] category is enumeration methods and rules-based methods If ta is of a higher priority than tb with constraints, then [7,8] s s which is a brute-force approach. The second category sim- Tm×m(a, b) = 1; otherwise Tm×m(a, b) = 0. The test sequence ply uses an intelligent algorithm. The third category[3] initially T p and the resource optimal matrix Rt aim to fulfill certain uses Petri nets to establish a task scheduling model, and then parallel test objectives. The following are some conditions set uses an intelligent algorithm to obtain scheduling optimization forth in the current study[7]:

∗Manuscript Received Sept. 2011; Accepted Mar. 2012. This work is supported by the National Natural Science Foundation of China (No.61101153), National High Technology Research and Development Program of China (863 Program) (No.2011AA110101). 616 Chinese Journal of Electronics 2012

• Each resource can be occupied by only one test task at atime. • The starting time of a task is ignored. III. Parallel Test Task Scheduling Method Fig. 1. Particle coding BasedonPSO-TS

The PSO-TS algorithm which is a combination of PSO and TS is applied to find an optimal scheduling sequence and re- source optimization. First, a new encoding method helps to create a search space without constraints for PSO-TS. Then Fig. 2. Processing methods of constraints the PSO searches for the optimal scheduling sequence among all the permutations of the encoded test tasks. After obtaining the test after task 4. We can encode task 4 and task 6 as a feasible task sequence, TS algorithm is proposed for resource the same task which is shown in Fig.2. Decoding is a reverse optimization to get a resource allocation matrix. process of encoding which is used to calculate the objective. 1. Search for optimal scheduling sequence with After the encoding and decoding process, the particle up- constraints date process no longer needs to adjust the elements in the The PSO algorithm[9,10] is used to determine the optimal vector to meet all the constraints. scheduling sequence T p in the current study. Each potential (2) Inertia weight solution is treated as a particle. All particles fly through the Inertia weight[10−12] is a very important parameter in PSO D dimensional parameter space of the problem while learning algorithm. The ideal inertia weight must have the ability to from the historical information collected during the search pro- ensure a good local search space around global optimization cess. Each particle has a tendency to fly toward better search particles, and a good global search for others. The value of d regions over the course of the search process. The velocity Vi inertia weight determines the ability of a particle to acquire d and position Xi of the dth dimension of the ith particle are velocity which is closely related to updating particles. A new presented below: dynamic way of determining inertia weight is proposed as fol- d d d d d lows. vi =ω × vi + c1 × rand1i × (pbesti − xi ) d d d d ω − ω + c2 × rand2i × (gbest − xi )(1)ω i, k ω ig × max min × MaxNumber − k ( )= min + d MaxNumber ( ) xd xd vd max i = i + i (2) (3) d where dmax is the maximum Euclidean distance between any where c1 and c2 are the acceleration constants, and rand1i d particle and gbest, dig is the Euclidean distance between par- and rand2i are two uniformly distributed random numbers in 1 2 D ticle Xi and gbest, MaxNumber is the maximal iteration time [0, 1]. Xi =(xi ,xi , ···,xi ) is the position of the ith par- 1 2 D to decrease the inertia weight from ωmax to ωmin, k is iteration ticle, pbesti =(pbesti ,pbesti , ···,pbesti ) is the best previ- ous position yielding the best fitness value for the ith particle; time. gbest =(gbest1,gbest2, ···,gbestD) is the best position discov- 2. Resource optimization 1 2 D T p ered by the whole population; Vi =(vi ,vi , ···,vi )represents After we get a feasible task scheduling sequence using the rate of position change for particle i,andω is the inertia the PSO algorithm, the next step is get a resource optimal weight used to achieve a balance between the global and local matrix Rt to obtain the best performance by making a choice search capabilities. of schemes. To find the best resource optimal matrix Rt,the [13,14] The main idea of PSO is given. In order to apply PSO TS is used. Since the resource matrix Rt is determinate p p to the test scheduling, a new encoding method of particles is by the sequence T and schemes of tasks in T ,threemain introduced to deal with constraints and a new inertia weight factors of TS in the initial solution, neighborhood solutions method is aimed to improve the optimization speed and effi- and taboo list are solved directly. ciency of PSO. First, the initial solution matrix Rt consists of a randomly j (1) Encoding of particles with constraints selected test scheme wrand (1 ≤ rand ≤ kj )fromthetest j Since the PTTSP is an integer programming problem the scheme set W of the test task tj , which is an element of test p encoding method of PSO should be modified. We use an m- sequence T . Second, the neighborhood solutions of Rt in- dimension vector to represent the particle position, which is cludes all the solutions that have only one scheme of a task p Rt denoted as T . The position of the particle would likely be sim- change compared with . The neighborhood solution size of  L p p ilar to X in Fig.1 after being updated by Eqs.(1) and (2). All Rt is Size = i=1 K(T [i]) − L where K(T [i]) denotes the p the elements of X are set in ascending order. It can guarantee number of test schemes of the ith element in T ,andL de- p that all the elements in a scheduling sequence are integers, and notes the length of vector T . Third, the taboo list is defined j j j j all tasks are included in the vector. as [wa,wb ]wherewa and wb is scheme number of task tj .It j j To handle constraints, the main idea is to treat tasks with means a alteration between wa and wb . The taboo list length constraints as an entity. The encoding of constrained tasks is set (1.0 ∼ 1.5)L. allows the test sequence to meet all the task constraints while After a total instruction for each part of PSO-TS, a pseudo- the particles are updating. For example, task 6 should begin code of it is shown in Fig.3. Parallel Test Task Scheduling with Constraints Based on Hybrid Particle Swarm Optimization and Taboo Search 617

initialize particle, set parameters; Do particle code and deal with constraints; update particle position according to Equation (1,2); Taboo algorithm to find an optimum Rt; update pbest and gbest; while (stop criteria is not attained); output Tp and Rt.

Fig. 3. The pseudo-code of PSO-TS algorithm

IV. Simulation Experimental Results and

Discussion Fig. 4. Convergence curve

To validate the performance of the PSO-TS algorithm, simulation results of two practical problems are illustrated in this section. There is a comparison with existing task schedul- ing algorithms, such as T askScheduler T , Standard genetic algorithm (SGA), and hybrid Genetic algorithm and simulated annealing (GASA) . A PTTSP from a previous study[4] is considered first in Experiment 1. The test task set is T = {t1,t2, ···,t15},the resource set is R = {r1,r2, ···,r8}·t1{r1,r3}.meansr1,r3 is used to complete t1.t2{r1,r5}, t3{r6}, t4{r1,r6}, t5{r2,r7}, t6{r2,r3}, t7{r2,r4}, t8{r2,r6}, t9{r3}, t10{r5,r6}, t11{r5}, Fig. 5. Gantt chart of a result using PSO-TS t12{r6,r7}, t13{r6}, t14{r7}, t15{r8}. t13  t6, t4  t10 c c are task constraints. The test time set of all tasks is τ = learning factor for 1 and 2 is set to 2. The maximum itera- {2, 11, 1, 7, 14, 5, 6, 6, 13, 2, 14, 4, 14, 10, 4}. The PSO-TS uses tion for PSO is 40, taboo list length = 40, and taboo search the following parameters: swarm size = 15, ωmax =1.2, maximum iteration = 60. The Gantt chart of a scheduling ωmin =0.7, maximum speed Vmax = 14, personal learning result is displayed in Fig.5 using PSO-TS. As shown in Fig.5 t5 t6 t5  t6 factor and social learning factor for c1 and c2 is set to 2, PSO tasks and meet the constraint . In addition, sev- maximum iteration = 40, taboo list length = 0, and taboo eral resources run at the same time to complete different tasks search maximum iteration = 0. In comparison, the parame- instead of waiting until one task finishes which is clearly dis- ters of SGA and GASA are set as follows. They have the same played. The test tasks can complete in 1200s which is much genetic parameters. The population size is 30. The maximum shorter than serial scheduling in at least 7150s. iteration equals 40. The crossover probability is 0.85, muta- tion probability = 0.7 and penalty factor = 1.4. The initial Table 2. Test schemes, resource requirements and annealing temperature = 1 and cooling temperature factor = test time of tasks 0.9 in GASA. The comparison with other algorithms applied Task Scheme Resource Time Task Scheme Resource Time in Experiment 1 in time consumption is shown in Table 1 with t1 1 1, 2 350 t6 1 2 350 the convergence curve in Fig.4. Each algorithm runs 100 times 2 1, 4 350 2 7 600 3 2, 6 270 t 1 1, 4 550 on the same hardware platform. 7 4 4, 6 320 2 1, 5 750 t2 1 2, 5, 6 700 3 1, 9 550 Table 1. Simulation results of different algorithms 2 3, 5, 6 900 4 4, 8 800 Best Average search Optimization 3 2, 6, 7 700 5 5.8 800 Algorithm (result/s) (time/s) rate 4 3, 6, 7 900 6 8, 9 800 TaskScheduler T 34 15.6 1 5 2, 6, 9 800 t8 1 4 400 SGA 34 12.1 0.91 6 3, 6, 9 900 2 10 600 GASA 34 13 0.95 t3 1 1 300 t9 1 3 300 PSO-TS 34 8.75 0.98 2 10 200 2 8 600 t4 1 2, 3 1000 3 9 900 2 2, 4 1000 t 1 1 700 In Experiment 2 a more complicated parallel test task 10 3 2, 7 1100 2 3 500 set is shown in Table 2 since there are various schemes of 4 3, 9 900 3 5 300 tasks. In most case, most tasks have their favorable alter- 5 4, 9 1000 4 7 100 t  t natives in the real word. 5 6 is the task constraint. The 6 7, 9 1100 t11 1 5, 6 600 T askScheduler T , SGA, and GASA are no longer suitable for t5 1 3,4 200 2 6, 7 700 multi-scheme problem in PTTSP. Hence, no comparisons are 2 4, 5 600 3 5, 10 1000 made to PSO-TS. The swarm size is set to 24. ωmax =1.2, 3 4, 10 500 4 7, 10 1000 ωmin =0.7, Vmax = 14, personal learning factor and social 618 Chinese Journal of Electronics 2012

V. Conclusion ment optimization by two-step unified genetic algorithm and simulated annealing algorithm”, Journal of Electric (China), In the current study, the PSO-TS algorithm is proposed Vol.23, No.4, pp.632–636, 2006. to solve PTTSP with constraints in automated test field. The [9] J. Kennedy, R.C. Eberhart, “Particle swarm optimization”, PSO-TS algorithm has the advantage of PSO and the excel- Proc. IEEE International Conference on Neural Networks, lence of TS with respect to addressing discrete optimization Perth, WA, Vol.4, pp.1942-1948, 1995. [10] J. Kennedy, R.C. Everhart, “A discrete binary of the particle and combinatorial optimization problems. The PSO-TS algo- swarm algorithm”, rithm is better than the T askScheduler T , SGA, and GASA Proc. IEEE International Conference on Systems, Man, and Cybernetics, pp.4104–4109, 1997. algorithms, both in terms of computing time and optimiza- [11] Y. Shi and R. Eberhart, “A modified particle swarm optimizer”, tion rate. At the same time, the PSO-TS algorithm can be IEEE International Conference on Evolutionary Computation used even in multi-scheme cases, whereas the other algorithms Proceedings, Anchorage, AK, USA, pp.69–73, 1998. cannot. [12] J. Kennedy, “Small world and mega-minds: effects of neighbor- Nevertheless, the PSO-TS algorithm cannot be utilized in hood topologies on particle swarm performance”, Congress on , Washington, DC, USA, pp.1931– cases where the constraint number is bigger than the task num- Evolutionary Computation 1938, 1999. ber. Therefore, our future work will aim to find a better en- [13] Fred Glover, “Taboo search-Part I”, ORSA Journal on Com- coding and decoding method that can be applied to PSO-TS. puting, Vol.1, No.3, pp.190–206, 1989. [14] Fred Glover, “Taboo search-Part II”, ORSA Journal on Com- , Vol.2, No.1, pp.4–32, 1990. References puting LU Hui received the Ph.D. de- [1] http://softwareqatestings.com/software-test-types/parallel- gree in navigation, guidance and control testing.html. from Harbin Engineering University, in [2] A. Radulescu, C. Nicolescu, van Gemund, A.J.C., Jonker, P.P. 2004. She is an associate professor in “CPR: Mixed task and data parallel scheduling for distributed Beihang University, Beijing. Her research systems”, Parallel and Distributed Processing Symposium, Pro- interests include automatic test for elec- ceedings 15th International, San Francisco, CA, USA, pp.1–9, tronic equipment, fault diagnosis, intelli- 2001. gent computation and optimization, data [3] Nathan Waivio, “Parallel test description and analysis of paral- analysis and their applications. (Email: lel test system speedup through amdahl’s low”, IEEE Autotest- [email protected]) con, Baltimore, MD, pp.735–740, 2007. CHEN Xiao was born in [4] Xia Rui, Xiao Mingqing, Cheng Jinjun, Fu Xinhua, “Opti- Province, China, in 1986. He received the mizing the multi-UUT parallel test task scheduling based on M.S. degree in electronic engineering from multi-objective GASA”, IEEE 8th International Conference on Beihang University. His research interests Electronic Measurement and Instruments, Xi’an, Chinese, pp.4- include ATS, task scheduling, and system 839–4-844, 2007. optimization. [5] Zohar Laslo, Baruch Keren, Hagai Ilani, “Minimizing task com- pletion time with the execution set method”, European Journal of Operational Research, Vol.187, No.3, pp.1513–1519, 2008. LIU Jing received the B.S. degree [6] Lukasz Kuszner, Michal Malafiejski, “A polynomial algorithm in electrical engineering from Shandong for some preemptive multiprocessor task scheduling problems”, Normal University. She is currently a M.S. European Journal of Operational Research, Vol.176, pp.145– candidate in Beihang University, Beijing. 150, 2007. Her research interests include heuristic op- [7] Jiayong Fang, Huihui Xue, Mingqing Xiao, “Parallel test tasks timization techniques for test scheduling scheduling and resources configuration based on GA-ACA”, problems, hybrid evolutionary algorithms. Journal of Measurement Science and Instrumentation, Vol.2, No.4, pp.321–326, 2011. [8] Yang Meng, A.E.A. Almaini, Wang Pengjun, “FPGA place- Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

SHIS Model of E-mail Virus Propagation∗

ZHONG Jiang1,2,LIAng1 and WEN Luosheng3

(1.College of Computer Science, Chongqing University, Chongqing 400044, China) (2.Key Laboratory of Dependable Service Computing in Cyber Physical Society, Ministry of Education, Chongqing 400044, China) (3.College of Mathematics and Statistics, Chongqing University, Chongqing 400044, China)

Abstract — By analyzing the characteristics of many SIS model, respectively. These models mentioned analyzed E-mail viruses in reality, we address an SHIS (Susceptible- epidemic spreading only on a homogeneous network. hidden-infected-susceptible) model in this paper. In our Recently, a large number of studies indicate that the net- model, on the one hand, the state H is introduced, which work topology has a great influence on epidemic spreading. denotes user receives some E-mails with virus but s/he doesn’t activate them and they aren’t infectious. On the The Internet, WWW (World Wide Web) network, MSN (the other hand, the topology of E-mail network is considered. Microsoft Network)network, and E-mail network have been The model not only describes better the practical condi- proven to follow the scale-free and small-world properties, that tion of E-mail virus propagation than existing models, but is, the distribution of connectivity of nodes follows the power- also makes it possible to analyze the users’ behavior. By law[3] and the network has a small mean distance. By analyz- analyzing the rate equation of the model, we study the ing practical data from the server log files of E-mail, the distri- epidemic threshold and the equilibrium point. We also butions of both sending and receiving E-mails follow the power present the relationship between the infected density and [4−6] [7−8] two important parameters: the percentage of activating E- law . Pastor-Satorras, Vespignani and Bogu˜n´a studied mail with virus and the frequency, in which users check the an SIS model on the scale-free network and found that there Email box. Finally, some numerical simulations are also exists no positive epidemic threshold on the uncorrelative and presented to show the correctness of theoretical analysis. correlative scale-free network. At present, some researchers[9] Our model would help to understand and control E-mail focus on the epidemic spreading models and the correspond- virus spreading. ing control methods on various types of networks. Masuda and Key words — E-mail virus, Complex networks, Konno[10] addressed a series of models with multiple states and Susceptible-hidden-infected-susceptible (SHIS) model. complex transition rules. In this paper, the process of E-mail virus spreading on I. Introduction the scale-free network is analyzed, and an SHIS model based on -mail network is presented. It should be notice that the As a convenient and efficient communication way, E-mail state H differs from the state E (exposed) in the SEIS and has been widely used by computer users for the last 15 years. SEIRS model[11], in which after a latent period an exposed Meanwhile, it is becoming the primary target of virus attack. becomes infected. Applying the rate equation approach, we During the latest decade, E-mail viruses have caused a series study the epidemic threshold and the equilibrium point. We of security issues on the Internet. According to the annual also present the relationship between the infected density and reports on computer viruses, E-mail has been viewed as the two important parameters that are closely related to user’s be- major spreading media for computer viruses and many E-mail haviors: the percentage of activating E-mail with virus and the viruses, such as “Melissa”, “I love you”, “Sobig. F”, “Bagle” frequency that users check the Email box. Finally, some nu- and “MyDoom”, are the strongest and widest-spread viruses merical simulations are also presented to show the correctness once. The first mathematical model studying computer viruses of theoretical analysis. propagation was addressed by Kephart and White[1] in 1991. In their model, the population (or computers) is divided into II. The SHIS Model susceptible (S) and infected (I) by their states. Noticing the spreading characteristics of E-mail viruses, Generally, there are two ways through which an E-mail Ref.[2] analyzed an SHIR (Susceptible-hidden-infected- virus can spread. One is through the hypertexts in the E-mail, removed) model and an SHIS (Susceptible-hidden-infected- which includes the malicious script codes or the references to susceptible) model by adding the hidden state in the SIR and malicious programs. The other is through the attachments in

∗Manuscript Received Aug. 2011; Accepted Sept. 2011. This work is supported by the Natural Science Foundation Project of CQCSTC (No.2010BB2046, No.2009BB2184), the National Basic Research Program of China (973 Program) (No.2011CB302602), the Fundamental Research Funds for the Central Universities (No.CDJZR10180025) and the Third Stage Building of “211 Project” (No.S-10218). 620 Chinese Journal of Electronics 2012 the E-mail, which includes some malicious programs. No mat- where Hk and Ik represent the density of the hidden and in- ter in which way the E-mail virus spreads, there exist three fected nodes with the connectivity degree k, respectively. The steps for the process of E-mail virus spreading. Firstly, the parameters λ, C, p and r are the infection rate, the speed of virus gets the destination addresses. Commonly, the virus processing E-mail, the percentage of activating E-mail virus searches the html, wab, dbx and other types of documents in and the recovery rate, respectively. Without loss of general- the local drives and extracts the destination addresses from ity, we set r = 1. The first equation of Eq.(1) represents the these documents. Secondly, the virus sends itself as an at- variation of the proportion of the hidden nodes with degree k. tachment through a variety of ways. Usually, it sends itself to The term 1 − Hk − Ik represents the probability of which a the destination addresses through SMTP (Simple mail trans- 1 kI P k randomly chosen node is susceptible. ΘI = k k k ( )is fer protocol). The final step is to get the control of systems in the probability that any given link points to an infected node, the destination computers. When a computer opens an E-mail where k represents the mean degree of the E-mail network. including virus, the virus could get the control of the system In order to find the equilibrium of Eq.(1), let the right- at some conditions, and then, it will repeat the above process. hand side equal to zero and one gets: Comparison with the process of the biological epidemic ⎧ spreading, one can find that the computers also have three ⎪ H 1 I ⎨ k = pC k states during the process of E-mail virus spreading: suscepti- ⎪ λpCk (2) ble, hidden and infected. When an E-mail including virus has ⎩ I ΘI k = C pC λk been sent from a source computer to a destination addresses, +( +1) ΘI it will be forwarded to the destination POP3 (Post office pro- where 1 tocol 3) server. At this point, the destination computer’s state I kTkP k Θ = k ( )(3) is susceptible (S). After the user logins and accesses the POP3 k server to receive the E-mail, the state of the computer will Substituting the second equation of Eq.(2) into Eq.(3), one be transferred to hidden (H). If the user opens the malicious could get the self-consistent equation about ΘI , E-mail carelessly and the virus will be activated, the state of 2 1 λpCk ΘI P (k) the computer will be changed to infected (I). According to the ΘI = (4) k C +(pC +1)λkΘI observation on the E-mail forwarding procedure and the be- k haviors of user’s sending and receiving E-mails, we could get 0 It is obvious that ΘI = 0 is a solution of Eq.(4), namely, an SHIS model for the E-mail virus spreading, as shown in this system has a disease-free equilibrium. In order to get the the Fig.1. The transition from S to H means that the user positive solution of the equation, define the function has received a malicious E-mail. The transition from H to S λpCk2xP k indicates that some users could remove the malicious E-mail f x x − 1 ( ) ( )= k C pC λkx (5) before the virus is activated. The transition from H to I im- k +( +1) plies that the virus has been activated and the computer is Obviously, f(0) = 0 and f(1) < 0 hold true. The derivation infected by the virus. The transition from I to S represents of the function is that the user has found the solution to clean the virus or the λpC2k2P k system has been installed again. f  x − 1 ( ) ( )=1 k C pC λkx 2 (6) k [ +( +1) ] By straight-forward computations, f (x) > 0. Therefore, function f(x) has one positive solution if and only if f (0) = k2 − λp < 1 k 0 holds true. So, we have: Fig. 1. Flow of malicious objects on the E-mail network Property 1 The epidemic threshold of system (1) λc = 1 k It is found that the spreading of some E-mail viruses de- λ<λ p k2,when c, the E-mail virus will die out; otherwise pends on the E-mail address lists. The infected computer sends the virus will be prevalent. E-mails to the users in its address list stochastically. As de- Property 1 also implies that there exists no positive epi- scribed above, the E-mail virus spreads on the network based demic threshold in an infinite size scale-free network (γ ≤ 3). on the address lists, which is called E-mail network. Obvi- This result is similar with Refs.[7, 8]. ously, the spreading model will be influenced by the properties [5] For realistic E-mail network, γ isbetween2to2.5 .In of the E-mail network. Many study results have shown that order to perform explicit calculations, we use a continuous k the E-mail network is a scale-free network, which means that approximation that allows practical substitution of series with the distribution of user number in the address list follows the integrals. The full connectivity distribution could be written power law, P (k) ∼ k−γ ,wherek indicates the connectivity as degree of a given user and γ is the scaling exponent. γ−1 −γ P (k)=(γ − 1)m k (7) So, we can formulate the rate equation of the E-mail virus m spreading as follows: where is the minimum degree at each node. The mean degree is H˙ k = λk(1 − Hk − Ik)ΘI − CHk (1) k = kP(k)=(γ − 1)m/(γ − 2) I˙k = pCHk − rIk k SHIS Model of E-mail Virus Propagation 621

Substituting Eq.(4) into Eq.(7) and substitution of series Thus, we have the positive solution of Eq. (15) ∗ with integrals, the positive solution ΘI of Eq.(4) satisfies γ − π 1/(3−γ) pC ∗ ≈ ( 2) mpλ (γ−2)/(3−γ) ∞ γ − m(γ−1)λpC ∗ ΘI γ − π pC ( ) (17) ∗ 1 ( 1) ΘI dk sin( 2) +1 ΘI = ∗ (γ−2) (8) k m [C +(pC +1)λkΘ ]k I As a result, we have

We can present the infected density at steady state 1/(3−γ) ∗ pC(γ − 1) (γ − 2)π ∞ ρI ≈ mpλ (18) ∗ ∗ (pC +1)(γ − 2) sin(γ − 2)π ρI = P (k)Ik dk m and ∞ γ − m(γ−1)λpC ∗ ( 1) ΘI dk = ∗ (γ−1) (9) γ − γ − π 1/(3−γ) m [C +(pC +1)λkΘI ]k ∗ ( 1) ( 2) ρH ≈ mpλ (19) (pC +1)(γ − 2) sin(γ − 2)π Noticing Note ∗ In the SHIS model, there are two parameters de- λpCΘI pC 1 C p = − scribing user behaviors: the percentage and the frequency C pC λk ∗ k(γ−2) pC k(γ−1) pC λ ∗ [ +( +1) ΘI ] +1 ( +1) ΘI C. According to Property 1, Property 2, Eqs.(18) and (19), λpC ∗ p · ΘI the percentage affects the epidemic threshold, hidden and in- ∗ (γ−1) [C +(pC +1)λkΘI ]k fected density. With increasing of percentage p, the epidemic (10) threshold decreases, and the infected density and hidden den- we have sity increase. On the other hand, increasing of frequency C will cause the increasing of the infected and the decreasing of pC γ − C ∗ − ( 2) ρ ΘI = ∗ I (11) hidden density. (pC +1) (γ − 1)m (pC +1)λΘI

As the result, we could calculate the density of infected III. Simulation Results nodes through Eq.(11), and this yields In order to show the correctness of the previous conclu- ∗ ∗ ∗ ρI = kλΘI [p − (p +1/C)ΘI ] (12) sions, some numerical simulations are conducted in this sec- tion. Firstly, an E-mail network is constructed. We use the H ∗ 1 I ∗ algorithm of Albert and Barab´asi in Ref.[12]. The network AccordingtoEq.(2),wefind k = pC k which implies consists of 100,000 nodes, and the mean degree is 6 (according that the density of hidden nodes is 1/pC timesofthatofsus- to Ref.[5], the amount is 6.6). So we set the scaling exponent ceptible nodes. So we can also get the densities of hidden γ =2.5 and the minimum degree m =2. nodes and susceptible nodes at steady state Secondly, the simulations of the E-mail virus spreading are ∗ 1 ∗ performed on the network. Four nodes are randomly selected ρH = ρI (13) pC as initial infected nodes, and other nodes are susceptible. It is assumed that E-mail boxes are checked by users per 200 min- and utes. 90 percent of users don’t activate the E-mails with virus ρ∗ − ρ∗ − ρ∗ S =1 H I (14) and 10 percent of users activate the malicious E-mails which Consequently, we have: will lead to the virus spreading. We further assume that users Property 2 When the E-mail virus is prevalent, the in- will find and clear the virus after their computers are infected fected density is expressed as Eq.(13) and the hidden density for 1000 minutes. We take 1000 minutes as unit time, so other is 1/pC times as large as the infected density. parameters are set as λ =0.1, C =5,p =0.1 in the SHIS Finally, we research the relationship between the percent- model. ∗ age p and the infection density ρI when infection rate is suf- Fig.2 shows the effect of people’s behaviors of dealing with ficiently small. Applying the Gauss hypergeometric function, E-mail, that is, the percentage p. We change percentage p from the Eq.(4) can be written as 10 percent to 90 percent and fix other parameters. The full and dash curves are theoretical results of Eqs.(18) and (19), (γ−1) 1 (γ − 1)m λpCΘI which are well closed to the simulation results (the average ΘI = (γ−2) k [C +(pC +1)λkΘI ]k of 100 simulations). Obviously, the infected density increases k pC C while users open unfamiliar E-mails more imprudently. Mean- = F 1,γ− 2,γ− 1, − while, the hidden density also increases with percentage p but pC +1 m(pC +1)λΘI (15) the increasing is less than the increasing of infected density. Fig.3 shows the effect of frequency C. We change frequency We consider the approximation of the Gauss hypergeometric C from 1 to 10 and fix other parameters. The full and dash z → [12] function when 0 : curves are theoretical results of Eqs.(18) and (19), which are 1 (γ − 2)π γ−2 well closed to the simulation results. The more frequently F 1,γ− 2,γ− 1, − ≈ z (16) z sin(γ − 2)π users check Email box, the more probably they are infected. 622 Chinese Journal of Electronics 2012

pp.343–359, 1991. [2] Y. Hayashi, M. Minoura, J. Matsukubo, “Oscillatory epidemic prevalence in growing scale-free networks”, Physical Review E, Vol.69, No.1, pp.016112-8, 2004. [3] R. Albert, A.L. Barabasi, “Statistical mechanics of complex net- works”, Reviews of Modern Physics, Vol.74, No.1, pp.47–94, 2002. [4] L.S. Liebovitch, I.B. Schwartz, “Information flow dynamics and timing patterns in the arrival of email viruses”, Physical Review E, Vol.68, No.1, pp.017101-4, 2003. [5] N. Schwartz, R. Cohen, D. Ben-Avraham et al., “Percolation in directed scale-free networks”, Physical Review E, Vol.66, No.1, pp.015104-4, 2002. [6] M.E.J. Newman, S. Forrest, J. Balthrop, “Email networks and Fig. 2. The effect of percentage p. The full and dash curve the spread of computer viruses”, Physical Review E, Vol.65,

are theoretical results of Eqs.(18) and (19), respec- No.1, pp.035101-4, 2002.

*¼ and »+ ¼ are simulation results on the tively. » [7] R. Pastor-Satorras, A. Vespignani, “Epidemic spreading in infected and hidden density, respectively scale-free networks”, Physical Review Letters, Vol.86, No.14, pp.3200–3223, 2001. [8] M. Bogu˜n´a, R. Pastor-Satorras, A. Vespignani, “Absence of epi- demic threshold in scale-free networks with degree correlations”, Physical Review Letters, Vol.90, No.2, pp.028701-4, 2003. [9] H.J. Shi, Z.S. Duan, G.L. Chen, R. Li, “Epidemic spreading on networks with vaccination”, Chinese Physics B, Vol.18, No.8, pp.3309–3317, 2009. [10] N. Masuda, N. Konno, “Multi-state epidemic processes on com- plex networks”, Journal of Theoretical Biology, Vol.243, No.1, pp.64–75, 2006. [11] L.Q. Gao, J. Mena-Lorca, Hethcote H.W., “Four SE1 endemic models with periodicity and separatrices”, Mathematical Bio- sciences, Vol.128, No.1, pp.157–184, 1995. [12] R. Albert, A.L. Barab´asi, “Topology of evolving networks: lo- cal events and universality physical”, Review Letters, Vol.85, No.24, pp.5234–5237, 2000. Fig. 3. The effect of frequency C. The full and dash curve ZHONG Jiang is an associate are theoretical results of Eqs.(18) and (19), respec- professor in the College of Computer

Science at University of Chongqing,

*¼ and »+ ¼ are simulation results on the tively. » infected and hidden density Chongqing,China. He graduated in com- puter application from University of Chongqing, China in 1995. He received IV. Conclusion the M.S. degree and Ph.D. degree in com- puter science from University of Chongqing In this paper, an SHIS model about the E-mail virus in 2001 and 2005 respectively. He works as spreading is addressed. By analyzing the rate equation of the a visiting scholar in University of Queens- model, we study the epidemic threshold and the equilibrium land from June 2008 to June 2009. His major areas of research interests include data mining, network security and knowledge point. Aiming at two parameters, the percentage p and fre- management. (Email: [email protected]) quency C, we present the approximate expression on the hid- LI Ang received the B.S. degree in den and infected density. These theoretical results are verified computer science from Chongqing Univer- by some numerical simulations which reveal the relationships sity in 2009. He is currently a graduate between the infected density and the two parameters. These student in computer science at Chongqing results may help us find some measures to prevent the E-mail University. His research interests include virus from spreading. Of course, there are other factors that complex networks, knowledge engineering and information security. affect the virus spreading, such as the type of E-mail viruses, spreading ways, the user’s strategy and the out-degree and in- degree of nodes in the E-mail network and we shall study them is an associate professor in the College of further. WEN Luosheng Mathematics and Statistics at University of Chongqing, Chongqing, China. He graduated in mathematics from University of Chongqing, References China in 1998. He received the M.S. degree and Ph.D. degree in computer science from University of Chongqing in 2005 and 2008 [1] J.O. Kephart, S.R. White, “Directed-graph epidemiological respectively. He works as a visiting scholar in University of Clark- models of computer viruses”, in Proc. of IEEE Symposium on son from August 2010 to August 2011. His major areas of research Security and Privacy 91’, Oakland: IEEE Computer Society, interests include data mining and complex networks. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Convex Approach for Local Statistics Based Region Segmentation∗

MA Liyan and YU Jian

(School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China)

Abstract — A convex active contour model based on assumes that each region is homogeneous. local image statistics is proposed in this paper. By assum- The region-based models tend to utilize global region sta- ing that the intensity distribution of the image pixels in a tistical information, thus usually cannot segment the images window is described by a Gaussian distribution, our model with intensity inhomogeneity which have spatially varying is able to segment images with intensity inhomogeneity. Due to the convexity of the proposed model, we intro- statistics. The intensity inhomogeneity often exists in med- duce a dual formulation to solve the minimization problem ical images, such as Magnetic resonance imaging (MRI) im- and obtain a much efficient method. Experiments show ages due to the imperfection of imaging devices. Recently, that the segmentation results of the proposed method are many active contour models have been proposed to handle this similar to that of the non-convex method based on local problem[13,14] . These methods segment the images based on statistics, but our method is much more efficient. local region statistics. Among them, the Region-scalable fit- Key words — Image segmentation, Active contour, In- ting (RSF) model[13] uses local intensity means. With a level tensity inhomogeneity, Level set method, Local statistics. set approach, RSF model can efficiently extract the object boundaries. However, in some cases, the local intensity means I. Introduction cannot provide enough information for accurate segmentation as shown in Fig.1. In Ref.[15], both local intensity means and Image segmentation is a fundamental problem in image variances are used to characterize the local intensity statistics. processing and computer vision. It is a crucial preliminary Obviously, this method introduces much computational cost. [1−3] step for subsequent object recognition and interpretation Most of the existing active contour models for image seg- and its goal is to partition a given image into meaningful areas. mentation on images with intensity inhomogeneity are non- Extensive studies have been carried out and many techniques convex. This usually makes the algorithms result in local min- have been proposed, among which active contour models for ima and slow convergence. Recently, Yang et al.[14] proposed image segmentation have achieved great success. Generally a global RSF model by incorporating the global convex C-V speaking, the existing active contour models can be classified method[7] into the RSF model and used the split Bregman [4−6] into two categories: edge-based models and region-based technique to solve the model. This method is more efficient [7−11] models . than RSF model. However, the global RSF model also fails The first active contour model is edge-based and was pro- to segment the images with intensity inhomogeneity in some [5] [4] posed by Kass et al. . Later, Caselles et al. proposed the cases, because it just utilizes the local intensity means as the famous GAC model. Edge-based models utilize the image gra- original RSF model does. dient to stop the curve evolution. Therefore, those methods In this paper, we propose a convex model in which the are very sensitive to noise and highly dependent on the ini- intensity distribution of the image pixels in a window is de- tial curve placement. The initial curve should be completely scribed by a Gaussian distribution. Using a dual formulation exterior or interior to the desired object boundaries. Those to solve the minimization problem, we obtain a more efficient drawbacks can be overcome by the region-based methods. The method than the non-convex method based on local statistics. advantage of edge-based method is its robustness to region in- homogeneity. II. Related Works Region-based models utilize the region statistical informa- tion inside and outside the contour rather than just use the 1. The global convex C-V model information along the curve. Therefore, they are less sensitive For a given image I : Ω → R in domain Ω and a closed 2 [8] to noise and the initial location placement of the curve. One curve C(s):[0, 1] → R , the energy functional of C-V model of the most successful and pioneering models is the Mumford- is defined by: [12] [8]   Shah method . Based on this model, Chan and Vese pro- 2 2 λ1 (I − c1) dx + λ2 (I − c2) dx + |C| (1) posed the piecewise constant model (named C-V model) which Ωc Ω\Ωc

∗Manuscript Received Mar. 2011; Accepted Apr. 2012. This work is supported by the National Natural Science Foundation of China (No.60905028, No.61033013) and Beijing Natural Science Foundation (No.4112046). 624 Chinese Journal of Electronics 2012 where |C| is the perimeter of the set Ωc which is the region method and the global RSF method fail to achieve the correct inside the contour C, λ1,λ2,v >0 are fixed parameters, c1,c2 segmentation. are two constants which are the average intensities inside and outside the contour. Using a level set function φ : Ω → R, the energy functional (1) is rewritten as:   Fig. 1. Experiments on a synthetic image. From left to right: 2 original image; the image of adding intensity inho- |∇H(φ(x))|dx + λ1 (I − c1) H(φ(x))dx mogeneity and with the initial contour; global C-V Ω Ω method; global RSF method 2 + λ2 (I − c2) (1 − H(φ(x)))dx Ω (2) In order to get the correct segmentation, more statistical information has to be taken into account. In the next section, where H is the Heaviside function. we will use the local statistical information to obtain a more One drawback of the model (2) is its non-convex. There- powerful algorithm. fore, a meaningful solution to this problem can only be guar- anteed to be a local minima. To overcome this problem, Chan III. The Proposed Method et al.[7] proposed the following global convex model for any given c1,c2: We assume that the intensity of the pixels in a window 2   follows a Gaussian distribution. Let ui(x)andσi(x) be the min |∇u(x)| + ru(x)dx (3) local intensity mean and local intensity variance of the region 0≤u≤1 Ω Ω i at point x respectively. Then, we obtain the following energy 2 2 where r = λ1(I − c1) − λ2(I − c2) . The first term is the functional:     Total variation (TV) regularization. Then setting Ωc = {x ∈ 2 L Ω : u(x) >μ|} for a.e. μ ∈ [0.1], we can get the solution. In E = λi Kσ(x − y)LpiMi,ε(φ(y))dy dx Ω Ω this paper, we simply choose μ =0.5. i=1

2. Global minimization of Region-scalable fitting + |∇Hε(φ(x))|dx (6) (RSF) energy Ω Using local intensity means, Li et al.[13] recently proposed a √ I x − u x 2 Lp πσ x ( ( ) i( )) where i =log( 2 i( )) + 2 . region-based model to segment images with intensity inhomo- 2σi(x) geneity. With the level set representation, their energy func- By keeping the level set function φ fixed, the minimizers 2 tional (RSF) is written as: of the variables ui(x)andσi(x) are:  2     K x − y M φ y I y dy  Ω σ( ) i,ε( ( )) ( ) 2 ui(x)=  λi Kσ(x − y)|I(y) − fi(x)| Mi,ε(φ(y))dy dx K x − y M φ y dy Ω σ( ) i,ε( ( )) i=1 Ω Ω    K x − y I y − u x 2M φ dy 2 Ω σ(  )( ( ) i( )) i,ε( ) v |∇H φ x |dx μ 1 ∇φ x − 2dx σi(x) = + ε( ( )) + ( ( ) 1) (4) Kσ(x − y)Mi,ε(φ)dy Ω Ω 2 Ω 2 ui x σi x where M1,ε(φ(x)) = Hε(φ(x)), M2,ε(φ(x)) = 1−Hε(φ(x)), Hε For fixed local statistic descriptors ( )and ( ) ,min- φ is the regularized version of H. The fitting functions f1(x) imizing the energy function (6) with respect to ,wegetthe f x gradient descent flow: and 2( ) respectively approximate the local image intensities   inside and outside the contour. Kσ is a kernel function that ∂φ   ∇φ = −Hε(φ)(λ1d1 − λ2d2)+Hε(φ)div controls the local region centered at the point x through the ∂t |∇φ| standard deviation σ.  where di = Kσ(x − y)Lpidy, i =1, 2. Obviously, the energy functional (4) is non-convex. By ap- Ω Then incorporating the global convex C-V model (3) and plying the global convex C-V model (3) to the RSF model and the information from the edge detector, we define the global incorporating the information from an edge detector g,Yang convex model based on local statistic descriptors as: et al.[14] proposed the following convex optimization model:     min g|∇u(x)| + (λ1d1 − λ2d2)u(x)dx (7) 0≤u≤1 min g(∇I(x))|∇u(x)| + (λ1e1 − λ2e2)u(x)dx (5) Ω Ω a0≤u≤b0 Ω Ω The standard approach to minimize the model is to solve ∇I x I x g ξ 1 the Euler-Lagrange equation associated with Eq.(7): where ( ) is the gradient of image ( ), ( )= 2 ,    1+β|ξ| 2 ∇u ei(x)= Kσ(x − y)|I(y) − fi(x)| dy, i =1, 2. − g λ1d1 − λ2d2 Ω div |∇u| +( )=0 (8) Although this method is faster than RSF method, it didn’t handle another disadvantage of RSF method. The local inten- Obviously, due to the presence of the term 1/|∇u|,Eq.(8)is sity means do not provide enough information for accurate not well defined at points where ∇u =0.Toovercomethis segmentation as shown in Fig.1. The original image is a syn- limitation, we introduce an auxiliary variable v to obtain the thetic image containing a rectangular object. The object and regularization of Eq.(7):  the background have the same intensity means but with differ- g|∇u x | 1 u x − v x 2 ent variances. It can be observed that the global convex C-V min ( ) + ( ) ( ) u,0≤v≤1 Ω 2θ A Convex Approach for Local Statistics Based Region Segmentation 625 

+ (λ1d1 − λ2d2)v(x)dx (9) Ω

Since functional (9) is still convex, its minimizer can be computed by minimizing (9) with respect to u and v sepa- rately. Fig. 2. Experiments of the proposed method on the images (1) Keeping v fixed, the solution u of (9) can be obtained used in Ref.[14]. First row: initial contours; second row: segmentation results by  1 2 min g|∇u(x)| + u(x) − v(x) (10) Fig.3 presents the segmentation results of the global convex u θ Ω 2 C-V method[3], the global RSF method[14] and the proposed It can be observed that the minimization problem (10) is method on the images used in Ref.[15]. Since the global convex [16] the same one considered by Chambolle except for the g- C-V method uses global information to segment the images, weighting of the TV term. Therefore, we can use the Cham- it fails to achieve satisfactory segmentations on the images bolle’s method to solve the minimization problem (10): with intensity inhomogeneity. The global RSF method also

n n fails to obtain satisfactory segmentations, because the local n+1 p + τ∇(div(p ) − v/θ) p = τ intensity means cannot provide enough information for accu- |∇ pn − v/θ | 1+ g (div( ) ) rate segmentation on those images. The proposed method can successfully extract object boundaries for those images using un+1 vn − θ pn+1 = div the same initial contours and the results are shown in the last column in Fig.3. where p is the dual variable and τ is the time step which should We use the following setting of the parameters in this be small enough to guarantee a stable evolution and conver- experiment: λ1 = λ2 =0.1 for the first and third images; gence. As in Ref.[16], we choose τ =1/8. λ1 = λ2 =0.3 for the second image; λ1 = λ2 =0.2forthe (2) Keeping u fixed, the solution v of (10) can be obtained fourth image. by  1 2 min u(x) − v(x) + (λ1d1 − λ2d2)v(x)dx (11) 0≤v≤1 2θ Ω

The corresponding Euler-Lagrange equation is

v − u + θ(λ1d1 − λ2d2)=0

Then, the solution of (11) can be obtained by

n+1 n+1 v =min(max(u − θ(λ1d1 − λ2d2), 0), 1) Fig. 3. Comparisons of the proposed method with the other methods. From left column to right column: initial IV. Experimental Results contours; global C-V method; global RSF method; proposed method In this section we demonstrate the effectiveness of our al- gorithm by applying it to different image segmentation tasks. 2. Comparisons with the non-convex method based Our algorithm is implemented in Matlab2009b on an Intel Core on local statistical descriptors i3-330M, 2.00 GB RAM PC. The values of the all pixels of the We compare the proposed method with the non-convex [15] images are in the range [0, 1]. The kernel Kσ can be truncated method . as a ω × ω mask,herewechooseω =4σ + 1 as in Ref.[13]. Fig.4 shows the segmentation results of the non-convex We set ε =1,θ =0.25, β = 100, σ = 6 for all images used method and the proposed method on the synthetic image used in experiments except we choose θ =0.15, β = 10, σ =3.5 in Fig.1 at 150 × 150 resolution. We set the parameters σ =3, for the image in the last row of Fig.2, and θ =0.15, σ =3for μ =1,Δt =0.1, v =0.00005 × 2552 for the non-convex the synthetic image in Fig.4. The parameters λ1 and λ2 are method, and λ1 = λ2 =0.2 for the proposed method. The chosen according to the segmented images. edge detector is not used in this experiment. We use the same 1. Comparisons with the global convex C-V initial contour as that in Fig.1. The iteration number and method and the global RSF method CPU time for the non-convex method are about 15000 and Fig.2 gives the results of our method on the images used 16 minutes respectively. However, our method only needs 160 in Ref.[14]. The first row shows the original images with the iteration number and 9 seconds to achieve this segmentation. initial contour respectively. The final segmentation results are It is clear that the proposed method is far more efficient than shown in the second row respectively. We use the following the non-convex method. Furthermore, comparing the results setting of the parameters: λ1 = λ2 =0.35 for the first image; in Fig.4, it is obvious that the proposed method can achieve λ1 = λ2 =0.25 for the second image; λ1 = λ2 =0.05 for the better segmentation. third image, λ1 =0.71 and λ2 =0.7 for the fourth image. It Fig.5 shows the results of the non-convex method on the is obvious that the results of the proposed method are similar images used in Fig.3. The sizes of the images are 109 × 119, to that of the global RSF method. 128×125, 132×107, 100×87 respectively. Using the same ini- 626 Chinese Journal of Electronics 2012 tial contours as that in Fig.3, this method can achieve segmen- [2] Y. Tang, P. Yan, Y. Yuan, X. Li, “Single-image super-resolution tation results similar to that of the proposed method. How- via local learning”, Int. J. Mach. Learn. & Cyber., Vol.2, No.1, ever, from Table 1, it is clearly that our method is much faster pp.15–23, 2011. [3] J. Wu, S. Wang, F. Chung, “Positive and negative fuzzy rule than the non-convex method. system, extreme learning machine and image classification”, Int. J. Mach. Learn. & Cyber., Vol.2, No.4, pp.261–271, 2011. [4] V. Caselles, R. Kimmel, G. Sapiro, “Geodesic active contours”, Int’l J. Comp. Vis., Vol.22, No.1, pp.1–79, 1997. [5] M. Kass, A. Witkin, D. Terzopoulos, “Snakes: Active contour Fig. 4. Experiments models”, , Vol.1, pp.321–331, 1988. on the synthetic image Int’l J. Comp. Vis [6] K. Zhang, S. Xu, W. Zhou, B. Liu, “Active contours based on used in Fig.1. From image laplacian fitting energy”, , left to right: the non- Chinese Journal of Electronics Fig. 5. The segmentation results of Vol.18, No.2, pp.281–284, 2009. convex method; pro- the non-convex method on the images [7] T.F. Chan, S. Esedoglu, M. Nikolova, “Algorithms for finding posed method used in Fig.3 respectively global minimizers of image segmentation and denoising mod- els”, J. Appl. Math., Vol.66, No.5, pp.1632–1648, 2006. [8] T.F. Chan, L. Vese, “Active contour without edges”, Table 1. Iteration number and CPU IEEE , Vol.10, No.2, pp.266–277, 2001. time (in seconds) Trans. Image Process. Image 1 Image 2 Image 3 Image 4 [9] N. He, P. Zhang, “Variational level set image segmentation method based on boundary and region information”, Our 65 55 160 60 Acta Elec- , Vol.37, No.10, pp.2215–2219, 2009. (in Chinese) method (6.99s) (7.13s) (19.51s) (5.18s) tronica Sinica [10] Q. Wang, Z.K. Pan, W.B. Wei, Y. Wang, “Variational image Non-convex 2000 7000 10000 3000 segmentation on implicit surface using dual method”, method (163.37s) (676.71s) (1246.66s) (258.25s) Acta Elec- tronica Sinica, Vol.39, No.1, pp.207–212, 2011. (in Chinese) [11] L. Zhang, L. Zhu, X. Mi, “Localized multi-channel level set We present the segmentation results of the proposed segmentation combined with gabor texture feature”, Acta Elec- method in Fig.6 on the images used in Fig.3 with different tronica Sinica, Vol.39, No.7, pp.1569–1574, 2011. (in Chinese) initial contour respectively. Obviously, our method also can [12] D. Mumford, J. Shah, “Optimal approximation by piecewise smooth function and associated variational problems”, successfully extract object boundaries. The iteration number Com- mun. Pur. Appl. Math., Vol.42, No.5, pp.577–685, 1989. and CPU time (in seconds) are 60 (6.48s), 250 (31.61s), 400 [13] C. Li, C.Y. Kao, J.C. Gore, Z. Ding, “Implicit active contours (48.27s), 100 (8.51s) respectively. With those initial contours, driven by local binary fitting energy”, CVPR, Minnesota, USA, the iteration number and CPU time of the non-convex method pp.430–436, 2007. could be hardly bearable. [14] Y. Yang, C. Li, C. Kao, S. Osher, “Split bregman method for minimization of region-scalable fitting energy for image segmen- tation”, In International Symposium on Visual Computing,Las Vegas, Nevada, USA, Vol.6454, pp.117–128, 2010. [15] L. Wang, L. He, C. Li, “Active contours driven by local Gaussian distribution fitting energy”, Signal Processing, Vol.89, No.12, pp.2435–2447, 2009. [16] A. Chambolle, “An algorithm for total variation minimization and applications”, J. Math. Imaging and Vis., Vol.20, No.1-2, Fig. 6. The segmentation results of the proposed method on pp.89–97, 2004. the images used in Fig.3 with different initial contours. MA Liyan received B.S. and M.S. First row: initial contours; second row: the results degrees from the School of Mathematics and Physics of China University of Geo- V. Conclusions sciences in 2005 and 2008. Now she is a Ph.D. candidate in the School of Computer In this paper, we presented a region-based convex active Science and Information Technology of Bei- contour model using the local statistical descriptors. This jing Jiaotong University. Her research in- terests include image processing based on model is spatially coherent and able to segment images with variational method and machine learning. intensity inhomogeneity. It is more powerful than the global (Email: [email protected]) RSF method which only uses the information of local inten- sity means. Being convex, the proposed method is guaran- teed to yield a globally optimal solution and takes much less YU Jian received the B.S. degree in applied mathematics, M.S. degree in math- computational cost than the non-convex method. The main ematics, and Ph.D. degree in applied math- disadvantage of the proposed method is that the parameters ematics from Peking University, Beijing, are manually selected. In the future, we will study how to au- China, in 1991, 1994, and 2000, respec- tomatically select parameters based on the segmented image. tively. During 1994–1998, he joined the faculty of Beijing Graduate School, China University of Mining and Technology. Cur- References rently,heisaProfessorandtheHeadofIn- [1] U. Maulik, D. Chakraborty, “A novel semi-supervised SVM for stitute of Computer Science, Beijing Jiao- pixel classification of remote sensing imagery”, Int. J. Mach. tong University. His current research interests include fuzzy clus- Learn. & Cyber., DOI: 10.1007/s13042-011-0059-3, 2011. tering, pattern recognition, and data mining. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Novel Boosted Charge Transfer Circuit for High Speed Charge Domain Pipelined ADC∗

CHEN Zhenhai1,2, YU Zongguang1,2, HUANG Songren1,2,JIHuicai2 and ZHANG Hong3 (1.Wide Bandgap Semiconductor Technology Disciplines State Key Laboratory, School of Microelectronics, Xidian University, Xi’an 710071, China) (2.China Electronic Technology Group Corporation, No.58 Research Institute, 214035, China) (3.School of Electronics and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China)

Abstract — A novel Boosted charge transfer (BCT) cir- use. cuit is proposed for Bucket-brigade devices (BBDs) based In order to alleviate the power consumption and process charge-domain (CD) pipelined Analog-to-digital converter limitations on the design of high-speed and high-resolution (ADC). It can significantly lower the sensitivity on Pro- pipelined ADCs based on traditional op-amps based SC cess, voltage and temperature (PVT) variations of tradi- tional BCT circuit, which can eliminate the Common mode scheme, many targeted technologies have been proposed and (CM) charge control circuit in the existing CD pipelined verified to be effective in past years. Such as the digital calibra- [3−7] ADC. With the proposed BCT circuit, a prototype ADC tion assisted SC pipelined ADCs and comparator or zero- is realized in a 0.18µm CMOS process without using any crossing based SC pipelined ADCs[8,9]. But the calibration common mode charge control techniques, with only 27mW algorithms of the first technique are complex and consume con- power consumption at 1.8 V supply. It achieves Spurious siderable die area and power, the second technique suffers from free dynamic range (SFDR) of 67.7 dB, Signal-to-noise- limited dynamic range. The BBD-based CD pipelined ADCs and-distortion ratio (SNDR) of 55.8 dB and Effective num- [10] ber of bits (ENOB) of 9.0 for a 3.79 MHz input at full sam- offer another alternative that eliminate the use of op-amps , pling rate. The Differential nonlinearity (DNL) is +0.5/−0.3 which could conquer the limitations described above. But con- LSB, and the Integral nonlinearity (INL) is +0.7/−0.55 LSB. ventional BBD-based CD pipelined ADCs have severely limi- Key words — Pipelined analog-to-digital converter, tation in speed and accuracy due to charge operation inaccu- racy of charge transfer circuit[11]. By introducing a BCT cir- Charge domain, Low power, Charge transfer circuit. cuit, charge transfer speed and accuracy of BBD are improved greatly[12,13] . However, the output charge amount of the BCT I. Introduction is influenced remarkably with PVT variations, which would cause large fluctuations in the CM charge in each conversion With the rapid development of CMOS technology and de- stage when the BCTs are used in CD pipelined ADCs. The sign methodology in the past decades, the performances of re- CM charge errors reduce the input signal range of the ADC ported pipelined ADCs are improved continuously. With vari- extensively. Furthermore, as the CM charge error accumulates ous creative design solutions, pipeline ADCs based on conven- stage by stage, several stages at the backend of the pipeline tional Switched-capacitor (SC) circuitry have achieve a resolu- may fail to function when the error increases to a given ex- tion of up to 16 bit with sampling rate of over 160-MSPS[1,2]. tent. In order to relieve the problem, complicated CM charge But one restriction associated with the two reported designs control techniques are introduced in Ref.[12] to enhance the are both of them rely on high-gain operational amplifiers (op- performance, which also increases the design complexity and amps) with large bandwidth to ensure precision and speed, power consumption. which would bring two bottlenecks for ADC’s further speed In order to overcome the PVT sensitivity of BCT and elim- and resolution improvement. The first is the high power con- inate the CM charge control circuit, a Novel BCT (NBCT) sumption, the power consumption of both ADC are above 1W, that adopts a feedback network composed of a Differential dif- and the power consumption for further improvement of speed ference amplifier (DDA) and a voltage reference to reject the and resolution will increase dramatically; the second is that output charge errors caused by PVT variations is proposed. A both of the design are implemented in BiCMOS processes, low power 10-bit 125-MSPS CD pipelined ADC based on the which makes them impossible to fully exploit the benefits of proposed NBCT is achieved in 0.18µmCMOSprocesswithout advanced deep submicron CMOS technology for embedded using any CM charge control techniques.

∗Manuscript Received Oct. 2011; Accepted May 2012. This work is supported by the National Natural Science Foundation of China (No.61076031, No.61106027, No.60832001) and the 333 Talent Project of Jiangsu Province, China (No.BRA2011115). 628 Chinese Journal of Electronics 2012

II. Common Mode Analysis of stage as BBD-based CD Pipelined Stage Qcmout =(Qi + Cs · ΔVdac +2Qc +2Qicm − Qi

The structure of typical BBD-based CD pipelined stage + Cs · (VFdac − ΔVdac))/2 and its operation waveform are shown in Fig.1[14] .Asingle- =Qicm + Cs · VFdac/2+Qc (2) ended configuration is used to simplify the description of the a CD pipelined stage. Fig.1( ) is the circuit diagram of CD where VFdac is the full output range of Sub-DAC, Qicm is pipelined stage. It is composed of a charge storage node Xn, the input CM charge. Eq.(2) illustrates that the output CM two charge storage capacitors Cc and Cs, a sub analog-to- charge is composed of three parts, the input CM charge, the digital converter (sub-ADC), a sub digital-to-analog converter Sub-DAC incremental charge Cs · VFdac/2 and the constant (sub-DAC), a charge transfer circuit and a reset switch Sr. charge Qc introduced during the charge transfer process. Ide- b t Fig.1( ) illustrates the operation waveform. At 0, when the ally, the output CM charge Qcmout of all sub-stage in the CD charge transfer circuit of the pre-stage is active, Qi begins to pipelined ADC should be kept constant, due to the PVT varia- transfer into Xn, since charge is transferred by electron, the tion the CM charge Qcmout will fluctuate and reduce the input voltage of Xn (Vn(t0)) will drop gradually, when the charge signal range of the ADC extensively. As the charge capacity injecting into node Xn. At t1, charge transferring of Qi is of a particular CD pipelined sub-stage is constant, identify the finished, assuming no charge leakage, voltage of Xn will keep charge capacity as Q = Qid + Qcmout,whereQid is the differ- Q constant, sub-ADC would begin to quantize i and generate ential input charge. As Q is a constant, if Qcmout is too large, the quantization results D(n). At t2, when quantization has then the maxim range of the input charge signal is restricted; finished, D(n) would be passed to the digital error correction if Qcmout is too small, then the bottom range of the input circuit of the ADC and the sub-DAC of this stage to generate charge is restricted. Furthermore, as the CM charge fluctua- the voltage Vdac for charge subtraction, Vdac is then connected tion accumulates stage by stage, several stages at the backend to Cs to subtract the charge amount of Cs · ΔVdac from Qi of the pipeline may fail to function when the error increases and get the residue charge Qout.Att3, Vc is connected to to a given extent. a lower voltage Vc(t4), causing a negative step on Xn, St is Assume the sub-stage shown in Fig.1(a)astheNth stage open and Qout will be transferred to the next pipelined stage, of a CD pipelined ADC, the CM charge can be rewritten as as the charge transferring out of node Xn, the voltage of Xn rises gradually. At t4, St is closed, charge transfer process is Qcmout(N)=Qicm(N − 1) + Cs · ΔVdac/2(N)+Qc(N) Vn t4 t5 finished, the voltage of Xn ( ( )) is kept constant. At , =Qicm(N − 1) + Qdac(N)+Qc(N) reset switch Sr resets Xn to Vn(t0), and Vc would be changed =Qicm(N − 2) + Qdac(N − 1) + Qc(N − 1) to a high voltage of Vc(t0). At t6, after resetting of Xn has Q N Q N completed, the whole clock period is finished. + dac( )+ c( ) =Qcmout(SH)+Qdac(1) + Qc(1) + ···

+ Qdac(N − 1) + Qc(N − 1) + Qdac(N)+Qc(N)

=Qicm + Qc(SH)+Qdac(1) + Qc(1) + ···

+ Qdac(N − 1) + Qc(N − 1) + Qdac(N)+Qc(N) (3)

where Qcmout(SH) is the CM output of sample and hold (SH) circuit of the CD pipelined ADC, Qicm istheCMchargecor- responding to the analog input CM voltage before the SH, Qc (SH) is the constant charge introduced by the charge transfer process from SH circuit to first stage of a CD pipelined ADC. As the analog input CM voltage is determined by the on-chip voltage reference, the variation of Qicm can be ignored. Also, as Cs is constant and VFdac is provided by the voltage refer- Fig. 1. Diagram of CD pipelined ADC sub-stage and its op- Q eration waveform ence of Sub-DAC, the variation of dac of all sub-stage can be ignored. The largest CM charge variation of Qcmout(N)is The charge relationship of the BBD-based CD pipelined caused by the variation of Qc in the SH circuit and the N − 1 stage is CD pipelined sub-stages. In order to control the variation of Qcmout(N), the variation of Qc during the charge transfer Qout =Qi + Cs · ΔVdac +(Cc + Cs) · (Vn(t0) − Vn(t4)) process in SH circuit and the N − 1 CD pipelined sub-stages + Cc · (Vc(t4) − Vc(t0)) should be minimized. =Qi + Cs · ΔVdac + Qc (1) III. Novel Boosted Charge Transfer where Qc =(Cc + Cs) · (Vn(t0) − Vn(t4)) + Cc · (Vc(t4) − Vc(t0)) is a constant which has no relationship with Qi. When the CD Circuit pipelined stage in Fig.1(a) is implemented in fully differential form, we can get the output CM charge of the CD pipelined Charge transfer circuit in a conventional BBD is depicted A Novel Boosted Charge Transfer Circuit for High Speed Charge Domain Pipelined ADC 629  a in the circuit and waveforms of Fig.2( ). The gate of charge- =C1 · (VCk1(t0) − VCk1(t1)) transfer FET S VG is held by a constant voltage; Ni and No  are two charge storage nodes; C1 and C2 are two charge stor- C2 − (Vdiff (t0) − Vdiff (t1)) · (5) age capacitors, Ck1 and Ck1n are clock control signals with C1 + C2 180◦ phase difference. The charge to be transferred is stored on capacitor C1. S is initially off. Charge transfer is initi- where Vdiff (t0)=VNo(t0) − VNi(t0), Vdiff (t1)=Vr2 − Vr1, ated at t0 by a Ck1 negative step which is coupled to the Ni VNo(t0) is the reset reference voltage of No before charge trans- via C1, S turns on, conveying (electrons) charge from Ni to fer process. Suppose a ΔV1 and ΔV2 is introduced on Vr1 No, causing the voltage of VNi to rise and VNo to fall. At and Vr2 respectively due to PVT variations, a corresponding time t1,whenVNi rises towards a cut-off value approximately variation ΔV1 − ΔV2 will be introduced on Vrdiff . However, equal to VG − Vth, charge transfer process is terminated. The in the real circuit the physical location of C1, C2 and the speed and accuracy of the process VNi sets towards the cut-off Vrdiff generation circuit is very close, so we have ΔV1 ≈ ΔV2, value VG − Vth determines the speed and accuracy of conven- which means the NBCT’s sensitivity to PVT is largely re- tional BBD charge transfer circuit. However the charge trans- jected. Comparing Eq.(4) with Eq.(5), QT of the proposed fer errors in conventional BBD charge transfer circuit caused NBCT is determined by differential reference voltages, in stead by charge transfer inefficiency, incomplete settling and charge of a voltage related to the quiescent operating voltage of the leakage are too large to be used in high-performance pipeline amplifier, hence is much less sensitive to PVT variations than ADCs. By introducing a boosted amplifier A, as is shown in that of the BCT in Fig.2(b). Fig.2(b), the accuracy and speed of conventional charge trans- fer, which is called BCT[12,13] , can be greatly improved. The resulted waveform shows that S’s gate voltage VG is no longer constant: amplifier A drives the gate of S such that VNi settles towards reference voltage Vr. The charge been transferred can be expressed in terms of the voltage change across capacitor C1,then

QT =C1 · (ΔVCk1 − ΔVNi)

=C1 · ((VCk1(t0) − VCk1(t1)) − (VNi(t0) − VNi(t1)) (4)

V t where VCk1(t0) Ck1( 1) are constant voltage controlled by reference voltage, VNi(t0) is the input voltage before the transferring process, which determines the input charge, and VNi(t1)=Vr, is the voltage of node Ni when the charge trans- ferring process is just terminated. The speed and accuracy of VNi(t1) setting towards Vr determines the speed and ac- curacy of the charge transfer process. If Vr is constant, then the transferred charge is a linear function of the charge to be transferred. However, Vr of the BCT in Fig.2(b) is determined by the quiescent operating point of cascade amplifier A, which is very sensitive to PVT variation. Suppose PVT variations Fig. 2. Simplified diagram and voltage waveforms of conven- cause a voltage error of ΔV in Vr, an equal error of ΔV will be V t tional charge transfer, BCT and the NBCT circuit. introduced in Ni( 1)too.AccordingtoEq.(4), a charge error (a) Conventional charge transfer; (b)BCTcircuit;(c) of ΔQ =ΔV · C1 will be introduced in QT . This charge error The NBCT circuit would lead to large CM charge error in CD pipelined ADCs. To handle the PVT issue of BCT, a NBCT circuit is pro- Fig.3(a) depicts the circuit implementation of the NBCT. posed in Fig.2(c). Where, A DDA is used to detect both the The DDA is controlled by the charge transfer control clock sig- voltage variations of VNi and VNo. The voltage Vr is replaced nal Ckt, which closes the DDA in the reset mode to save power. b by a differential voltage reference Vr2 − Vr1. The charge trans- Fig.3( ) shows the simulated transient waveform of the NBCT. fer process is similar with the BCT in Fig.2(b), except that At t1, the DDA senses the differential voltage Vdiff and Vrdiff the DDA detects the voltage difference of Vdiff = VNo − VNi and drives the gate VG ofNMOSFETStoahighlevel,Sturns and sets it towards the differential voltage reference Vrdiff = on, electrons transfers from Ni to No, causing the voltage of V V t V Vr2 − Vr1 when charge transfer process is terminated. Assume Ni to rise and No to fall. At 2,when diff sets towards V the voltage change of Vdiff between t0 and t1 is ΔVdiff ,the rdiff , DDA will drives VG to cut off S, terminating the charge charge been transferred is calculated as transfer process. From the figure it can be seen the time Vdiff sets towards Vrdiff is about 2.6ns, which means the designed QT =C1 · (ΔVCk1 − ΔVNi) NBCT can works at the speed of over 300MHz. The simulated   ΔVdiff · C2 output charge variation between NBCT and BCT for transfer =C1 · ΔVCk1 − C1 + C2 the same input charge under different PVT condition is shown 630 Chinese Journal of Electronics 2012

for NBCT. Fig.4(c) corresponds to the temperature change, the charge difference is 0.5pC for the BCT, which is 10 times of the 0.05pC for NBCT. The simulated results verify that the proposed BCT can reject the charge error due to PVT vari- ations remarkably. Compared with BCT, the NBCT’s power consumption is increased to twice of BCT, but this increased power consumption is much smaller than the eliminated power consumption of common mode control circuit used in the con- ventional CD pipelined ADCs. IV. Experimental Results of the 10-bit CD ADC Based on NBCT

A 10-bit 125-MSPS CD pipelined ADC based on the pro- posed NBCT without using any common mode charge control techniques has been fabricated in a 0.18µm CMOS process. The die photograph of the prototype ADC is shown in Fig.5(a). The bottom side is the SHA and the CD pipeline ADC core, and the upper side shows bandgap reference voltage generator, reference buffer op-amps, the clock buffer and digital error cor- rection logic block. The total active area excluding the PAD and ESD cells is about 0.8 × 1.3mm2, where the active area of the SHA and CD pipeline sub-stages is about 0.5 × 1.3mm2. Fig. 3. Circuit implementation and the simulated waveform of the NBCT. (a) Circuit implementation; (b)Simulated The measured output FFT spectrum with a 3.79MHz input waveform frequency at 125MSPS is shown in Fig.5(b). The measured SNR is about 57.3 dB, the SFDR is about 67.7 dB and the in Fig.4. Simulation conditions are: gain of the amplifier is SNDR is about 55.8 dB, so the ENOB is about 9.0. The mea- 20dB, C1 =2∗ C2 = 3pF and the charge calculation equation sured nonlinearity of the ADC is shown in Fig.5(c). The INL is is QT = C2 ·ΔVdiff /(1+C2/C1)=1pF·ΔVdiff , the charge to +0.7/ − 0.55 LSB, and the DNL is +0.5/ − 0.3 LSB. The total be transferred is 0.5pC. The left figure is the results of BCT ADC power consumption on a 1.8V supply is about 27 mW and the right is for NBCT. Fig.4(a) corresponds to the pro- excluding the output drivers. The measured performances of cess corner, we can see that the charge difference received on the prototype ADC are compared with those of the recently [15−19] C2 between FF and SS condition is 0.6pC for the BCT, which reported 10 bit ADCs as summarized in Table 1 .The is 20 times of the 0.03pC for NBCT. Fig.4(b) corresponds to prototype10-bit 125 MSPS CD pipelined ADC based on the different voltage supplies, the difference between 2V and 1.6V proposed NBCT has the best Figure of merit (FoM) of 0.42 supply is 0.6pC for the BCT, which is 15 times of the 0.04pC pJ/step, including on-chip references, and an area efficiency of

Fig. 4. Simulated PVT comparision of BCT and NBCT. (a) Process corner; (b) Voltage supply; (c) Temperature A Novel Boosted Charge Transfer Circuit for High Speed Charge Domain Pipelined ADC 631

Table 1. Performance comparison Parameters This work Ref.[15] Ref.[16] Ref.[17] Ref.[18] Ref.[19] Year 2011 2004 2007 2009 2009 2011 Sampling rate 125 MSPS 125 MSPS 125 MSPS 120 MSPS 100 MSPS 100 MSPS Technology 0.18µmCMOS 0.18µmCMOS 65nm CMOS 0.18µmCMOS 0.18µmCMOS 0.13µmCMOS DNL/INL 0.5/0.7 LSB 0.5/0.7 LSB 0.3/0.6 LSB 0.18/0.53 LSB 0.58/0.84 LSB 0.6/1.0 LSB 67.7/55.8 dB −/53.7dB 68/57 dB 68/53 dB 65/51 dB −/56 dB SFDR/ SNDR (Fin =3.79 MHz) (Fin =2MHz) (Fin =5MHz) (Fin =1MHz) (Fin =4MHz) (Fin =1MHz) ENOB 9.0 bits 8.6 bits 9.26 bits 8.5 bits 8.3 bits 9.0 bits Power 27 mW 40 mW 20 mW 108 mW 24.2 mW 32 mW FOM= ENOB 0.42 pJ/step 0.78 pJ/step 0.27 pJ/step 2.4 pJ/step 0.54 pJ/step 0.62 pJ/step Power/(2 · fclk) Active area 1.04 mm2 0.66 mm2 0.13 mm2 1.8 mm2 0.8 mm2 1.96mm2

tion”, IEEE J. Solid-State Circuits, Vol.45, No.12, pp.2602– 2612, 2010. [2]R.Payne,M.Corsi,D.Smith,et al., “A 16-bit 100 to 160 MS/s SiGe BiCMOS pipelined ADC with 100 dBFS SFDR”, IEEE J. Solid-State Circuits, Vol.45, No.12, pp.2613–2622, 2010. [3] S. Lee and B. Song, “Digital-domain calibration of multi- step analog-to-digital converter”, IEEE J. Solid-State Circuits, Vol.27, No.12, pp.1679–1688, 1992. [4] E. Siragursa, I. Galton, “A digitally enhanced 1.8-V 15-bit 40- MSample/s CMOS pipelined ADC”, IEEE J. Solid-State Cir- cuits, Vol.39, No.12, pp.2126–2138, 2004. [5]B.Peter,K.Franz,K.Claus,et al., “A 14b 100MS/s digi- tally self-calibrated pipelined ADC in 0.13µmCMOS”,Proc. of ISSCC, San Francisco, California, USA, pp.224–225, 2006. [6] B. Murmann and B. Boser, “A 12-b 75 MS/s pipelined ADC using open-loop residue amplifier”, Proc. of ISSCC, San Fran- cisco, California, USA, pp.330–331, 2003. [7] J. Li and U. Moon, “Background calibration techniques for multi-stage pipelined ADC’s with digital redundancy” IEEE , Vol.50, No.9, pp.531–538, 2003. Fig. 5. Die photograph and measured results of the prototype Trans. Circuits Syst. II [8] T. Sepke, J.K. Fiorenza, C.G. Sodini, ., “Comparator- ADC. (a) Die photograph; (b) FFT spectrum; (c)DNL et al based switched-capacitor circuits for scaled CMOS technolo- &INL gies”, Proc. of ISSCC, San Francisco, California, USA, pp.220– 2 221, 2006. only 0.009mm /MHz, which shows very good power efficiency [9] L. Brooks and H.S. Lee, “A zero-crossing based 8-bit 200MS/s with a reasonable trade-off between power consumption and pipelined ADC”, IEEE J. Solid-State Circuits, Vol.42, No.12, die area when compared with the recently reported 10-bit pp.2677–2687, 2007. ADCs. [10] E.B. James, “Bucket brigade analog-to-digital converter”, US Patent, No.4072938, 1978. V. Conclusion [11] C.N. Berglund, “Analog performance limitations of charge transfer dynamic shift registers”, IEEE J. Solid-State Circuits, Vol.6, No.6, pp.391–394, 1971. A NBCT circuit modified from conventional BCT circuit [12] A. Michael, K. Edward, K. Jeffrey, et al., “A process-scalable is presented, which alleviates the PVT variation sensitivity of low-power cCharge-domain 13-bit pipeline ADC”, Proc. of BCT circuit and the CM charge variation due to the PVT vari- Symposium on VLSI Circuits, Honolulu, Hawaii, USA, pp.222– ations in the BBD-based CD pipelined ADC. It eliminates the 223, 2008. common mode charge control circuit and simplifies the system [13] W. FREY, “Bucket-brigade device with improved charge trans- fer”, , Vol.9, No.25, pp.588–589, 1973. complexity of the CD pipelined ADC. The PVT variation re- IET Electronics Letters [14] J.H. Cai, H.S. Ren, C.Z. Hai, ., “A charge cou- jection of NBCT circuit is improved 10 times than conventional et al pled pipelined analog-to-digital converter”, Chinese patent, BCT circuit, A 27mW 10-bit BBD-based CD pipelined ADC No.200910264739.2, 2009. based on the proposed NBCT operating at a sampling rate [15] M. Yoshioka, M. Kudo, K. Gotoh, et al., “A 10b 125MS/s 40mW of 125 MSPS is constructed in 0.18µm CMOS. Tested results pipelined ADC in 0.18µmCMOS”,Proc. of ISSCC, San Fran- show that the prototype ADC achieves ENOB of 9.0-bit and cisco, California, USA, pp.282–283, 2005. DNL/INL of 0.5/0.7LSB for a 3.79 MHz input at full sampling [16] N.S. Pratap, K. Ashish, D. Chandrajit, et al., “20mW, 125 rate, which confirm the validity of the proposed NBCT. Msps, 10bit pipelined ADC in 65nm standard digital CMOS process”, Proc. of IEEE CICC, San Jose, California, USA, pp.189–192, 2007. References [17] C.H. Cheol, K.Y. Ju, K.W. Joo, et al., “A 10b 120 MS/s 108 mW 0.18µm CMOS ADC with a PVT-insensitive current refer- [1] A.M.A. Ali, M. Andy and D. Chris, et al., “A 16-bit 250- ence”, Analog Integrated Circuits and Signal Processing, Vol.61, MS/s IF sampling pipelined ADC with background calibra- No.58, pp.115–121, 2009. 632 Chinese Journal of Electronics 2012 HUANG Songren received the [18] K.H. Lee, S.W. Lee, Y.J. Kim, K.S. Kim and S.H. Lee, “Ten- B.S. degree in electrical engineering from bit 100 MS/s 24.2 mW 0.8mm 20.18µm CMOS pipeline ADC Hunan University, Changsha, China, in based on maximal circuit sharing schemes”, Electronics Letters, 1992. He is currently working toward the Vol.45, No.25, pp.1296–1297, 2009. Ph.D. degree in microelectronics at Xid- [19] C.S. Shin and G.C. Ahn, “A 10-bit 100-MS/s dual-channel ian University, Xi’an, China. Since1992, he pipelined ADC using dynamic memory effect cancellation tech- has been with China Electronic Technology nique”, IEEE Trans. Circuits Syst. II: Express briefs, Vol.58, Group Corporation, No.58 Research Insti- No.5, pp.274–278, 2011. tute, Wuxi, China, where he is involved in CHEN Zhenhai received the B.S. designing high speed Digital signal proces- and M.S. degrees in electronic science en- sors (DSPs) and high performance CMOS data converters. He is gineering from Jiangnan University, Wuxi, the author or co-author of over 20 technical papers of conferences China, in 2004 and 2008 respectively. He and journals. Also, he has filed 10 China patents. His research is currently working toward the Ph.D. de- interests include high speed digital signal processing and CMOS gree in microelectronics at Xidian Uni- mixed-mode integrated circuit design, especially high-performance versity. Since 2008, he has been with low-power DSPs and data converters. China Electronic Technology Group Cor- JI Huicai received the B.S. degree poration, No.58 Research Institute, Wuxi, in electrical engineering from Xidian Uni- China, where he is involved in designing versity, Xi’an, China, in 1992. Since 1992, high performance CMOS data converters. He is the author or he has been with China Electronic Tech- co-author of over 10 technical papers of conferences and jour- nology Group Corporation, No.58 Research nals. Also, he has filed 4 China patents. His research interests Institute, Wuxi, China, where he is in- include CMOS analog and mixed-mode integrated circuit design, volved in designing high speed Direct dig- especially high-performance low-power data converters. (Email: ital synthesizer (DDS) and high perfor- [email protected]) mance CMOS data converters. He is the YU Zongguang received the B.S. author or co-author of over 20 technical pa- and M.S. degrees in electronic science en- pers of conferences and journals. Also, he has filed 4 China patents. gineering from Xidian University, Xi’an, His research interests include high speed DDS and CMOS mixed- China, in 1985 and 1988 respectively, and mode integrated circuit design, especially low-power DDSs and data the Ph.D. degree from Southest University, converters. Nanjing, China in 1997. Since 1988, he has been with China Electronic Technol- ZHANG Hong received the B.S. degree in electronic engi- ogy Group Corporation, No.58 Research neering from Xi’an University of Technology in 2000 and the M.S. Institute, Wuxi, China. He is now the and Ph. D. degrees from Xi’an Jiaotong University, Xi’an, China, chief experts of China Electronic Technol- in 2004 and 2008 respectively. Since 2008, he has been with the ogy Group Corporation. Since 2002 he became a professor of Xidian Department of Microelectronics, Xi’an Jiaotong University, where University and Jiangnan University. His current research interests he was an associate professor. His current research interests include include high-speed memory, CMOS analog and mixed-mode inte- high-speed, high-precision ADC, and RFIC for broadband wireless grated circuit design. communications. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Frequent 2-Episode Mining with Minimal Occurrences Based on Episode Matrix and Lock State∗

LIN Shukuan, WANG Ya, WANG Jue, GUO Tianzhu and QIAO Jianzhong

(College of Information Science and Engineering, Northeastern University, Shenyang 110004, China)

Abstract — Frequent episode mining helps to set up rithms are usually adopted, which restricts the global mining episode rules and predict future events. In frequent strategy to be also based on apriori ideas. Typical represen- episode mining, 2-episode mining plays an important role. tatives of apriori-like algorithms include Refs.[7–11]. Ref.[7] The mining methods for 2-episodes determine the global focuses on non-overlap frequent episodes. Ref.[8] mines the strategies of frequent episode mining. The paper focuses on minimal occurrence based frequent 2-episode mining. two kinds of frequent episodes (window based and minimal oc- For the problems existing in the current methods, a novel currence based ones). Ref.[9] only discusses the window based frequent 2-episode mining method is proposed with high frequent episodes. Ref.[10] presents the mining method for efficiency based on episode matrix and the lock strategy. the episodes composed by interval events. Ref.[11] mines fre- It does not need to generate candidate episodes and only quent closed episodes. The algorithm apriori is used to mine scans data once. A series of experiments on real data sets frequent patterns in transaction databases[6]. Like algorithm show the advantages of the proposed method at time and apriori,theseapriori-like algorithms mining frequent episodes space cost. also need to scan data many times and generate a number Key words — Frequent episode mining, Episode ma- of candidate sets. This is not only time consuming, but also trix, Lock state, Time queue. cannot process event streams. For minimal occurrence based frequent 2-episode mining, we propose a novel method which I. Introduction only scans event sequences once without candidate generation. The experiments on real data sets validate the effectiveness With the wide use of EDGEs (Electronic data gathering and efficiency of the mining method proposed in this paper. equipments) such as sensors and RFID (Radio frequency iden- tification devices), an unprecedented volume of event data has II. Frequent 2-Episode Mining Based on been generated. CEP (Complex event processing) becomes a the Episode Matrix new research hotspot[1,2]. Frequent episode mining is an im- [3,4] portantaspectofCEP . By discovering frequent episodes To mine frequent 2-episodes and extend to frequent n- in event sequences, one can set up corresponding episode rules episodes (n>2) further, we need to count the minimal oc- and predict future events, taking actions before events occur. currences of 2-episodes. Therefore, we design an episode ma- An episode is a partially ordered collection of events occurring trix to record their minimal occurrence information. By the in an event sequence. Frequent episode mining in an event episode matrix, all frequent 2-episodes can be generated, only sequence is different from frequent pattern mining in a trans- scanning event sequent once without candidate generation. To [3,4] action database and sequence pattern mining in a sequence explain the episode matrix, we give the correlative definitions [5] database . So the existing methods of frequent pattern min- here. ing and sequence pattern mining can not be used in frequent Definition 1 Given an episode EP = e1e2 ···en in episode mining. an event sequence, its occurrence ep =((e1,t1), (e2,t2), In frequent episode mining, the mining of 2-episodes plays ···, (en,tn)) (t1 2) is based currence if there is not any occurrence ep =((e1,t1), on it and begins with it. So the global strategy of frequent (e2,t2), ···, (en,tn)) of EP such that t1 ≥ t1, tn ≥ tn and episode mining depends on the mining of 2-episodes. The pa- (tn − t1) < (tn − t1). per researches on minimal occurrence based frequent 2-episode Definition 2 Given an episode EP = e1e2 ···en,its i i i mining. m minimal occurrences epi =((e1,t1), (e2,t2), ···, (en,tn)), 1 2 m In traditional frequent 2-episode mining, apriori-like algo- i =1, 2, ···,m. Then the queue (t1,t1, ···,t1 )composedof

∗Manuscript Received Sept. 2010; Accepted May 2012. This work is supported by the the National Natural Science Foundation of China (No. 61272177), and the Fundamental Research Funds for the Central Universities (No.N110604001). 634 Chinese Journal of Electronics 2012 the timestamps of the first instances of its all occurrences is ment cannot form a minimal occurrence of the corresponding called the time queue of episode EP, denoted as TQ. 2-episode with any current event. Given an event sequence containing n event types The main generation steps of the episode matrix are shown e1,e2, ···,en,itsepisodematrixisan × n matrix whose rows in Algorithm 1. n and columns are the event types respectively. Each element It can be proved that Algorithm 1 only counts minimal e ,e i, j , , ···,n of it [ i j ], ( =12 ), corresponds to a 2-episode occurrences of 2-episodes, i.e., Algorithm 1 is correct, and Al- EP e e count, T Q count = i j and is denoted as ( ). Here, is gorithm 1 counts all minimal occurrences of 2-episodes, i.e., EP TQ the count of minimal occurrences of ,and is the time Algorithm 1 is self-contained. Owing to space limitation, we EP e e queue of .Wecall i left event of the element, and j right omit the proof. event of the element. Each matrix element is initialized by e ,e , [ i j ]=(0 null). III. Performance Evaluation The episode matrix is constructed while events in an event sequence are being scanned. An event occurring later and an To evaluate the performance of our method, we compare event occurring earlier will form a 2-episode occurrence. If it the episode matrix based mining method (method 1) proposed is a minimal occurrences, one count will be yielded for the 2- by the paper with the traditional apriori-like method proposed episode and recorded in the count field of the corresponding by mannila[8] (method2)attimeandspacecost. matrix element. And the timestamp of the event occurring ear- All the experiments were performed on a PC with 2.66GHz lier (left event) will be inserted to the tail of the time queue CPU and 3.5GB memory. Two real data sets were employed. TQ. Now, the key is to identify whether an occurrence is a The first one is Human Gonome data[12] with 5 event types, minimal one. To solve the problem, we set a lock for each which is a denser data set. The second one is Intel Lab data[13], matrix element. All locks are initialized to “locked” state. collected from 54 wireless sensors deployed in the Intel Berk- To explain the generating process of the episode matrix based erley Research Lab. Each event includes the attributes such on lock, the correlative definition and interpretation are first as sensor id, temperature. It is a sparser data set. presented below. Definition 3 An event being scanned in an event se- Fig.1 presents the runtime of the two methods with differ- quence is a current event. Set (e, t) is the current event. For ent frequency thresholds on the two data sets. It can be seen that the experiment results on the two data sets are basically an event type e ,ifthereareitsm (m ≥ 1) occurrences (e ,t1), consistent. The runtime of method 1 is much less than that of (e ,t2), ···, (e ,tm) such that t1

Algorithm 1 GenMat Generating episode matrix Event sequence, the number n of event types in the event Input: Fig. 1. Compare the runtime of the two methods with differ- sequence ent frequency thresholds. (The length of the two event Episode matrix Output: sequences is 10k events). (a) Human Gonome data; (b) 1. (event sequence has not been scanned over) { while Intel Lab data 2. Get an event (e, t) from event sequence as current event; 3. for (i =0;i

From Fig.1, it also can be seen that the runtime of method References min fre 1 does not change with , and keeps a smaller value all [1] E. Wu, Y. Diao and S. Rizvi, “High-performance complex the time. Whereas the runtime of method 2 is large when event processing over streams”, Proc. SIGMOD’06, pp.407– min fre is small, and it minishes with min fre increasing. 418, 2006. This is because in method 1, the process of constructing ma- [2] A. Demers, et al., “Cayuga: A general purpose event monitoring trix is that of mining frequent 2-episodes. By Algorithm 1, it system”, Proc. CIDR’07, pp.412–422, 2007. is independent of min fre. To the best of our knowledge, this [3] R. Agrawal, T. Imielinski and A. Swami, “Mining association rules between sets of items in large databases”, result has never been showed before. In method 2, the number Proc. SIG- MOD’93, pp.207–216, 1993. of candidate episodes generated and the time spent in evalu- [4] J.B. Zhao, A.G. Dong and G. Lin, “An algorithm for frequent min fre min fre ating them are less when is larger. When pattern mining in biological networks”, Acta Electronica Sinica, becomes enough large (0.5 and 0.042 for Human Gonome and Vol.38, No.8, pp.1803–1807, 2010. (in Chinese) Intel Lab data respectively), the runtime of method 2 is even [5] R. Agrawal and R. Srikant, “Mining sequential pattern”, Proc. less than that of method 1. This is because when min fre is ICDE’95, pp.3–14, 1995. enough large, the runtime of method 1 does not change. But [6] R. Agrawal and R. Srikant, “Fast algorithms for mining associ- ation rules”, Proc. VLDB’94, pp.487–499, 1994. in method 2, few candidate episodes and frequent ones are [7] S. Laxman, P.S. Sastry and K.P. Unnikrishnan, “A fast algo- generated and little time is needed. In practical applications, rithm for finding frequent episodes in event streams”, Proc. min fre too large is meaningless. KDD’07, pp.410–419, 2007. Fig.3 presents the space cost of the two methods on the [8] H. Mannila, H. Toivonen and I. Verkamo, “Discovery of frequent two data sets. For the same reason, the space comparison of episodes in event sequences”, Data Mining and Knowledge Dis- the two methods shows the same characteristics as time com- covery, Vol.1, No.3, pp.259–289, 1997. parison. Obviously, when min fre is not too big, the space [9] H. Mannila, H. Toivonen and A.I. Verkamo, “Discovering fre- quent episodes in sequences”, Proc. KDD’95, pp.210–215, 1995. cost of method 1 is much less than that of method 2. This is [10] D. Patel, W. Hsu and M.L. Lee, “Mining relationship among because in method 2, more space is needed to store large num- interval-based events for classification”, Proc. SIGMOD’08, bers of candidate episodes. Whereas method 1 only needs to pp.393–404, 2008. store the matrix no matter how much min fre is. The space [11] W.Z. Zhou, H.Y. Liu and H. Cheng, “Mining closed episodes need is independent of min fre. from event sequences efficiently”, Proc. PAKDD’10, pp.310– 318, 2010. [12] The UCSC Genome Bioinformatics Site [Online]. Available: http://hgdownload.cse. ucsc.edu/downloads.html, 2010. [13] Intel Lab Data Site [Online]. Available: http://db.csail.mit.edu/ labdata/labdata.html, 2010. LIN Shukuan was born in Jilin Province, China, in May 1966, graduated from Department of Computer Science of Jilin University in 1988. Currently, she is a professor of Northeastern University. The main research directions are temporal data Fig. 3. Compare the space cost of the two methods. (a)Hu- mining and machine learning. (Email: lin- man Gonome data; (b)IntelLabdata [email protected]) From the above experiments, we can find that method 1 outperforms method 2 at time and space cost, and its advan- WANG Ya was born in Henan tages are more obvious on dense data sets. Province, China, in Dec. 1986. She is studying on temporal data mining as a IV. Conclusions graduate student in College of Information Science and Engineering of Northeastern The paper proposes a frequent 2-episode mining method University. with minimal occurrences based on episode matrix and lock state, which only scans event sequences once without candi- date generation. Compared with the traditional apriori-like al- gorithm, the efficiency of time and space is enhanced markedly. WANG J u e was born in Liaoning Province, China, in Dec. The experiments on real data sets show the effectiveness and 1980. She is studying on temporal data mining and machine learn- efficiency of the method proposed in the paper. It is very im- ing as a doctoral student in College of Information Science and portant for mining the episodes with arbitrary length. And it Engineering of Northeastern University. determines the global mining strategy is different from the GUO Tianzhu was born in Inner , China, in Oct. apriori-like algorithm and lays the foundation of n-episode 1987. He is studying on temporal data mining as a graduate student mining (n>2) with high efficiency. How to extend it to n- in College of Information Science and Engineering of Northeastern episode mining (n>2) and keep the feature of scanning event University. sequences once without candidate generation is our ongoing QIAO Jianzhong was born in Liaoning Province, China, in work. Besides, for a given data set, how to choose a suitable Sept. 1964. He is a professor, doctorial tutor. His main research min fre is the issue to be considered in the future. directions are parallel computation and artificial intelligence. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Modeling and Path Generation Approaches for Crowd Simulation Based on Computational Intelligence∗

LIU Hong, SUN Yuling and LI Yuanyuan

(School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China) (Shandong Provincial Key Laboratory for Novel Distributed Computer Software Technology, Jinan 250014, China)

Abstract — Modeling and simulating group behaviours This paper presents some novel approaches based on com- have been an active research topic in the field of com- putational intelligence for supporting entity modeling and puter animation and game. This paper presents some path generation for crowd simulation. First of all, it presents novel approaches for supporting entity modeling and path an entity modeling approach based on CGA and NURBS generation in crowd simulation. It analyses related work about crowd simulation first. Then, an entity modeling technologies. This approach takes entity models compose of approach based on CGA (Cellular genetic algorithm) and NURBS as population and the structure curves of the model NURBS (Non uniform relational B splines) technologies is as genes. CGA is adopted for mapping these individuals to a presented. Next, following the analysis to PSO (Particle two dimensional grid and the genetic operations are restricted swarm optimization) and ABC (Artificial bee colony) algo- within the related neighborhood. This approach insures that rithms,acrowdpathgenerativeapproachbasedonABC- the excellent individual can be reserved in generated entities PSO is put forward. After that, a simulating example of and with high rate of convergence. Then, It combines the ad- crowd cohesion and performance comparison are exhibited for showing the efficiency of the algorithms. Finally, the vantage of ABC and PSO algorithms, presents a ABC-PSO current work is summarized and an outlook for the future algorithm for path generation in crowd simulation. It adopts work is given. the strategy of employed bees and onlooker bees in ABC to ac- Key words — Crowd simulation, Entity modeling, Cel- celerate the positioning process of the group and the approach lular genetic algorithm, PSO (Particle swarm optimiza- in PSO to dynamically adjust an individual’s position, speed, local extreme value and global extreme value.This assures the tion) algorithm, ABC algorithm, Path generation. consistent movement of individuals in group. Compared with both ABC and PSO algorithms, it increases the optimizing I. Introduction ability and rate of convergence and can implement the path generation of crowd movement effectively. Modeling and simulating group behaviors have been an active research topic in the field of computer animation and II. Related Work game. Traditional key frame technique took animators lots of time and vigor to simulate vivid behaviors of crowds or flocks as each individual’s behavior needs to be scripted care- Herding, flocking, and schooling behaviors of animals have fully to meet the following requirement: motion of the whole been studied extensively over the past century. Turing’s[1] group should be harmonious while the individuals in the group work on how nature produces complex patterns by low level movement look like independent and stochastic moving. rules inspired much of there search into computing intelligence Inspired by natural and social behaviours, researchers have and emergent behavior, which Johnson[2] describes “leader- developed many successful optimisation algorithms. Two im- less individuals using low level rules to achieve higher levels portant classes of computational intelligence algorithms are of sophistication”. Computationally simple processing archi- evolutionary algorithms and swarm intelligence algorithms. tectures, inspired by natural artifacts or processes, have been Compared with conventional optimisation techniques, these found to provide complex and useful forms of computation. nature inspired algorithms can deal with not only the well- Early progress in the simulation of group behaviors was defined but also the more complex, uncertain and ill-defined made by Reynolds[3]. Actors in his system are bird like objects problems. similar to the point mass ensued in particle systems except

∗Manuscript Received Apr. 2011; Accepted Nov. 2011. This work is supported by the National Natural Science Foundation of China (No.60743010) and the Ph.D. Programs Foundation of Ministry of Education of China (No.20093704110002), Natural Science Foundation of Shandong Province (No.ZR2010QL01) and Shandong Provincial Key Laboratory Project. Modeling and Path Generation Approaches for Crowd Simulation Based on Computational Intelligence 637 that each bird also has an orientation. The birds maintain cell can assume a state from a finite set, and the automaton proper position and orientation within the flock by balanc- evolves in discrete time steps, changing the states of all its ing their desire to avoid collisions with neighbors, to match cells according to a local rule, homogeneously applied at every the velocity of near by neighbors, and to move towards the step. The new state of a cell depends on the previous states of center of the flock. Each bird uses only information about a set of cells, which can include the cell itself, and constitutes near by neighbors. This localization of information simulates its neighborhood. More formally, we can give the following one aspect of perception and reaction in biological systems definition, assuming Zd as the underlying spatial network of and helps to balance the three flocking tendencies. Reynolds’s the automaton. work demonstrates that realistic-looking animations of group Definition 1 A d-dimensional cellular automaton (or d- formations can be created by applying simple rules to deter- CA) is a structure A =(Zd,S,N,δ), where Zd is the lat- mine the behaviors of the individuals in the flock. tice of d-tuples of integer numbers. S is a finite set of states. A number of related methods for controlling group behav- N = {nj =(x1j , ···,xdj )/j ∈{1, 2, ···,n}} is a finite ordered iors have been proposed, but only a few researchers try to subset of Zd called the neighborhood of A. δ : Sn+1 → S is use swarm intelligence algorithms for this purpose. Tu and the local transition function or local rule of A. [4] Terzopoulos created groups of artificial fishes endowed with It is useful to have a special state, called quiescent and synthetic vision and perception of the environment, which con- denoted by 0, such that δ(0, ···, 0) = 0. [5] trol their behavior. Ward et al. studied an evolving sensory Let x ∈ Zd be a cell, and s ∈ S be its state at time t.Then  controller for producing schooling behavior based on “boids” at time t+1, x will assume the state S = δ(x+n1, ···,x+nn). [6] Blue and Adler used Cellular automata (CA) in order to sim- Hence, the elements of N give the vector of displacements ulate collective behaviors, in particular pedestrian movement. along each one of d possible directions that allow reaching the [7] Ban et al. proposed a distributed behavior model in cogni- cells that influence the state change of the cell. Two important tive perspective to simulate the fish behavioral action. Qiu neighborhoods are as follows: [8] and Hu investigated a framework for modeling the structure (1) Von Neumann neighborhood aspect of different groups in pedestrian crowds. Chen and nVN = {nj =(x1j , ···,xdj )/xkj ∈{−1, 0, 1} for k = Lin[9] used a conceptual model to co-operate with particles d 1, ···,d and |xkj|≤1} warm optimization mechanism for controlling the movement of k=1 crowds in computer simulation. Fridman and Kaminka[10] in- In this case, a cell x is connected to all the cells at a dis- vestigated a general cognitive model of group behaviors, based tance 1 along exactly one of the d coordinates, and with itself on Festinger’s Social comparison theory (SCT), and described (distance 0) (see Fig.1). the implementation and adaptation of the SCT model in the (2) Moore neighborhood Soar cognitive architecture and its application in modeling im- itational behavior. nM = {nj =(x1j , ···,xdj )/xkj ∈{−1, 0, 1} for k =1, ···,d} III. The Entity Modeling Based on In this case, a cell x is connected to all the cells at distance at most1ineachdirection(i.e., diagonal connections are allowed: Cellular Genetic Algorithm and NURBS see Fig.2). Technology

For the character of a large scale individuals in crowd sim- ulation, a group of the entity modeling by evolutionary design possesses unexampled advantage. Each individual in group have to be different and each of them have to be slight sim- ilar in group simulation. Fortunately, evolutionary design is suitable to solve this kind of problem. Evolutionary design is based on simulating the process of natural selection and re- production on a computer. This technique depends on the Fig. 1. Von Neumann type Fig. 2. Moore type specification of a parameterized model that is general enough to allow a wide variety of possible outcomes of interest to the It appears clear from the above definition that the main designer. These outcomes are similar with the seed while each characteristics of CAs are discreteness and locality. From the of them are different. It just is accord with request of entity repeated synchronous application of the. simple local evolu- modeling. The entity modeling based on cellular genetic algo- tion rules, a global behavior emerges which can be very com- rithm and NURBS technology will be introduced at following plex. Let us now formalize the notion of evolution or behavior section. of CAs, starting with the definition of configuration. 1. Cellular automata Definition 2 A configuration or instantaneous or global The theory of CA (Cellular automata) as models of self- state of a cellular automaton A =(Zd,S,N,δ) is a map reproducing systems was conceived and firstly developed by ϕ : Zd → S that associates a state with every cell. We will John Von Neumann during 1950s[11] . Informally, a CA is a set denote by ϕ(A), or simple ϕ, the set of configurations of A. of identical elements, called cells, each one of which occupies The synchronous application of the local rule to every cell a node of a regular, discrete, infinite spatial network. Each allows transforming a configuration into a new one. 638 Chinese Journal of Electronics 2012

t − ti ti+k+1 − t Definition 3 The global function of a cellular automa- Ni,k(t)= × Ni,k−1(t)+ d ti+k − ti ti+k+1 − ti+1 ton A =(Z ,S,N,δ) is a map FA : ϕ → ϕ defined by d × N t ,t∈ t ,t FA(c)(x)=δ(x + n1, ···,x+ nn) for every x ∈ Z . i+1,k−1( ) [ k n+1](3) Definition 4 The behavior or evolution of a cellular t automaton A from a given initial configuration c0 ∈ ϕ is In which, i are the values of knots, and the knot vector is T t0,t1, ···,t a sequence of configurations {ct}t≥0, such that for t ∈ n, =( n+k+1). ci+1 = FA(ci). From the definition equation of NURBS curve, it can be The sequence {ct}t≥0 is often designated as the orbit of c0 seen that the shapes of NURBS curves can be changed by when CAs are considered as dynamical systems, or as compu- moving control points, altering the weights of control points, tation on c0 when they are seen as computation models. and The knot vector is a sequence of parameter values that de- 2. Cellular genetic algorithm termines where and how the control points affect the NURBS Traditional GAs evolve a population of individuals over curve. In this paper, the approach of altering shape is used time by selecting mates from the entire population. Loss of by moving control points while the structure lines are being population diversity (convergence) reduces the quality of many adjusted. solutions. There are many creation approaches of NURBS models, The algorithm we explore here embeds the evolving popu- such as transshipping to the basic NURBS elements, revolving lation of the GA in a CA. In CGA, the population is structured or lofting to NURBS curves and so on. Therefore, NURBS in a regular grid of two dimensions and a neighborhood is de- models can be created by outlining structure lines and then fined on it. The algorithm iterative considers as current each lofting them. Based on this way, the structure lines of a suc- individual in the grid. An individual may only interact with cessful NURBS model are extracted first. Then the extracted individuals belonging to its neighborhood, so the parents are structure line are adjusted. Finally, these structure lines are chosen among its neighbors with a given criterion. Crossover lofted to generate the other model. and mutation genetic operators are applied to the individu- There are both U curves and V curves on NURBS surfaces. als, with probabilities Pc and Pm, respectively. After applying U curves are landscape orientation curves while V curves are these operators, the algorithm computes the fitness value of longitudinal orientation curves on model surface. A NURBS the new offspring individual (or individuals), and inserts it model can be rotten by lofting its U curves or V curves. There- into the equivalent place of the current individual in the new fore, extracting structure curves is to draw out and copy these population following each individual’s fitness. The neighbor- U and V curves and then adjusted them to form new model. hood of Moore type is taken in this algorithm. 4. The coding of cellular genetic algorithm Solving a given problem with genetic algorithm starts with Cellular genetic algorithm specifying a representation of the candidate solutions. Such Step 1 Initialization candidate solutions are seen as phenotypes that can have very Set the current generation number i =1; complex structures. There are many coding methods, such as Set crossover probability Pc, mutation probability binary coding, gray coding, real coding, symbol coding, tree- Pm and Termination Condition; structure coding, hybrid coding, and so on. In this paper, the Generate a random population set P with scale factor of structure curves is taken as gene and real cod- n individuals; ing is used to express scale value. The number of gene bits is Step 2 Create the neighborhood set according to CA; decided by the structure curve number of the model. Step 3 Perform crossover, mutation and importation on the neighborhood set; 5. Fitness Step 4 Calculate the fitness for each individual in Select a current best model by designer ,and the fitness of the neighborhood set; every individual is decided by the similar degree with the best Step 5 Reproduce individuals to form a new population individual. It is calculated according to the radios between according to each individual’s fitness. the structure curves of an individual with the structure curves Go to step 2 until Termination Condition is Step 6 of the best individual. satisfied. Definition 5 The structure curve radius ri: the average of the distances between the control points to the center point 3. NURBS technology at the ith structure curve. Non-Uniform Rational B-Splines encompass almost every Definition 6 The ratios of the current structure curve other possible 3D shape definition. A NURBS curve is defined Currenti: the ratios between the radius of the ith structure as:  n r iPiNi,k(t) curve i at the current individual and the radius of the first C t i=0 ( )= n (1) structure curve. i=0iNi,k(t) Definition 7 The best ratios of the structure curve where Pi are the control points, i are the weights of Pi,and Besti: the best ratios between the radius of the ith struc- Ni,k(t) are B-Spline basis functions, i =0, 1, 2, ···,n. ture curve ri at the current individual and the radius of the The recursive definition of B-Spline basis functions Ni,k(t) first structure curve. is:  1 1,t∈ [ti,ti+1] fitness = (4)  Besti − Currenti Ni,0(t)= (2) n i=1 0,t/∈ [ti,ti+1] Besti Modeling and Path Generation Approaches for Crowd Simulation Based on Computational Intelligence 639

According to Eq.(4), the more similar an individual with shared by the employed bees. Scouts either randomly search the best individual, the higher its fitness value. the environment in order to find a new food source depending on an internal motivation or based on possible external clues. IV. Path Generation Based on ABC-PSO Next, we will explain the ABC-PSO algorithm in detail. Algorithms 3. ABC-PSO algorithm (1) Initialization The algorithm initializes the position and speed of every 1. Particle swarm optimization (PSO) algorithm individuals, and m possible solutions. Particle swarm optimization (PSO) was originally intro- Suppose the size of swarm in d-dimension space is N, duced by J. Kennedy and R. Eberhart[12] in 1995 as an opti- X i =(xi1,xi2, ···,xid) is the position vector and V i = mization technique. The underlying motivation for the devel- (vi1,vi2, ···,vid)(i =1, 2, ···,N) is the speed vector respec- opment of PSO algorithm was social behavior of animals such tively. The matrix X and V are generated randomly in a as bird flocking, fish schooling, and swarm theory. Each indi- restricted range. vidual in PSO is assigned with a randomized velocity according (2) Select the target point of swarm dynamically to its own and its companions’ flying experiences, and the in- When selecting the target point of swarm, the employed dividuals are then flown through hyperspace. In the Standard bees select new sources according to their source information PSO model, each individual is treated as a volumeless parti- by Eq.(6). cle in the D-dimensional space, with the position and velocity New yij = yij + Φij (yij − ykj)(6) of ith particle represented as Xi =(Xi1,Xi2, ···,XiD)and k ∈{, , ···,m} k  i j ∈{, , ···,d} φ Vi =(Vi1,Vi2, ···,ViD). The particles move according to the where 1 2 and = , 1 2 , ij is a following equation: uniformly distributed real random number in the range [−1, 1]. The fitness value of source yi canbegottenbythedistance Vid =w ∗ Vid + c1 ∗ rand(·) ∗ (Pid − xid) between the bees and source and the average honey content.  + c2 ∗ rand(·) ∗ (Pg − Xid)(5a) 2 2 2 f(yi)=α yi1 + yi2 + ···+ yid + βS/n (7) Xid =Xid + Vid (5b) where S is the honey content in the source and n is the number where c1 and c2 are positive constant, rand() and Rand() are of bees which going to the source. two random functions in the range of [0,1]. Parameter w is the When the scout is finished by the employed bees, they in- inertia weight introduced to accelerate the convergence speed formed the onlooker bees of the position and fitness value of of the PSO. Vector p =(pi1,pi2, ···,piD) is the best previous i the source. The onlooker bees select the source via Boltzmann position (the position giving the best fitness value) of particle i selection policy according to Eq.(8). p p ,p , ···,p called pbest, and vector g =( g1 g2 gD)istheposition   of the best particle among all the particles in the population fi exp andcalledgbest. T c−1 pi =   ,T= T0(0.99 )(8) In the PSO algorithm, a particle decides where to move SN fi next, considering its own experience, which is the memory of i=1 exp T its best pat position, and the experience of its most success- P y f ful neighbor. At each iteration, the particle pbest with the where i is the probability of selecting the source i and i is i T T best fitness in the local neighborhood and the current particle the fitness of the th solution, is temperature and 0 is the c are combined to adjust the velocity alone each dimension, and initial temperature, is the times of cycle. that velocity is then used to compute a new position for the (3) Swarm behavior controlling particle. The portion of the adjustment to the velocity influ- After the target point of swarm is selected, the bees are enced by the individual’s previous best position is considered divided into several small group according to their selected the cognition component, and the portion influenced by the sources. They communicate via information sharing mecha- best in the neighborhood is the social component. nism, make path planning for avoiding collision. x v i 2. The Artificial bee colony (ABC) algorithm After initializing the position i and speed i ( = , , ···,N P The ABC (Artificial bee colony) algorithm, proposed by 1 2 ) of every individual, the individual peak value best g Karaboga in 2005 for real-parameter optimization, is a opti- and the global extreme value best mization algorithm which simulates the foraging behaviour of a f(xi)=αd + βf(yij )(9) bee colony[13]. The minimal model of swarm-intelligent forage  2 2 2 selection in a honey bee colony which the ABC algorithm sim- d = (xi1 − yi1) +(xi2 − yi2) + ···+(xid − yid) ulates consists of three kinds of bees: employed bees, onlooker (10) bees and scout bees. Half of the colony consists of employed From Eq.(9), we can see that individual’s fitness f(xi)is bees, and the other half includes onlooker bees. Employed decided jointly by the distance between the individual and bees are responsible for exploiting the nectar sources explored source yij and the fitness f(yij ) of the source, α and β are before and giving information to the waiting bees (onlooker impact factors. bees) in the hive about the quality of the food source sites which they are exploiting. Onlooker bees wait in the hive and vid =ωvid + c1rand()(pid − xid)+c2rand()(pgd − xid) decide on a food source to exploit based on the information (11a) 640 Chinese Journal of Electronics 2012

xid =xid + vid (11b) seen in Ref.[14]. In the following section, only a path gener- (According to PSO algorithm) ation examples of cohesion is considered to demonstrate the performance and applicability of the proposed algorithms. where ω is inertia weight, c1 is the weight coefficient of the in- (1) Performance comparison dividual’s acceleration, c2 is the weight coefficient of the global For illustrating the performance efficiency of the proposed acceleration. In general, c1 = c2 =2,rand() is a random func- algorithm, the standard PSO and ABC are used to compare tion range in [0, 1]. During the executing process, if vij >vmax with ABC-PSO. The experiment uses three algorithms looking then vij = vmax. for optimal path in a 700 × 700 mesh where there are many The position xi and speed vi (i =1, 2, ···,N)ofeveryin- obstacles, one initial point and one target point. The criteria dividual are updated according to Eq.(11). The optimization of performances considered is the quality of path and iterative paths are generated after repeated iteration. speed. For each algorithm, the population scale is 50 and the (4) Parameter control results are averaged over 20 trials. Fig.3 shows the perfor- There are three important parameters in the algorithm: mance comparison of three algorithms. the number of possible solutions (food source) M, the thresh- During 20 trials, the proposed ABC-PSO algorithm uses old value of abandoned source LIMIT and the max cycle gen- little time to find optimal paths than ABC and PSO, which eration MCN LIMIT controls the astringency of the algorithm means that it has a better convergence. The average opti- and determines the ability of local peak. After LIMIT times mal path of ABC-PSO is also the best one. In addition, it iteration, if the fitness of the source yi still is situated in a shows that the ABC-PSO has better stability than the other lower level, then it will be abandoned. The source yi will be twos. The experimental result illustrates that the proposed j replaced by the new source yi according to Eq.(12). ABC-PSO algorithm performs better than the other twos.

j j j j yi = ymin + rand()(ymax − ymin) (12) where rand() is a random number in [0, 1]. In this way, onlooker bees and employed bees carry out local search while onlooker bees and scout bees perform global search, so as to get a balance of algorithm between global dis- persive search and local chemotatic search. In PSO, a new position vector is calculated using the par- ticle’s current and best solution and the swarm’s best solution while in ABC, a new solution vector is calculated using the employed bee’s current solution and a randomly chosen solu- tion. In PSO, the new solution is replaced with the old one without considering which one is better. However, in ABC, a greedy selection scheme is applied between the new solution and the old one, and the better one is preferred for inclusion in Fig. 3. The performance comparison of three algorithms the population. In this way, the information of a good member of the population is distributed among the other members due (2) Simulated results to the greedy selection mechanism employed. ABC also uses After the path has been generated, it is imported to MAYA a probabilistic selection scheme in the onlooker bees phase in software via the control of MEL scripting for generating sim- addition to this greedy selection scheme. The ABC algorithm ulated crowd behaviors. To indicate the effectiveness of the also has a scout phase which provides diversity in the popu- proposed ABC-PSO algorithm, it’s used to simulate cohesive lation by allowing new random solutions to be inserted into behaviors of 100 individuals as shown in the following figures. the population instead of the solutions which do not provide These individuals must get to intermediate building using the improvements while the PSO algorithm does not have such a shortest time. process. Fig.4 is the initial state in which all individuals locate in However, ABC algorithm haven’t given the path of the different places and Fig.5 is the result state after performing single target collection of swarm. The global scout of ABC 80 iterations of the ABC-PSO algorithm. It illustrates that algorithm can be used to overcome the local optimization of the proposed algorithm do a good work in the path generation PSO while the path planning of PSO can be used to imple- for simulating crowd cohesion. ment the single target collection of swarm. The combination of two algorithms can accelerate the speed of path generation in crowd simulation. 4. Experimental results and discussions The basic crowd simulation consists of four basic steering behaviors: cohesion, separation, dynamic object track, and collision avoidance. Our previous work about the path generation of dynamic object tracking and collision avoidance based on PSO can be Fig. 4. The initial state of the crowd Modeling and Path Generation Approaches for Crowd Simulation Based on Computational Intelligence 641

havior route selection from a cognitive prospective in computer animation”, Acta Electronica Sinica, Vol.37, No.4, pp.758–763, 2009. [8] F. Qiu, X. Hu, “Modelling group structures in pedestrian crowd simulation”, Simulation Modelling Practice and Theory, Vol.18, No.1, pp.190–205, 2010. [9] Y.P. Chen, Y.Y. Lin, “Controlling the movement of crowds in computer graphics by using the mechanism of particle swarm optimization”, Applied Soft Computing, Vol.9, pp.1170–1176, Fig. 5. The cohesive state after perform the ABC-PSO algo- 2009. rithms [10] N. Fridman, G.A. Kaminka, “Towards a computational model of social comparison: Some implications for the cognitive ar- chitecture”, Cognitive Systems Research, Vol.12, pp.186–197, V. Conclusions 2011. [11] T. Ceccherini-Silberstein, M. Coornaert, Cellular Automata and Modeling crowd behavior is an important challenge. Using Groups, Springer, 2010. traditional techniques, both experience and vigor are needed [12] J. Kennedy, R.C. Eberhart, “Particle swarm optimization”, to produce even a short group animation. In this paper, we Proceedings of the IEEE International Conference on Neural Networks IV, Vol.4, IEEE Press, Piscataway, NJ., pp.1942– integrate techniques from computational intelligence, put for- 1948, 1995. ward a group modeling approach based on cellular genetic [13] D. Karaboga, “An idea based on honey bee swarm for numer- algorithm, and introduce a group path generating approach ical optimization”, Technical Report TR06, Erciyes University, based on PSO-ABC algorithms for crowd simulation. Engineering Faculty, Computer Engineering Department, 2005. Although looking simple, the modeling and path gener- [14] H. Liu, S. Xu, “Group animation path generation based on par- ation algorithms in this paper presents a feasible and useful ticle swarm optimisation”, Proceedings of the 14th International approach in an animation generative environment. This en- Conference on Computer Supported Cooperative Work in De- sign, Fudan University, Shanghai, China, pp.37–42, 2010. vironment can be used to enhance the fidelity and vitality of LIU Hong was born in 1955, is computer animation, and reduce the work intensity of anima- now a Professor of computer science in the tors remarkably. School of Information Science and Engi- Future research will include: (1) extending the work in neering, Shandong Normal University. She this paper, to model the more complex entity; (2) increasing received Ph.D. degree from the Chinese Academy of Sciences in 1998. Her main re- in the system, the population characteristics for the customer search interests include computational in- classification and evaluation mechanisms. These improvement telligence and cooperative design. (Email: will be studied in the very near future. [email protected])

was born in 1986. She References SUN Yuling is now a M.S. degree candidate in School [1] A.M. Turing, “The chemical basis of morphogenesis”, Philo- of Information Science and Engineering, sophical Transactions of the Royal Society of London, Series Shandong Normal University. Her main re- B, Biological Sciences, Vol.237, No.641. pp.37–72, 1952. search interests include evolutionary algo- [2] S. Johnson, Emergence: the connected lives of ants, brains, rithm and crowd animation. cities, and software”, Scribner NewYork, 2002. [3] C. Reynolds, “Flocks, birds, and schools, A distributed behav- ioral model”, Computer Graphics, Vol.21, No.4, pp.25–34, 1987. [4] X. Tu, D. Terzopoulos, “Artificial fishes: physics, locomotion, perception, behavior”, Proceedings of SIGGRAPH 1994, NY, LI Yuanyuan was born in 1986. USA, pp.43–50, 1994. She received M.S. degree from Shandong [5] C.R. Ward, F. Gobet, G. Kendall, “Evolving collective behavior Normal University, and now is a Ph.D. can- in an artificial ecology”, Artificial Life, Vol.7, No.2, pp.191–209, didate in . Her main re- 2001. search interests include evolutionary algo- [6] V.J. Blue, J.L. Adler, “Cellular automata microsimulation for rithm and crowd animation. modelling bi-directional pedestrian walkways”, Transportation Research, Part B: Methodological, Vol.35, No.3, pp.293–312, 2001. [7] X.J. Ban, D.P. Jiang, S.R. Ning and Y.X. Yi, “Research on be- Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Low Power EEPROM Designed for Sensor Interface Circuit∗

MENG Xiangyun, YANG Sen, CHEN Zhongjian, LU Wengao, ZHANG Yacong, HUANG Jingqing, LI Haojiong, SU Weiguo and LI Song (1.Key Laboratory of Microelectronic Devices and Circuits, Department of Microelectronics, Peking University, Beijing 100871, China)

Abstract — EEPROM is an important part for inter- low-power mode after data readout automatically. To further face circuit of sensor. It saves the calibration data and pa- reduce the power consumption, the clock works only when the rameter setting data by non-volatile storage. A new low- data is allowed to read into D-flip-flop and the charge pump power Erasable and electrically programmable read only works only when the data needed to erase or programmed into memory (EEPROM) circuit for sensor interface circuit is introduced in this paper. The data stored in EEPROM can the memory units. be reloaded automatically by the power-on-reset technique when the power is on. The module consumes power only II. EEPROM Circuit Design during power-on-reset and data changing by status opti- mization and low-power design. This EEPROM circuit has been used in a MEMS accelerometer readout circuit veri- The architecture of the EEPROM circuit is shown in Fig.1. fied by 0.35µm CMOS EEPROM process. The results are CtrlA, CtrlB are control signals, data and clk are input data satisfying in that the module can write and store 25bit data and synchronous clock signals. CtrlA, CtrlB, data are trans- and automatically reloads data within 0.1µsafterpower-on ferred to inner control signals CtrlA in, CtrlB in and paral- and then turns to low power mode. lel data signal data in for EEPROM cells by General Con- Key words — Accelerometer, Erasable and electically trol Logic. Dickson charge pump[8] and level shift circuit are programmable read only memory (EEPROM), Low power, used to generate 14V voltage for programming and erasing the Readout circuit. floating-gate Bitcells in EEPROM.

I. Introduction

It’s necessary to use EEPROM[1] to save the calibra- tion data and parameter setting data by non-volatile stor- age for interface circuit of sensor[2,3]. For example, differen- tial MEMS (Microelectronics mechanical systems) capacitive accelerometer[4] detects external acceleration by detecting the values of a couple of differential capacitors[5]. However, the values may not be equal to each other because of process vari- ations which can affect the performances such as zero-point, linearity etc. More over a sensor interface circuit needs to meet the requirements of accelerator detector with different measur- Fig. 1. Architecture of EEPROM circuit ing ranges, thus its programmability is indispensable[6,7]. This paper presents a new low power EEPROM design for The structure of EEPROM cell is shown in Fig.2. Bitcell sensor interface circuit which has been applied to a MEMS is the storage unit with double-gate structure. A Bitcell has accelerometer readout circuit for storage of series input cali- 5 ports, the source (S), the drain (D), select gate (SG), con- bration data and parameter setting data. The EEPROM cir- trol gate (CG) and substrate (Sub). The control gate is of a cuit consumes power only during Power-on-reset (POR) and dual-gate structure and has an ultra-thin oxide layer between data changing. It automatically reloads the data stored in the floating gate and the drain. By applying different voltages EEPROM cell to the corresponding D-latches and accordingly to the ports of the Bitcell by EEPROM cell control logic, the enables the interface circuit. The EEPROM circuit turns into charge on the floating gate of the sense transistor is changed,

∗Manuscript Received Oct. 2011; Accepted Mar. 2012. This work is supported by the National High Technology Research and Development Program of China (863 Program) (No.2008AA042201). Low Power EEPROM Designed for Sensor Interface Circuit 643 and consequently its threshold voltage, to record 0 or 1. The reformed to CtrlB in after the buffer. EEPROM cells con- recorded message can be read out by whether the channel con- trolled by CtrlA in and CtrlB in operate Read and then Halt trolled by CG is on when a voltage Vsensing is added to CG. automatically. The D-latches in EEPROM cells latch the data from the Bitcells delayed by the resistors Rdelay and capacitors Cdelay at the down edge of read and output them during Halt oper- ation. When power-on-reset is not needed, the external input signals CtrlA and CtrlB are directly transferred to CtrlA in and CtrlB in by the buffers.

Fig. 2. Architecture of EEPROM cell

There are four operations of Erase, Program, Read and Halt that an EEPROM provides under the control of the EEP- ROM cell control logic. The function, control signals, thresh- old voltage of the Bitcell and port voltages are shown in Table 1.

Fig. 3. Power-on-reset structure and time-sequence Table 1. Status, function, control logic and voltage of EEPROM cell The charge pump works only when the data is needed to Status Erase Program Read Halt erase or programmed into the memory units. It is halt during Function Write 1 Write 0 Readout data Halt (low power) CtrlA in 0 1 1 0 other steps. At the same time, during these two steps (Erase CtrlB in 1 1 0 0 and Program), the outside data is not allowed to read into D-flip-flop, so the clk for EEPROM Cell Control Logic data VTH (V) > 3V < 0V − − VD (V) 0 14 3.3 0 input is gated by CtrlB. Charge pump halt and gated clock VSG (V) 14 14 3.3 3.3 method further reduced the power consumption. VCG (V) 14 0 Vsensing 0 clk inner = clk · CtrlB (1) VS (V) float float 0 0

The operation steps are shown as follow when recording or III. Simulation Results changing data. 1. Erase: Every EEPROM cell record ‘1’ by operating The EEPROM circuit has been designed and simulated Erase. with 0.35ç CMOS EEPROM process based on Cadence Spec- 2. Program: EEPROM cells of the bits’ content corre- tre platform. spond to ‘0’ record ‘0’ by operating Program. 3. Read: Data in all EEPROM cells is read out and verified by operating Read. 4. Halt: All EEPROM cells are deactivated to save power. The Bitcells’ sources and drains are connected to ground and all the D-latches continue data output. During EEPROM operations, the new input data is read into D-flip-flop in step Read and Halt, and then into Bitcell in step Erase and Program if data modification is needed. These data will be verified during next step Read and Halt. Step Read and Halt are directly executed under the condition of only reading out data. A power-on-reset technique in General control logic is shown on the left of Fig.3. The power-on-reset functions when CtrlA and CtrlB are floating, as shown on the right of Fig.3. Because of the capacitor C1, the voltage of CtrlA jumps to 3.3V with the power supply voltage V3p3, and then falls back Fig. 4. Simulation results of EEPROM cell to ground due to the resistance R1. The wave form of CtrlA is a pulse and reforms to CtrlA in after the buffer. CtrlB keeps The simulation results of EEPROM cell are shown in Fig.4. low level by connected to ground through a resistance R2 and The Bitcell could not be erased and programed during simu- 644 Chinese Journal of Electronics 2012 lation stage, so the simulation results of EEPROM cell are of the differential capacitors, offset of the output voltage and replaced by the ports voltages of the Bitcell. cut-off frequency of the filter. The control signals a and b (CtrlA and CtrlB) change ac- cording to the order Halt, Erase, Program, Read. The simu- V. Conclusion lated port voltages S, CG, SG, D are all converted dovetailed nicely with Table 1. The port voltage of SG and CG turn up Interface circuits of sensors usually require calibration data slowly because the charge pump begins to operate during steps and parameter setting data to be saved by non-volatile stor- switch from Halt to Erase. The two port voltage D from top age using EEPROM. A new low power EEPROM design to bottom in Fig.4 corresponds to the two situations that the for MEMS accelerometer readout circuit exploiting power-on- data stored in EEPROM cell is 0 and 1. If the stored data is reset technique is presented for the storage of these data. Test 1, during Program step, there is no necessity to program the results showed correct functions for design requirements. This Bitcell, so the port voltage of D is 3.3V. EEPROM design is also suitable for other sensor interface ap- plications. IV. Test Results References The EEPROM circuit was implanted into an accelerometer [1] A.F. Murray, L.W. Buchan, “A user’s guide to non-volatile, on- µ readout circuit and fabricated with 0.35 m CMOS EEPROM chip analogue memory”, IEL Journal of Electronics & Commu- process. The photograph of the readout chip is shown as Fig.5. nication Engineering Journal, Vol.10, No.2, pp.53–63, 1998. [2] Rodgers, Bertram, Goenawan, Sofjan, Yunus, Mohammad; Kaneko, Yoshikazu, Yoshiike, Junichi, “16-µA interface circuit for a capacitive flow sensor”, IEEE Journal of Solid-State Cir- cuits, Vol.33, No.12, pp.2121–2133, 1998. [3]H.K.Trieu,M.Knier,O.Koester,H.Kappert,M.Schmidt,W. Mokwa, “Monolithic integrated surface micromachined pressure sensors with analog on-chip linearization and temperature com- pensation”, Proceedings of the IEEE Micro Electro Mechanical Systems (MEMS), Miyazaki, Japan, pp.547–550, 2000. [4] Wang Yaolin, Xu Yigang, Chen Feng, Hu Xinsong, “A new sil- icon unitized micro-acceleration sensors”, Chinese Journal of Electronics, Vol.3, pp.36–40, 1994. [5] Wook Bahn, Hyoungho Ko, Taedong Ahn, Kwangho Yoo, Sangyoon Lee, Cheolkyu Han, Deog-kyoon Jeong, Dong-il Cho, Fig. 5. Chip of the readout circuit “A 16-bit ultra-thin tri-axes capacitive micro accelerometer for mobile application”, Control, Automation and Systems, ICCAS ’07, Seoul, Korea, pp.1970–1973, 2007. [6] Zhang Guohua, “Temperature compensation of a capacitive pressure transducer”, Chinese Journal of Electronics, Vol.5, pp.124–125, 1996. [7] Ko, Hyoungho, Cho, Dong-il Dan, “Highly programmable tem- perature compensated readout circuit for capacitive micro ac- celerometer”, Sensors and Actuators, A: Physical, Vol.158, No.1, pp.72–83, 2010. [8] T. Tanzawa and T. Tanaka, “A dynamic analysis of the Dick- son charge pump circuit”, IEEE J. Solid-state Circuits, Vol.32, No.8, pp.1231–1240, 1997. Meng Xiangyun was born in Han- dan, Hebei Province, China. She received Fig. 6. Test result of data writing and reading the B.S. degree from Department of Mi- croelectronics, Peking University, China, in The EEPROM circuit has been tested by a USB 6501 2010. She is currently pursuing M.S.degree in the Key Laboratory of Microelectronic Data-Acquisition device programmed by LabView 8.6. The Devices and Circuits, Department of Mi- test result as shown in Fig.6 was obtained by comparisons be- croelectronics, Peking University. Her re- tween the input data and output data and indicated that the search focuses on CMOS readout circuit EEPROM circuit functioned correctly. The data in the EEP- of accelerometer and focal plane array. ROM could adjust the gain of the readout circuit, mismatch (Email: [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Orthogonality is Better: Auxiliary Problems in ASO Algorithm∗

ZHANG Taozheng1, WANG Xiaojie2 and TONG Hui3

(1.Information Engineering School, Communication University of China, Beijing 100024, China) (2.Center for Intelligence Science and Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China) (3.School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China)

Abstract — We propose a principle called orthogonal- However, they are not enough for APs selection. Consider a ity for Auxiliary problems (APs) selection in Alternating case that there are two types of APs, AP1 and AP2. Their structure optimization (ASO) algorithm. Both theoreti- numbers are n and m respectively. All of APs can be cre- cal analyses and experimental results indicate the follow- ated from unlabeled data. How should we select APs among ing conclusions. If the weight matrices of different types of APs are orthogonal or approximately orthogonal, their them? Neither of the preceding two principles can answer this multi-combinations perform better than or equal to any question. In order to resolve it, we propose another principle: components. Moreover, as long as the ratios of their com- orthogonality. We first give theoretical analyses on it, and then ponents are appropriate, even if the total amounts of APs carry experiments on the task of Chinese syntactic chunking. are fixed, the multi-combinations still perform better than They both indicate the following conclusions. If the weight or equal to any components. In short, the principle of matrices of different types of APs are orthogonal or approx- orthogonality holds. imately orthogonal, their multi-combinations perform better Key words — Alternating structure optimization than or equal to any components. Moreover, as long as the (ASO), Auxiliary problems (APs), Target problems (TPs), ratios of their components are appropriate, even if the total Orthogonality. amounts of APs are fixed, the multi-combinations still per- form better than or equal to any components. In short, the principle of orthogonality holds. I. Introduction The rest of this paper is organized as follows. We first give a brief overview of ASO algorithm and the principles on how Recently, semi-supervised learning algorithms, which in- to create APs in Section II. The theoretical analyses about the volve labeled and unlabeled data at the same time, become principle of orthogonality are described in detail in Section III. [1,2] more and more popular . However, they often work well We provide the related experiments and discussions in Section when there is only a very small amount of labeled data and IV and conclude in Section V. are not shown to improve state-of-the-art performances while [3] large labeled data is available . II. A Brief Overview of ASO Alternating structure optimization (ASO) algorithm, which bases on linear structural learning, also belongs to y [3] The form of the standard linear predictive model is = semi-supervised learning algorithm . Its significant advan- f(x)=uT x,whereu is called the weight vector. A usual tage that contrasts with those traditional ones lies in it method to search the optimum predictive classifier is ERM can improve performances regardless of the number of the [3−5] (Empirical risk minimization) with the quadratic regulariza- labeled data . At present, ASO algorithm has been tion term, which controls the model complexity of n labeled widely applied to lots of different tasks in NLP, e.g., Part- examples {(X i,Yi), i =1, ···,n}. of-Speech tagging[3], named entity recognition[3,4] ,syntactic [4,6] [7,8] [3] chunking , semantic role labeling , text categorization , n [9] etc 1 T 2 word sense disambiguation , . uˆ = arg min L(u X i,Yi)+λu (1) u n The kernel of ASO algorithm is to create excellent Auxil- i=1 iary problems (APs). So far, there are few researches on its general methods. Ando and Zhang[3] give two principles on Function L(p, y) is called loss function in Eq.(1). In this work, how to create APs, that is, automatic labeling and relevancy. we use a modification of Huber’s robust loss function[3].

∗Manuscript Received July 2011; Accepted Feb. 2012. This work is supported by the National Natural Science Foundation of China (No.90920006). 646 Chinese Journal of Electronics 2012 ⎧ ⎨⎪ −4py, if py < −1 Orthogonality If the weight matrices of different types L(p, y)= (1 − py)2, if − 1 ≤ py < 1 (2) of APs are orthogonal or approximately orthogonal, their ⎩⎪ multi-combinations perform better than or equal to any com- 0, if py ≥ 1 ponents. Moreover, as long as the ratios of their components Suppose we solve m binary classification problems at the same are appropriate, even if the total amounts of APs are fixed, time. They are indexed by l ∈{1, ···,m}. Each of them in- the multi-combinations still perform better than or equal to cludes nl training samples {(X i,Yi),i =1, ···,nl}.Inour any components. joint linear model, a predictor for problem l takes the follow- For a given TP, assume that we have two types of APs: ing form AP1(n)andAP2(m). The numbers in parentheses indicate T T T fl(θ, x)=wl x + vl θx, θθ = I (3) that the amounts of APs are n and m respectively. The A u1, u2, ···, un Parameters wl and vl are weight vectors specific to each prob- weight matrices of AP1 and AP2 are =[ ] B u , u , ···, u lem l.Matrixθ is the common structure parameter shared by and =[˜ 1 ˜ 2 ˜ m] respectively. Now we consider bi- n m n m all the problems, whose rows are orthogonal. The kernel of combination AP1( )&AP2( ). There are + APs in all. U AB the model can be regarded as learning a good common feature Its weight matrix is =[ ]. As is well known, performing U map θx. It is a low-dimensional feature vector parameterized Singular value decomposition (SVD) on a matrix and then by matrix θ. The parameters {wl, vl} may then be computed obtaining its singular values equals to performing eigenvalue decomposition on the matrix U T U and then obtaining the by JERM (Joint empirical risk minimization) over all the prob- [11] lems, that is, their values should minimize the combined em- square roots of those eigenvalues .Soweneedtoanalyze U T U pirical risk: matrix : T T T   T T A A AAB m n U U =[AB] [AB]= [AB]= 1 T l l 2 BT BT ABT B {wˆ l, vˆl} = arg min L((wl+θ vl)X i,Yi )+λwl w ,v n l l l=1 i=1 (5) (4) 1. Strictly orthogonal Here an important observation is that the binary classi- If AP1 and AP2 are strictly orthogonal, i.e., the inner fication problems used to derive θ are not necessarily those product matrix of their weight matrices are zero one, we can problems that we are aiming to solve[3]. In fact, new problems obtain Eq.(6).

θ T T T can be invented for the purpose of obtaining a better .Some T A AAB A A 0 θ U U = T T = T (6) simple problems can be created to approximate with less B ABB 0 B B computational complexity. It is this observation that brings U T U ASO algorithm. Those created problems are called Auxiliary Matrix is a partitioned diagonal one. Performing SVD U h problems (APs), while the original problems are called Target on matrix and then obtaining the largest singular values A problems (TPs). It is clear that the creation and selection of equals to rearranging all the singular values of matrices and B h [11] APs are the kernels of ASO algorithm. Ando and Zhang[3] and then obtaining the largest singular values of all . h U give two ad-hoc principles on how to create APs. As a result, the largest singular values of matrix are gen- h Automatic labeling We need to automatically gener- erally larger than or equal to the largest singular values of A B ate various “labeled” data for the APs from unlabeled data. matrix or matrix respectively. We explain it as follows If the APs that we create are simple enough, we could con- for clarity. A B struct the corresponding classifiers conveniently with a lot of Assume the singular values of matrix and matrix are U unlabeled data and solve the TPs based on Eq.(4). defined in Eq.(7) and Eq.(8) respectively. As for matrix ,we Relevancy APs should be related to the TPs to some can assume its singular values are defined in Eq.(9). extent, e.g., they should share the certain predictive struc- λ1,λ2, ···,λr,λr+1, ···,λn tures; the features they used should be similar. Obviously, if (λ1 ≥ λ2 ≥···≥λr ≥ λr+1 = ···= λn =0, 1 ≤ r ≤ n)(7) the more APs have relevancy with TPs, the more APs support μ1,μ2, ···,μs,μs+1, ···,μm TPs. Furthermore, we directly define relevancy as follows[10] , i.e., the precision of prediction for target problems after a cer- (μ1 ≥ μ2 ≥···≥μs ≥ μs+1 = ···= μm =0, 1 ≤ s ≤ m) tain type of AP is added. In other words, after a certain type (8) of AP is added, the higher the precision of prediction achieves, σ1,σ2, ···,σt,σt+1, ···,σn+m themoreAPshaverelevancywithTPs. (σ1 ≥ σ2 ≥···≥σt ≥ σt+1 = ···= σn+m =0, 1 ≤ t ≤ n + m Above two principles cannot answer all questions on how (9) to create and select APs. If several types of APs are available, how to select them to further improve the performances? If So we can obtain the following conclusion according to those the total amounts of APs are fixed, how to select them to com- analyses: prise the multi-combinations? We focus on those problems in σi ≥ max{λi,μi} the following sections. s.t. 1 ≤ i ≤ h ≤ min(n, m) (10) III. Theoretical Analyses of Orthogonality Compared with matrix A or B,matrixU can better repre- sent the “principle components” of the predictors (classifiers) A new principle for APs selection called principle of or- space made up of APs. So the performances of bi-combination thogonality is introduced in this section. AP1(n)&AP2(m) are superior to each single one. Orthogonality is Better: Auxiliary Problems in ASO Algorithm 647

However, AT B = 0 is too strong to be satisfied in prac- matrices, the disturbances for eigenvalues of matrix U are tice. So we extend the preceding conclusion to the case that small. Thus the orders of singular values are not influenced matrix A and matrix B are approximately orthogonal, i.e., and the theoretical premises of orthogonality are still satisfied. AT B ≈ 0. The following theoretical analyses indicate that As a result, we can draw the following conclusion. If the weight the principle of orthogonality is still available even in such matrices of different types of APs are orthogonal or approx- condition. Above extensions are much useful in practice. All imately orthogonal, their multi-combinations perform better the experiments in Section IV are on those premises. than or equal to any components. 2. Approximately orthogonal 3. Total amount of APs is fixed Assume that matrix A and matrix B are still made up of It is worth noting that the amount of APs in bi- ui (1 ≤ i ≤ n)andu˜j (1 ≤ j ≤ m) respectively. Both ui combination AP1(n)&AP2(m)isn + m, while the amounts (1 ≤ i ≤ n)andu˜ j (1 ≤ j ≤ m)areweightvectorsofasin- of AP1 and AP2 are n and m respectively. Maybe somebody gle type of APs. Moreover, we zero out their negative values wonder if the amounts of AP1 or AP2 are all increased to m+n, as doing in Ref.[3]. The reason is that the positive weights the superiority of the original bi-combination won’t be obvi- of a linear classifier are usually directly related to the target ous. In other words, it is not clear why the multi-combinations concept, while the negative components often yield much less perform better. Is it because they contain multiple types of specific information. In fact, in the SVD computation, we only APs, or the amounts of APs become larger? use the positive components of weight vectors[3].Aftersuch Let’s consider the following case. For a given TP, assume treatments, nonzero elements of weight vectors ui (1 ≤ i ≤ n) that we create two types of APs: AP1 and AP2. Their weight T and u˜ j (1 ≤ j ≤ m) are sparse. We then analyze matrix A B, matrices are approximately orthogonal. Moreover, the appro- which can be rewritten as Eq.(11), priate total amount of APs is fixed, e.g., t. There are three schemes for APs selection as follows. AT B u , u , ···, u T u , u , ···, u =[ 1 2 n] [˜ 1 ˜ 2 ˜ m] (1) The t APs are all made up of AP1; ⎡ T ⎤ u1 (2) The t APs are all made up of AP2; ⎢ T ⎥ ⎢ u2 ⎥ (3) The t APs are both made up of AP1 and AP2, their = ⎢ ⎥ [u˜ 1, u˜ 2, ···, u˜ m] ⎣ . ⎦ numbers are n and m respectively, s.t. n + m = t. . 2 2 2 uT Let λ1,λ2, ···,λt (λ1 ≥ λ2 ≥ ··· ≥ λt) be eigenvalues of n T ⎡ T T T ⎤ matrix A A. As a result, the singular matrix of matrix A u1 u˜ 1 u1 u˜ 2 ··· u1 u˜ m ⎢ T T T ⎥ can be expressed as ⎢ u2 u˜ 1 u2 u˜ 2 ··· u2 u˜ m ⎥ ⎢ ⎥ a = . . . (11) D = diag[λ1,λ2, ···,λt] (14) ⎣ . . ··· . ⎦ T T T un u˜ 1 un u˜ 2 ··· un u˜ m Those similar settings also apply to matrix B. Suppose its singular matrix is As for weight vectors of different types of APs, the positions b of nonzero elements are more likely different. That is, nonzero D = diag[μ1,μ2, ···,μt](μ1 ≥ μ2 ≥···≥ μt) (15) u u elements of i and ˜ j are generally located in the different We further assume that their performances are different, e.g., positions of a vector. So the probability of Eq.(12) is high. the performances of AP1 are inferior to those of AP2. It means T that matrix B can better represent the “principle components” ui u˜ j =0(1≤ i ≤ n, 1 ≤ j ≤ m) (12) of the predictors (classifiers) space made up of APs. In gen- T T T T It makes matrices A B and B A((A B) )evenmore eral, its singular values (μj (1 ≤ j ≤ t)) are bigger than matrix sparse. They can be approximately viewed as zero ones. A’s (λi(1 ≤ i ≤ t)) in the mass. As a result, the requirements that the weight matrices of Thus we naturally have a new idea. Firstly, sort all the a b two different types of APs approximate to be orthogonal can singular values of matrices D and D in a descending order be easily satisfied. andthenselectthetopt singular values. Assume there are n a b The nonzero elements of matrices AT B and BT A will and m ones derived from matrices D and D respectively, s.t. c bring influences on the eigenvalues of matrix U in numeri- n + m = t. Secondly, create another singular matrix D with c cally. In the following, we use Gerschgorin Circles in matrix them. So matrix D is equipped with the biggest t singular theory[11] to analyze them quantitatively. values of all. Accordingly, we can obtain n AP1s and m AP2s c The definition of Gerschgorin Circles The eigenval- whose singular values are all in the matrix D to realize the n×n ues of matrix A =(aij )n×n ∈ C are contained in the union third scheme. In conclusion, it should be the most effective of Gr of the n Gerschgorin Circles defined by all. n Above analyses also apply to tri-combinations, qua- |z − aii|≤ri, where ri = |aij |, for i =1, 2, ···,n (13) combinations, and so forth. Generally speaking, the principle j=1 of orthogonality is theoretically credible. j=i In other words, the eigenvalues are trapped in the collection of IV. Experiments and Discussions circles centered at aii with radii given by the sum of absolute values in Ai∗ with aii deleted. We further investigate the principle of orthogonality in a In short, if nonzero values of matrix AT B or BT A are Natural language processing (NLP) task of Chinese syntac- small enough, i.e., they can be approximately viewed as zero tic chunking in this section. Suppose the TP in this task is 648 Chinese Journal of Electronics 2012 to label the subject components of Chinese sentences, i.e., to (1) Performances of multi-combinations predict whether the current word is a part of the subjects or We list Maximum values, Average values and 2-Norm val- not. Input sentences have been segmented and labeled with ues of weight matrices for each bi-combination in Table 3. We POS. Four different types of APs are created for enabling the can validate that those four different types of APs are ap- ASO algorithm. proximately orthogonal pairwise after calculations. Thus the 1. APs creation principle of orthogonality can be applied to them. The F Table 1 shows four different types of APs created in the measures of multi-combinations including AP1 are shown in experiments. The number immediately behind “AP” repre- Fig.1. The first column in Fig.1 lists the F measure of AP1 sents its serial number, e.g., “AP1” means the first type of alone, while the following columns show the F measures of bi- APs, while the number “100” in the parentheses shows its combinations, tri-combinations and qua-combinations respec- amount[12]. APs are created according to LDC Chinese Tree tively. Of course, all those multi-combinations include AP1. Bank (CTB) 2.0 and CTB 5.1, which have been segmented and When an extra type of APs is added, the performances are labeled with POS. Those corpora can be regarded as unlabeled improved more or less than those before. Similar results can data for TP here. be obtained for the other three types of APs.

Table 1. Four types of APs Table 3. Parameters of weight matrices Types and Meanings for bi-combinations amounts of APs Bi- Maximum Average 2-Norm to predict whether the current word combinations values values values AP1(100) (labeled w0) is the given noun AP1&AP2 0.6845 0.0093 1.3923 to predict whether the current tag AP1&AP3 0.7007 0.0107 1.5031 AP2(30) (labeled p0) is the given tag AP1&AP4 0.4805 0.0013 0.8917 to predict whether the previous word AP3(100) AP2&AP3 0.6556 0.0045 1.3734 (labeled w−1) is the given noun AP2&AP4 0.5467 0.0027 0.9639 to predict whether the next word AP4(100) AP3&AP4 0.6098 0.0034 1.1256 (labeled w+1) is the given noun

2. Experimental settings The corpora used in the experiments are CTB 2.0 and CTB 5.1. There are 601,659 words in total. The corpora are aver- agely divided into 5 shares. We perform 5-fold cross valida- tions on them. In order to simulate the situation that training data is less, we only select one share for training and the re- mained four shares for testing. The purpose of this paper is to compare and analyze the contributions of multi-combinations Fig. 1. F measures of multi-combinations including AP1 for TP, so we don’t give a detailed description on the process of features selection. Only simple features, such as segmen- In a word, the performances of multi-combinations are su- tations and POS, are selected. Both APs and TP share the perior to or equivalent to their components. They are not same features, as shown in Table 2[13]. inferior to their components in general. (2) Performances when the amount of APs is fixed In the following experiments, we take total amount t =50 Table 2. Features selected Categories as an example and select AP1, AP3 and AP4 to comprise the Features of features multi-combinations. Word w−2 w−1 w0 w+1 w+2 Fig.2 shows the singular values of AP1, AP3 and AP4 re- POS p−2 p−1 p0 p+1 p+2 spectively. We can observe that the singular values of AP4 are Bi-word (w−2, w−1) (w−1,w0) (w0,w+1) (w+1,w+2) general bigger than those of AP3, while the singular values of Bi-POS (p−2,p−1) (p−1,p0) (p0,p+1) (p+1,p+2) AP1 are the smallest of all. Consequently, we can predict that the performances of AP4 Precision, Recall and F measure are adopted for evaluat- are the best of all, while ing the performances. The loss function we select is Huber’s the performances of AP1 robust loss function, as shown in Eq.(2). The regularization are the worst of all, those λ −4 u2 n u2 parameter is fixed to 10 and is defined by i=1 i . of AP3 are in the mid- h The parameter takes 20, 10, 5 and 1, respectively. Besides, dle. The following exper- we select L-BFGS method to estimate parameters and the lin- [14] imental results further ear classifier as baseline classifier . confirm above predic- The experimental results of multi-combinations are first tions. The second row of introduced, and then followed by the performances when to- Table 4 displays the per- tal amounts of APs are fixed. Finally, experiments on how to formances of AP1(50), select the appropriate total amount of APs are discussed. AP3(50) and AP4(50) at Fig. 2. Singular values of AP1, 3. Results and discussions length. AP3 and AP4 Orthogonality is Better: Auxiliary Problems in ASO Algorithm 649

Firstly, we introduce the experimental results of bi- amount should contain all types of APs. Secondly, we con- combinations. Table 5 shows the top 50 singular values of sider tri-combination AP1&AP3&AP4. Table 6 and the solid all the bi-combinations and tri-combination in columns. For line in Fig.3 both display the performances of different total each combination, the left of the dotted line lists each specific amounts. It is easy to see that the conclusions obtained above singular value, while the right of the dotted line shows which are also feasible for tri-combination. type of APs it belongs to. Besides, the last row lists the ratios of selected different types of APs. According to the first three columns of Table 5, we can select appropriate ratios for each bi-combination. The corresponding experimental results are shown in the third row of Table 4. The ratios in parentheses represent the amounts of selected types of APs. Obviously, F measures of bi-combinations are superior to single types by 6% in average.

Table 4. Performances of multi-combinations of APs Multi-combinations Precision (%) Recall (%) F(%) AP1(50) 88.85 52.13 65.71 Fig. 3. F measures of different total amounts for AP1&AP3 AP3(50) 95.88 65.88 78.10 and AP1&AP3&AP4 AP4(50) 95.73 66.16 78.24 AP1(11)&AP3(39) 96.77 72.27 82.74 Table 6. Performances of different total AP1(12)&AP4(38) 97.32 72.95 83.39 amounts of AP1&AP3&AP4 Amounts & Precision Recall AP3(21)&AP4(29) 97.08 73.17 83.45 F(%) ratios (%) (%) AP1(7) & AP3(18) 97.30 73.43 83.70 1(0:0:1) 50.11 95.97 65.84 & AP4(25) 15(0:6:9) 68.83 73.27 70.98 30(6:10:14) 97.09 73.74 83.82 Secondly, the performances of tri-combination are demon- 50(8:17:25) 97.30 73.43 83.70 strated in the last row of Table 4. The ratios of different types 70(10:25:35) 97.65 73.50 83.87 of APs are determined from the forth column of Table 5. The 90(10:35:45) 97.40 73.11 83.52 F measure of tri-combination is the best of all, although it is higher than bi-combinations by only 1% or so. As far as we know, 84% is the upper bound of F measures in present V. Conclusion experimental settings. We focus on the principle of orthogonality for APs selection Table 5. The top 50 singular values of in ASO algorithm. We first give theoretical analyses, and then multi-combinations take a NLP task of Chinese syntactic chunking for example to AP1&AP3 AP1&AP4 AP3&AP4 AP1&AP3&AP4 investigate it. The theoretical analyses and the experimental 6.9255 3 7.3732 4 7.3732 4 7.3732 4 results both validate the following facts. If the weight matri- 6.1455 3 6.6436 4 6.9255 3 6.9255 3 ...... ces of different types of APs are orthogonal or approximately ...... orthogonal, their multi-combinations perform better than or 2.6943 3 2.8414 1 3.1478 3 3.3405 1 equal to any components. Moreover, as long as the ratios of 2.6863 3 2.8054 4 3.1371 3 3.3388 4 their components are appropriate, even if the total amounts AP1:11& AP1:12& AP3:21& AP1:7&AP3:18 of APs are fixed, the multi-combinations still perform better AP3:39 AP4:38 AP4:29 &AP4:25 than or equal to any components. In conclusion, the principle of orthogonality holds. (3) Appropriate total amount of APs In future, our plan is to apply the principle of orthogonality Firstly, we consider bi-combination AP1&AP3. The to- to more tasks, e.g., Chinese semantic role labeling. tal amount of each single one is also fixed to 50. Table 6 shows the performances of different total amounts for bi- combination AP1&AP3. The ratios in the parentheses rep- References resent the amounts of selected AP1 and AP3. We further [1] Y. Zhong, “A unified theory of information, knowledge and illustrate those F measures with dash line in Fig.3 for clar- intelligence”, Chinese Journal of Electronics, Vol.12, No.3, ity. When parameter t equals to 5, the combination is only pp.391–396, 2003. made up of AP3 according to the first column of Table 5. [2] Y. Zhong, “Consciousness machines: theory and applications”, The corresponding F measure is not satisfying. By contrast, Chinese Journal of Electronics, Vol.6, No.1, pp.42–45, 1997. when total amount t increases to 30, 50, 70 and 90 respec- [3] R.K. Ando, T. Zhang, “A framework for learning predictive structures from multiple tasks and unlabeled data”, tively, i.e., the combinations are all made up of two types of Journal of Machine Learning Research, Vol.6, No.11, pp.1817–1853, 2005. APs, the corresponding F measures also improve to a higher [4] R.K. Ando, T. Zhang, “A high-performance semi-supervised level. However, they are similar with each other, almost the learning method for text chunking”, Proc. of the 43rd Annual same. The performances of other bi-combinations are similar. Meeting of the Association for Computational Linguistics,Ann Therefore, we can roughly conclude that the appropriate total Arbor, Michigan, USA, pp.1–9, 2005. 650 Chinese Journal of Electronics 2012

[5] J. Chen, L. Tang, J. Liu and J. Ye, “A convex formulation for [14] T. Zhang, X. Wang, H. Tong and Y. Zhong, “Auxiliary problems learning shared structures from multiple tasks”, Proc. of the selection in semantic role labeling”, Journal of Computational 26th International Conference on Machine Learning,Montreal, Information Systems, Vol.8, No.2, pp.549–561, 2012. Canada, pp.137–144, 2009. ZHANG Taozheng was born in [6] X. Bai, T. Zhang, S. He and X. Wang, “Chinese syntactic chunk- Taiyuan, Shanxi Province in 1982. She re- ing based on ASO algorithm”, Proc. of the 13th China National ceived B.S. degree in Electronic Informa- Conference on Artificial Intelligence, Beijing, China, 2009. tion Engineering from Taiyuan University [7] C. Liu, H.T. Ng, “Learning predictive structures for semantic of Technology in 2005 and Ph.D. degree role labeling of NomBank”, Proc. of the 45th Annual Meet- in Signal and Information Processing from ing of the Association for Computational Linguistics, Prague, Beijing University of Posts and Telecom- Czech Republic, pp.208–215, 2007. munications in 2012 respectively. Her re- [8] S. He, T. Zhang, X. Wang, X. Bai and Y. Dong, “Incorporating search interests include machine learning multi-task learning in conditional random fields for chunking and natural language processing. (Email: in semantic role labeling”, Proc. of International Conference [email protected]) , on Natural Language Processing and Knowledge Engineering was born in Nan- Dalian, China, pp.47–51, 2009. WANG Xi a o j i e chang, Jiangxi Province in 1969. He re- [9] R.K. Ando, “Applying alternating structure optimization to ceived Ph.D. degree from Beijing Univer- word sense disambiguation”, Proc. of the 10th Conference sity of Aeronautics and Astronautics in ,NewYorkCity, on Computational Natural Language Learning 1996. He is currently a professor in Cen- USA, pp.77–84, 2006. ter for Intelligence Science and Technol- [10] T. Zhang, X. Wang, X. Bai and S. He, “Relevancy of auxil- ogy at Beijing University of Posts and iary problems in alternating structure optimization algorithm”, Telecommunications. His research interests Journal of Beijing University of Posts and Telecommunica- include natural language processing and , Vol.34, No.4, pp.14–18, 2011. tions multimodal cognitive computing. (Email: [11] G.H. Golub, C.F.V. Loan, (3rd edn), Matrix Computations [email protected]) Johns Hopkins University Press, Baltimore, Maryland, USA, 1996. TONG Hui was born in Shandong [12] T. Zhang, X. Wang and H. Tong, “Researches on combina- Province in 1978. He graduated and re- tions of auxiliary problems in ASO (Alternating Structure Opti- ceived the doctorate of science from Ts- mization) algorithm”, Proc. of the 2011 International Confer- inghua University in 2006. He is currently ence on Future Computer Science and Education, Xi’an, China, a teacher of Beijing University of Posts and pp.608–614, 2011. Telecommunications. His research interests [13] T. Zhang, X. Wang and H. Tong, “Domain adaptation in Al- include matrix theory and parallel compu- ternating structure optimization (ASO) algorithm”, Proc. of tation. (Email: [email protected]) the 11th IASTED International Conference on Artificial Intel- ligence and Applications, Innsbruck, Austria, pp.50–55, 2011. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Non Local Feature-Preserving Strategy for Image Denoising∗

HE Ning1 and LU Ke2

(1.School of Information, Beijing Union University, Beijing 100101, China) (2.College of Computing and Communication Engineering, Graduate University of Chinese Academy of Sciences, Beijing 100049, China)

Abstract — In this paper, we propose a variational tures preserved. However, these models are mostly based on image denoising model by exploiting an adaptive feature- the information around the pixels, and the energy minimiza- preserving strategy which is derived from the Non-local tion problem is usually solved through the diffusion equation. means (NL-means) denoising approach. The commonly Thus to some extent, the details and texture of the image will used NL-means filter is not optimal for noisy images con- taining small features of interest since image noise always be discarded, and the image edge will be ambiguous. makes it difficult to estimate the correct coefficients for Recently proposed Non-local means (NL-means)[4] algo- averaging, leading to over-smoothing and other artifacts. rithm has offered considerably promising results. Unlike pre- We address this problem by a non-local detail preserving vious denoising methods that rely on the local regularity as- constraint, which is performed by adding two terms in the sumption, the NL-means exploits spatial correlation in the en- Total variation (TV) model. One is a non local patch based regularization term that controls the amount of denoising tire image for noise removal. It modifies each pixel value with to preserve textures, small details, or global information, a weighted average of other pixels that have a similar neighbor- the other is a new data fidelity term, which forces the gradi- hood configuration. Since image pixels have high correlations ents of desired image being close to the smoothed normal. between each other while noise is typically independently and The Euler-Lagrange equation is used to solve the prob- identically distributed, averaging of these pixels can result in lem. Experimental results show that the proposed method noise removal and produce a similar pixel to its original value. can alleviate the over-smoothing effect and other artifacts, while preserving the fine details. In this paper, we propose a method to use the NL-means reg- ularization in image denoising. The NL-means regularization Key words — Feature-preserving, Image denoising, constraint has the potential to suppress noise while preserv- Non-local means, Regularization. ing fine details. On the other hand, we introduce a new data fidelity term to preserve the textures and alleviate other arti- I. Introduction facts.

Image denoising is a classical and important problem in II. Related Works many computer vision tasks. Usually a better denoising model is able to preserve important image features, such as edges and A good denoising method must be able to reduce as much texture. However, the traditional image denoising methods noise as possible while preserving the original image informa- such as median filtering, Gaussian filtering etc., mainly remove tion. The NL-means model[4] restores every pixel in the image the high frequency component of the image, so the details and by a weighted average of surrounding pixels using a robust the texture regions of the image are often blurred. Previous similarity measure. It not only considers the neighborhood of methods for addressing this problem mainly come from varia- the current pixel, but also uses all the pixels of similar Gaus- [1] tion models and Partial differential equations (PDEs) ,such sian neighborhood to improve the denoising capability. This as Perona Malik (PM) diffusion model and the bounded vari- makes it maintain the image details well. Its basic formula is ation model[2] etc. At the same time, Image denoising based as follows: on wavelet transformation and curvelets[3] are very popular  (G ∗|u(x+.)−u(y+.)|)(◦) because they are useful for recovering regression curves and 1 − σ NL(u)(x)= e h2 u(y)dy (1) surfaces from noisy data with jumps and some other local fea- C(x) Ω

∗Manuscript Received Sept. 2011; Accepted Oct. 2011. This work is supported by the National Natural Science Foundation of China (No.61070120, No.61103130, No.60982145), Beijing Natural Science Foundation (No.4112021), Beijing Educational Commission Science Foundation (No.KM201111417015), the Opening Project of Shanghai Key Laboratory of Integrate Administration Technolo- gies for Information Security (No.AGK2010005), the National Basic Research Program of China (973 Programs) (No.2010CB731804-1, No.2011CB706901-4). 652 Chinese Journal of Electronics 2012 where Ω denotes the image grid, x ∈ Ω, attempt to identify features to be preserved while eliminating  the noise by averaging. In this paper, we address this problem (G ∗|u(x+.)−u(z+.)|2)(◦) − σ C(x)= e h2 by a non-local detail preserving constraint, which is performed Ω by adding a non local patch based regularization term and a data fidelity term. dz is a normalizing constant, Gσ is a Gaussian kernel, and h acts as a filtering parameter, which controls the decay of the III. Non Local Feature-Preserving exponential function and therefore the decay of the weights as Algorithm a function of the Euclidean distances. Although the NL-means model can be considered as a 1. The proposed model vector-extension of the bilateral filter weighting function, Let u0,u : Ω → R be image intensity functions, the stan- which omits a geometric distance factor, it possesses some dard linear degradation model is: properties which may be undesirable[5], such as computational u x u x ∗ k n x ,x∈ burden and the edges blurred. A common effect of the algo- 0( )= ( ) + ( ) Ω (2) rithms using correlation and window comparison is the adhe- where k is a (known) space-invariant blurring operator, n is sion artifact, that is, if a pixel inside a flat zone is near an additive random noise independent of u, Ω ⊂ R2 is the image edge, then the window distances are dominated by it. The domain, u0 denotes the observed noisy image data, u is the weight configuration provided by the NL-means at this pixel ideal image we want to recover. The goal of image denoising is concentrated in the same direction of the edge. This ef- is to recover u from the observed image data u0. We address fect is visible in Fig.1 as a shadow surrounds the boundary of the denoising problem within the TV framework, and it can the hat. Fig.1 also displays an example where NL-means have be written as: excessively filtered a textured zone. min {Φ(u0 − k ∗ u)+Ψ(|∇u|)} (3) u∈BV (Ω) where Φ defines a data fidelity term, and Ψ defines the regular- ization that enforces a smoothness constraint on u, depending 1 on its gradient ∇u, BV (Ω) is the space of functions of L (Ω) TV u |∇u|dxdy < ∞ such that the TV norm is ( )= Ω . In fact, the information about the ideal image is unknown and only its estimation from the noisy image u0 is obtain- able. Usually, the similarity in image intensity is given by a u − u 2dxdy σ2 ∇u least square Ω ( 0) = , and the gradient is estimated from ∇u0. However, the gradient estimation from the noisy image is an ill-posed problem. So, we introduce a regularization term about the non local patch. Non local regularizers In the variational framework, Kindermann et al.[11] formulated the neighborhood filters and NL-means filters as non local regularizing functional. How- ever, the functional generally is not convex. For this reason, [12] Fig. 1. Denoising result by NL-means. (a) noisy image; (b) Gilboa-Osher formalized the convex non local functional the result by NL-means; (c)-(d) a local zone, noisy inspired from graph theory, and moreover, based on the gra- image and denoised image respectly dient and divergence definitions on graphs in the context of machine learning, they derived the corresponding non local Currently, there are several approaches which can partially overcome these drawbacks. In Ref.[6], an additional gradient operators. We inspired from Gilboa-Osher method and pro- fidelity term is used to cope with the problem from higher or- posed the regularization term to maintain the sharp transition u ∈ BV der PDEs, and the blurring effect produced by some second at object boundaries. Let , using the weighted gradi- order PDEs is alleviated. However, they did not take into ac- ent operator, consists in seeking for a function Ψ which is not only smooth enough on weighted graph, but also close enough count the influence of the noise on the gradient estimation. In u Ref.[7], the authors add a new gradient fidelity term to the to 0. This optimization problem can be formalized by the minimization of a weighted sum of two energy terms: TV model to some second-order nonlinear diffusion PDEs for  alleviating the staircasing effect, and the classical Gaussian 2 Ψ(u)(x)= w(x, y)φ(|∇wu| )dy filtering technique is used to smooth the image prior to the Ω gradient estimation. However, the Gaussian filtering is the  u(x) − u(y) + φ (|u(x) − u(y)|) w(x, y)dy uniform smoothing in all direction of images and fine details |u(xt) − u(y)| Ω (4) are easily destroyed. Hence, the gradient of smoothed image is unreliable near the edges, and the gradient fidelity cannot where u : Ω → R be a function, and w : Ω × Ω → R be well preserve the gradient and edges. In Refs.[8–10], NL-means a nonnegative and symmetric weight function. The weighted filter and its recently improved versions, the bilateral filters, gradient vector ∇wu : Ω ×Ω → R is defined as (∇wu)(x, y):= are also based on feature preservation. All of these methods (u(y)−u(x)) w(x, y). The norm of the non local gradient of u A Non Local Feature-Preserving Strategy for Image Denoising 653  x ∈ |∇ u| x u y − u x 2w x, y dy at Ω is given by w ( )= Ω ( ( ) ( )) ( ) . uniqueness of minimizers. Our calculations involve associ- The regularization Ψ brings the priors about the local features ated Euler-Lagrange equations. These are not well defined of the restored function. Its main role is on the regions which for BV functions. However, we assume that we work with 1,1 are either outliers or are erroneously set to zero. The im- functions u ∈ W (Ω), a dense subset of BV (Ω)andfor ages we wish to restore are supposed to involve smooth re- which these Euler-Lagrange equations are well defined. We gions and edges. Our attention being restricted to convex ob- also assume that the function ϕ is differentiable everywhere 2 2 jective function, which means that ϕ(0+)=0.Wechoose (for instance ϕ(x)= ε + |x| , for a small parameter ε>0). √ α φ(s)= s2 + a (a>0 is a parameter). The original weights Here v =Δg,whereg ∈ Bp,q (Ω), 0 <α<2, 1 ≤ p, 2 U(x)−U(y)2 q ≥∞ u ∈ BV u ∈ L 1 − , while keeping (Ω). We have (Ω), of the NL-means w(x, y)= e 2h2 only depend u ∈ BV ⊂ L2 u − u − v u − u − g ∈ L2 C(x) (Ω) (Ω), 0 = 0 Δ (Ω). 2 α−2 on the patch value U(x) and not on the pixel position x it- Thus v =Δg ∈ L (Ω) ∩ Bp,q (Ω). The texture component v 2 self. In order to use weights with compact support, we set the still belongs to L (Ω). Thus, we compute the Euler-Lagrange U(x)−U(y)2 1 − Eq.(7) as follows: w U x ,U y e 2h2 weights as ( ( ) ( )) = C U x .   ( ( )) ∇u − λ u − u − g − ϕ u −∇·n Eq.(4) states that the solution of NL-means can be com- 2 ( 0 Δ ) div (Δ ) |∇u| =0 (8) puted directly in the patch space provided that each patch is given. But moreover, the NL-means not only compares the In Ω, Eq.(8) is solved using standard finite differences with a grey level at a single point but the geometrical configuration semi-implicit scheme. in a patch. Thus, by incorporating the proper regularizer term de- Data Fidelity Term Nowwepresentthegradientand pending on the noise model, we design the following total en- intensity fidelity terms according to the idea in Ref.[13]. An ergy: noisy image can be decomposed into three parts: structures E(u, v)=Ψ(u)+Φ(u, v) (u), texture (v) and noise (η). Now if we take the image de-  u u v η 1 2 composition of view, 0 = + + , the functional can be = w(x, y)φ(|∇wu| )dy C(x) written as:  Ω u x − u y φ |u x − u y | ( ) ( ) dy {|u| μ u −u−v 2 λ v } + ( ( ) ( ) ) inf BV (Ω)+ 0 L2(Ω)+ H−1(Ω) Ω |u(x) − u(y)| (u,v)∈BV (Ω)×H−1(Ω)  2 (5) + (ϕ(|∇u − n|)+λ|u0 − u − v| )dxdy In order to adapt the Rudin-Osher-Fatemi (ROF) model to Ω (9) capture the textures in the v component, Meyer[14] proposes 2 u v to replace L space by another space, called G,whichisa Minimizing these functionals in and , we obtain the Euler- G Lagrange equation space of oscillating functions. The space is endowed by the  following norm: 2  2 2 {(u(y) − u(x))w(x, y) · [v (y)φ (|∇w(u)| )(y) 1 Ω 2 2 v |g | |g | 2 ∞ 2  2 G =inf ( 1 + 2 ) L (6) + v (x)φ (|∇w(u)| (x))]}dy g    ∇u g g ,g ∈ R2 × R2 v v +div ϕ (·) +2λ(u0 − u − Δg) where =(1 2) and =divg, H−1 = |∇u| |∇(Δ−1)v|2dxdy. This method decomposed the image into =0, (10) structure, texture, and noise. While we denoised the noisy  ∂u  images, the result u + v is needed, and preserved the texture.  ∂n  =0 To increase the performance, we generalize the data fidelity ∂Ω term Φ by imposing a local adaptability behavior to the al- where n is the outward unit normal vector on the boundary [12] gorithm following an idea proposed by Gilboa et al. .Ina ∂Ω,andn = ∇u/|∇u|. cartoon-type region, the algorithm enhances the denoising pro- 2. Basic properties of our model cess by increasing the value of λ. So we define the following By the calculation of the variation, the proposed model data fidelity term: can be obtained by the following Propositions.  Proposition 1 The operator ψ(u) as defined in Eq.(4) 2 Φ = (ϕ(|∇u − n|)+λ|u0 − u − v| )dxdy (7) with strictly convex and admits the following properties: Ω (a) u ≡ const iff ψ(u) ≡ 0. where ϕ : R → R is a continuous function which decreases (b) ψ(u) is a positive semidefinite operator, that is on the interval (−∞, 0] and increases on the interval [0, +∞). ψ(u),u≥0. ρ One usual choice is ϕ(t)=|t| ,forρ>0. The term Proof Sufficiency of Property (a) holds since for any con- ϕ |∇u − n| dxdy u u s |u x − u y |≡ ψ u ≡ const Ω ( ) is designed to force the gradient of stant we have = ( ) ( ) 0. We get ( )= to be close to the smoothed gradient estimation u, and thus ψ(0) = 0. the over-smoothing texture effect can be avoided. The sec- For the necessity, it is easy to see that for a given point x, |u − u − v|2dxdy ψ u x u x u y w x, y ond term is the fitting term, Ω 0 is not we have ( )( )=0ifeither ( )= ( )or ( )=0holds strictly convex in the direction (η, −η). If ϕ is strictly con- for any y ∈ Ω. Using the connectivity condition, we reach the vex, then the functional is strictly convex and we obtain conclusion that u(x)=u(y) for any given x, y. 654 Chinese Journal of Electronics 2012

Property (b) can be validated (relying on the symmetry IV. Experimental Results φ s w x, y w y,x ( ) ≥ ( )= ( )and s 0) by: In this section, experimental results are presented to val-  idate the proposed model. Comparisons with the NL-means  u(x) − u(y) ψ(u),u = φ (|u(x) − u(y)|) w(x, y)u(x)dydx model proposed by Rudin et al.[4], and the TV model proposed Ω×Ω |u(x) − u(y)|  by X. Bresson[15] are also performed. 1 φ(|u(x) − u(y)|) = (u(x) − u(y))u(x) w(x, y) 1. One-dimensional signal 2 Ω×Ω |u(x) − u(y)| Firstly, we use one-dimensional signal composed of con- φ(|u(y) − u(x)|) +(u(y) − u(x))u(y) w(y,x) dydx stant, ramp and smooth intensities by Eq.(9). Figs.2(a)and |u(y) − u(x)|  2(b) show the original signal and noisy signal respectively, and φ |u x − u y | 1 u x − u y 2 ( ( ) ( ) ) w x, y dydx Fig.2(c) and Fig.2(d) show the results by TV model and the = ( ( ) ( )) |u x − u y | ( ) 2 Ω×Ω ( ) ( ) proposed model respectively. We can see from the figure that ≥0 (11) the staircasing arises in the smooth regions in Fig.2(c), while the result by the proposed method has fewer local constant Proposition 1 holds. components. Furthermore, the proposed method makes a good From Proposition 1, the convexity of the energy functional compromise between edge preserving and smoothing. can guarantee a global optimization and a unique minimum. Proposition 2 The cost function Eq.(7) has the follow- 2. Two-dimensional images ing properties. In this section, we use ‘Barb’ and ‘Boat’ as the test images, (a) The model of scale 0 is the input image. and both images have abundant textures and edges. (b) The model of scale ∞ is the mean of the input image. Gaussian noise is added to the images with standard de- C u x, y dxdy ∞ = ∞ 0( ) . viation varied from 20 to 40. Fig.3 shows a set of results. Proof  (a) The property is proved by contradiction. De- Figs.3(a)and3(b) show the original and noisy image respec- M ϕ |∇u − n| dxdy u  u L2 note = Ω ( 0 ) . Let us assume = 0 in tively, and the results obtained by the TV model, NL-means 2 sense for any large λ. Specifically, there exists h ∈ L such method and the proposed model are shown in Figs.3(c), 3(d), M n ε> u < u λ> 2 that 2 = 0andΦ( ) Φ( 0)for ε2 ,where u + n = u0.Then 

Φ(u0) − Φ(u)= ϕ(|∇u0 − n|) − ϕ(|∇u − n|) Ω 2 − λ n 2dxdy

Fig. 3. Results for ‘Barb’ (up) and ‘Boat’(low). (a) the original images; (b) the noisy images; (c) results by ‘TV’ model; (d)results by ‘NL-means’ model; (e) results by the proposed model A Non Local Feature-Preserving Strategy for Image Denoising 655

3(e) respectively. For the ‘Barb’ images, we can see that the improvement becomes more salient as noise level increases. restored image using the proposed model looks more natural Fig.4 and Fig.5 show the amplitude of the MSSIM improve- and has no piecewise constant patches, which have appeared ment over the value of sigma of the TV model, the NL-mans at the cheeks and nose in the result of TV model shown in model, and our proposed model. These figures show that our Fig.3(c), and the restored image by NL-means method is not algorithm can more quickly reach the best restored image with sharper than ours on the edges of Barb image even if it avoids maximum MSSIM, and our algorithm obtains the best perfor- the over-smoothing features. The same effects are also shown mance comparing with the other algorithms. in the results of the ‘Boat’ image. The ‘Boat’ is a kind of image which contains many linear edges and curve singularities. As expected, our method takes benefits from the detail preserving scheme for capturing efficiently the geometrical content of im- age. All the results indicate that the regularization term and the data fidelity term are effective, and play important roles in preserving the textures and edges. For a quantitative comparison, we use SNR (the ratio of signal-to-noise) and MSSIM[16] (Mean structure similarity in- dex) to evaluate the quality of restored images of several algo- rithms under different noise level. SNR is given by the formula: Fig. 4. Comparison of the MSSIM results of the different im-  ages which is corrupted by some Gaussian noise of (u − u¯)2dx standard deviation 20: (a) “Boat”; (b)“Barb” SNR = Ω (13) η − η 2dx Ω ( ¯) 1  where u + η = u0,¯u = udx is the mean of original |Ω| Ω image u, η andη ¯ are the noise and its mean. MSSIM is a new method for measuring the similarity between two images. The higher the score value of MSSIM is, the more similar two images are. For comparison, all results are the images whose values of MSSIM reach maximum in their iterative process. The quan- tity comparison is shown in Table 1 and Table 2 for ‘Barb’ and Fig. 5. Comparison of the MSSIM results of the different im- ‘Boat’ respectively. From the tables, we can see that both the ages which is corrupted by some Gaussian noise of SNR improvement and the MSSIM improvement by our model standard deviation 40: (a) “Boat”; (b)“Barb” are larger than the other two models. Different with Gaussian kernel convolution, our proposed algorithm takes advantage of 3. Texture images denoising the data fidelity and non local regularizer, therefore there is We test the proposed method on textural image corrupted σ no any edge blurring. The MSSIM improvement shows that by additive Gaussian noise ( = 30). Figure 6 illustrates the our algorithm has a good perceptual quality. Especially, the fact that the proposed method reconstruct the original texture and Figure 6 shows a decomposition of the noisy image using B Table 1. The SNR/MSSIM of the restored “Boat” the model (5) with equal to the square of 3 pixel-length images by using three models centered at 0, p =1,andλ =2. Noise Noisy “NL-means” Proposed Standard “TV” model image model model deviation 20 7.57/0.33 16.66/0.84 15.67/0.81 17.25/0.87 25 5.61/0.24 15.66/0.79 14.73/0.76 16.35/0.84 30 4.02/0.20 14.79/0.78 13.77/0.69 15.55/0.83 35 2.70/0.17 13.90/0.74 12.86/0.65 14.90/0.82 40 1.53/0.14 12.93/0.70 12.00/0.60 14.34/0.80

Table 2. The SNR/MSSIM of the restored “Barb” images by using three models Noise Proposed Noisy “TV” “NL-means” Fig. 6. The proposed method with a textural image. (a)orig- Standard mage model model model inal image; (b) noisy image; (c) denoised by proposed deviation method; (d) texture of the image; (e) structure of the 20 8.73/048 12.76/0.77 11.39/0.72 13.15/0.81 image; (f) difference image 25 6.79/0.40 11.83/0.72 11.05/0.68 12.36/0.78 30 5.21/0.34 11.17/0.69 10.69/0.64 11.64/0.75 35 3.91/0.30 10.63/0.65 10.32/0.60 11.19/0.73 V. Conclusion 40 2.71/0.26 10.12/0.61 9.93/0.56 10.81/0.70 In this paper, a NL-means total variation model is pro- 656 Chinese Journal of Electronics 2012 posed for discontinuity preserving denoising. We present a non bilateral filter for image denoising”, 2010 International Con- local patch based regularization term that controls the amount ference on Apperceiving Computing and Intelligence Analysis ( ), 17-19 Dec., pp.253–257, 2010. of denoising to preserve textures, small details, or global in- ICACIA [11] S. Kindermann, S. Osher, P.W. Jones, “Deblurring and denois- formation, and a new data fidelity term, which is designed to ing of images by nonlocal functionals”, SIAM MMS, Vol.4, No.4, force the gradients of desired image to be close to the smooth pp.1091–1115, 2005. approximation gradients. To improve the ability of preserv- [12] G. Gilboa, S. Osher, “Nonlocal operators with applications to ing the details of edges and texture. We carry out several image processing”, SIAM MMS, Vol.7, No.3, pp.1005–1028, numerical experiments to compare the performance of several 2008. algorithms. The SNR and MSSIM improvements demonstrate [13] J. Gilles, Y. Meyer, “Properties of BV-G structures + textures decomposition models. application to road detection in satel- that our proposed method has the best performance than the lite images”, IEEE Trans. Image Processing, Vol.19, No.11, TV algorithm, and the original NL-means algorithm. pp.2793–2800, 2010. [14] Y. Meyer, “Oscillating patterns in image processing and in some nonlinear evolution equations”, References The Fifteenth Dean Jacquelines B. Lewis Memorial Lectures, American Mathematical Society, [1] G. Gilboa, N. Sochen, Y.Y. Zeevi, “PDE-based denoising of 2001. complex scenes using a spatially-varying delity term”, Proc. [15] X. Zhang, M. Burger, X. Bresson, S. Osher, “Bregmanized non- ICIP, Barcelona, Spain, Vol.1, pp.865–868, 2003. local regularization for deconvolution and sparse reconstruc- [2]S.Osher,M.Burger,D.Goldfarbet al., “An iterative regu- tion”, CAM Report 09-03, 2009. larization method for total variation based image restoration”, [16] M. Jung, L.A. Vese, “Nonlocal variational image deblurring Multiscale Modelling and Simulation, Vol.4, No.2, pp.460–489, models in the presence of gaussian or impulse noise”, SSVM 2005. 2009, LNCS (5567), pp.402–413, 2009. [3] E. Cand`es, L. Demanet, D. Donoho, L. Ying, “Fast dis- HE Ning wasborninPanjin, crete curvelet transforms”, Multiscale Modeling and Simulation, Liaoning Province on July 29, 1970. She Vol.5, No.3, pp.861–899, 2006. graduated from Department of Mathemat- [4] A. Buades, B. Coll, J.M. Morel, “A review of image denoising ics at Ningxia University in July 1993. She algorithms, with a new one”, Multiscale Modeling and Simula- received M.S. degree and Ph.D. degree in tion, Vol.4, No.2, pp.490–530, 2005. applied mathematics from Northwest Uni- [5] T. Brox, O. Kleinschmidt, D. Cremers, “Efficient nonlocal versity and Capital Normal University in means for denoising of textural patterns”, IEEE Transactions July 2003 and July 2009, respectively. Cur- on Image Processing, Vol.17, No.7, pp.1083–1092, 2008. rently she is a associate professor of Beijing [6] V. Dore, M. Cheriet, “Robust NL-means filter with optimal Union University. Her research interests in- pixel-wise smoothing parameter for statistical image denoising”, clude image processing and computer graphics. (Email: xxthen- IEEE Trans. on Signal Processing, Vol.57, No.5, pp.1703–1716, [email protected]) 2009. LU Ke wasborninNingxia [7] J. Boulanger, C. Kervrann, P. Bouthemy, P. Elbau, J.B. Province on Mar. 13, 1971. He gradu- Sibarita, J. Salamero, “Patch-based nonlocal functional for de- ated from Department of Mathematics at noising uorescence microscopy image sequences”, IEEE Trans- Ningxia University in July 1993. He re- actions on Medical Imaging, Vol.29, No.2, pp.442–454, 2010. ceived M.S. degree and Ph.D. degree from [8] L. Xiao, Lili Huang, Badrinath Roysam, “Image variational de- Department of Mathematics and Depart- noising using gradient fidelity on curvelet shrinkage”, EURASIP ment of Computer Science at Northwest Journal on Advances in Signal Processing, Vol.2010, pp.1–17, University in July 1998 and July 2003, re- 2010. spectively. He was a Postdoctoral Fellow [9] Lei Yang, Richard Parton, Graeme Ball, Zhen Qiu, Alan H. in National Laboratory of Pattern Recog- Greenaway, Ilan Davis, Weiping Lu, “An adaptive non-local nition, Institute of Automation, Chinese Academy of Sciences from means filter for denoising live-cell images and improving par- July 2003 to April 2005. Currently he is a professor of Graduate ticle detection”, Journal of Structural Biology, Vol.172, No.3, University of the Chinese Academy of Sciences. His research focuses pp.233–243, 2010. on curve matching, 3D image reconstruction and computer graphics. [10] Wen Qiang Feng, Shu Min Li, Ke Long Zheng, “A non-local (Email: [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Graph-based Method to Mine Coexpression Clusters Across Multiple Datasets∗

ZAN Xiangzhen1,XIAOBiyu1, MA Runnian2, ZHANG Fengyue3 and LIU Wenbin1

(1.Department of Physics and Electronic Information Engineering, Wenzhou University, Wenzhou 325035, China) (2.Telecommunication Institute, Air Force Engineering University, Xi’an 710077, China) (3.Department of Biomedical Engineering, School of Life Science and Technology, Beijing Institute of Technology, Beijing 100081, China)

Abstract — Mining coexpression clusters across multi- developed the concept of 2nd-order expression analysis where ple datasets is a major approach for identifying transcrip- 1st-order expression analysis refers to the extraction of co- tion modules in systems biology. The main difficulty of this expression patterns from each microarray data set, and 2nd- problem lies in the fact that these subgraphs are buried order expression analysis refers to analyze their correlated oc- among huge irrelevant connections. In this paper, we ad- [1] dress this problem using a noise reduction strategy. It con- currence across multiple data sets .Sincethen,Yanet al. sists of three processes: (1) Coarse filtering; (2) Clustering proposed a pattern-growth approach CloseCut, and a pattern- potential subsets of graphs; (3) Refined filtering on those reduction approach SPLAT. Both aimed at identifying the ex- subsets. Using yeast as a model system, we demonstrate act recurrence of dense patterns in different graphs[2].Hu that most of the gene clusters derived from our method are et al. developed a platform, named CODENSE, to mine enrichment clusters. That is they are likely to be functional coherent dense subgraphs across multiple networks[3].CO- homogenous entities or potential transcription modules. DENSE utilizes a summary graph to integrate multiple data Key words — Coexpression networks, Frequent dense sets. However, such aggregation of multiple networks may vertex sets, Summary graph, Clustering. produce tremendous false dense patterns. As the number of graphs increases, their summary graph may eventually satu- I. Introduction rate as a clique. Later, Yan et al. proposed to partition the input graphs into some potential subgroups according to their [4] A key goal in systems biology is to characterize the under- topological similarity .Chenet al. recommended to first lying molecular mechanisms governing specific cellular behav- cluster edges with high correlated recurrence, and then mine iors and processes. Microarray technology has revolutionized frequent dense subgraphs in each edge cluster. As coexpres- the way of studying gene expression because of its capabil- sion graphs often contain hundreds of thousands of edges, they ity measuring the activities of thousands of genes simultane- adopted the min-hashing and locality-sensitive hashing tech- [5] ously. Transcription modules, which form the building block niques to obtain possible candidate clusters . of genetic regulatory networks, are groups of genes regulated As the connection of frequent dense patterns may differ by the same transcription factor(s). The main approach to from network to network, Yan et al. relaxed the requirement reconstruct transcription modules is identifying coexpression of coherence and named these patterns across multiple graphs clusters, which are assumed likely to be controlled by the same as Frequent dense vertexsets (FDVSs). The main difficulty transcription factors. However, it may result in a number of of this problem lies in the fact that the FDVSs are generally subtle and false modules because of (1) cellular processes are surrounded among a huge amount of irrelative noise edges. In complex dynamical systems; (2) the transcription of genes in this paper, we solve the FDVSs problem through iteratively different modules may be perturbed simultaneously under a removing noise edges so that subgraphs of the FDVSs gradu- given condition; (3) the noisy nature of the high throughput ally come out. technologies leads to a significant number of false positives or The outline of this paper is organized as follows. In Sec- false negatives. tion II, we introduce the problem formulation and some defi- In order to overcome the three aforementioned problems, nitions needed. In Section III, we propose two key algorithms Zhou first systematically studied to extract the co-expression to eliminate irrelevant noise edges and then present the main clusters across multiple datasets. In her seminar paper, she framework of mining FDVSs. In Section IV, we take an enri-

∗Manuscript Received Sept. 2011; Accepted Feb. 2012. This work is supported by the National Natural Science Foundation of China (No.60970065, No.30970666, No.61174162), Zhejiang Natural Science Foundation (No.R1110261, No.Y1080227), and Graduate Innovation Fund of Wenzhou University (No.3160601010951). 658 Chinese Journal of Electronics 2012 chment analysis of FDVSs obtained from microarray datasets Definition 6 (Frequent dense vertex set). Given a graph  of Saccharomyces cerevisiae and give an example of functional set D = {Gi =(V,Ei)},asetofverticesV ⊂ V is a fre- annotation. Finally, we conclude in Section V. quent dense vertex set if the density of its induced subgraphs  δ(gi(V )) is larger than or equal to δ at least in k graphs. II. Problem Formulation Definition 7 (Summary graph). Given a graph set D = {Gi =(V,Ei)}, its summary graph is a graph S(V,Eˆ) A microarray dataset is usually modeled as a simple un- where the average edge density coefficient of each edge e ∈ Eˆ weighted and undirected graph, where each gene is represented is larger than a user-defined threshold. by one node and two genes are connected with an edge if their Summary graph was originally used to conserve all the rel- expression profiles show high and significant correlation. A evant edges and eliminate some irrelevant ones at the same frequent densely connected subgraph across many graphs may time. In this paper, we use the edge density coefficient to correspond to a possible tight coexpression cluster. The FD- build summary graph . Obviously, it is more suitable than the VSs problem can be formulated as: Given m graphs with n frequency of edges used in previous papers[3−5], and can be ap- common nodes, search all FDVSs which are densely connected plied to efficiently filter out irrelevant edges. As the deletion of in at least in k graphs (0

(3) the support vectors of edges within the same FDVS may be zu,v ECuv = very different. For example, the Hamming distance between max(du,dv) support vectors 1111100000 and 1111101011 is 3. However, the most important information is they both appear in G1, ···,G5. where du and dv represent the degree of vertices u and v re- (3) In order to capture this feature, we define the similarity of two spectively, zu,v is number of triangles pass through uv. Definition 4 (Edge induced subgraph). Given a graph support vectors as G V,E uv ∈ E u, v ∈ V ( ) and an edge ( ), the subgraph induced ve, ve     s ve, v  uv guv V ,E V ⊂ V ( e )= by is defined as ( ), where is the vertex set min(h(ve),h(ve )) connected with u or v, E ⊂ E is the edge set between vertices in V . where ·, · and h(·, ·) denote the inner product and the Ham- Definition 5 (Edge density coefficient). Given a graph ming weight of two vectors respectively. G =(V,E), the edge density coefficient of uv ∈ E (u, v ∈ V ) is defined as III. Mining Frequent Dense Vertex Sets  (du − 1)(dv − 1) EDuv = σ(guv) 1. Motivation (du + dv − 2)/2 The difficulty of mining FDVSs mainly comes from the fact where du and dv represent the degree of vertices u and v re- that they are buried among a huge amount of irrelevant edges. spectively, guv is the subgraph induced by edge uv. Here we define the irrelevant edges as those having no contri- Graph partitioning is a common way in complex networks bution to the formation of FDVSs. If we can effectively filter analysis. Girvan and Newman proposed a global measure- out most of those irrelevant or noise edges, the FDVSs will edge betweenness, which counts the number of the shortest become easily detected. Then, the key issue is how to identify paths running through an edge[7]. Obviously, edges bridg- those noise edges. Intuitively, we have following observations: ing dense communities tend to have higher edge betweeness (1) If an edge e ∈ Ei is sparsely connected with its neigh- score than those inside communities. However, its computa- bor edges, then it has no contribution to the FDVSs. tion complexity is O(|EV |), which hinders its application to (2) If an edge e ∈ Ei is densely connected only in a few large scale networks. Radicchi introduced the edge clustering graphs, then it has no contribution to the FDVSs because of coefficient, which counts the number of triangles containing a the frequency requirement. given edge[7]. Recently, we proposed the edge density coeffi- (3) If an edge e ∈ Eˆ bridges two densely connected sub- cient to measure the local density around an edge, and it is graphs in summary graph S, then it has no contribution to anti-correlated with the edge betweeness. This means an edge the FDVSs. must be sparsely connected with other edges if this measure is (4) If e ∈ Eˆ ∧ e ∈ Ei(1 ≤ i ≤ m), then it has no contribu- very low[8]. tion to the FDVSs. A Graph-based Method to Mine Coexpression Clusters Across Multiple Datasets 659

(5) If V  ⊂ V is a FDVS in a subset of graphs SUB(D), subsets containing at least one FDVS. GCLUSTER consists then their connections in each graph of D − SUB(D)haveno of two processes: (1) project all vectors to seed vectors; (2) contribution to this FDVS. select a seed vector to cluster. 2. Algorithms Algorithm 2 GCLUSTER In order to address the five kind of noisy edges listed in Input: Summary graph S(V,Eˆ), Graph number m, Mini- above, we designed two algorithms: FILTER and GCLUS- mal frequent support k, TER. FILTER contains 4 steps to single out those edges listed Minimal Hamming distance threshold τ; in above, and its workflow is illustrated in Fig.1. Output: Cluster center set C; Algorithm 1 FILTER 1. assign the support vector of each edge e ∈ Eˆ aweight Input: Graph set D = {Gi =(V,Ei)} (1 ≤ i ≤ m), w(ve) = 1, and puts those with Hamming weight k or k +1 Minimal density threshold δ, to set A and others to set B; Minimal frequent support k, 2. merge the same vectors in set A and B by adding their User defined parameter f, p, q; weight and leave only one vector; Output: Summary graph S(V,Eˆ) and the resulted graph 3. for each edge ve ∈ Bdo   dataset D = {Gi =(V,Ei)}; find a subset SUB(A)⊂A where each vector having the ED <δ/f G s v , v  v 1. filter out edges with e in each graph i maximal similarity score ( e e )with e; S ED ≥ v  ∈ w v  w v  2. build summary graph with edges satisfying i e update the weight of e SUB(A) by ( e )= ( e )+ pkδ(1 ≤ i ≤ m); w(ve) ∗ w(ve )/ w(ve ); 3. filter out edges with ECe

3. The main framework common genes. Based on the above two algorithms, the mining process for Table 1. The sources of microarray files FDVSs can be formulated as follows: Data Experimental conditions References (1) Invoke FILTER to perform a coarse filtering of irrele- points vant edges on graph set D; 1 Alpha factor release 18 Spellman (2) Invoke GCLUSTER to find potential cluster centers 2 Elutriation 14 et al. (1998) C based on the resulted graph set D and the corresponding 3 DNA damage (MMS) response 17 Gasch 4 Gamma radiation 17 . summary graph S; et al 5 Mock irradiation 8 (2001) (3) Build a graph set SUB(D) based on each cluster center 6 Diamide 8 c ∈ C, and invoke FILTER to perform a refined filtering of 7 Heat shock 22 irrelevant edges. 8 Nitrogen depletion 9 Gasch (4) Detect dense subgraphs in the resulted summary 9 Nutrition limitation 10 et al. (2000) graphs and output their vertex sets. 10 Sorbitol effects 6 (5) Merge identical vertex sets and separate some large 11 Steady state 8 12 Cell cycle alpha factor 13 Zhu . vertex sets. et al 13 Fkh1 2 alpha fator 13 (2000) Concerning the user-defined parameters in FILTER and Sudarsanam f 14 Nutrition 8 GCLUSTER, Our simulation results show that parameter et al. (2000) p can be set from 4 to 10. The parameter can be set from Chu et al. 15 Sporulation 7 0.1 to 0.2 for coarse filtering and from 0.8 to 0.9 for refined (1998) filtering. It is because the FDVSs appear in most graphs of DeRisi 16 Diauxic shift timecourse 7 SUB(D) in the latter case. The parameter q can be set as et al. (1997) Signaling and circuitry Roberts a constant 0.334. The Hamming distance threshold is set as 17 56 τ = 2, that means the Hamming distance between any two of multiple MARK pathways et al. (2000) Glucose pulse on Ronen vectors in T is no more than 2. 18 26 galactose chemostat . (2006) k et al If the frequency of a FDVS is larger than ,thenitcanbe Calcineurin/Crzlp signaling 19 24 extracted from not only one SUB(D). For example, {a, b, d} pathway for Ca Yoshimoto can be obtained both from cluster centers 1110 and 1101 in Calcineurin/Crzlp et al. (2002) 20 8 Fig.2. Therefore, there exist some identical FDVSs from dif- signaling pathway for Na ferent center c ∈ C. In this paper, two FDVSs are identical if 1. The distribution of edges they differ less than 2 vertices and we can merge them as one. We first study the performance of FILTER on the 20 coex- On the other hand, a large FDVS may contain a small one. pression graphs. At beginning, there are about 820967 edges This situation can be addressed by checking their occurrence whose frequencies are larger than 3 across these graphs. After in D. If they are highly correlated then they should be merged processedbyFILTERwithδ =0.6andk = 6, only 348222 as one; otherwise the large FDVS should be broke into two edges remain. The coarse filtering process actually eliminates small FDVSs. more than half of the original edges which are irrelevant edges. Finally, the overlapping problem between FDVSs can be Fig.3 shows the distribution of edge numbers before and after easily addressed by our method if their occurrences are not filtered versus the Hamming weight of support vectors. The high related. For example, {a, b, d} and {b, c, d} in Fig.1 share number of edges roughly obeys a power law distribution before a common pair of vertex {b, d} and form a dense subgraph in filtered and this means the frequency of most edges is relative summary graph S, they can be easily distinguished from two small, of which about 90% of edges are less than 8. It changes cluster centers 1110 and 0011 respectively. to a biased bell shape after a coarse filtering. Concerning the IV. Experimental Study

We use 10 datasets of Saccharomyces cerevisiae from Stanford microarry dataset (http://smd.stanford.edu) and the NCBI Gene expression omnibus (http://www.ncbi.nlm.nih. gov/geo/). According to the experimental conditions, they are partitioned into 20 datasets with at least 7 experiment data (see Table 1 for detailed information.). The similarity between two genes in one dataset is measured by Pearson’s correlation between their expression profiles. We transform the Pearson’s correlation (denoted as r)intoanotherquan- tity, (n − 2)r2/(1 − r2), and model it as a t-distribution with n−2 degrees of freedom, where n is the number of data points used. Two genes are connected if the Pearson’s correlation Fig. 3. The distribution of edges in graph set D before and of their expression patterns is significant at a =0.001 level. after invoking FILTER for coarse filtering with k =6, Finally, we build 20 co-expression networks comprising 5672 δ =0.6 A Graph-based Method to Mine Coexpression Clusters Across Multiple Datasets 661

as seed vectors to project is reasonable. Fig.3 also presents the distribution of combinatorial numbers for choosing k from n = 20. Having a further look at Hamming weight 3, 4 and 5, we can see that the number of edges is extraordinarily larger than the corresponding combinatorial number. This demon- strates there are huge amount of edges sharing the same sup- port vectors at each Hamming weight. And the number of vectors to be projected and then clustered by GCLUSTER will be relatively small. 2. Enrichment analysis of FDVSs Applying the framework in Section III.3, we obtained 43, 35 and 35 FDVSs when δ is 0.6, 0.7, and 0.8 respectively and k is 6. To quantify the functional homogeneity, we submit- ted those clusters to GOEAST (http://omicslab.genetics.ac.cn /GOEAST/)[9]. For each cluster, GOEAST returns a list of GO terms in three categories: Biological process (BP), Molec- ular function (MF) and Cellular component (CC). In this pa- per, an enrichment cluster is defined as if more than 80% of its genes significantly share at least one GO term with a =0.001 level. Biologically, enrichment clusters tend to be regulated by the same transcription factor(s), and thus form a transcription module. Fig.4 shows the number of GO terms significantly shared by each cluster. Obviously, most of the clusters are en- richment clusters. The average numbers of GO terms shared by each cluster are 25.14, 30.77 and 30.83 when δ is 0.6, 0.7, and 0.8 respectively. This means the increasing of threshold δ can improve the functional homogeneity of clusters. 3. Functional annotation Fig.5 presents one cluster containing seven genes YGR118W, YPR043W, YPL079W, YOR167C, YJL136C, YLR388W, YNL303W across the 20 graphs. They are densely connected in 11 graphs. When submitted to GOEAST, it re- turns three directed graphs corresponding to GO terms signif- icantly shared by six genes except for YNL303W in blue color, which has no functional annotation now yet. Fig.6 shows the Fig. 4. The number of GO terms significantly shared by ob- relation of the significantly shared GO terms in three catalogs: tained clusters appearing in at least 6 graphs. Plots Biological process (BP), Cellular component (CC) and Molec- (a), (b)and(c) correspond to δ =0.6, 0.7, 0.8 respec- ular function (MF). Apart from MF, the terms in BP and CC tively are densely related, and the two leaf nodes GO0006412 (trans- lation) and GO-0022626 (cytosolic ribosome) have a depth 6 FDVSs appearing in at least k = 6 graphs, GCLUSTER and 9. This indicates they share high functional homogeneity. project all other edge vectors to those with Hamming weight The degree of gene YNL303W in the 11 dense subgraphs is 5, 6 or 7. As they are about 22% of the total edges, using them 6, 5, 6, 6, 3, 5, 6, 6, 4, and 6 respectively. This indicates it is

highly related with other six genes. Thus, we can speculate that gene YNL303W has same functions with other six genes.

V. Conclusions

We developed a new framework to mining frequent dense vertex sets in multiple coex- pression networks. The main idea is to fil- ter out irrelevant edges and identify FDVSs in some potential subsets of networks. In or- der to achieve this aim, we designed two algo- Fig. 5. A potential transcription module which is tightly coexpressed in 11 out of 20 rithms: FILTER and GCLUSTER. Combined datasets with the summary approach, FILTER just 662 Chinese Journal of Electronics 2012 uses the edge density coefficient and the edge clustering coeffi- Bioinformatics, Vol.21, pp.213–221, 2005. cient to detect and delete irrelevant edges iteratively. Through [4] X. Yan, M. Mehan, Y. Huang, M.S. Waterman, P.S. Yu, X.J. a projecting technique, GCLUSTER can achieve a simple Zhou, “A graph based approach to systematically reconstruct human transcriptional regulatory modules”, Bioinformatics, fuzzy clustering. Our method is scalable in the number and Vol.23, No.13, pp.577–586, 2007. size of graphs to be mined. And it is also extendable to [5] L. Chen, S.M. Wang, R.S. Chen, “A method to detect gene weighted and directed coexpression clusters from multiple microarrays”, Progress in Biochemistry and Biophysic, Vol.35, No.8, pp.914–920, 2008. [6] F. Radicchi, C. Castellano, F. Cecconi et al., “Defining and identifying communities in networks”, PNAS, Vol.101, No.9, pp.2658–2663, 2004. [7] M. Girvan, M.E.J. Newman, “Community structure in social and biological networks”, PNAS, Vol.99, No.12, pp.7821–7826, 2002. [8] H. Zhang, X. Zan et al., “Detecting dense subgraphs in com- plex networks based on edge density coefficient”, IEEE Fifth In- ternational Conference Bio-Inspired Computing: Theories and Applications (BIC-TA), pp.51–53, 2010. [9] Q. Zheng, X.J. Wang, “GOEAST: a web-based software toolkit for gene ontology enrichment analysis”, Nucleic Acids Res., Vol.36, pp.358–363, 2008. Fig. 6. The relation between GO terms significantly shared by six genes in Fig.5 in three catalogs: BP, CC and ZAN Xiangzhen was born in 1986. He received the B.S. degree and is assidu- MF ously study the M.S. degree in the Depart- graphs. We demonstrated its application in identifying coex- ment of Physics and Electronic Information pression clusters across 20 microarray dataset of yeast. Most Engineering, Wenzhou University, China. His research interest is bioinformatics and of the identified gene clusters are enrichment clusters which pattern recognition. significantly share a number of GO terms. Therefore, the dis- covered clusters can be used to predict functions of unknown genes, construct transcription modules and infer potential bi- ological mechanisms. LIU Wenbin was born in 1969. He is a professor of the Department of Physics and Electronic Information Engi- References neering, Wenzhou University, China. He received Ph.D. degree from the Depart- [1] X.J. Zhou et al, “Functional annotation and network reconstruc- ment of Control Science and Engineering, tion through cross-platform integration of microarray data”, Huazhong University of Science and Tech- Nature Biotechnology, Vol.23, No.2, pp.238–243, 2003. nology in 2004. Then he worked as a post [2] X. Yan, X. Zhou and J. Han, “Mining closed relational graphs Ph.D. researcher at the same group for two with connectivity constraints”, Proc. 2005 ACM SIGKDD Int. years. In 2007, he visited the Institute of Conf. Knowledge Discovery in Databases, pp.324–333, 2005. Sytems Biology for one year. His major interests include compu- [3] H. Hu, X. Yan et al., “Mining coherent dense subgraphs across tational biology, data mining, pattern recognition DNA computing massive biological networks for functional discovery”, BMC and evolutionary algorithms. (Email: [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Fusion Scheme of Region of Interest Extraction in Incomplete Fingerprint∗

JING Xiaojun1,2, ZHANG Bo1,2, ZHANG Jie1,2 and ZHONG Mingliang1,2 (1.School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China) (2.Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, China)

Abstract — A fusion scheme is proposed for incomplete proposed. The matching is only executed on the ROI centered fingerprint to extract the Region of interest (ROI) centered at the reference point. So, it is necessary to reserve the use- at the reference point. Firstly, the orientation entropy is ful information in the ROI or foreground, while removing the computed. Secondly, it is necessary for ROI extraction to noise in the background. segment foreground and background efficiently. So, the feature vector, based on orientation and gray, is defined ROI determination relies on the accurate detection of the for segmentation with SVM. Then, the fingerprint seg- reference point, which is defined as the point that has the mentation of the incomplete area is re-computed and mea- maximum curvature in the most internal ridge. Usually, a sured with correlation and competition of texture, which is core point, the topmost or bottommost point on the inner- based on Local binary pattern (LBP). Finally, the proposed most recurving ridgeline, is used as such a reference point. method is based on mutual information of fingerprint ori- The Poincare Index method[6] is the most commonly used to entation and Poincare Index to detect the reference point of fingerprints, and then extract ROI. The performance of detect core point. However, the method is dependent on the the new method is evaluated on FVC2004 database. And reliability and accuracy of the orientation field and is sensitive the performance is shown in the experiments and proves to noise. Therefore, it may induce spurious core points got- that it could locate the position of reference point of all ten or right points missed. Besides, this method cannot detect type fingerprints more effectively and precisely. the reference point of Arch type fingerprints because it doesn’t Key words — Incomplete fingerprint recognition, Re- present a core point. There are other proposed methods based [7,8] gion of interest (ROI), Fusion criterion. on high irregularity, curvature but they are also subject to noise and can’t differentiate between core and delta. Complex filtering-based methods[9] detect the parabolic and triangular I. Introduction symmetry associated with core and delta points by filtering on complex images corresponding with the orientation tensor. Recently, due to uniqueness and immutability, such as fin- They could not only get the most likely position of the singular gerprints, iris, face, voice and so on, biometrics has become points but also their associated direction. one of the most effective personal identity authentication. Es- In this paper, a segmentation algorithm based on SVM and pecially, Fingerprint-based identification is the most widely LBP has been presented for ROI, which combines orientation used biometric authentication because of its high portability. coherence, orientation entropy and gray together. Then, a ref- And the acquired fingerprints often have dirty parts, scars, erence point set is detected using the Poincare Index method. creases and so on. For incomplete fingerprint, the large lose Meanwhile, another reference point set is extracted by com- of information and serious nonlinear deformation cause great puting mutual information of fingerprint orientation. Finally, difficulties in incomplete fingerprint recognition. So it is im- the centroid of the intersection of two sets is considered as the [1] portant to do some researches on this project . reference point. Experiment results show that the proposed The uniqueness of a fingerprint is determined by the global method could detect the reference point of all classes and it is characteristics as well as the local ridge anomalies (a ridge bi- free to resolution and rotation. furcation or a ridge ending called minutia)[2].Moreover,Meth- The paper is organized as follows. In Section II, some re- ods of fingerprint matching can be coarsely categorized into: lated works are presented. In Section III, the proposed scheme minutia-based[3], image-based[4] and hybrid-based[5].Tocan- is described in detail. Section IV provides the experimental cel the position and rotation variations, some novel schemes results to analyze the performance. And the conclusion is in combining image-based methods with a reference point are Section V.

∗Manuscript Received July 2011; Accepted Nov. 2011. This work is supported by the National Natural Science Foundation of China (No.60872148, No.60702049). 664 Chinese Journal of Electronics 2012

II. Related Works herence Coh(ω)andentropyH(ω) construct the feature vec- tor. After training, we can obtain the rough segmentation. 1. Orientation field estimation For the foreground, y =1.Andy = −1 for the background. There are two major methods to compute the orienta- Secondly, the preliminary segmentation must be modified, tion field of fingerprints, that is formula method and mask because of the wrong segmentation near the optimal hyper- method[2]. Although simple, mask method is not very accu- plane by SVM. The probability of SVM can be computed by rate. So the formula method is mainly used in the practice. [11] Sigmoid function : Block division is important for orientation estimation. The −1 too large or too small block will increase the estimation error. p(fsvm(x)) = (1 + exp(Afsvm(x)+B)) (2) In this paper, sliding windowis used, empirically. where p(fsvm(x)) is correct classifying probability when the 2. Traditional segmentation algorithm based on ∗ ∗ output of the sample x is fsvm(x)=w · x + b . A and B are mean and variance parameters of function. Generally speaking, the foreground and background are di- Thirdly, the fingerprint segmentation of the incomplete vided by some thresholds with some features. These features area is re-computed and measured with correlation and compe- include statistical characteristics of gray value, gradient, direc- tition of texture, which is based on LBP. LBP has been widely tion of the ridge, ridge frequency and projection, Gabor filter- [2] used in the pattern recognition due to its high robustness and ing and so on . Fig.1 depicts a traditional example. Depend- [12] uniqueness . Label the fingerprint as region A,regionX and ing on the threshold, this method is not applied to too dry or the overlapping region A ∩ X, depicted in Fig.1. Compute the too wet fingerprints. Moreover, unsupervised algorithms with feature H LBP (ω) by LBP and the LBP similarity DLBP (ωnb) a sole character and/or a simply empirical threshold cannot among regions. Generally, the gray value of the central pixel Ic achieve a perfect purpose, while supervised algorithms have and LBP in the eight pixels neighborhood Ip (p =0, 1, ···, 7) high algorithm complexity. Consequently, we design a seg- [13] are features, I∗ stands for gray value : mentation algorithm based on multiple features combining the 7 point level segmentation by SVM. p LBP (c)= b(Ip − Ic)2 (3) p=0

where 1,Ip ≥ Ic b(Ip − Ic)= (4) 0,Ip

3. Segmentation based on SVM SVM is firstly presented to find the optimal classifier under linear divisible condition and it has been used in fingerprint segmentation in Ref.[10]. (xi,yi) is the set of linear divisible samples, where i =1, ···,n, x ∈ Rd, y ∈{+1, −1} is category label. The optimal classification is: Fig. 2. Region of fingerprint blocks and its surroundings n f x {w∗ · x b∗} α∗y x · x b∗ ( )=sgn i + =sgn i i( i )+ (1) (1) The feature H LBP (ω) of the local blocks Ω is com- i=1 puted. The dispersion of feature H LBP (ω)isas: ∗ ∗ n ∗ where b is the classifying threshold, w = i=1 αi yixi and it L −1 can be obtained from arbitrary support vector or the median dLBP (Ω, Ωnb)= xor(H LBP (Ω), H LBP (Ωnb)) (5) of one pair of support vector in two categories. 0 where Ωnb are the neighborhoods of block Ω. III. The Proposed Scheme (2) Compute the similarity of LBP: dLBP (Ω,M) 1. The segmentation steps DLBP (M)= (6) dLBP (Ω,N) Firstly, a medium quality fingerprint and a low quality fin- N∈Ωnb gerprint are selected as training samples. The fingerprint is where DLBP (M), M ∈ Ωnb can describe the differences be- divided into non-overlapping blocks with 7 × 7 size. Compute tween blocks, especially between incomplete and complete the mean Mean(ω), the variance Var(ω), the orientation co- area. A Fusion Scheme of Region of Interest Extraction in Incomplete Fingerprint 665

Finally, compute the coherence and the competition by Mutual information can perfectly measure the correlation fuzzy criterion: between region A and region X, and is robust. As a result, ab the value of the orientation mutual information represents the F (a, b)= (7) scattering of the orientation around a local block. Predefine a 1 − a − b +2ab threshold value T . For each block, if the mutual information is y ω a p f ω when ( )=1,then = ( svm( )), smaller than T , the center of this block is decided as a possible reference point, and another reference point set S2 could be b = arg min (DLBP (ωnb)). p(fsvm(ωnb)) obtained.

If F>F1, ω is foreground, otherwise, it is background. 4. Detection of reference point and ROI when y(ω)=−1, then a =1− p(fsvm(ω)), We combine the mutual information and the Poincare In- dex to detect the reference point more precisely. Meanwhile, a b = arg max (DLBP (ωnb)). sub-image 96 × 96 centered at the reference point is extracted 1−p(fsvm(ωnb)) as the ROI (as shown in Fig.9). The criterion is as following: F>F ω If 2, is foreground, otherwise, it is background. (1) If S1 is not empty, then it is supposed that there is a Through the above modifying process by LBP, we can ob- core point at least in the input fingerprint. If the expanded tain the optimal segmentation. area centered at the average core point exceeds a predefined 2. The poincare index method area or the number of the intersection of S1 and S2 is larger − / The Poincare index method takes the values 1/2, 1 2, than the predefined number, stop the extension to S1.The and 0 for a core point, a delta point, and an ordinary point, intersection is determined as the reference point set S.Its respectively. The Poincare Index method may induce the dis- centroid is considered as the detected reference point. placement of the core point because of smoothing the esti- (2) If there isn’t a reference point in S1, the threshold value mated orientation field. To resolve the problem, we extend T1 is chosen. If mutual information centered at the pixel (i, j) the area centered at the average core point based on the cur- is smaller than T1, then the pixel (i, j) is decided as a reference vature information adaptively. And the center of each block point. contained in the extended area is considered as a core point. Therefore, a reference point set S1 could be gotten. This will have a detailed description below: IV. Experimental Results 1 Np To evaluate the reliability of detection, this method is ap- PI(i, j)= Δ(k)(8) 2π k=1 plied to several actual fingerprint images. FVC2004 Set B is where used in a set of experiments. ⎧ ⎨⎪ δ(k), if |δ(k)| <π/2 Δ(k)= π + δ(k), if δ(k) ≤−π/2 ⎩⎪ π − δ(k), otherwise

δ(k)=O(x(k+1)modNp ,y(k+1)modNp ) − O(xk,yk)(9)

3. Mutual information measurement Information entropy is used to measure the uncertainty of orientation filed. A reference point is defined as the point Fig. 3. The comparison of two methods. (a) the original fin- that has maximum curvature in the most internal ridge, while gerprint; (b) traditional method based gray; (c)the orientation entropy in the local area measures directional dif- proposed method ference in a local area. Orientation entropy in the local area ω is defined: H(ω)=− ρO(u,v) log(ρO(u,v)) (10) (u,v)∈ω where −1 |O i, j − O u, v | (u,v)∈ω Δ ( ) ( ) Fig. 4. The reference point set S1 based on the Poincare Index ρO(i,j) = (11) method −1 (i,j)∈ω (u,v)∈ω Δ|O(i, j) − O(u, v)|

A fingerprint blocks and its surroundings can be parti- tioned as three parts, that is the region near to other blocks A, the blocks X and the overlapping A ∩ X. Their orienta- tion entropy are respectively H(A), H(X)andH(AX). The definition of mutual information in theory is: Fig. 5. The reference point set S2 gotten from the orientation I(A, X)=H(A)+H(X) − H(AX) (12) with gradient dispersion 666 Chinese Journal of Electronics 2012

As shown in Fig.3, two methods, the traditional method method makes the estimated orientation field more smoothing based on gray and the novel method presented in this paper, and avoids extracting spurious core points. Results also prove are compared. In Fig.3, label a stands for the original finger- the high accuracy of the proposed method, its advantages to print image and b stands for segmentation result of the tradi- Arch type fingerprints, noisy fingerprints and its robustness to tional method, while c is segmentation result of the proposed resolution and rotation. method. Obviously, the algorithm in this paper improves seg- mentation greatly, especially for the incomplete fingerprints. References Fig.6 shows that the proposed method is applicable to not only all fingerprint classes but also noisy fingerprints. Fig.7 [1] A.K. Jain, Jianjiang Feng, K. Nandakumar, “Fingerprint shows that ROI remains the global characteristics of the in- matching”, Computer, IEEE Computer Society, Vol.43, No.2, pp.36–44, 2010. put fingerprint. Four databases (DB1, DB2, DB3 and DB4) [2] D. Maio, D. Maltoni, A.K. Jain, S. Prabhakar, Handbook of of FVC2004 are collected using different sensors/technologies. Fingerprint Recognition, Springer Verlag, 2009. The results in Fig.8 show that the proposed method is free to [3] F. Benhammadi, M.N. Amirouche, H. Hentous, K.B. Begh- resolution. dad, M. Aissani, “Fingerprint matching from minutiae texture maps”, Pattern Recognition, Vol.40, pp.189–197, 2007. [4] A.K. Jain, S. Prabhakar, L. Hong, S. Pankanti, “Filterbank- based fingerprint matching”, IEEE Transactions on Image Pro- cess, Vol.9, No.5, pp.846–859, 2000. [5] A. Ross, A. Jain, J. Reisman, “A hybrid fingerprint matcher”, Pattern Recognition, Vol.36, pp.1661–1673, 2003. [6] M. Kawagoe, A. Tojo, “Fingerprint pattern classification”, Pat- tern Recognition, Vol.17, pp.295–303, 1984. [7] A.M. Usman, A. Tariq, S. Nasir, A. Khanam, “Core point detec- Fig. 6. The detection results of the reference point based tion using improved segmentation and orientation”, on our proposed method for the common fingerprint Proceeding classes of IEEE/ACS Interna-tional Conference on Computer Systems and Applications (AICCSA), Doha, Qatar, pp.637–644, 2008. [8] S. Mohammadi, A. Farajzadeh, “Reference point and orienta- tion detection of fingerprints”, Proceeding of The Second Inter- national Conference on Computer and Electrical Engineering (ICCEE), Dubai, United arab emirates, pp.469–473, 2009. [9] K. Nilsson, “Symmetry Filters applied to fingerprints”, Ph.D. , Chalmers University of Technology, Sweden, 2005. Fig. 7. The results of ROI Thesis [10] Zhao Yanyun, Cai Anni, “Fingerprint image segmentation based on support vector machines”, Journal of Beijing University of Posts and Telecommunications, Vol.29, No,2, pp.38–41, 2006. (in Chinese) [11] C.P. John, Probabilities for SV Machines, MIT Press, Boston, 2000. [12] Ding Ying, Li Wen hui, Fan Jingtao, Yang Huamin, “A mov- ing object detection algorithm base on choquet integrate”, Acta Electronica Sinica, Vol.38, No.2, pp.263–268, 2010. (in Chinese) [13] L. Nanni, A. Lumini, “Local binary patterns for a hybrid fin- Fig. 8. The reference point localization for fingerprints from gerprint matcher”, Pattern Recognition, Vol.41, pp.3461–3466, FVC2004 DB4 Set B 2008. JING Xiaojun was born in Bei- jing. He received the M.S. and Ph.D. de- V. Conclusions grees in 1995 and 1999 respectively. From 2000 to 2002, he had been the postdoc- This paper has described a novel method of extracting the toral researcher in Beijing University of Region of Interest and an improved method of the smoothing Posts and Telecommunications, and now operation on the orientation image. Firstly, correct segmenta- is a professor in BUPT. (Email: jxiao- tion can directly affect the fingerprint recognition, especially [email protected]) for incomplete fingerprints. The preliminary segmentation is obtained firstly according to the global features (orientation ZHANG Bo was born in Shaanxi coherence and orientation entropy) and local features (mean Province. He is a Ph.D. candidate in and variance in block) by SVM. Moreover, according to the School of Information and Communication defects of SVM, the texture description by LBP and the out- Engineering, Beijing University of Posts and Telecommunications. His research in- put of SVM based on the orientation entropy are combined terest is mainly information fusion, sig- together to get the optimal segmentation. Then, to detect the nal processing, image processing. (Email: reference point of fingerprint images more precisely, we com- [email protected]) bined the Poincare Index method and the mutual informa- tion. Experiment results show that the improved smoothing Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Uniform Solution to QSAT by P Systems with Proteins∗

LU Chun and SHI Xiaolong (Key Laboratory of Image Processing and Intelligent Control, Department of Control Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China)

Abstract — P systems are distributed parallel comput- Ref.[10]. ing models in the area of membrane computing, which are MP systems, first considered by Andrei Pˇaun and Bianca inspired by the structure and the functioning of living cells, Popa, are a continuation of the investigations aiming to bridge as well as the organization of cells in tissues, organs, and [11] other higher order structures. P systems with proteins on membrane computing and brane calculi .Bothofthemstart membranes are a class of P systems, which have proved from the same reality - the living cell, but they develop in to be very efficient computing devices. Specifically, it was different directions and have different objectives. Membrane known that the Quantified satisfiability problem (QSAT) computing tries to abstract the computing power of biologi- of a Boolean formula can be solved by a semi-uniform fam- cally inspired models in the Turing sense, whereas brane calculi ily of P systems with proteins on membranes and with works in the framework of process algebra. Various operations membrane division. However, it remains open whether on membranes appear in both areas, a few related attempts in a uniform families of P systems with proteins on mem- branes can solve in polynomial time exactly the class of this direction including Refs.[12] and [13]. problems PSPACE. In this work, we present a uniform so- From the point of realistic, maximal parallelism way of pro- lution to QSAT problem by P systems with proteins on cessing different species of molecules in the membrane struc- membranes in a linear time with respect to both the num- ture was not realistic. It is necessary to consider a model n m ber of Boolean variables and the number of clauses of limiting the parallelism. The purpose is achieved through the the instance, which answers the above open problem. modeling of the trans-membrane proteins (protein channels) Key words — Membrane computing, Membrane pro- observed in living cells arrangement. Several computational tein, PSPACE-complete problem, Quantified satisfiability universality results of MP systems are known. In this work, problem (QSAT problem), Uniform solution. we focus on the computational efficiency of MP systems. In order to efficiently solve computational hard problems, [14] I. Introduction membrane division rule was introduced into P system . With division rules, the membranes (no skin membrane) and Membrane systems (P systems) were introduced by Gh. the contains of it could be duplicated by the interact of the Pˇaun as distributed parallel computing devices[1], based on membrane itself. Using such rules the systems can duplicate inspiration from biochemistry, especially with respect to the a given membrane that contains a specified symbol, possibly structure and the functioning of a living cell. The cell is con- rewriting this symbol in a different way in each of the cells sidered as a set of compartments enclosed by membranes which produced by the process. All the other symbols, as well as the are nested one in another. The basic model consists of a hier- rules which are contained in the original cell are copied unal- archical structure composed by several membranes, embedded tered into the two resulting cells. The computational efficiency into a main membrane called the skin. Membranes divide the of P system with division rule was studied in Ref.[15]. Euclidean space into regions, that contain multisets of objects The computational power of MP systems with cell divi- (represented by symbols of an alphabet) and evolution rules. sion was studied first in Ref.[16], and their ability to solve Using these rules, the objects may evolve and/or move from a NP-complete problem in polynomial time was demonstrated. region to a neighboring one. Most variants of P systems have In Ref.[17], an efficient semi-uniform solution to QSAT (Quan- been proved to be Turing equivalent[2−5] and computationally tified satisfiability problem), a well known PSPACE-complete efficient[6−9]. For general information in this area see the mem- problem is provided in the framework of MP system with cell brane computing Web site (http://ppage.psystems.eu). In this division. However, it remains open whether a uniform fam- work, we deal with a class of P systems called P systems with ilies of P systems with proteins on membranes can solve in proteins on membranes (MP systems, for short) introduced in polynomial time exactly the class of problems PSPACE.

∗Manuscript Received Aug. 2011; Accepted Sept. 2011. This work is supported by the National Natural Science Foundation of China (No.61033003, No.60971085), the Opening Fund of Key Laboratory of Image Processing and Intelligent Control, Ministry of Education (No.200905). 668 Chinese Journal of Electronics 2012

In this work, we present a uniform solution to QSAT prob- Table 2. Rules of type cp Type Rule Effect lem by using a recognizer MP system with cell division in a  [ip|a]i → [ip |b]i, polynomial time. The system in the family solves all the pos- 1cp  modify an object, but not move a[ip|]i → b[ip |]i sible instances of QSAT with Boolean variables and clauses,  [ip|a]i → a[ip |]i, in a time which is linear with respect to n and m.Theresults 2cp  move an object, but not modify a[ip|]i → [ip |a]i  given in this work answer the above open problem. [ip|a]i → b[ip |]i, 3cp  modify and move one object The paper is organized as follows. In Section II, we provide a[ip|]i → [ip |b]i  some definitions focus on P systems studied in this work and 4cp a[ip|b]i → b[ip |a]i interchange two objects some preliminary ideas about recognizer membrane systems  interchange and modify 5cp a[ip|b]i → c[ip |d]i and polynomial complexity classes are introduced. In Section two objects III, we present a uniform and polynomial solution of the QSAT problem by a family of recognizer P systems with proteins on changing the protein), or changing from p top ¯ and back (like membranes and with membranes division rules. Conclusions in the case of bistable catalysts). Rules with such flip-flop and some final remarks are given in Section IV. proteins are denoted by nff, n =1, 2, 3, 4, 5 (note that in this case we allow both rules which do not change the protein and II. Definitions rules which switch from p top ¯ and back). Both in the cases of rules of type ff and of type cp we can The reader is assumed to be familiar with basic elements ask that the proteins are always moved in their complemen- of membrane computing and computability, from Ref.[18]. So tary state (from p intop ¯ and vice versa). Such rules are said only a few notions and notations are mentioned, which are to be of pure ff or cp type, and we indicate the use of pure used through the paper. ff or cp rules by writing ffp and cpp, respectively. As usual, for a given alphabet V , V ∗ denotes the set of all Introducing the membrane division, the following type of     finite strings over V , with the empty string denoted by λ,and rule, div :[ip|]i → [ip |]i[ip |]i,wherep, p , p are proteins the membrane structures are expressed by correctly matching (possible equal), are used to divide membranes with label i. labeled parentheses. The family of recursively enumerable sets A membrane can be divided by means of an interaction of natural numbers is denoted by NRE. on itself (the skin is never divided). The division rule doesn’t In the framework of MP systems, there are two types of change the label of the membrane and will have two mem- objects, proteins and usual objects; the former are placed on branes with the same label and the same contents, objects the membranes, the latter are placed in the regions delimited and/or other membranes (although the rule specifies only the by membranes. A protein p on a membrane (with label) i is proteins involved) in the next step. As in the standard P sys- written in the form [ip|]i. Both the regions of a membrane tems, the time of execution of all rules in the systems is just structure and the membranes can contain multisets of objects one time unit. In this way the ability of cell division allows and proteins, respectively. us to obtain an exponential amount of cells in linear time and The following types of rules are considered for handling to design cellular solutions to QSAT in polynomial time. A P the objects and the proteins; in all of them, a, b, c, d are ob- system with proteins on membranes and membrane division of jects, p is a protein, and i is a label. In Table 1, res means the initial degree m ≥ 1 is a device of the form that protein is restricted during the evolution. In these cases, Π=(O, P, Σ,μ,ω1/z1, ···,ωm/zm,E,R1, ···,Rm,io,iin, the protein is not evolved and plays the role of a catalyst, just where assisting the evolution of objects. • m is the degree of the system (the number of mem- A generalization is to allow rules of the forms in Table 2 branes); (now, cp means that the rule will change protein during the • O is the set of objects; p p evolution), where evolves to as the catalyst of the rule • P is the set of proteins (with P ∩ O =Ø); p p cp (possibly equal; if = , then the rules of type become • μ is the membrane structure consisting of m membranes res rules of type ). enumerated by 1, ···,m; An intermediate case can be that of changing proteins, but • ω1, ···,ωm are the (strings representing the) multisets of in a restricted manner, by allowing at most two states for each objects present in the m regions of the membrane structure μ; protein, p,¯p, and the rules either as in the first table (without • z1, ···,zm are the multisets of proteins present on the m membranes of μ; Table1. Rules of type res Type Rule Effect • E ⊆ O is the set of objects present in the environment [ p|a] → [ p|b] , (in an arbitrarily large number of copies each); 1res i i i i modify an object, but not move a[ip|]i → b[ip|]i • R1, ···,Rm are finite sets of rules associated with the m [ p|a] → a[ p|] , μ 2res i i i i move an object, but not modify membranes of ; a[ p|] → [ p|a] i i i i • io ∈{1, 2, ···,m} indicates the output region; [ip|a]i → b[ip|]i, 3res modify and move one object • iin ∈{1, 2, ···,m} indicate the input membranes. a[ p|] → [ p|b] i i i i The rules of the forms specified above are used in a non- 4res a[ip|b]i → b[ip|a]i interchange two objects interchange and modify deterministic maximally parallel way: in each step, a maximal 5res a[ p|b] → c[ p|d] i i i i two objects multiset of rules is used, that is, no rule can be applied to the objects and the proteins which remain unused by the chosen Uniform Solution to QSAT by P Systems with Proteins 669 multiset. As usual, each object and each protein can be in- under polynomial-time reduction and under complement. volved in the application of only one rule, but the membranes are not considered as involved in the rule applications, hence III. A Uniform Solution to QSAT the same membrane can appear in any number of rules at the same time. In this section, a uniform and linear time solution to If, at one step, two or more rules can be applied to the the QSAT (satisfiability of quantified propositional formulas) same objects and proteins, then only one rule will be non- problem is provided through a family of recognizer P systems deterministically chosen. If at the same time a membrane with proteins on membranes and with membrane division. labeled with i is divided by a rule of type div and there are The QSAT problem is a standard PSPACE-complete prob- objects in this membrane which evolve by means of rules of lem. A formula of it is types (cp and res), then we suppose that first the evolution γ = Q1x1Q2x2 ···Qnxn(C1 ∧ C2 ∧···∧Cm) rules are used, and then the division is produced. The process Q ≤ i ≤ n ∀ ∃ C takes one step. where each i,1 ,iseither or ,andeach j , 1 ≤ j ≤ m, is a clause of disjunction Cj = y1 ∨ y2 ∨ ···yr At each step, a MP system is characterized by a configura- (with each yk being either a propositional variable, xs,or tion consisting of all multisets of objects and proteins present its negation, ¬xs). For example, the propositional formula, on the corresponding membranes (we ignore the structure μ, β = Q1x1Q2x2[(x1 ∨ x2) ∧ (¬x1 ∨¬x2)], is easy to see that it which will not be changed, and the objects from the envi- is true when Q1 = ∀ and Q2 = ∃, but it is false when Q1 = ∃ ronment). For example, C = ω1/z1, ···,ωm/zm is the initial and Q2 = ∀.QSAT∈PMCMP . configuration, given by the definition of the P system. By Proof Let us consider a propositional formula in the con- applying the rules in a non-deterministic maximally parallel junctive normal form: manner, we obtain transitions between the configurations of the system. A finite sequence of configurations is called com- ϕ = C1 ∧···∧Cm putation. A computation halts if it reaches a configuration Var(ϕ)={x1, ···,xn} where no rule can be applied to the existing objects and pro- Ci = yi,1 ∧···∧yi,l , 1 ≤ i ≤ m teins. i y ∈{x , ¬x | ≤ j ≤ n}, ≤ i ≤ m, ≤ k ≤ l Only halting computations are considered successful, thus i,k j j 1 1 1 i a non-halting computation will yield no result. With a halting A normal form for QSAT: the quantified formula with n vari- computation, we associate a result in the form of the multi- ables and m clauses is plicity of objects in region io in the halting configuration. We ∗ ϕ = Q1x1Q2x2 ···Qnxn(C1 ∧ C2 ∧···∧Cm) denote by N(Π) the set of numbers computed in this way by a given system Π. (A generalization would be to distinguish where Qi is either ∀ or ∃. 2 the objects and to consider vectors of natural numbers as the The pair function from N onto N is defined by n, m = result of a computation, but we do not examine this case here.) ((n + m)(n + m +1)/2) + n. This function is polynomial 2 Families of MP systems with cell division can be used as time computable and bijective from N onto N. Depending on language recognizers, since deciding whether an instance of a numbers m (of clauses) and n (of variables), we will consider a problem has an affirmative or negative answer is equivalent to system (Π( n, m ), Σ( n, m ),iin), where iin = 0 is the input deciding if a string belongs or not to the language associated region and Σ( n, m )={vi,j ,vi,j ,vi,vi|1 ≤ i ≤ m, 1 ≤ j ≤ n} with the problem. is the input alphabet. γ n m Considering the non-determinism feature, the computa- The problem instance (with size parameters and ) cod γ tions of a family of recognizer MP systems must produce the will be encoded by ( ) with multisets of symbols from Σ same answer. This can be summarized in the following defi- of the MP system, corresponding to the clause-variable pairs nitions. A recognizer MP system with cell division Π has an such that the clause is satisfied by true and false assignment alphabet containing two distinguished objects yes and no, used of the variables: to signal acceptance and rejection respectively; every compu- cod(γ)={vi,j |xj ∈{yi,k|1 ≤ k ≤ li}, 1 ≤ i ≤ m, 1 ≤ j ≤ n} tation of Π is halting and exactly one object among yes, mo is ∪{vi,j |¬xj ∈{yi,k|1 ≤ k ≤ li}, 1 ≤ i ≤ m, 1 ≤ j ≤ n} sent out from the skin membrane during each computation. ∪{vi|Qi = ∀, 1 ≤ i ≤ n} We denote the class of decision problems solvable by uni- ∪{v |Q ∃, ≤ i ≤ n}. form (semi-uniform, respectively) families of MP systems in i i = 1 S polynomial time by PMC (PMC ). If the family Π can pro- We now construct the MP systems: vide a uniform solution to the problem X,itcanbeproved Π()=(O, P, Σ,μ,ω0/z0, ···, that PMCR is closed under polynomial-time reduction and complement[1,19]. If Π is a family of recognizer P systems ωn+1/zn+1, Ø,R,iin,io)with degree of n +2 solving a decision problem X in polynomial time and in a uni- O ={ai,ti,fi|1 ≤ i ≤ n}∪{t, s}∪ form way, then it provides a polynomial time solution of X in {di|1 ≤ i ≤ 6n + m +1}∪{yes, no}; a semi-uniform way. Thus, we have PMC⊆ PMCS. P ={P0,P+,P−,Px}∪{P1|1 ≤ i ≤ n}; Let us denote by PMCMP the set of all decision problems {v ,v ,v ,v | ≤ j ≤ n, ≤ i ≤ m} which can be solved by means of recognizer MP systems with Σ= i,j i,j j j 1 1 ; cell division in polynomial time. The class PMCMP is closed μ =[0[n···[1[n+1]n+1]1 ···]n]0; 670 Chinese Journal of Electronics 2012

ωn+1 =d0; The solution consists in the following stages: • ω0 =λ; Generation stage: with membranes division rules, all truth assignments for the variables associated with the ωi =ai, for each i =1, 2, ···,n; Boolean formula are produced. z P , i , , ···,n i = 0 for all =0 1 +1; • Pre-checking stage: determine what truth assignments iin =n +1, is the input cell; make true all the clauses. io =0, is the output region; • Checking stage: the universal and existential gates of the R : fully quantified formula are simulated and the satisfiability of the whole formula is encoded by a special object in a suitable membrane. [iPx|]i → [iP+|]i[iP−|]i, 1 ≤ i ≤ n (1) • Output stage: the systems send out to the environment [iP+|ai]i → [iP+|ti]i, 1 ≤ i ≤ n (2) the right answer according to the result of the previous stage. [iP−|ai]i → [iP−|fi]i, 1 ≤ i ≤ n (3) Generation stage n i ≤ i ≤ n ti+1[iP0|]i → [iPx|ti+1]i, 1 ≤ i ≤ n − 1(4)In the first + 1 steps, we encode membrane (1 ) with quantifier ∀ (∃) by sending the object yi (vi,1≤ i ≤ n, fi+1[iP0|]i → [iPx|fi+1]i, 1 ≤ i ≤ n − 1(5) respectively) into the membrane i − 1 according rules 15-17. t P | → P |t , ≤ i

This phase finishes after m + 1 steps. have the following result. Checking stage Corollary 1 PSPACE⊆PMCMP This phase of computation checks whether the formula got- Proof It suffices to make the following remarks: the ten from the pre-stage is satisfied with quantifiers. Following QSAT problem is PSPACE-complete, QSAT∈PMCMP,and rules 19-22, objects t move upwards the membrane structure the complexity class PMCMP is closed under polynomial time tree, checking at each level on quantifier ∀ or ∃. reduction. [17] S The universal gate of the formula is simulated by rules Theorem 2 PSPACE⊆PMCMP . [17] S 19-20 which are used to examine the present of the object t in Corollary 2 PMCMP =PSPACE membrane labeled by i+1 which contains cell-i. This happens From Theorem 1 and Corollary 2, the following theorem S if and only if the set of the formula in these two cells is satisfi- holds. PSPACE=PMCMP =PMCMP . able. In membrane [iP+|]i ([iP−|]i, respectively), the value of variable i is xi (¬xi respectively). Otherwise, the computation IV. Conclusions in this gate stops without sending object t out. The existential gate of the formula is simulated by rules We have shown that the QSAT problem, a well known 21-22. The formula is satisfiable when the quantify is ∃ if and PSPACE-complete problem, can be solved in linear time by only if at least one copy of t are present in cell-i +1.Either a uniform family of P systems with proteins on membranes in membrane [iP+|]i and [iP−|]i contained in the same upper and with membrane division, which answers the open prob- membrane, the object t is sent outside when it present in cell-i lem whether uniform families of P systems with proteins on which is encoded by vi.Ifnocopyoft is present in mem- membranes can solve in polynomial time exactly the class of brane i, so the gate sends nothing outside without applying PSPACE problems. any rules. If the quantified formula is satisfiable, object t is The solution presented here differs from the solution to released to the membrane i +1. PSPACE problems given by Socik et al.[17] in the following Eventually, object t appears in membrane n + 1, signaling sense: a family of P systems with proteins on membrane is that the whole formula is satisfied, and system halts. Other- constructed, associated with the problem that is being solved, wise, the formula is not satisfiable, and t willnotbesentto in such a way that all the instances of the problem that have membrane n +1. the same size are processed by the same P system (to which Output stage an appropriate input, that depends on the concrete instance, If one of the truth-assignments from a membrane with la- is supplied). In Ref.[17] one works with semi-uniformly con- bel 1 has satisfied the formula with quantifiers, in membrane structed P systems, associated with each one of the instances 0thereisobjectt. In the next step, yes is sent to the environ- of the problem. ment by rule 25 and the protein on membrane 0 is changed to In membrane computing, there are two main classes of P P+ which will never let object no show. systems: with the membranes arranged hierarchically, inspired If the formula is not satisfiable, in step 6n + m +3 thereis from the structure of the cell, and with the membranes placed nothing sent out. With the counting object d6n+m+3 and P0 in the nodes of a graph, all of them at the same level. In the in the membrane 0, the object no is produced by rule 26, and study of the computational power of MP systems, they were in sent into the environment. a hierarchical structure. How would the computational power In order to show that the family Π = {Π( n, m )|n, m ∈ N} of (semi-)uniform families of such systems change if the graph is polynomially uniform by deterministic Turing machines, we structure is introduced into it? note that the sets of rules associated with the system Π( n, m ) are recursive. Hence, the amount of necessary resources for References defining each system is quadratic in max{n, m}, and this is indeed the case, since those resources are the following: [1] G. Pˇaun, “Computing with membranes”, Journal of Computer nm n m ∈ nm and System Sciences, Vol.61, No.1, pp.108–143, 2000. 1. Size of the alphabet: 2 +12 + +9 Θ( ). [2]L.Pan,G.Pˇaun, “Spiking neural P systems with anti-spikes”, n ∈ n 2. Initial number of cells: +2 Θ( ). International Journal of Computers, Communications & Con- 3. Initial number of objects: 2n +3∈ Θ(n). trol, Vol.4, No.3, pp.273–282, 2009. 4. Number of rules: n2 +3nm+20n+m+5 ∈ Θ(nm+n2). [3]L.Pan,G.Pˇaun, “Spiking neural P systems: an improved nor- 5. Upper bound for the length of the rules: 6 ∈ Θ(1). mal form”, Theoretical Computer Science, Vol.411, pp.906–918, 2010. From the previous explanation, the systems start the com- [4] J. Wang, H.J. Hoogeboom, L. Pan, Gh. Pˇaun, M.J. P´erez- putation with the multiset cod(γ) added to the input cell, and Jim´enez, “Spiking neural P systems with weights”, Neural Com- ϕ∗ correctly answers the question whether or not is satisfiable putation, Vol.22, No.10, pp.2615–2646, 2010. in the last step. The duration of the computation is polyno- [5] L. Pan, X. Zeng, X. Zhang, “Time-free spiking neural P sys- mial in terms of n and m: the answer yes or no is sent out at tems”, Neural Computation, Vol.23, No.5, pp.1320–1342, 2011. most in step 6n + m +3. [6] T.O. Ishdorj, A. Leporati, L. Pan, X. Zeng, X. Zhang, “De- terministic solutions to QSAT and Q3SAT by spiking neural P It is easy to see that the system Π can be constructed in systems with pre-computed resources”, Theoretical Computer a polynomial time starting from numbers m and n,andthis Science, Vol.411, pp.2345–2358, 2010. concludes the proof. [7] X. Zhang, S. Wang, Y. Niu, L. Pan, “Tissue P systems with cell From Theorem 1, and having in mind that the complexity separation: attacking the partition problem”, Science China class PMCMP is closed under polynomial time reductions, we Information Sciences, Vol.54, No.2, pp.293–304, 2011. 672 Chinese Journal of Electronics 2012

[8] Y. Niu, L. Pan, M.J. P´erez-Jim´enez, M.R. Font, “A tissue P sys- [17] P. Sos´ık, A. Pˇaun, A. Rodr´ıguez-Pat´on, David P´erez, “On tems based uniform solution to tripartite matching problem”, the power of computing with proteins on membranes”, Lecture Fundamenta Informaticae, Vol.109, pp.179–188, 2011. Notes in Computer Science, Vol.5957, pp.448–460, 2010. [9] L. Pan, M.J. P´erez-Jim´enez, “Computational complexity of [18] G. Pˇaun, Membrane Computing-An Introduction, Springer- tissue-like P systems”, Journal of Complexity, Vol.26, No.3, Verlag, Berlin, pp.51–125, 2002. pp.296–315, 2010. [19] M.J. P´erez-Jim´enez,A. Romero-Jim´enez, F. Sancho-Caparrini, [10] A. Pˇaun, B. Popa, “P systems with proteins on membranes “A polynomial complexity class in P systems using membrane and membrane division”, Lecture Notes in Computer Science, division”, Journal of Automation, Languages and Combina- Vol.4036, pp.292–303, 2006. torics, Vol.11, No.4, pp.423–434, 2006. [11] L. Cardelli, “Brane calculi”, Lecture Notes in Computer Sci- LU Chun is a Ph.D. candidate ence, Vol.3082, pp.257–280, 2005. in Huazhong University of Science and [12] L. Cardelli, G. Pˇaun, “An universality result for a (mem) brane Technology, Wuhan, China. He received calculus based on mate/drip operations”, International Journal M.S. degree in Systems Engineering from of Foundations of Computer Science, Vol.17, No.1, pp.49–68, Huazhong University of Science and Tech- 2006. nology in 2008. Currently, his main re- [13] S.N. Krishna, “Universality results for P systems based on search interests cover membrane comput- Brane Calculi operations”, Theoretical Computer Science, ing, neural computing, automata theory Vol.371, No.1, pp.83–105, 2007. and its application. [14] G. Pˇaun, “P Systems with active membranes: attacking NP- complete problems”, Journal of Automata, Languages and SHI Xiaolong Ph.D., is an asso- Combinatorics, Vol.6, No.1, pp.75–90, 2001. ciate professor of System Engineering in [15] C. Zandron, A. Leporati, C. Ferretti, G. Mauri, “On the com- Department of Control Science and Engi- putational efficiency of polarizationless recognizer P systems neering of Huazhong University of Science with strong division and dissolution”, Fundamenta Informat- and Technology, Wuhan, China. His ma- icae, Vol.87, No.1, pp.79–91, 2008. jor research interests cover image process- [16] A. Pˇaun, B. Popa, “P systems with proteins on membranes”, ing, neural network, pattern recognition Fundamenta Informaticae, Vol.72, No.4, pp.467–483, 2006. and bioinformatics. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Missing Value Estimation for Gene Expression Profile Data∗

WANG Xuesong, LIU Qingfeng and CHENG Yuhu

(School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou 221116, China)

[5] Abstract — A new Missing value (MV) estimation of the observed values over that gene .Casedeletionproce- method for gene expression profile data is proposed by con- dures may bias the results if the remaining cases are unrepre- sidering both the internal and external conditions of gene sentative of the entire sample. Because the same value is used expression profiles. The internal condition emphasizes the to replace MVs in a given gene, both zero and mean substi- time-series characteristic of gene expression profile data. [6] Therefore, we can use the cubic spline fitting method to tutions will reduce the variance of the variable in question . construct a gene expression curve so as to estimate MVs. Scholz et al. pointed out that the key of MV estimation is to [7] The main idea of MV estimation based on the external con- find a relationship between genes , based on which, a lot of dition is to reconstruct MVs according to the expression methods have been developed which can be classified into two values of candidate genes. Firstly, an initial subset of candi- categories, i.e., global strategy and local strategy[8].Anas- date genes is determined by defining a trace matrix. Then sumption for the global strategy is that, all genes in a dataset a final subset of candidate genes is constructed by select- are with covariance structure. Therefore, the global strategy ing genes from the initial subset according to an improved Pearson correlation coefficient. At last, we select K genes is only suitable for datasets with strong global correlation. For that are most correlated with the target gene from the the local strategy, because it can exploit the local similarity final subset to compute the weighted sum of the K expres- structure of genes, it has the ability of dealing with noise and sion values. Thus, the weighted sum is the estimated value time-series gene expression data. of the target gene based on the external condition. Ex- Typical methods using local strategy are the weighted K- perimental results indicate that, compared with commonly [9] nearest neighbor imputation (KNNimpute) and its improved used MV estimation methods, KNNimpute, SKNNimpute and IKNNimpute, the proposed method has higher esti- methods including the sequential KNN imputation method [10] mation accuracy and is robust to the magnitude of K. (SKNNimpute) and the iterative KNN imputation method (IKNNimpute)[11]. MV estimation using these methods can be Key words — Gene expression profile, Missing value, carried out through constructing weights between the target Correlation coefficient, Curve fitting, Trace matrix. gene and each candidate gene. These MV estimation methods merely take the external condition, the expression profile of I. Introduction candidate genes into consideration. Therefore, good estima- tion results can be obtained under the condition that there exists a strong correlation between the target gene and each With the step of coming into post-genomic era, more and candidate gene. If the number of genes is large, the probabil- more experts pay attention to constructing Gene regulatory ity of obtaining candidate genes that are strongly correlated networks (GRNs) in recent years. As we know, gene expres- with the target gene is high which is helpful for MV estima- sion profile data obtained from microarray technology is an im- tion. But in practice, the scale of genes is small. Therefore, portant material for constructing GRNs. Generally speaking, [1] insufficient genes may results in weak correlation and further GRNs can be constructed using a data analytical method or [2] large MV estimation error when only the external condition is a biological method . However, due to the imperfections of considered. microarray experiments, most of gene expression profile data [3] It is well known that gene expression profile is usually de- contain an average of 5% of Missing values (MVs) .Because [4] noted by a large matrix. A row of the matrix represents a of MVs, many data analytical methods cannot perform well . gene expression under different experimental conditions (time Therefore, it is necessary to design suitable MV estimation points) which is time-series data. Therefore, there exists an in- methods. ternal condition between the time-series data for a gene which So far, the simple ways usually applied to dealing with is only related with the gene itself. Similar to the external MVs include removing the genes with MVs directly (case dele- condition, the internal condition is also applicable for MV es- tion), or replacing the MVs of a gene with zero or the average

∗Manuscript Received Mar. 2011; Accepted May 2011. This work is supported by the National Natural Science Foundation of China (No.60804022, No.60974050, No.61072094), Program for New Century Excellent Talents in University (No.NCET-08-0836), Fok Ying-Tung Education Foundation for Young Teachers (No.121066), Natural Science Foundation of Jiangsu Province (No.BK2008126). 674 Chinese Journal of Electronics 2012 timation. In our study, a novel MV estimation method is the sum of squared differences between the last two estima- proposed by considering both the internal and external con- tions. If the sum is larger than a predefined threshold, return ditions. The estimated values of MVs are composed of two to (2) and continue until it is smaller than the threshold. components. The first one is obtained from a curve fitting 3. Our method result based on the internal condition, and the second com- Generally speaking, the interpolation method is good at ponent is the weighted sum of observed values over candidate estimating MVs for time-series data, and the cubic spline fit- genes based on the external condition. ting is more suitable for processing gene expression profiles data. Gene expression curve obtained by cubic spline fitting II. Methods and Materials not only can reflect the internal growth and change regulation of a gene but also can reconstruct MVs. Therefore, the cubic 1. Notation spline fitting method is used here to create the gene expres- The dataset of a gene expression profile can be denoted as sion curve and the estimated value of xyz obtained by which f amatrixv =(xij )N×M ,wherexij represents the value of gene is denoted as xyz. i at the jth time point, N and M are the numbers of genes For each gene, a threshold value is defined according to and time points respectively. In our study, a gene with MVs Eq.(4). is called target gene, and the genes with available information Thresholdi = |μi| + λ|σi| (4) for estimating its missing entries constitute the set of candi- where μi and σi are the mean and variance of the expression y z date genes. If the value of target gene atthetimepoint is value of gene i respectively, λ is a predefined constant with x missed, the estimated value of the MV is denoted as ˆyz. λ>0. 2. KNNimpute, SKNNimpute and IKNNimpute Let T =(tij )N×M be a trace matrix where tij is defined KNNimpute is a classical method. It takes advantage as: ⎧ of the principle of minimum Euclidean distance to select K ⎨⎪ 0,xij =NaN nearest neighbors of the target gene, and then reconstructs t , |x | < ij = ⎪ 1 ij Thresholdi (5) the MVs of the target gene by weighted average of the K ⎩ [9] 2, |xij |≥Thresholdi neighbors . To compute the Euclidean distance dyi between the target gene y and each candidate gene i,amatrixr is pro- where NaN represents MV. posed. If the value of gene i at the jth time point is missed, Then we can get an initial subset of candidate genes:   the ijth element of r , rij , is equal to 0; otherwise, it is 1. The icandidate v =(xoj)O×M , if tyz =0andtoz =0 (6) Euclidean distance is defined as: Gene expression values reflect the level of activity of gene M   2 under different experimental conditions, and large values al- ryjrij (xyj − xij ) j=1 ways represent strong level of activity. The larger gene expres- dyi = (1) M sion values, the smaller measurement error. In other words,   ryjrij large gene expression values are helpful for improving the es- j=1 timation accuracy of MVs. The traditional Pearson correla- tion coefficient measures the similarity between any two genes The weight between the selected gene k and the target gene y from the overall situation, while it neglects the influence of is defined as: 1/dyk large gene expression values. In order to highlight the effect of wyk = (2) K large gene expression values, an improved Pearson correlation 1/dyk coefficient ryi between the target gene y and each candidate k=1 gene i is proposed. The estimated valuex ˆyz follows the following form: M K ((tyj ∩ tij )xyj − x¯y)((tyj ∩ tij )xij − x¯i) j=1 xˆyz = wykxkz (3) ryi = M M k=1 2 2 ((tyj ∪ tij )xyj − x¯y) ((tyj ∪ tij )xij − x¯i) j=1 j=1 As an improved method, SKNNimpute is different from (7) KNNimpute on two main points: (1) MVs are reconstructed where ‘∩’and‘∪’ denote minimizing and maximizing opera- sequentially from genes with the smallest missing rate. (2) For tions respectively. It should be especially specified that if each a target gene, if all the MVs in it have been reconstructed, it of the two variables tyj and tij is zero, tyj ∪tij is zero regardless can be reused as a candidate gene in the following work. In incomplete of the ‘∩’or‘∪’operation. SKNNimpute, a dataset is divided into two parts x complete incomplete Suppose ψ is the threshold of the improved correlation co- and x . x is formed by all target genes while complete efficient defined in Eq.(7), we can obtain the final subset of x is formed by all candidate genes. Generally, K near- complete candidate genes denoted by est neighbors are selected from x . fcandidate Compared with KNNimpute, IKNNimpute is based on an v =(xlj )L×M , if ryl >ψ (8) iterative procedure. It follows three steps: (1) Replace all MVs with the average values of their corresponding genes. (2) Esti- After getting the final subset vfcandidate,weselectK genes mate all MVs through SKNNimpute procedure. (3) Compute with larger magnitude of improved correlation coefficient from Missing Value Estimation for Gene Expression Profile Data 675 the final subset of candidate genes, and construct a weight: III. Results and Analysis

 ryk L 1. Parameter sets w × yk = K O (9) λ τ k=1ryk In our method, there are three parameters including , and ψ need to be set. Here, we set λ and τ to be 5 and 10−3 O L where and represent the number of genes in the initial respectively. Theoretically, two genes have strong similarity and final subsets respectively. if the absolute value of their Pearson correlation coefficient is x [9] According to Eq.(10), the estimated value ˆyz can be ob- larger than 0.75 . As for the improved correlation coefficient tained. defined in Eq.(7), when the minimizing operation is 1 or 0 L K x − xf w x and the maximizing operation is 2, we can get the smallest ˆyz = 1 O yz + yk kz (10) k=1 value that is a quarter of the Pearson correlation coefficient. The steps of constructing MVs through our method can Therefore, ψ is set to 0.2. be summarized as follows. 2. Feasibility analysis Step 1 Sort target genes in an ascending order according For the complete dataset TD1, some true values were to their missing rates; deleted at random to create a test dataset. Here, the Pro- Step 2 Execute the cubic spline curve fitting operation portion of genes containing MVs (PGMV) in the test dataset for target gene y; TD1 was 1%, 5%, 10% and 15% respectively. We then used Step 3 For the MV xyz in gene y: our method to recover the introduced missing values and used f (a) Obtain the estimated value of xyz from the cubic spline the Normalized root mean squared error (NRMSE) as an eval- curve fitting; uating index. (b) Calculate the trace matrix T by Eq.(5) and obtain the 2  initial subset of candidate genes according to Eq.(6); NRMSE =(1/σtrue) (Vtrue − Vest) /n (11) (c) Compute the improved Pearson correlation coefficient n σ between target gene y and each initial candidate gene accord- where is the number of MVs and true is the standard de- n V V ing to Eq.(7), and obtain the final subset of candidate genes viation for the true values. true and est represent true through Eq.(8); and estimated values of MVs respectively. The NRMSE of the (d) Select K genes according to the magnitude of correla- estimation is shown in Fig.1. tion coefficients between target gene y and the final candidate genes, then construct the weight matrix according to Eq.(9); (e) Obtain the estimated valuex ˆyz accordingtoEq.(10) and replace the MV withx ˆyz; (f) Compute the difference between the former and the current estimated values δ.If|δ|≤τ, go to Step 4; otherwise, return to Step 3(b) and iterate until the convergence criterion τ is reached; Step 4 Reconstruct the next MV in target gene y until all MVs in target gene y have been replaced; Step 5 Go to the next target gene until all target genes have been estimated. Fig. 1. NRMSE obtained by our method under different PG- 4. Dataset MVs In our study, we use Saccharomyces microarray dataset From Fig.1, it can be easily seen that the NRMSE firstly published by Spellman to validate our method. The dataset decreases, and then increases with the number of nearest can be described as a matrix with rows corresponding to genes neighbors. For all conditions of PGMV, NRMSEs keep un- and columns to experimental conditions[12]. In our study, changed basically when K varies between 10 and 40. In addi- three datasets TD1, TD2 and TD3 obtained under three dif- tion, for a fixed K value, the NRMSE increases as the PGMV ferent experimental conditions, i.e., cdc28, cdc15 and alpha, increases. respectively are used (http://cellcycle-www.stanford.edu/). [11] Br´as and Menezes did the same experiment on TD1 , Here, TD1 is used to do feasibility analysis, while TD2 and and their results showed that NRMSEs obtained by KNNim- TD3 are used to do comparative analysis. Original datasets, pute, SKNNimpute and IKNNimpute were larger than 0.6. TD1 and TD2 are pre-processed for the evaluation by remov- Fig.1 shows that all NRMSEs obtained by our method are ing rows and columns containing missing expression values, smaller than 0.5. Accordingly, our method has higher estima- yielding ‘complete’ datasets. Table 1 shows the attributes of tion accuracy which is much feasible for MV estimation. these datasets. 3. Comparative analysis Table 1. Attributes of datasets In our study, we select 170 genes at random from the com- TD1 TD2 TD3 plete dataset TD2 to do comparative analysis on four MV es- Experimental condition cdc28 cdc15 alpha timation methods, i.e., KNNimpute, SKNNimpute, IKNNim- Dimension of original dataset 6179×17 6179×24 6179×18 pute and our method. Similar to the research of Troyanskaya [9] [11] Dimension of complete dataset 1383×17 4380×24 et al. and Br´as and Menezes , we select 4 different values 676 Chinese Journal of Electronics 2012 of K, 5, 10, 15 and 20, to show the results of comparative analysis. The evaluating indexes used here are Average effective error (AEE) and Error rate (ER). AEE is the average of estimation errors that are smaller than 1 and ER is the rate of estimation errors that are larger than 1.  AEE =(1/n ) (Vtrue − Vest) (12)    ER =(n − n )/n (13) where n is the number of errors that are smaller than 1. From Fig.2, we can see that the AEE of our method is lower than that of other three methods. Moreover, the tendencies of AEE using our method are the same over different values of K which means that, our method has the highest stabil- Fig. 2. AEE obtained by different methods over different values of K ity. Table 2 shows that the ER of our me- thod is the lowest among all methods with the same PGMV. IV. Conclusions Therefore, it can be concluded that our method has better esti- DNA microarray is a high-throughput technology that al- mation accuracy and stability than KNNimpute, SKNNimpute lows the recording of expression levels of thousands of genes and IKNNimpute methods. simultaneously, giving a global view of gene expression. The data generated in a set of microarray experiments are usually Table 2. ER obtained by different methods over gathered in a matrix with genes in rows and experimental con- different values of PGMV ditions in columns. Frequently, these matrices contain MVs PGMV KNNimpute SKNNimpute IKNNimpute Our method due to the occurrence of imperfections during the microarray 5% 31.3% 31.3% 25.0% 12.5% experiment. MVs may make the precision or the stability of 10% 42.2% 37.5% 30.4% 25.0% data analytical methods for gene expression data poor. In this 15% 47.0% 36.2% 32.4% 16.7% work, we propose a new method for estimating MVs in gene 20% 52.4% 42.0% 39.2% 15.6% expression data from the view of point of both the internal and external conditions, i.e., the estimation value of a target In the third experiment, we select 104 genes from TD3 gene is composed of two components. The first estimation is and use them to do cluster analysis. From the research of the result of the cubic spline fitting, and the second one is Spellman et al., we know that the 104 genes can be classi- the weighted sum of expression values over K candidate genes fied into 5 classes with containing 21, 8, 9, 15 and 51 genes that are most similar to the target gene. In order to highlight respectively[12]. In the 5 classes, there are 7, 0, 1, 5 and 9 genes theeffectoflargegeneexpressionvalues,animprovedPearson contain MVs respectively. Firstly we reconstruct all MVs in correlation coefficient is proposed to measure the similarity be- the input data by the four MV estimation methods respectively tween the target gene and each candidate gene. Experimental and then use the Fuzzy C-means clustering (FCM) algorithm results concerning on the Saccharomyces microarray dataset to classify these genes. The number of genes correctly clas- verify the feasibility and validity of the proposed MV estima- sified is shown in Table 3. It can be seen from Table 3 that tion method. more genes can be classified correctly using our method. The reason is that our method has higher MV estimation accuracy; therefore, the gene expression data after the reconstruction of References MVs is much suitable for clustering analysis. [1] X.S. Wang, Y.Y. Gu, Y.H. Cheng et al., Construction of delay gene regulatory network based on complex network”, Acta Elec- Table 3. Number of genes correctly classified tronica Sinica, Vol.38, No.11, pp.2518–2522, 2010. (in Chinese) MV [2] M. Choi, O.H. Lee, S. Jeon et al., “The oocyte-specific tran- estimation Class 1 Class 2 Class 3 Class 4 Class 5 Total scription factor, Nobox, regulates the expression of Pad6, a pep- tidylarginine deiminase in the oocyte”, , Vol.584, method FEBS Letters No.16, pp.3629–3634, 2010. KNNimpute 13 28 68 8 6 13 [3] A.G. De Brevern, S. Hazout, A. Malpertuy, “Influence of mi- SKNNimpute 14 8 6 13 28 69 croarrays experiments missing values on the stability of gene IKNNimpute 13 8 6 13 29 69 groups by hierarchical clustering”, BMC Bioinformatics, Vol.5, Our method 15 8 6 13 30 72 pp.114–125, 2004. [4] X.S. Wang, Y.Y. Gu, Y.H. Cheng et al., “An ensemble classi- Missing Value Estimation for Gene Expression Profile Data 677

fier based on selective independent component analysis of DNA WANG Xu es o n g received the Ph.D. degree from China University of microarray data”, Chinese Journal of Electronics, Vol.18, No.4, pp.645–649, 2009. Mining and Technology in 2002. She [5] J.L. Schafer, J.W. Graham, “Missing data: our view of the is currently a professor in the School of Information and Electrical Engineer- state of the art”, Psychological Methods, Vol.7, No.2, pp.147– 177, 2002. ing, China University of Mining and Tech- [6] M.T. Swain, J.J. Mandel, W. Dubitzky, “Comparative study nology. Her main research interests in- of three commonly used continuous deterministic methods for clude machine learning, bioinformatics and artificial intelligence. (Email: wangx- modeling gene regulation networks”, BMC Bioinformatics, Vol.11, pp.459–484, 2010. [email protected]) [7] M. Scholz, F. Kaplan, C.L. Guy et al., “Non-linear PCA: a miss- LIU Qingfeng received the B.S. de- ing data approach”, , Vol.21, No.20, pp.3887– Bioinformatics gree from China University of Mining and 3895, 2005. Technology in 2009. He is currently a M.S. [8] R. J¨ornsten, H.Y. Wang, J.W. William et al., DNA microarray candidate in the School of Information and data imputation and significance analysis of differential expres- Electrical Engineering, China University sion”, Bioinformatics, Vol.21, No.22, pp.4155–4161, 2005. of Mining and Technology. His main re- [9] O. Troyanskaya, M. Cantor, G. Sherlock et al., “Missing value search interest is bioinformatics. (Email: estimation methods for DNA microarrays”, Bioinformatics, [email protected]) Vol.17, No.6, pp.520–525, 2001. [10] K.Y. Kim, B.J. Kim, G.S. Yi, “Reuse of imputed data in mi- croarray analysis increases imputation efficiency”, BMC Bioin- CHENG Yuhu received the Ph.D. formatics, Vol.5, pp.160–169, 2004. degree from the Institute of Automation, [11] L.P. Br´as, J.C. Menezes, “Improving cluster-based missing value Chinese Academy of Sciences in 2005. He is estimation of DNA microaray data”, Biomolecular Engineering, currently a professor in the School of Infor- Vol.24, No.2, pp.273–282, 2007. mation and Electrical Engineering, China [12] P.T. Spellman, G. Sherlock, M.Q. Zhang et al., “Comprehen- University of Mining and Technology. His sive identification of cell cycle-regulated genes of the yeast Sac- main research interests include machine charomyces cerevisiae by microarray hybridization”, Molecular learning and intelligent system. (Email: Biology of the Cell, Vol.9, No.12, pp.3273–3297, 1998. [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Rate-Distortion Model Based Frame Layer Rate Control Algorithm for Stereoscopic Video Coding∗

WANG Qun, ZHUO Li, ZHANG Jing and LI Xiaoguang

(Signal and Information Processing Laboratory, Beijing University of Technology, Beijing 100124, China)

Abstract — Rate control plays an important role in rithm have been applied in many coding standards. Such as video coding and transmission. In this paper, a novel the first-order linear model has been used in MPEG-2 TM5 rate-distortion model has first been proposed to charac- rate control algorithm[2], and the quadratic model has been terize the coding characteristics of stereoscopic video cod- [3] used in MPEG-4 VM8 algorithm . TMN8, proposed by Jordi ing, where the weighted average of the left and right view- point measured with the Video quality metric (VQM) is Ribas Cobera et al., has been used in H.263 rate control algo- adopted as the stereoscopic video coding distortion metric, rithm, and its R-D model is the combination of the logarith- [4] insteadofMeansquareerror(MSE).Thenaframelayer mic model and the quadratic model .Withrecentadvances rate control method for stereoscopic video coding has been in stereoscopic video coding, the research on rate-distortion presented based on the proposed R-D model. Experimen- model based rate control for stereoscopic videos has attracted tal results demonstrate that, the proposed R-D model can high interest over the past years. Generally speaking, the rate- accurately characterize the relationship among coding dis- distortion models used for the stereoscopic video coding cur- tortion, coding rate and quantization parameter and the proposed rate control method can efficiently control the rently are mostly the improvements of traditional quadratic [5,6] output bit rate consistent with the target bit rate while R-D model . the R-D coding performance is superior to that of JMVC The rest of this paper is organized as follows. The proposed 4.0. rate-distortion model for stereoscopic video coding is studied Key words — Stereoscopic video, Video quality metric in the Section II. The frame layer rate control method based (VQM), Rate distortion model, Rate control. on the proposed R-D model is presented in the Section III. The experimental results are analyzed in Section IV. Section I. Introduction V concludes the paper.

Rate control plays an important role in video coding and II. Proposed Rate-Distortion Model for transmission. On the one hand, it is an essential component Stereoscopic Video Coding for robust video transmission, especially over time-varying and narrowband channel, where the transmission channels usually 1. Coding framework of stereoscopic video have fluctuated bandwidth. Hence, with the rate control tech- In this paper, MVC is used as the coding framework of nique, we can control the output bit rate according to the stereoscopic video. MVC was proposed by ITU-T and MPEG channel conditions and buffer size etc. On another hand, rate Joint Video Team (JVT) to achieve a high multi-view video control is also beneficial to improve the video quality. The coding efficiency. The coding architecture of stereoscopic video compression efficiency of the video encoder and the recon- coding is shown in Fig.1[7]. It can be seen that the stereoscopic structed video quality can be greatly improved through op- video encoder usually adopts the combination of Disparity timal bit allocation. compensated prediction (DCP) and Motion compensated pre- Source model was first presented by Hang H.M. et al in diction (MCP) method to completely remove the redundant 1997[1], which describes the relationships among the output information. rate (R), coding distortion (D) and quantization parameter 2. Coding distortion measurements of stereoscopic (QP). Source models are also called Rate-distortion (R-D) video models to characterize the coding performance of the video VQM, which is developed by the Institute of Telecom- encoder. Currently, many R-D model based rate control algo- munication Sciences (ITS) and American National Standard

∗Manuscript Received Dec. 2011; Accepted Jan. 2012. This work is supported by Program for New Century Excellent Talents in

the National Natural Science Foundation of China (No.61003289, No.61100212), the Natural Science Foundation of Beijing University (No.4102008), the Excellent Science Program for the Returned Overseas Chinese Scholars of Ministry of Human Resources and Social Security of China, Scientific Research Foundation for the Returned Overseas Chinese Scholars of MOE. A Rate-Distortion Model Based Frame Layer Rate Control Algorithm for Stereoscopic Video Coding 679

Institute (ANSI), is a standardized objective video quality Fig.2 shows the relationship curves of rate vs. distortion, metric[8]. VQM can represent both the perceived overall image quantization parameter vs. distortion, and rate vs. quantiza- quality and depth of 3-D video well and has shown a better tion parameter of I frame for the Ballroom sequence, which are performance in terms of stereoscopic video coding distortion simply called R-D, Q-D and QP-R curve respectively. Fig.3 compared with the traditional PSNR metric[9]. shows the P frame results of Ballroom sequence. From Fig.2 It’s known that, according to the characteristics of human and Fig.3, it can be seen that a quadratic polynomial model visual perception, if the right and left views are displayed exhibits the best correlation with Q-D curve for both I frame with different quality and resolutions, the overall 3-D video and P frame. However, it is difficult to approximate the rela- quality is determined by the view with the better quality and tionship of R-D and QP-R. resolution[10] . Based on this theory, in this paper, we use the weighted average VQM of the left and right views instead of the average VQM as the stereoscopic video coding distortion metric, where the weights are set to 0.7 (left) and 0.3 (right) respectively. 3. The actual R-D model for stereoscopic video coding MVC coding framework supports two kinds of coding modes, i.e. Intra and Inter mode. The coding structure of I frame and P frame is quite different, thus the coding distor- tion property also varies. Therefore, we need to set up their R-D model respectively. In order to analyze the rate distortion characteristics of stereoscopic video encoder, three kinds of 3-D video sequences with different motion characteristics are tested in this pa- per firstly. For each sequence, 250 frames are chosen for the test, and the frame rate is fixed to 25 frames/s. We take 2 views from 8 views video to test using the weighted aver- Fig. 1. The coding framework for stereoscopic video coding age VQM of the left and right views as the coding distortion metric. To measure the R-D characteristics of I frame, the 4. The proposed R-D model for stereoscopic video coding structure is set as all I frames. While for P frames, the coding IPPP···IPPP··· coding structures is utilized. The GOP size We have found through a large number of experiments that is set as 8 and the QP value as 5, 10, 15···50 respectively for if we adopt exponential function directly to fit the R-D and each sequence. R-Q curves in Fig.2 and Fig.3, its accuracy is very low and it

Fig. 2. R-D, Q-D and QP-R curves of Ballroom sequence I frame. (a) R-D curve; (b) Q-D curve; (c) QP-R curve

Fig. 3. R-D, Q-D and QP-R curves of Ballroom sequence P frame. (a) R-D curve; (b) Q-D curve; (c) QP-R curve 680 Chinese Journal of Electronics 2012 can lead to the rapid increase of computational complexity. compared to validate the accuracy of R-D model presented in However, if we compute logarithmic operation on R, and list this paper. Fig.6 shows the comparison of the model data the results again in Fig.4 and Fig.5, we can see clearly that and actual test data of Vassar sequence. From Fig.6, it can a cubic polynomial model can describe the log R − D and be seen that the R-D model presented in this paper coincides log R − Q curves well whether for I frame or P frame, which well with the actual R-D curves, and the model can accurately will reduce the computational complexity of the model to a characterize the relationship among coding distortion, coding certain extent. rate and quantization parameter. Similar results have been achieved when other sequences were tested.

III. Proposed Frame Layer Rate Control Method

The goal of rate control method is to control the output bit rate of the encoder consistent with the target bit rate Under the given target bit rate constraint, the first step is to estimate the Quantization parameter (QP) of source encoder according Fig. 4. log(R)-D and log(R)-Q curves of Ballroom I frame. to the proposed model. And then encode the current frame (a)log(R)-D curve; (b)log(R)-Q curve with the estimated QP parameters. In this paper, the basic rate control unit is GOP. For the R-D model based rate control method, the key is to calculate the parameters of the model. Duo to the characteristics of the neighboring frames are very close to each other, the coding statistics of previous coded frames can be utilized to estimate the model parameters of the current frame. Therefore, the method can be implemented with the following steps: Step 1 Initialization • Encode the I frame with the preset QP param- eters and obtain the output bit number BI .Set Fig. 5. log(R)-D and log(R)-Q curves of Ballroom P frame. Encoded Frame Num = 1. Subtract BI from the target bits (a)log(R)-D curve; (b)log(R)-Q curve BT and get the remaining bits BP used for the P frames of Therefore, considering the trade-offs between the model GOP. complexity and accuracy, we characterize the rate distortion BP = BT − BI (2) characteristics of both I frame and P frame using the same R B target · GOP Size GOP Size simple cubic model. In this paper, rate-distortion model for where T = Frame rate , is the GOP stereoscopic video coding has been proposed as follows: size and Rtarget the target bit rate. ⎧ 3 2 • Encode the 1st, 2nd, 3rd and 4th P frames with ⎨⎪ R − D : Dp = p1 log (R)+p2 log (R)+p3 log(R)+p4 3 2 the preset QP parameters and achieve four sets of data, Q − D : Dp = p5QP + p6QP + p7QP + p8 ⎩⎪ i.e.(R1,QP1), (R2,QP2), (R3,QP3)and(R4,QP4). Set R − Q QP p 3 R p 2 R p R p : = 9 log ( )+ 10 log ( )+ 11 log( )+ 12 Encoded Frame Num =5. (1) Step 2 Determine the QP where QP is the quantization step, R is the coding rate, DP is • Calculate the target bits Bi of the remaining P frames the coding distortion measured using VQM, pi, i =1, 2, ···, 12 in the same GOP using the Eq.(3): are the parameters of the proposed model. 5. Experimental results and analysis Encoded Frame Num The accuracy of R-D model is essential for the subsequent Bp − Ri B i=1 rate control results. Several 3D video sequences are tested and i = GOP − Encoded Frame Num (3)

Fig. 6. Comparison between proposed model and actual data of vassar sequence. (a) R-D curve; (b) Q-D curve; (c) R-Q curve A Rate-Distortion Model Based Frame Layer Rate Control Algorithm for Stereoscopic Video Coding 681

• Estimate the model parameters of Eq.(1) using (R1,QP1), (R2,QP2), (R3,QP3)and(R4,QP4), then use these model parameters to determine the QP value of the current frame corresponding to the target bits. • Encode the frame with the achieved QP and obtain the output bit number of the current frame. Step 3 Update • Update the coding data in Eq.(1) using the newly out- put bit number and QP. Use these coding data to estimate the model parameter of the next frame. Fig. 8. Comparison of R-D performance with different meth- ods for Exit sequence • Update Encoded Frame Num = Encoded Frame Num +1. Table 1. The difference between the output bit rate Step 4 Loop over frames and the target bit rate for various video sequences Test Bit rate (kbit/s) RCE (%) • Repeat step 2 and 3 until all the P frames in the current sequences Target bit rate Actual bit rate GOP are encoded. 6543.068 6324.1224 3.346 Ballroom 2123.4784 2112.5580 0.514 IV. Experimental Results and Analysis 946.044 979.6448 3.552 4769.2176 4592.6684 3.702 In order to validate the effectiveness of our proposed rate Exit 1145.7196 1175.4584 2.596 372.3664 365.9608 1.720 control method in this paper, we performed experiments over 7177.8368 6873.018 4.247 several stereoscopic video test sequences with different motion Vassar 1944.4324 1805.6980 7.135 characteristics. And we compared our proposed method with 373.874 381.7824 2.115 the fixed QP method of JMVC reference software. In the ex- 4271.6256 4111.7936 3.742 periments, we test 250 frames for each sequence, and the frame Race1 2121.6644 2202.5396 3.812 rate is fixed to 25 frames per second. The coding structure of 927.166 949.7148 2.432 IPPP···IPPP···is utilized to encode each GOP. The GOP size 3359.4508 3517.4008 4.702 is set as 15. The Rate control error (RCE) is used to measure Flamenco2 1798.9368 1823.2428 1.351 the accuracy of rate control method: 921.9116 955.1424 3.604 Generally speaking, the proposed rate control method can |Rtarget − Ractual| RCE = × 100% (4) efficiently control the output bit rate consistent with the target Rtarget bit rate while the R-D coding performance is superior to that of JMVC4.0 method. The proposed rate control method can Table 1 illustrates the difference between the target bit be applied in the stereoscopic video coding and transmission rate and the actual output bit rate for various video sequences. applications. From Table 1, it can be seen that the proposed rate control method can efficiently control the output bit rate consistent V. Conclusion with the target bit rate. The average rate control error is about In this paper, a novel R-D model for stereoscopic video 3%. Fig.7 and Fig.8 show the R-D performance curves of Ball- coding is first proposed using the weighted average VQM of room and Exit sequences using the proposed rate-controlled the left and right views as the stereoscopic video distortion method and JMVC 4.0 method respectively. It can be seen metric, where the weights are set to 0.7 and 0.3 respectively. from the Fig.7 and Fig.8 that, the R-D performance of the pro- A cubic polynomial model is proposed to describe the correla- posed RC algorithm is superior to that of JMVC 4.0 method, tion curves i.e.logR and distortion D,aswellaslogR and QP and a certain amount of coding gain have been achieved for for both I frame and P frame. The experimental results indi- some sequences. cate that, the R-D model presented in this paper coincides well with the actual stereoscopic video coding curves. Then a rate control method for stereoscopic video coding at frame layer has been presented based on the proposed R-D model. Exper- imental results demonstrate that, the proposed rate control method can efficiently control the output bit rate consistent with the target bit rate while the R-D coding performance is superior to that of JMVC4.0 method.

References [1] H. Hang, J. Chen, “Source model for transform video coder and Fig. 7. Comparison of R-D performance with different meth- its application in fundamental theory”, IEEE Trans. Circuits ods for Ballroom sequence Syst. Video Technol., Vol.7, No.2, pp.287–298, 1997. 682 Chinese Journal of Electronics 2012

[2] ISO/IEC JTC1/SC29/WG11, MPEG-4 test model5. 1993. ZHUO Li was born in 1971. She re- [3] ISO/IEC JTC1/SC29/WG11, MPEG-4 Video Verification ceived the B.S. degree in Radio Technology Model Version 18.0, 2001. from the University of Electronic Science [4] ITU-T/SG15.Video codec test model, TMN8. 1997. and Technology, , China, in 1992, the M.S. degree in Signal and Information [5] J.L. Chen, “Research on mulit-view video coding”, Ph.D. The- Processing from the Southeast University, sis, Zhejiang University, Hangzhou, China, 2006. Nanjing, in 1998, and the Ph.D. degree in [6] Z.J. Zhu, F. Liang et al., “Bit-allocation and rate-control al- Pattern Recognition and Intellectual Sys- gorithm for stereo video coding”, Journal on Communications, Vol.28, No.7, pp.15–21, 2007. tem from Beijing University of Technology, [7] Z.J. Zhu, G.Y. Jiang, M. Yu, “Fast disparity estimation algo- in 2004. She has been a professor of Beijing University of Technology since 2007. Her research interests include rithm for stereo video coding”, Proceedings of 2002 IEEE Re- secure multimedia signal processing, stereoscopic video coding and gion 10 Conference on Computers, Communications, Control transmission. (Email: zhuoli@ bjut.edu.cn) and Power Engineering, Beijing, China, pp.285–288, 2002. [8] M. Pinson, S. Wolf, “A new standardized method for objectively was born in 1975, measuring video quality”, IEEE Transactions on Broadcasting, ZHANG Jing Vol.50, No.3, pp.312–322, Sept. 2004. Ph.D. Associate Professor and M.S. stu- dent supervisor of Beijing University of [9] Chaminda T.E.R. Hewage et al., “Quality evaluation of color Technology. Her research interest is im- plus depth map-based stereoscopic video”, IEEE Journal of age/video signal and processing. (Email: Selected Topics in Signal Processing, Vol.3, No.2, pp.304–318, 2009. [email protected]) [10] S.L.B. Stelmach, W.J. Tam, D. Meegan and A. Vincent, “Stereo image quality: Effects of mixed spatio-temporal resolution”, IEEE Trans. Circuits Syst. Video Technol., Vol.10, No.2, pp.188–193, Mar. 2000. WANG Q u n was born in 1986. She received the B.S. degree in Commu- LI Xiaoguang was born in Beijing, nication Engineering from Shandong Uni- China. He received the B.S. and Ph.D. de- versity of Science and Technology, Qing- grees in Electronic Engineering from the dao, China, in 2009. Since then, she has Beijing University of Technology, in 2003 been a graduate student in Information and 2008, respectively. He is currently a and Communication Engineering at Bei- Lecturer and M.S. student supervisor of the jing University of Technology. Her re- Beijing University of Technology. His re- search interests mainly include stereoscopic search interests include image super reso- video coding and transmission. (Email: lution and high dynamic range image pro- [email protected]) cessing. (Email: [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A KPLS-Eigentransformation Model Based Face Hallucination Algorithm∗

LI Xiaoguang, XIA Qing and ZHUO Li

(Signal and Information Processing Laboratory, Beijing University of Technology, Beijing 100124, China)

[4] Abstract — The traditional eigentransformation me- creases. Wu et al. presented a patch-based two-step image thod for face hallucination is a linear subspace approach, super-resolution approach. The example facial images are nor- which represents an image as a linear combination of train- malized and divided into overlapped patches, then, KPLS[5,6] ing samples. Consequently, those novel facial appearances is employed to implement the regression between the LR and not included in the training samples cannot be super- resolved properly. In this paper, a KPLS (Kernel partial HR patches which located at the same location in the face. For least squares) regression is introduced into the eigentrans- each divided location, there is a regression model to be trained. formation method to reconstruct the High resolution (HR) The regression model is then used to infer super-resolved im- image from a Low resolution (LR) facial image. We have age patches from the LR inputs. Residual error image is also evaluated our proposed method using different zooming compensated via a KPLS model. Experimental results show factors and compared these performances with the current that the reconstructed results are impressive in case of small Super resolution (SR) algorithms. Experimental results zooming factors (such as 2 or 4). However, obviously blocky show that our algorithm can produce better HR face im- ages than the compared eigentransformation based method artifacts will appear on the resolved HR images if the zooming and the KPLS method in terms of both visual quality and factor is 8 or even larger. numerical errors. The above mentioned works have made significant contri- Key words — Face hallucination, Eigentransformation, butions to the way we now exploited learning-based super- Image super resolution, Kernel partial least squares. resolution for human face. Keep in mind that PCA can han- dle the face representation problem in case of larger zoom- ing factor, while the personal characters will be lost. KPLS I. Introduction can be used to improve the efficiency of eigentransformation model. Inspired by Refs.[3] and [4], we present a KPLS- Due to the limits of image acquisition devices, the cap- eigentransformation model based algorithm for face halluci- tured images may be of Low-resolution (LR), which may be nation in this paper. subject to warping, blurring, down-sampling and noising ef- The remainder of this paper is organized as follows. Sec- fects. Consequently, the low quality of face images will fur- tion II presents the theory of the proposed flowchart. The ther affect the performances of face recognition or automatic KPLS-eigentransformation model is also described in details. video analysis. To solve the problem, it is useful to render a Section III presents the experimental results, and Section IV High resolution (HR) face image from the low-resolution one. concludes the paper. These techniques are known as face Super-resolution (SR) or face hallucination[1]. Baker and Kanade[2] proposed a “hallucination” algorithm II. The Proposed Algorithm to break the limit of the reconstruction constraint. Wang et al.[3] presented an eigentransformation method for hallucinat- In this paper, we propose a novel two-step face hallucina- ing human faces, in which global features have been employed. tion algorithm based on both the global and local models. The This method is concise in its theoretical expression, easy to im- flowchart of the proposed algorithm is shown in Fig.1. There plement, and very efficient when compared to other SR algo- are two KPLS models required to be trained offline. Model 1 is rithms. However, due to the linear property of the PCA (Prin- used to generate a primitive HR image based on the global fea- cipal component analysis), eigentransformation model can’t tures. An input LR face image will be represented by a set of model the relationship between the LR and HR faces effi- well trained LR eigenfaces. Then, the combination coefficients T ciently. Therefore, the results will be more and more similar C =[c1,c2,c3, ···,cM ] are fed into the KPLS regression pre- to the mean face of the training set as the zooming factor in- dictor and the estimated HR coefficients C =[c1,c2,c3, ···,

∗Manuscript Received Nov. 2011; Accepted Apr. 2012. This work is supported by the National Natural Science Foundation of China (No.61003289, No.61100212), the Natural Science Foundation of Beijing (No.4102008), the Program for New Century Excellent Talents in University, the Excellent Science Program for the Returned Overseas Chinese Scholars of Ministry of Human Resources and Social Security of China, Scientific Research Foundation for the Returned Overseas Chinese Scholars of MOE. 684 Chinese Journal of Electronics 2012

Fig. 1. The flowchart of the proposed approach

T cM ] can be derived. The primitive HR image is generated versions with zooming factor 2, and 8 respectively. We can see by the linear combination of C and HR eigenfaces. Model 2 that they are quite different. The amplitude of the waveform is is used to infer HR residual errors based on the local patches. decreasing from high to low resolution. This fact indicates the Finally, the residual error values are added to the primitive gap between the LR and HR faces and it shows inefficiency of HR image to reconstruct the HR face image. the eigentransformation model. That is why the reconstructed 1. Analysis on eigentransformation model HR image will become more and more similar to the mean face The key idea of the eigentransformation SR algorithm pro- with the increasing of the zooming factor. Although we apply posed by Wang[3] is to fit an input LR face image to a linear some constraints on the principal components, the lost details combination of the LR eigenfaces by PCA, and to replace the cannot be synthesized any more. LR eigenfaces with those corresponding HR ones while keeping the combination coefficients unchanged to reconstruct the esti- mated HR face. We represent a set of LR and the correspond- ing HR face images by matrixes: I L =[IL1,IL2, ···,ILM ] and IH =[IH1,IH2, ···,IHM], where ILi and IHi are the im- age vectors, and M is the number of the training samples. Denoting ml and mh as the mean faces of the LR and HR training face images respectively. PCA is applied to I L and I H respectively to represent the face images using a weighted combination of the eigenfaces. Let EL =[l1,l2, ···,lM ]and EH =[h1,h2, ···,hM ] denote the eigenfaces of the LR and Fig. 2. PCA coefficients of the same face image in different HR faces. The mathematical description of the eigentransfor- resolutions mation algorithm can be expressed simply as follows: To improve the efficiency of the eigentransformation T model, a predictor is introduced to predict the HR PCA co- C = EL(xl − ml)(1) efficients from the corresponding LR PCA coefficients in this xh = EH C + mh (2) paper. Then, a KPLS-eigentransformation model is proposed. where xl is an input LR face image, xh is the reconstructed 2. KPLS-eigentransformation model HR face, and C is the combination coefficient vector of the xl To deal with nonlinear applications, KPLS was proposed when represented by LR eigenfaces. in Ref.[6] as an extension of PLS. As the key idea of the other However, is it reasonable the C, the coefficient vector of kernel-based methods, both of the input and output train- LR face, is directly employed in Eq.(2)? To further investi- ing data are mapped into a high-dimension feature space us- gate this problem, The PCA coefficients of the face images ing a kernel function, such as polynomial function, sigmoid in different resolutions are analyzed. The training images are function or radial function. In this paper, radial basis kernel 2 down-sampled with factor 2, 4, and 8 to form four sub-training k(xi,xj )=exp(−xi − xj /z) is selected, where z is a pa- 2 sets. Then, PCA is applied to each sub-training set to get rameter to control the radial variance. We set z to 35 ,which eigenfaces and the weighted coefficients for each face sample. ensures convergence. Fig.2 shows PCA coefficient vectors of a random selected face Nonetheless, it is difficult to build a reasonable regression in different resolutions, where WHR, WLR2, and WLR8 are model with a single KPLS regression, because we cannot define the PCA coefficient vectors of HR face and its down-sampled a fixed error threshold for different ci in a regression procedure. A KPLS-Eigentransformation Model Based Face Hallucination Algorithm 685

In our paper, the PCA coefficient vector C is divided into sev- methods have been simulated and implemented. Subjective eral sub-vectors. For example, each of k adjacent components performances of the different SR algorithms have been shown is divided as a sub-vector. Then, for each sub-vector, there is in Fig.4. There are HR face images (141 × 161), results of the a sub-KPLS predictor trained. The KPLS model is composed cubic interpolation, the KPLS-SR[4] and our proposed algo- of all the sub KPLS predictors. The value of k is dependent rithm respectively. It is obviously that the results of the cubic on the experiments. We select k = 6 in our implementation. interpolation are blurring and can hardly describe the personal For example, there is a training set including 300 HR front details in the image due to the zooming factor is large. Com- facial images and their corresponding LR version images. To pared with our proposed algorithm, the results of KPLS-SR[4] train the proposed KPLS-eigentransformation model, PCA is present more noises and artifacts, which may caused due to applied on both the HR and LR samples to obtain the eigen- the local KPLS model is employed to infer the primitive HR faces EH and EL and the combination coefficients for each image. Note that the performance of Ref.[4] is impressive in sample C H and C L. Then, C H and C L are used to train case of the zooming factor not being greater than 4. However, the KPLS model. There are 50 sub-KPLS predictors to be the obviously blocky artifacts will appeared as the zooming trained. For each zooming factor (viz. 2, 4 and 8), a set of factor increases. To illustrate that our algorithm is applica- sub-KPLS predictors have been trained in our experiments. ble to other kinds of faces, some faces of non-Chinese[8] are Therefore, the proposed KPLS-eigentransformation model selected to test. From Fig.5, the three faces below are zoomed can be represented using the following equations: in 8 times from LR inputs. The results easily tell that our T proposed algorithm is robustness. C = EL(xl − ml)(3) C = KPLS(C)(4) xh = EH C + mh (5) where KPLS(·) denotes the KPLS model.

Fig. 3. Predicted HR PCA coefficients using LR coefficients with KPLS model

Fig.3 shows the predicted HR PCA coefficients with the proposed KPLS models. The considered zooming factors are 2 and 8 respectively. Compared with Fig.2, we can see that the predicted PCA coefficients are more similar to the target HR PCA coefficients. 3. Compensation for the primitive HR image A primitive HR face image is generated using the proposed Fig. 4. Subject performances of the different SR algorithms KPLS-eigentransformation model. However, some detailed in- (zooming factor: 8). (a) Orignal HR Faces; (b) Cubic interpolation; (c)TheKPLS-SR[4];(d) The proposed formation may be lost due to the limitation of the global pre- algorithm dictive model. In order to compensate for the detailed informa- tion, we apply an algorithm of compensation for the residual image as step 2 in Ref.[4].

III. Experiment and Discussion

The CAS-PEAL face database[7] and FERET face database[8] is employed to evaluate the performance of the proposed algorithm in this paper. We randomly select 350 front faces for experiments, in which 300 faces are used for the training set, and the other 50 faces work as the test set. All the face images are normalized based on the eye coordinates, and cropped to 141 × 161 pixels to form the HR face set. Fig. 5. Subject performances of the proposed algorithm in [8] To further measure the performance of the proposed face FERTE Database (zooming factor: 8). (a) Orignal HR faces; (b) The proposed algorithm hallucination algorithm, several image magnification and SR 686 Chinese Journal of Electronics 2012

To measure the objective performance of different algo- pp.291–300, 2003. rithms, we have also estimated the results of the different al- [7] G. Wen, B. Cao, S. Shan, X. Chen, D. Zhou, X. Zhang, D. Zhao, gorithms in terms of PSNR, the structural similarity measure- “The CAS-PEAL large-scale Chinese face database and baseline evaluations”, ment (MSSIM)[9] and the time-consuming. The average PSNR IEEE Transactions on Systems, Man and Cyber- netics, Part A: Systems and Humans, Vol.38, No.1, pp.149–161, and MSSIM of all the super-resolved testing images with the 2008. different algorithms are listed in Table 1. We can see that, the [8] P.J. Phillips, H. Moon, P.J. Rauss and S. Rizvi, “The FERET proposed algorithm can provide a superior performance than evaluation methodology for face recognition algorithms”, IEEE the other algorithms. All the experiments are conducted on a Transactions on Pattern Analysis and Machine Intelligence, P4 3.0 GHz PC, 4G RAM. Vol.22, No.10, pp.1090–1103, 2000. [9] Z. Wang, A.C. Bovik and L.G. Lu, “Why is image quality assess- ment so difficult?”, Speech and Signal Processing, USA, Vol.8, Table 1. The objective performances of different methods pp.3313–3316, 2002. PSNR Run time (s) Algorithms MSSIM was born in Beijing, (dB) (zooming factor: 8) LI Xiaoguang China. He received the B.E and Ph.D. Cubic interpolation 21.112 0.679 0.157 degrees in electronic engineering from the [4] KPLS SR 21.842 0.797 0.824 Beijing University of Technology, in 2003 Our algorithm 23.475 0.851 0.223 and 2008, respectively. He is currently a lecturer and master student supervisor of the Beijing University of Technology. His IV. Conclusion research interests include image super res- olution and high dynamic range image pro- In this paper, we present a novel KPLS-eigentransforma- cessing. tion based face hallucination algorithm. Based on the analysis XIA Qing was born in 1987. He is on the drawbacks of eigentransformation model, the KPLS has currently a master student in information been introduced to improve the efficiency of the eigentrans- and communication engineering in Beijing formation model which is used to estimate the relationship University of Technology. His research in- between the LR and HR facial images. The proposed KPLS- terests include image super resolution and image compression. Eigentransformation model can bridge the gap between the LR and HR facial images efficiently.

References ZHUO Li received the B.E. de- [1] S.Y. Wang, Z. Li, X.G. Li, “A panchromatic image-based spec- gree in radio technology from the Univer- tral imagery super resolution algorithm”, Chinese Journal of sity of Electronic Science and Technology, Electronics, Vol.20, No.4, pp.617–620, 2011. Chengdu, China, in 1992, M.E. degree in [2] S. Baker, T. Kanade, “Limits on super-resolution and how to signal and information processing from the break them”, IEEE Transactions on Pattern Analysis and Ma- Southeast University, Nanjing, in 1998, and chine Intelligence, Vol.2, No.2, pp.1167–1183, 2002. Ph.D. degree in pattern recognition and in- [3] X. Wang, X. Tang, “Hallucinating face by eigentransformation”, tellectual system from Beijing University of IEEE Transactions on Systems, Man and Cybernetics, Part C: Technology, in 2004. She is currently a pro- Applications and Reviews, Vol.35, No.3, pp.425–434, 2005. fessor, with Ph.D. supervision in circuits [4] W. Wu, Z. Liu, X. He, “Learning-based super resolution us- and systems, in the College of Electronic Information and Control ing kernel partial least squares”, Proceedings of Image Vision Engineering, Beijing University of Technology. During 2006–2007, Compute, Vol.29, No.6, pp.394–406, 2011. she spent one-year research overseas, sponsored by the prestigious [5] R. Roman, K. Nicole, “Overview and recent advances in partial CSC Scholarship, as a postdoctoral research fellow at the Univer- least squares”, Lecture Notes in Computer Science, Vol.3940, sity of Sydney, Australia. Prior joining the university, she worked pp.34–51, 2006. for three years in communication industry. Her current research [6] R. Rosipal, “Kernel partial least squares for nonlinear regres- interests include wireless multimedia sensor networks, mobile intel- sion and discrimination” Neural Network World, Vol.13, No.3, ligence, multimedia content analysis and communication. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Two-Party Combined Cryptographic Scheme and Its Application∗

WANG Shengbao1,2,3,XIEQi1, TANG Qiang4,ZENGPeng5 and CHEN Wei3

(1.School of Information Science and Engineering, Hangzhou Normal University, Hangzhou 310012, China) (2.State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China) (3.New Star Institute of Applied Technology, Hefei 230031, China) (4.Faculty of EWI, University of Twente, the Netherlands) (5.Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, China)

Abstract — In this paper, we extend Haber and Pinkas’ recoverable by a third party (called escrow agency), for the notion of combined (cryptographic) scheme to the two- purpose of e.g., recovering encrypted data from disasters, au- party setting, which is shown to be a useful tool in some dit trail or legal interception. real-world application which we name the “2-boss prob- We define this to be a 2-boss problem. Using the tra- lem”. In a two-party combined scheme, a single public key associated with two independent private keys and one es- ditional public key schemes, this problem cannot be solved crow decryption key is provided. Any ciphertext encrypted easily, namely the overhead of key management of a boss will under the public key can be simultaneously decrypted by be largely increased if (s)he sets up multiple such cooperative the three keys. Meanwhile, the two private keys can also organization with different companies. To see this, let’s give be used as signing keys to achieve non-repudiation service. a naive solution as follows. To satisfy requirement (1), gen- We provide formal security definitions for two-party com- erating one public/private key pair and distribute the private bined schemes, and present a simple and efficient scheme. Our construction is derived from bilinear pairings, and the key (acting as the common decryption key) to the bosses. For security is based on the Bilinear Diffie-Hellman (BDH) as- requirement (2), the two bosses use their own private keys as sumption. signing keys, and the corresponding public keys as verification Key words — Combined scheme, Public key encryp- keys. For requirement (3), the two bosses just escrow their tion, Digital signature, Bilinear pairings, Provable security. common decryption key to their escrow agency. As a result, each boss has to keep one more secret key when engaging in one such cooperation. I. Introduction To solve the problem, an ideal situation would be that no matter how many such cooperating organization a boss is in- 1. Motivation volved, (s)he only needs to keep one private key, i.e., (s)he uses Since a lot of communication is between an individual and the only private key to fulfill all the above mentioned three re- an organization (e.g., a company), there comes up many spe- quirements. Our primitive Two-Party Combined (2P-Comb) cial requirements for the information sender/receiver depend- Scheme is proposed to achieve this goal. [1] ing on the nature of the information . Let us consider the 2. Related work following scenario. Suppose two organizations are engaged in Combined scheme. Encryption schemes and signature e.g acooperation, ., two companies are setting up a joint lab- schemes are commonly used in combination, most often with oratory for some new project. The two bosses of the compa- each of them using independent keys. Most proofs of secu- nies want to set up the following security policies relating the rity of each scheme assume that it uses random keys that are (joint) laboratory. not used by any other scheme. It is often conjectured that (1) The two bosses share exactly the same decryption the combined use of two schemes which are secure when used power on the incoming ciphertext sent to the laboratory. in isolation will degrade their security, since an adversary of (2) They need to have their own independent signing keys the encryption scheme may make use of some information ob- that can provide non-repudiation service. tained from the signature scheme, and vice versa. In Ref.[2], (3) They require that the encrypted information must be however, Haber and Pinkas proved that in many cases, the

∗Manuscript Received Oct. 2010; Accepted Mar. 2012. This work is supported by the National Natural Science Foundation of China (No.61103209, No.11061130539, No.61070153), the Open Foundation of State Key Laboratory of Networking and Switching Technology of China (No.SKLNST-2009-1-13) and the Anhui Provincial Natural Science Foundation (No.10040606Q63). 688 Chinese Journal of Electronics 2012 simultaneous use of related keys (or even same keys) for two Typically, the map e will be derived from either the Weil or schemes, especially, for a Public-key encryption (PKE) scheme Tate pairing on an elliptic curve over a finite field. Pairings are and for a signature scheme, does not compromise their secu- very useful tools to build novel cryptographic schemes, such as rity. The combination of an encryption and a signature scheme encryption schemes from Ref.[8], and signature schemes from is called a combined scheme (Comb scheme). The first prac- Refs.[7, 9, 10]. We refer to Refs.[11–13] for a more compre- tical combined scheme which supports escrow decryption was hensive description of how these groups, pairings and other proposed by Diament et al.[3] in 2004. However, we note that parameters should be selected in practice for efficiency and their scheme is only a one-party combined scheme, i.e., besides security. the escrow agency there is only one party can decrypt the ci- A real-valued function f(l) is negligible if for any integer phertext. From this point of view, this scheme cannot be used n>0, |f(l)|

[7] Let k be a security parameter and the algorithms of the ba- tion scheme ΣEnc and the BLS signature as the compo- sic Chosen-plaintext attack (CPA) secure version of the scheme nent signature scheme ΣSig. Note that as has been proven in are defined as follows. Ref.[2], most of the existing discrete logarithm based signature • Key generation K: This algorithm runs as follows: schemes can be used to construct secure combined scheme. In IG k Run on the input to generate and output this paper, we use the BLS signature as an example. For com- G1,G2,e,whereG1 and G2 are groups of some prime or- pleteness, we reprint the two key generation algorithms K and der q and e: G1 × G1 → G2 is an admissible pairing. KE of the above triple-key encryption scheme. The algorithms P ∈ G  Choose an arbitrary generator 1. for our combined scheme are as follows. H G → • K  Choose a cryptographic hash function : 2 Key generation : This algorithm runs as follows: { , }w w P ∈ G 0 1 .Here will be the bit-length of plaintexts. Choose an arbitrary generator 1.

 H G → Set the two seed private keys to be two independent  Choose four cryptographic hash functions : 2 ∗ w w w ∗ w w random integer x1,x2 ∈ Zq . {0, 1} , H1: {0, 1} ×{0, 1} → Zq , H2: {0, 1} →{0, 1} pk P ,P  P x P ∗  Return a public key = 1 2 ,where 1 = 1 and and H3 : {0, 1} → G1.Herew will be the bit-length of P2 = x2P are two partial public keys. plaintexts. • K Escrow decryption key generation E :Togen-  Set the two seed private keys to be two independent ∗ erate the escrow decryption key using one of the two seed random integer x1,x2 ∈ Zq . x1 x2 dE x1P2 x1x2P pk P ,P  P x P private keys and compute: = = ,or  Return a public key = 1 2 ,where 1 = 1 and dE = x2P1 = x1x2P . P2 = x2P , respectively. • Encryption ε: To encrypt M ∈ Ω for an entity with • Escrow decryption key generation KE :Togenerate public key pk = P1,P2, perform the following steps: the escrow decryption key using one of the private keys x1 and r ∈ Z∗ Choose a random value q . x2 compute: dE = x1P2 = x1x2P ,ordE = x2P1 = x1x2P . C rP, M ⊕ H ηr   Set the ciphertext to be: = ( ) ,where • Encryption E: To encrypt M ∈ Ω for an entity with ∗ η = e(P1,P2) ∈ G2 . public key pk = P1,P2, perform the following steps: • D C U, V ∈ θ ∈{ , }w Decryption : Suppose = Φ.Todecrypt Choose a random value 0 1 . ∗ the ciphertext C using one of the two private keys xi ∈ Zq r H θ, M  Set = 1( ). (where i = 1 or 2) compute: C rP, θ⊕H ηr ,M⊕H θ   Set the ciphertext to be: = ( ) 2( ) V ⊕ H e U, p M i, j j  i ∗ ( ( j )) = where = 1 or 2, and = with η = e(P1,P2) ∈ G2 . • Escrow decryption DE : Suppose C = U, V ∈Φ.To • Decryption D: Suppose C = U, V, W ∈ Φ.Tode- decrypt the ciphertext C using the escrow key dE compute: crypt the ciphertext C using one of the two seed private keys V ⊕ H e U, dE M ∗ ( ( )) = . xi ∈ Zq (where i = 1 or 2) do: Note that the value of η in algorithm ε is independent V ⊕ H e U, P xi σ i, j Compute ( ( j ) )= ,where =1or2 of the message to be encrypted. Hence there is no need to and j = i. re-compute η on subsequent encryptions to the same public W ⊕ H θ M  Compute 2( )= . key pk. The consistency of the scheme can be easily verified r H θ, M U rP  Set = 1( ). Test that = . If not, reject the because we have: ciphertext. M C x1 x2  Output as the decryption of . e(U, P2) = e(U, P1) = e(U, DE ) • Signature generation S: To sign a message m ∈ ∗ Theorem 1 (IND-CPA security) Let H be a random {0, 1} , computes the signature as σ = xiH3(m)(i =1or w oracle from G2 to {0, 1} .LetA be an IND-CPA adversary 2). that has advantage ε against the triple-key encryption scheme. • Verification V : To verify a message/signature pair Suppose A makes a total of qH > 0queriestoH. Then there (m, σ), check if e(H3(m),Pi)=e(σ, P )holds(i = 1 or 2). is a BDH problem algorithm B for IG has advantage ε1,and Remark 1 When the above combined scheme is used to with running time t1 where ε1 ≥ 2ε/qH , t1 ≤ O(time(A)). solve the 2-boss problem, each boss shall respectively generate Our proof of this theorem is inspired by that of Lemma one of the seed private keys x1 and x2 on the own and then 4.3 in Ref.[8]. Let A be an IND-CPA adversary against the combine the corresponding public keys P1 and P2 to form the triple-key encryption scheme who makes at most qH queries public key pk = P1,P2 of the joint laboratory. Therefore, to random oracle A and who has advantage ε.Wecancon- each seed private key is only known to its owner and can thus struct an algorithm B which interacts with A to solve the BDH possibly be used to provide non-repudiation service. problem. After B’s perfect simulation for A, it follows that B Remark 2 With the escrow decryption key x1x2P ,the outputs the correct answer to the BDH problem instance with escrow agency is not able to compute either of the private keys x x probability at least 2ε/qH as required. 1 or 2 due to the hardness of the Discrete logarithm (DL) Using the well-known Fujisaki-Okamoto (FO) transforma- problem in G1. tion[17], we can directly obtain an IND-CCA secure triple- Remark 3 The escrow agency can verify the correctness key scheme which serves as the component encryption scheme of the received escrow decryption key x1x2P since we have ΣEnc in the following combined scheme. e(P1,P2)=e(P, x1x2P ). 2. Proposed two-party combined scheme 3. Security proof Here we propose a concrete 2P-Comb scheme, using the We have the following two lemmas stating the security of IND-CCA secure triple-key scheme as the component encryp- each component scheme ΣEnc and ΣSig in the above 2P-Comb A Two-Party Combined Cryptographic Scheme and Its Application 691 scheme. 2001. [3] T. Diament, H.K. Lee, A.D. Keromytis and M. Yung, “The dual Lemma 1 (ΣEnc is secure in 2P-Comb) Let H, H1, H2 be A receiver cryptosystem and its applications”, Proc. of ACM-CCS random oracles. If is an adversary that has non-negligible , ACM Press, pp.330–343, 2004. ε 2004 advantage against ΣEnc in the combined scheme, then there [4] E. Verheul, “Evidence that XTR is more secure than super- A ε is an algorithm that has non-negligible advantage against singular elliptic curve cryptosystem”, Proc. of EUROCRYPT ΣEnc alone. 2001, LNCS Vol.2045, Springer-Verlag, pp.195–210, 2001. This lemma shows that if an adversary A chooses an IND- [5] T. ElGamal, “A public key cryptosystem and a signature scheme based on discrete logarithms”, CCA challenge in the attacking game defined in Section III, IEEE Transactions on Information Theory, Vol.IT-31, No.4, pp.469–472, 1985. its probability of success is no greater than in the IND-CCA [6] E. Verheul, “Evidence that XTR is more secure than supersin- game for ΣEnc. The proof of this lemma is similar to that of gular elliptic curve cryptosystems”, J. Cryptology, Vol.17, No.4, Lemma 4.2 in Ref.[3], and therefore omitted here. pp.277–296, 2004. Lemma 2 (ΣSig is secure in 2P-Comb) Let H3 be a ran- [7] D. Boneh, B. Lynn and H. Shacham, “Short signatures from the dom oracle. If A is an adversary that has non-negligible ad- Weil pairing”, ASIACRYPT 2001, LNCS Vol.2248, Springer- Verlag, pp.514–532, 2001. vantage ε against ΣSig (i.e. the BLS signature scheme) in [8] D. Boneh, M. Franklin, “Identity-based encryption from the the combined scheme, then there is an algorithm A that has Weil pairing”, SIAM J. Computing, Vol.32, No.3, pp.586–615, non-negligible advantage ε against ΣSig alone. 2003. Lemma 2 is proved in the full paper. This lemma shows [9] Y. Wen, J. Ma, H. Huang, “An aggregate signature scheme with that if an adversary A chooses an EUF-CMA challenge in the specified verifier”, Chinese Journal of Electronics, Vol.20, No.2, attacking game, its probability of success is no greater than pp.333–336, 2011. [10] Z. Wang, Y. Dai, D. Ye, “Universally composable identity-based in the EUF-CMA game (cf. Ref.[7]) for the BLS signature signature”, Acta Electronica Sinica, Vol.39, No.7, pp.1613– scheme. 1617, 2011. (in Chinese) Theorem 2 (IND-EUF security of 2P-Comb) Let H0, [11] A. Joux, “A one round protocol for tripartite Diffie-Hellman”, H1, H2,andH3 be random oracles, then our two-party com- Proc. of ANTS’00, LNCS Vol.1838, Springer-Verlag, pp.385– bined scheme is IND-EUF secure, assuming that the BDH 394, 2000. [12] P.S.L.M. Barreto, H.Y. Kim, B. Lynn and M. Scott, “Efficient problem is hard in groups generated by IG. algorithms for pairing-based cryptosystems”, Proc. CRYPTO Proof In Ref.[7], Boneh et al. reduced the security of 2002, LNCS Vol.2442, Springer-Verlag, pp.354–368, 2002. the BLS short signature to the computational Diffie-Hellman [13] H. Chen, C. Ma, “Fast Tate pairing algorithm using double- (CDH) problem for G1. In Theorem 1, we proved that the base chains”, Acta Electronica Sinica, Vol.39, No.2, pp.408–413, basic triple-key encryption scheme is chosen-plaintext secure 2011. (in Chinese) (IND-CPA secure) assuming the BDH problem is hard. With [14] R. Rivest, A. Shamir and L. Adleman, “A method for obtaining [17] digital signatures and public key cryptosystems”, the FO transformation , we can easily have that the com- Communica- tions of the ACM, Vol.21, No.2, pp.120–126, 1978. ponent encryption scheme ΣEnc is chosen-ciphertext secure [15] M. Bellare, A. Desai, D. Pointcheval and P. Rogaway, “Relations assuming the BDH problem is hard. Combing these results among notions of security for public-key encryption schemes”, with Lemma 1 and Lemma 2, if there exists an adversary A Proc. of CRYPTO 1998, LNCS Vol.1462, Springer-Verlag, has advantage ε against our combined scheme, then there is pp.26–45, 1998. an adversary B that can solve either of the above two hard [16] D. Pointcheval and J. Stern, “Security proofs for signature problem with nonnegligible advantage. (Note that the BDH schemes”, Proc. of EUROCRYPT 1996, LNCS Vol.1070, G ,G ,e Springer-Verlag, pp.387–398, 1996. problem for ( 1 2 ) is easier than the CDH problem for [17] E. Fujisaki and T. Okamoto, “Secure integration of asymmetric G 1.) and symmetric encryption schemes”, Proc. of CRYPTO 1999, LNCS Vol.1666, Springer-Verlag, pp.537–554, 1999. V. Conclusion WANG Shengbao received Ph.D. degree from Shanghai Jiaotong Univer- sity. HeisnowalectureinSchool We introduced and formally defined the notion of two- of Information Science and Engineering, party combined schemes, with a concrete construction, which Hangzhou Normal University. His research provides a neat solution for some real-world security applica- interests focus on cryptography and net- tions. Our proposed scheme was proven secure in the random work information security. (Email: sheng- oracle model, assuming the hardness of the Bilinear Diffie- [email protected]) Hellman (BDH) problem.

XIE Qi received Ph.D. degree from References Zhejiang University. He is now a profes- sor of Hangzhou Normal University. His [1] Y. Desmedt, “Society and group oriented cryptography: A new research interests include network security, concept”, Proc. of CRYPTO 1987, pringer-Verlag, pp.120–127, cryptography and authentication technol- 1987. ogy. (Email: [email protected]) [2] S. Haber and B. Pinkas, “Securely combining public-key cryp- tosystems”, Proc. of ACM-CCS 2001, ACM Press, pp.215–224, Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Discriminative Decision Function Based Scoring Method Used in Speaker Verification∗

LIANG Chunyan, ZHANG Xiang and YAN Yonghong (The Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China)

Abstract — Decision function of log likelihood ratio ventional one, where the whole feature file of each utterance derived from classical hypothesis testing theory is widely is processed based on a full GMMs log-likelihood evaluation. used in Gaussian mixture model based speaker recogni- It treats the GMMs simply as a probability density function tion system. This paper introduces a discriminative de- of the feature vectors from a target speaker. In this study, cision function based scoring method for speaker recogni- tion with the state-of-the-art Joint factor analysis (JFA) we propose a scoring method based on discriminative decision system. In the scoring module of JFA system, an approxi- function which is applied to expand a single GMM into a set mate form of the decision function is proposed. Based on of individual Gaussian components. In the proposed method, the approximation, we present a discriminative decision we re-estimate the contribution of each speech sound unit to function by re-estimating the contribution of each speech the decision function to further improve the performance of sound unit to the decision function to further improve the speaker verification. performance of speaker verification. The discriminative decision function is used to exploit the individual Gaus- The rest of this paper is organized as follows. We briefly in- sian component for better classification. The experiments troduce the theory of JFA in Section II. The traditional frame- are carried on the core conditions of National institute of by-frame scoring method is presented in Section III. We pro- standards and technology (NIST) 2010 speaker recogni- pose the discriminative decision function based scoring strat- tion evaluation data. The experimental results show that egy in Section IV. Experiment results are shown in Section V. the proposed scoring method outperforms the conventional Finally, we give the conclusion in Section VI. frame-by-frame strategy on the whole. Key words — Speaker verification, Joint factor analysis II. Joint Factor Analysis (JFA), Discriminative decision function. JFA has obtained wide attention during the last few years I. Introduction and become the state-of-the-art system in the field of speaker recognition. JFA model is used to solve the problem of speaker The task of speaker verification is to determine whether and session variability in GMMs framework. In this model, the M a given segment of speech is spoken by the hypothesized speaker and channel dependent mean supervector can be speaker[1,2]. The task can be treated as a hypothesis-testing represented as a sum of two supervectors: problem. Given a trial including both the test utterance and M = s + c (1) the target speaker, a decision should be made to tell “True” or “False” based on the comparison between the log likelihood where s is the speaker supervector and c is the channel super- score of the trial and a threshold. Gaussian mixture models vector, both of which are normally distributed. They can be (GMMs) have always been the dominant approach in speaker respectively represented by verification[1]. In this approach, GMMs are applied to model data distribution and the Log likelihood ratio (LLR) derived s = m + Vy+ Dz (2) from hypothesis testing theory is used as decision function. c = Ux (3) In recent years, Joint factor analysis (JFA)[3,4] has become the state-of-the-art technique in speaker verification. It has where m is the speaker-independent mean supervector, that been proposed to solve the problem of speaker and session is the mean supervector of the Universal background model variability in GMMs framework. Many sites used JFA in the (UBM), V is the speaker loading matrix with high speaker latest NIST evaluations, and there are many ways in the step of variability (eigenvoices), D is the diagonal loading matrix de- scoring[5−7] . Frame-by-frame scoring method is the most con- scribing remaining speaker variability not covered by V ,and

∗Manuscript Received June 2011; Accepted Apr. 2012. This work is supported by the National Natural Science Foundation of China (No.10925419, No.90920302, No.61072124, No.11074275, No.11161140319) and the Strategic Priority Research Program of the Chinese Academy of Sciences (No.XDA06030100). Discriminative Decision Function Based Scoring Method Used in Speaker Verification 693 T  U is the channel loading matrix with high intersession variabil- 1 p(ot|s ) = log   ity (eigenchannels). y, z and x are the speaker factor, diagonal T p ot|s p ot|m t=1 ( )+ ( ) factor and channel factor respectively, which are all assumed p o |m to be standard normally distributed random variables. − ( t ) log   (7) The underlying task in JFA is to train the hyperparam- p(ot|s )+p(ot|m ) U V D eters , and on a large training set. In the Bayesian In a GMM λ, the probability p(ot|λ) for an observed feature framework, posterior distribution of the factors (knowing their frame ot is priors) can be computed using the enrollment data. The likeli- C C χ hood of test utterance is then computed by integrating over p(ot|λ)= ωcp(ot|λc)= g(ot|λc)(8) the posterior distribution of y and z, and the prior distribution c=1 c=1 x[8] of . Two terms of the Taylor series log(x) ≈ x − 1areusedtoob- tain the approximation of Eq.(7) and we discard the −1since III. Traditional Frame-by-Frame Scoring the change will not affect the classification accuracy. Method T p o |s p o |m χ ≈ 1 ( t ) − ( t ) Λ( )     T p(ot|s )+p(ot|m ) p(ot|s )+p(ot|m ) The frame-by-frame scoring method is based on a full t=1 GMM log-likelihood evaluation[7]. The log-likelihood of test T C   1 g(ot|sc) − g(ot|mc) utterance χ and model s is computed as an average frame = T C g o |s g o |m log-likelihood. The formula is as follows t=1 c=1 j=1( ( t j )+ ( t j ))

T C C T   1  1 g(ot|sc) − g(ot|mc) log P (χ|s)= log ωcN (ot; sc, Σc) = (9) T T C   t=1 c=1 c=1 t=1 j=1(g(ot|sj )+g(ot|mj )) T 1  If we define = log p(ot|s )(4) T T   t=1 1 g(ot|sc) − g(ot|mc) Φc = ,c=1, 2, ···,C (10) T C g o |s g o |m t=1 j=1( ( t j )+ ( t j )) where ot is the feature vector at frame t, T is the length (in frames) for test utterance χ, C is the number of Gaussians as the difference of average occupation probability among the in the GMM and s = s + Ux is the supervector of the tar- whole observation series for Gaussian component c between get model after channel adaptation while Ux is the channel the adapted speaker model and UBM, Eq.(9) can be rewritten supervector for the test utterance. in the following form of inner product  Similarly, when calculating the log-likelihood of utterance Λ(χ)=w b(η) (11) χ and the UBM, the mean supervector of UBM is also com-  w , ···, b η pensated as m = m + Ux. This is equivalent to set the mean where =[1 1] is a unit weight vector and ( )= , ···, t supervectors of both the target model and the UBM into the [Φ1 ΦC ] denotes the difference vector of occupation prob- η same channel space where the test utterance lies, which can ability for a trial . η effectively solve the acoustic mismatch problem between the From Eq.(11), we can see that, given a trial ,thevalueof training and test environment. the decision function, hence the decision of “True” or “False” w Thus, the average verification score is obtained by comput- for the trial, is completely determined by a weight vector b η ing the log-likelihood ratio between the compensated target and a difference vector ( ). The average occupation probabil- speaker model s and UBM m, for the test utterance χ, ity Φc can be thought to represent the occurrence frequency of Gaussian mixture component c among the whole observation T sequences. We call the difference vector b(η) as the trial’s in- χ 1 p o |s − p o |m Λ( )= T (log ( t ) log ( t )) (5) formation vector, which is used to map the trial into a vector. t=1 The values in weight vector w canbeviewedasthecontri- IV. Discriminative Decision Function bution to the decision function of the corresponding elements w Based Scoring Method in the trial’s information vector. Hence, we can name the contribution factor, which can also be considered as a classi- fier between the true information vectors and the false ones. In 1. The approximation of decision function Eq.(11), the values in w are the same, which indicates that the p o If we define ( t) to denote the total probability of both contributions of the differences of average occupation proba- s m o the speaker model and UBM , given a feature frame t, bility corresponding to all the Gaussian components are equal. that is In GMMs for speaker verification, the Gaussian compo- p o p o |s p o |m ( t)= ( t )+ ( t )(6)nents can be considered to model the underlying broad pho- [1] Then the Eq.(5) can be written as netic sounds that characterize a person’s voice .Hence,Φc, c , ···,C =1 , can be thought to represent the differences be- T   1 p(ot|s ) p(ot|m ) tween the average occupation probability for the event that the Λ(χ)= log − log T p ot p ot feature vector of the test utterance is accounted for by each t=1 ( ) ( ) corresponding speech sound unit characterized by the target 694 Chinese Journal of Electronics 2012 speaker model and that for the UBM. The contributions to on Eq.(19), we then use the Generalized linear discriminant se- the decision function of the sound units are determined by quence (GLDS) kernel based Support vector machine (SVM) w. Actually, some of the sound units have more discrimina- to obtain the optimal w. tive information for different speakers, which should be given 3. GLDS kernel method for the discriminative heavy weight. In contrast, the sound units which are less dis- training of contribution factor w criminative should be less weighted. In the following, we will Combining the solution of Eq.(19) with the scoring equa- show how to obtain a discriminative contribution factors w to tion form (11), we have further improve the speaker verification performance. t t −1 t score = b w = b R M +1 (20) 2. MSE criterion Suppose we have a training set consisting of (N+ + N−) The above equation can become trials, in which true trials are denoted as {xi}, i =1, ···,N+, score btR−1b¯ and false trials as {yj }, j =1, ···,N−. Each of the trials = + (21) is mapped into a difference vector of occupation probability t where b+ =(1/N+)M +1 and R =(1/N+)R. b x i , ···,N b y j , ···,N ( i), =1 +,and ( j ), =1 −.Thus,the We compare two trials x and y by mapping them into trial score of the decision function for a trial x can be written as information vectors bx and by first and then computing the score wtb x = ( ). We can first obtain the discriminative contri- GLDS kernel as[10] bution factor w based on Minimizing the sum-of-squares error [9] t −1 (MSE) criterion . KGLDS = bxR by (22)

w∗ E{ wtb x − y x 2} To reduce training time, we factor R = U tU using the = arg minw ( ( ) ( )) (12) Cholesky decomposition. Then where E denotes expectation and y(x) is the ideal output for t KGLDS =(Ubx) (Uby) (23) trial x. Let the ideal output for true trial vectors be 1 and 0 for false trial vectors, i.e. y(true)=1andy(false)=0,the If we transform all the trial information vectors by Ubx,the criterion above can be approximated using the training set as kernel is a simple inner product. This will dramatically reduce the time used in SVM training. Finally, SVM training proce- N+ N− ∗ t 2 t 2 dure will find the corresponding αi for each support vector bi w = arg min |w b(xi) − 1| + |w b(yj )| (13) w d w i=1 j=1 and a universal . Thus the optimal contribution factor can be solved as follows We construct matrix M + and M − respectively using all the l ∗ −1 information vectors of true and false trials as follows w = αiyiR bi + d (24) ⎡ ⎤ ⎡ ⎤ t t i=1 b(x1) b(y1) ⎢ t ⎥ ⎢ t ⎥ t ⎢ b(x2) ⎥ ⎢ b(y2) ⎥ where d =[d 0 ··· 0] . M ⎢ ⎥ , M ⎢ ⎥ + = ⎣ . ⎦ − = ⎣ . ⎦ (14) z . . Given a new trial , we firstly convert it to the correspond- t t ing information vector bz. Then the discriminative decision b(xN ) b(yN ) + − function based on the optimal contribution factor w can be And we define expressed as M M + = (15) l t M − −1 score = αiyiR bi + d bz (25) Then, the problem of Eq.(13) becomes i=1 w∗ Mw − o = arg minw 2 (16) V. Experiments where o is a vector consisting of N+ ones followed by N− zeros (i.e., the ideal outputs for the training trials). 1. Experiments setup The problem of Eq.(16) can be solved using the method of The experiments for different JFA systems based on the normal equations two kinds of scoring methods (the traditional frame- by-frame t t M Mw = M o (17) and the proposed discriminative decision function based scor- And Eq.(17) can be rearranged by ing methods) are carried out on the NIST 2010 speaker recog- nition evaluation corpus. The NIST SRE 2010 is similar to t t t t (M M )w = M +1 + M −0 = M 1 (18) SRE 2008 but different from prior evaluations by including in the training and test conditions for the core test not only where 1 is the vector of all ones and 0 is an all-zeros vector. conversational telephone speech recorded over ordinary tele- If we define R = M tM, w can be obtained by phone channels, but also such speech recorded over a room −1 t microphone channel, and conversational speech from an inter- w = R M +1 (19) view scenario recorded over a room microphone channel. We In the MSE criterion, the classifier focuses on all the train- respectively name the above three conditions telephone, mi- ing samples but not those which are easily classified wrongly, so crophone and interview for short. In this study, we focus on the discriminability of w trained by Eq.(19) is limited. Based three types of trials: telephone-telephone, interview-interview Discriminative Decision Function Based Scoring Method Used in Speaker Verification 695 and interview-telephone. Equal error rate (EER) and the min- linear, which means that in the purpose of classification, the imum Decision cost function (minDCF) are used as metrics for effect of using Taylor series can be ignored. evaluation[11,12] . 3. Experiments on NIST SRE 2010 In our experiments, we use Mel-frequency cepstral coeffi- In this subsection, we list the results of JFA systems using cients (MFCCs) as the acoustic cepstral features. 18 cepstral frame-by-frame and Discriminative decision function (DDF) coefficients are computed and first order derivatives over 5 based scoring methods on the three test conditions in NIST frames are appended to each feature vector, which results in a SRE 2010. dimensionality of 36. These feature vectors are modeled using Table 1 lists the performance of the JFA systems based GMMs and JFA is used to treat the problem of speaker and on the two scoring methods for the telephone-telephone con- session variability. dition. From Table 1, we can see that the proposed scoring The gender dependent UBM models with 1024 mixture method outperforms the conventional frame-by-frame strategy components are trained using the NIST SRE 2004 1side train- for both male and female speakers. Our system can achieve ing corpus. The Switchboard II, Switchboard Cellular cor- 14.85% relative improvement in EER and 5.53% relative im- pus as well as the telephone data from NIST SRE 2005 and provement in minDCF for male speakers and relative gains of 2006 corpus is used to train the speaker loading matrix with 16.12% EER and 16.12% minDCF for female speakers. 300 speaker factors. And the NIST SRE 2004 corpus is used Table 1. Comparison of different scoring to train the diagonal matrix. For channel loading matrix, a methods for the telephone-telephone task telephone loading matrix with 100 channel factors is trained Male Female System based on the phone data from NIST SRE 2004, 2005 and EER(%) minDCF EER(%) minDCF 2006 corpus for the telephone-telephone condition. A com- Frame-by-frame 5.05 0.452 6.45 0.614 mon channel loading matrix also with 100 channel factors for DDF 4.30 0.425 5.41 0.515 both the interview-interview and interview-telephone condi- tions is trained based on the telephone and microphone data The performance of different JFA systems based on from NIST SRE 2004, 2005 and 2006 corpus as well as the our method and the traditional frame-by-frame one for the MIXER5 interview development corpus. interview-interview task is shown in Table 2. As can be seen from Table 2, our method has achieved relative 11.27% and The true and false trials for telephone-telephone, inter- 7.28% improvement in EER and minDCF for male speakers as view-interview and interview-telephone conditions provided in well as 6.21% and 4.22% improvement in EER and minDCF NIST SRE 2008 are used for training the contribution factor for female speakers. w respectively for the corresponding test conditions in NIST SRE 2010. Table 2. Comparison of different scoring 2. Experiments of Taylor series approximation methods for the interview-interview task Male Female Since we obtain an approximate decision function, from System which the discriminative decision function based scoring EER(%) minDCF EER(%) minDCF Frame-by-frame 2.84 0.563 5.31 0.687 method is derived, the effect of using the Taylor series should DDF 2.52 0.522 4.98 0.658 be examined. Fig.1 shows the relationship of LLR score obtained from Table 3 compares the proposed system with the frame-by- the traditional decision function and the approximation form frame one for the interview-telephone condition. It demon- with two terms of Taylor series. We tested on 10000 utterances strates that except for the measurement of EER for male respectively for male and female speakers and each utterance speakers, the performance of our proposed system is compara- is scored both on Eqs.(5) and (11). It can be seen that the re- ble or even better than that of the frame-by-frame one. Rela- lationship between scores from the two scoring forms is nearly tive gains of 5.54% in minDCF for male speakers and 8.39% in EER for female speakers are obtained. We have noticed that the performance of male speakers for the interview-telephone task is not very comparable. This may due to the fact that the number of interview-telephone trials (both true and false) from NIST SRE 2008 is too small to train the contribution factor w well.

Table 3. Comparison of different scoring methods for the interview-telephone task Male Female System EER(%) minDCF EER(%) minDCF Frame-by-frame 2.40 0.505 4.65 0.659 DDF 2.74 0.477 4.26 0.659

4. Speed Fig. 1. Relationship of scores obtained from traditional de- The aim of this experiment was to show the approximate cision function and approximate form. (a)Male;(b) scoring time for the two different systems to compare their Female complexity. The time measured included reading necessary 696 Chinese Journal of Electronics 2012 data connected with the trial and computing the likelihood with joint factor analysis”, Proceeding of the International Con- ratio. Each measuring was repeated 5 times and averaged. ference on Acoustic Speech and Signal Processing, Taipei, Tai- wan, pp.4057–4060, 2009. Table 4 shows the average scoring time per trial. From Table [8] P. Kenny and P. Dumouchel, “Experiments in speaker verifi- 4, we can see that proposed scoring method is faster than the cation using factor analysis likelihood ratios”, Proceedings of traditional frame-by-frame one. Odyssey 2004, Toledo, Spain, pp.219–226, 2004. [9] R. Duda and P. Hart, Pattern Classification and Scene Analy- Table 4. Comparison of average scoring sis, Wiley, New York, 1973. time per trial using frame-by-frame [10] W. Campbell, “Generalized linear discriminant sequence kernels and DDF based scoring methods for speaker recognition”, Proceedings of the International Con- System Scoring time cost (s) ference on Acoustics Speech and Signal Processing, Orlando, Frame-by-frame 3.75 Florida, USA, Vol.1, pp.161–164, 2002. DDF 2.01 [11] “The NIST year 2008 speaker recognition evaluation plan”, http://www.nist.gov/speech/tests/spk/2008/index.html, 2008. [12] “The NIST year 2010 speaker recognition evaluation plan”, VI. Conclusion http://www.nist.gov/speech/tests/spk/2010/index.html, 2010. LIANG Chunyan received the In this paper, we have introduced a discriminative deci- B.E. degree in Communication Engi- neering from Shandong Normal Univer- sion function based scoring method used in speaker verification sity in 2008. Now she is a M.S. & with the JFA system. Experiments show that the proposed Ph.D. candidate in Key Laboratory of method is effective and outperforms the traditional frame-by- Speech Acoustics and Content Understand- frame scoring method on the whole. As well, the computing ing at Institute of Acoustics, Chinese complexity of the proposed method is much lower than the Academy of Sciences. Her research inter- frame-by-frame scoring method. ests include speaker recognition and lan- guage recognition. (Email: liangchun- [email protected]) References ZHANG Xiang received B.E. de- gree in Electronic Information Engineer- [1] D.A. Reynolds, T.F. Quatieri and R.B. Dunn, “Speaker verifi- ing from Shangdong University in 2006 cation using adapted Gaussian mixture models”, Digital Signal and Ph.D. degree from Key Laboratory Processing, Vol.10, No.1-3, pp.19–41, 2000. of Speech Acoustics and Content Under- [2] X. Zhang, X. Xiao, H Wang, J. Zhang and Y. Yan, “Multiclass standing at Institute of Acoustics, Chinese maximum a posteriori linear regression for speaker verification”, Academy of Sciences. His research inter- Chinese Journal of Electronics, Vol.19, No.4, pp.641–645, 2010. ests include speaker recognition, language [3] M.H. Sanchez, L. Ferrer, E. Shriberg, A. Stolcke, “Constrained identification, speaker diarization, and au- cepstral speaker recognition using matched UBM and JFA train- dio watermarking. ing”, Proc. of Interspeech, Florence, Italy, pp.141–144, 2011. YAN Yonghong received B.E. de- [4] P. Kenny, P. Ouellet, N. Dehak, V. Gupta and P. Dumouchel, gree from Tsinghua University in 1990, and “A study of inter-speaker variability in speaker verification”, Ph.D. degree from Oregon Graduate Insti- IEEE Trans. on Audio, Speech and Language Processing,Vol. tute (OGI). He worked in OGI as an As- 16, No.5, pp.980–988, 2008. sistant Professor (1995), Associate Profes- [5]N.Dehak,P.Kenny,R.Dehak,P.OuelletandP.Dumouchel, sor (1998) and Associate Director (1997) of “Front-end factor analysis for speaker verification”, IEEE Center for Spoken Language Understand- Trans. on Audio, Speech and Language Processing, Vol.19, No. ing. He worked in Intel from 1998 to 4, pp.788–798, 2011. 2001, chaired Human Computer Interface [6] N. Br¨ummer, L. Burget, J. Cernocky, O. Glembek et al., “Fu- Research Council, worked as Principal En- sion of heterogeneous speaker recognition systems in the stbu gineer of Microprocessor Research Laboratory and Director of Intel submission for the NIST speaker recognition evaluation 2006”, China Research Center. Currently he is a professor and director IEEE Trans. on Audio, Speech and Language Processing, of Think IT Laboratory. His research interests include speech pro- Vol.15, No.7, pp.2072–2084, 2007. cessing and recognition, language/speaker recognition, and human [7] O. Glembek, L. Burget, N. Dehak, N. Br¨ummer and P. Kenny, computer interface. He has published more than 100 papers and “Comparision of scoring methods used in speaker recognition holds 40 patents. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Visual Attention Model Based Regions of Interest DetectioninCompressedDomain∗

SUI Lei, ZHANG Jing, ZHUO Li and YANG Yuncong

(Signal and Information Processing Laboratory, Beijing University of Technology, Beijing 100124, China)

Abstract — As the reality that human beings usually tion II introduces and describes the components of the pro- pay more attention to areas of interest, visual attention posed method. Section III presents the experimental results, model is a feasible method to find Regions of interest and Section IV concludes the paper. (ROIs) and measure the interest of a region. However, it is required to decompress image data completely. A vi- sual attention model based ROIs detection in compressed II. The Proposed ROIs Detection Method domain is proposed in this paper, which can compute vi- sual attention model with partially decompression. This In the proposed method, we can compute visual saliency method includes: (1) Visual saliency map computation; map with DC and AC coefficients before Inverse discrete co- (2) Focus of attention (FOA) selection and shift; (3) ROIs sine transform (IDCT), which can reduce the computational detection. The experimental results show the proposed complexity and save decoding time. Fig.1 illustrates visual at- method performs well on the speed/accuracy of ROIs de- tection and interest measurement. tention model computation for ROIs detection process which consists of three steps: (1) Visual saliency map computation; Key words — Visual attention model, Regions of inter- (2) Focus of attention selection and shift; (3) ROIs detection. est, Compressed domain, Saliency map.

I. Introduction

Visual attention model is a promising method that com- putes the most salient parts in an image and represents the image as a gray scale “saliency map”, which results in the Fig. 1. Proposed method processing steps creation of attention regions. Some visual attention mod- 1. Visual saliency map computation els have emerged in recent years, which can be divided into The saliency map represents the saliency at every location three categories[5−7] : space-based attention model, visual in the visual field and is based on the spatial distribution of feature-based attention model and space/feature-based atten- saliency. In this approach, we obtain AC and DC coefficients tion model. Itti et al.[5] computed the most salient parts as from compressed stream to compute local and global saliency ROIs in an image based on human visual model. In our work[6], map separately. Then we normalize and combine two saliency a ROIs detection method based on object characteristics and maps. visual attention model was proposed. In Ref.[6], the local saliency map is computed by color, Above approaches are required to decompress image data intensity and orientation feature maps. But in the proposed completely and result in extra computation complexity and [7] method, the saliency map is determined in the compressed slower processing speed . Aneffectivemethodistoprocess domain. image data in compressed domain. In Ref.[8], DC and AC In this step, the intensity and color feature maps are ob- coefficients are extracted as features for image retrieval. Tan [9] tained by building Gaussian pyramids with DC coefficients et al. utilized the spatial relationship of DCT coefficients to Y,Cb Cr [10] of and . Then a set of center-surround differences resize images. In our work , both skin and face detection and normalization operators are applied to each feature map. are based on compressed stream. Finally a local saliency map is obtained with combined and According to ROIs detection method in pixel domain and normalized all feature maps. Center-surround is implemented image processing method in compressed domain, a visual at- as the difference between fine and coarse scales: tention model based ROIs detection in compressed domain is 2 m+2 proposed in this paper. CY = N(GYDC(m) − GYDC(n)) (1) The remainder of this paper is organized as follows: Sec- m=0 n=m+1

∗Manuscript Received Aug. 2011; Accepted Feb. 2012. 698 Chinese Journal of Electronics 2012

Fig. 2. Local saliency map in compressed domain. (a) Original image; (b) Intensity feature map; (c) Orientation feature map; (d) Color feature map; (e) Local saliency map

2 m+2 as a population. Individual and three points selected from the C N G m − G n Cb = ( CbDC( ) CbDC( )) (2) population constitute a layout (is also known as distributed m=0 n=m+1 model). Then the number of populations different from previ- 2 m+2 ous layout is calculated by comparison with other populations CCr = N(GCrDC(m) − GCrDC(n)) (3) along the horizontal and vertical axis in the DC map. The m=0 n=m+1 greater is the value, the higher is the point saliency. N · G · where ( ) is a normalization operator, YDC( ) is the Gaus- (2) Normalization operator. Each point saliency is Y sian pyramid level specified for componentofDCcoeffi- normalized to [0, 255], and then a global saliency map (gray m n cients, is a center fine scale and is a coarser scale yield scale) can be obtained. the feature maps. The total saliency map is a combination and normalization [7] DCT , For Discrete cosine transform (DCT) , (1 0) is the of local saliency map and global saliency map. The saliency result of two rows pixels subtracted and then each difference map is shown in Fig.4. multiplied by a coefficient is summed. The larger is the value of DCT(1, 0), the more different is rows. The same analysis is to the DCT(0, 1) coefficient. So we set a threshold and then follow these criterions to detect the orientation of the block as: ⎧ ⎪ if |DCT(0, 1)| >t&&|DCT(1, 0)| >t&&DCT(0, 1) ⎪ ⎪ ⎪ ·DCT(1, 0) < 0, the block is one diagonal. ⎪ ⎪ |DCT , | >t |DCT , | >t DCT , ⎪ if (0 1) && (1 0) && (0 1) ⎪ ⎪ ·DCT(1, 0) > 0, the block is another diagonal. ⎪ Fig. 3. Global saliency map in Fig. 4. Total saliency map in ⎨⎪ if |DCT(0, 1)| t |DCT , | t, ⎩⎪ take-all” neural network, in which synaptic interactions among the block is horizontal. units ensure that only the most active point remains, while all According to the orientation of each block, the orientation other points are suppressed. The most active point is termed feature for all points in the DC map is calculated as follows: the focus of attention. Then focus of attention is shifted under the mechanism of inhibition of return and adjacency-priority Oi =1− Pi/Q (5) principle with the area and location of saliency enhancement [6] where Q is total number of 8 × 8blocksina5× 5 region (each factors . point in DC map corresponds to a 8 × 8 block in original im- 3. ROIs detection In this step, ROIs detection consists of two steps: (1) can- age), Pi denotes the number of blocks which are the same as didate regions are segmented from total saliency map with center block in a 5 × 5region,Oi is the orientation feature of every block. threshold algorithm; (2) smalls holes are removed and big holes are filled with morphological processing method; (3) The can- The local saliency map Lsm is computed by the combi- nation and normalization with intensity feature map Y , color didate region is seen as a ROI if focus of attention lies in this feature map Cb, Cr and orientation feature map O[5].The region. In addition, interest of each region is measured with regional saliency, area and location of saliency enhancement gray value of every point in the local saliency map corresponds [6] to saliency. The local saliency map is shown in Fig.2 factors . The global saliency map is computed based on improved [6] III. Simulation Results Sentiford’s model . Firstly, DC coefficient of Y , Cb and Cr components are extracted to form a DC map. Then improved In our experiment, we downloaded 600 images from Flicker model is used for computing global saliency map. Because the and Itti’s home page[5]. ROIs detection results were shown in size of DC map is 1/64 of the original image, down sampling Fig.5. In Fig.5, contour line is the edge of ROIs. The line process is omitted. The processing steps are as follows: between two regions describes the shift of focus of attention. (1) Saliency computation of each point in a DC In order to evaluate the effectiveness of the algorithm, we map. Firstly, we let each point as individual and 5 × 5patch tested objective and subjective interest and the concordance Visual Attention Model Based Regions of Interest Detection in Compressed Domain 699

between objective and subjective inter- est of ROIs[6]. Table 1 gives the objec- tive interest of ROIs of some images in Fig.5. Table 2 shows the subjective eval- uation of ROIs. We tested five users sep- arately. We provided four evaluation lev- els (1, 2, 3, 4) and regarded the average results of user’ evaluation as subjective interest of each region. The smaller is the average, the higher is the interest. The results are shown in Table 2. The concordance between objective and sub- jective interest compared with method in Fig. 5. ROIs detection results Ref.[6] are shown in Table 3. IV. Conclusion Table 1. Objective interest of each region Area Location Interest Original Local Global Different from the conventional method to visual atten- ROI enhancement enhancement of a image saliency saliency tion model in pixel domain, this paper introduces an effective factor factor region visual attention model for ROIs detection in compressed do- 1 137.0000 255.0000 1.2557 0.9980 255.0000 4 main. Experimental results show that the proposed method 2 133.0000 247.5547 1.2443 0.9975 244.4075 1 194.0000 255.0000 1.2500 0.9864 255.0000 can improve the speed of ROIs detection. It can satisfy hu- 5 2 203.0000 243.6946 1.2500 0.7281 184.0115 man visual system as well. Our future work is to improve the 1 136.0000 255.0000 1.4343 0.9969 255.0000 accuracy of ROIs detection for the multi-objective image and 6 2 131.0000 245.6250 1.0335 0.9503 167.0653 apply the method to image recognition. 3 130.0000 243.7500 1.0321 0.8997 151.5632

References From experimental results in Table 3, the concordance of FOA shift for multi-ROI is lower because it is difficult to accu- [1] J. Zhang, L.S. Shen and David Dagan Feng, “A survey of image rately extract some small objects which aren’t generally first retrieval based on visual perception”, Acta Electronica Sinica, object in the Multi-ROIs image. The comparison results show Vol.36, No.3, pp.494–499, 2008. (in Chinese) our proposed method has better performance than method in [2] Y.H. Tang, “Image retrieval based on user defined region- of-interest”, , Vol.22, No.11, pp.20–22, Ref.[6]. In the experimental, the processing time of the pro- Computer Applications 2002. posed method is 2248 ms and the method in Ref.[6] is 9102 ms, [3] J.C. Garcia-Alvarez, G. Castellanos, “Region of interest extrac- which is explained with following reasons: (1) Visual saliency tion method using wavelets”, Communication Theory, Reliabil- map is computed by extracting DC and AC coefficients from ity, and Quality of Service, Colmar, France, pp.119–124, 2009. compressed stream, which can avoid IDCT and save lots of [4] T. Shen, H.S. Li, X.L. Huang, “Active volume models for med- decoding time; (2) The size of visual saliency map is less than ical image segmentation”, IEEE Transaction on Medical Imag- saliency map in Ref.[6]. Hence, the time cost of focus of at- ing, Vol.30, No.3, pp.774–791, 2011. tention selection and shift and measurement of interest of a [5] L. Itti, C. Koch, “A saliency-based search mechanism for overt region is less than the latter. and covert shifts of visual attention”, Vision Research, Vol.40, No.6, pp.1489–1506, 2000. Table 2. Subjective interest of each region [6] J. Zhang, L.S. Shen and J.J. Gao, “Region of interest detec- Image ROI A B C D E Average tion based on visual attention model and evolutionary pro- 1 1 1 2 1 1 1.2 gramming”, Journal of Electronics and Information Technol- 4 2 2 2 1 2 2 1.8 ogy, Vol.31, No.7, pp.1646–1652, 2009. 1 1 2 1 2 1 1.4 [7] J. Zhang, L.S. Shen and X.G. Li, 5 Image Retrieval and Com- 2 2 1 2 1 2 1.6 pressed Domain Processing, Posts and Telecom., Press, Beijing, 1 2 2 1 1 2 1.6 China, pp.102–113, pp.287–300, 2008. 6 2 1 3 2 2 1 1.8 [8] Suresh P, Sundaram. RMD, Arumugam A, “Feature Extraction 3 3 1 3 3 3 2.6 in Compressed Domain for Content Based Image Retrieval”, Ad- vanced Computer Theory and Engineering, Phuket, Thailand, pp.190–194, 2008. Table 3. Concordance between objective and subjective interest [9] E.L. Tan, W.S. Gan, S.K. Mitra, “Fast arbitrary resizing of Single Total (Single images in the discrete cosine transform domain”, IET Image Multi-ROIs(400 images) Method ROI (200 +MultiROIs) Processing, Vol.5, No.1, pp.73–86, 2011. images) First ROI FOA shift First ROI [10] S.W. Zhao, L. Zhuo, S.Y. Wang, Z. Xiao, X.G. Li and L.S. Shen, Method[6] 100% 73% 62% 82% “Pornographic image recognition in compressed domain based Proposed on multi-cost sensitive decision tree”, Computer Science and 100% 76% 64% 84% method Information Technology, Chengdu, China, pp.225–229, 2009. 700 Chinese Journal of Electronics 2012

SUI Lei was born in 1985. He is ZHUO Li is currently a Professor currently a M.S. student in Information and Ph.D. student supervisor in Circuits and Communication Engineering at Bei- and Systems, at the College of Electronic jing University of Technology. His research Information and Control Engineering, Bei- interests include pornographic image pro- jing University of Technology. Her current cessing and recognition in compressed do- research interests include wireless multi- main. (Email: [email protected]) media sensor networks, mobile intelligence, multimedia content analysis and Commu- nication. ZHANG Jing was born in 1975, Ph.D., Associate Professor and M.S. stu- YANG Yuncong was born in 1987. dent supervisor in Beijing University of She is currently a M.S. student in Infor- Technology. Her research interest is im- mation and Communication Engineering at age/video signal and processing. Beijing University of Technology. Her re- search interests include medical image pro- cessing and recognition. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Homomorphic Aggregate Signature Scheme Based on Lattice∗

ZHANG Peng, YU Jianping and WANG Ting

(ATR Key Laboratory of National Technology, Shenzhen University, Shenzhen 518060, China)

Abstract — Homomorphic signatures can authenticate cal cryptography or the lattice cryptography, previous schemes vector subspaces of a given ambient space. Aggregate sig- only discussed the situation signatures are all from the same natures can compress multiple signatures into a compact user. signature. In order to study the security issues in multi- The aggregate signature scheme defined in Ref.[8] com- source network coding and sensor data aggregation, the homomorphic aggregate signature scheme is introduced, bined multiple signatures on different messages and from dif- which can aggregate signatures with message operated ferent users into a single signature. The output was an ag- from different users. Compared to the classical cryptog- gregate signature whose length was the same as the any indi- raphy, the lattice cryptography is more secure, simple and vidual signature. Aggregate signatures are useful for reducing flexible, so it is applied to the signature scheme design. the size of certificate chains and for reducing message size in Bonsai tree characteristics of lattice cryptography can gen- secure routing protocols. Wen et al.[9] designed a new efficient erate multiple bases of a lattice, which means multiple users have the same public key and different private keys. aggregate signature scheme with specified verifier from bilinear Further, the homomorphic aggregate signature scheme is pairings. The characteristic, only the designated verifier can proposed. Our scheme is secure under the lattice-based in- verify the correctness of the signature, prevented the disclosure homogeneous smallest integer solution assumption. Com- of the signer’s any relevant information. However, signatures’ pared to the ordinary lattice-based signature schemes, the aggregation with message operated doesn’t be considered in communication and verification efficiency are improved. these prior schemes. Key words — Lattice cryptography, Homomorphism, As mentioned above, homomorphic signatures only com- Aggregation, Homomorphic aggregate signature. bine signatures with message operated from the same user, and aggregate signatures only combine signatures without message I. Introduction operated from different users. In fact, the signature scheme with homomorphism and aggregation is widely needed for [1] Privacy homomorphism is a very useful cryptography guaranteeing the security in multi-source network coding and tool that enable secure computation. Homomorphic signature secure data aggregation for sensor networks. We call it the Ho- schemes support the signatures’ operations which are consis- momorphic aggregate signature (HAS) scheme. Using lattice tent with the messages’ operations. As these special charac- cryptography and its flexibility, we propose a HAS scheme, teristics, homomorphic signature schemes are well-suited to in which multiple signatures or messages can be compressed guarantee information security in message-operated scenario, into one signature or message, even if these signatures are on [2] [3] such as network coding , sensor networks , et al. different messages and are produced by different users. Most homomorphic signature schemes based on classical [4] cryptography were proved to be either insecure or function- II. Preliminaries limited[2]. Since new trapdoors for hard lattices were devel- oped successfully in Ref.[5], many lattice-based homomorphic 1. Lattices [6,7] signature schemes have been proposed owing to mathemat- For any integer q ≥ 2, we let Zq denote the ring of integers ical elegance, implementation simplicity, provable security re- modulo q.Whenq is prime, Zq is a field and is sometimes n×m duction and dramatic gain in efficiency of the lattice cryptogra- denoted Fq.WeletZq denote the set of n × m matrices phy. The proposed homomorphic signature scheme by Ref.[6] with entries in Zq. was the first linear scheme that authenticated vectors defined We will be interested in integer lattices whose points have m n over binary fields, and was based on the problem of finding coordinates in Z . For any integer q ≥ 2, u ∈ Zq and matrix n×m short vectors in integer lattices. Using ideal lattices, Boneh et A ∈ Zq , define integer lattices: al.[7] presented the first homomorphic signature scheme for ∧ A {e ∈ Zm . . ∃s ∈ Zn, AT · s e q } polynomial functions, which can compute polynomial func- q ( )= s t q = (mod ) ⊥ m tions on signed data. However, no matter based on the classi- ∧q (A)={e ∈ Z s.t. A · e =0(modq)}

∗Manuscript Received Oct. 2011; Accepted Nov. 2011. This work is supported by the National Natural Science Foundation of China (No.61171072, No.61001058), Projects in the National Science and Technology Pillar Program (No.2011BAH20B02, No.2011BAH20B03). 702 Chinese Journal of Electronics 2012

u m ∧q (A)={e ∈ Z s.t. A · e = u(mod q)} Takes a security parameter n and the maximum user size l. Outputs the public key pki and the private key ski(i = t ∈ u A u A ⊥ A t For any Λq ( ), Λq ( )=Λq ( )+ . 1, ···,l). 2. Sampling from discrete Gaussians • Sign(id, ski,mi) Lattice has useful cryptography application because of its Takes a tag id,aprivatekeyski and a message mi,and natural trapdoor characteristic. Virtually, all kinds of lattice- outputs a signature σi. l based cryptography schemes show how to use a trapdoor in a • Aggm(id, {αi,mi,pki}i=1) theoretically sound and secure way. A short basis of the lattice For the messages with the same id, takes the messages is a trapdoor like this. m1, ···,ml whose weights are α1, ···,αl from users with pub- [10] Theorem 1 Generating a short basis lic keys pk1, ···,pkl respectively, and outputs a message mAgg. q m ≥ n q l Let be an integer and 5 lg . There is a prob- • Aggσ(id, {αi,σi,pki}i=1) abilistic polynomial-time algorithm TrapGen(q,n) that out- For the signatures with the same id, takes the signatures A ∈ Zn×m T ∈ Zm×m puts ( q , ) such that: σ1, ···,σl whose weights are α1, ···,αl from users with public a A Zn×m ( ) is statistically close to a uniform matrix in q . keys pk1, ···,pkl respectively, and outputs a signature σAgg. b T ∧⊥ A ( ) is a basis of q ( ). • Verify(id, (pk1, ···,pkl),mAgg,σAgg) c T T  ( ) The Euclidean norm of all the rows in ( )is Takes the tag id, the public keys pk1, ···,okl, the aggre- O n n bounded by ( log ). gated message mAgg and the aggregated signature σAgg,and [11] Theorem 2 Randomizing a basis outputs either 0(reject) or 1(accept). n×m m×m For a rank n matrix A ∈ Zq ,letT ∈ Z be For correctness, we require that for each (pki,ski) output ∧⊥ A s ≥T˜ · n l an√ arbitrary basis of q ( ). For a parameter by Setup(1 , 1 ), the following hold: ω( log n), there is a probabilistic polynomial-time algorithm (a) For any id and any mi,ifσi ← Sign(id, ski,mi)then  ⊥ RandBasis(T ,s) that outputs another basis T of ∧q (A)such  √ that T ≤s · m. Verify(id, ski,mi,σi)=1. [11] Theorem 3 Delegating a basis l n×m m×m (b) For any id and any set {αi,mi,σi,pki}i=1,ifσi ← Sign For a rank n matrix A ∈ Zq ,letT ∈ Z be an arbi- ⊥  n×m (id, ski,mi), then trary basis of ∧q (A). Let A ∈ Zq be arbitrary. There is ExtBasis T , B l a deterministic polynomial-time algorithm ( = Verify[id, (pk1, ···,pkl),Aggm(id, {αi,mi,pki}i=1),  ⊥ (m+m)×(m+m) AA ) that outputs a basis S of ∧q (B) ⊂ Z l Aggσ(id, {αi,σi,pki}i=1)] = 1 such that T˜  = S˜ . [5] Theorem 4 Sampling from discrete Gaussians 2. The homomorphic aggregate signature scheme There is a probabilistic polynomial-time algorithm Based on the definition and the GPV signature scheme[5], Sample e A, T ,σ,u A ∈ Zn×m Pr ( ), given a matrix√ q ,abasis the homomorphic aggregate signature scheme HAS is as fol- ⊥ ˜ T of ∧q (A), a parameter σ ≥T ·ω( log n), and a vector lows. n u ∈ Z , outputs a sample from a distribution that is statis- • Setup(1n, 1l) D u D u tically close to ∧q (A),σ. ∧q (A),σ is the discrete Gaussian Given a security parameter n and the maximum user size u distribution over ∧q (A) with parameter σ. l, do the following: 3. Hardness assumption (a) For any integer q,runTrapGen(q,n)togeneratea n×m ⊥ The classical lattice-based hardness problems mainly in- matrix A ∈ Zq and a short basis T 1 of lattice Λq (A). clude shortest vector problem, closest vector problem and (b)RunRandBasis(T 1,si) repeatedly and output other ⊥ smallest basis problem. Further, in order to propose cryptog- short bases T 2, ···, T l of lattice Λq (A). ∗ n×m raphy schemes, the inhomogeneous smallest integer solution (c)LetH : {0, 1} → Zq be a hash function. ISIS problem q,m,β is reduced, and as follows. (d)LetT 1, ···, T l be l users’ secret keys individually. All n×m Givenanintegerq,amatrixA ∈ Zq , a syndrome users has the same public key A. u ∈ Zn β e ∈ Zm q and a real , find an integer vector such • Sign(id, T i, ui) A · e u q e≤β n that = mod and . Given a tag id ∈{0, 1} for an order, a secret key T i,and ISIS n Ref.[5] has proved the average-case problem q,m,β a message ui ∈ Zq , do the following: n×2m is as hard as approximating the shortest vector problem in (a)SetD = AH(id) ∈ Zq . m β the worst√ case for any poly-bounded , and any prime (b)RunExtBasis(T i, B)togetashortbasisSi of lattice q ≥ β · ω n n ⊥ ( log ). Λq (B) such that T˜ i = S˜ i. (c) Output the signature ei ← SamplePre(B, Si,σ,ui). l III. Homomorphic Aggregate Signatures • Aggm(id, {αi,mi,pki}i=1) After receiving all messages whose tags are id, output the l 1. The homomorphic aggregate signature defini- aggregated message uAgg = i=1 αiui,whereαi is the weight tion of the message ui. l Formally, a homomorphic aggregate signature scheme • Aggσ(id, {αi,σi,pki}i=1) HAS is a tuple of probabilistic, polynomial-time algorithms After receiving all signatures whose tags are id, output l (Setup,Sign,Aggm,Aggσ,Verify) with the following func- the aggregated signature eAgg = i=1 αiei,whereαi is the tionality: weight of the signature ei. n l • Setup(1 , 1 ) • Verify(id, A, uAgg, eAgg) A Homomorphic Aggregate Signature Scheme Based on Lattice 703

Given the pair of message and signature (uAgg, eAgg√), de- Because the same public key for all users is used, the n×2m e ≤lσ m ∗ ∈{pk , ···, pk } A fine B = AH(id) ∈ Zq and verify Agg 2 ; first forgery pk 1 qs is impossible. If wins, B · e u ∗, u∗, e∗ u∗ ∈{u , ···, u }  Agg = Agg. (pk )( 1 qs ) is the second successful ∗ ∗ ∗ If both formulas are true, the signature passes the verifi- forgery in the security model. Verify√(id, pk , u , e )=1 ∗ ∗ ∗ cation. In fact: means that B · e = u and e ≤σ 2m. ISISq,m,β √prob-   β σ m l lem is solved successfully at the same time, where = 2 . B · eAgg =B · i=1αiei  l V. Efficiency = i=1αi(B · ei) l = i=1αiui According to the proposed HAS scheme, multiple messages from different users can be operated with message operated, so =uAgg this scheme is homomorphic and aggregate. Multiple messages or signatures are compressed into one, so the communication IV. Security efficiency is improved; only one time verification process is needed to verify multiple message and signature pairs, so the 1. The Security model computation is efficient. The efficiency comparison between As algorithms Aggm and Aggσ are open, they are not men- [5] GPV scheme and our scheme for l messages is shown in Ta- tioned in this model. ble 1. A homomorphic aggregate signature scheme (Setup, Sign, Agg ,Agg ,Verify l Table 1. Efficiency comparison m σ ) is unforgeable if for all and any Message Signature Verification A probabilistic polynomial-time adversary , the advantage of length length computation A in the following game is negligible in the security parameter GPV Scheme[5] ln log q lm log q lnm2(log q)2 n. HAS Scheme n log q 2m log q 4nm2(log q)2 • Setup The challenger runs Setup(1n, 1l)toobtain (pki,ski)(i =1, ···,l). It sends pki(i =1, ···,l)toA,and VI. Conclusion keeps ski (i =1, ···,l)toitself. • Queries The same tag id ∈{0, 1}n is used in a query Owing to the flexible structure and implementation sim- stage. A specifies message m1, ···,mq. The challenger chooses plicity of lattice cryptography, a homomorphic aggregate sig- nature scheme HAS is proposed. The homomorphic feature users’ private key {sk1, ···,skqs }⊂{sk1, ···,skl} randomly and computes σi ← Sign(id, ski,mi)(i =1, ···,qs). Then, it of the scheme can deal with signatures with messages oper- gives to A the tag id, corresponding public key pki and the ated; and the aggregate feature of the scheme can deal with signature σi (i =1, ···,qs). Let qs be the most query times signatures from different users. Based on the inhomogeneous for each stage. smallest integer solution assumption, HAS is secure even under • Output The adversary A outputs a tuple of the public the quantum attacks. HAS has homomorphic and aggregate key, message and signature (pk∗,m∗,σ∗). features, and mainly uses modular addition and modular mul- The adversary A wins if Verify(id, pk∗,m∗,σ∗)=1and tiplication operations, so it is efficient. either ∗ ∗ (a) pk ∈{pk1, ···,pkqs },andm =0 References or [1] R. Rivest, L. Adleman, M.L. Dertouzos, “On data banks and b pk∗ ∈{pk , ···,pk } pk∗ pk ( ) 1 qs for example = j ,and privacy homomorphisms”, Foundations of Secure Computation, ∗ m = mj . Academic Press, pp.169–179, 1978. 2. The security analyses [2] D. Boneh, D. Freeman, J. Katz et al., “Signing a linear sub- space: signature schemes for network coding”, Given an adversary that breaks this HAS scheme over Zq, Proceedings of ISIS PKC 2009, LNCS 5443, pp.68–87, 2009. we can construct an adversary that solves the q,m,β prob- [3] Z.J. Li, G. Gong, “Data aggregation integrity based on homo- Z lem over q. So, the proposed HAS scheme is secure based on morphic primitives in sensor networks”, Proceedings of the 9th the ISISq,m,β assumption. International Conference on Ad-hoc, Mobile and Wireless Net- Proof Let A be an adversary that makes at most qs sig- works, LNCS 6288, pp.149–162, 2010. nature queries. We construct an algorithm that takes as input [4] Y. Wang, “Insecure “Provably secure network coding” and n×2m ∗ n homomorphic authentication schemes for network coding”, a random matrix B ∈ Zq and a vector u ∈ Zq . n l http://epint.iacr.org/2010/060.pdf, 2010. • Setup Runs Setup(1 , 1 )togetB and T 1, ···, T l,and [5] C. Gentry, C. Peikert, V. Vaikuntanathan, “Trapdoors for hard sends public key B to A. lattices and new cryptographic constructions”, Proceedings of • A u , ···, u Queries chooses messages 1 qs randomly, the the 40th annual ACM Symposium on Theory of Computing challenge does the following: (STOC 2008), pp.197–206, 2008.

(a) Choose the signers with private keys T 1, ···, T qs ran- [6] D. Boneh, D.M. Freeman, “Linearly homomorphic signatures domly. over binary fields and new tools for lattice-based signatures”, Proceedings of PKC 2011, ed. R. Gennaro, LNCS 6571, pp.1– (b) Run the algorithm Sign(id, T i, ui). 16, 2011. c e , ···, e ( ) Return signatures 1 qs of the messages [7] D. Boneh, D.M. Freeman, “Homomorphic signatures for poly- u , ···, u 1 qs . nomial functions”, Proceedings of Eurocrypt 2011, LNCS 6632, • Output A outputs a tuple of the public key, message pp.149–168, 2011. and signature (pk∗, u∗, e∗). [8] D. Boneh, C. Gentry, B. Lynn, H. Shacham, “Aggregate and verifiably encrypted signatures from bilinear maps”, Procee- 704 Chinese Journal of Electronics 2012

was born in 1968. He dings of Eurocrypt 2003, pp.416–432, 2003. YU Jianping [9] Yiling Wen, Jianfeng Ma, Huawei Huang, “An aggregate signa- is a professor of ATR Key Laboratory of National Technology, Shenzhen University, ture scheme with specified verifier”, Chinese Journal of Elec- China. His main research interests include tronics, Vol.20, No.2, pp.333–336, 2011. [10] J. Alwen, C. Peikert, “Generating shorter bases for hard random cryptography, network security and infor- mation security. (Email: [email protected]) lattices”, Proceedings of STACS 2009, pp.75–86, 2009. [11] D. Cash, D. Hofheinz, E. Kiltz et al., “Bonsai trees, or, how to delegate a lattice basis”, Proceedings of Eurocrypt 2010,LNCS 6110, pp.523–552, 2010. was born in 1977. was born in 1984. WANG T i n g ZHANG Peng He is currently a Ph.D. candidate in She is currently a Ph.D. candidate in the the ATR Key Laboratory of National ATR Key Laboratory of National Technol- Technology, Shenzhen University, China. ogy, Shenzhen University, China. Her re- His research interests include cryptogra- search interests include cryptography, net- phy and information security. (Email: work security and information security. [email protected]) (Email: zhangpeng [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

An Approximate Approach to End-to-End Traffic in Communication Networks∗

JIANG Dingde1,2, XU Zhengzheng2,NIULaisen2 and LIU Jindi2 (1.State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China) (2.College of Information Science and Engineering, Northeastern University, Shenyang 110819, China)

[2] Abstract — All end-to-end traffic in a network con- used to model OD flows or TM ; information theory are also structs Traffic matrix (TM) which reveals all traffic travers- applied to solve this problem[8]; some typical models, for in- ing the whole networks. In this paper, we investigate TM stance gravity model[9,10], are taken to capture network traffic estimation problem in large-scale backbone networks. We characteristics; additionally, some signal processing technolo- propose an accurate approach to estimate it, based on [11,12] the Recurrent multilayer perceptron (RMLP) which has gies are introduced into this research area . Neverthe- a powerful ability of modeling. According to constraint less, as mentioned in Refs.[2, 13], statistic-based approaches relations between link loads and TM, we introduce their are extremely sensitive to prior information of OD flows or temporal and spatial correlation to modify the traditional TM. Stoev et al. used the probability model combining tradi- RMLP and establish our models. And the outputs of our tional single-link (-flow) traffic model to capture global behav- models take into account the constraints that TM itself is ior of network traffic[14]. Roughan et al. and Vishwanath et al. satisfied with. Trained with input-output data pairs, our presented some synthetic methods generate TM so as to con- models can learn and grasp all kinds of characteristics of [15] TM and all weight parameters are determined. Finally, we duct the normal network activity . Additionally, Nucci et use the real data to validate our method. Simulation re- al. converted the under-constrained problem into the full rank [16] sults show that our method can perform the accurate and one by changing the routing . Similarly, Soul et al.pre- fast estimation of TM very well. sented a heuristic algorithm to compute the routing needed [17] Key words — End-to-end traffic, Traffic matrix, Time- in order to obtain a full rank problem . Gunnar et al.in- [18] varying network, Correlation analysis, Approximate com- vestigated TM estimation in large-scale backbone network . putation. Different from previous model, Erramilli et al.proposedan independent-connection model to describe TM[19]. Although I. Introduction these methods can overcome ill-posed nature of TM estimation problem in certain extent, they need the complex computation. For current complex communication networks, to per- Our method only uses the input-output data pair to train our form effectively network management activities is significantly models. As a result, it avoids the complex computation and difficult[1,2]. Hence, to know traffic information across a net- can estimate TM at faster speed. Besides, we also made many [20−22] works, Traffic matrix (TM) was proposed and researched in studied for this problem . detail[3−5]. TM is composed of all Origin-destination (OD) We propose a novel method to deal with this problem in pairs (or flows) or end-to-end traffic in a network[2]. It can give large-scale backbone network. Previous methods do not only network operators a global perspective of how all the traffic in hold the larger estimation errors, but also take computation- [2,13] the network flows. In today’s network activities, TM, playing ally on the higher overhead . HereweexploitRecurrent [23] an extremely important role in network management, receives multilayer perceptron (RMLP) , which is a recurrent neural the extensive attention[2,6,7]. network, to estimate TM. RMLP owns a powerful modeling Unfortunately, direct measurement of TM is not generally ability and is used for modeling the unknown and complex practical[5]. Thereby, TM estimation is a main way so far. To systems. Additionally, as mentioned in Ref.[23], RMLP holds attain the accurate TM, a variety of statistical methods are the characteristics of recurrent network and feedforward net-

∗Manuscript Received Dec. 2011; Accepted Jan. 2012. This work is supported in part by the National Natural Science Foundation of China (No.61071124, No.61172051), the Specialized Research Fund for the Doctoral Program of Higher Education (No.20100042120035), the Program for New Century Excellent Talents in University (No.NCET-11-0075), the Fundamental Research Funds for the Central Universities (No.N090404014), the Open Project of State key Laboratory of Networking and Switching Technology (No.SKLNST-2009- 1-04), the China Postdoctoral Science Foundation (No.20110491516), and the Northeastern University Postdoctoral Science Foundation (No.20110622). 706 Chinese Journal of Electronics 2012 work. This guarantees RMLP holds the universally approx- improve the performance. Output layer is a one-step predic- imative properties because multilayer feedforward networks tor that only forecasts one OD flow. Hence, this is a kind were approximators[24], owns backpropagation nature, and can of multi-input and single-output RMLP. Assume that, at the capture the temporal and spatial corrections because of its time t, the input vector and output vector are y(t)andxˆi(t) recurrent and parallel structures. Hence, RMLP is suited (where i =1, 2, ···,N), respectively. According to Fig.1, the for estimating large-scale TM. Furthermore, when modeling, following equation is attained. ⎧ we consider the spatio-temporal correlations and linear con- x t φ W 3 × U t ⎪ ˆ i( )= 3i( fi 2i( )) straints between TM and link loads in order to make the ac- ⎪ 3 ⎪ xˆ i(t)=φ3i(Wfi × U2i(t)) curate estimation. Once our models are trained successfully, ⎨ U t φ W 2 × U t W 2 × U t − they can make fast estimation and prediction for OD flows ⎪ 2i( )= 2i( fi 1i( )+ zi 2i( 1)) (2) ⎪ 1 1 or TM. Our previous work can be found in Ref.[25]. Conse- ⎪ U1i(t)=φ1i(Wfi × U0i(t)+Wzi × U1i(t − 1)) ⎩⎪ quently, they can be used for on-line and real-time estimation U0i(t)=Fi(Ui(t)) for the large-scale TM. We use the real data from the Abilene where Fi denotes the data pretreating process and is a func- network to validate our method. Simulation results show that tion of one-to-one mapping, φji is the activation function of our method is practical and effective. the ith layer (j =1, 2, 3; i =1, 2, ···,N)asmentionedin Ref.[23], and Ui(t) is the function of y(t)andxˆ i(t)whichis II. Estimation Model denotes as follows. T T T T For a backbone network, TM describes all traffic flowing Ui(t)=(y (t), y (t−1), ···, y (t−r), xi(t−1), ···, xi(t−k)) between all source and destination nodes in the network. It (3) contains the important information used for network activi- Fig.1 and Eq.(2) denote the training model and the esti- ties. Due to time-varying nature of TM, TM and link loads mation model used for the single OD flow. can be referred to as time sequences. At a particular time t, 2. Multiple OD flow model TM and link loads are denoted as the variables x(t)andy(t), Using the above single OD flow model, we design a multiple respectively. Hence, TM estimation can be denoted as the OD flow model in Fig.2. Fig.2 indicates that the multiple OD below linear time-varying system model: flow model comprises of the single OD flow model. According to Fig.2, we can derive the below equation: y(t)=Ax(t)(1)⎧ 3 ⎪ xˆ 1(t)=φ31(Wf1 × U21(t)) ⎪ A L × N ⎪ 2 2 where the routing matrix whose element is equal 1 ⎪ U21(t)=φ21(Wf1 × U11(t)+Wz1 × U21(t − 1)) [6,20] ⎪ if OD flow j traverses link i or zero otherwise . Generally, ⎪ U t φ W 1 × U t W 1 × U t − ⎪ 11( )= 11( f1 01( )+ z1 11( 1)) L N exists in a large-scale backbone network, so Eq.(1) ⎪ [3,7] ⎪ U01(t)=F1(U1(t)) describes a highly ill-posed problem . ⎨ ··· (4) ⎪ ⎪ x t φ W 3 × U t ⎪ ˆ N ( )= 3N ( fN 2N ( )) ⎪ ⎪ U t φ W 2 × U t W 2 × U t − ⎪ 2N ( )= 2N ( fN 1N ( )+ zN 2N ( 1)) ⎪ 1 1 ⎪ U1N (t)=φ1N (WfN × U0N (t)+WzN × U1N (t − 1)) ⎩⎪ U0N (t)=FN (UN (t))

where U1(t),U2(t), ···,UN (t) meet with Eq.(3). “Data post- Fig. 1. Block-diagram representation of modified RMLP treating” in Fig.2 means to make the estimation results sat- model used for single OD flow, with 2 hidden layers isfied with Eq.(1). Additionally, “Training model” and “Esti- fully recurrent, and with an output layer one-step pre- mation model” denote the process used for training and esti- diction mation, respectively. x t x t , x t , ···, x t T 1. Single OD flow model Assume the value ( )=( 1( ) 2( ) N ( )) from Eq.(4). Then “Estimation model” can be formulated into: Here, we model the single OD flow estimation problem by ⎧ modifying the traditional RMLP. Fig.1. plots the block di- ⎨⎪ min x(t)2 agram of the modified RMLP used for the single OD flow, s.t. y(t)=Ax(t) (5) where for OD flow i (i =1, 2, ···,N; N is the number of OD ⎩⎪ k xi(t) ≥ 0,i=1, 2, ···,N flows in a network), Wzi denotes the network feedback con- m nection weight matrix of the kth layer (k =1, 2), Wfi the network forward connection weight matrix of the mth layer III. Estimation Algorithm (m =1, 2, 3), y(t)theL × 1 input vector of ”data pretreat- ing”, namely link loads, U0i the input vector of RMLP, Uji Now, we formulate our estimation method proposed, in- the output of the jth hidden layer (equal to 1 or 2),x ˆi the cluding model training and TM estimation. Model training is −1 output of RMLP, and z I time delay unit, Zr with r time to determine all weights and parameters in the above presented delay unit, and Zk with k time delay unit. The ”data pretreat- estimation model by supervised learning, in which we will dis- ing” in Fig.1 transforms the input data into the ones suited cuss training method about RMLP and analyze the single OD for RMLP handling. And we can speed up the training and flow model training and the multiple OD flow model training. An Approximate Approach to End-to-End Traffic in Communication Networks 707

Then we further deal with TM estimation, including the single ing method to train them by input-output data pairs T OD flow estimation and the multiple OD flow estimation. y(t), (x1(t), x2(t), ···, xh(t))  in the supervised learning 1. Model training way. After finishing training for our multiple OD flow model k m Before RMLP is used to estimate OD flow or TM, firstly by the enough samples, weights Wzi and Wfi in Fig.2 and it should be trained successfully. In contrast to feedforward Eq.(4) can be determined accurately. Due to the parallelism network, RMLP is more difficult to train, because it is a re- of our multiple OD flow model, each sub-model about sin- current network[23]. Furthermore, RMLP holds the recency gle OD flow can be trained independently. Therefore, we may effect, namely the tendency for recent weight updates to cause leave each sub-model learn traffic information from a mount of a network to forget what it has learned in the past[23].Inpar- sample data simultaneously in order to save the training time. ticular, there exist many more numbers of links and OD flows Once these weights are successfully obtained, our multiple OD in the large-scale IP network. Thus, in the process of network flow model is established uniquely. training, this needs to handle more data and will lead By Fig.2 and Eq.(4), we can also decide the responding model by training. Furthermore, when accomplishing model training, we can obtain the following equation: x(t)=g(y(t)) (7) where y(t) is link loads, x(t) is TM, and denotes a mapping from RL to RN . Similarly, Eq.(7) has denoted the transform from the input space of L dimensions to the output space of N dimension of TM x(t). Thereby, the model of TM in large- scale backbone network is successfully built by Eq.(7). 2. Traffic matrix estimation Now we analyze TM estimation by the above built model, including single OD flow estimation and TM estimation. As for single OD flow estimation, according to Eq.(6), input link loads y(t) into the model denoted in Fig.1, and then one can quickly obtain the single OD flow estimation:

xˆ i(t)=gi(y(t)) (8) Fig. 2. Block-diagram representation of modified RMLP model used for TM, with 2 hidden layers fully recur- where i =1, 2, ···,N. By Eq.(7), one can successfully perform rent, and with an output layer one-step prediction, TM estimation and obtain estimation result and with N RMLP independent and N estimated OD flows attained simultaneously xˆ(t)=g(y(t)) (9) to the more complex problems. Due to the limitation of space, However, as discussed above, TM is satisfied with Eq.(1), the training algorithms related with RMLP are not discussed and each its element should also be nonnegative. Hence, its re- here, which can be found in Refs.[23, 25]. sulting estimation is the optimal solution satisfied with Eq.(5). 1. Single OD flow model training In terms of Eqs.(5) and (7), the resulting estimation of TM can For the single OD flow model denoted in Fig.1 and Eq.(2), be denoted into: ⎧ we exploit the RMLP training method to train them by input- ⎨⎪ min xˆ(t)2 output data pairs y(t), xi(t) in the supervised learning way. s.t. y(t)=Axˆ(t) (10) Based on the abilities of modeling of RMLP, after training ⎩⎪ k xˆi(t) ≥ 0,i=1, 2, ···,N our single OD flow model by the enough samples, weights Wzi m and Wfi in Fig.1 and Eq.(2) can be determined accurately. Up to now, we have discussed our TM estimation method. Our single OD flow model is established uniquely once these Our method is based on the modified RMLP, and sufficiently weights are successfully obtained. takes into account the temporal and spatial correlations of According to Eqs.(2) and (3), when the single OD flow TM. Therefore, we refer to it as RMLP-based spatio-temporal model training is over and the corresponding weights are de- estimation (RSTE) method. RSTE own many advantages. cided, the following equation is obtained: Firstly, as discussed above, we use the multi-stream NDKEF training method based on BPTT(h) to train the modified xi(t)=gi(y(t)),i=1, 2, ···,N (6) RMLP network. The network weights are grouped by the L where y(t) is link loads and gi denotes a mapping from R nodes, with the result that the number of group is the same as to R1 corresponding to every OD flow i. Accordingly, Eq.(6) the one of nodes. This method has obvious advantages when has represented the transform from the input space of L di- training in the personal computers with speed and memory mensions to the output space of one dimension. Up to now, limited. Secondly, RSTE can perform the parallel data pro- the model of single OD flow in large-scale backbone network cessing with the speed fast. As shown in Fig.2, our model is successfully established. holds the parallel structure. Hence, it can perform the train- 2. Multiple OD flow model training ing and prediction of OD flows in a parallel way and at the For the multiple OD flow model denoted in Fig.2 faster speed. Thirdly, RSTE owns the scalability. Due to and Eq.(4), likewise, we exploit the RMLP train- parallel structure of our model, it can be scaled to the larger 708 Chinese Journal of Electronics 2012 networks. Furthermore, every OD flow may be trained indi- Step 7 If ε<δor k>T, save network weights to the vidually, and thus although the size of networks is increased, file and exit the training, or set k = k + 1 and go back to Step this do not impact on the speed of training. It can also make 2. the quick estimation in the parallel form. Finally, RSTE can Algorithm 2 capture sufficiently TM’s characteristics and perform the accu- Step 1 By Algorithm 1, get the network weights and rate estimation. On the one hand, it exploits RMLP to model initialize the network. large-scale IP TM. On the other hand, for the purpose of accu- Step 2 Present input data to networks and make pre- rate estimations, it sufficiently considers the spatio-temporal treatment. correlations of link loads and OD flows in the large-scale IP Step 3 By network models, estimate TM. network. Generally speaking, RMLP can describe the charac- Step 4 If the process is over, then save the estimations teristics of OD flows (or TM) after they are trained by using to the file and exit, or go back to Step 2. the input-output data pairs. And then RMLP can be used to Below we conclude the complete RSTE method: estimate their value at other time. Because RMLP holds the Step 1 By problem requirements, determine and con- capability of learning and generalizing, the results estimated struct network model as denoted in Fig.1 (or Fig.3 or Fig.4). by it are often accurate. In particular, OD flows (or TM) in Step 2 By Algorithm 1, train this network model. large-scale IP network generally holds daily pattern, weekly Step 3 In terms of Algorithm 2, estimate TM. pattern and even monthly pattern. Consequently, as long as Step 4 Save estimation results and exit. the data set used for training is sufficient enough, RMLP can well capture the properties of OD flows (or TM) by learning, IV. Simulation Results and Analysis and simultaneously can make the generalization and conclu- sion. As a result, it can also describe the characteristics that In this section, we use the real data from the Abilene back- the data set used for training do not reflect, and correctly bone network to validate our method RSTE, evaluating our predict the corresponding OD flow (or TM). model, analyzing estimation errors and discussing RSTE’s sta- However, the following things should be noted. Firstly, the bility and performance. Traffic data from the Abilene are col- network input principle used for predicting should be consis- lected in the 5-minute sampling interval by flow tool. Since [6] tent to those for training, i.e. the number and type of input TomoGravity (TomoG) , PCA (Principal component analy- [3] data set must be consistent. In particular, the time slot re- sis) estimation , and SRSVD (Sparsity regularized singular [10] lation of input data is also to remain consistent in the net- value decomposition) are reported as the accurate and ac- work architecture in Figs.1, 2. Secondly, because OD flow is a ceptable methods for TM estimation, RSTE will be compared time sequence, it often shows the temporal correlations. And with them. We use the 4000-point real data from the Abi- thereby the order of input data used for training and predict- lene to simulate performance of four algorithms, respectively. ing is very important. To make the correct estimations, input To analyze their performance, the first 2000-point real data data should keep original order. Finally, sometimes need to from the Abilene, respectively, are used to construct the mod- consider the priming process. Because RMLP network has the els corresponding to them, while the rest of data are exploited internal state variables, it should be initialized correctly. If the to test four methods. According to the above presented net- time slot to predict closely follows that of training, one can di- work models, by analyzing comparatively, our models choose rectly make the prediction. Otherwise, the priming process the 15 lags and then use the 660-6R-8R-1L RMLP structure should be performed to initialize the internal state variables. for each part of our models, i.e. our models holds 660 inputs, 6 3. Algorithm and 8 neural cells in the first and second full recurrent hidden In this section, we proposed the two algorithms about level, respectively, and 1 neural cell in the linear output level. RSTE method, Algorithm 1 and Algorithm 2. Algorithm 1 Besides, we use the same 144 RMLP structures to form the is used for training process, and Algorithm 2 is used for pre- parallel RMLP model for the whole estimation of TM. dicting process. 1. Comparative analysis In this subsection, we will analyze comparatively TM es- Algorithm 1 timation of four different methods in the Abilene. Here we Step 1 Initialize the whole network model represented will regard other three methods as the baseline to be com- by Fig.1 (or Fig.2 or Fig.4). Present the training data pairs pared with RSTE. Fig.3 denotes estimation results of ODs 80, to the network and make the priming training. Set the error 95, and 119 in the Abilene with these four methods. The first bound δ, total iterative steps T and k =0. 2000-point sample data are used for the responding approaches Step 2 Present the training data pairs to the network. to establish their own models, while the other 2000-point data Pretreat the data and train the network by using data pre- are employed to test their performance. Fig.3 represents sev- treated. eral kinds of typical properties of OD flows in the Abilene, Step 3 By the network model, pass the input data to e.g. extremely time-varying or dynamic changes, burst or dra- x the network and get the estimation ˆi. matic changes in time, stationary nature such as period vari- Step 4 Use BPTT(h) to make the gradient calculation. ety, and so forth. From Fig.3, we can see that, in the Abilene, Step 5 If multi-stream training is over, then go to Step though the rate of traffic is up to 10e7 orders of magnitude, 6, or go back to Step 2. all four methods can capture traffic change tendency in time. Step 6 Use NDEKF method to update the network Moreover, in contrast to other three methods, RSTE can pre- weights. And then calculate the total error ε. cisely track real traffic. PCA can also track these dynamic An Approximate Approach to End-to-End Traffic in Communication Networks 709 changes, but fluctuate largely near real traffic. TomG and SRSVD can not keep up with change intensity of traffic and consequently yield the under-estimation or over-estimations. From Fig.3(b), it is easily found that though OD flows ex- hibit the obvious burst or dramatic changes, four methods can exactly track their tendency of changes and RSTE can more accurately capture this type of change than others. Thereby, in contrast to other three methods, RSTE can perform the more accurate estimation for TM. This shows that RSTE is reasonable and practical. Additionally, Fig.3 also show that both RSTE and TomG can more stably estimate OD flows compared with other two algorithms. SRSVD is more stable than PCA. This implies that RSTE and TomG can have the Fig. 5. Relative error CDF of first 2000 data points estimated. stronger ability to capture inherent nature of TM, whereas the (a) x =L2 norm, Spatial relative errors; (b) x =L2 larger fluctuation happens to PCA and SRSFM. This further norm, Temporal relative errors suggests that RSTE can obtain the more accurate estimation 2000 data points estimated. Fig.4(a) shows that the mean of TM. SREs of RSTE, TomoG, PCA, and SRSVD are 0.52, 0.73, 0.86, and 0.80, respectively. Hence, RSTE’s SREs are also the lowest in four methods, the second is TomoG, and then next is PCA and SRSVD in turn. Fig.4(b) indicates that the mean TREs of RSTE, TomoG, PCA, and SRSVD are 0.19, 0.24, 0.29, and 0.33, respectively. Thus, RSTE’s TREs are still the lowest in four methods, the second is TomoG, and then next is PCA and SRSVD in turn. In short, RSTE holds the low- est spatial and temporal relative estimation errors, i.e., it can make the accurate estimation. To evaluate the estimation performance of different meth- ods, we also examine the Cumulative distribution functions (CDFs) of their SREs and TREs. For first 2000 estimations, from Fig.5(a), we find that, for RSTE, TomoG, PCA, and SRSVD, still about 81%, 72%, 58%, and 63% of OD flows Fig. 3. Estimation results of ODs 80, 95, and 119, true in blue, hold SREs less than 0.8, respectively. Fig.5(b) shows that for RSTE in red, TomoG in green, SRSVD in pink, PCA RSTE, TomoG, PCA, and SRSVD, about 86%, 59%, 12%, in sky-blue and 15% of measurement time slots hold TREs less than 0.25, respectively. Thereby, this further confirms RSTE owns the 2. Evaluation of estimation errors lowest relative estimation errors in all four methods and have the most accurate estimations. Additionally, from Fig.4(a), we can see that RSTE has Now, we further discuss estimation errors of four meth- more lower estimation errors for small and large OD flows in ods. Here, we exploit the Spatial relative errors (SREs) and contrast with other three methods. This suggests that RSTE the Temporal relative errors (TREs) to make the comparative can not only estimate the larger OD flows, but can also esti- analysis. Fig.4 shows SREs and TREs corresponding to first mate the smaller OD flows. Besides, from Fig.4(b), we can find that TREs of four methods are relative stable in time. How- ever, in constrast, TomoG is the most stable, next is RSTE, the third is SRSVD, and the last is PCA. Thus RSTE also owns the better stability.

V. Conclusions

In this paper, we investigated TM estimation in large-scale backbone networks. To overcome the highly ill-posed nature of this problem and accurately capture inherent characteris- tics of TM, we modified the traditional RMLP to make it be more suited for large-scale TM estimation. At the same time, the link loads and TM before current time slot were Fig. 4. Relative errors of first 2000 data points estimated. (a) introduced into the RMLP’s inputs to seize further its spatio- Flow ID, from smallest to largest in mean; (b)Time temporal correlations. By exploiting RMLP’s ability of mod- slot order eling, we established our models, including single and multiple 710 Chinese Journal of Electronics 2012

OD flow models, and TM model. Under the conditions that [16] A. Nucci, R. Cruz, N. Taft, C. Diot, “Design of IGP link weight TM and link loads were satisfied with linear time-varying sys- changes for estimation of traffic matrix”, in Proc. of IEEE In- tem model, by using the input-output data pairs to train our focom, 2004. [17] A. Soule, A. Nucci, E. Leonardi, R. Cruz, N. Taft, “How to models, the highly ill-posed nature of TM estimation prob- identify and estimate the largest traffic matrix elements in a lem was overcome in the process of model training. Moreover, dynamic environment”, in Proc. of ACM Sigmetrics, 2004. based on RMLP, our models could not only learn and grasp [18] A. Gunnar, M. Johansson, T. Telkamp, “Traffic matrix estima- all kinds of characteristics of TM used for training, but could tion on a large IP backbone: A comparison on real data”, in also reason and generalize new properties of TM not used for Proc. of ACM IMC, 2004. training. Once model training was terminated, all parameters [19] V. Erramilli, M. Crovella, N. Taft, “An independent-connection model for traffic matrices”, in , 2006. of our models were determined and our models were success- Proc. of ACM IMC [20] Dingde Jiang, Zhengzheng Xu, Hongwei Xu, Yang Han, Zhen- fully constructed. Besides, in the process of TM estimation, hua Chen, “An approximation method of origin-destination flow we describe it into a constrained optimal process to further traffic from link load counts”, Computers and Electrical Engi- overcome the ill-posed nature of this problem. neering, Vol.37, No.6, pp.1106–1121, Nov. 2011. Finally, to validate our method, we conducted a series of [21] D. Jiang, J. Chen, L. He. “An accurate approach of large-scale test in the real backbone network to evaluate our models, an- IP traffic matrix estimation”, IEICE Transactions on Commu- , Vol.E90-B, No.12, pp.3673–3676, 2007. alyze estimation errors, and discuss stability and performance nications [22] D. Jiang, X. Wang, L. Guo, “Mahalanobis distance-based traffic of our method. Simulation results show that, in contrast to matrix estimation”, European Transactions on Telecommunica- previous methods, our method exhibits the more accuracy, sta- tions, Vol.21, No.3, pp.195–201, 2010. bility, and effectiveness. [23] S. Li, “Wind power prediction using recurrent multilayer per- ceptron neural networks”, IEEE Power Engineering Socierty General Meeting, Vol.4, pp.13–17, 2003. References [24] K. Hornik, M. Stinchcombe, H. White, “Multilayer feedforward [1] A. Soule, F. Silveira, H. Ringberg, C. Diot, “Challenging the networks are universal approximators”, Neural Networks, Vol.2, pp.359–366, 1989. supremacy of traffic matrices in anomaly detection”, in Proc. [25] D. Jiang, G. Hu, “Large-scale IP traffic matrix estimation based of ACM IMC, 2007. on the recurrent multilayer perceptron network”, in [2]A.Soule,A.Lakhina,N.Taftet al., “Traffic matrices: Balanc- Proceed- ing measurements, inference and modeling”, in Proc. of ACM ings of the IEEE International Conference on Communica- , Beijing, China, pp.366–370, May 2008. SIGMETRICS, 2005. tions [3] A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot, E. Kola- JIANG Dingde received Ph.D. de- cyzk, N. Taft, “Structural analysis of network traffic flows”, in gree in communication and information systems from School of Communication Proc. of ACM SIGMETRICS, 2004. [4] L. Guo, “LSSP: A novel local segment shared protection for and Information Engineering, University of Electronic Science and Technology of multi-domain optical mesh networks”, Computer Communica- China, Chengdu, China, in 2009. He is tions, Vol.30, pp.1794–1801, June 2007. [5] K. Papagiannaki, N. Taft, A. Lakhina, “A distributed approach currently an Associate Professor in College of Information Science and Engineering, to measure traffic matrix”, in Proc. of ACM IMC, 2004.(5+1) Northeastern University, Shenyang, China. [6] D. Jiang, Z. Xu, Z. Chen et al., “Joint time-frequency sparse His research interests include network mea- estimation of large-scale network traffic”, Computer Networks, Vol.55, No.10, pp.3533–3547, 2011. surement, network security, Internet traffic engineering, and com- munication networks. He is a member of IEEE and IEICE. (Email: [7]L.Guo,J.Cao,H.Yuet al., “Path-based routing provisioning [email protected]) with mixed shared protection in WDM mesh networks”, Journal , Vol.24, pp.1129–1141, Mar. 2006. of Lightwave Technology XU Zhengzheng is currently working toward Ph.D. de- [8] Y. Zhang, M. Roughan, C. Lund, D. Donoho, “An information gree in management science and engineering in College of Informa- theoretic approach to traffic matrix estimation”, in Proc. of tion Science and Engineering, Northeastern University, Shenyang, ACM SIGCOMM, 2003. China. She is currently a Research Member at Key Laboratory [9] Y. Zhang, M. Roughan, N. Duffield, A. Greenberg, “Fast ac- of Comprehensive Automation of Process Industry of Ministry of curate computation of large-scale IP traffic matrices from link Education, College of Information Science and Engineering, North- loads”, ACM SIGMETRICS Performance Evaluation Review, eastern University, Shenyang, China. She is also presently a Re- Vol.31, No.3, pp.206–217, 2003. search Member of Systems Engineering Research Institute at the [10] J. Ni, S. Tatikonda, E.M. Yeh, “A large-scale distributed traf- same university. Her research interests include supply chain and lo- fic matrix estimation algorithm”, in Proc. of IEEE Globecom, gistics management, decision analysis, modeling, and optimization. 2006. [11] A. Soule, K. Salamatian, A. Nucci, N. Taft, “Traffic matrix NIU Laisen is currently working toward Ph.D. degree in tracking using Kalman filtering”, LIP6 Research Report RP- communication and information systems in College of Informa- LIP6-2004-07-10, LIP6, 2004. tion Science and Engineering, Northeastern University, Shenyang, [12] Y. Zhang, M. Roughan, W. Willinger, L. Qiu, “Spatio-temporal China. He is currently a Research Member at Communication and compressive sensing and Internet traffic matrices”, in Proc. of Information Systems Institution, College of Information Science and ACM SIGCOMM, 2009. Engineering, Northeastern University, Shenyang, China. His re- [13] I. Juva, “Sensitivity of traffic matrix estimation techniques to search interests include network measurement and cognitive net- their underlying assumption”, in Proc. of IEEE ICC, 2007. work. [14] S. Stoev, G. Michailidis, J. Vaughan, “Global modeling of back- LIU Jindi received B.S. degree from College of Informa- bone network traffic”, in Proc. of IEEE Globecom, No.13, 2010. tion Science and Engineering, Northeastern University, Shenyang, [15] K.V. Vishwanath, A. Vahdat, “Swing: Realistic and respon- China, in 2011. She is currently working toward M.S. degree in com- sive network traffic generation”, IEEE/ACM Transactions on munication and information systems at the same university. Her Networking, Vol.17, No.3, pp.712–725, 2009. research interests include network measurement and cognitive net- work. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Novel Covert Timing Channel Based on RTP/RTCP ∗

YING Lizhi, HUANG Yongfeng, YUAN Jian and LINDA Yunlu Bai (Institute of Information Cognition and Intelligence System, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China)

Abstract — A covert timing channel utilizes temporal II. Related Works features to transfer secret messages. This paper proposes a framework of network covert timing channels based on in- A number of researches have been done on covert timing formation theory to measure their capacity and efficiency. channels, especially on TCP/IP. For example, S. Zander gave Different from many former channels constructed upon a an overview on covert timing channels and divided previous single transport layer protocol, the paper designs and im- works into two categories: “packet rate timing” and “mes- plements a novel covert timing channel based on RTP and sage sequence timing”[3]. Serdar described an implementation RTCP which both belong to the application layer. The of a covert network timing channel on TCP/IP[4].Steven channel utilizes Run Length Code and Multi-Zero Code to improve imperceptibility and robustness. The evalua- Murdoch proposed a way to embed a covert channel into [5] tion and test results show that the channel is superior in TCP/IP within the detection method . Sarah etc. proposed covert communication for it is quite robust against jitter a non-detectable timing channel scheme by using packet inter- and packet loss with high imperceptibility. transmission time, and their work achieved significant perfor- [6] Key words — Steganography, Covert timing chan- mance improvement . Liu Xiong gave a method to analyze a typical covert timing channel with uniformly distributed noise, nel, Real time protocol/Real time transmission protocol and used some formulas to calculate the channel’s capacity[7]. (RTP/RTCP), Information theory. Covert channels built on application layer have more ad- vantages than those on transport layer, such as data won’t I. Introduction be checked or modified by mid-devices. Ref.[8] suggested an adaptive steganography scheme in G.711 speech of VoIP. This Steganography, known as the technology of hiding infor- method utilized adaptively different algorithms to embed some mation in a carrier and making them latent, becomes increas- secret messages in different blocks of speech stream of VoIP ingly important in the area of information security. In this according to the block characteristics. Huang Yongfeng pre- paper, a research effort within the framework of a covert tim- sented a model of covert communication based on VoIP, and ing channel based on Real time protocol (RTP) and Real time pointed out the significance of key exchanging to the covert [9] transmission protocol (RTCP) is described. communication . All these works bring us many inspira- A covert channel is “A communication channel that can be tions in building a covert communication system based on exploited by a process to transfer information in a manner that RTP/RTCP. violates the system security policy”[1]. There are two kinds In this paper, characteristics of RTP/RTCP are utilized to of covert channels, covert timing channel and covert storage construct a covert timing channel. Theoretical analysis shows channel. Most of the former works on designing covert timing that this timing channel seems like a natural VoIP flow, while channels are based on the observation of different transmission in fact, it is easy to implement and hard to detect. intervals of a single network stream. Our work focuses on the covert timing channel based on RTP and RTCP which both III. Framework and Implementation belong to the application layer. Techniques The rest of the paper is organized as follows: In Section II, related works of steganography based VoIP (Voice over IP) 1. Framework of the covert timing channel are introduced; In Section III, the design and implementation Within the leverage of the information theory to measure of the covert timing channel are presented; In Section IV, the the bandwidth, a framework of the RTP/RTCP covert tim- proposed method is evaluated with test results; finally, con- ing channel is proposed, which also applies to all the network clusion and direction of future work are given in Section V. protocols.

∗Manuscript Received June 2011; Accepted Mar. 2012. This work is supported in part by the National Natural Science Foundation of China (No.60703053, No.60773140), and Independent Scientific Research Programs of National Education Ministry (No.20111081023). 712 Chinese Journal of Electronics 2012

A single network stream is mainly composed of two fac- tors: packets and time intervals. In a covert timing channel, only temporal features are used while packets content is not modified. The combination of packets and time intervals is used to transfer covert information. Define a set of different types of packets as B = {b0,b1, ···,bm|b0 =ø},andasetof Fig. 1. State transition diagram of DSCM source different time intervals T = {t0,t1, ···,tm|t0 =ø}.Thesym- Besides entropy, another definition is given to evaluate the bol set S of information source can be formed by B and T , efficiency of different algorithms. where S = B × T is introduced to illustrate the combination process. S contains all the possible ordered pairs as Eq.(1) Definition 1 The Average symbol length (ASL) is de- shows: fined as the average number of packets to code one bit in a covert timing channel, denoted as l¯. S = B × T = {(bi,tj )|bi ∈ B; tj ∈ T } (1) If the appearance probability of each tuples qij in Q is Denote sk =(bi,tj ) as a tuple which is an ordered pair. p(qij ), ¯lQ will be calculated as Eq.(7): Elements of S will be chosen as source codes. According to practical features of a unique network flow, the appearance M N s P of each tuple k is assigned a probability value k. Then the ¯lQ = pij × (i + j)(7) entropy H(S) is calculated by the following Eq.(2): i=0 j=0 |S| H(S)=− Pk log PK (2) Although H∞(ΘRLC ) >H(ΘDRC), the ASL of RLC is k=1 bigger. Considering time intervals, Definition 2 is presented H(S) is the covert timing channel’s capacity. Since to measure transmission efficiency. RTP/RTCP are application layer protocols carried by UDP, Definition 2 The transmission efficiency Vmax is defined time intervals are hard to control. Generally, the temporal as the rate in bits per second: feature set of RTP/RTCP packets are treated as {ϕ},sowe H(S) get the expression (3) Vmax = (bit/s) (8) Trate ∗ ¯lsymbol Srtp,rtcp = {s1,s2} = {RT P, RT CP } (3) To make this covert channel more inconspicuous, the com- Simply assume RTP/RTCP sending rate is 20ms per bination of s1 and s2 is sufficiently utilized to implement a packet, by calculating the transmission efficiency of RLC and covert timing channel. Furthermore, tuples of Srtp,rtcp can be DRC, we get VRLC max =30bits/sandVDRC max =36 combined as Eq.(4): bits/s. The theoretical capacity is capable of covert communi- cation, but there should be some pretreatment of secret mes- Q = {q00,q01,q10, ···,qij , ···,qMN sages to ensure the covert channel is efficient and practical |q i RT P ,j RT CP , ≤ i ≤ M, ij =(( ) ( )) 0 enough. 0 ≤ j ≤ N} (4) 3. Synchronization 2. Methods of implementation In order to guarantee synchronization during communi- A method called Directly represent code (DRC) is pro- cation, the binary stream of secret messages is divided into k n posed in this paper where RTP and RTCP packets indicate frames. A frame is formed by a bits frame head and bits bits 0 and 1 respectively. Eq.(5) illustrates the symbol set data. QDRC: All the frames are of equal length which is mainly deter- QDRC = {q10,q01|1(RT P ), 1(RT CP )} (5) mined by the packet loss ratio of current network. If packet As RFC3550 defines that RTCP transmission interval is ratio is lower, the frame length could be longer to reduce re- [2] dundancy. To avoid frequently sending RTCP packets, the random , we assume that p(s1)=4/5andp(s2)=1/5, and binary stream could be encoded by Multi-Zero code (MZC) in then H(QDRC) equals 0.7219. But in DRC, the sender may transmit several continuous advance. When encoding every 0 of the stream into several RTCP packets, which influences converse quality. To make the 0s, the sending rate of RTCP packets decreases as well as the model more intuitive and practical, Run length code (RLC) capacity of timing channels. Besides the advantage of mak- is introduced. During transmission, if the current bit is the ing the covert timing channel harder to detect, MZC could same as previous, an RTP packet is sent, otherwise an RTCP discover coding errors. packet and an RTP packet are both sent. It turns out to be a Discrete stability constant Markov (DSCM) source (seen in IV. Evaluation and Test Results Fig.1). According to the previous assumption, p(q10)=3/4 while p(q11)=1/4. This section shows the theoretical evaluation and test re- Similar to Eq.(2), the Maximum source entropy rate sults of the covert timing channel based on RTP/RTCP. The (MSER) of RLC can be analyzed by Eq.(6): following criteria are being used: bandwidth and robustness. All the evaluation is based on assumption that an RTP packet 2 is sent every t seconds, and the frame length is n +2bits.A H∞(QRLC )= P (j)H(QRLC |q = j)=0.81 (6) j=1 single 0 is encoded into m 0s by MZC. A Novel Covert Timing Channel Based on RTP/RTCP 713

The bandwidth of this timing channel is in direct propor- Test results of bandwidth and proportion of RTCP packets tion to RTP sending rate in the VoIP session. The bandwidth in this channel are shown in Table 1. of the timing channel is calculated as Eq.(8). Table 1. Bandwidth and proportion of RTCP packets n(m +1) BW = (bit/s) (9) Frame length Number BW Percentage 2mt(2 + n) (bits) of 0s (bits/s) RTCP (%) 1 10 1 25 30.2 n When = 8, Fig.2 shows the relationship among band- 2 18 2 20 19.8 width, transmission interval and number of 0s. 3 10 2 18 19.8 4 18 1 28 30.2

The test results show that this timing channel can be uti- lized to transmit keys or short texts. If higher imperceptibility with lower efficiency is needed, m should be set bigger in MZC, and vice versa.

V. Conclusion

This paper presents the design, implementation, and eval- uation of a novel covert timing channel based on RTP/RTCP. Fig. 2. Relationships among bandwidth, multi-zero and time A framework grounded on information theory is proposed to interval guide the implementation. RLC and MZC are used to in- When m is constant, the bandwidth increases as the trans- crease the stealthiness of this channel. Test results show that mission interval of RTP packets decreases. this channel is capable of covert communication. For future work, this timing channel could be collaborated with a storage As to Robustness, two of the most common reasons caus- [8−10] ing errors in this timing channel are packet loss and network channel based on RTP or RTCP . With different informa- jitter. Mechanisms for retransmission and correction of errors tion hiding algorithms of VoIP and RTP/RTCP, a framework are needed. of a covert multidimensional space-time channel needs further Because every RTP packet contains a sequence number, investigation. RTP packet loss and jitter can be corrected. The sender can discover RTCP packet loss by reading the Last sender report References timestamp (LSR) field of RTCP packets from the receiver, and [1] G.J. Simmons, “The prisoners’ problem and the sublimi- resend the frame. RTCP packets can be also sorted in a RTP nal channel”, Advances in Cryptology Proceedings of Crypto, packets sequence correctly according to the RTP timestamp. Vol.83, pp.51–67, 1983. Through error control mechanisms, this covert channel is more [2] RFC 3550:2003, RTP: A Transport Protocol for Real-Time Ap- robust against jitter and packet loss. plications. MZC is imported to both DRC and RLC to improve im- [3] S. Zander, G. Armitage and P. Branch, “A Survey of covert perceptibility and robustness. channels and countermeasures in computer network proto- cols”, IEEE Communications Surveys & Tutorials, Vol.9, No.3, pp.44–57, 2007. [4] S. Cabuk, C.E. Brodley, C. Shields, “IP covert timing channels: Design and detection”, Proc. of ACM CCS’04, Washington, DC, U.S.A, pp.179–187, 2004. [5] Steven J. Murdoch, Stephen Lewis, “Embedding covert chan- nels into TCP/IP”, Proc. of the 7th International Conference on Information Hiding, Barcelona, Spain, pp.247–261, 2005. [6] Sarah H. Sellke, Chih-Chun Wang, Saurabh Bagchi, “TCP/IP timing channels: Theory to implementation”, Proc. of IEEE INFOCOM 2009, Rio de Janeiro, Brazil, pp.178–187, 2009 [7] Liu Xiong, Dai Yiqi, “A typical network covert timing channel with uniformly distributed noise”, Chinese Journal of Electron- ics, Vol.20, No.4, pp.730–734, 2011. Fig. 3. Percentage of RTCP packets in RLC and DRC [8] Miao Rui, Huang Yongfeng, “An approach of covert commu- nication based on the adaptive steganography scheme on voice Fig.3 shows the comparison between DRC and RLC. When over IP”, Proc. of IEEE ICC 2011, Tokyo, Japan, pp.1–5, 2011. we transfer messages of equal-length (1M, all the bits are ran- [9] Huang Yongfeng, Yuan Jian, Chen Minchao, Xiao Bo, “Key dis- tribution over the covert communication based on VoIP”, domized), the percentage of RTCP packets is less in RLC Chi- nese Journal of Electronics, Vol.20, No.2, pp.357–360, 2011. method under most conditions, which means the speech qual- [10] Linda Yunlu Bai, Huang Yongfeng, Hou Guannan, Xiao Bo, ity is more likely to be guaranteed by using less bandwidth to “Covert channels based on jitter field of the RTCP header”, transfer RTCP packets. If m>10, the percentage of RTCP Proc. of IEEE IIHMSP 2008, Harbin, China, pp.1388–1391, packets can be lower than 10%. 2008. 714 Chinese Journal of Electronics 2012

YING Lizhi is a M.S. candidate YUAN Jian is a Ph.D. candidate in the Department of Electronic Engi- in the Department of Electronic Engineer- neering at Tsinghua University, Beijing. ing at Tsinghua University, Beijing. His His research interests are information hid- current research interests focus on covert ing and web intelligence. (Email: lzy- communication. [email protected])

HUANG Yongfeng (corresponding LINDA Yunlu Bai is a Ph.D. can- author) is an associate professor in the didate in the Department of Electronic En- Department of Electronic Engineering at gineering at Washington University, Seat- Tsinghua University, Beijing. His re- tle. Her current research interests focus on search interests include P2P, multime- analysis of cognitive radio networks. dia network and next generation Inter- net. He has published five books and over 50 research papers on computer network and multimedia communication. (Email: [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Linear Approximations of Pseudo-Hadamard Transform∗

WANG Bin, WU Chunming and CHANG Yaqing

(Computer Science College, Zhejiang University, Hangzhou 310027, China)

Abstract — In FSE 2003, Johan Wall´en proposed ef- Let T n ficient log-time algorithms for computing linear approxi- u =(un−1, ···,u0) ∈ F2 2n mations of addition modulo . They posed that his algo- u·x u x ⊕···⊕u x ⊕u x rithms can be generalized to more complex functions such be binary column vectors, = n−1 n−1 1 1 0 0 denote the standard dot product. as Pseudo-Hadamard Transform, but didn’t to the readers. ↔ In this paper, we present a formula for computing linear To avoid confusion, we use ⊕ and + to denote addition n n correlation of Pseudo-Hadamard Transform. in F2 and addition modulo 2 , respectively. We further- n n n n Key words — Linear approximations, Correlation, more let carry: F2 × F2 → F2 × F2 be the carry function Pseudo-Hadamard Transform, Linear cryptanalysis. for Pseudo-Hadamard Transform defined by carry (x, y)= (S(x, y),T(x, y)), where ↔ I. Introduction S(x, y)=x ⊕ y ⊕ (x + y) [1] ↔ Linear cryptananlysis views the cipher as a relation be- T (x, y)=x ⊕ 2y ⊕ (x +2y) tween the plaintext, the ciphertext and the key, and tries to approximate this relation using linear relations. In this paper, For notational convenience, we define the functions lca, lcc: F n 4 → − , We compute linear correlation of a special class of practically ( 2 ) [ 1 1] (for linear correlation of PHT and the carry important mappings which is employed in SAFER++[3] and function respectively by [4] PHT Twofish . It can be described lca(α, β, γ ,γ )=c(α, β → γ ,γ ) n n n n f : F2 × F2 → F2 × F2 Carry lca(α, β, γ ,γ )=c(α, β → γ ,γ ) n n f(x, y)=((x + y)mod2 , (x +2y)mod2 ) Since the only nonlinear part of PHT is the carry function, And the current paper is a further extension of the methods it should be no surprise that the linear properties of PHT com- [2] from reference and based on a fairly simple classification of pletely reduce to those of the carry function. And it is trivial the linear approximations of the carry function. to prove that Similar results with respect to differential cryptanalsis[5] 1 are discussed in Ref.[6]. Addition modulo 2n with respect to lca(α, β, γ ,γ )=lcc(α ⊕ γ ⊕ γ ,β⊕ γ ⊕ R γ ,γ ,γ ) differential cryptanalsis are discussed in Ref.[7]. The simper 1 where R γ denotes right shift of γ by one bit. case with one addened fixed is considered in Refs.[8, 9] with respect to both linear and differential cryptanalysis. III. Linear Approximations of Carry In the next section, we discuss some preliminaries. In Sec- Function tion III, we extend the classification and derive our results on Pseudo-Hadamard Transform. We will consider lcc as a function of hexadecimal word by PHT writing the linear approximation (α, β → γ,γ) as the hex- II. Preliminaries adecimal word w = wn−1 ···w0,wherewi =8αi +4βi +2γi + γi . This defines lcc as a function from the hexadecimal words The following standard is convenient for discussing these of length n to the interval [−1, 1]. As n varies in the set of linear approximations. n n n n n nonnegative integers, we obtain a function from the set of all Let f : Z2 × Z2 → Z2 × Z2 , α, β, γ ,γ ∈ Z2 .The hexadecimal words to [−1, 1]. In the same way, we consider lca correlation is defined by an functions from the set of all hexadecimal words to [−1, 1]. c α, β →f γ,γ 1 − α·x⊕β·y⊕γ ·f(x,y)⊕γ ·f(x,y). In the following, we can derive a linear representation of ( )= 2n ( 1) 2 n • x,y∈F2 lcc. We will use the bracket notation: if is any statement,

∗Manuscript Received Apr. 2011; Accepted May 2011. This work is supported by the National Natural Science Foundation of China (No.61103200, No.61070157), the National Basic Research Program of China (973 Program) (No.2012CB315903). 716 Chinese Journal of Electronics 2012

βl {•} =1when• is true and {•} =0when• is false. Let +(−1) lcc(αe¯l,βe¯l,γ ⊕ el,γ ) n ei ∈ F i αl 2 be a vector whose th component is 1 and the others − lcc αel,βel ⊕ el−1,γ ⊕ el,γ ⊕ el n +( 1) ( ¯ ¯ )] are 0. For x, y ∈ F2 ,¯x denotes the component wise negation x, x x ⊕ xy of ¯i = 1, and denotes the component wise product Proof Eq.(2) let Sl(x, y)denotethelth component of of x and y,(xy)i = xiyi.Let the left of carry function. This function can be recursively computed as S0(x, y)=0 H1 = {αl−1 = βl−1 =0},H2 = {αl−1 =1,βl−1 =0}

H3 = {αl−1 = βl−1},H4 = {αl−1 = βl−1} S (x,y) 1 x y S x +y +S (−1) l+1 = ((−1) l +(−1) l +(−1) l − (−1) l l l ) 2 Lemma 1 Let Then k =max{i|1 ≤ i ≤ n; γi =1, orγi =1} lcc (α, β, γ + el+1,γ ) when l ≥ k, αl =0or βl =0, lcc(α, β, γ ,γ )=0. γ ·S x, y γ ·T x, y x 1 − α·x⊕β·y⊕(γ ⊕el+1)·S⊕γ ·T Proof Since ( )or ( ) is independent of l = 2n ( 1) 2 n and yl when l ≥ k,weseethatlcc(α, β, γ ,γ ) = 0 whenever x,y∈Z2 αl =0or βl =0forsome l ≥ k. 1 α·x⊕β·y⊕γ·S⊕γ·T ⊕S − l+1 = 2n ( 1) Lemma 2 The function lcc is given recursively as fol- 2 n x,y∈Z2 lows. 1 α·x⊕β·y⊕γ·S⊕γ·T S − − l+1 lcc α, β, , lcc α, β, e , = 2n ( 1) ( 1) (1) ( 0 0) = ( 0 0) 2 n x,y∈Z2 =lcc(= lcc(α, β, e0, 0) 1 − α·x⊕β·y⊕γ ·S⊕γ ·T lcc α, β, e , ··· = 2n ( 1) = ( 0 0 011) 2 n x,y∈Z2 =lcc(α, β, 0, 0 ···011)  1 xl yl Sl xl+yl+Sl , α β · ((−1) +(−1) +(−1) − (−1) ) 1 if = =0 2 =  , 0 others 1 1 (α⊕e )·x⊕β·y⊕γ·S⊕γ·T = (−1) l 2n γ ∈{ ,e } γ ∈{ ,e , ··· } k {i| ≤ i< 2 2 n If 0 0 , 0 n 0 011 ,let =max 0 x,y∈Z2 n γ , γ } l ≥ k ; i =1 or i =1 and . Then we have the following 1 α·x⊕(β⊕e )·y⊕γ·S⊕γ·T − l results. + 2n ( 1) 2 n x,y∈Z2 (2) lcc(α, β, γ ⊕ el+1,γ ) 1 α·x⊕β·y⊕(γ⊕e )·S⊕γ·T ⎧ − l + 2n ( 1) ⎪ 1 2 n ⎨ lcc(αe¯l,βe¯l,γ ,γ ),αl = βl x,y∈Z2 2 = ⎪ 1 (α⊕el)·x⊕(β⊕el)·y⊕(γ ⊕el)·S⊕γ ·T ⎩ 1 αl − − (−1) lcc(αe¯l,βe¯l,γ ⊕ el,γ ),αl = βl 2n ( 1) 2 2 x,y∈Zn 2 (3) lcc(α, β, γ ,γ ⊕ ei+1) ⎧ 1 lcc α ⊕ e ,β,γ,γ lcc α, β ⊕ e ,γ,γ ⎪ 1 = [ ( l )+ ( l ) ⎪ [lcc(αe¯l,βe¯l ⊕ el−1,γ ,γ )+lcc(αe¯l,βe¯l,γ γ ⊕ el)], 2 ⎪ 2 ⎪ + lcc(α, β, γ ⊕ el,γ ) − lcc(α ⊕ el,β⊕ el,γ ⊕ el,γ )] ⎪ αl = βl =0 ⎪ ⎨⎪ 1 αl [lcc(αe¯l,βe¯l,γ ,γ )+(−1) lcc(αe¯l,βe¯l Consider the case (αl,βl)=(1, 0), (0, 1), (0, 0), (1, 1), re- =⎪ 2 ⎪ spectively. It follows that in ⎪ ⊕el−1,γ ,γ ⊕ el)],αl =1,βl =0 ⎪ ⎪ ⎪ ··· 1 ⎩⎪ [(lcc(α ⊕ el,β,γ ,γ )+lcc(α, β ⊕ el,γ ,γ ) , 2 0 others + lcc(α, β, γ ⊕ el,γ ) − lcc(α ⊕ el,β⊕ el,γ ⊕ el,γ )] (4) when αl = βl

there is only one term that is not 0. This completes the proof. lcc (α, β, γ ⊕ el+1,γ ⊕ el+1) Eqs.(3) and (4) are similar to Eq.(2). 1 = [lcc(αe¯l,βe¯l,γ ,γ ) − lcc(αe¯l,βe¯l ⊕ el−1,γ ,γ ⊕ el) Using this lemma, it is easy to derive a linear representa- 4 αl tion of the lcc. +(−1) lcc(αe¯l,βe¯l ⊕ el−1,γ ⊕ el,γ ) n Theorem 1 Let α, β, γ ,γ ∈ Z2 , we can derive a linear αl +(−1) lcc(αe¯l,βe¯l,γ ⊕ el,γ ⊕ el)] representation of lcc when αl = βl , we can get the following conclusions. lcc(α, β, γ ,γ )=LBwn1 Awn2 ···Aw0 C lcc(α, β, γ ⊕ el+1,γ ⊕ el+1) where L =(1, 0, 0, 0), C =(1, 1, 1, 1, 1, 1, 11)T,andwhen 1 = [lcc(αe¯l,βe¯l ⊕ el−1,γ ,γ )+lcc(αe¯l,βe¯l,γ ,γ ⊕ el) γl−1 = γl−1 =0 4 Linear Approximations of Pseudo-Hadamard Transform 717

⎛ ⎞ 4H1 0000 000 ⎜ α ⎟ 1 ⎜ 2H2 2H1 2H1 (−1) l−1 2H2 0 000⎟ Bwl−1 = α 4 ⎝ 2H4 00 0(−1) l−1 2H3 000⎠ β α α α H3 H4 H4 −H3 (−1) l−1 H4 (−1) l−1 H3 (−1) l−1 H3 (−1) l−1 H4 when γl−1 =0,γl−1 =1 ⎛ ⎞ 04H1 00 0 0 0 0 ⎜ α ⎟ 1 ⎜ 2H1 (−1) l−1 2H2 2H2 2H1 0000⎟ Bwl−1 = α 4 ⎝ 002H4 000(−1) l−1 2H3 0 ⎠ α α β α H4 −H3 H3 H4 (−1) l−1 H3 (−1) l−1 H4 (−1) l−1 H4 (−1) l−1 H3 ⎛ ⎞ 04H1 00 0 0 0 0 ⎜ ⎟ ⎜ 4H1 000 0 0 0 0⎟ ⎜ αl−1 ⎟ ⎜ 2H1 (−1) 2H2 2H2 2H1 0000⎟ 1 ⎜ 2H 2H 2H (−1)αl−1 2H 0000⎟ A = ⎜ 2 1 1 2 ⎟ wl−1 ⎜ k ⎟ 4 ⎜ 002H4 000(−1) 3 2H3 0 ⎟ ⎜ α ⎟ ⎜ 2H4 000(−1) l−1 2H3 000⎟ ⎝ α α β α ⎠ H4 −H3 H3 H4 (−1) l−1 H3 (−1) l−1 H4 (−1) l−1 H4 (−1) l−1 H3 β α α α H3 H4 H4 −H3 (−1) l−1 H4 (−1) l−1 H3 (−1) l−1 H3 (−1) l−1 H4 when γl−1 =1,γl−1 =0 ⎛ ⎞ 00004H1 000 1 ⎜ 00002H 2H 2H (−1)αl−1 2H ⎟ B = ⎜ 2 1 1 2 ⎟ wl−1 ⎝ α ⎠ 4 (−1) l−1 2H3 0002H4 000 β α α α (−1) l−1 H4 (−1) l−1 H3 (−1) l−1 H3 (−1) l−1 H4 H3 H4 H4 −H3 ⎛ ⎞ 00004H1 000 ⎜ ⎟ ⎜ 0000004H1 0 ⎟ ⎜ αl−1 ⎟ ⎜ 00002H2 2H1 (−1) 2H2 0 ⎟ 1 ⎜ 00002H (−1)αl−1 2H 2H 2H ⎟ A = ⎜ 1 2 2 1 ⎟ wl−1 ⎜ α ⎟ 4 ⎜ (−1) l−1 H3 0002H4 000⎟ ⎜ α ⎟ ⎜ 00(−1) l−1 2H3 00 0 2H4 0 ⎟ ⎝ α α α α ⎠ (−1) l−1 H4 (−1) l−1 H3 (−1) l−1 H3 (−1) l−1 H4 H3 H4 H4 −H3 α α α α (−1) l−1 H3 (−1) l−1 H4 (−1) l−1 H4 (−1) l−1 H3 H4 −H3 H3 H4 when γl−1 =1,γl−1 =1 ⎛ ⎞ 0000004H1 0 1 ⎜ 00002H (−1)αl−1 2H 2H 2H ⎟ B = ⎜ 1 2 2 2 ⎟ wl−1 ⎝ α ⎠ 4 00(−1) l−1 2H3 00 02H4 0 α α β α (−1) l−1 H3 (−1) l−1 H4 (−1) l−1 H4 (−1) l−1 H3 H4 −H3 H3 H4 ⎛ ⎞ 0000004H1 0 ⎜ ⎟ ⎜ 00004H1 000⎟ ⎜ αl−1 ⎟ ⎜ 00002H1 (−1) 2H2 2H2 2H1 ⎟ 1 ⎜ 00002H 2H 2H (−1)αl−1 2H ⎟ A = ⎜ 2 1 1 2 ⎟ wl−1 ⎜ α ⎟ 4 ⎜ 00(−1) l−1 2H3 00 02H4 0 ⎟ ⎜ α ⎟ ⎜ (−1) l−1 2H3 0002H4 000⎟ ⎝ α α α α ⎠ (−1) l−1 H3 (−1) l−1 H4 (−1) l−1 H4 (−1) l−1 H3 H4 −H3 H3 H4 α α α α (−1) l−1 H4 (−1) l−1 H3 (−1) l−1 H3 (−1) l−1 H4 H3 H4 H4 −H3

l l−1 Proof Case 1 If b = c =0,γl−1 = γl−1 =0,P00 = P00 when l Fix a word w,andlet(α, β → γ ,γ ) be corresponding αl−1 = βl−1 =0andP00 =0otherwise. i 1 l−1 linear approximation. For Case 2 If b =0,c =1,γl−1 = γl−1 =0,P01 = (Q + 2 00 n l−1 l 1 l−1 αl−1 l−1 z ∈ F2 ,b,c∈{0, 1} P )whenαl−1 = βl−1 =0,P01 = (P +(−1) Q ) 01 2 00 01 α β P l ≤ l0. We consider the following 16 cases. when αl−1 = βl−1 718 Chinese Journal of Electronics 2012

l l−1 Case 5 If b = c =0,γl−1 =0,γl−1 =1,P00 = P01 We have an example illustration to use this formula to l when αl−1 = βl−1 =0andP00 =0otherwise. compute the linear correlation of PHT. For example: l Case 6 If b =0,c =1,γl−1 =0,γl−1 =1,P01 = If (α, β → γ ,γ ) = (01000, 01110 → 10100, 01100), l−1 l−1 l l−1 (Q01 + P00 )/2, when αl−1 = βl−1 =0,P01 =(P01 + then w = (2) (13) (7) (4) (0) lcc(α, β, γ ,γ )=lcc(w)= αl−1 l−1 l (−1) Q00 )/2, when αl−1 =1,βl−1 =0andP01 =0oth- LB2A13A7A4A0C = −1/4. erwise. b c γ γ P l Case 7 If =1, =0, l−1 =0, l−1 =1, 10 = IV. Conclusions αl−1 l−1 l l−1 (−1) P11 /2whenαl−1 = βl−1 and P10 = P01 /2when αl−1 = βl−1 In this paper, we extend the classification of carry func- Case 8 If b =1,c =1,γl−1 =0,γl−1 =1, tion o and present a formula for computing linear correlation of Pseudo-Hadamard Transform. But we can not derive an al- l l−1 l−1 αl−1 l−1 αl−1 l−1 P11 =(P01 − Q00 +(−1) Q11 +(−1) P10 )/4 gorithm for generating all linear approximations with a given non-zero correlation coefficients. when αl−1 = βl−1 and

l l−1 l−1 βl−1 l−1 αl−1 l−1 P11 =(Q01 + P00 +(−1) P11 +(−1) Q10 )/4 References [1] Mitsuru Matsui, “Linear cryptanalysis method for DES cipher”, when αl−1 = βl−1. in , Vol.765 of Lecture b c γ γ P l P l−1 Advances in Cryptology-Eurocrypt 1993 Case 9 If = =0, l−1 =1, l−1 =0, 00 = 10 Notes in Computer Science, pp.386–397, Springer-Verlag, 1993. l when αl−1 = βl−1 =0andP00 =0otherwise. [2] Johan Wallen, “Linear approximations of addition modulo 2n”, l Case 10 If b =0,c =1,γl−1 =1,γl−1 =0,P01 = in Fast Software Encryption 2003, Vol.2887 of Lecture Notes in l−1 l−1 l l−1 (Q10 + P11 )/2whenαl−1 = βl−1 =0,P01 =(P10 + Computer Science, pp.261–273, Springer-Verlag, 2003. αl−1 l−1 l [3] James L. Massey, “SAFER K-64: A byte-oriented block- (−1) Q11 )/2whenαl−1 = βl−1 and P01 =0otherwise. l ciphering algorithm”, in Ross Anderson, Fast Software Encryp- Case 11 If b =1,c =0,γl−1 =1,γl−1 =0,P10 = tion’93, Vol.809 of Lecture Notes in Computer Science, pp.1–17, − αl−1 P l−1/ α β P l P l−1/ ( 1) 00 2when l−1 = l−1 and 10 = 10 2when Springer-Verlag, 1993. α  β l−1 = l−1. [4] Bruce Schneier, John Kelsey, Doug Whiting, David Wagner et Case 12 If b =1,c =1,γl−1 =1,γl−1 =0, al., The Twofish Encryption Algorithm: A 128-Bit Block Ci- pher, John Wiley & Sons, New York, USA, 1999. l l−1 l−1 αl−1 l−1 αl−1 l−1 P11 =(P10 − Q11 +(−1) Q00 +(−1) P01 )/4 [5] Eli Biham and Adi Shamir, “Differential cryptanalysis of DES- like cryptosystems”, Journal of Cryptology, Vol.4, No.1, pp.3– when αl−1 = βl−1 and 72, 1991. [6] Helger Lipmaa, “On differential properties of Pseudo-Hadamard l l−1 l−1 αl−1 l−1 αl−1 l−1 P11 =(Q10 + P11 +(−1) P00 +(−1) Q01 )/4 transform and related mappings”, in Progress in Cryptology- Indocrypt 2002, Vol.2551 of Lecture Notes in Computer Science, when αl−1 = βl−1. pp.48–61, Springer-Verlag, 2002. l l−1 [7] Helger Lipmaa and Shiho Moriai, “Efficient algorithms for com- Case 13 If b = c =0,γl−1 = γl−1 =1,P00 = P11 when l puting differential properties of addition”, Fast Software En- αl−1 = βl−1 =0andP00 =0otherwise l cryption’2001, Vol.2355 of Lecture Notes in Computer Science, Case 14 If b =0,c =1,γl−1 = γl−1 =1,P01 = pp.336–350, Springer-Verlag, 2002. l−1 l−1 l l−1 (Q11 + P10 )/2whenαl−1 = βl−1 =0,P01 =(P11 + [8] Hiroshi Miyano, “Addend dependency of differential/linear αl−1 l−1 l (−1) Q10 )/2whenαl−1 =1,βl−1 =0andP01 =0other- probability of addition”, IEICE Trans. Fundamentals, Vol.81, wise. No.1, pp.106–109, 1998. l [9] Zhang Wentao, Qing Sihan, Wu Wenling, “Improved Case 15 If b =1,c =0,γl−1 = γl−1 =0,P10 = αl−1 l−1 l l−1 differential-linear cryptanalysis of reduced-round SAFER++”, (−1) P /2whenαl−1 = βl−1 and P10 = P /2when 01 11 Chinese Journal of Electronics, Vol.13, No.1, pp.111–115, 2004. αl−1 = βl−1. is a postdoctoral fellow WANG B i n Case 16 If b =1,c =1,γl−1 = γl−1 =0, in the College of Computer Science of Zhe- jiang University. His current research is in l l−1 l−1 αl−1 l−1 αl−1 l−1 P11 =[P11 − Q10 +(−1) Q01 +(−1) P00 ]/4 information security and network routing. (Email: [email protected]) when αl−1 = βl−1 and

l l−1 l−1 αl−1 l−1 αl−1 l−1 P11 =[Q11 + P10 +(−1) P01 +(−1) Q00 ]/4

α  β when l−1 = l−1. WU Chunming is a professor of l l−1 In all cases, P = BQ ,whereB is 4 × 8 matrix, and College of Computer Science at Zhejiang l l l l when β¯l−1 = βl−1 ⊕ 1, Q00,Q01,Q10,Q11 can be easy derived, University. His research fields include so Ql = AQl−1,whereA is 8 × 8matrix Internet QoS provisioning, reconfigurable network technology, virtualization network In all cases, A = Awl−1 . By induction, we have l and artificial intelligence. Q = Awl−1 ···Aw0 C for all l.Sincelcc(γ ,γ ,α,β)= 00 n lccn (γ ,γ ,α,β)=LP , it follows that lcc(γ ,γ ,α,β)=

LBwn−1 ,Awn−1 , ···,Aw0 C. Over. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Personal DRM Scheme Based on Social Trust∗

QIU Qin1,2, TANG Zhi1,2,LIFenghua3,4 and YU Yinyan1,2

(1.Institute of Computer Science and Technology, Peking University, Beijing 100871, China) (2.Beijing Key Laboratory of Internet Security Technology, Peking University, Beijing 100871, China) (3.Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100029, China) (4.Department of Electronic Engineering, Beijing Electronic Science and Technology Institute, Beijing 100070, China)

Abstract — Existing commercial Digital rights manage- Trusted authority (TA) played by a license server and some- ment (DRM) schemes are not suitable for personal content times an external Certificate authority (CA) to record trans- protection because of centralized architecture and rigid action information, authenticate content users and perform constraints. To enable secure and flexible sharing of sensi- authorizations. tive personal content, a new DRM scheme is proposed in this paper. Social trust between content sharers is mod- In recent years, a few DRM schemes for personal content eled as computable concepts with DRM related contexts; protection turn up, but TA is still indispensable. Bhatt et [2] based on the trust model, decentralized DRM architec- al. proposed a personal DRM manager for smart phones, ture and scalable content sharing protocols are presented. which works on the assumption that there is CA to issue cer- Using the proposed DRM scheme, personal content own- tificates for each device. Microsoft Information rights manage- ers can perform authentication and authorization without ment (IRM)[5] and Voltage SecureFile[6] need a trusted server the intervention of Trusted authority (TA); by perform- to authenticate content users and issue them licenses. ing content sharing recommendations, authorized content users can conditionally have content shared with friends. 1. Limitations of existing DRM schemes Prototype implementation and simulation experiments in- Some features of existing DRM schemes hinder their ap- dicate that the proposed scheme achieves satisfactory se- plications in personal content protection. curity and usability. Limitation 1 The centralized system architecture Key words — Digital rights management (DRM), Per- causes expense and privacy problems for personal content own- sonalcontent,Socialtrust. ers. Existing DRM systems set up centralized TA to authenti- I. Introduction cate and authorize content users. However, for content owners

who are individuals and small organizations building or em- Nowadays, more and more personal contents are created ploying centralized TA service is too costly. Even if there is and shared by individuals for either casual or collaboration free service to use, content owners may be reluctant to have purposes. It is quite common that personal content owners cre- their secrets (such as the content encryption keys) and pri- ate digital content with ubiquitous devices like smart phones vacy (such as authorization records) in the charge of a third and laptops, and have the content shared with friends or co- party. Therefore, it is fundamentally important for personal operators through Social Network Services, instant messengers DRM systems to deploy decentralized architecture and enable or emails. autonomous authentication and authorization by content own- Some personal contents, such as contract drafts, unpub- ers. lished original works, private photos and videos are involved Limitation 2 Authorized content users can seldom share with business benefits, copyrights, or privacy. The illegal dis- the content with others[1]. closure of them often results in infringements on rights and in- The constraint is reasonable in commercial DRM systems terests of content owners. Digital rights management (DRM) to reserve the benefits of content providers; however, in per- provides protection in the whole life-cycle of digital content, sonal content sharing, the constraint is unreasonable and both- and gives content owners varying degrees of control over how ering. Personal content sharing usually happens among peo- digital contents may be used[1,2]. It is a desirable solution to ple with similar interests, goals or values, for either casual or [7] protect sensitive personal contents. collaboration purposes . In many cases, content owners ex- However, most existing DRM systems, such as Mi- pect content users to re-share the content with friends to build crosoft Windows Media Rights Manager and InterTrust stronger social ties or complete tasks better. For example, Al- Rights—System are set up to preserve commercial profits ice may share her original work with Bob, hoping that Bob or of large and medium content providers[1−4].Theyrelyon Bob’s friends can help her improve the work. A desirable DRM

∗Manuscript Received Nov. 2011; Accepted Dec. 2011. This work is supported by the National Natural Science Foundation of China (No.61170251), Key Program of Scientific and Technology Research of Ministry of Education (No.209156), and Beijing Natural Science Foundation (No.4102056). 720 Chinese Journal of Electronics 2012 scheme should help Alice ensure that the new content sharers 2. DRM Client software (DRM-CS) introduced by Bob would use the content properly. Any system user can act as CO or CU, and perform trust 2. Our contributions information management, authorization and content pack- Social trust is a belief in the honesty, integrity and reliabil- age/usage through DRM-CS. Fig.2 illustrates the main func- ity of others[8]; it is the basic environmental factor of personal tional components of DRM-CS: content sharing. Because the danger of being misquoted or (1) Trust Manager manages trust-related information. Its

discovering that the shared content has been used for under- main responsibilities are:  collecting and saving trust infor-

handed or unsavory purposes is always there, before one shares mation from user inputs and recommendations of other system  important content, there is an assumed understanding of trust users;  evaluating the trustworthiness of target entities; [9] that the content will be used only for the good . providing recommendations to others with local trust informa- In this paper, we model social trust between content shar- tion. ers, and propose a DRM scheme for personal content owners to (2) DRM Controller generates the client user’s public- share their contents with trusted ones autonomously and flex- private key pairs and content keys, implements cryptography- ibly. By integrating social trust, the proposed scheme has the related operations, and sets right information based on trust following advantages: (1) it is based on decentralized archi- information from Trust Manager; it also generates and inter- tecture; without the intervention of TA, a content owner can prets licenses. perform authentication and authorization autonomously. (2) (3) Content Package/Usage Tool is in charge of content With recommendations from content users, suitable new con- package and usage. tent users (those with similar interests or help in completing Client reliability is foundation of all DRM schemes[3,4].It the task) can be introduced in a controllable manner. enforces only legal operations on terminals without giving out any secret information. The reliability of our DRM-CS can be II. Overview of our Scheme realized through pure software techniques. To resist tampers and reverse engineering attacks, software protection schemes 1. System model like introspection, state inspection, and code obfuscation can Fig.1 illustrates the model of our DRM scheme. Secure be adopted[10]. For particular high security, hardware-aided content sharing progresses between Content owner (CO) and techniques like Trusted Computing Platform[1] are available Content user (CU), and no TA is involved in the system. CO solutions. issues herself Owner license (O-Lic), with which CO can per- form any operation including package and authorization on all her contents. With User license (U-Lic) issued by CO, protected content can be securely decrypted and rendered by DRM Client software (DRM-CS) of CU.

Fig. 2. Components of DRM-CS

III. The Underlying Trust Model Fig. 1. The model of our DRM scheme Trust is often deemed as a relationship that is reflexive In our scheme, the premise for authorization is that CO (any entity trusts herself) and conditionally transitive (the fact deems CU to be a trustworthy user, who will use the content that A trusts B and B trusts C does not indicate that A trusts properly. For example, CU is trusted not to plagiarize inno- C, unless certain conditions are satisfied). A trust model not vations in the protected original work and publish a similar only reflects existing trust relationships, but also builds new work in advance. The general process of content sharing is as trust relationships with recommendation mechanism. There [8,11−13] follows: are three basic types of trust : (1) CO establishes sharing trust with CU. If CO has knowl- (1) Direct trust reflects the trustor’s judgment on the trust- edge about CU, the establishment can be directly completed worthiness of an acquainted entity, without intervention of by CO; if CU is unknown, CO can establish indirect trust with third parties. CU based on recommendations from others. (2) Confidence of recommendation represents the trustor’s (2) After establishing sharing trust, CO issues U-Lic to confidence in an entity to provide accurate recommendations. CU, with which CU can not only use the content but also make (3) Indirect trust in an unknown entity is built through rec- sharing recommendations for her trusted entities to share the ommendations from those that have trust in the recommended content. one. A Personal DRM Scheme Based on Social Trust 721

In this section, our trust model with DRM related contexts is presented. The notations in Table 1 are used throughout this paper.

Table 1. Notations Notation Description UIDi i’s user identifier TD(i, j, context) i’s trust degree in j under certain context Fig. 3. Example of a trust graph i’s confidence degree in j’s recommendations CoR(i, j, context) under certain context VTi(context) Validity Threshold set by i under context Procedure: TrustPro (source, dest, type) PU ; PR i’s public key; i’s private key i i Comment: set type =1togetTD(source, dest) as output; kd(•, •) key derivation function set tupe = 2 to get Cor (source, dest) as output. Sig i’s signature on message digest i rslt ← 0, j ← 0, PEnc(pu, •); asymmetric encryption with pu and if there is a direct trust path from source to dest then PDec(pr, •) asymmetric decryption with pr if type =1then Enc(k,•); symmetric encryption and decryption rslt ← TD(source, dest) Dec(k, •) with secret key k else if type =2then rslt ← Cor(source, dest) 1. Context-aware trust representation end if Our trust model is context sensitive. For a trustor, her else n ← the number of recommendation paths from source to dest trust relationship towards a trustee is defined as below: if n>=1then Trust(UIDtrustee,context)={TD,CoR} for every recommendation path i<= n do – expression (1) source finds recommender Ri that has direct trust path to context trustCategory, trustConstraint dest CoR (source, Ri) ← TrustPro(source, Ri, 2) where = . if CoR (source, R ) >VT then context i In expression (1), is a feature vector providing j ← j +1; background information including trust category and trust if type =1then constraint. TD is trustor’s trust (either direct trust or in- rsltj ← CoR (source, Ri*TD(Ri,dest)—expression (2) direct trust) degree in the trustee under the specified context. else if type =2then CoR is the trustor’s confidence degree in the trustee’s recom- rsltj ←CoR(source, Ri) ∗ CoR(Ri,dest)—expression (3) mendations under the specified context. Being fuzzy logics, end if end if both TD and CoR are continuous variables in the interval of end for , [0 1]. 0 indicates lowest degree of trust or confidence, while 1 N ← j indicates highest. rslt ← average of rsltk,wherek =1, 2, ···,N—expression (4) According to the contexts involved in the DRM system, end if we have two categories of trust: Key Trust and Sharing Trust. end if Key trust (KT) is the trust in authenticity of the bind- return rslt ing between the trustee and the claimed public key.It provides Fig. 4. Procedure for trust propagation foundation for user authentication and secure communication. A trustor’s Key Trust towards a trustee can be described as The procedure of our trust propagation is described in Trust(UIDtrustee, KT). Here trust constraint is set void. Fig.4. It satisfies the following desired properties: Sharing trust (ST) is the trust in eligibility of the trustee (1) For any entity that has direct trust in the trustee, to share the content. It provides foundation for user autho- the entity only considers the direct trust path and ignores all rization. A trustor’s Sharing Trust towards a trustee can be recommendation paths. This property avoids the problem of described as Trust(UIDtrustee, ST,CID). Here, trust con- opinion dependence[13]. CID straint is a content identifier ; it confines the range of (2) Recommendations from untrusted recommenders are contents that the trustee is trusted to share. deemed unreliable and ignored. We use Validity threshold (VT) to map TD and CoR to In Fig.4, we use expressions (2), (3) and (4) for trust prop- valid or invalid states. VT is a continuous variable in the agation because they conform to both Weighted average op- , open interval of (0 1). It is adjustable by system users ac- erator in D-S theory and Consensus operator in Subjective cording specific contexts and security policies. For example, logic[13,14] . With the maximal length of recommendation paths if Content Owner S expects only very trustworthy entities to limited with a reasonable constant, the complexity of the pro- share a sensitive content CID, she can set a high value for cedure is O(m), where m is the scale of valid recommenders. VTS (ST,CID). 2. Trust propagation IV. Operation Process and Related A trustor’s trust relationship with other entities can be regarded as a directed graph[8,11−13], as seen in Fig.3. With Protocols recommendations from different recommenders, multiple rec- ommendation paths connecting the trustor to the target entity In this section, we describe how our trust-based scheme are built. The trustor propagates trust along all paths to eval- works to secure personal content sharing. The whole process uate the trust degree in the target entity. consists of four steps, as illustrated in Fig.5. 722 Chinese Journal of Electronics 2012

R. To preserve R’s privacy, authorization information is en- crypted with system default key SysKey as .

S : = Enc(SysKey,{CID,RightsInfo})

S → R : LU (R)={UIDS ,UIDR,,PEnc(PUR,CEK),SigS}

Case 3b CU is indirectly trusted Authorized CU may want to re-share content CID with an unauthorized entity D, and sends re-sharing request to S. If D is unknown to S, S sends Sharing recommendation request (SRR) to Rk (k =1, 2, ···) who are authorized CU of CID. SRR contains the recommendation deadline τ, and a random number γ to prevent message replay.

S : α = Enc(SysKey,{CID,UIDD})

Fig. 5. Operation process and related protocols S → RK : SRR = {UIDS ,α,τ,γ,SigS }

Step 1 Initialization If having direct ST in D, Rk returns S a Recommendation Step 1.1 To begin with, system user S generates a ran- Certificate RecCert(Rk) with trust information encrypted to dom public-private key pair {PUS,PRS } with her DRM-CS. protect privacy. PRS is stored in the form of ciphertext. Step 1.2 S establishes Key trust (KT) and exchanges Rk : βk = Enc(SysKey,{CID,UIDD,TD(Rk,D,ST,CID)}) public keys with others. Firstly, with secure communication Rk → S : RecCert(Rk)={UIDRk ,UIDS ,βk,γ,SigRk } or auxiliary verification methods, S gets legal public keys of some friends Ri, i =1, 2, ···, and establishes direct KT with After the deadline τ,S’sDRM-CSverifiesγ and recom- them. When S needs the public key of some unknown system menders’ signatures in received recommendations, and then user D, S requests friends for recommendations. If a friend propagates TD(S, D, ST,CID). If the result is larger than Ri, ∀i,hasdirectKTinD,Ri returns S a recommendation VTS (ST,CID), S deems D to be an eligible CU of CID, and containing UIDD, PUD and TD(Ri,D,KT); otherwise, Ri generates U-Lic forwards the request to the next hop. Finally, S’s DRM-CS LU (D)={UIDS,UIDD,,PEnc(PUD,CEK),SigS} for D. performs trust propagation on all the received recommenda- Step 4 Content usage tions. If the result is a valid trust value, S successfully builds To use content, D’s DRM-CS first associates CP with KT with D and saves PUD. LU (D) by checking whether CID in LU (D) and that in CP are Step 1.3 S’s DRM-CS generates O-Lic LO(S), which identical, and then ensures that the issuer identifier in LU (D) contains the cipher of a random master key MKS .With and the owner identifier in CP are the same. After success- LO(S), S fully controls all her contents. ful verification, D’s DRM-CS collects CEK from LU (D)to decrypt the content cipher in CP. S : LO(S)={UIDS,PEnc(PUS,MKS ),RightsInfo,SigS } V. Security Analysis Step 2 Content package To protect some content M, S’s DRM-CS generates a 1. Robustness of the trust model unique content identifier CID, collects MKS from LO(S), and Illegal CU may be introduced in two ways: (1) Because of then derives content encryption key CEK from MKS and CID subjective faults, a trustor overvalues trust degrees in trustees, [15] with a function satisfying one-way and randomness .Next, causing that trust degrees in some untrustworthy entities turn S’s DRM-CS encrypts M and generates content package CP. larger than VT mistakenly; (2) some rogue recommenders may CP can be distributed to CU at any time in any way. provide unfair positive recommendations for untrustworthy en- tities individually or collusively. We carried experiments simu- S CEK kd CID,MK : = ( S ) lating the above ways to test the robustness of our trust model. C = Enc(CEK,M) Simulation 1-Trust overvaluation we simulated that CP = {CID,UIDS ,C,SigS } trustors overvalue the trust degrees in all their trustees with random scales within an overvaluing range. Experiments were Step 3 CU authorization carried out in random trust networks with 100 trustees. When content sharing happens, S first establishes Shar- Simulation 2-Unfair positive recommendations we ing trust (ST) with CU, and then performs authorization for carried out experiments in random trust networks with 100 them. recommenders and 100 recommendees. When unfair recom- Case 3a CU is directly trusted mendation attacks happened, random rogue recommenders as- For a direct trustee R, S sets trust constraint CID, as well signed the highest trust degree (i.e. 1) to all they recommend. as TD(S, R, ST,CID)andCoR(S, R, ST,CID). Experiment results The experiment results of Simula- If TD(S, R, ST,CID)  VTS(ST,CID), S deems R tion 1 and Simulation 2 are shown in Fig.6 and Fig.7 respec- as an eligible content sharer and generates U-Lic LU (R)for tively. The results indicate that: (1) our trust model achieves A Personal DRM Scheme Based on Social Trust 723 satisfactory robustness; the proportions of illegal CU are in and propose a DRM scheme to protect personal content shar- very low levels; (2) setting a proper value for VT helps impede ing. To our best knowledge, we are the first to integrate social the appearance of illegal CU. trust into DRM application[8,11−13] . A comparison of our work with related solutions for personal content protection is shown in Table 2. The merits of our scheme include: (1) It is TA independent, which eliminates the cost and privacy problems in existing centralized DRM systems; (2) It supports recom- mended content sharing, which enables more flexible sharing experiences. Our scheme can be used to secure private information shar- ing, business collaborations among small-scale organizations, and original work appreciation before the work is issued for sale. A prototype of DRM-CS has been developed, which is composed of a desktop manager and a plug-in in file readers. Fig. 6. The proportion of illegal CU caused by trust overval- Content-related operations, including package, authorization, uation and usage, are performed by the plug-in, while trust informa- tion is managed by the desktop manager. For a plaintext file with the size of 100K bytes, the decryption time of its cipher is 0.61 milliseconds (on PC with Pentium D CPU 3.00GHz, and 1.00GB RAM) with U-Lic.

Table 2. Comparisons of personal content protection solutions Refs.[2,5,6] Ref.[16] Our work Persistent protection Yes No Yes TA independence No Yes Yes Support recommended Fig. 7. The proportion of illegal CU caused by unfair positive No No Yes recommendations sharing

2. Security of content sharing protocols With the protocols described in Section IV, our scheme References achieves following properties: [1] J.S. Erickson, “Fair use, DRM, and trusted computing”, Com- Property 1 Only system users that are authorized by munications of the ACM, Vol.46, No.4, pp.34–39, 2003. CO can decrypt and use the content. [2] S. Bhatt, R. Sion, B. Carbunar, “A personal mobile DRM man- The reason is twofold: (1) Because CEK (or its origin MK) ager for smartphones”, Computers & Security, Vol.28, No.6, in a license is encrypted with the private key of the target sys- pp.327–340, 2009. tem user (either CO or CU), only those system users who have [3]Z.Zhang,Q.Pei,J.Maet al., “Security and trust in digital been issued a license can use the content. (2) Before using the rights management: A survey”, International Journal of Net- , Vol.9, No.3, pp.247–263, 2009. content, compliant system users verify whether the owner of work Security [4]Z.Zhang,Q.Pei,J.Maet al., “Establishing multi-party trust the content package and the issuer of the license are identical. architecture for DRM by using game-theoretic analysis of se- Thus, only the corresponding CO can issue valid licenses. curity policies”, Chinese Journal of Electronics, Vol.18, No.3, Property 2 The privacy of system users can be hardly pp.519–524, 2009. disclosed at a large scale. [5] Microsoft Information Rights Management, http://office.micros Two kinds of privacy information are involved in our oft.com/en-in/excel-help/information-rights-management-in- scheme: authorization information in U-Lic and trust infor- the-2007-microsoft-office-system-HA010102918.aspx [6] Voltage SecureFile, mation in recommendation certificates. They are encrypted http://www.voltage.com/products/sfclient. htm with SysKey and can only be decrypted by the DRM-CS of [7] C.C. Marshall, S. Bly, “Sharing encountered information: Digi- the target receiver after successful verification. Nobody else tal libraries get a social life”, Proc. of Joint ACM/IEEE Con- except the message sender knows the plaintext of the privacy ference on Digital Libraries (JCDL), Tucson, Arizona, USA, information. pp.218–227, 2004. By requesting only one recommender for recommenda- [8] S.P. Marsh, “Formalizing trust as a computational concept”, tions, a malicious trustor may infer the recommender’s trust Ph.D. Thesis, University of Stirling, UK, 1994. [9] C.R. McInerney, S. Mohr, “Trust and knowledge sharing in or- information from the result of trust propagation. However, ganizations: Theory and practice”, Information Science and such method is low-efficient and troublesome. In a macro- Knowledge Management, Vol.12, No.5, pp.65–86, 2007. scopical view, it can hardly cause privacy concerns on a large [10] C. Collberg, J. Nagra, Surreptitious Software: Obfuscation, scale of system users. Watermarking, and Tamperproofing for Software Protection, Pearson Education, Massachusetts, USA, pp.401–464, 2010. VI. Conclusions [11] A. Abdul-Rahman, S. Halles, “A distributed trust model”, Proc. of the 1997 Workshop on New Security Paradigms, Langdale, In this paper, we model social trust among content sharers, Cumbria, UK, pp.48–60, 1998. 724 Chinese Journal of Electronics 2012

[12] Y. Sun, W. Yu, Z. Han, K.J.R. Liu, “Information theoretic 1987, and received M.S. degree and framework of trust modeling and evaluation for ad hoc net- Ph.D. degree in computer application from Peking University in 1990 and 1995 respec- works”, IEEE Journal on Selected Areas in Communications tively. He is a professor and doctoral super- (J-SAC), Special Issue on Security in Wireless Ad Hoc Net- visor in Institute of Computer Science and works, Vol.24, No.2, pp.305–317, 2006. Technology of Peking University. His re- [13] A. Josang, “An algebra for assessing trust in certification search interests include document process- chains”, Proc. of Network and Distributed Systems Secu- ing and digital rights management. (Email: rity Symposium (NDSS), San Diego, California, USA, pp.1–10, [email protected]) 1999. [14] A. Josang, M. Daniel, “Strategies for combining conflicting dog- LI Fenghua was born in Xishui, matic beliefs”, Proc. of 6th International Conference on Infor- Hubei Province, China in 1966. He re- mation Fusion, Cairns, Australia, pp.1133–1140, 2003. ceived B.S. degree, M.S. degree, and Ph.D. [15] M.J. Atallah, M. Blanton, N. Fazio, K.B. Frikken, “Dynamic degree in computer software and computer and efficient key management for access hierarchies”, ACM systems architecture from Xidian Univer- Transactions on Information and System Security (TISSEC), sity, China, in 1987, 1990, and 2009 respec- Vol.12, No.3, Article 18, pp.1–43, 2009. tively. He had been a lecturer in Xidian [16] Y. Zhu, Z. Hu, H. Wang, H. Hu, G.J. Ahn, “A collaborative University from 1992 to 1994. Since 1994 framework for privacy protection in Online Social Networks”, he has been with Beijing Electronic Science and Technology Institute as a lecturer, as- Proc. of 6th International Conference on Collaborative Com- sociate professor, professor, and doctoral supervisor. His research puting: Networking, Applications and Worksharing (Collabo- rateCom), Chicago, Illinois, USA, pp.1–15, 2010. interests include network security, system security & evaluation and trusted computation. (Email: [email protected]) was born in Shangrao, QIU Qin YU Yinyan was born in Zhejiang Jiangxi Province, China in 1986. She re- Province, China in 1976. She received B.S. ceived B.S. degree in computer science and degree in computational mathematics from technology from Beijing Normal University Nanjing University of Science and Technol- in 2007. She is currently pursuing Ph.D. ogy in 1998, M.S. degree in applied math- degree in computer application in Peking ematics from Chinese Academy of Sciences University. Her research interests include in 2001, and Ph.D. degree in computer ap- digital rights management and information plication from Peking University in 2005. security. (Email: [email protected]) She is a senior engineer in Institute of Com- puter Science and Technology of Peking TANG Zhi was born in Zhejiang Province, China in 1965. University. Her research interests include digital rights management He received B.S. degree in radio physics from Peking University in and information security. (Email: [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Quality of Experience Assessment for Cross-layer Optimization of Video Streaming over Wireless Networks∗

LIU Fangqin1, LIN Chuang1 and MENG Kun2

(1.Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China) (2.School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing100083, China)

Abstract — To evaluate the Quality of experience One approach to build the QoE assessment model is to (QoE) of video steaming services in real time continues use the subjective test, such as the Mean opinion score (MOS) to be a desirable yet challenging work. In this paper, we method[4]. It is the most direct way to express the users’ opin- present a novel no-reference model for the QoE assessment ions about the video quality; however, it involves a large group of the video streaming service in wireless networks. This model evaluates the QoE of video streaming as a compre- of viewers and viewing equipments. Thus, it is not a conve- hensive function of all the parameters about the encod- nient and cost-effective approach to provide in-service moni- ing/decoding, transmission, and video content. Through toring for streaming system evaluations and optimizations. the comprehensive function, our model not only renders Another approach is to propose objective models or met- a general and accurate tool for the QoE assessment, but rics to evaluate the QoE. According to the amount of informa- also provides several useful guidelines for cross-layer opti- tion needed to perform the QoE assessment, the models can mizations: such as how to balance the image quality and be classified into three types: Full reference (FR), Reduced the playback quality to maximize the video quality and how to protect different types of packets to minimize the reference (RR), and No reference (NR). FR models require transmission distortions. the entire original video to be compared with the distorted video to infer the video quality[5]; RR models desire partial Key words — Video streaming service, QoE assessment information from the original video (e.g., exacting some key model, Cross-layer optimization. parameters from theoriginalvideo)[6];NRmodelsdonotneed the information from the original video, and usually only in- I. Introduction volve measurable parameters from networks and data streams to evaluate the quality of reconstructed video. The original With the explosive growth of communication technologies, video or its partial information is usually unavailable, there- more and more bandwidth-demanding and delay-sensitive ser- fore, only NR is the feasible and scalable type that can be used vices such as video streaming become popular in wireless net- in the real-time streaming system. However, the parameters works. However, wireless networks can provide only limited from networks and data streams that affect the video quality Quality of service (QoS) support for those services due to the are numerous, and their effects are not easy to analyze and multi-path fading, shadowing, and mutual interference of wire- describe. Then, to build a NR QoE assessment model is quite less channels. To improve the QoS support, a variety of so- challenging. Several works built the models with some pa- lutions have been proposed. Among them, the Cross-layer rameters fixed: the relationship between the packet losses and design (CLD) has shown that it can obtain a significant per- the quality of the same video sequence is revealed in Refs.[7, formance gain[1]. In order to promote the video streaming ser- 8] with the same codec parameters; the effect of the bit rate vices in wireless networks to be used world-wide, it is essential on the video quality was analyzed in Refs.[9, 10] without the to guarantee a high quality of these services experienced by transmission distortions. These works are not general for dif- the users, i.e., the Quality of experience (QoE). Nevertheless, ferent video sequences, different codec parameters, or differ- most previous works of CLD are not conducted from the users’ ent networks. Some other works investigated the relationship point of views[2,3]. Therefore, we need to investigate the rela- between all the parameters and the video quality using data [11] [12] tionship between different network resources and the QoE, and mining techniques or neural networks . The fundamental then design protocols or algorithms based on the relationship characteristics of the video quality cannot be obtained by such to improve the quality of video streaming services. methods. Therefore, we cannot make full use of the QoE asse-

∗Manuscript Received Sept. 2011; Accepted Feb. 2012. This work is supported in part by the National Basic Research Program of China (973 Program) (No.2010CB328105 and No.2009CB320505) and the National Natural Science Foundation of China (No.60932003, No.61070182, No.60973144, No.61173008, and No.61070021). 726 Chinese Journal of Electronics 2012 ssment models for cross-layer optimizations. quality. For example, the quality of the video sequence with Motivated by these questions, in this paper, we investigate more quickly changed scenes is more sensitive to the smooth- all the parameters that affect the video quality in details and ness of playback. In summary, the parameters which affect the build a general QoE assessment model to help design protocols video quality can be classified into three categories: encod- and algorithms. Our contributions are as follows: ing/decoing parameters, transmission parameters, and video First, we build a NR model for the QoE assessment of content parameters. We list these parameters in Table 1. the video streaming service in wireless networks. The model The parameters in the encoding/decoding category are is built by using the data traces collected from real wireless chosen to represent the features of codec algorithm. We differ- cellular networks, and it takes the encoding/decoding, trans- entiate I frames and P frames about the average size and the mission, and video content parameters into consider together, frame rate (B frames need to be encoded by referencing their and reveals their fundamental effects on the video quality. We subsequence frames. In real-time streaming system, B frames propose a new concept — available information ratio to in- are seldom used since using B frames will increase the delay tegrate the encoding/decoding and video content parameters of playback. In this paper, we don’t consider B frames). into one, and show that the video quality has high correlations In the transmission category, we do not list delay and jit- to this concept. We derive the model between the available in- ter, because both of them can be reflected by the frame rate formation ratio and the video quality first, and then add other and the frame loss rate. In addition, the losses of different parameters into this model step by step to obtain the final kinds of packets (packet in I frames or P frames) can cause comprehensive result. different distortions in the video sequence because of the hier- Second, we propose several useful guidelines for protocol archical structure of MPEG encoding. We classify the packet or algorithm designs based on the model. We investigate the loss rate into three parameters, i.e., the packet loss rate of I difference of the importance between the image quality (in frames, the packet loss rate of P frames, and the frame loss terms of the frame size) and the playback quality (in terms rate. of the frame rate). The results can be used to design encod- ing algorithms for the balance between the image quality and Table 1. The parameters that affect QOE Parameter the playback quality to maximize the video quality. We also Parameter name Notation categories analyze the different distortions caused by different types of Bit rate BR packet losses, i.e., the packet losses in I frames/P frames and Average I-frame size Si the packet losses causing the lost of frames. The results show Encoding/ Average P-frame size Sp that sometimes we might drop the whole frame with distor- I-frame rate FRi decoding tions preferentially to get even higher video quality. P-frame rate FRp The remainder of this paper is organized as follows: Sec- Resolution of the picture R tion II presents the overview of our modeling method. Section Packet loss rate in I frames PLRi III builds the QoE assessment model. Section IV proposes Transmission Packet loss rate in P frames PLRp guidelines for cross-layer designs. Finally, Section V concludes Frame loss rate FLR Video Spatial activity ρ this paper and discusses the implementation and extension of content Temporal activity τ our model. Asshowninpreviouswork[13], the spatial and temporal II. Methodology Overview dynamics are two major features of the video content. There- fore, we use the spatial and temporal dynamics to capture the MPEG is a typical and widely-used codec algorithm, and video content, and two discrete parameters, the spatial activity in this paper, our model is built for the streaming services with and the temporal activity, are proposed to measure the spatial MPEG-encoded video sources. Based on the MPEG codec, we and temporal dynamics, respectively. Considering that frame first analyze the parameters that affect the video quality. n consists of N pixels and the luminance value at pixel i is xi, 1. Parameters affecting the QoE the spatial activity of frame n, denoted as ρn, is then defined MOS is the most popular subjective method for measuring N 1 2 QoE. The score scale of MOS is from 1 to 5 and higher the as ρn = (xi − x¯) ,where¯x is the average lumi- 2 ∗ N score is, better the video quality is. We also adopt the MOS 255 i=1 2 scale and assume that the quality of the uncompressed and nance value of frame n and 255 is the normalization constant. un-encoded video sequence with the frame rate of 30 is 5. The average spatial activity for the whole video sequence (ρ)is K The raw video sequence is compressed and encoded with 1 ρk,whereK is the total number of frames in the video different bit rates and GOP patterns by MPEG encoder in K k=1 the media server. Both the bit rate and the GOP pattern sequence. affect the video quality. The encoded video frames are then The calculation of the temporal activity is a bit more com- segmented into packets and transmitted over networks which plicate. The motion vector, which is a two-dimensional vec- may introduce quality distortions by packet losses, delays, and tor, is defined as the offset from the center of a block in a jitters. After arriving at the client, the received video pack- frame to the center of the best matching block in its pre- ets are decoded, and the error control or adaption mechanism vious frame. Denoting√ (a, b) as a motion vector, the mo- deployed in the decoder also affects the video quality. In ad- tion distance equals a2 + b2.Forblocki (usually with size dition to that, the content of a video sequence also affects its equaling 16*16) in frame n, we calculate the motion distance Quality of Experience Assessment for Cross-layer Optimization of Video Streaming over Wireless Networks Quality of ...727

Di, and then the average temporal activity of the frame is rate. The relationship between the bit rate and the distor- M 1  tion has been investigated in Ref.[9], and a well-known rate- τn =  Di,whereM is the number of blocks 2 2 distortion model has been proposed. However, in addition to x + y • M i=1 in a frame, and x∗y is the search area. If the coordinate of the the bit rate, the video content and the encoding/decoding pa- center of a block is (x0,y0), the search area x ∗ y means that rameters also affect the distortions. For example, with the we find the best matching block in the range of [x0 ±x, y0 ±y]. same bit rate, more complicate the video content is, worse the The average temporal activity for the whole video sequence is video quality is; and the video sequences with different num- K 1  bers of I frames have different video qualities. In what follows, τ = τk,whereK is the total number of frames in we introduce a new concept available information ratio to in- K − 1 k=2 tegrate the video content and encoding/decoding parameters. the video sequence. Denoting the size of a video sequence in transmission, i.e., 2. Data collecting after encoded and before decoded, as Strans, the available in- To build the QoE assessment model, we collect large formation ratio (β) is defined as the average size of the data amounts of data in real wireless networks. We set up a video from Strans which is used directly or indirectly during decod- streaming service: three laptops with wireless network cards ing by each pixel in the decoded video. are utilized as the clients, and they are located in campus Denoting the average sizes of an I frame and a P frame as libraries; the media server is located far from the campus pro- Si and Sp, respectively, the available information ratio is then viding Video on demand (VoD) services to the clients; the Si · FRi +(Si + Sp) · FRp clients are connected to the media server through real wireless calculated by β = ,whereFRi and (FRi + FRp) · R cellular networks provided by China Mobile. FRp are the I-frame rate and P-frame rate, respectively, and R On the media server, ten different video sequences, which is the picture resolution. P frames are encoded by referencing i.e cover a wide range of video contents ( .Akiyo,Car- its previous frame, so Si bits are reused by a P frame. phone, Football, Foreman, Highway, Mobile, Coastguard, Fig.1 and Fig.2 show the effects of the bit rate and the Crew, News, and Walk), are encoded from the original un- available information ratio on the image quality, respectively. compressed format to the constant bit rate (CBR) MPEG-4 The raw video sequences are encoded in different GOP pat- simple profile with the resolution of 176 ∗ 144 pixels, and at terns and different bit rates. As can be seen in Fig.1, for the different bit rates (from 40 to 380 kbps) and frame rates (from same video content and with the same bit rate, the image qual- 5to25fps). ities deviate largely (more than 0.5) due to the difference of On the clients, all the parameters mentioned in previous the GOP patterns. In Fig.2, the deviation of the image qual- subsection are monitored every 10 seconds. We have collected ities between the video sequences with the same content and 112947 records within about four-week period. The records of the same available information ratio is quite small (no more the first six video sequences are used to build the QoE assess- than 0.5), indicating that the available information ratio are ment model (i.e., the training data set), and the others are more suitable to represent the image quality of videos. used for validation (i.e., the validation data set). From Fig.2, we can see that the relationship between the image quality without packet losses and the available infor- III. QoE Assessment Model mation ratio can be described by logarithm function of the general form Vs,noloss = F1 · log(β)+F2 · F1. F1 and F2 In general, the QoE could be specified from the following are strongly related to the video content. Taking the video two aspects: content parameters and the encoding/decoding parameters (1) The image quality, i.e., the quality of each picture in into consider together, we use the general form Vs,noloss = the video sequence. Assuming that the quality of the raw pic- (A0 · τ + A1 · ρ + A2) · (log(β + A3)+A4) to describe the rela- ture without compression and encoding is 5, the image quality tionship of the image quality without packet losses and these can be captured by the distortions in the picture compared to parameters. the raw picture. We use the VQM tool[14], which is a popu- The effect of the packet losses on the image quality is to lar FR metric to provide a reliable assessment for the image cause the content’s artifacts (such as ripples, blurriness, or er- quality, to calculate the image quality in our experiments. ror blocks). In another way, we can suppose that these distor- (2) The playback quality, i.e., the quality indicated by the tions are caused because the available information ratio loses smoothness of video playback. Assuming that the playback some effectiveness. Then, we extend the model of Vs,noloss to quality with the frame rate equaling 30 is 5, the playback qual- the model considering the transmission distortions as ity of the video sequence can be represented by the distortions caused by the decrease of the frame rate. We obtain the play- Vs =(C0 · τ + C1 · ρ + C2) · (log((β + C3) back quality or the whole QoE by using MOS method. · (C4 · PLRi + C5 · PLRp + C6 · FLR+ C7)) + C8) As indicated in Ref.[13], the video quality distortions are (1) usually perceived by end users in the cumulative way. Then, From the training data set, we get that C0 = −0.0481, by denoting the image quality of a video sequence as Vs,and C1 =0.5415, C2 = −0.9751, C3 =0.0086, C4 = 299.5781, the distortion of the playback quality as Dt,wecanderivethe C5 =55.8459, C6 =8.3660, C7 =0.2034, and C8 =1.2331. video quality V as V = Vs − Dt. The mean square deviation of fitting error is 0.2929. 1. Modeling the image quality 2. Modeling the playback distortion The raw video sequence is encoded, and some qualities are Fig.3 gives the effect of the frame rate (FR) on the distor- compromised to achieve a smaller data size, i.e., a smaller bit 728 Chinese Journal of Electronics 2012

Fig. 1. The effect of the bit rate on the Fig. 2. The effect of the available informa- Fig. 3. The effect of the frame rate on the image quality tion ratio on the image quality playback distortion tions of the video playback quality (Dt). As can be seen, Dt of the parameters on the video quality, and then design cross- decreases exponentially with the increases of the frame rate. layer protocols or algorithms, such as the scheduling mecha- Taking the video content parameters and the frame rate into nism and encoding algorithm, to allocate the limited network consider together, we use Eq.(2) as the model to capture the resources for the maximization of the video quality. playback distortion. Fig.5 shows the effects of the P-frame rate (FRp)onV , Vs,  FR  and Dt with fixed I-frame rate FRi =2,bitrateBR = 200 (E3· ) Dt E0 · τ E1 · ρ E2 · e 30 E4 =( + + ) + (2) kbps, and average P-frame size Sp =10kb.WecanseethatV first increases with small FRp, and then decreases with large From the training data, we get that E0 =4.4946, E1 = FRp. ThisisbecausewhenBR, Sp,andFRi are fixed, the −1.4226, E2 =0.7056, E3 = −1.8047, and E4 = −0.2248. reduction of Si caused by the increase of FRp is not so much The mean square deviation of fitting error is 0.007. since Sp is usually much smaller than Si. Therefore, the re- Combining the models of the image quality and the play- duction of Dt is usually larger than the reduction of Vs with back distortion, we can obtain the model for the whole video small FRp, and then the reduction of Dt become smaller with quality V from Eq.(1) and Eq.(2). large FRp. This is very useful for encoding a video sequence, 3. Model validation because in general, the I-frame rate FRi is determined by the Fig.4 shows the accuracy of our model compared to the change frequency of the scenes in the video, while Si and FRp simulation results. Our results overestimate the video quality are more changeable according to the remaining resource. For a little in some traces with very low quality (each overesti- example, we can change the number of the Discrete cosine mation is no more than 0.5). This indicates that our model transformation (DCT) coefficients to be encoded in an I frame is a little less accurate for the video sequences of low quality, to change Si in a MPEG encoder; or we can increase the num- but the accuracy is still acceptable. In addition, for the video ber of P frames easily if there is enough bit resource to be sequence Walk, the accuracy of our results is not so good as allocated. Therefore, our model can then be used to balance for the other three kinds of video sequences. This is because the image quality (in terms of Si) and the playback quality (in that the scenes in Walk are more complicate than other video terms of FRp) to achieve the best video quality. sequences. Nevertheless, the accuracy of our model for all the video sequences is sufficient. The mean square deviation be- Fig.6 shows the effects of different kinds of packet losses, i.e PLR tween the model results and the simulation results is 0.1779. ., the packet loss rate in I frames ( i), the packet loss rate in P frames (PLRp), and the frame loss rate (FLR)on IV. Guidelines for Cross-layer Design the video quality (V ). We can see that the effect of PLRi on V is much worse than that of PLRp and FLR,andtheeffect Based on our model, we can investigate the different effects of PLRp is worse than FLR. Especially, when PLRi > 0.03,

Fig. 4. Comparisons of the video quality between the model results and the Fig. 5. The effect of FRp on V , Vs,and Fig. 6. The effects of LP LRi, PLRp,and simulation results Dt with fixed FRi FLP on V Quality of Experience Assessment for Cross-layer Optimization of Video Streaming over Wireless Networks Quality of ...729 the video quality is under 1, which is unacceptable. This is ods for multimedia applications”. because that the lost of a frame just makes its next frame ref- [5] Z. Wang, A. Bovik, H. Sheikh and E. Simoncelli, “Image quality erence a wrong frame, and the distortions caused by wrong assessment: from error visibility to structural similarity”, IEEE Transactions on Image Processing, Vol.13, No.4, pp.600–612, referencing (such as ripples in the picture) is usually more ac- 2004. ceptable than the distortions caused by the lost of a packet in [6] Z. Wang and E. Simoncelli, “Reduced-reference image qual- a frame (such as blurriness and error blocks in the picture). In ity assessment using a wavelet-domain natural image statistic addition, the distortions caused by packet losses in P frames model”, Proc. of SPIE Human Vision and Electronic Imaging, are better than that in I frames since the data in I frames is Xi’an, China, pp.149–159, 2005. more important and the distortions in I frames usually will [7] S. Kanumuri, P. Cosman, A. Reibman and V. Vaishampayan, “Modeling packet-loss visibility in mpeg-2 video”, spread further. Therefore, in the cross-layer protocol designs, IEEE Trans- actions on Multimedia, Vol.8, No.2, pp.341–355, 2006. wecoulddropawholeframewithdistortions preferentially, [8] Z. He and H. Xiong, “Transmission distortion analysis for real- then the packets of P frames, at last the packets of I frames time video encoding and streaming over wireless networks”, to decrease the transmission distortions when there is a lack IEEE Transactions on Circuits and Systems for Video Tech- of network resources. nology, Vol.16, No.9, pp.1051–1062, 2006. [9] G. Sullivan and T. Wiegand, “Rate-distortion optimization for video compression”, IEEE Signal Processing Magazine, Vol.15, V. Conclusion and Discussion No.6, pp.74–90, 1998. [10] H. Koumaras, T. Pliakas and A. Kourtis, “A novel method for In this paper, we have built a QoE assessment model with pre-encoding video quality prediction”, sufficient accuracy for the video streaming service as a compre- Proc. of Mobile and Wireless Communications Summit, Budapest, Hungary, pp.1– hensive function of all the encoding/decoding, transmission, 4, 2007. and video content parameters. Based on the model, we in- [11] A. Csizmar Dalai, D. Musicant, J. Olson et al., “Predicting user- vestigate the effect differences between the image quality and perceived quality ratings from streaming media data”, Proc. of the playback quality, and the different importance of different IEEE International Conference on Communications,Glasgow, kinds of packet losses to the video quality. Both the results Scotland, pp.65–72, 2007. [12] G. Rubino, P. Tirilly and M. Varela, “Evaluating users sat- show that we could obtain a balance between these parameters isfaction in packet networks using random neural networks”, to obtain a higher video quality. Proc. of International Conference on Artificial Neural Net- Our model is developed for the MPEG codec. Actually, works, Athens, Greece, pp.303–312, 2006. it can be extended for other codecs. The method to build [13] H. Koumaras, A. Kourtis, D. Martakos and J. Lauterjung, the model is general for different codecs, and the new concept “Quantified pqos assessment based on fast estimation of the available information ratio is the key point for the extensions. spatial and temporal activity level”, Multimedia Tools and Ap- , Vol.34, No.3, pp.355–374, 2007. In different codecs, we just need to adjust the calculation of plications [14] N.J. Victory, “Assistant secretary for communications and infor- the available information ratio. mation”, Technical Report, Video Quality Measurement Tech- To implement our model in real-time video streaming sys- niques, 2002. tems, we need to solve these two issues: LIU Fangqin received Ph.D. degree 1. The computation complexities to calculate the spatial in 2012 from the Department of Computer and temporal activities are too high, and this would affect the Science and Technology, Tsinghua Univer- performance of the streaming services. To this end, we can sity. Her research area is performance eval- uation and transmission design of video simplify the calculations of the spatial and temporal activi- streaming services in wireless networks. ties. For example, the computations are carried out just in some block samples from some frames, which can decrease the computation complexities largely. 2. We need to intrude into the video data to compute the is a professor of spatial and temporal activities, and to distinguish the types LIN Chuang the Department of Computer Science and of frames. There is no good solution for this problem, and we Technology, Tsinghua University. His cur- should embed our model in the decoder. rent research interests include computer networks, performance evaluation, network security analysis, and Petri net theory and References its applications. [1] S. Thangam and E. Kirubakaran, “A survey on cross-layer based approach for improving TCP performance in multi hop mobile ad-hoc networks”, Proc. of IEEE International Conference on Education Technology and Computer, Singapore, pp.294–298, MENG Kun is a Ph.D. candidate 2009. in School of Computer and Communica- [2] Y. Chan, P. Cosman and L. Milstein, “A cross-layer diversity tion Engineering, University of Science and technique for multicarrier OFDM multimedia networks”, IEEE Technology Beijing. His research interests Transactions on Image Processing, Vol.15, No.4, pp.833–847, include QoS of computer networks, service 2006. computing, and stochastic models. [3] M. Van Der Schaar, “Cross-layer wireless multimedia transmis- sion: challenges, principles, and new paradigms”, IEEE Wire- less Communications, Vol.12, No.4, pp.50–58, 2005. [4] ITU-T.P.910:1999, “Subjective video quality assessment meth- Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Adaptive Layer-3 Buffer Management Scheme for Domestic WLAN∗

Wai Leong Pang1, David Chieng2 and Nural Nadia Ahmad1

(1.Multimedia University, Cyberjaya, Selangor, Malaysia) (2.Wireless Communications Cluster, MIMOS Berhad, Malaysia)

Abstract — The existing IEEE802.11 a/b/g/n pro- from adopting IEEE802.11e at a large scale. First and foremost, vides robust and relatively low cost wireless services but that requires massive infrastructure upgrade which is not only costly there only provide best effort services. IEEE802.11e pro- but also disruptive to existing customers. Furthermore, bandwidth, vides Quality of service (QoS) to the real time applica- jitter and latency are still not guaranteed[1−3,6] . In addition, band- tions but the bandwidth distribution is controlled by the width distribution is fully controlled by the IEEE802.11e Medium IEEE802.11e and network administrator has no controls access control (MAC) and the network administrator has no con- over the QoS mechanisms. We propose a non-disruptive trols over the QoS mechanisms. and yet low-cost QoS and Fairness provisioning solution In view of these shortcomings, we propose a simple and yet cost- over existing best-effort WLAN networks with an Adaptive effective alternative Layer-3 solution that is applicable to existing layer-3 buffer management (AL3BM) Scheme for small do- IEEE802.11a/b/g/n or even 11e devices. mestic wireless network. The existing WLAN networks or hotspots can continue their operation without any major II. Related Works disruption or costly upgrade. AL3BM is implemented in a network router that sits between the wired and wireless Many works have been carried out to look at QoS provisioning network to control the heavy downlink traffic. The network in IEEE802.11 WLAN. Various schemes are implemented by differ- administrator can partially controls the bandwidth alloca- entiating the initial window size, window-increasing factor, maxi- tion through the AL3BM. Simulations have been carried mum backoff stage or the Inter-frame space (IFS) and new or en- out to examine the proposed scheme in providing services hanced version of the MAC have been proposed to support QoS. differentiation, and fairness. The simulation results show The 802.11e Enhanced distributed coordination function (EDCF) is that AL3BM improves the performance of the wireless end based on IFS and Contention window (CW) differential adjustment users. Hardware testbed is configured and the experimen- to provide eight different types of service classes. Each service class tal results show that AL3BM provides QoS to the WLAN. has different IFS and CW to distinguish between each other. The HCF is an extension of the Point coordination function (PCF) mode. Key words — Wireless LAN (WLAN), Quality of ser- The Hybrid coordinator (HC) controls the channel access and en- vice, Service differentiation. sures QoS guarantee for prioritized flows by granting explicit access during the Contention period (CP). The IEEE802.11e provides the QoS based on the types of services. The MAC enhancement is an I. Introduction approach that requires hardware upgrade and thus involving signif- icant cost[6]. The WLAN technologies particularly those based on A 2-level protection and guarantee mechanism is proposed for IEEE802.11a/b/g/n standards[1−3], have been enjoying enormous voice and video traffic in EDCA[7]. The 1st level protects the ex- market adoption due to its ability to provide robust and yet rel- isting voice and video flows from the new flows and the 2nd level atively low-cost wireless access service. BT Openzone’s wireless protects the voice and video flows from the best-effort flows. The broadband service which mainly relies on IEEE802.11g[4] is now system rejects the extra voice and video flows. The AP is heavily available at 500,000 homes, businesses and city centers in the UK loaded to update the network status for admission justification. and Ireland[5]. Despite the success, there remains a barrier to R. Ohayon proposed a solution that work with either Dis- wide range adoption of real-time services such as VoIP, audio or tributed coordination function (DCF) or EDCF[8]. It includes dis- video streaming, since there is no QoS and Fairness control in these tributed reservations of time slots between the stations sharing the technologies. Due to that the IEEE802.11 workgroup proposed WLAN without using the signaling or controlling messages, but the IEEE802.11e standard[6] which defines a set of Quality of service stations need to hold a virtual of the time slots and their status. (QoS) enhancements for WLAN applications by enhancing the Me- I.A. Qaimkhani and E. Hossain[9] proposed a hybrid contention dia access control (MAC) mechanism. This protocol consists of free access MAC protocol and an admission control algorithm to two access mechanisms, the Enhanced distributed channel Access increase the capacity of WLAN and ensures efficient admission con- (EDCA) and Hybrid coordination function controlled channel access trol for consistent delay-bound guarantees. The proposed scheme (HCCA). The EDCA defines four Access categories (ACs) that pro- is complex to be implemented. The collision avoidance with fading vide differential treatments to different types of traffic. Nevertheless detection algorithm is proposed in Ref.[10] to evaluate the perfor- several shortcomings deter existing network providers such as BT mance against channel failures. It adjusts the maximum number

∗Manuscript Received Sept. 2010; Accepted Mar. 2012. Adaptive Layer-3 Buffer Management Scheme for Domestic WLAN 731 of retransmission to improve the delay and jitter of the real-time our solution can still be adopted by either 11a or 11n. Overall, our traffic. proposed solution offers a number of distinct benefits: Two models with different physical layer speeds and dif- • Ability to provide service differentiation. ferent quality stations are implemented as enhanced MAC for • Compatible with existing IEEE802.11a/b/g/n/e systems. IEEE802.11e[11] . It supports low delay services and the high prior- • The network administrator can controls and manages the ity traffic is better chances to access the channel. The Adaptively packets scheduling. Tuneable HCF (AT-HCF) MAC dynamically adjusts the duration • Improve the fairness of WLAN. ratio between HCCA and EDCA based on observed system dynam- • Easy and cost effective solution, which requires minimal ics and improves the delay and throughput of the systems in Ref.[12]. changes or investment on the existing systems. The scheme adjusts the CAPLimit value according to the measured ratio of Constant bit rate (CBR) prone traffic and Variable bit rate III. AL3BM (VBR) prone traffic. The scheme will fine-tune the CAPLimit ac- cording to the observed system throughput. It reduces the average The architecture of AL3BM is shown in Fig.1. AL3BM con- delay and maintains the total throughput. But high execution re- sists of 5 modules, the Admission control agent (ACA) module, sources are required to fine-tune the CAPLimit. Policy server (PS) module, Bandwidth broker (BB) module, Mon- A mathematical model is developed to analyze the throughput itoring (MM) module and Resource manager (RM) module. ACA and delay by using a discrete time M/G/1 queue[13] . The admission controls the number of users admitted to the wireless network algorithm provides a tight lower bound of the admissible region for through the Dynamic host configuration protocol (DHCP). DHCP real-time traffic. The model considers only the collisions to pro- is disabled when the bandwidth consumed, BWconsumed,ismore vide weighted throughput. Each node changing its backoff counter than the threshold value, BWthreshold. BWthreshold is set at the basedonbothitsownpacket’sprioritylevelandtheprioritylevelof 90% of the maximum throughput achieved by the WLAN sys- transmitted packet. But the scheme proposed is complex and hard tems. The new wireless station fails to get the IP address when to implement. the DHCP is disabled. As shown in Algorithm 1, ACA module monitors the BW in every t . DHCP is disabled when Differentiated Arbitration inter frame space (AIFS) and discrete consumed step the BW is more than BW and the flag signal, time slot models are used to analyze the external collision time[14]. consumed threshold DHCP is set to 0. Or else the DHCP is activated and the It estimates the effects of varied back-off window size and AIFS un- flag DHCP is set to 1. der heterogeneous traffic, and considers AIFS events between each flag MM monitors the total bandwidth consumption of the wireless backoff procedure. The simulation is carried out under ideal channel network through Simple network management protocol (SNMP). conditions, saturation traffic and a single hop heterogeneous traffic As shown in Algorithm 2, MM uses the “snmpget” command to ac- network environment. It is a complex scheme that hard to imple- cess the Management information base (MIB) database of the AP ment. to get the total number of octets received, “ifInOctets”andthe Network aware concept is used to reduce the blocking prob- total number of octets transmitted, “ifOutOctets”MMgetsthe ability and dropping probability of mobile requests[15]. Fuzzy MIB parameters in every tstep and the BWconsumed is calculated logic inference system is employed to select appropriate cache re- using Eq.(1), with ifInOctets’andifOutOctets’ represented the lay nodes to cache published video streams and distribute them MIB parameters collected in the previous cycle. to different peers through service-oriented architecture. The new  IEEE802.11n is proposed and vendors have released products based BWconsumed ={(ifInOctets − ifInOctets ) on the IEEE802.11n[16] . It builds on previous 802.11 standards by +(ifOutOctets − ifOutOctets)}×8/t adding multiple-input multiple-output and 40 MHz channels to the step (1) physical layer, and frame aggregation to the MAC layer. It uses mul- tiple antennas to coherently resolve more information than possible using a single antenna. It supports antenna diversity and spatial multiplexing. The new IEEE802.11n with high wireless bandwidth successfully solves the bottleneck problem arise in AP. The QoS is- sues will only arise again, once the IEEE802.11n fails to support high bandwidth demand from the wireless users. The researches on MAC layer are mainly targeting to provide service differentiation between different types of services where the real time traffic are given higher priority compare to the best effort traffic. One major drawback of this approach is that such solutions require significant hardware and/or firmware modifications. From the reviews above, it can be deduced that most of the re- search works are targeted to provide service differentiation in order to protect the real-time traffic. But most of the solutions proposed are the MAC layer approaches that required hardware upgrading. This involves significant amount of investment in terms of cost and Fig. 1. AL3BM architecture time. We proposed a QoS and Fairness provisioning scheme that is implemented in a router that sits between the wired and wireless PS provides the configuration parameters and coordinates the networks. The proposed software-based solutions only need mini- operations between different modules. As shown in Algorithm 3, PS mal cost and modifications on the existing WLAN systems. AL3BM module contains two operation modes, the fairness and the service is added in the network router (Linux-based computer) to manage differentiation modes. PS is set to fairness mode when the mode the packets scheduling. Simulation has been carried out to compare signal is set to “1” and is set to service differentiation mode when the performances of the proposed scheme against the conventional mode signal is “2”. For fairness mode, the flow class bandwidth IEEE802.11b/g/e systems. IEEE802.11n is not evaluated in our rate, FCBRi is determined using the Eq.(2), with i = total number work since it has a clear capacity advantage over IEEE802.11b/g/e of flow classes. The bandwidth is shared equally among the flow systems. IEEE802.11a with 54Mbps is obsolete and replaced by the classes. IEEE802.11g. Nevertheless, whenever QoS and Fairness is required, FCBRi =1/i (2) 732 Chinese Journal of Electronics 2012

For service differentiation mode, only the real-time traffic is queue. Multiple flow class queues are used to accommodate the given higher priority to access the network. Eq.(3) is used to cal- packets from different types of traffics according to the need of the culate the FCBRi,withi equal to the total number of real time WLAN. All the flow class queues are scheduled by the Weighted flows under service differentiation mode. FCBRi is the ratio be- Round Robin scheduler before diverting the packets to the WLAN. tween the offered load of a flow class, FC offered load and the to- The real-time traffics are given higher weightage compared with the tal bandwidth consumed, BWconsumed. The best effort flow class best effort traffics. Network administrator can controls the packets bandwidth rate, FCBRi+1 is calculated using Eq.(4). The remain- scheduling through the AL3BM protocol. ing bandwidth rate after deduct the total bandwidth share of the real-time traffic is allocated to the best effort flow class. IV. Simulation Analyses i FCBR = FC offered load/BWconsumed (3) Extensive simulations are carried out using the ns2 in order to FCBRα+1 =1− ΣFCBRi (4) evaluate the performance of the proposed scheme. The performance of the proposed scheme is compared with the existing wireless proto- BB module manages the bandwidth distribution of the packets cols (such as IEEE802.11b, IEEE802.11g and IEEE802.11e) under scheduling protocol that implemented in the RM module. BB mod- various traffic scenarios. The simulation analyses are carried out ule gets the bandwidth distribution parameters from PS module and to evaluate the following aspects: i.e. (1) Per-class and per-station maps the parameters accordingly to suit the need of the operation QoS provisioning, (2) Fairness, (3) Weighted bandwidth allocation, mode as defined in PS module. The bandwidth ratio of each flow and (4) Channel sharing. class, Ratio FCi is equal to FCBRi,andtheFCBRi values are 1. Simulation setup supplied by PS module. The priority of a flow class is equal to i, The simulation testbed configuration is referred to Ref.[19]. with i equal to the number of flow. The flow class with i =1gets The wireless stations are attached to an AP under infrastructure the highest priority, i = 2 with second priority and so on. mode. The network topology (Fig.2) consists of N wired stations, one router, one AP and N wireless stations. Each wired station Algorithm 1 Admission control agent (ACA) module is connected to the router with a link of 100 Mbps capacity and 1: for every tstep do propagation delay of 30 ms. The router is connected to the AP 2: if BWconsumed >BWthreshold then via a link with the propagation delay of 20 ms and the capacity of 3: disable DHCP 100 Mbps. The link propagation delay between the wired stations 4: DHCPflag =0 and the router is 10 ms higher compared to the link propagation 5: else enable DHCP 6: DHCPflag =1 delay between the router and AP. All the stations are randomly 7: end if placed within the transmission range and there is no hidden sta- 8: end for tion. The Request to send/Clear to send (RTS/CTS) mechanism is deactivated in the simulation in order to study the more widely Algorithm 2 Monitoring module (MM) used CSMA/CA mode. The allocated buffer size for all the queues 1: for every tstep do  in AP and stations is set to 50 packets. For scalability test, ACA 2: BWconsumed = {(ifInOctets − ifInOctets )+  (ifOutOctets − ifOutOctets )}×8/tstep module is disabled, DHCPflag is set to 1, and AL3BM accepts all 3: end for the users. 4: ifInOctets = ifInOctets  5: ifOutOctets = ifOutOctets Table 1. Traffic configurations Traffic Voice Video Data Algorithm 3 Policyserver(PS)module Transmission Always θ = Total number of flow classes 2 × 64 250 rate (kbps) download α = Total number of real-time flow classes Payload (bytes) 100 1000 1000 1: for mode = 1 do 2: loop: i = {1, 2, ···,θ}{ Configuration On/off model: CBR/ FTP 3: FCBRi =1/θ} Burst & idle = 3s UDP/ TCP 4: end for 5: for mode = 2 do 6: loop: i = {1, 2, ···,α}{ i 7: FCBR = FC offered load/BWconsumed} 8: FCBRα+1 =1− ΣFCBRi 9: end for

Algorithm 4 Bandwidth broker (BB) module 1: for every tstep do 2: loop: i = {1, 2, ···,θ}{ 3: Ratio FCi = FCBRi 4: Priority FCi = i} 5: end for RM module consists of multiple operation modes as shown in Fig.1. Only two operation modes (fairness and service differen- tiation modes) are configured in this paper and AL3BM can be configured to the other operation modes depend on the network. Each operation mode is controlled by a packet scheduling protocol, and the enhanced Class based queuing (CBQ) protocol is used as Fig. 2. Simulation network topology the packets scheduling protocol in the RM module. As shown in Ref.[17], CBQ is compatible and well perform in the IEEE802.11 The general behavior of the current network trend is the down- WLAN and the details of the CBQ are available in Ref.[18]. It con- link traffic is heavier compare to the uplink traffic. Three types of sists of a classifier, multiple flow class queues (FC1,FC2, ···,FCi) traffic: voice, video and data traffic as shown in Table 1 are used and a Weighted round robin (WRR). Classifier identifies the packets in the simulations. Voice traffic is generated by the on/off traffic received and assigns the arriving packets to the correct flow class model, and the packets are generated at a constant rate of 64 kbps Adaptive Layer-3 Buffer Management Scheme for Domestic WLAN 733

Fig. 3. Video throughput, audio delay and video jitter. (a) Video throughput; (b) Audio delay; (c) Video jitter during the on periods (which is the typical bit rate for G.711 codec stations in second group are configured to send data traffic. The for Voice over IP, VoIP) and no packets are generated during the wireless stations in third group are set to receive video streaming off periods. The burst time and idle time of the voice traffic are service between time interval, Tvideo = [40s ∼ 80s] and the wire- taken from the Pareto distribution, and the average burst and idle less stations in fourth group are set to send/receive voice between time are set to 3 s. The video traffic is a CBR application over a time, Tvoice = [60s ∼ 100s]. The reason for choosing these settings User datagram protocol (UDP) connection. The data traffic is gen- is mainly to evaluate the performance of the service differentiation erated by a greedy File transfer protocol (FTP) application over a between the voice and video services when both services are coexist Transmission control protocol (TCP) connection. The packet sizes i.e.whenTvoice and Tvideo are overlapped. All the voice flows are for the voice, video and data traffic are set to 100, 1000 and 1000 diverted to FC1, and the video flows are diverted to FC2.The bytes, respectively. data flows are diverted to FC3. The total number of real-time The performance of the proposed scheme is compared with traffic flows, α = 2 and the flow class bandwidth ratio, FCBRi is the IEEE802.11b (11b), IEEE802.11g (11g) and IEEE802.11e (11e) determined using Eqs.(3) and (4). MAC protocols. Our scheme is implemented in IEEE802.11b The average video flow throughput of the schemes is shown in (Q11b), IEEE802.11g (Q11g) and IEEE802.11e (Q11e). Three Ac- Fig.3. As shown in Fig.3(a), the average video flow throughput of cess categories (ACs) of EDCA are considered in the simulation the 11b is the lowest. The average video flow throughput of Q11b analyses, i.e. voice, video and best effort. Table 2 lists the MAC is better compared to the 11b. The average video flow throughput parameters used in the simulations. The transmission rate of 11b of the other schemes is almost the same (500kbps). Beside that, the and 11e is 11 Mbps and 11g is 54 Mbps (as per pre-setting in net- average audio packets delay and video packets jitter of Q11b are work simulator, ns2). The values of these parameters remain un- lower compared to the 11b as shown in Fig.3(b) and Fig.3(c). changed during the simulations unless otherwise stated. The main The following simulation evaluated the effect of traffic load on performance metrics measured in the simulations are the average the proposed schemes. Since IEEE802.11b protocol is already ob- throughput, average packets loss rate, average packets end-to-end solete, the following simulations are carried out with IEEE802.11g delay (that measured the packet delay between wired and wireless and IEEE802.11e protocols. The total number of stations, N is station) and average packets jitter. varied between 4 to 20 stations in a step of 4 and the same traffic Table 2. MAC parameters of 11b, 11g and 11e scenario are used as per described in the previous simulation, except Scheme 11b 11g 11e the number of stations in each group is composed in N/4 stations. AC − − Voice Video Data Fig.4, Fig.5 and Fig.6 show the results of aggregate throughput, Slot 20 μs 9 μs 20 μs data traffic loss rate, voice traffic end-to-end delay and video traffic SIFS 10 μs 16 μs 10 μs jitter delay, which are the key performance indices for QoS perceived DIFS 50 μs 50 μs − by end users. AIFS − − 30 μs 30 μs 50 μs CWmin 31 31 7 15 31 CWmax 1023 1023 15 31 1023

2. Per-class service differentiation The simulation evaluates and compares the per-class fairness of the 11b, 11g and 11e with their respective AL3BM added systems (Q11b, Q11g and Q11e). The operation mode of the AL3BM is set to service differentiation mode, with mode = 2. In this experiment, there are 4 groups of stations with 2 stations per group. The total number of wired and wireless stations are 8 each (N =8).Allthe Fig. 4. Aggregate throughput. (a) 11g and Q11g; (b) 11e and wireless stations are set to receive data traffic but only the wireless Q11e

Fig. 6. Video jitter and voice end-to-end delay. (a) Video traffic jitter delay; Fig. 5. Data traffic loss rate (b) Voice traffic end-to-end delay 734 Chinese Journal of Electronics 2012

Fig.4(a)and(b) show that Q11g and Q11e main- tained slightly higher throughput compared to 11g and 11e. The throughput of Q11g decreases due to the high probability of transmission collisions, when the number of wireless stations, N exceeds 16. As N increases further, the performance of the Q11g holds a slight edge over conventional 11g. The reason is largely due to the ability of Q11g to control the amount of data traffic by introducing a higher drop- ping probability to it. Q11g also significantly reduces the video traffic jitter delay by assigning a higher pri- ority to the video traffic. Q11g has lower voice end- to-end delay when the N<16 due to the bandwidth allocation that controlled by AL3BM, and not much performance improvement is observed in Q11e. High dropping probability/loss rate makes TCP adjusts its transmission rate so that packets are not transmitted excessively. Thus, Q11g reduces the probability of the packets collisions occur even though the N increases. It has been observed that the data traffic loss rate of Q11g and Q11e are higher compared Fig. 7. Per-station throughput. (a) 11g; (b) Q11g; (c) 11e; (d) Q11e to the 11g and 11e respectively since the proposed sc- heme controls the bandwidth allocated to the data traffic and hence applied[20]. causes a slightly higher dropping rate in data traffic. This will help    2  N N 2 to allocate the bandwidth to the services with higher priorities. The F = THi N TH (5) i=1 i=1 i Q11g significantly reduces the video traffic jitter compared to 11g especially when the N ≥ 8. The video traffic jitter of Q11g is 32% With THi denotes the average throughput achieved by ith stations. lower than 11g when N = 20. Not much improvement of the video traffic jitter in Q11e compared to 11e. The voice traffic end-to-end THi = THvoice,i + THvideo,i + THdata,i (6) delay of Q11g is lower compared to 11g when N<16. The voice The fairness index is increased from F =0.857 (11g) to 0.998 (Q11g) packets get the highest priority to access the network compared to and F =0.589 (11e) to 0.982 (Q11e) accordingly. Fig.8 shows the the other traffic in Q11g. This helps to reduce the voice packet end- fairness index for 11g, Q11g, 11e and Q11e algorithms with various to-end delay. Not much improvement of the video jitter and voice values of N ranging from 4 to 20. The stations are grouped in four delay in 11e and Q11e. as described in Section IV.2. It can be deduced that the fairness index of 11g decreases as N increases. However, Q11g maintains si- 3. Per-station fairness gnificantly high value of F In this experiment, the performance of the proposed scheme is (> 0.97) compared to 11g evaluated in terms of fairness among stations. The same setting for the entire range of N. as the previous simulation are adopted (with N = 8) except that For example, when the voice and video traffic are continuously generated. The wireless N =8,F =0.998 (Q11g) stations are labeled as STA1, STA2, ··· to STA8 respectively. All and 0.857 (11g). The fair- stations receive data traffic. In addition, STA3 and STA4 send data ness index of Q11g does traffic, STA5 and STA6 receive video traffic at rate of 250 kbps, and not fluctuate much as N STA7 and STA8 send/receive voice traffic. AL3BM is set to fairness increases. The fairness in- mode, with mode = 1. The wireless stations, STA1 and STA2 in dex of Q11e is significantly 4 the first group are diverted to FC , STA3 and STA4 are diverted higher value compared to to FC3, STA5 and STA6 to FC2, STA7 and STA8 to FC1.The 11e, e.g., F =0.982 Fig. 8. Fairness index with respect to total number of flow classes, i = 4 and the flow class bandwidth (Q11e) and 0.589 (11e), i the number of stations, N ratio, FCBR is determined using equation (2). The bandwidth are when N = 8. The fairness i shared equally among the stations and the FCBR =0.25foreach index of Q11e decreased as N increases. The overall channel uti- individual flow class. lization is slightly improved as N increases. The channel utilization, Fig.7 shows the per-station throughput for 11g, Q11g, 11e and i THi =10.92 Mbps (11g), 11.03 Mbps (Q11g), 3.22 Mbps (11e) Q11e. For 11g (Fig.7(a)) and 11e (Fig.7(c)), STA3 and STA4 which and 3.23 Mbps (Q11e), when N = 16. send data traffic persistently, achieve remarkably higher throughput compared to the other stations. This is caused by the contention- V. Experimental Results based channel access mechanism, CSMA/CA. In the CSMA/CA, The real-time testbed adopts a similar topology as shown in only senders participate in the competition of channel access and Fig.2, but with only 3 wired and 3 wireless stations. AL3BM is the senders share the channel evenly. Therefore, STA3 and STA4 implemented in a Linux based router which equipped with Intel have significantly higher throughput than the other receiving sta- Celeron 333 MHz, 256 MB SDRAM, and 20 GB hard disk. For tions and causes severe unfairness. As shown in Fig.7(b)and(d), wireless stations, the D-Link IEEE802.11b AP and wireless adapters Q11g and Q11e solve the unfairness problem. Q11g and Q11e take are used. AL3BM is programmed to provide service differentiation. into account the bandwidth consumption of both uplink and down- AL3BM consists of two flow classes, FC1 and FC2. 1.65 Mbps is link traffics. Since STA3 and STA4 tend to use more bandwidth allocated to FC1 for video traffic and the remaining bandwidth is than other stations, the proposed scheme will limit their bandwidth allocated to FC2 for the other traffic in this experiment. Video consumption. traffic is given the higher priority compared to the other traffic in In order to quantify the fairness, Jain’s fairness index, F is AL3BM. All the wireless stations downloaded files from the wired Adaptive Layer-3 Buffer Management Scheme for Domestic WLAN 735 stations through File transfer protocol (ftp1, ftp2 and ftp3 respec- [4] Wi-Fi technical details. http://www.btopenzone.com/help/tech tively). Wired stations 1 and 2 streamed the video with total trans- nical-details/index.jsp mission rate of 832 kbps to wireless stations 1 and 2 (Video 1 and [5] “BT lights up half a million wifi hotspots”, Telecoms.com, Au- Video 2). gust 20, 2009. http://www.telecoms.com/13962/bt-lights-up- half-a-million-wifi-hotspots [6] IEEE Std 802.11eTM-2005, Part 11: Wireless LAN Medium ac- cess control (MAC) and Physical layer (PHY) specifications: Medium access control (MAC) quality of service enhancements, Nov. 2005. [7] Y. Xiao, F.H. Li, S. Choi, “Two-level protection and guaran- tee for multimedia traffic in IEEE802.11e distributed WLANs”, Wireless Network, 2009, Vol.15, pp.141–161, Feb. 2009. [8] Rony Ohayon, “Virtual reservation scheme for supporting CBR multimedia services with strict QoS performance over WLAN and wireless mesh”, International Journal of Communication Systems, 2009, ISSN 1099-1131, DOI 10.1002/dac.1045,28 Fig. 9. Flows’ throughput May 2009. [9] I.A. Qaimkhani, E. Hossain, “A novel QoS-aware MAC proto- As shown in Fig.9, the maximum throughput of the wireless net- col for voice services over IEEE 802.11-based WLANs”, Wireless work reached 6 Mbps. The Video 1 and Video 2 flows maintained Communication and Mobile Computing, Vol.9, No.1, pp.71–84, a stable throughput regardless the amount of traffic load admitted 9 June 2008. to the network. The FTP flows throughput decreased once there is [10] M. Varposhti, N. Movahhedinia, “Supporting QoS in IEEE new flow admitted to the network. The experimental result shows 802.11 e wireless LANs over fading channel”, Computer Com- that the proposed scheme able to provide priority services and hence munications, Vol.32, pp.985–991, 2009. able to guarantee bandwidth to the real time traffic using the eas- [11] P. Hemanth, D. Shankar, P. Jayakrishanan, “Enhancement of ily available Commercial on the shelf (COTS) hardware and open QoS in 802.11e for different traffic”, Lecture Notes in Computer source Linux operating system. The proposed scheme is suitable for Science, Vol.5408/2009, pp.408–413, 2009. small domestic WLAN. [12] W.K. Lai, C.S. Shieh, C.S. Jiang, “Performance enhancement for IEEE 802.11e networks by adaptive adjustment of the VI. Conclusion HCCA/EDCA ratio”, Proceedings of the 3rd International Con- ference on Ubiquitous Information Management and Commu- The existing wireless networks are dominated by nication, pp.1–6, Jan. 2009. IEEE802.11b/a/g/n system that only provides the best effort ser- [13] C. Cetinkaya, “Service differentiation mechanisms for WLANs”, vices to the wireless users. The new IEEE802.11e is introduced Ad Hoc Networks, Elsevier, Vol.8, No.1, pp.46–62, Apr. 2009. to provide QoS to the wireless network by providing the service [14] S.W. Pan, J.S. Wu, “Throughput analysis of IEEE 802.11e differentiation to the wireless users. The migration from the exit- EDCA under heterogeneous traffic”, Computer Communica- ing wireless systems to the new IEEE802.11e or the other systems tions, Elsevier, Vol.32, No.5, pp.935–942, March 2009. involves significant cost and time, and not environmental friendly. [15] C.J. Huang, K.W. Hu, Y.J. Chen, C.H. Chen, Y.C. Luo, “A The proposed scheme is a Layer-3 approach that maintains the QoS-aware VoD resource sharing scheme for heterogeneous net- existing wireless systems and provides either services or users dif- works”, Computer Networks, Elsevier, Vol.53, pp.1087–1098, ferentiation to the wireless users. AL3BM router is added to the Dec. 2009. wireless systems to provide QoS and is suitable to be implemented [16] Sixto Ortiz Jr., “IEEE 802.11n: The road ahead”, Computer, in the small domestic wireless network. The network administra- Vol.42, No.7, pp.13–15, July 2009. tor can controls the bandwidth distribution through the proposed [17] W.L. Pang, David Chieng and Nurul Nadia, “Enhanced layer scheme. The proposed scheme provides services and users differen- 3 service differentiation for WLAN”, WSEAS Transactions on tiation to the WLAN. The proposed scheme significantly improves Systems, Vol.8, No.5, pp.649–658, May 2009. the users’ fairness of the real time traffic when the number of data [18] S. Floyd and V. Jacobson, “Link-sharing and resource manage- flows increased gradually. Finally, the experimental result shows ment models for packet networks”, IEEE/ACM Transactions that the proposed scheme is successfully implemented in the testbed on Networking, Vol.3, No.4, pp.365–386, Aug. 1995. and provided QoS to the WLAN. [19] D.M. Chiu, R. Jain, “Analysis of the Increase and decrease algo- rithms for congestion avoidance in computer networks”, Com- References puter Networks and ISDN Systems, Vol.17, pp.1–14, June 1989. [20] E.C. Park, D.Y. Kim, C.H. Choi, J.M. So, “Improving quality of [1] IEEE 802.11a WG, Part 11: Wireless LAN Medium access con- service and assuring fairness in WLAN access networks”, IEEE trol (MAC) and Physical layer (PHY) specification: High-speed Transactions on Mobile Computing, Vol.6, No.4, pp.337–350, physical layer in the 5GHz band, Sept. 1999. April 2007. [2] IEEE 802.11b WG, Part 11: Wireless LAN Medium access con- Wai Leong Pang is a lecturer in trol (MAC) and Physical layer (PHY) specification: High-speed the Faculty of Engineering at Multimedia physical layer extension in the 2.4 GHz band, IEEE, Sept. 1999. University, Malaysia. He received M.S. de- [3] IEEE Std 802.11g/D1.1, Part11: Wireless LAN Medium access gree in electronics from Putra University, control (MAC) and Physical layer (PHY) specifications: Fur- Malaysia. His research interests include ther higher-speed physical layer extension in the 2.4 GHz band, wireless networks, VLSI analog and digital 2001. designs. (Email: [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Improved Eavesdropping Detection Strategy Based on Extended Three-particle Greenberger-Horne-Zeilinger State in Two-step Quantum Direct Communication Protocol∗

LI Jian, YE Xinxin, LI Ruifan, ZOU Yongzhong and LU Xiaofeng

(School of Computer, Beijing University of Posts and Telecommunications, Beijing 100876, China)

Abstract — In order to improve the eavesdropping de- three-particle Greenberger-Horne-Zeilinger (GHZ) state in tection efficiency in two-step quantum direct communica- two-step quantum direct communication protocol is proposed tion protocol, an improved eavesdropping detection strat- in which extended three-particle GHZ state is used to detect egy using extended three-particle GHZ state is proposed, eavesdroppers. During the security analysis, the method of the in which extended three-particle GHZ state is used to detect eavesdroppers. During the security analysis, the entropy theory is introduced, If the eavesdropper gets the full method of the entropy theory is introduced, and two de- information, the detection rate of the two-step quantum direct tection strategies are compared quantitatively by using the communication protocol using the EPR pair block is 50%; the constraint between the information which eavesdropper detection rate of the presented protocol using extended three- can obtain and the interference introduced. If the eaves- particle GHZ state is 59%. In the end, the security of the droppers intend to obtain all information, the eavesdrop- proposed protocol is discussed. The analysis results show that ping detection rate of the original two-step quantum direct the improved two-step quantum direct communication proto- communication protocol by using EPR pair block as detec- tion particles is 50%; while the proposed strategy’s detec- col using extended three-particle GHZ state is more secure. tion rate is 59%. In the end, the security of the proposed For simplicity, suppose that the original two-step QSDC protocol is discussed. The analysis results show that the using EPR pair in Ref.[17] is referred to as TSE and the im- eavesdropping detection strategy presented is more secure. proved eavesdropping detection strategy proposed is referred Key words — Quantum key distribution (QKD), to as TSET. Dense coding, Extended three-particle GHZ (Greenberger- It should be emphasized that, taking into account the se- Horne-Zeilinger) state, Eavesdropping detection, Entropy. curity vulnerability while the quantum “Ping-pong” protocol is used as QSDC, only the situation that the proposed pro- tocolisusedasaQKDstrategyisanalyzed.Thatistosay, I. Introduction the transmitted information is random key (raw key), not the secret message. After the error correction and the privacy Since Bennett and Brassard presented the pioneer QKD amplification, the random key will become the final key. protocol[1] (BB84 protocol) in 1984, a lot of quantum informa- tion security processing methods have been advanced, such as [2−13] [14,15] II. The TSET Protocol quantum teleportation , quantum secret sharing and so on. In recent years, a novel concept, Quantum secure direct communication (QSDC)[16−18] was put forward and studied by The basic idea of dense coding is that Alice makes one of some groups. The QSDC protocol can be used in some spe- the four unitary operations to each of her particles and two cial environments as first proposed by Bostr¨om et al.[19].In bits of classical information can be encoded in each EPR pair. 2003, Deng et al. proposed a two-step quantum direct commu- In Ref.[17], the author generalizes the dense coding idea into nication protocol using the Einstein-Podolsky-Rosen (EPR) secure direct communication. But the eavesdropping detec- pair block, which each EPR pair carries two bits of classical tion efficiency is not high. In order to improve the eavesdrop- information[17] . ping detection efficiency, an improved eavesdropping detection To improve the efficiency of eavesdropping detection in strategy based on extended three-particle GHZ state in two-

the two-step quantum direct communication protocolan im- step quantum direct communication protocol is proposed. The proved eavesdropping detection strategy based on extended specific steps are as follows.

∗Manuscript Received Feb. 2012; Accepted Mar. 2012. This work is supported by the Natural Science Foundation of Jiangsu Province (No.BK2011169), the National Natural Science Foundation of China (No.61100205, No.61100208). Improved Eavesdropping Detection Strategy Based on Extended Three-particle Greenberger-Horne-Zeilinger... 737

Suppose that the message to be transmitted is a sequence to each of her particles, u0,u1,u2,u3 as described by Eqs.(1), N + x =(x1, ···,xN )wherexi ∈{0, 1}, i =1, 2, ···,N. (2), (3), (4), respectively. And they transform the state |φ  + − + − Define into |φ , |φ , |ψ  and |ψ  respectively. These operations correspond to 00, 01, 10, and 11, respectively. In order to en- + 1 |φ  = √ (|00 + |11)(1)sure the security of this transmission, Alice has to insert some 2 decoy particles in the sequence C. She inserts enough of the − 1 |φ  = √ (|00−|11)(2)extended three-particle GHZ states in the sequence C as Bob 2 does in (S2), and measures them by Z-bases, only Alice knows + 1 |ψ  = √ (|01 + |10)(3)their positions. 2 (S5) Bob decrypts Alice’s secure message with the Bell- − 1 |ψ  = √ (|01−|10)(4)basis measurement on the particles B and C simultaneously. 2 After the transmission of the encoded particles C, Alice 1 |ψ = √ (|000 + |110 + |111)(5)tells Bob the positions of the decoy photons. Bob makes the 3 second eavesdropping detection. If Alice’s measurement result |  |  |  Now let us give an explicit process for the TSET is 0 and Bob’s measurement result is 00 + 11 , or Alice’s |  (S1) Alice and Bob agree that each of the four Bell bases measurement result is 1 and Bob’s measurement result is |  can carry two-qubit classical information and encode |φ+, 11 , they can trust that there is no eavesdropper, allowing |φ−, |ψ+ and |ψ− as 00, 01, 10 and 11, respectively. them to continue. Then Bob performs the Bell-basis measure- ment on the particles B and C simultaneously, and Bob gets (S2) Bob prepares a large enough number of Bell states, the secure message. In fact, in the second transmission, Eve randomly inserts enough the extended three-particle GHZ can only disturb the transmission and cannot steal the infor- states and dispenses the Bell states. mation because she can only get one particle from an EPR Bob prepares a large enough number (N) of Bell states + pair. |φ  in sequence. He extracts all the first particles in the Bell (S6) The TSET protocol is end. states, and forms a series of particles A (the travel qubits) in order. The remainder particles in the Bell states are formed a series of particles B (the home qubits) in order. These particles III. The Security Analysis are used to transmit secure message. From the description above, it can be seen that, the most Bob prepares a large number N0 of the extended three- difference between TSE and TSET is the method of the de- particles GHZ states |ψ and forms a series of particles C in tection of eavesdropping. In Ref.[17], the author computes the order. These particles are used to detect eavesdropping. Note maximal amount of the information (I)thatEvecaneaves- that the C sequence includes 3N0 qubits. drop and the probability (d) that Eve is detected. And the For C, Bob reserves particles 3 of the extended three- function I0 is provided. particle GHZ state, and measures them by Z-bases. After When p0 = p1 = p2 = p3 =1/4, that, Bob inserts particles 1, 2 of the extended three-particle 3 GHZ state to the particles A randomly. So a new sequence D I0 = − i=0λi log2 λi (6) in which there are decoy particles is produced, but only Bob knows their positions. Such that for i =0,1,2,3, Bob stores the particles B and sends the particles D to  1 1 1 Alice. λ0,2 = + − d(1 − d)(7) 4 2 4 (S3) The detection of eavesdropping  1 1 1 After Alice received the particles D, Bob tells her the po- λ1,3 = − − d(1 − d)(8) sitions where there are decoy photons and the results of his 4 2 4 measurements. Alice extracts the decoy photons from the par- So the above method can be used to compare the efficiency ticles D and performs the measurement. If Bob’s measurement of eavesdropping detection between the two protocols. result is |0 and Alice’s measurement result is |00 + |11,or Now, let us analyze the efficiency of the eavesdropping de- Bob’s measurement result is |1 and Alice’s measurement re- tection in TSET protocol. In order to gain the information sult is |11, they can trust that there is no eavesdropper. This that Alice operates on the travel qubits, Eve performs the uni- is the first eavesdropping detection. After that, if the error tary attack operation Eˆ on the composed system firstly. Then rate is small, Alice and Bob can conclude that there are no Alice performs the coding operation on the travel qubits. And eavesdroppers in the line. Alice and Bob continue to perform finally Eve performs a measurement on the composed system. step (S4); otherwise, they have to discard their transmission Note that, all transmitted particles are sent together before and abort the communication. detecting eavesdropping. This method is different with the (S4) Alice encodes her messages on the particles C and original dense coding. Because Eve does not know which par- transmits them to Bob. ticles are used to detect eavesdropping, she can only perform The dense coding scheme of Bennett and Wiesner[8] is used the same attack operation on all the particles. As for Eve, the to encode the message, where the information is encoded on an state of the travel qubits is indistinguishable from the com- EPR pair with a local operation on a single qubit. Explicitly, plete mixture, so all the travel qubits are considered in either Alice makes one of the four unitary operations (u0,u1,u2,u3) of the states |0 or |1 with equal probability p =1/2. 738 Chinese Journal of Electronics 2012

Generally speaking, suppose that there is a group of decoy Such that for i =0,1,2,3, photons in the extended three-particle GHZ states |ψ.Sim-  √  √  ilar to that in Ref.[20], suppose that after Eve performs the 1 1 1 2+ 9 − 15d 2+ 9 − 15d λ0,2 = + − 1 − attack operation Eˆ the states |0 and |1 become 4 2 4 5 5 (18)   √  √  |ϕ0 = Eˆ ⊗|0x = α|0x0 + β|1x1 (9) 1 1 1 2+ 9 − 15d 2+ 9 − 15d  ˆ λ1,3 = − − 1 − |ϕ1 = E ⊗|1x = m|0y0 + n|1y1 (10) 4 2 4 5 5 (19) where |xi and |yi are the pure ancillary states determined by I d I d Eˆ uniquely, and The above analysis shows that function ( E)and ( ET ) have the similar algebraic properties. In order to compare the 2 2 2 2 |α| + |β| =1, |m| + |n| = 1 (11) two functions, Fig.1 is given.

Then let us compute the detection probability. After being attacked by Eve, the state of composed system becomes 1 |ψEve =E ⊗ E ⊗ I ⊗ √ (|0x0x⊗|0 3 + |1x1x⊗|0 + |1x1x⊗|1) 1 = √ [(α|0x0 + β|1x1) ⊗ (α|0x0 + β|1x1) ⊗|0 3 +(m|0y0 + n|1y1) ⊗ (m|0y0 + n|1y1) ⊗|0

+(m|0y0 + n|1y1) ⊗ (m|0y0 + n|1y1) ⊗|1]

1 2 = √ [(α |0x00x0 + αβ|0x00x0 + αβ|1x10x0 3 2 2 + β |1x11x1) ⊗|0 +(m |0y00y0 + mn|0y01y1 2 2 Fig. 1. The comparison of the two detection results + mn|1y10y0 + n |1y11y1) ⊗|0 +(m |0y00y0 2 As Fig.1 shows that if Eve wants to gain the full informa- + mn|0y01y1 + mn|1y10y0 + n |1y11y1) ⊗|1] tion (I = 2), the probability of the eavesdropping detection 1 2 = √ (α |0x00x00 + αβ|0x01x10 + αβ|1x10x00 is dE (I =2)=0.5 in the TSE. And the probability of the 3 eavesdropping detection is dET (I =2)=0.59 in the TSET. β2| x x  m2| y y  mn| y y  + 1 11 10 + 0 00 00 + 0 01 10 Obviously, if Eve wants to get the same amount information, 2 2 + mn|1y10y01 + n |1y11y10 + m |0y00y01 she must encounter the higher detection efficiency in TSET. 2 + mn|0y01y11 + mn|1y10y01 + n |1y11y11) Also, if there is the same detection efficiency, Eve will eaves- (12) drop less information. Then assume that Bob sends |1 rather than |0.Theabove Obviously, when Alice performs the extended three- security analysis can be done in full analogy, resulting in the particle GHZ measurement on the decoy photons, the prob- same crucial relations. ability without eavesdropper is If Eve wants to gain the same amount of information, she

1 2 2 2 2 2 2 2 2 2 2 must face a larger detection probability in the TSET than the p(|ψEve)= (|α | + |β | + |m | + |n | + |n | ) (13) 3 other. This also shows that the TSET is more secure than the other. So the lower bound of the detection probability is

d =1− p(|ψEve) (14) IV. Conclusions

Now, let us analyze how much information Eve can gain. In the TSET protocol, the security message can be securely |α|2 a |β|2 b |m|2 s |n|2 t a, b, s Suppose = , = , = , = ,where transmitted to the receiver, and any useful message will not t a b s t and are positive real numbers, and + = + =1.Then leak to the potential eavesdropper. Compared with the TSE 2 2 d =1− (2a +3t − 2a − 2t +2)/3 (15) protocol, the TSET protocol has the following differences: (1) In the two-step quantum direct communication pro- [17] After some simple mathematical calculations, when a = t, tocol using the EPR pair block , Alice sends the checking- we can get sequence to Bob to make the eavesdropping detection, If the 2 d =1− (5a − 4a +2)/3 (16) error rate is low, then Alice sends the message-coding sequence Suppose that Bob sends |0 to Alice, the maximal amount to Bob. During this process, Eve can capture the state of of information is equal to the Shannon entropy of a binary particles in both checking-sequence and the message-coding sequence. So a little secret message may be leaked. In the channel, when p0 = p1 = p2 = p3 =1/4, the information I0 that Eve can get is TSET protocol, Bob prepares the particles rather than Alice; Eve cannot capture the states of the home qubits, so the TSET 3 I0 = − i=0λi log2 λi (17) protocol can improve the security of the transmission. Improved Eavesdropping Detection Strategy Based on Extended Three-particle Greenberger-Horne-Zeilinger... 739

(2) In the TSET protocol, the eavesdropping detection is [16] G.L. Long and X.S. Liu, “Theoretically efficient high-capacity made twice. The first eavesdropping detection is made to en- quantum- key- distribution scheme”, Phys. Rev. A, Vol.65, sure that there is no eavesdropper, so Alice can encode the 032302, 2002. [17] F.G. Deng, G.L. Long and X.S. Liu, “Two-step quantum di- secret message and the transmission can continue. In fact, in rect communication protocol using the Einstein-Podolsky-Rosen the second transmission, Eve can only disturb the transmis- pair block”, Phys. Rev. A, Vol.68, 042317, 2003. sion and cannot steal the information. But if Alice and Bob [18] G.L. Long, F.G. Deng, C. Wang X.H. Li, K. Wen and W.Y. can entrust that there is no eavesdropper by the second eaves- Wang, “Quantum secure direct communication and determinis- dropping detection, those particles which are used to transmit tic secure quant communication”, Front. Phys. China, Vol.2, the secure message can be reused. No.3, pp.251–272, 2007. In this paper, only the situation that the protocol is used [19] K. Bostr¨om and T. Felbringer, “Deterministic secure direct as a QKD strategy is considered. So the weaknesses which the communication using entanglement”, Phys. Rev. Lett., Vol.89, pp.187902, 2002. TSET protocol necessarily faced as QSDC, such as the noise [20] F. Gao, F.Z. Guo, Q.Y. Wen ., “Comparing the efficiency of channel[21,22], may not be considered. In the further work, et al different detection strategies of the ‘Ping-pong’ protocol”, Sci. the other Two-step quantum direct communication protocol’s China, Ser. G-Phys Mech. Astron, Vol.39, No.2, pp.161–166, security and its improvement will be researched. 2009. (in Chinese) [21] A. W´ojcik, “Eavesdropping on the “Ping-pong” quantum communication protocol”, Phys. Rev. Lett., Vol.90, No.5, References pp.157901, 2003. [1] C.H. Bennett and G. Brassard, “Quantum cryptography: [22] F.G. Deng, X.H. Li, C.Y. Li et al., “Eavesdropping on the Public-key distribution and coin tossing”, Proceeding of the “Ping-pong” quantum communication protocol freely in a noise IEEE International Conference on Computer, Systems and Sig- channel”, Chin. Phys. Lett., Vol.16, pp.277–281, 2007. nal Processing, Bangalore, India, pp.175–179, 1984. LI Jian Ph.D., Associate Profes- [2]C.H.Bennett,G.Brassard,C.Crepeau,R.Jozsa,A.Peres sor of Beijing University of Posts and and W.K. Wooters, “Teleporting an unknown quantum state Telecommunications, interested in research via dual classical and Einstein-Podolsky-Rosen channels”, Phys. on quantum information, quantum compu- Rev. Lett., Vol.70, 1993. tation, quantum communication security, [3] Kim. Yoon-Ho, S.P. Kulik and Shih. Yanhua, “Quantum tele- electronic commerce and artificial intelli- portation with a complete Bell state measurement”, J. Mod. gence. (Email: [email protected]) Opt., Vol.49, pp.221–236, 2002. [4] J. Li, H.F. Jin and B. Jing, “Improved quantum “Ping-pong” protocol based on GHZ state and classical XOR Operation”, Sci. China Ser. G, Vol.54, No.9, pp.1612–1618, 2011. YE Xinxin is a M.S. student in [5] Z.J. Zhang, Y.M. Liu and D. Wang, “Perfect teleportation of ar- the School of Computer at the Beijing bitrary n-qubit states using different quantum channels”, Phys. University of Posts and telecommunica- Lett. A, Vol.372, pp.28–32, 2007. tions, China. She received B.S. degree [6] J. Li, D.J. Song, X.J. Guo and B. Jing, “Improved quantum in computer science from Harbin Normal “Ping-pong” protocol based on five-qubit GHZ state and clas- University. Her research interests include sical CNOT operation”, Int. J. Theor. Phys., Vol.51, No.1, quantum information, information security pp.292–302, 2012. and the security of the Internet of things. [7] H. Yuan, Q. He, X.Y. Hu et al., “Deterministic secure quantum (Email: buptyezi@ bupt.edu.cn) communication with cluster state and bell-basis measurements”, , Vol.50, 2008. Commun. Theor. Phys. LI Ruifan is a lecturer of Beijing [8] J. Li, D.J. Song, X.J. Guo and B. Jing, “An improved “Ping- University of Posts and Telecommunica- pong” protocol based on four-qubit genuine entangled state”, tions, interested in research on quantum Chinese Jaurnal of Electronics, Vol.20, No.3, pp.457–460, 2011. information, quantum computation, quan- [9] Prakash. Hari, “Quantum teleportation”, International Con- tum communication security, information ference on Emerging Trends in Electronic and Photonic Devices security and artificial intelligence. (Email: and Systems, 2009. rfli@ bupt.edu.cn) [10] J. Li, D.J. Song, X.J. Guo and B. Jing, “Quantum secure di- rect communication protocol based on five-particles cluster state and classical XOR operation”, Acta Electronica Sinica, Vol.36, No.1, pp.31–36, 2012. (in Chinese) ZOU Yongzhong is a lecturer of [11] F. Akira, “Quantum teleportation and quantum information Beijing University of Posts and Telecom- processing”, Quantum Electronics and Laser Science Confer- munications, interested in research on ence, 2010. quantum information, quantum computa- [12] J. Li, H.F. Jin and B. Jing, “Improved security detection strat- tion, quantum communication security and egy for quantum “Ping-Pong” protocol and its security analy- information security, artificial intelligence. sis”, Chin. Commun., Vol.8, No.3, pp.170–179, 2011. [13] J. Li, D.J. Song, X.J. Guo and B. Jing, “An improved security detection strategy based on W state in “Ping-pong” Protocol”, Chinese Journal of Electronics, Vol.21, No.1, pp.117–120, 2012. [14] M. Hillery, V. Buzek and A. Berthiaume, “Quantum secret shar- LU Xiaofeng is a lecturer of Beijing University of Posts ing”, Phys. Rev. A, Vol.59, pp.1829–1834, 1999. and Telecommunications, interested in research on quantum infor- [15] S.K. Singh and R. Srikanth, “Generalized quantum secret shar- mation, quantum computation, quantum communication security, ing”, Phys. Rev. A, Vol.71, pp.012328, 2005. information security and artificial intelligence. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Interferometric Phase Statistics and Estimation Accuracy of Strong Scatterer for InSAR∗

XU Huaping, LI Shuang and FENG Liang

(School of Electronic and Information Engineering, Beihang University, Beijing 100191, China)

Abstract — Strong scatterer, such as sign point, ex- tributed SAR TanDEM-X was launched by German Aerospace ists in Interferometric synthetic aperture radar (InSAR) Center successfully. TanDEM-X aims to generate a world- image. This paper focuses on the interferometric phase wide, consistent, timely, high-precision DEM (with height ac- statistics of the strong scatterer in InSAR. Based on the [3] curacy lower than 2m) . InSAR echo model, the phase Probability density function (PDF) of strong scatterer is deduced. Then it is presented Interferometric phase, defined as the phase difference of that interferometric phase PDF of strong scatterer is de- two SAR images in the same area, is one of the key parameters termined not only by correlation coefficient, but also by the in InSAR height estimation. Its estimation accuracy is the pri- characteristic parameter of strong scatterer, and the PDF mary factor that determines the height estimation accuracy[4]. becomes more centralized with the increasing of domina- During InSAR data processing, precise and robust phase un- tive scatterer intensity. According to interferometric phase wrapping algorithms are dependent on Interferometric phase PDF of strong scatterer deduced in this paper, the phase [5] Cramer-Rao lower bound (CRLB) of strong scatterer is estimation (IPE) accuracy . In order to discuss the accu- presented. The higher the intensity of dominative scat- racy of IPE effectively, the interferometric phase statistics is terer is, the better estimation accuracy can be obtained. necessary. Moreover, to analyze height estimation error in It reveals that the accuracy of Interferometric phase esti- InSAR system design, Cramer-Rao lower bound (CRLB) of mation (IPE) assessed with strong scatterer is better than interferometric phase estimation error, is calculated based on that in actual case. Therefore, Concomitant sign point interferometric phase statistics[6]. Therefore, it is important (CSP) is presented to assess the accuracy of IPE. Simula- tion results illustrate the validity of the theory proposed to investigate the interferometric phase statistics. in this paper. Interferometric phase statistics of distributed scatterers, based on the developed system theoretical approach, is first Key words — Interferometric phase statistics, Inter- presented by Just and Bamler[7]. The interferometric phase ferometric synthetic aperture radar (InSAR), Cramer-Rao statistics of multi-look is investigated by Lee et al.[8]. Recently, lower bound (CRLB), Sign point, Estimation accuracy. with the development of multichannel InSAR, the multichan- nel interferometric phase statistics was studied, and the CRLB I. Introduction of IPE was deduced and widely used for multi-baseline, multi- frequency InSAR performance assessment[9−11] . These inves- SAR is an active remote sensor and can observe ground in tigations are all based on the hypothesis of jointly Gaussian all kinds of weather regardless of day or night. It has unique distributed in SAR data, and the hypothesis is only accepted advantage in terrain measurement, ocean monitoring, military in the ideal uniform scene. However, in the real world, many affairs and so on. With the development of SAR, InSAR is pre- strong scatterers are embedded in the uniform scene in many sented to measure the terrain height efficiently, which obtains cases, and exhibited as bright points in SAR image. For ex- geocoded Digital elevation model (DEM) from phase difference ample, Ground control points (GCP), which is necessary for of two SAR images in the same area with slight look angle dif- error compensation in baseline estimation, unwrapping phase [12] ference. As an optimal technology of terrain measurement, correction and so on, is shown as strong scatterer . In addi- InSAR has been widely discussed and rapidly improved, and tion, sign point, which is required more extensively for phase the acquisition of global high-precision DEM has always been unwrapping and height estimation accuracy assessment, is also [12] the motive of InSAR technology development since its first expressed as strong scatterer . This paper can be considered introduction[1]. In 2000, dual-antenna Shuttle radar topogra- as an extension to the interferometric phase statistics of uni- phy mission (SRTM) was launched successfully by National form scene. Aeronautics, Space Administration, the National Imagery and It is well known that strong scatterer is widely used for Mapping Agency, and then Jet Propulsion Laboratory (JPL) InSAR data processing and performance assessment, whereas obtained the near-global topography DEM data for latitudes its statistics is unknown at the moment. In order to discuss smaller than 60 using SRTM data[2]. In 2010, spaceborne dis- the rationality of error compensation using GCP and perfor-

∗Manuscript Received Aug. 2011; Accepted Oct. 2011. Interferometric Phase Statistics and Estimation Accuracy of Strong Scatterer for InSAR 741 mance assessment using sign point, the interferometric phase 1. SAR phase statistics of strong scatterer statistics of strong scatterer must be investigated for InSAR. Strong scatterer can be considered as the summation of To solve these problems, this paper focuses on deducing the narrow band stationary Gaussian random process and sinusoid PDF of interferometric phase of strong scatterer, analyzing with random phase, and its SAR model Ss(t) can be rewritten its statistics, and calculating the CRLB of IPE. Moreover, we as: propose a more objective method of accuracy assessment of Ss(t)=As(t)cos[ω0t + Φs(t)] (4)

IPE with CSP than that with strong scatterer. where, ω0 is carrier frequency. As(t)andΦs(t) are the ampli- This paper is organized as follows: Section II describes the tude and phase of Ss(t). The phase PDF p(φs)is: model and statistics of interferometric phase of uniform scene. 2 1 B B cos(φs − θ) In Section III, strong scatterer is defined and its interferomet- p(φs)= exp − + √ 2π 2σ2 πσ ric phase statistics is deduced and analyzed in detail. Then, 2 the CRLB of IPE of strong scatterer is discussed in Section IV. ⎧ 2 2 1 B cos(φs − θ) ⎫ ⎨ B sin (φs − θ) + Ψ ⎬ In the end, CSP is presented to assess interferometric phase es- 2 σ · exp − timation performance and computer simulations demonstrate ⎩ 2σ2 ⎭ the rationality and validity of theory analysis. (5) II. InSAR Phase Statistics of Uniform where, Ψ(x) is Laplace function. From Eq.(5), it is shown: Scene (1) If SBR is lower, that is B → 0, then p(φs)=1/2π. The phase PDF p(φs) of strong scatterer approaches to uni- Uniform scene is also known as distributed scatterer. Each form distribution. In this case, the phase statistics is the same resolution cell of its SAR echo data is the aggregate of a num- as that in uniform scene. ber of discrete scatterers. In the case of uniform scene, we can (2) If SBR is higher, then the Eq.(5) becomes: obtain the interferometric phase Δφu PDF p(Δφu) of uniform 2 2 [8] B cos(φs − θ) B sin (φs − θ) scene : p(φs)= √ exp − (6) 2πσ 2σ2 −|γ|2 p φ 1 1 (Δ u)= 2 2 Therefore, the amplitude B of dominative scatterer will 2π 1 −|γ| cos (Δφu − φm,n) determine the shape of PDF p(φs), while the phase θ of the |γ| cos(Δφu − φm,n)arcos[−|γ| cos(Δφu − φm,n)] · 1+ dominative scatterer will translate the curve. The results are −|γ|2 2 φ − φ 1 cos (Δ i m,n) (1) illustrated in Fig.3 and Fig.4. Fig.3 shows the interferometric phase PDF p(φs) with different SBR. The variety of B repre- γ where, is the complex correlation coefficient of the two SAR sents the level of SBR. Fig.4 describes a sequence of plots of images and φm,n =arg(γ). the phase PDF p(φs) with different θ. p φ The interferometric phase PDF (Δ u)isaparametric 2. InSAR phase statistics of strong scatterer γ distribution depending upon complex correlation coefficient . The models of strong scatterer of co-registered SAR image Fig.1 shows PDF p(Δφu) with different amplitude |γ| of com- pair Ssm and Ssn are shown as: plex correlation coefficient in the case of φmn = 0. Fig.2 shows the PDF p(Δφu) with different phase φmn of complex corre- Ssm = Asm cos(ω0t + Φsm),Ssn = Asn cos(ω0t + Φsn)(7) lation coefficient in the case of |γ| =0.9. The interferometric phase ΔΦs of strong scatterer is calculated III. InSAR Phase Statistics of Strong by: ΔΦs = Φsm − Φsn (8) Scatterer where Asm,Asn, Φsm and Φsn are the amplitude and phase of Ssm and Ssn. Therefore, the interferometric phase statistics of There are many cases that strong scatterer appears in SAR strong scatterer can be deduced from p(asmc,asnc,asms,asns): image such as GCP, sign point and so on. Therefore, it makes sense to discuss the statistics of strong scatterer. One domina- 1 p(asmc,asnc,asms,asns)= tive scatterer in one resolution is assumed for simplifying the (2π)2det1/2(C) whole deduction. The SAR model of one resolution cell with 1 T −1 · exp − (S − M) C (S − M) one strong scatterer can be established as follows: 2 N (9) Ss = Ask exp(jφsk)+B exp(jθ)(2) a ,a ,a a k=1 where, smc snc sms and sns are in-phase and quadrature components of Ssm and Ssn respectively, and they are the B θ Where is the known amplitude of dominative scatterer. is functions about θ. the phase of dominative scatterer, which is random and uni- T form distribution in [−π,π]. Ask and φsk are the amplitude S =[asmc,asnc,asms,asns] (10) T and phase of background. Strong Scatterer to background ra- M =[E[asmc],E[asnc],E[asms],E[asns]] (11) tio (SBR) is defined as: Making the first change of variables: B2 2 2 2 2 SBR = 10 lg (3) asm = asmc + asms,asn = asnc + asns, 2σ2 742 Chinese Journal of Electronics 2012 a a 2 2 sns sms K2 / σ q −|pmn| φsn =arg ,φsm =arg (12) =1 2 ( ) (15) asnc asmc

Making second change of variables: pmn,qis the parameters about the imaging system (see Ref.[7]) √ √ The interferometric phase PDF p(Δφs)is: xm = qnasm,xn = qmasn,q= qmqn (13)  π  ∞  ∞ qm and qn are the parameters about the imaging system (see p(Δφs)= p(xm,xn, Δφs,φsm)dxmdxndφsm Ref.[7]). −π 0 0 Making the last change of variable: Δφs = φsm − φsn. (16) Eq.(9) reduces to the expression of PDF p(xm,xn, Δφs,φsm), It is hard to obtain the analytical expression of p(Δφs), so and is shown as follows: numerical calculation is used to discuss the interferometric phase statistics and its results are shown in Fig.5, Fig.6 and p(xm,xn, Δφs,φsm)=K1xmxn p φ 2 2 2 √ Fig.7. Fig.5 describes the interferometric phase PDF (Δ s) · exp{−K2[xn + xm +2B (q −|pmn| q)} √ of strong scatterer with different correlation coefficient. Fig.6 · exp{2K2|pmn|xmxn cos[Δφs − (θm − θn)]/ q} and Fig.7 illustrate the effect of strong scatterer on PDF √ xn cos(φsm − Δφs − θn) p(Δφs), where a sequence of SBR and θm − θn are applied. · exp − 2BK2(|pmn|− q) +xm cos(φsm − θm) Comparing Fig.1 and Fig.2 with Fig.5, Fig.6 and Fig.7, it (14) is obvious to obtain the conclusion: (1) The correlation coefficient has the same effect on the where, InSAR image phase statistics of strong scatterer as that of 2 4 2 K1 =1/(2π) σ q(q −|pmn| ) uniform scene.

Fig. 1. The PDF p(Δφu) with different Fig. 2. The PDF p(Δφu) with different Fig. 3. The PDF p(φs) with different SBR |γ|(φmn =0) φmn(|γ| =0.9) (θ =0)

Fig. 4. The PDF p(φs) with different θ Fig. 5. Interferometric phase PDF with Fig. 6. Interferometric phase PDF with (SBR=11dB) different |γ| different SBR

Fig. 7. Interferometric phase PDF with different θm − θn Fig. 8 The CRLB of uniform scene and strong scatterer Interferometric Phase Statistics and Estimation Accuracy of Strong Scatterer for InSAR 743

(2) Unlike uniform scene, InSAR phase statistics of strong scatterer, are plotted, and the following points are illustrated scatterer is determined not only by correlation coefficient, but in Fig.8. also by the characteristic parameter of strong scatterer. With (1) With the increasing of correlation coefficient, the the increasing of SBR, the interferometric phase PDF p(Δφs) CRLB is decreasing, which is valid for both uniform scene becomes more centralized. While θm − θn only translates the and strong scatterer. But the is lower than that of uniform interferometric phase PDF curve. If B → 0, then Eq.(16) scene, when correlation coefficients are equal. is equal to Eq.(1). Therefore, the phase statistics of uniform (2) For strong scatterer, the higheris, the lower CRLB is. scene could be considered as a particular case of strong scat- Therefore, we can conclude that the accuracy assessment of terer. IPE with higher is better than that in practical case. Since all our information is contained in the PDF p(Δφs) calculated from the observed data Ssm and Ssn,itisnotsur- V. Computer Simulation and Result prising that the accuracy of IPE depends directly on the PDF Analysis p(Δφs). Moreover, we should not expect to be able to estimate a parameter with any degree of accuracy if the PDF only de- The simulation is started from echo signal, and Chirp scal- pends weakly upon that parameter, or in the extreme case, if ing (CS) is employed for SAR imaging. In the simulation, the PDF does not rely on it at all. In general, the more the the in-phase and quadrature components of SAR image are PDF is influenced by an unknown parameter, the better we are Gaussian independent and identically distributed with mean able to estimate it. Fig.6 illustrates that the interferometric 0 and variance 1; the correlation coefficient is 0.91; the orbit phase PDF p(Δφs) concentrates gradually with the increasing parameters are listed in Table 1. The SAR single-look com- of SBR. Therefore, we can conclude that the higher the SBR plex images with strong scatterer (SBR=13dB) and CSP are is, the better we could obtain the accuracy of IPE. shown in Fig.9(a)and9(b). Maximum correlation coefficient is used in image co-registration, and wrapped interferomet- IV. CRLB of IPE for Strong Scatterer ric phase is generated by multiplying the master images with the complex conjugate of co-registration images respectively, As an interferometric phase estimation error, CRLB is and their interferograms are shown in Fig.9(c). The interfer- needed for InSAR system design and performance analysis. ometric phase is unwrapped by noise-immune algorithm, and The CRLB of IPE will be deduced based on the interferomet- unwrapped phase image is given in Fig.9(d). ric phase PDF. Consequently, the difference of CRLBs between uniform scene and strong scatterer should be analyzed. The CRLB of the IPE for uniform is given by[11]: 2 −1 ∂ fML(Δφ) CRLB = − E (17) ∂Δφ2

In the case of strong scatterer, submitting Eq.(16) to Eq.(17), the CRLB of interferometric phase of strong scatterer CRLBs can be obtained:

−1 T 2 CRLBs = PI (φsn,φsm)P = − (18) α +2β where, P =[1, −1], and the Fisher information matrix I is

⎡ 2 2 ⎤ ∂ ln p(xsm,xsn,φsm,φsn) ∂ ln p(xsm,xsn,φsm,φsn) 2 ⎢ ∂φsm ∂φsm∂φsn ⎥ I =− E⎣ 2 2 ⎦ ∂ ln p(xsm,xsn,φsm,φsn) ∂ ln p(xsm,xsnφsm,φsn) ∂φ ∂φ ∂φ2 sn sm sn α + β −β =− (19) −βα+ β   π π √ α= 2K2B( q −|pm,n|)xn cos(φsn − θn) −π −π Fig. 9. Simulation results (SBR=13dB) · p(φsm,φsn)dφsmdφsn (20)  π  π β = 2K2|γ|xmxn cos(φsm − θm − φsn + θn) Table 1. Orbit parameters −π −π Parameters Master satellite Slave satellite

· p(φsm,φsn)dφsmdφsn (21) Semi-major axis (m) 6 886.000 6 886.000  ∞  ∞ Eccentricity 0.000 0.0007 · p(φsm,φsn)= p(xm,xn,φsm,φsn)dxmdxn Inclination (deg) 97.000 97.000 0 0 (22) Argument of perigee (deg) 89.000 89.000 RAAN (deg) 218.000 218.000 According to Eq.(18), a group of curves, which are CRLB ver- Mean anomaly (deg) 0.000 −120.00 sus correlation coefficient for uniform scene (B = 0) and strong 744 Chinese Journal of Electronics 2012

In practice, comparing original value with estimate value of References the sign point expressed as strong scatterer in SAR image, the [1] L. Pang, J.X. Zhang and H.D. Fan, “Progress and tendency relative error relating to sign point is calculated as following of multibaseline synthetic aperture radar interferometry tech- formula: nique”, , Vol.38, No.9, pp.2152–2157, Acta Electronica Sinica 2 2010. (in Chinese) Δφr = (φˆi − φˆj )/C (23) M [2] A. Koch, C. Heipke, “Quality assessment of digital surface mod- 0

Radar Clutter Suppression Based on SαS Fractional Autoregressive Model∗

FENG Xun, WANG Shouyong, YANG Jun and ZHU Xiaobo

(Wuhan Radar Institute, Wuhan 430019, China)

Abstract — In correlated non-Gaussian clutter back- noise driving an AR model. However, radar usually faces a grounds, serious degradation occurs in traditional clutter complex clutter environment. In 1999, the Australian Defence suppression method. A new clutter suppression method science and technology organization (DSTO) analyzed a large α was proposed based on Symmetric alpha stable (S S) frac- number of sea clutter data collected from maritime surveil- tional autoregressive model. The SαS fractional autore- gressive model was considered as a stochastic processed lance radars in different sea statesand pointed out that the in which a fractional autoregressive system was driven by statistics of radar sea clutter are mostly non-Gaussian and of- [3] awhiteSαS noise, and the clutter suppression filter was ten have long and short correlation characteristics .Asthe established by using the model parameters which was es- statistics of real radar clutter is often non-Gaussian and the timated based on the Fractional lower order covariance clutter’s correlation characteristics can not completely be de- α (FLOC). The S S fractional autoregressive model can ef- scribed by AR model, the method proposed in Ref.[2] can not fectively describe the non-Gaussian characteristics as well work well under correlated non-Gaussian clutter backgrounds. as the long and short correlation characteristics of the clut- ter, and by using FLOC, more accurate parameters can be Actually, there is another important statistical distribution estimated. Simulations and real data results show that the called Alpha-stable distribution[4] which is suitable for mod- proposed method obviously outperforms the traditional eling the non-Gaussian clutter[5] and has been well demon- clutter suppression method under correlated non-Gaussian strated by experimental radar clutter measurements. The clutter backgrounds. Alpha-stable distribution is the only distribution that satisfies Key words — Clutter suppression, Correlated non- the generalized central limit theorem and it is a direct gener- Gaussian clutter, Fractional autoregressive model, Frac- alization of the Gaussian distribution and in fact includes the tional lower order covariance. Gaussian as a limiting case. The authors in Ref.[6] give an experiment and prove that the real radar sea clutter satisfies I. Introduction Symmetric Alpha-stable distribution (SαS) with the param- eter α =1.85. In Ref.[7], it is also proved that the Alpha- At present, Moving target detection (MTD) is the main stable distribution is better than Gaussian, Weibull and K- technology for clutter suppression or coherent integration in distribution for the description of radar clutter. Moreover, the radar signal processing, and essentially it is the periodogram AR model can not completely describe the complex correla- of the observations[1]. In correlated clutter backgrounds, if tion characteristics of the clutter, the fractional pole model[8] target Doppler falls in the clutter spectrum domain, it has is adopt to establish the fractional autoregressive model which difficulty in target detection, especially for wide clutter spec- is a cascade model with a Fractional unit pole (FP) model trum. followed by an AR model. Because the FP model is suit- In order to obtain an effective method for the clutter sup- able for describing the clutter’s long correlation, and the AR pression, it firstly needs to establish an accurate clutter model, model is suitable for describing the clutter’s short correla- i.e. taking the clutter as a stochastic process generated by a tion, the fractional autoregressive model has the superiority white noise driving a parametric model. The statistical dis- to describe the clutter’s complex correlation characteristics. tribution of the white noise should be consistent with the real Based on the fractional autoregressive model, this paper pro- clutter, and the correlation characteristics of the parameter poses a clutter suppression method in which clutter is mod- model should be able to describe the actual clutter effectively. eled as a stochastic process generated by a white SαSnoise Accordingly, by estimating the model’s parameters, an inverse driving a fractional autoregressive model. As the SαSdis- system which is able to whiten the correlated clutter can be tribution has no second-order statistics[9], and the parameter obtained, and then the correlated clutter can be suppressed. estimation method based on Fractional lower order moments The authors in Ref.[2] propose a clutter suppression method (FLOM)[10] shows poor accuracy in actual use, we present a based on Autoregressive (AR) model. They take the clutter parameter estimation method based on Fractional lower co- as a stochastic process which is generated by Gaussian white variance (FLOC)[11] which can improve the estimation accu-

∗Manuscript Received Dec. 2011; Accepted May 2012. 746 Chinese Journal of Electronics 2012

Fig. 1. Block diagram of clutter modeling and clutter suppression based on SαS fractional autoregressive model racy. Finally, computer simulations and real data tests are parameters. The FP model’s autocorrelation function r(l)= 2d−1 [12] given, the results show that SαS fractional autoregressive Cd|l| (Cd > 0) is expressed as a power series, while model can effectively describe the statistics of non-Gaussian the AR model’s autocorrelation function of first-order r(l)= clutter, and also the long withal short correlation character- a|l| C 1 C > AR 2 ( AR 0) is exponential form. Comparing Fig.2 istics of the clutter, the proposed method can effectively sup- 1 − a1 press the correlated non-Gaussian clutter. to Fig.3, we can see that the PF model has great long correla- The rest of the paper is organized as follows. In Section tion properties, while the AR model has short-term correlation II, the SαS fractional autoregressive model and the clutter properties. Therefore the cascade model has the superiority on [12] suppression filter are introduced. In Section III, the param- describing the clutter’s complex correlation . eter estimation of the SαS fractional autoregressive model is presented based on FLOC. In Section IV, performance of the proposed estimation method and the clutter suppression filter is analyzed by simulations and real data testes.

II. SαS Fractional Autoregressive Model and Clutter Suppression Method

In this section we take the observation clutter samples X(n) as the output of a linear system driven by a white SαS noise. As shown in Fig.1, Hd(z) is the system function of FP model, HAR(z) is the system function of the AR model, U(n) α [12] α <α≤ is an i.i.d. S Sprocess , and the parameter (0 2) Fig. 3. Autocorrelation function of first-order AR model is called characteristic exponent which determines the density α tails of the distribution, the smaller of the stronger non- III. Parameter Estimation of SαS Gaussian characteristics of X(n) has. The parameter γ (γ>0) is the dispersion of the distribution, it plays a role analogous Fractional Autoregressive Model to the role that the variance plays for Gaussian process, d is called fractional pole parameter, ak is AR model parameters, 1. Parameter estimation of FP model P is the order. By estimation the above parameters, we can The key step in whitening correlated non-Gaussian clutter ˆ −1 get FP model’s inverse system Hd (z), as well AR model’s is to accurately estimate the model parameters. In Ref.[13], an −1 d inverse system HˆAR(z). In Fig.1, X(n)istheSαSfractional estimation method on fractional pole parameter is proposed autoregressive process, while X(n) represents the filters’ out- by calculating the slope of the power spectrum. However, the put which is i.i.d. SαS process. SαS fractional autoregressive process does not have second- order moment, the traditional estimation based on power spec- trum is no longer applicable. The authors in Ref.[10] propose a covariation spectrum to replace the power spectrum, the covariation spectrum is the Fourier transformation of the co- variation coefficient which is defined as

[X(n),X(n − k)]α λ(k)= (1) [X(n − k),X(n − k)]α

where [x, y]α represents the covariation between the variable x and y. In Ref.[10], the authors use Fractional lower order moments (FLOM) to estimate the covariation coefficients, that

2s−1 λ(k)=E[X(n)X(n − k) ], 1 ≤ 2s<α (2) Fig. 2. Autocorrelation function of FP model where the operation (·)s = |·|s−1(·) is called fractional lower In order to illustrate the characteristics of the FP model power transformation, s is known as fractional lower index. It and the AR model, Fig.2 and Fig.3 give the curves of the should be pointed out that this method still has some prob- two models’ autocorrelation functions under different model lems and limitations. If X(n) is independence of X(n − k), Radar Clutter Suppression Based on SαS Fractional Autoregressive Model 747

ˆ −1 according to Eq.(2), we have system Hd (z), in order to eliminate the impact of Hd(z)(i.e. long correlation characteristics). The impulse response of the E |X n X n − k 2s−1|2 E |X n |2 E |X n − k |2(2s−1) [ ( ) ( ) ]= [ ( ) ] [ ( ) ] inverse system can be expressed as = ∞ (3) n − dˆ hˆ−1 k Γ ( ) λ k d ( )= (11) When estimation ( ), it usually uses the sample average in- Γ (−dˆ)Γ (n +1) stead of statistical average. But from Eq.(3), the second-order moment does not exist, and it does not satisfy the ergodic the- When SαS fractional autoregressive process passes the inverse ˆ −1 orem, so the estimation will not converge to the true value. In system Hd (z), it becomes SαS AR process represented as this paper we use Fractional lower order covariance (FLOC) Xar(n), then we can estimate the AR parameters. Usually, to estimate the spectrum function. The FLOC of X(n)can AR parameters can be obtained by solution of the generalized [14] be expressed as[11] Yule-Walker equations ,thatis ⎡ ⎤ λ λ − ··· λ − P ⎡ ⎤ ⎡ ⎤ FLOC(k)=rfloc(k) (0) ( 1) (1 ) a1 λ(1) ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ E{ X n s X n − k s ∗} ⎢ λ(1) λ(0) ··· λ(2 − P ) ⎥ ⎢ a2 ⎥ ⎢ λ(2) ⎥ = [ ( )] ([ ( )] ) (4) ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ . . . ⎥ ⎢ . ⎥ = ⎢ . ⎥ ⎢ . . .. . ⎥ ⎣ . ⎦ ⎣ . ⎦ Its estimation is obtained ⎣ . . . . ⎦ . . a λ P N λ(P − 1) λ(p − 2) ··· λ(0) P ( ) r k 1 X n s Xs n − k ∗ (12) ˆfloc( )=N ( ) [ ( )] (5) n=1 where ak (k =1, ···,P) is AR parameters, λ(k) is covariation coefficients of Xar(n). Using FLOM to estimate λ(k)and AccordingtoEq.(5),wehave substituting it into Eq.(12), then AR parameters ak can be s s 2 2s 2s E[|X(n) X(n − k) | ]=E[|X(n)| ]E[|X(n − k)| ] estimated. But as described in Section III.1, with the limi- tation of FLOM, the estimate will not converge to the true < ∞ (6) value. This problem will lead to a significant deterioration on FLOC X n AR model parameter estimation. So we use FLOC to esti- it satisfies the ergodic theorem and the of ( )can be calculated by the sample average, so FLOC estimation er- mate the AR model parameters, and take the estimation of i.e r k ror is smaller compared to FLOM. According to the concept fractional lower order covariance ( .ˆfloc( )) instead of co- λ k of FLOC, we present a spectrum estimation method based on variation coefficients ( ) in Eq.(12), then we can get fractional lower order covariance, it can be expressed as[12] ⎡ ⎤ ⎡ ⎤ rˆfloc(0)r ˆfloc(−1) ··· rˆfloc(1 − P ) a1 N−1 ⎢ ⎥ ⎢ ⎥ ⎢ rˆfloc(1)r ˆfloc(0) ··· rˆfloc(2 − P ) ⎥ ⎢ a2 ⎥ S ejω r k · e−jωk ⎢ ⎥ ⎢ ⎥ floc( )= floc( ) (7) ⎢ ⎥ ⎢ ⎥ ⎢ . . . . ⎥ . k=−(N−1) ⎣ . . .. . ⎦ ⎣ . ⎦

For fractional autoregressive model the relationship be- rˆfloc(P − 1)r ˆfloc(P − 2) ··· rˆfloc(0) ap tween fractional pole parameter d and its spectrum can be ⎡ ⎤ [12] rˆfloc(1) expressed as ⎢ ⎥ ⎢ r ⎥ ⎢ ˆfloc(2) ⎥ S ejω ∼|ω|−2d,ω→ + ⎢ ⎥ ( ) 0 (8) = ⎢ . ⎥ (13) ⎣ . ⎦ jω Taking fractional lower order covariance spectrum Sfloc(e ) rˆfloc(P ) instead of S(ejω), and by logarithmic transformation on Eq.(8), we can get The AR model parametersa ˆ1, aˆ2, ···, aˆP of SαS fractional au-

jω + toregressive process can be obtained by soluting the Eq.(13). log Sfloc(e ) ∼−2d log(ω),ω→ 0 (9) Now we can summarize the stepwise of clutter suppression There is a linear relationship between fractional lower order method below. jω d covariance spectrum Sfloc(e ) and angular frequency ω in Step 1 Estimate the PF parameter of clutter data X n X n Hˆ −1 z X n logarithm coordinate, so we get estimation of fractional pole ( ), get ( )acrosstheinversesystem d ( ), then ar( ) parameter d by calculating the slope of the fractional lower is received. X n order covariance spectrum, and it can be expressed as Step 2 Estimate the AR parameters of ar( ), then we get the full parameters of SαS fractional autoregressive model. LFLOC dˆ= − (10) Step 3 Design clutter suppression filter according to the 2 parameters, then the clutter in the observation will be sup- where LFLOC is the slope of the fractional lower order covari- pressed. ance spectrum. Because the clutter suppression filter is established accord- 2. Parameter estimation of AR model ing to the pure clutter samples, when target signal exist, the Since SαS fractional autoregressive process has long corre- filter will only whiten clutter, the target signal still can be lation characteristics, the AR parameters can not be estimated integrated at the output. The computational complexity is directly. Firstly, X(n) should be filtered by FP model’s inverse N 3 +3NP +4P − P 2, N is the length of data. 748 Chinese Journal of Electronics 2012

IV. Experiments and Analysis In order to verify the performance of the proposed clutter suppression method in correlated non-Gaussian clutter back- 1. Simulations and performance analysis grounds, we compare following three methods: the proposed (1) Effectiveness analysis of parameter estimation method clutter suppression method (called SαS FAR FLOC filter), the basedonFLOC method based on fractional autoregressive model and FLOM To verify the effectiveness of parameter estimation method (called SαS FAR FLOM filter), and the traditional method based on FLOC in correlated non-Gaussian clutter, we design based on Gaussian AR model second-order moment (called simulation experiments to compare the estimation accuracy of Gaussian AR filter). Let the radar observation be the two methods based on FLOC and FLOM in SαSfractional autoregressive process background. Simulation conditions: the Y (n)=S(n)+X(n),n=0, 1, ···,N − 1 (16) fractional pole parameter d =0.2, AR model parameters are where S(n)=A exp(j2πfdnT ) is the target Doppler signal, a1 = −0.30 + 0.20i, a2 = −0.12 + 0.17i, a3 = −0.12 + 0.21i, A is the signal amplitude, fd is the Doppler frequency, T is a4 =0.02 + 0.03i, a5 = −0.05 + 0.02i, the characteristic ex- X n ponent is α =1.8, 1.5, the dispersion is γ = 1, data length the pulse repetition interval, and ( ) is the clutter which is α N = 64. In FLOC method (i.e. parameter estimation method modeled as a S S fraction autoregressive process. Since the d SαS fraction autoregressive process has no second-order mo- based on FLOC), the fractional pole parameter is calculated [9] according to Eq.(10), AR model parameters are calculated ac- ment, the Generalized signal-to-clutter ratio (GSCR) is used cordingtoEq.(13).InFLOMmethod(i.e. parameter estima- which is expressed as. tion method based on FLOM), fractional pole parameter d is N−1 GSCR 1 |S n |2 calculated according to the expression =10log γN ( ) n=0 LFLOM dˆ= − (14) We assume the CPI length N = 64, the pulse repetition in- 2 −3 terval T =10 s, the Doppler frequency fd = 50Hz and the where LFLOM represents the slope of the covariation GSCR = −7dB. Fig.5 shows the spectrums of the three filter’s [10] spectrum , and the AR model parameters are calculated ac- output. cording to Eq.(12). The Root mean square errors (RMSE) of the two estimation methods are shown in Table 1, in which 20 independent operations are adopted. It is shown that the FLOC method has better performance than the FLOM method for different α, and the smaller values of α,thelower accuracy of the FLOM method, whereas, the FLOC method maintains the same good performance.

Table 1. Root mean square error (RMSE) of the FLOC method and FLOM method α =1.8 α =1.5 Model RMSE of RMSE of RMSE of RMSE of parameters FLOC FLOM FLOC FLOM method method method method d =0.2 0.083 0.101 0.098 0.135 Fig. 4. The α spectrum estimated by FLOC, FLOM and the true α spectrum (α =1.5) a1 = −0.30 + 0.20i 0.076 0.098 0.122 0.182 a2 = −0.12 + 0.17i 0.081 0.093 0.078 0.134 a3 = −0.12 + 0.21i 0.079 0.125 0.092 0.161 a4 =0.02 + 0.03i 0.041 0.103 0.048 0.117 a5 = −0.05 + 0.02i 0.074 0.059 0.069 0.151

And then, we give the performance comparisons of the two methods via parametric α spectrum[10] whichisexpressedas

jω jω α Sα(e )=γ|H(e )| P −α 2d −jωk = γ[2 sin(ω/2)] · 1+ ake k=1 (15) where the parameters ak and d can be estimated based on FLOM or FLOC. The parametric α spectrum curves of the FLOM and FLOC method are given in Fig.4, the character- Fig. 5. Spectrums of filters’ output (α =1.5) istic exponent is α =1.5. From Fig.4 we see that parametric α spectrum based on FLOC is closer to true parametric α As can be seen from Fig.5 that the performance of SαS spectrum. FAR FLOC filter is obviously better than the other two fil- (2) Clutter suppression performance analysis ters, Gaussian AR filter takes the worst performance. The Radar Clutter Suppression Based on SαS Fractional Autoregressive Model 749

Fig. 6. IPIX clutter amplitude histogram Fig. 7. FLOC curves, covariance curve Fig. 8. Spectrums of filters’ output for real and probability density fitting and real clutter’s correlation curve clutter curves main reason is that the three filters are based on the clutter file #320 to test the performance of the SαSFARFLOCfilter, parameters estimation, so the estimation accuracy determines SαS FAR FLOM filter and Gaussian AR filter. We assume the clutter suppression performance. Since the parameters es- the radar CPI length N = 64, the pulse repetition interval −3 timation method based on FLOC possesses the best, the SαS T =10 s, the target Doppler frequency fd = 30Hz (i.e.tar- FAR FLOC filter can outperform the other two filters. get signal in clutter region), the GSCR = −12dB. In Fig.8 the 2. Real clutter data analysis spectrums of the three filters’ output are shown, and we see In this section, we use the IPIX (Intelligent PIXel pro- the performance of SαS FAR FLOC filter is obviously better cessing) radar sea clutter data[15] provided by the McMaster than the other two filters. University in Canada to analyze the applicability of the SαS fractional autoregressive model. The IPIX radar’s operating V. Conclusion frequency is 9.39GHz, the range resolution is 30m and the pulse repetition frequency is 1000Hz. The data file #320 is The traditional clutter supperssion method such as MTD analyzed, which contains 14 range cells and in each range cell as well as the AR model suffers poor performance in correlated there are 131072 pure clutter samples. non-Gaussian clutter backgrounds. This paper presents a clut- (1) Applicability analysis of SαS fractional autoregressive ter suppression method based on SαS fractional autoregressive model model. It first estimates the model parameters of the clutter We first analyze the amplitude probability density func- based on FLOC, and then utilizes the inverse filter to elimi- tions of the clutter samples with the distributions of SαSand nate the long-correlation and short-correlation of the clutter, Gaussian. The parameter estimation methods in Ref.[9] are so as to effectively whiten the correlated non-Gaussian clutter. adopted to estimate the characteristic exponent α and the dis- Finally, the parameter estimation efficiency of SαSfractional persion γ of the SαS distribution, and as to the experimented autoregressive model is demonstrated by simulations. Further- clutter samples, the parameters are estimated to be α =1.55 more, the applicability of SαS fractional autoregressive model and γ =0.24. For Gaussian distribution, the estimated vari- is tested by IPIX radar sea clutter data, and the performance ance of the clutter samples is δ =1.85. The pdf of SαSand of the proposed clutter suppression method is demonstrated Gaussian distribution is shown in Fig.6, which shows that the by simulation and real clutter data, respectively. The results real clutter samples are seriously deviate from the Gaussian show that the SαS fractional autoregressive model is an ef- distribution, and can be well described by SαS distribution. fective clutter model and it is more suitable for describing In this section we respectively generate SαS fractional au- the statistics and correlation characteristics of the real clutter toregressive clutter samples and Gaussian AR clutter samples than the traditional AR model. Compared to the traditional by the estimated parameters, then compare their correlation clutter suppression method, the proposed method shows signif- with real clutter’s correlation, and analyze their similarity de- icant improvements in clutter suppression performance under gree. The three curves in Fig.7 are the IPIX radar clutter correlated non-Gaussian clutter environment. In addition, the data FLOC curve, FLOC curve of SαS fractional autoregres- proposed method can be realized only by calculating the sta- sive clutter samples and covariance curve of Gaussian AR clut- tistical properties and estimating the model parameters of the ter samples. From Fig.7 the correlation characteristics of clut- clutter, thus it is convenient for the engineering applications. ter samples generated based on SαS fractional autoregressive model is closer to the real clutter’s correlation characteris- References tics. So we can see the proposed SαS fractional autoregressive model is an effective clutter model and it is more suitable [1] M. Richards, Fundamentals of Radar Signal Processing, Pub- lishing House of Electronic Industry, Beijing, China, 2008. for describing statistical characteristics and correlation of real [2]J.Petitjean,R.Diversi,R.Guidorziet al., “Recursive errors-in- clutter than Gaussian AR model. variables approach for ar parameter estimation from noisy ob- (2) Performance analysis of clutter suppression for real servations”, Proc. of IEEE International Conference on Acous- clutter tics Speech and Signal Processing, Taipei, China, pp.3401–3404, We take the 1280 samples of first range cell from the data 2009. 750 Chinese Journal of Electronics 2012

[3] I. Antipov, “Statistical analysis of northern australian coastline [15] S. Haykin, IPIX radar databases, 2008-11-02, http://soma. sea clutter data”, Technical Report DSTO/TR/1236,DSTO ece.mcmaster.ca/ipix/dartomouth/datasets. Electronics and Surveillance Research Laboratory, Edinburg, FENG Xun was born in 1982. He Australia, 2001. received B.S and M.S degrees in electro- [4] Xia Guangrong, Liu Xingzhao, “The improved moment-type magnetism from Wuhan Radar Institute in methods for the parameters estimation of the Gaussian and 2005 and 2008, respectively. He is currently symmetric alpha stable distributions based on empirical charac- pursuing the Ph.D. degree in signal pro- teristic function”, Chinese of Journal Electronics, Vol.13, No.1, cessing at Wuhan Radar Institute. (Email: pp.45–48, 2004. cliff[email protected]) [5] M. Liao, C. Wang, Y. Wang et al., “Using SAR images to detect ships from sea clutter”, IEEE Geoscience and Remote Sensing Letters, Vol.5, No.2, pp.195–198, 2008. was born in [6] G. Tsihrintzis, C. Nikias, “Evaluation of fractional lower-order WANG Shouyong 1956. He received the Ph.D. degree from statistics based detection algorithms on real radar sea-clutter Huazhong University of Science and Tech- data”, , Vol.144, No.1, pp.29– IET Radar, Sonar and Navigation nology in 2003. Dr. Wang holds the 38, 1997. Wuhan Radar Institute Professorship in [7] R. Kapoor, A. Baneilee, G. Tsihrintzis et al., “UWB radar signal processing. His recent research in- detection of targets in foliage using alpha-stable clutter mod- terests include research of modern signal els”, IEEE Transactions on Aerospace and Electronic Systems, processing and radar signal processing. Vol.35, No.3, pp.819–834, 1999. [8] S. Johansen, M. Nielsen, “Likelihood inference for a nonstation- ary fractional autoregressive model”, Journal of Econometrics, YANG Jun was born in 1973. He Vol.158, No.1, pp.51–66, 2010. received the B.S. degree and the M.S. de- [9] C. Nikias, S. Min, Signal Processing with Alpha-stable Distri- gree from AFRA, Wuhan, China, in 1996 butions and Applications, Wiley, New York, 1995. and 1999 respectively, and the Ph.D. de- [10] M. Xinyu, C. Nikias, “Parameter estimation and blind channel gree from AEU, Xi’an, China in 2003. He is an associate professor of AFRA. His cur- identification in impulsive signal environments”, IEEE Trans- rent interests include radar signal process- actions on Signal Processing, Vol.43, No. 12, pp. 2884-2897, 1995. ing and radar system. [11] T. Qiu, Y. Zhu, S. Zhao et al., “The properties of FLOC and its application in evoked latency change detection”, Proc. of In- ZHU Xiaobo wasborninSichuan ternational Conference on Innovative Computing Information Province, China, on May 21, 1980. He re- , Dalian, China, pp.355–358, 2008. and Control ceived the Ph.D. degree in signal processing [12] G. Dimitris, Statistical and Adaptive Signal Processing, Pub- from Wuhan Radar Institute in 2011. He lishing House of Electronic Industry, Beijing, China, 2003. is currently work at the 95022 PLA Troops [13] G. Yue, “Network traffic prediction with Farima processes”, and his main emphasis is the research of Ph.D. Thesis, DalHousie University, Canada, 2002. modern radar signal processing. [14] M. Shao, C. Nikias, “Signal processing with fractional lower order moments: stable processes and their applications”, Pro- ceedings of the IEEE, Vol.81, No.7, pp.986–1010, 1993. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A Novel Soft Switching Converter with Active Auxiliary Resonant Commutation∗

CHU Enhui, HOU Xutong and ZHANG Huaguang

(College of Information Science and Engineering, Northeastern University, Shenyang 110819, China)

Abstract — In order to realize a simple topology, high ever it need extra three auxiliary power switches and a trans- efficiency, high frequency, low voltage stress, easily con- former, so the topology is quite complex and hard to control, trolled soft switching converter, a novel soft switching con- and disadvantage to improve conversion efficiency and realize verter with active auxiliary resonant commutation is pre- miniaturization. sented in this paper. Soft switching of main power switch and auxiliary power switch can be achieved by using ac- This paper puts forward a novel active soft switching con- tive auxiliary resonant network. It is very attractive for verter which uses a sample active auxiliary resonant network high power application where IGBT (Insulated gate bipo- to achieve soft switching of the main power switch and the aux- lar transistor) is predominantly used as the power switch. iliary power switch with low voltage and current stress. This Its operation principle is analyzed through its application novel converter has a sample topology and control strategy, to the boost converter. The novel soft switching cell can be because the novel resonant network only adds one auxiliary also used in other basic DC-DC converter. A 3kW, 16kHz prototype which uses IGBT was made. The effectiveness active switch. This paper has made a detailed analysis of the of the proposed converter is confirmed by the experimental novel soft switching topology and its feasibility is verified by results. experiment. Key words — Active auxiliary resonant commutation, Active DC-DC converter, Soft-switching. II. Circuit Description and Operation Principle I. Introduction 1. Circuit description The hard-switching PWM (Pulse-width moderation) con- Fig.1 shows the presented novel active auxiliary resonant verter is widely used in lots of fields such as communication converter topology in which S1, D1, Lm and Co represent the and network servers due to its simple topology, easy to con- main power switch, the output rectifier diode, the input filter trol, constant switching frequency and good output regulating inductor and the output filter capacitor, respectively, DS1 is characteristics. In high voltage or high power situations, power the anti-parallel diode of the main power switch S1.Theactive devices suffer from a large voltage or current stress with large auxiliary resonant network is composed of a resonant inductor switching losses, and the EMI (Electro-magnetic interference) Lr, a resonant capacitor Cr, a lossless snubber capacitor CS, caused by voltage peaks and surge current may influence the an auxiliary power switch S2, and an auxiliary diode D2. normal operation of the converter. In order to solve these prob- lems effectively, many soft-switching technologies[1−8] have been presented in recent years, such as resonant switching technology, zero switching technology, zero transfer technol- ogy, auxiliary resonant network technology and so on. Among these technologies, zero transfer technology adopts active aux- iliary resonant network[9−14], can control resonant process of resonant component through auxiliary power switch, so that it can keep the advantage of PWM chopper-fed circuits while Fig. 1. Novel active auxiliary resonant converter realizing soft switching, reduces switching loss, and it has be- come the focus of power electronics. At present, many novel 2. Circuit operation chopper-fed circuit topologies have been proposed, but there The mode transition and the working waveforms of novel still exists some shortages, such as complex topology, high volt- active auxiliary soft switching converter are depicted in Fig.2 age and current stress of active power switch, irrealizable soft- and Fig.3, respectively. The gate voltage pulse sequences of switching of auxiliary power switch, large circulating current the main power switch S1 and the auxiliary power switch S2 andsoon.AboveproblemcanbesolvedinRef.[15],how- are shown in Fig.3, too.

∗Manuscript Received Apr. 2010; Accepted June 2012. This work is supported by the Fundamental Research Funds for the Central Universities (No.N100404015). 752 Chinese Journal of Electronics 2012

In order to simplify the analysis, we as- sume that: (1) All devices of the circuit are in ideal condition, the input filter in- ductance Lm is large enough to be instead of constant current source ILm.(2)The output filter capacitor Co is large enough to be instead of constant voltage source Vo. The operating principle in mode tran- sitions of this converter treated here is ex- plained as follows: • Mode 0 [0,t0]: Before time t0,the stored energy of the inductor Lm is trans- ferred to the load side. When the auxil- iary power switch S2 is turned on, Mode 0 changes to Mode 1. • Mode 1 [t0,t1]: At the instant t0,then the switch S2 is turned on under a principle of ZCS with the aid of the resonant induc- tor Lr, the current through the diode D1, begins to flow to the active auxiliary reso- nant network. The current flowing through the resonant inductor Lr, and the resonant capacitor Cr, and the switch S2 increases Fig. 2. Mode transitions and equivalent circuits. (a)Mode0;(b)Mode1;(c)Mode 2; (d)Mode3;(e)Mode4;(f)Mode5;(g)Mode6;(h)Mode7 sinusoidally. The current iLr across Lr,the

     i D v C 2 current D1 across 1 and the voltage Cr across r is Z1 vCr (t1)= 1 − 1 − ILm Vo (5) Vo V i o ω t − t Lr = Z sin 1( 0)(1) 1 The on-period t01 in Mode 1 is Vo   iD1 = ILm − sin ω1(t − t0)(2) Z1 1 −1 Z1 t01 = sin ILm (6) v − ω t − t V ω1 Vo Cr =[1 cos 1( 0)] o (3)  √ • Mode 2 [t1,t2]: At the instant t1, when the diode D1 is where Z1 = Lr/Cr, ω1 =1/ LrCr. ZCS turned off, the current flowing through D1 commutates through the active auxiliary resonant network. The lossless snubber capacitor Cs, connected in parallel with the main power switch S1 is produced the edge-resonant mode with a resonant inductor Lr and resonant capacitor Cr.There- fore, the lossless snubber capacitor Cs becomes the discharging mode, and the voltages across Cs drops gradually. The current

iLr across Lr, the voltage vCr across Cr and the voltage vCs across Cs is

V1 iLr = sin ω2(t − t1)+I1 cos ω2(t − t1) − I1 + iLr (t1) Z2 (7) C vCr = [V1 − V1 cos ω2(t − t1)+I1Z2 sin ω2(t − t1)] Cr

ILm + (t − t1)+vCr (t1)(8) Cr + Cs C vCs = [V1 cos ω2(t − t1) − I1Z2 sin ω2(t − t1) − V1] Cs

ILm Fig. 3. Key waveforms of converter + (t − t1)+Vo (9) Cr + Cs

iL v t1 The current r and the voltage Cr at instant can be C C C I i t − I V V −v t C r s given by where 1 = Lr ( 1) Lm , 1 = o Cr ( 1), = ,  Cs  Cr + Cs L C C C C Z r( r + s) ω r + s iL t1 iC t1 IL 2 = , 2 = . r ( )= r ( )= m (4) CrCs LrCrCs A Novel Soft Switching Converter with Active Auxiliary Resonant Commutation 753

The current iLr and the voltage vCr at instant t2 can be • Mode 6 [t5,t6]: When the main power switch S1 is turned given by off with ZVS, the current flowing through the boost inductor  Lm flows to the snubber capacitor Cs. Therefore, the lossless 2 2 LriL − CrV2 snubber capacitor Cs becomes charging mode and the volt- i t rpeak Lr ( 2)= (10) C Lr ages across the lossless capacitor s increases gradually. the  v C V 2 I Z 2 C voltage Cs across s is i 1 +( 1 2) I Lrpeak = + Lm (11) IL Z2 Cs m vCs = (t − t5) (18) Cs I C Lm s v t where V2 = vCr (t2)= (t2 − t1)+ Vo + vCr (t1). The voltage Cr at instant 6 can be given by Cr Cr • Mode 3 [t2,t3]: When the voltage across the snubber 2 2 2 vCr (t6)=vCr (t5)=− Z1 I2 + V2 (19) capacitor Cs becomes zero, the anti-parallel diode Ds1 of the t main power switch S1 is naturally turned on. As a result, the The on-period 56 in Mode 6 is main power switch S1 can achieve ZVS (Zero-voltage switch- Cs t56 = (Vo + vCr (t5)) (20) ing) and ZCS (Zero-current switching) hybrid soft commuta- ILm tion in a turn-on transition state when the current flow through • Mode 7 [t6,t7]: When the voltage across the lossless the anti-parallel diode Ds1 decreases and naturally shifts to snubber capacitor Cs becomes larger than the sum of the volt- the main power switch S1 by giving the gate voltage signal of age across the resonant capacitor Cr and the output voltage the main power switch S1 while Ds1 is turned on. V0, the auxiliary diode D2 is naturally turned on. When the iL Lr vC Cr The current r across and the voltage r across voltage across the lossless snubber capacitor Cs is equal to the is output average voltage Vo and the voltage across the auxiliary

V2 resonant capacitor Cr becomes zero, the diode D2 is naturally iLr = I2 cos ω1(t − t2) − sin ω1(t − t2) (12) Z1 turned off. At the same time, the diode D1 is turned on and

Mode 7 shifts to Mode 0. The voltage vCr across Cr and the vCr = I2Z1 sin ω1(t − t2)+V2 cos ω1(t − t2) (13) voltage vCs across Cs is where I2 = iL (t2), V2 = vC (t2). I r r v Lm t − t v t v t t Cr = ( 6)+ Cr ( 6) (21)  The voltage Cr at instant 3 can be given by Cr ( 3)= Cr + Cs Z2I2 V 2 1 2 + 2 ILm vCs = (t − t6)+Vo + VCr (t6) (22) The on-period t23 in Mode 3 is Cr + Cs   The on-period t34 in Mode 7 is 1 −1 Z1I2 t23 = tan (14) ω1 V2 vCr (t6) t67 = − (Cr + Cs) (23) ILm • Mode 4 [t3,t4]: When the current of the main power This active auxiliary resonant converter repeats cyclically switch S1 becomes bigger than the current flowing through a the steady-state operation described above. boost inductor Lm, the diode Ds2 in anti-parallel with the aux- iliary power switch S2 is naturally turned on, and the current III. Experimental Results and flowing through S2 begins to commutate to the anti-parallel Performance Evaluations diode Ds2. By cutting the gate voltage pulse signal delivered to the auxiliary power switch S2, during this period, an auxil- 1. Design specifications and operating waveforms iary power switch S2 can achieve complete ZVS and ZCS hy- Based on the circuit topology and analyses above, a 3kW, brid soft commutation in a turn-off transition when the current 16kHz prototype based on IGBT has been built. Input volt- flowing through the auxiliary power switch S2 shifts exactly. age VS = 200V, output voltage Vo = 380V, output power P S The voltage vCr at instant t4 can be given by range o = 1kW-3kW, main power switch 1 and auxiliary power switch S2 adopts Mistubishi CM75DU-24H; the out- 2 2 2 D1 vCr (t4)=−vCr (t4)=− Z1 I2 + V2 (15) put rectifier diode adopts high efficiency and high speed Toshiba 30JL2C41; D2 adopts high dielectric strength and The on-period t34 in Mode 4 is high speed soft recovery Hitachi DFM30F12. Input filter in- π ductor Lm =1.024mH, resonance inductor Lr =7.6μH, res- t 34 = (16) Cr ω1 onant capacitor = 121 nF, resonance snubber capacitor CS = 33 nF, smoothing output capacitor Co = 8200μF. • Mode 5 [t4,t5]: When the auxiliary power switch S2 is Fig.4 illustrates the voltage and current switching wave- turned off, the resonant current flowing through the inductor forms and its v − i trajectory of the main power switch S1.It Lr and the capacitor Cr becomes zero, all the circuit opera- can be seen that there is no voltage and current peak in the tions are identical to the conduction state of the conventional main switch S1 and low dv/dt and di/dt reduce voltage and Boost converter. current stress of the switch. In addition, from v − i traces of

The voltage vCr at instant t5 can be given by S1, ZVS and ZCS turn-on and ZVS turn-off in S1 is achieved. Fig.5 illustrates the voltage and current switching waveforms 2 2 2 vCr (t5)=vCr (t4)=− Z1 I2 + V2 (17) and its v − i trajectory of the auxiliary power switch S2.It 754 Chinese Journal of Electronics 2012 can be seen that ZVS and ZCS turn-off and ZCS turn on in S2 is achieved. These experimental results verify the previous theoretical analysis.

Fig. 7. Voltage and current waveforms and v − i trajectory of auxiliary switch S2 with clamping diode. (a)Wave- forms; (b) v − i trajectory

ping diode, without clamping diode and hard switching (with RC snubber circuit) are shown in Fig.8, respectively. It can be seen that the actual efficiency of the proposed novel soft- switching converter, especially the converter with clamping diode Dc, is higher than that of hard switching for the re- quired output power range. Especially, for 3kW breadboard setup, the actual power conversion efficiency of soft-switching Fig. 4. Voltage and current waveforms and v − i trajectory of PWM scheme can achieve 97.8%. And moreover, for high main switch S1.(a) Turn-on waveforms; (b) Turn-on v − i trajectory; (c) Turn-off waveforms; (d) Turn-off frequency switching, this power circuit can, achieve higher ef- v − i trajectory ficiency characteristics.

Fig. 5. Voltage and current waveforms and v − i trajectory of auxiliary switch S2.(a)Waveforms;(b) V −i trajectory

From the voltage and current waveforms in Fig.5, it can Fig. 8. Curves of efficiency be also seen that the voltage across the auxiliary power switch S2 has a parasitic oscillation phenomenon at S2 ZCSturnoff. 3. EMI characteristic In order to suppress the parasitic oscillation phenomenon, an a b extra clamping diode Dc is needed in the original circuit in Fig.9( ) and Fig.9( ) illustrate the measured EMI char- Fig.1, and the new active auxiliary resonant converter topol- acteristic graphs of the soft-switching converter and the hard- RC S ogy with clamping diode Dc is shown in Fig.6. The clamping switching converter ( snubber in the drain of 1) under the diode Dc is naturally turned on as soon as the voltage of the horizontal antenna condition and the vertical antenna condi- auxiliary power switch S2 exceed output voltage 380V, then tion, respectively. It can be seen that the EMI interference of the parasitic peak voltage can be suppressed effectively. The the soft-switching converter is much smaller than that of the voltage and current waveforms and its v − i trajectory of the hard-switching converter during the whole frequency range of auxiliary power switch S2 in case of adding a clamping diode 30MHz–1GHz. Additionally, the EMI interference can reduce are represented in Fig.7. As shown in Fig.7, a large oscilla- 42.7dBμV/m (at 34.05MHz) and 37.1 dBμV/m (at 230MHz) tion in Fig.5 disappears, and the peak voltage is effectively as much as possible, respectively, under the horizontal antenna suppressed. Therefore, the over voltage across the auxiliary condition and the vertical antenna condition. It is more effec- power switch S2 can be reduced positively. tive to use a active soft-switching converter to suppress the 2. Efficiency evaluation radiated emission. The actual output power Po of the prototype with clam-

Fig. 9. Noise measurement of radiated EMI. (a) Under the Fig. 6. Novel active auxiliary resonant converter with clamp- horizontal antenna condition; (b) Under the vertical ing diode DC antenna condition A Novel Soft Switching Converter with Active Auxiliary Resonant Commutation 755

IV. Conclusions high frequency ZVS-PWM boost dc-dc converter with auxil- iary resonant snubber”, Proc. of 2010 International Conference In this paper, a novel soft-switching converter with ac- on Communications, Circuits and Systems, Chengdu, , tive auxiliary resonant network in the load side is presented. China, pp.576–580, 2010. The operation principle of the converter has been analyzed in [9] T. Mishima, M. Nakaoka, “A new family of soft switching PWM detail, and parameters of the resonant network have been pre- non-isolated dc-dc converter with active auxiliary edge-resonant cell”, sented. By the theory analysis and experiments using 3kW, Proc. of 2010 International Power Electronics Confer- ence, Taormina, Italy, pp.2804–2809, 2010. 16kHz prototype, some conclusions have been reached as be- [10] N. Lakshminarasamma et al., “Steady-state stability of current- low: mode active-clamp ZVS dcdc converters”, IEEE Transactions (1) The soft-switching of power switches can be realized on Power Electronics, Vol.26, No.5, pp.1295–1304, 2011. by using the simple active auxiliary resonant network, which [11] B.R. Lin, J.J. Chen, Y.S. Huang, “Soft switching active- can eliminate the overlapping phenomenon of voltage and cur- clamped dual series-resonant converter”, Power Electronics, i/ t v/ t IET, Vol.3, No.5, pp.764–773, 2010. rent and reduce the switching loss. (2) Low d d and d d [12] Y.C. Chen, Y. Gao, “Research on active clamped ZVS-SEPIC can lower voltage and current stress of switches, reduce EMI converter”, 2010 2nd International Conference on Industrial problems aroused by hard-switching PWM converter and solve and information Systems, Dalian, China, pp.310–314, 2010. the reverse-recovery problem of the output rectifier diode. (3) [13] S, Urgun, “Zero-voltage transition-zero-current transition ZCS and ZVS can be ensured under the wide load condition. PWM dc-dc buck converter with zero-voltage switching zero- (4) An actual high efficiency of 97.8% can be achieved based current switching auxiliary circuit”, Power Electronics,IET, Vol.5, No.5, pp.627–634, 2012. on the 5kW prototype. [14] E.M. Miranda-Teran, R.P. Torrico-Bascope, “An active clamp- The circuit proposed in this paper is suitable for large and ing modified push-pull dc-dc converter”, Power Electronics medium power soft-switching converters. Conference, Praiamar, Brazil, pp.384–389, 2011. [15] G.Q. Lin, “A Novel Zero-voltage and Zero-current Transition DC-DC converters”, Proceedings of the CSEE, Vol.27, No.22, References pp.106–109, 2007. [1] E.H. Chu, L. Gamage, M. Ishitobi, E. Hiraki, M. Nakaoka, CHU Enhui was born in 1965. He “Improved transient and steady-state performance of series res- received the M.S. degree in automation onant ZCS high-frequency inverter-coupled voltage multiplier from Northeastern University, Shenyang, China, in 1993, and the Ph.D. degree from converter with dual mode PFM control scheme”, Electrical En- Yamaguchi University, Yamaguchi, Japan, gineering in Japan, Vol.149, No.4, pp.60–72, 2004. in 2003. He is currently an assistant profes- [2] H.G. Zhang, Q. Wang, E.H. Chu, X.C. Liu, L.M. Hou, “Analysis sor and supervisor for M.S. student in the and implementation of a passive lossless soft-switching snubber College of Information Science and Engi- for PWM inverters”, IEEE Transactions on Power Electronics, neering, Northeastern University. His main Vol.26, No.2, pp.411–426, 2011. research interests include power electronics [3] P. Sang-Hoon, P. Sori, Y. Jaesung, J. Yongchae ., “Analysis et al and its application, high-frequency soft switching power conversion and design of a soft-switching boost converter with an HI-bridge system and its control. (Email: [email protected]) auxiliary resonant circuit”, IEEE Transactions on Power Elec- was born in 1988. , Vol.25, No.8, pp.2142–2149, 2010. HOU Xutong tronics He received the B.S. degree in electronic [4] E.H. Chu, H.G. Zhang, X.C. Liu, M.Y. Zhai, “Novel interleav- information science and technology from ing double switch forward soft switching converter”, Proceedings Shenyang University of Chemical Technol- , Vol.29, No.33, pp.22–27, 2009. (in Chinese) of the CSEE ogy, Shenyang, China, in 2011. He is cur- [5] B. Chang, C.L. Wang, Y. He, “Research on soft switching boost rently a M.S. graduate student in the Col- converter”, Proc. of 2011 Second International Conference on lege of Information Science and Engineer- Digital Manufacturing and Automation, Zhangjiajie, Hunan, ing, North-eastern University. His research China, pp.1015–1018, 2011. interests focus on three level soft switching [6] E.H. Chu, S. Jin, H.G. Zhang, “A novel passive soft switch- converter. ing converter”, Acta Electronica Sinica, Vol.38, No.8, pp.1963– ZHANG Huaguang was born in 1968, 2010. (in Chinese) 1959. He is currently a professor in the [7] T. Mishima, Y. Takeuchi, M. Nakaoka, “A new high step-up College of Information Science and Engi- voltage ratio soft switching PWM boost dc-dc power converter neering, Northeastern University. His cur- with edge resonant switched capacitor modular”, Proc. of the rent research interests include fuzzy con- 2011 14th European Conference on Power Electronics and Ap- trol, chaos control, neural networks-based plications, Birmingham, Britain, pp.1–10, 2011. control, nonlinear control, signal process- [8] E.H. Chu, W.Y. Hu, J.X. Gong, R. Hou, M. Nakaoka, “A novel ing, and their industrial applications. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Monte-Carlo Simulations on the Noise Characteristics of the Ion Barrier Film of Microchannel Plate∗

SHI Feng1, FU Shencheng2,LIYe2, DUANMU Qingduo2 and TIAN Jingquan2

(1.Science and Technology on Low-Light-Level Night Vision Laboratory, Xi’an 710065, China) (2.School of Science, Changchun University of Science and Technology, Changchun 130022, China)

Abstract — In the third-generation Low-light-level MCP with a thin dielectric film, as shown in Fig.1. (LLL) image tube, a super thinner Ion barrier film (IBF) is often coved on the input surface of Microchannel plate (MCP) to protect the photocathode and prolong the oper- ation life of the image tube. In order to further investigate on the properties of IBFs, a physical model for the inter- action of low-energy electrons with solid was described. A Monte-Carlo simulation on the noise characteristics of the Al2O3 IBF was conducted. Trajectory and spatial distri- bution of the transmission electrons were simulated with Matlab software. Besides, influence factors for transmit- tance characteristic of electrons in the Al2O3 IBF were studied. Finally, noise factor of IBFs was calculated and discussed, based on a method of Time domain division. Fig. 1. Structures of microchannel plate coated with ion bar- This work provided a theory support for fabricating high rier films and single-channel electron multipliers performance LLL device. It was reported that the typical films of Al2O3 or SiO2 with Key words — Microchannel plate, Ion barrier film, 3-4 nm are the ideal materials for preventing ion feedback[2−5]. Monte-Carlo simulation, Electron transmittance, Noise The dead-voltage, defined as the corresponding threshold volt- factor. age to the lost energy of electrons in transmitting films, is 150∼200V in operation. The electrons, with the small mass I. Introduction and certain energy, will transmit through the IBF freely; while the positive ions and gas molecules will be stopped effectively The Microchannel plate (MCP) with Ion barrier films by the IBF. The IBF also prevents ions generated in the MCP (IBFs) is one of the key components in Low-light-level (LLL) during operation from migrating back to the cathode and dam- image intensifier tubes in Generation III. During operation, aging the delicate Cs/O activation layer, and dramatically im- photoelectrons generated by the photoelectric cathode are in- proves the operational life. It was reported that the opera- cident into the channels. After many multiplications, the elec- tional life of the MCP image tube with the IBF has exceeded tron cloud with high density occurs at the output-face of MCP. 10000h. However, the IBF plays a scattering role in the in- The remnant gas molecules inevitably exist in the image tube cident electron, resulting in the randomness of spatial distri- because of the limitation of vacuum technology and adsorption bution for the transmission electrons. Thus the stability of of gas molecules. The molecules are ionized by the collision of spatial distribution for the transmission electron in time do- the dense secondary electrons at the output-face of MCP. The main is directly influenced, resulting in the noise of the IBF. In generated positive ions impinge against the channel walls in this paper, a physical model for the interaction of low-energy reverse and bombard the photoelectric cathode of the image electrons with IBFs was described. Trajectory and spatial dis- tube, resulting in the ion spots on fluorescence screen during tribution of the transmission electrons were simulated with electrons emitting. The linear operational characteristic is de- Matlab software. Finally, noise factor of IBFs was calculated stroyed and the operational life is reduced, which is the harm and discussed, based on a method of Time domain division. of the ion feedback[1]. The effective solution to eliminate or This work provided a theory support for fabricating high per- reduce the ion feedback is commonly coating the input face of formance LLL device.

∗Manuscript Received Oct. 2011; Accepted Apr. 2012. This work is supported by the National Natural Science Foundation of China (No.61077024, No.61007006). Monte-Carlo Simulations on the Noise Characteristics of the Ion Barrier Film of Microchannel Plate 757

II. Monte-Carlo Simulation on predicted. However, statistics methods, such as Monte Carlo Low-energy Electron Transmission in simulation calculation method, may be adapted to analyze IBFs movement roles in case of a lot of electrons. If random number R is introduced into En, θn and Φn, the trajectory and energy distribution in time domain will be obtained. The interaction of low-energy electron with IBF can be di- The simulation program in the Matlab environment was vided into two types: elastic scattering and non-elastic scatter- compiled. The basic condition and parameters for this sim- ing. Elastic scattering changes the direction of incident elec- ulation are as follows: (1) The distribution of the atoms in trons and non-elastic scattering loses the energy of incident the IBF is random; (2) The density of the Al2O3 thin film is electrons. This means IBFs will change the spatial distribu- 1.90 g/cm3; (3) The number of the incident electron is 100; tion and the energy of incident electrons. The classic Ruther- (4) The thickness of the film is 5nm. Fig.2 shows the trajec- ford elastic scattering model is only suitable for high energy tory of incident electrons in the Al2O3 IBF. After transmitting electron interacting with the material of low atomic number. Al2O3 IBF, electron dispersion radius increases from ∼ 4nm So Browning total elastic scattering cross section σ, associated to ∼ 6 nm, which directly reduces the clarity of the input op- with the low-energy electrons, is suitable for this study. Com- toelectronic signal. The energy of the incident electrons versus monly, each electron scattering process is determined by four projection distance in Z direction was also simulated, as shown variables: initial scattering energy (En), scattering angle (θn), [6−8] in Fig.3. The energy of the incident electron is reduced from scattering azimuth (Φn) and scattering step-length (Λn) . 800 eV to 660 eV after transmitting the Al2O3 IBF with the θn and Λn are both determined by σ. The energy loss induced thickness of 5 nm. by non-elastic scattering can be expressed as   dE . × 4  CZ E kJ III. Influence Factors for Electron 7 85 10 ρ . + / ds = E A ln 1 166 J (keV cm) (1) Transmittance in the Al2O3 IBF where ρ is the density of the elements, C is the atomic concen- According to the analysis above, the initial energy of the tration, Z is atomic number, A is atomic weight, J is average incident electrons and the thicknesses of IBFs are the key fac- ionization energy. The (n+1)th scattering energy is described tors for transmittance characteristic of electrons. Fig.4 shows as dE the relationship between the Al2O3 IBF electron transmit- En+1 = En − · Λn (2) ds tance and the incident energy. The dead-voltage is 235V for the Al2O3 IBF of 5nm. The number of the electrons that can penetrate through the film increases rapidly when the incident energy is higher than the dead-voltage. For example, if the in- cident energy is 800 eV, the electron transmittance reaches 87.16%. Fig.5 shows the relationship between the electron transmittance and the thickness of the Al2O3 IBF when the incident electron energy is 500eV. At thickness of 15nm, the electron transmittance tends to be zero. IV. Monte-Carlo Simulation on Noise Characteristics of IBFs

Fig. 2. Trajectory of incident electrons in the Al2O3 IBF. The Time domain division It is assumed that the spatial initial energy of incident electrons is 300eV. Coordi- distribution of the incident electron in x − y plane obeys the nates unit is nm Gaussian distribution G(0). After IBF scratting a new Gaus- The transmission of incident electrons in IBFs follows a sian distribution G(δ) occurs, where δ is the statistical variance certain rule while the specific electronic trajectory can not be of electron distribution. Here, time domain is divided into infi-

Fig. 3. Energy of the incident electrons Fig. 4. The relationship between the Fig. 5. The relationship between the elec- versus projection distance in Z di- Al2O3 IBF electron transmittance tron transmittance and the thick- rection and the incident energy ness of the Al2O3 IBF 758 Chinese Journal of Electronics 2012 nite time interval Δt.ThevalueofΔt is set as the resolution [2] Y. Li, D.L. Jiang, R. Xiang et al., “Technology Investigation on limit interval of human eyes. In the time interval Δt,thenum- SiO2 film for Ion-preventive feedback of microchannel plate”, , Vol.36, No.12, pp.2400–2404, 2008. (in ber of the incident electrons in a given range, Nin(Δt), can be Acta Electronica Sinica Chinese) calculated. Accordingly, the number of the output electrons [3] W.S. Timothy and P.E. Joseph, “An analysis of electron scat- N t in the given range, out(Δ ), may also be calculated. Because tering in thin dielectric films used as ion barriers in generation of the random interaction of incident electrons with IBFs, the III image tubes”, SPIE, Vol.4796, pp.23–32, 2003. values of Nin(Δt)andNout(Δt) are both variable in each cal- [4] D.L. Jiang, R. Xiang, K. Wu et al., “Preparation and char- culation. Thus the transmittance fluctuations of the input and acteristics of SiO2 ion-preventive feedback thin film”, Chinese output electrons in time domain will be obtained. Usually, Journal of Luminescence, Vol.29, No.6, pp.1096–1100, 2008. (in Noise factor (NF) is chosen to describe noise characteristics of Chinese) [5] J.S. Pan, J.W. Lv ., “Ion feedback suppression for mi- IBFs. It can be expressed as et al crochannel plate applied to third generation image intensifiers”, NF = Pin/Pout (3) Chinese Journal of Electronics, Vol.19, No.4, pp.757–762, 2010. [6] X.R. Jiang, Micro-manufacture Technology, Publishing House where Pin and Pout are Signal to noise ratios (SNRs) for the of Electronics Industry, Beijing, pp.161–166, 1999. (in Chinese) input and the output electrons, respectively. SNR can be cal- [7] D.L. Jiang, Q.F. Liu, Y. Li et al., “MCP ion barrier film and culated by the ratio of the statistics average of electron number its stopping function on incident ions”, Chinese Journal of Lu- to statistical variance of electrons number. minescence, Vol.27, No.6, pp.1015–1020, 2006. (in Chinese) [8] U. Littmark, J.F. Ziegler, “anges of energetic ions in matter”, Physical Review A, Vol.23, No.1, pp.64–72, 1981. SHI Feng was born in Shanxi Province in 1968. He is currently working toward the Ph.D. degree in Nanjing Uni- versity of Science and Technology. He is the director and the member of academic committee in Science and Technology on Low-Light-Level Night Vision Laboratory. His current research interests include pho- toelectric imaging materials, devices and Fig. 6. Transmittance of the Fig. 7. Noise factor of the systems. input and output elec- Al2O3 IBF versus in- FU Shencheng (corresponding au- tronsversustime cident electron energy thor) was born in Jilin Province in and IBF thickness 1979. He received the Ph.D. degree from Changchun University of Science and Tech- Based on the method of Time domain division, the noise nology in 2010. He is a supervisor of the characteristic of the Al2O3 IBF was simulated. At the condi- M.D. candidates in Changchun University tion of the incident energy of 800 eV and the IBF thickness of Science and Technology. His current re- of 5nm, the transmittance of the input and output electrons search interests include holographic stor- versus time are obtained, as shown in Fig.6. Besides, the in- age and photoelectric imaging materials. fluence factors of NF were also studied. Fig.7 shows the NF of (Email: [email protected]) the Al2O3 IBF as a function of incident electron energy and LI Ye was born in Jilin Province in IBF thickness. The NF decreases with the increase of incident 1969. He received the Ph.D. degree from Changchun University of Science and Tech- electron energy and the decrease of IBF thickness. nology in 2011. He is a professor and super- visor of the Ph.D. candidates in Changchun V. Conclusion University of Science and Technology. His current research interests include IC design A physical model for the interaction of low-energy electrons and photoelectric imaging technology. with IBFs was described. Trajectory and spatial distribution of the transmission electrons were simulated with Matlab soft- ware. After transmitting IBF, the electron dispersion radius DUANMU Qingduo was born in increases accompanied with the loss of electron energy. Elec- Jilin Province in 1956. He received the tron transmittance increases with the increase of the incident Ph.D. degree from Changchun University of Science and Technology in 2003. He is a electron energy and the decrease of IBF thickness. Noise fac- professor and supervisor of the Ph.D. can- tor of IBF based on Time domain division was also calculated. didates in Changchun University of Science The results indicated that the lower noise factor of IBFs can and Technology. His current research in- be obtained under the condition of higher incident electron terests include Si-MCP and photoelectric energy and thinner IBFs. imaging technology.

TIAN Jingquan was born in Jilin Province in 1938. He is References a professor and supervisor of the Ph.D. candidates in Changchun [1] M.J. Iosue, “Night vision device and method”, U.S. Patents, University of Science and Technology. His current research interests US6198090B1, 2001. include photoelectric imaging materials, devices and systems. Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

ANN Synthesis Models for Asymmetric Coplanar Waveguides with Finite Dielectric Thickness∗

WANG Zhongbao and FANG Shaojun

(School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China)

Abstract — Novel and accurate Computer-aided de- line characteristics, but they are mathematically complex with sign (CAD) models based on Artificial neural networks tremendous and time-consuming computational efforts. For (ANNs) are proposed for the synthesis of Asymmetric the quasi-static analysis, the Conformal mapping technique coplanar waveguides (ACPWs) with finite dielectric thick- [6−9] (CMT) is a powerful tool . In 1981, the ACPW with in- ness. First, the ACPWs are analyzed by using the Con- finite substrate thickness was first analyzed by V.F. Hanna formal mapping technique (CMT) to obtain the training [6] data sets. Then, six training algorithms are used to train and D. Thebault . In their study, twice conformal transfor- the ANNs for finding proper training algorithm. High- mations were adopted. As a result, the analysis formulas are precision models are obtained by using the Levenberg- very complicated. In 1995, using SchwartzChristoffel transfor- Marquardt (LM) training algorithm. The models also can mation, simple analytical formulas for quasi-static parameters be used for symmetric coplanar waveguides. At last, the of asymmetric coplanar lines had been obtained[7]. In 1999, the models are validated by the comparison with the CMT effective permittivity, characteristic impedance, and electric- analysis, HFSS electromagnetic simulation, and experi- mental results available in the literature. The proposed field strength of Conductor-backed ACPWs (CBACPWs) had [8] CAD models are extremely useful to microwave engi- been reported . In 2002, shielded multilayered ACPWs have neers for accurately calculating the physical dimensions of been analyzed by using the CMT[9]. Based on the data ACPWs with finite dielectric thickness. sets obtained from the quasi-static analysis results, the anal- Key words — Synthesis models, Artificial neural net- ysis models of ACPWs with finite dielectric thickness and works, Conformal mapping, Asymmetric coplanar waveg- CBACPWs had been obtained by using Artificial neural net- [10] uides (ACPWs) with a finite dielectric thickness, Sym- works (ANNs) and an Adaptive-network-based fuzzy infer- [11] metric and Asymmetric coplanar waveguides (Asymmetric ence system (ANFIS) . CPWs). It is noted that so far, most of the Computer-aided design (CAD) models are the analysis models that have been used to obtain characteristic parameters of various ACPWs. Whereas I. Introduction synthesis CAD models used to directly obtain the physical di- mensions of ACPWs for the required design specifications are Coplanar waveguides (CPWs) are widely adopted in many [1,2] very scant. In 2011, synthesis CAD models for CBACPWs had practical RF circuits and antennas . CPWs offer several ad- [12] been reported , but the models can’t be used for synthesis of vantages over microstrip lines in designing and manufacturing conductor-backed symmetric CPWs. Furthermore, to best of Monolithic microwave integrated circuits (MMICs). These ad- our knowledge, there is no synthesis CAD model for ACPWs vantages include high ?exibility in the control of characteristic with finite dielectric thickness. impedance, easy connection to the shunt lumped elements, and In this paper, novel and accurate CAD models based on low dispersion. Asymmetric coplanar waveguides (ACPWs)[3] ANNs are proposed for the synthesis of ACPWs with finite di- provide additional degrees of freedom to control the line char- electric thickness. To obtain the training data sets, the CMT acteristic and optimize the circuit layout. is first used to analyze the ACPWs with finite dielectric thick- Recently, the various ACPWs have been analyzed by using ness. Based on the CMT analysis results, six training algo- full-wave methods[4,5] or quasi-static methods[6−11].Thechar- rithms are used to train the ANNs for obtaining accurate CAD acteristic parameters of ACPWs with finite dielectric thickness models. The validity and accuracy of the proposed synthesis have been obtained by using the full-wave analysis methods [4] models have been verified by the comparison with the CMT such as Finite-difference time-domain (FDTD) and Multi- [13] analysis, HFSS electromagnetic simulation and experimen- resolution time-domain (MRTD)[5]. Full-wave analysis meth- tal results previously published in Ref.[3]. ods are the most accurate tools for obtaining transmission-

∗Manuscript Received Aug. 2011; Accepted Jan. 2012. This work is supported by the National Natural Science Foundation of China (No.61071044), the Traffic Applied Basic Research Project of the Ministry of Transport of China (No.2010-329-225-030) and the Fundamental Research Funds for the Central Universities. 760 Chinese Journal of Electronics 2012

II. CMT Analysis region shown in Fig.1(b) by the hyperbolic sine transformation   The cross-section of an ACPW with finite dielectric thick-  πf i =sinh (6) ness is shown in Fig.1(a). In this figure, w represents the cen- 2h tral strip width, g1 and g2 represent the slot widths, and the dielectric substrate has thickness of h with relative dielectric Then, using the Schwartz—Christoffel transformation, the di- constant εr. If the dielectric interfaces in the slots are modeled electric capacitance per unit length of the line can be obtained as magnetic walls and all the conductors are assumed to be in- as follow: K(kf ) finitely thin and perfectly conducting, the overall capacitance Cs =(εr − 1) · ε0 (7) K(k ) per unit length of ACPW with finite dielectric thickness can f be written as: where      C = Ca + Cs (1) (i3 − i2)(i4 − i1) kf = (8) (i − i )(i − i ) where capacitance Ca is air-space capacitance after removing 3 1 4 2 the dielectric substrates and capacitance Cs is introduced by with the dielectric substrate of thickness h having the equivalent      πf1 π(2g1 + w) dielectric constant (εr − 1). i1 =sinh = −sinh , 2h 4h      πf2 πw i2 =sinh = −sinh , 2h 4h      πf3 πw i3 =sinh =sinh , 2h 4h      πf4 π(2g2 + w) i4 =sinh =sinh . 2h 4h

Finally, the overall capacitance per unit length of the ACPW with finite dielectric thickness is K k K k C · ε ( i) ε − · ε ( f ) =2 0  +( r 1) 0  (9) K(ki) K(kf )

Therefore, the effective dielectric constant and characteristic impedance are, respectively,

Fig. 1. Conformal mappings for calculating the capacitance εeff = C/C0 =1+(εr − 1) · q (10) of ACPW with finite dielectric thickness. (a)Orig- inal ACPW structure; (b) Intermediate mapping for where the filling factor q is expressed as the dielectric region; (c) Mapping into parallel-plate capacitor K k K k q 1 · ( i) · ( f ) = K k K k (11) First, in order to obtain the air capacitance Ca, assum- 2 ( i) ( f ) ing the ACPW without the substrate, the upper half plane in and a r  Fig.1( ) is transformed into the rectangular region in -plane 60π K(ki) Z0 = √ · (12) by means of the Schwartz - Christoffel transformation εeff K(ki)  i di r =  (2) i − i i − i i − i i − i i1 ( 1)( 2)( 3)( 4) III. ANN Synthesis Models As a result, the air capacitance per unit length of the line is ANNs have been developed for many years. Recently, K(ki) Ca =2· ε0 (3) ANNs have gained attention as fast and flexible vehicles for mi- K(k ) i crowave modeling, simulation, and optimization. Feed-forward where K(ki) is the complete elliptical integrals of the first kind neural networks are a basic type of neural networks suitable with the module ki and for modeling high-dimensional and highly nonlinear problems.   2 An important class of feed-forward neural networks is Mul- ki = 1 − (ki) (4)  tilayer perceptron (MLP). An MLP consists of three types of (i3 − i2)(i4 − i1) layers: an input layer, an output layer and one or more hidden ki = (5) (i3 − i1)(i4 − i2) layers. The success of MLPs for a particular problem depends on the adequacy of the training algorithm regarding the ne- with   cessities of the problem. To obtain high-precision synthesis 2g1 + w w w 2g2 + w i1 = − ,i2 = − ,i3 = ,i4 = . models, Back-propagation with momentum (BPM), Resilient 2 2 2 2 back propagation (RBP), Scaled conjugate gradient (SCG), Second, in order to compute the dielectric capacitance, the Conjugate gradient with Fletcher- Reeves (CGF), Broydon- dielectric region in Fig.1(a) is transformed into the lower half Fletcher-Goldfarb-Shanno (BFGS), and Levenberg-Marquardt ANN Synthesis Models for Asymmetric Coplanar Waveguides with Finite Dielectric Thickness 761

Table 1. Errors obtained from ANN synthesis models trained with different learning algorithms First ANN synthesis model Second ANN synthesis model Learning MRE (%) ARE (%) MSE MRE (%) ARE (%) MSE algorithm Training Test Training Test Training Test Training Test Training Test Training Test BPM 242 203 18.0 18.3 6.80 × 10−3 7.40 × 10−3 492 324 20.5 20.8 2.33 × 10−2 2.63 × 10−2 RBP 102 101 4.24 4.56 4.88 × 10−4 5.49 × 10−4 194 120 4.10 4.39 1.40 × 10−3 1.70 × 10−3 SCG 37.5 25.9 1.27 1.39 4.66 × 10−5 5.68 × 10−5 43.1 27.7 1.25 1.34 1.17 × 10−4 1.48 × 10−4 CGF 28.7 29.1 1.33 1.41 5.34 × 10−5 6.46 × 10−5 60.7 86.3 1.41 1.67 1.59 × 10−4 2.21 × 10−4 BFGS 7.67 8.54 0.44 0.46 6.35 × 10−6 7.77 × 10−6 23.1 45.0 0.71 0.78 3.92 × 10−5 4.81 × 10−5 LM 0.69 0.77 0.02 0.03 1.73 × 10−8 5.03 × 10−8 0.78 2.51 0.03 0.04 6.14 × 10−8 2.47 × 10−7

(LM) algorithms[10,14,15] , have been used to train the MLPs was used in the output layers. The training and testing data for finding proper training algorithm. sets were scaled between −1.0and+1.0 for inputs and outputs The aim of this paper is to develop two accurate ANN before training in order to facilitate an easier learning process. synthesis models for ACPWs with finite dielectric thickness. As mentioned foregoing, the six learning algorithms are Fig.2 gives the ANN synthesis models. The first ANN synthe- used to train the neural models. In order to compute the ra- sis model can be used to calculate the central strip width w for tio of geometrical dimensions (w/h)ANN or (g2/h)ANN,train- a given substrate (h, εr) and required characteristic impedance ing the ANN models using these learning algorithms involves Z0 by choosing appropriate g1 and g2. The second ANN syn- presenting them sequentially and/or randomly with different thesis model can be used to compute the slot width g2 for a data sets (εr,g2/h or w/h, g1/g2,andZ0), corresponding to given substrate (h, εr) and required characteristic impedance the ratio of geometrical dimensions w/h or g2/h.TheMean Z0 by choosing appropriate w and g1/g2. square error (MSE) between target and the actual outputs of the networks is used to adapt the weights of the ANNs. The adaptation is carried out, after the presentation of each data set (εr,g2/h, w/h, g1/g2,andZ0), until the calculation ac- curacy of the models is deemed satisfactory according to one criterion: either the MSE for all the training data sets that fall below a given threshold or the maximum allowable number of epochs reached. IV. Numerical Results and Discussion

ANNs have been successfully introduced for the synthe- sis of ACPWs with finite dielectric thickness. To obtain high-precision synthesis models, ANNs were trained by us- ing the BPM, RBP, SCG, CGF, BFGS, and LM learning algorithms[10,14,15] . It is noted that, for each learning algo- rithm, the maximum allowable number of epochs was 1000, Fig. 2. ANN synthesis models for ACPWs with finite dielec- and the Maximal relative error (MRE), Average relative error tric thickness. (a) The first ANN synthesis model; (b) (ARE) and MSE of the ANN synthesis models were calculated. The second ANN synthesis model

The ANN model is a kind of black box models, whose ac- curacy depends on the data sets used during training. In this study, the training data sets were obtained from the CMT analysis presented in the Section II. For each ANN model, 5000 different data sets were used in this study. 3500 data sets were used in training and the rest of the data sets were used to test the ANN models. The design parameter ranges of ACPWs with finite dielectric thickness are 2 ≤ εr ≤ 22, 0.1 ≤ w/h ≤ 0.9, 0.1 ≤ g2/h ≤ 2, 0.1 ≤ g1/g2 ≤ 1, and 25Ω ≤ Z0 ≤ 220Ω. To find proper ANN synthesis models for ACPWs with fi- Fig. 3. Comparison of the results obtained from the first ANN nite dielectric thickness, many experiments were carried out in synthesis model and the CMT analysis contours for this study. After many trials, it was found that the target in ACPWs with finite dielectric thickness (εr =12.9, g /h =0.7, h = 200μm) high accuracy was achieved by using two hidden layered net- 2 work. The numbers of neurons in the first and second hidden The training and test errors obtained from the ANN mod- layers were 12 and 24 for both ANN synthesis models. For els trained with different learning algorithms are summarized each ANN model, the tangent sigmoid activation function was in Table 1. When the training and test performances for the used in the hidden layers, and the linear activation function six learning algorithms are compared with each other, the best 762 Chinese Journal of Electronics 2012 results were obtained from the ANNs trained with the LM al- trained with the LM algorithm are in good agreement with gorithm for both the first and second synthesis models. The the results of the synthesis formula[16] and the CMT analysis. worst results were obtained from the ANNs trained with the It is also seen that there is a good self-consistent agreement BPM algorithm. As it can be seen from Table 1, for each ANN between the first and second synthesis models. synthesis models trained with the LM algorithm, the MRE is less than 2.6% and the ARE is less than 0.05%. These error values obviously show that the ANN synthesis models trained with the LM algorithm can be used for accurately computing physical dimensions of ACPWs with finite dielectric thickness for the required design specifications.

Fig. 5. Comparisons among the results of the ANN synthesis models trained with the LM algorithm, the symmetric CPW synthesis formula[16], and the CMT analysis for symmetric and asymmetric CPWs with finite dielec- tric thickness (εr =10.2, h = 200μm, and g2/h =0.5)

Fig. 4. Comparison of the results obtained from the second In Table 2, the results obtained from the ANN synthesis ANN synthesis model and the CMT analysis contours models trained with the LM algorithm are compared with the [13] for ACPWs with finite dielectric thickness (εr =12.9, results of the CMT analysis, HFSS electromagnetic simu- [3] w/h =0.5, h = 200μm) lation, and experimental work . Z0m is the measured char- acteristic impedance value, and w, g1,andg2 represent the In order to validate the ANN synthesis models trained with measured geometrical dimensions of ACPWs with finite dielec- the LM algorithm, the results obtained from the ANN synthe- tric thickness. Also, Z0h and Z0c represent the characteristic sis models trained with the LM algorithm are compared with impedance values obtained from the HFSS and CMT analysis the results of the CMT analysis. Figs.3 and 4, respectively,  by using w, g1,andg2, respectively. The w represents the show the CMT analysis contours of the ratios of geometrical di- w/h g /h g /g strip width obtained from the first ANN synthesis model by mensions and 2 versus the ratio of slot widths 1 2 for  using g1 and g2.Theg2 represents the slot width obtained various characteristic impedance values with a given substrate from the second ANN synthesis model by using w and g1/g2. material (εr =12.9, and h = 200μm). It is clear observed that Finally, the CMT analysis results (Z0w and Z0g )arecalculated there is a very good agreement between the results of CMT   by using w and g2 for checking. As it can be seen from Table analysis and the ANN synthesis models trained with the LM 2, a good agreement is obtained between the theoretical and algorithm. This good agreement supports the validity of the experimental results. synthesis models proposed here. Similar results are obtained for the different dielectric substrate materials (2 ≤ εr ≤ 22), but they are not given here to avoid repetition. V. Conclusion The comparisons among the results of the ANN synthesis models trained with the LM algorithm, the symmetric CPW In this paper, novel and accurate ANN models are pre- synthesis formula[16], and the CMT analysis for symmetric and sented for the synthesis of ACPWs with finite dielectric thick- asymmetric CPWs with finite dielectric thickness (εr =10.2, ness. For each ANN model trained with the LM algorithm, the h = 200μm, and g2/h =0.5) are given in Fig.5. The character- MRE is less than 2.6% and the ARE is less than 0.05%. These istic impedance results are plotted with respect to the ratio of error values obviously show that the proposed ANN models geometrical dimensions w/h for three different g1/g2 values. can be used for accurately computing the physical dimensions It is observed that the results of the ANN synthesis models of ACPWs with finite dielectric thickness by a very simple way,

Table 2. Comparisons of the results of the proposed ANN synthesis models, CMT, HFSS, and experimental results Measured HFSS CMT First ANN synthesis model Second ANN synthesis model   w (μm) g1 (μm) g2 (μm) Z0m(Ω) Z0h(Ω) Z0c(Ω) w (μm) Z0w(Ω) g2 (μm) Z0g(Ω) 747 123 1060 51.5 51.33 51.77 747.03 51.76 1056.58 51.75 737 257 991 57.5 58.77 59.88 736.00 59.89 989.43 59.86 1248 406 1548 62.4 62.87 62.37 1240.10 62.44 1492.36 62.16 1244 575 1386 66.3 66.81 67.16 1242.38 67.18 1325.59 66.81 ANN Synthesis Models for Asymmetric Coplanar Waveguides with Finite Dielectric Thickness 763 rather than by the iteration technique of applying the analysis Vol.9, No.5, pp.394–402, Sept. 1999. method. The ANN models have been validated by comparing [11] M. Turkmen, C. Yildiz, K. Guney, S. Kaya, “Comparison their results with the results of the CMT analysis, HFSS elec- of adaptive-network-based fuzzy inference system models for analysis of conductor-backed asymmetric coplanar waveguides”, tromagnetic simulation, and experimental works. Also, the Progress In Electromagnetics Research M, Vol.8, pp.1–13, 2009. proposed ANN models can be used for symmetric CPWs with [12] S. Kaya, K. Guney, C. Yildiz, M. Turkmen, “New and accu- finite dielectric thickness. rate synthesis formulas for asymmetric conductor-backed copla- nar waveguides”, Microwave and Optical Technology Letters, Vol.53, No.1, pp.211–216, Jan. 2011. References [13] User’s Guide-High Frequency Structure Simulator, Ansoft Cor- [1] R.N. Simons, Coplanar Waveguide Circuits, Components, and poration, Pittsburgh, USA, 2005. Systems, John Wiley & Sons, New York, USA, 2001. [14] C. Yildiz, K. Guney, M. Turkmen, S. Kaya, “Neural models [2] S.Q. Fu, S.J. Fang, Z.B. Wang, X.M. Li, “Broadband circularly for coplanar strip line synthesis”, Progress In Electromagnetics polarized slot antenna array fed by asymmetric CPW for L- Research, Vol.69, pp.127–144, 2007. band applications”, IEEE Antennas and Wireless Propagation [15] K. Guney, C. Yildiz, S. Kaya, M. Turkmen, “Neural models Letters, Vol.8, pp.1014–1016, 2009. for the V-shaped conductor-backed coplanar waveguides”, Mi- [3] V.F. Hanna, D. Thebault, “Theoretical and experimental in- crowave and Optical Technology Letters, Vol.49, No.6, pp.1294– vestigation of asymmetric coplanar waveguides”, IEEE Trans- 1299, June 2011. actions on Microwave Theory and Techniques, Vol.32, No.12, [16] C. Yildiz, M. Turkmen, “New and very simple CAD models for pp.1649–1651, Dec. 1984. coplanar waveguide synthesis”, Microwave and Optical Tech- [4] P. Chen, S.J. Fang, “Calculation and analysis of dispersion char- nology Letters, Vol.41, No.1, pp.49–53, Apr. 2004. acteristic of ACPW using FDTD method”, Acta Electronica WANG Zhongbao was born in Sinica, Vol.34, No.9, pp.1610–1612, Sept. 2006. (in Chinese) Sichuan Province, China, in 1983. He re- [5] M.J. Fan, S.J. Fang, P. Chen, E.C. Wang, K. Chen, “Multi- ceived the B.E. and M.E. degrees in com- resolution time-domain analysis on dispersion characteristic of munication engineering from Dalian Mar- ACPW”, Journal of Dalian Maritime University, Vol.33, No.2, itime University (DLMU), Dalian, China, pp.49–52, May 2007. (in Chinese) in 2007 and 2009, respectively, and is cur- [6] V.F. Hanna, D. Thebault, “Analysis of asymmetrical coplanar rently working toward the Ph.D. degree at waveguides”, International Journal of Electronics, Vol.53, No.3, DLMU. His current research interests in- pp.221–224, Mar. 1981. clude patch antennas, passive RF compo- [7] C. Karpuz et al., “Fast and simple analytical expressions nents and microwave technology using ar- for quasistatic parameters of asymmetric coplanar lines”, Mi- tificial intelligence. (Email: [email protected]) crowave and Optical Technology Letters, Vol.9, No.6, pp.334– FANG Shaojun (corresponding 336, Aug. 1995. author) was born in Shandong Province, [8] S.J. Fang, B.S. Wang, “Analysis of asymmetric coplanar waveg- China, in 1957. He received the Ph.D. uide with conductor backing”, IEEE Transactions on Mi- degree in communication and information crowave Theory and Techniques, Vol.47, No.2, pp.238–240, Feb. systems from Dalian Maritime University, 1999. Dalian, China, in 2001. He is currently [9] S.J. Fang, B.S. Wang, “CAD-oriented model for asymmetrically a professor and doctoral supervisor in the shielded multilayered CPW”, Acta Electronica Sinica, Vol.30, School of Information Science and Tech- No.6, pp.804–807, June 2002. (in Chinese) nology, DLMU. His recent research inter- [10] C. Yildiz, S. Sagiroglu, O. Saracoˇgu, “Neural models for copla- ests include patch antennas, ACPW com- nar waveguides with a finite dielectric thickness”, International ponents and computational electromagnetics. (Email: fangshj@ Journal of RF and Microwave Computer-Aided Engineering, dlmu.edu.cn) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

A New Approach to Airborne High Resolution SAR Motion Compensation for Large Tra jectory Deviations∗

MENG Dadi1, HU Donghui1 and DING Chibiao2 (1.Key Laboratory of Technology in Geospatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China) (2.Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China)

Abstract — The Two-step algorithm (TSA) is widely (the so called Central beam approximation (CBA)[10] adopted), and used for trajectory deviation compensation of airborne the illuminated surface is supposed flat[11]. Synthetic aperture radar (SAR), by which most of the mo- In TSA, motion errors are split into space invariant and range tion errors are compensated before Range curve migration variant components. The first-order MOCO is carried out to the compensation (RCMC) and the residual after the RCMC. range compressed data (or the raw SAR data, in terms of phase We found that the RCMC in the presence of the residual correction and range resampling) to compensate for the space in- motion errors results in additional range shift errors hard variant component. After the Range Curve Migration Compensa- to be compensated for. Based on theoretical investiga- tion (RCMC), with the CBA adopted, the second-order MOCO is tions, this shortage of TSA is reported and a new compen- carried out to to compensate for the range variant component. sation scheme which greatly alleviates the RCMC induced TSA is widely used in the SAR jargon. It can be easily in- errors is proposed. Besides, range resampling considering tegrated into the popular SAR focusing algorithms such as Range the residual motion errors is very convinent, which, how- doppler algorithm (RDA)[4] and Chirp scaling algorithm (CSA)[6,7]. ever, is boresome in TSA. The new method is effective to Efforts have been made to integrate it into the Range migration al- compensate the high resolution SAR systems for large tra- gorithm (RMA)[8,9]. jectory deviations, which is hard to be achieved by TSA As a matter of fact, to result in well focused SAR images, when because of the uncompensated errors. Results on the sim- the trajectory deviations are sufficiently large, especially to high ulated data are provided to demonstrate the effectiveness resolution SAR systems, the range variant range resampling must of the new method. be considered, which is hard to be carried out in TSA for the sake of imaging algorithm architecture. On the other side, we found Key words — Synthetic aperture radar (SAR), Motion that the RCMC in the presence of residual motion errors results compensation, Space variant. in additional range shift errors (besides the inherited range variant range shift errors) which can not be rationally compensated for. For high resolution SAR systems with large trajectory deviations, the I. Introduction RCMC induced range shift errors can be larger than several range cells, which lead to the resolution loss of the SAR images. Airborne Synthetic aperture radar (SAR) platforms always ex- To deal with the RCMC induced errors and range variant range perience trajectory deviations, mainly due to atmospheric turbu- resampling, in this paper, we propose a new MOCO approach named lence, from the nominal straight line and forward velocity varia- Direct MOCO algorithm (DMA) which accomplishes space invari- tions during the period of data acquisition. These, if not reasonably ant & range variant range resampling and phase correction to the corrected, will introduce motion errors on the raw SAR data and range compressed SAR data before RCMC, thus the RCMC induced deteriorate the focusing quality of the resulting SAR images, such errors can be greatly alleviated and the precision of MOCO for the as loss of geometric and radiometric resolution, reduction of image large trajectory deviations can be remarkably improved. contrast, increase of sidelobes, etc[1]. To account for these motion errors, Global positioning system In this paper, after the introduction of the basic MOCO concept (GPS) receiver and Inertial navigation unit (INU) are mounted on in Section II, the SAR focusing procedure with TSA integrated into the SAR platform to record its track data during the period of is detailedly deduced in Section III. The DMA MOCO approach is data acquisition[2]. The recorded track data are used during SAR analysed in Section IV. Error evaluation is carried out in Section focusing procedure to calculate and compensate for the motion er- V. The two approaches are compared on the simulated SAR data rors. Forward velocity variations can be easily compensated for in Section VI. Sections VII provides computational burden analysis, by resampling the raw SAR data in azimuth direction[3−5],com- and Section VIII concludes this paper. pensation of which is not concerned in this paper. Compensation For the sake of clarity, in the following discussion, the strip-map of trajectory deviations, namely Motion compensation (MOCO), is SAR with zero squint angle is supposed, and the illuminated sur- commonly implemented by the Two-step algorithm (TSA)[3,6−8],by face is assumed flat and on the right side of the trajectory, and the which only the errors in the mean squint angle direction is corrected forward velocity variations are ignored.

∗Manuscript Received Dec. 2010; Accepted Dec. 2011. A New Approach to Airborne High Resolution SAR Motion Compensation for Large Trajectory Deviations 765

II. Conceptional Introduction of MOCO correction (involving the exponential factor in Eq.(3))[1].Unfortu- nately, as a result of both the range curve migration and azimuth 1. Geometry definition beamwidth, the compensation can not be precisely implemented at The classic strip-map SAR system geometry in the presence of this stage, since the received signals from many targets with differ- trajectory deviations is shown in Fig.1. The x − y plane represents ent locations are blended together which require different amount the flat illustrated surface, and the nominal straight flight track is of compensation. in y direction with relative height H and x = 0. The actual trajec- tory is also depicted, and δ (η), δ (η) is the displacement of it from 2. The space variance characters of motion errors x z As stated above, SAR platform deviations cause different mo- the nominal one in x and z direction respectively at the azimuth tion errors (Δr (η)) for the targets with different locations. From position A the viewpoint of MOCO, let R and R are the nearest and y = v · η (1) near far furthest slant range of the SAR data respectively, and the middle where v is the forward velocity of the platform and η the along track range (namely reference range) time. Rref =(Rnear + Rfar)/2(7) then the x coordinates series on the illuminated surface correspond- ing to the middle range bin before MOCO (a curve along η depicted in Fig.1, CBA applied) is 1 2 2 2 xref (η)=δx(η)+[Rref − (H + δz(η)) ] (8) The space invariant motion errors according to the reference range are defined as 1 2 2 2 Δrref (η)=Rref − [xref (η)+H ] (9) For target A, the residual motion errors

ΔrA,res(η)=ΔrA(η) − Δrref (η) (10) is defined as space variant motion errors. Conventionally, respecting the signal after RCMC, ΔrA,res(η) can be split into range variant part ΔrA,rv(η) and azimuth variant part Δr (η): Fig. 1. Geometry of a typical stripmap SAR system in the A,av presence of trajectory deviations ΔrA,res(η)=ΔrA,rv(η)+ΔrA,av(η) (11) Let where 2 Δr (η)=Δr (η)| − Δr (η) (12) s(t)=rect(t/T ) · exp{j2πf0t + jπKt } (2) A,rv A yA=vη ref √ and be the transmitted chirp signal where j = −1, f0 is the carrier Δr (η)=Δr (η) − Δr (η)| (13) frequency, T,K the chirp duration and rate respectively, and t the A,av A A yA=vη range delay time. A rectangular azimuth antenna pattern is adopted In this definition, at instant azimuth time η0,ΔrA,rv(η0)con- in Eq.(2). When the platform is at position (δx(η),vη,H + δz(η)) siderate the motion errors of C in Fig.1, which is in the same range on the actual trajectory, the corresponding position on the nominal bin with A but in the beam centre direction, and ΔrA,av(η0)the trajectory is (0,vη,H). With the amplitude of the received signal residual motion errors (be zero when η0 = yA/v). neglected, after demodulation and range compression, the received In our opinion, however, space variance of motion errors can also be defined respecting the signal before RCMC. At azimuth time η,a signal from a point target A located at (xA,yA, 0) in the illuminated area is given by target B located at (x, vη, 0) satisfies rB (η)=rA(η) which implies:   1 2 2 vη − yA x − δx(η)=[(xA − δx(η)) +(yA − vη) ] 2 (14) sA(η, t)=rect · sin c{KT · [t − 2 · rA(η)/c]} S then the range and azimuth variant motion errors can be defined as  · exp{−j4πrA(η)/λ} (3)  ΔrA,rV (η)=ΔrA(η) − Δrref (η) (15) xA=x 8 y =vη where c =3× 10 m/sisthelightspeed,λ = c/f0 the wavelength A of the transmitted signal, S the synthetic aperture length respect- and   ing target A, and r (η) the distance between the instant platform ΔrA,aV (η)=ΔrA(η) − ΔrA(η) (16) A xA=x position and the location of target A yA=vη respectively and they satisfy 1 2 2 2 rA(η)=[(xA − δx(η)) +(yA − vη) +(H + δz(η)) ] 2 (4) ΔrA,res(η)=ΔrA,aV (η)+ΔrA,rV (η) (17) If the platform is at the corresponding position on the nominal In this definition, however, at instant azimuth time η0, straight track, Eq.(4) becomes ΔrA,rV (η0) considerate the motion errors of B in Fig.1, which is in the range bin rA(η0) but in the beam centre direction, and  1  2 2 2 ΔrA,av(η0) the residual motion errors (be zero when η0 = yA/v). rˆA(η)=rA(η) =[x +(yA − vη) + H ] 2 (5) δx(η)=0 A In fact, the former definitions of space variance are adopted δy (η)=0 in the conventional TSA approach, and the latter will be used in and the distance difference the new MOCO approach proposed in Section IV. Some simula- tion results are shown in the following sections to demonstrate the ΔrA(η)=rA(η) − rˆA(η)(6) differences of the two definitions. Substituting rA(η)inEq.(3)byˆrA(η), we can obtains ˆA(η, t), the received signal of target A acquired without trajectory devia- III. Investigation of TSA tions. The purpose of MOCO is to transform sA(η, t)toˆsA(η, t) for all the targets in the SAR data. The transformation consists of For the simplicity of illustration, the following analysis is carried range resampling (involving the sinc function in Eq.(3)) and phase out in RDA case (other algorithms can share the same results for 766 Chinese Journal of Electronics 2012 the sake of the same imaging theory), where RCMC is implemented After RCMC and azimuth IFFT we have in range-doppler domain.   vη − yA As stated above, TSA is implemented by space invariant com- sˆ (η, t)=rect · sinc{KT A S pensation to range compressed data and range variant phase error correction after RCMC. After the first step, Eq.(3) becomes · [t − 2(ˆrA(yA/v)+ΔrA,res(η)+ΔrA,RCM (η))/c]}   vη − y · exp{−j4π[ˆr (η)+Δr (η)]/λ} (26) s (η, t)=rect A · sinc{KT A A,res A S Unfortunately, for the dependence of azimuth position (yA)of · [t − 2(ˆrA(η)+ΔrA,res(η))/c]} ΔrA,RCM (η), it is impossible to compensate ΔrA,RCM (η)after RCMC. · exp{−j4π[ˆrA(η)+ΔrA,res(η)]/λ} (18) 1. RCMC without residual motion errors  Supposing ΔrA,res(η) to be zero, then sA(η, t)=ˆsA(η, t). By the Principle of stationary phase (POSP)[12], after the azimuth FFT ofs ˆA(η, t), we can obtain the relative range-doppler domain signal [12] sˆA(fη ,t). The range delay in sinc function ofs ˆA(fη,t)is 1  2 2 − λ fη 2 rˆ (fη)=ˆr (y /v) · 1 − (19) A,rd A A 4v2 which is obtained by substituting η in Eq.(18) to 1 1 2 2 2 − η = η (fη)= · [y − λfη · rˆ (y /v) · (4v − λ f ) 2 ] (20) 0 v A A A η Fig. 2. Block diagram of a general SAR focusing procedure with DMA integrated into with Secondary range compression (SRC)[12] ignored. Eq.(20) can also be written as 3. Discussion From Eq.(26), we can see that after the first order MOCO and 1 −1 2v 2 2 − fη = η (η)= (y − vη) · [ˆr (y /v)+(y − vη) ] 2 (21) RCMC, the energy of target A is in the correct range binr ˆ (y /v) 0 λ A A A A A A except for range shift errors ΔrA,res(η)+ΔrA,RCM (η), and the −1 where (.) represents the inverse function. phase error is ΔrA,res(η). The latter can be easily compensated The RCMC is performed by shiftings ˆA(fη,t)fromˆrA,rd(fη )to for after RCMC and before azimuth compressing (usually needing rˆA(yA/v). After RCMC and Inverse FFT (IFFT), we have one additional FFT pair). Range shift error ΔrA,res(η)isusu-   vη − y ally hard to be compensated for in TSA (compensation of which sˆ (η, t)=rect A · sinc{KT · [t − 2R /c]} A S A needs one additional FFT pair and transposition pair), because the SAR data is commonly at azimuth frequency domain after RCMC · exp{−j4πrˆ (η)/λ} (22) A and before azimuth compressing. Compensation of ΔrA,RCM (η) The compensation for the exponential factor in Eq.(22) results in a is impossible for the dependence of point azimuth position (yA) fully focused SAR image. of ΔrA,RCM (η), as stated above. The uncompensated range shift 2. RCMC in the presence of residual motion errors errors (ΔrA,res(η)+ΔrA,RCM (η)) will cause severe quality degen-  eration for large trajectory deviations and/or high resolution SAR By POSP, the range delay rA,rd(fη ) in sinc function of sA(fη,t) systems. are obtained by substituting η to η = η1(fη), the inverse function of which is given by −1 IV. DMA Discription fη =η1 (η) 1 2v − Facing the failure of TSA to accomplish range variant range = · (y − vη) · [ˆr2 (y /v)+(y − vη)2 ] 2 λ A A A A resampling and compensation for the RCMC induced range shift 2 errors, an alternative way of MOCO, named DMA, is compensating − · Δgr (η) λ A,res for the space invariant and range variant motion errors to the range 2 compressed SAR data before RCMC, where the second definition =η−1(η) − · Δgr (η) (23) 0 λ A,res (the definition before RCMC) of space variance of motion errors g Eq.(15) and Eq.(16) is adopted. where (·) represents the derivative operator. Referring to Eq.(3), space invariant motion errors Eq.(9) and 2 g range variant motion errors Eq.(16) can be compensated for to- In RCMC of SAR imaging algorithms, · Δ rA,res(η)in λ gether to s (η, t) in terms of phase correction and range resampling Eq.(23) is not considered. In other words, RCMC of s (f ,t)is A A η by phase multiplication and range interpolation respectively. After implemented according tor ˆ (f ) in Eq.(19), which leads to a A,rd η the compensation we have range shift error    vη − yA ΔrA,RCM (η)=ˆrA,rd(fη)| −1 − rˆA,rd(fη )| −1 sA,DMA(η, t)=rect · sinc{KT fη =η (η) fη=η (η) S 1   0 1 =ˆr (y /v) · D (η) · 1 − 1 − D2 (η) · [t − 2(ˆrA(η)+ΔrA,aV (η))/c]} A A A v2 A 2 y − vη · exp{−j4π[ˆrA(η)+ΔrA,aV (η)]/λ} g2 A · (Δr)A,res(η)+ · · DA(η) (27) v rˆA(yA/v) 1 −  Compared with Eq.(18), the residual motion errors (before g 2 · Δ rA,res(η) (24) RCMC) have reduced from ΔrA,res(η)toΔrA,aV (η)whichare much smaller, and the RCMC induced range shift errors can be where greatly alleviated. 1   2 yA − vη 2 Integrated into the general SAR focusing procedure, the proce- DA(η)= 1+ (25) dure of DMA can be shown in Fig.2. rˆA(yA/v) A New Approach to Airborne High Resolution SAR Motion Compensation for Large Trajectory Deviations 767

Fig. 3. Azimuth variant motion errors (af- Fig. 4. Azimuth variant motion errors (be- Fig. 5. RCMC induced range shift errors ter RCMC) fore RCMC) for targets with differ- for targets at different azimuth lo- ent azimuth locations, cations

V. Error evaluation compensated for in the following steps.

1. Simulation of the space variance of motion errors To highlight the magnitudes of range variant and azimuth vari- VI. Point Target Simulation ant motion errors defined in Eqs.(12) and (13), simulations on the In order to qualify DMA for the MOCO of high resolution SAR space variance of motion errors are carried out. system with large trajectory deviations, simulation on three targets The relative parameters are collected in Table 1. The small, located at different slant range of illuminated surface is carried out. medium and large horizontal deviations are tested respectively, and The parameters relevant to the simulation, which depict a typical the vertical deviations are set to zero since the horizontal deviations X-band SAR system, are shown in Table 3. To assess the space are the relatively dominant ones[15]. Target A located out of the variance of the MOCO algorithms, three targets are placed in near middle range is illuminated with non-zero azimuth angle (refering range, middle range, far range respectively. To result in the focused to Fig.1). SAR images, RDA is used as the focusing procedure with DMA or After calculation by Eqs.(9) and (12) we can obtain the TSA integrated into. space invariant motion errors −3.5349/ − 7.0685/ − 10.6009 m for A typical large trajectory deviation in x direction is generated δ (η)=5/10/15 m respectively, and the range variant motion errors x by a three-order polynomial plus a sine function. The vertical devi- −0.2275/ − 0.4552/ − 0.6831 m respectively. The azimuth variant ations, which are relatively small, are set to zeros[15]. motion errors are evaluated by Eq.(13) and shown in Fig.3. It is After focusing procedure with TSA or DMA integrated into re- obvious from the results that the range variant motion errors are spectively, we can obtain the point impulse responses of the three much larger than the azimuth variant ones (more than ten times) targets, the contours of which are plotted in Fig.6. The Single look whereas much smaller than the space invariant ones (about seven- complex (SLC) image of every contour is measured in terms of res- teen times). With respect to phase correction, the range variant olution, Peak sidelobe ratio (PSLR) and Integrated sidelobe ratio ones always need to be compensated for (which is implemented in (ISLR), and the results are shown in Table 4. From the experimental TSA) whereas the latter needs to be compensated for only when results of Fig.6 and Table 4 we can obtain: the focus quality can not be accepted (which can be compensated (1) For target 1 and 3, which are not in the middle range bin, af- by methods proposed in Refs.[13] and [14]). With respect to range ter TSA, the large RCMC induced range shift errors (approximately resampling, the azimuth variant motion errors are commonly small than a range cell and need not to be compensated for, and the range variant ones, however, may usually need to be compensated for es- Table 1. Parameters of the simulation on the pecially for high resolution SAR systems (which is impossible in space variance of motion errors TSA for the sake of RCM induced error stated above). δx(η) δx(η) δx(η) H xref (η) xA Using the parameters in Table 1, the azimuth variant motion (small) (medium) (large) errors ΔrA,aV (η) can be evaluated by Eqs.(15) and (16) and plot- 5m 10m 15m 7000m 7000m 8000m ted in Fig.4. Compared with Fig.3, ΔrA,aV (η) is a little larger Table 2. Parameters of the simulation on RCMC than ΔrA,av(η) (less than twice) but much smaller than ΔrA,res(η) (about tenth). Therefore the RCMC induced range shift errors can induced range shift errors test 1 test 2 test 3 test 4 test 5 be rationally alleviated. g −1 Δ rA,res(η)(ms ) 0.01 0.1 0.2 0.1 0.1 2. Simulation of RCMC induced range shift errors −1 The RCMC induced range shift errors are observed in this sec- V (ms ) 130 130 130 130 100 R (m) 10000 10000 10000 15000 10000 tion by simulation. The expression of ΔrA,RCM (η) implies that A Δr (η) is determined by Δgr (η), v, R and y − vη.To A,RCM A,res A A Table 3. Parameters of a typical X-band SAR system test the relationship between ΔrA,RCM (η) and these factors, sim- Carrier Azimuth beam 10 3.44 ulations considering five groups of parameters listed in Table 2 are frequency (GHz) width (deg) carried out. The experimental results are plotted in Fig.5. Range sampling Platform The results of test 2 and test 4 nearly overlapped implying that 500 130 rate (MHz) velocity (ms−1) Δr (η) is insensitive to the slant range, whereas the other A,RCM Range Location of target 1 results show that Δr (η) is sensitive to v and Δgr (η). (8780, 0, 0) A,RCM A,res bandwidth (MHz) 300 (m, near range) It is recognized from Fig.4 that even Δr (η) is fully com- A,res Location of target 2 pensated for after RCMC, the RCMC induced range shift errors is PRF (Hz) 700 (10305, 0, 0) usually larger than a range cell, even tens of range cells for large (m, middle range) Nominal track Location of target 3 deviations of high resolution SAR systems. 7000 (11761, 0, 0) height (m) (m, far range) Moreover, Fig.5 implies that ΔrA,RCM (η) depends heavily on the azimuth position of target, therefore it can not be rationally 768 Chinese Journal of Electronics 2012

Table 4. Focus qualities of the impulse responses in Fig.6 Slant range Slant range Slant range Azimuth Azimuth Azimuth resolution (m) PSLR (dB) ISLR (dB) resolution (m) PSLR (dB) ISLR (dB) Theoretical 0.44300 −13.260 −9.8000 0.22150 −13.260 −9.8000 (a) 0.49688 −8.6936 −7.1149 0.46283 −11.6523 −9.3978 (b) 0.43945 −13.0859 −10.0383 0.23577 −6.2518 −5.448 (c) 0.53555 −6.4769 −5.4174 0.66161 −0.22248 0.42681 (d) 0.44414 −13.0437 −9.8831 0.25826 −5.4783 −4.2278 (e) 0.44297 −9.7968 −12.9847 0.26551 −3.6059 −4.4159 (f) 0.44414 −13.0768 −9.8444 0.26044 −3.7132 −3.3193

Fig. 6. Contour plots of impulse responses for the RDA+DMA and RDA+TSA (range variant range resampling enabled). The coordinates are defined as the relative distance to the theoretical locations. (a)RDA+TSA (near range); (b) RDA+TSA (middle range); (c) RDA+TSA (far range); (d) RDA+DMA (near range); (e) RDA+DMA (middle range); (f)RDA+DMA(farrange)

2 m) bring forth the responses dispersion in range direction and res- space invariant and variant components, and the latter includes olution lost in azimuth direction. Whereas the range dispersion of range variant part and azimuth variant part. Two definitions of the responses have not occurred after DMA. range and azimuth variance of space variant component were intro- (2) For target 2, which is in the middle range bin, the result of duced who were adopted by TSA and DMA respectively. Then TSA DMA are not as good as the one of TSA since the larger azimuth was deduced step by step and we found that the RCMC will induce variant motion errors (ΔrA,rV (η)) remained after DMA. additional range shift errors which is hard to be compensated for. (3) The range and azimuth registration errors are too small to Based on the analysis above, we presented a new MOCO scheme be measured. with the RCMC induced errors greatly alleviated. Finally, we expli- cated the advantage of DMA by a point target simulation. After the work in this paper, we found that DMA can provide high precision VII. Computational Analysis MOCO which is critical for image processing of high resolution SAR Supposing the range pixel number and azimuth one of the SAR systems with large trajectory deviations. data are M and N, the TSA requires N range phase multiplications (to implement the space invariant range resampling) and M azimuth References phase multiplications (to implement the range variant phase correc- tion). The additional range variant range resampling requires N [1] C.K. John Jr., “Motion compensation for synthetic aperture range interpolations. radar”, IEEE Transactions on Aerospace and Electronic Sys- DMA, however, requires N range phase multiplication (to im- tems, Vol.11, No.3, pp.338–348, 1975. plement the phase correction) and N range interpolations (to imple- [2] S. Buckreuss, “Motion errors in airborne synthetic aperture ment the range resampling). Compared with TSA, DMA can save radar system”, European Transactions on Telecommunications, M azimuth phase multiplications. Vol.2, No.6, pp.655–664, 1991. [3] G. Fornaro, “Trajectory deviations in airborne SAR: analysis VIII. Conclusions and compensation”, IEEE Transactions on Aerospace and Elec- tronic Systems, Vol.35, No.3, pp.997–1009, 1999. This paper firstly presented a theoretical investigation of [4] G. Franceschetti, R. Lanari, Synthetic Aperture Radar Process- MOCO of airborne SAR system. The motion errors were split into ing, CRC Press, New York, U.S.A., 1999. A New Approach to Airborne High Resolution SAR Motion Compensation for Large Trajectory Deviations 769

[5] Y. Huang, Z. Bao, F. Zhou, “A novel method for along-track variant phase error in airborne wideband SAR”, Journal of motion compensation of the airborne strip-map SAR”, Acta Electronics & Information Technology, Vol.29, No.10, pp.2375– Electronica Sinica, Vol.33, No.3, pp.459–462, 2005. (in Chi- 2378, 2007. nese) [15] D.R. Kirk, R.P. Maloney, “Impact of platform motion on wide- [6] A. Moreira, Y. Huang, “Airborne SAR processing of highly angle synthetic aperture radar image quality”, IEEE National squinted data using a chirp scaling approach with integrated Radar Conference, Boston MA, New York, U.S.A., pp.41–46, motion compensation”, IEEE Transactions on Geoscience and Apr. 20-22, 1999. Remote Sensing, Vol.32, No.5, pp.1029–1040, 1994. MENG Dadi received the B.S. de- [7]A.Moreira,J.Mittermayer,R.Scheiber,“Extendedchirpscal- gree from Xi’an Jiaotong University, Xi’an, ing algorithm for air- and spaceborne SAR data processing in China in 2001, and the Ph.D. degree from stripmap and ScanSAR imaging modes”, IEEE Transactions on Graduate University of Chinese Academy Geoscience and Remote Sensing, Vol.34, No.5, pp.1123–1136, of Sciences, Beijing, China in 2006. He 1996. is currently an Associate Research Fel- [8]A.Reigber,E.Alivizatos,A.Moreira,“Extendedwavenumber- low at Key Laboratory of Technology in domain synthetic aperture radar focusing with integrated mo- Geospatial Information Processing and Ap- plication System, Chinese Academy of Sci- tion compensation”, IEE Proceedings-Radar, Sonar and Navi- ences (GIPAS). His research interests in- gation, Vol.153, No.3, pp.301–310, 2006. clude SAR data processing and motion compensation. (Email: [9] D. An, X. Huang, Z. Zhou, “Motion compensation for low fre- quency UWB SAR based on the modified wave number domain [email protected]) received the B.S. de- algorithm”, Acta Electronica Sinica, Vol.38, No.12, pp.2839– HU Donghui 2845, 2010. (in Chinese) gree from Peking University, Beijing, China in 1992, and the M.S. degree from the [10] G. Fornaro, G. Franceschetti, S. Perna, “On center-beam ap- Beijing Institute of Technology, Beijing proximation in SAR motion compensation”, IEEE Transactions in 2001. He is currently an Associate , Vol.3, No.2, pp.276–280, on Geoscience and Remote Sensing Research Fellow at GIPAS. His main re- 2006. search interests include SAR signal pro- [11] P. Prats, K.A. Camara de Macedo, A. Reigber, R. Scheiber, cessing and SAR calibration. (Email: J.J. Mallorqui, “Comparison of topography- and aperture- [email protected]) dependent motion compensation algorithms for airborne SAR”, , Vol.4, No.3, IEEE Geoscience and Remote Sensing Letters DING Chibiao received the B.S. pp.349–353, 2007. and Ph.D. degrees from Beihang Univer- [12] I.G. Cumming, F.H. Wong, Digital Processing of Synthetic sity, Beijing, China in 1991 and 1997 re- Aperture Radar Data: Algorithms and Implementation,Artech spectively. Since then, he has been work- house Inc., Norwood, U.S.A., 2005. ing with the Institute of Electronics, Chi- [13] X. Zheng, W. Yu, Z. Li, “A novel algorithm for wide beam nese Academy of Sciences, Beijing, where SAR motion compensation based on frequency division”, IEEE he is currently a Research Fellow and the International Conference on Geoscience and Remote Sensing vice Director. His main research inter- Symposium, Denver CO, New York, U.S.A., pp.3160–3163, July ests include advanced SAR systems, signal 31-Aug. 4, 2006. processing technology and information sys- [14] D. Meng, C. Ding, “A new approach to compensating spatially tems. (Email: [email protected]) Chinese Journal of Electronics Vol.21, No.4, Oct. 2012

Novel Implementation of Track-Oriented Multiple Hypothesis Tracking Algorithm∗

GUO Jianhui and ZHANG Rongtao

(Nanjing Institute of Electronics Technology, Nanjing 210013, China)

Abstract — It is widely accepted that modern compu- observation to track assignments for the tracks within that tational capabilities have made the application of Multi- hypothesis. As new hypotheses are formed, the compatibility ple hypothesis tracking (MHT) feasible for a wide variety constraint for tracks within a hypothesis is maintained. An- of applications. However, even in typical expected sce- other variation of this algorithm is proposed by Cox[4],which narios, periods of unusually high target or clutter density may occur that stress the ability of MHT to operate in reduces the number of low probability hypothesis that are nor- real-time and under the constraints of limited computer mally formed and then pruned by Murty’s Algorithm. [5−8] memory. The most computing burden in MHT is the best The second, track oriented approach does not main- global hypothesis formation. This paper establishes a solu- tain hypothesis from scan to scan. The tracks formed from tion tree and introduces the branch and bound strategy for each scan and the tracks, which can survive pruning, are pre- the best global hypothesis formation. Then, a novel MHT dicted to the next scan. A strong argument for the track- algorithm which can be applied in practical radar imple- mentations is proposed. The algorithm is illustrated with oriented approach to MHT can be made by nothing that the examples of simulated missile defense scenarios and a tar- combinations of hypothesis formation are such that there are [9,10] get tracking scenario with real radar data. The experiment typically many more hypotheses formed than tracks . results indicate that the algorithm is valid. The main concern in an MHT algorithm is the chance of Key words — Multiple hypothesis tracking (MHT), a combinatorial explosion due to creation of new hypothe- ses. Although pruning and merging are used to contain hy- Global hypothesis, Branch and bound. pothesis generation, these reduce the robustness of the MHT method[11]. However, even though some performance degrada- I. Introduction tion they must be accepted, during these difficult conditions. In Ref.[12] the delay time (latency) between the time of the A technique that is known to be the best approach for current input data and the time that the data currently being providing the ability to handle the problem of assignment am- processed was produced is continuously computed, in order biguities in Multiple target tracking (MTT) is Multiple hy- to perform adaptive processing include the choice of track and pothesis tracking (MHT). MHT algorithm has not only been hypothesis pruning parameters. Then, the adaptive processing proved to lead to a mathematically optimum solution, but is is a function of the latency. also the only MTT algorithm that integrates track initiation, The paper is based upon experience with the track ori- continuation and termination with explicit modeling of false [1] ented MHT and propose an efficient real-implemented MHT alarms and other constrains . algorithm which introduce the branch and bound searching The original MHT method forms denoted Reid’s algo- method to find the best hypothesis, and then use the N-Scan rithm, which is first presented in Refs.[2, 3]. The MHT pruning to delete the unlikely track. The structure of the pa- method resolves the association problem by generating and per is as follow: the outline of track-oriented MHT is explained maintaining alternative hypothesis tracks to explain possible in Section II, and the implementation of our new algorithm is observation-track association. There are two basic approaches presented in Section III. In Section IV, simulation and real to MHT. The first, hypothesis oriented approach which fol- radar data experiments are carried out to evaluate the algo- lows the original work of Reid, is based on the creation of a rithm’s performance. Finally, a conclusion is drawn in Section hypothesis structure from scan to scan which is continually ex- V. panded and pruned with the arrival of new data. The method works as follows: At each scan a set of hypothesis are car- II. Track-Oriented MHT Basics ried over from a previous scan consisting of one or more tracks which are compatible with all other tracks in the hypothesis. Key features of the MHT algorithm are family, clustering, Compatible tracks are those, which do not share any common track management, pruning and Global Hypothesis. observations. Then on receipt of new data, each hypothesis Family A family is defined as a set of tracks that rep- is expanded into a new set of hypothesis after considering all resent at most a single target. The use of a family struc-

∗Manuscript Received Dec. 2010; Accepted Apr. 2012. Novel Implementation of Track-Oriented Multiple Hypothesis Tracking Algorithm 771 ture is useful in hypothesis formation for the track-oriented can vary in the measurement space. In practice, we calculate MHT method. This hypothesis formation method utilizes the the Log-likelihood ratio (LLR) or track score. Then the LLR fact that all tracks within a family are incompatible with each at scan time k, LLRk =ln(LRk)isupdatedby other, thus at most one track per family can be in any hypoth- esis. LLRk = LLRk−1 +ln(ΔLRk)(4) Clustering All efficient implementations of MHT use clustering to break the larger tracking problem into smaller Pruning In order to keep the number of track hypothe- sub problems in which hypothesis formation can be done sep- ses manageable, track LLR and N-Scan based pruning are per- arately. The clustering process partitions the full set of track formed. N-Scan pruning is performed after the best global hy- hypotheses into a number of non-interacting groups or clus- pothesis is formed. For each cluster, after hypotheses genera- ters. Clusters are collections of incompatible tracks (family) tion step, the next step is to form and evaluate new hypotheses that are linked by common observations. If a report in a new and to deletion unlikely hypotheses and tracks. Then, the sur- scan of data is associated with track hypotheses in two differ- viving tracks are filtered. One thing must be regarded is that ent clusters, then the two clusters are merged to form a new track deletion can be performed before filtering. combined cluster. It is important to limit the number of tracks Best global hypothesis The best global hypothesis is in a cluster to ensure the practical application. formed for each cluster. The best global hypothesis for a clus- Track management Track prediction and gating are ter consists of a set of compatible track hypotheses such that typically time consuming functions and large numbers of no measurement in cluster is present in more than one track tracks also put a major burden on hypothesis formation. Thus hypothesis. Since the track hypotheses in the best global hy- it is very important that the number of tracks propagated from potheses are compatible or independent, the LLR of the global [13] one scan of data to the next be limited. There are some ways hypotheses is the sum of the LLRs of all track hypotheses . that this can be accomplished. Using a standard MHT struc- ture, the number of tracks allowed per family can be adjusted. III. Novel Implementation Based Branch This is conveniently implemented through the N-Scan pruning and Bound approach and the choice of N can affect the number of tracks that are maintained. In our implementation, the most likely global hypothesis Track score When a new scan of data is received, is formed for each cluster and this hypothesis is used to reduce measurement-to-track association is performed using efficient the number of tracks by N-Scan pruning. gating procedures. First a coarse gating is performed to deter- A Branch and bound (BB) strategy is adapted to find the mine likely measurement-to-track associations and then a fine best global hypothesis. The branch and bound method can ellipsoidal gating is performed. be used to solve optimization problems without an exhaustive Track Likelihood ratio (LR) and Log-likelihood ratio search (in the average case). It has two main mechanisms: one (LLR): Let LRk−1 and LRk be the likelihood ratios of a track is to generate branches and another is to generate a bound so hypothesis at scan times k − 1andk respectively. Then the that many branches can be terminated. In the MHT applica- track likelihood ratio at time k is updated by tion, cluster is a collection of families, and a family is a set of LRk =ΔLRk · LRk−1 (1) tracks that represent at most a single target. The structure is interpreted as Fig.1. Thus, a solution tree can be established LR where Δ k is the incremental likelihood ratio at scan time as Fig.2. k.Atscantimek no report or one or more reports may be associated with a track hypothesis corresponding to missed de- tection and detection events. The incremental likelihood ratio is given by ⎧ ⎪ PDN(vk;0,Sk) ⎪ , if a report is associated ⎪ PFApFA(zk) ⎨⎪ with a track. ΔLRk = (2) ⎪ 1 − PD ⎪ , ⎪ − P if no report is associated ⎩⎪ 1 FA with a track. P where D is the detection probability density function (pdf) Fig. 1. Example of cluster, families, and tracks relation assuming that the report is the target, pFA is the false alarm pdf assuming that the report is the clutter measurement, and The solution tree is established based on the fact that all vk and Sk are the innovation and innovation covariance, re- tracks within a family are incompatible with each other, thus spectively. When a new track hypothesis is created, the likeli- at most one track per family can be in any hypothesis. As hood ratio for the new track is given by indicated in Fig.2, we need to select a set of compatible tracks Tij (i =1∼ m, j =1∼ Ni) in family Fi respectively and the LR1 = λNT /λFA (3) sum of the tracks scores is max, where Ni is the track num- ber of the family Fi, m is the family number of the individual where λNT ,λFA are the average spatial density of the new Ni λNT ,λFA cluster. There are possible hypotheses. targets and clutter density, respectively. In general, i=1∼m 772 Chinese Journal of Electronics 2012

1. Check the families number in cluster, if the number exceeds Mmax divide the families to portions, ensure number in each partion is less than Mmax; 2. In each family, sort the tracks Tij into non-increasing order. After sorting, LLR(Ti1) > LLR(Ti2 > ···> LLR(T ), then select at most N tracks with iNi max the highest scores; 3. In each cluster or partition, start branch and bound searching. Suppose that there are m famities in the cluster/ partition, and H is the best global solution: Ptocedure NN (H, i); if layer>m—leaf node? if sum score (H) >sum seore (Best) Best = H; end else Fig. 2. solution tree for j in {1 ···max(Ni,Nmax)} if compatibility (H, j) Although branch and bound strategy is usually very effi- —Tij is compatibility with each in solution H cient and optimal, a very large tree may be generated in the BB([H, j], layer+1); —recursion worse case. The worst case time complexity is still exponential. end When the track number Ni or family number m is large, the end time complexity is intractable in real implementation. In high 4. N-Scan pruning is accomplished by tracing back N target density situations when there are many incompatible scans form the most likely track to cstablish a new tracks the MHT can spend as much as 95% of its processing root node. Track branches don’t have the same new root node as the most likely tracks are deleted. time attempting to find the best hypothesis[12]. Thus, a series of techniques are adopted to improve the Fig. 3. Pseudo-code of the algorithm computing feasibility. First, we can sort the tracks into non- increasing order, and select no more than Nmax incompatible Table 1. Computation time per one sample (millisecond) tracks with the highest scores to form global hypothesis in each Algorithm Target number MHT based Conventional family. Furthermore, Nmax can be adaptive chosen by the to- on BB track-oriented MHT tal number of tracks and families in the cluster. Second, in Ave 14 Ave 67 3(N = 10) high target density situations, there are many families in one max Max 16 Max 141 cluster. If the family number exceeds the Mmax, the problem Ave 16 Ave 1146 4(N =8) will be divided and best global hypothesis be formed in the max Max 18 Max 1637 sub problems, finally the solution will be merged. Ave 30 Ave 33816 5(N =5) Unlike other methods, in the first solution, the unselected max Max 47 Max 75656 tracks will not be deleted directly in each family in order to remain more potential tracks. Tracks will be deleted in the N- Table 2. Track hypotheses number of MHT based BB Scan pruning based best global hypothesis. In the second solu- Target number Before N-Scan pruning After N-Scan pruning tion, big cluster is not broken. The sequence operations such as Ave 65 Ave 23 3(N = 10) measurement-to-track gating and track hypothesis branching max Max 81 Max 27 are also done in a big cluster. Because we apply the decoupled Ave 91 Ave 31 4(Nmax =8) filter, those computing burden is low. And the techniques im- Max 108 Max 36 Ave 113 Ave 39 prove real time performance with minimal impact to tracking 5(N =5) max Max 130 Max 45 performance. A pseudo-code of our algorithm based branch and bound search is shown in Fig.3. It’s achieved by the recursion. IV. Experiments and Result Discussion

1. Simulation Missile defense (MD) scenarios are simulated to evaluate the performance of the algorithm. The most difficult MD con- ditions occur at the time when a large cluster of closely spaced objects is deployed. Then, the best global hypothesis forma- tion function is typically the most demanding. More difficult scenarios involving 10 and 20 target are sim- ulated respectively. In most radar target tracking application Fig. 4. Computing time results of MHT based BB scenarios, the number of targets is no more than 20 in one cluster. Fig.4 shows the computing time results of MHT based ses maintained by the algorithm. The results indicate that the Branch and bound (BB). Fig.5 shows number of track hypothe- computation time is linear with the target number. Although Novel Implementation of Track-Oriented Multiple Hypothesis Tracking Algorithm 773 there are 20 closely spaced targets, the computation is less information about alternate hypotheses for later confirmation than 1700ms (Intel Pentium(R) Dual, 2GHz, RAM 1G). And or deletion. Fig.8 shows this case in the real application. In the algorithm is robust, because every target has more than 20 Fig.8, the target P3 is misled by the cluster because the target track hypotheses averagely. Fig.6 shows the tracking result of is not detected in some time intervals, but its extrapolative 20 closely spaced objects. In Fig.6, the tracks for each target track hypothesis is maintained. When the target is detect are joined by lines and marked by distinct symbols for easy again, the extrapolative track will associate with the target identification. We can see the tracks are not cross. measurement and its track score will be higher than the mis- led track, the extrapolative track hypothesis will be confirmed.

Fig. 5. Number of track hypotheses maintained Fig. 8. Example of a tracking situation

V. Conclusions and Future Work

The improved performance of the Multiple Hypotheses Tracking comes at the cost of significantly higher computa- tional complexity. This paper proposes a simplified branch and bound approach to the best global hypothesis formation. The algorithm’s performance is evaluated by examples of sim- ulated missile defense scenarios with high target density and a target tracking scenario with real radar data. The experiment results prove the superiority of the algorithm and indicate that the algorithm can be applied in practical radar implementa- tions. However, there are still some deficiencies which may be the future direction of in-depth research. For example, when a big cluster divided into two or more partitions, some optimal split Fig. 6. Tracking result of 20 closely spaced objects (CSOS) method should be investigated to improve the performance. 2. Real application The algorithm is used to an air balloon based surveillance References radar application. The radar is down-sight and many received [1] Tarun Bhattacharya, Al Premji, Tim J. Nohara, et al., “Evalu- measurements are from the ground moving objects such as ation of fast MHT algorithms”, IEEE National Radar Confer- cars. MHT is applied to improve the tracking stability and ence, Dallas, TX, pp.213–218, 1998. [2] D.B. Reid, “An algorithm for tracking multiple targets”, robustness. IEEE Trans. Autom. Control, Vol.AC-24, pp.843–854, Dec. 1979. The density of clutter measurements is shown in Fig.7. It [3] D.B. Reid, “A multiple hypothesis filter for tracking multi- is definitely high in general radar application. ple targets in a cluttered environment”, Lockheed Missiles and The most significant feature of MHT is the ability to carry Space Company Report, No.LMSC, D-560254, Sept. 1977. [4] I.J. Cox and S.L. Hingorani, “An efficient implementation of Reid’s multiple hypothesis tracking algorithm and it’s evalu- ation for the purpose of visual tracking”, IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol.18, No.2, pp.138–150, 1996. [5] T. Kurien, “Issues in the design of practical multitarget tracking algorithm”, Multitarget-Multisensor Tracking: Advanced Appli- cations, Y. Bar-Shalom (Ed.), Norwood, MA: Artech House, 1990. [6] G.C. Demos, R.A. Ribas, et al., “Applications of MHT to dim moving targets”, Signal and Data Processing of Small Targets, Proc. SPIE, Vol.1305, pp.297–309, Apr. 1990. [7] D.S.K. Chan, et al., “Performance results of the bilevel MHT Fig. 7. Density of clutter measurements tracking algorithm for two crossing targets in a high clutter en- 774 Chinese Journal of Electronics 2012

viroment”, Signal and Data Processing of Small Targets, Proc. pezza, Proc. of SPIE, Vol.6201, 62010N, 2006. SPIE, Vol.1954, pp.406–416, 1994. [13] A.B. Poore, “Complexity reduction in MHT/MFA tracking”, [8] D.S.K. Chan, D.A. Langan, “Tracking in a high-clutter envi- Signal and Data Processing of Small Taret 2005, Proc. of SPIE, ronment: simulation results characterizing a BiLevel MHT al- Vol.5913, pp.59131 F-1 - 59131 F-ll, 2005. gorithm”, Signal and Data Processing of Small Targets, Proc. GUO Jianhui was born in Jiangxi SPIE, Vol.2235, pp.540–551, Apr. 1993. Province, China, in Dec. 1983. He received [9] S. Blackman, “Multiple hypothesis tracking for multiple target the B.S. degree and M.S. degree from the tracking”, IEEE A&ESystemsMagazine, Vol.19, No.1, pp.5– Nanjing University of Science and Technol- 17, Jan. 2004. ogy, Nanjing, China, in 2003 and 2005, re- [10] S. Blackman and R. Popoli, Design and Analysis of Modern spectively, the Ph.D. degree from the same Tracking Systems, Artech House, Norwood, MA, 1999. university in 2008. He is currently a senior [11] Somajyoti Majumder, “Sensor fusion and feature based navi- engineer in the Nanjing Institute of Elec- gation for subsea robots”, Ph.D. Thesis, Australian Centre for tronics Technology. His current research Field Robotics, School of Aerospace, Mechanical and Mecha- interests include radar data process, radar tronic Engineering, University of Sydney, Aug. 2001. control and information fusion. (Email: guojianhui [email protected]) [12] Bradley K. Norman, Brian A. Cronin, et al., “Adaptive pro- cessing to ensure practical application of a multiple hypothesis ZHANG Rongtao was born in 1974, received Ph.D. degree tracking system”, Sensors, and Command, Control, Commu- from Nanjing University of Science and Technology, Nanjing, China, nications, and Intelligence (C3I) Technologies for Homeland in 2002. His current research interests include radar data process, Security and Homeland Defense V, edited by Edward M. Cara- radar control and information fusion. continued from page 776 Secure Verifiable Active Access Control for Medical Sensor Networks ...... ZHANG Lichen, WANG Xiaoming et al. 3 ( 554 ) CharacteristicsofElectricChargesCarriedbyDustParticlesandTheirEffectsonConnectorContactFailure...... GAO Jinchun and XIE Gang 3 ( 559 ) OntheSecurityofDouble-Block-LengthHashFunctionswithRate1...... GONG Zheng, LUO Yiyuan et al. 3 ( 566 ) LifetimeOptimizationforMulti-SourceMulti-RelayCooperativeOFDMSystems...... PANG Lihua, LI Jiandong et al. 3 ( 571 ) QoS-OrientedScalableVideoTransmissionUsingCooperativeRelay...... XIAO Hongjiang, JI Xiangyang et al. 3 ( 575 ) NetworkTrafficAnomalyDetectionBasedonMaximumEntropyModel...... QIAN Yaguan, WU Chunming et al. 3 ( 579 ) VisualAttentionModelBasedRegionsofInterestDetectioninCompressedDomain...... SUI Lei, ZHANG Jing et al. 4 ( 697 ) A HomomorphicAggregateSignatureSchemeBasedonLattice...... ZHANG Peng, YU Jianping et al. 4 ( 701 ) AnApproximateApproachtoEnd-to-EndTrafficinCommunicationNetworks...... JIANG Dingde, XU Zhengzheng et al. 4 ( 705 ) A NovelCovertTimingChannelBasedonRTP/RTCP...... YING Lizhi, HUANG Yongfeng et al. 4 ( 711 ) LinearApproximationsofPseudo-HadamardTransform...... WANG Bin, WU Chunming et al. 4 ( 715 ) A PersonalDRMSchemeBasedonSocialTrust...... QIU Qin, TANG Zhi et al. 4 ( 719 ) QualityofExperienceAssessmentforCross-layerOptimizationofVideoStreamingoverWirelessNetworks...... LIU Fangqin, LIN Chuang et al. 4 ( 725 ) AdaptiveLayer-3BufferManagementSchemeforDomesticWLAN...... Wai Leong Pang, David Chieng et al. 4 ( 730 ) MICROWAVE AND ELECTRONIC SYSTEM ENGINEERING No. Page Optimum Design of Multi-band Transformer with Multi-section for Two Arbitrary Complex Frequency-dependent Impedances ...... CHEN Ming 1 ( 160 ) FusionFilteringAlgorithmBasedonEdgePointJudgment...... WANG Xuanming, ZHANG Xiaolin et al. 1 ( 165 ) StudyonW-bandSheetElectronBeamSineWaveguideTraveling-WaveTube...... XU Xiong, WEI Yanyu et al. 1 ( 169 ) A Miniaturized Bandpass Filter with Low Passband Insertion Loss and High Harmonic Suppression in Ultra-wide Stopband ...... CUI Lan, WU Wen et al. 1 ( 173 ) JointRange-velocityClosedLoopTrackingFilter...... CUI Wei 1 ( 179 ) AnIonosphereCorrectingAlgorithmforGNSSSignalsBasedonObservableAnalysis...... GAO Shuliang, LI Rui et al. 1 ( 185 ) Phase-modulatedWaveformDesignUsingMaximumMutualInformationCriterion...... GONG Xuhua, MENG Huadong et al. 1 ( 190 ) Cosine-Modulated Transceivers for TV White Space Cognitive Access...... ZHAO Nan, PU Fangling et al. 2 ( 362 ) A NovelApproachtoDetecttheUnresolvedTowedDecoyinTerminalGuidance...... SONG Zhiyong, XIAO Huaitie et al. 2 ( 367 ) Compass Augmented Regional Constellation Optimization by a Multi-objective Algorithm Based on Decomposition and PSO ...... LU Hui and LIU Xin 2 ( 374 ) AnalysisofPhaseNoiseModelinCross-CoupledLCVCO...... WU Xiulong, XU Tailong et al. 2 ( 379 ) A NonlinearDynamicBandwidthControlAlgorithmforDigitallyControlledPhase-LockedLoop...... CAI Zhikuang, YANG Jun et al. 2 ( 384 ) Improved Eavesdropping Detection Strategy Based on Extended Three-particle Greenberger-Horne-Zeilinger State in Two-step QuantumDirectCommunicationProtocol...... LI Jian, YE Xinxin et al. 4 ( 736 ) InterferometricPhaseStatisticsandEstimationAccuracyofStrongScattererforInSAR..... XU Huaping, LI Shuang et al. 4 ( 740 ) Radar Clutter Suppression Based on SαS FractionalAutoregressiveModel...... FENG Xun, WANG Shouyong et al. 4 ( 745 ) A Novel Soft Switching Converter with Active Auxiliary Resonant Commutation...... CHU Enhui, HOU Xutong et al. 4 ( 751 ) Monte-CarloSimulationsontheNoiseCharacteristicsoftheIonBarrierFilmofMicrochannelPlate...... SHI Feng, FU Shencheng et al. 4 ( 756 ) ANNSynthesisModelsforAsymmetricCoplanarWaveguideswithFiniteDielectricThickness...... WANG Zhongbao and FANG Shaojun 4 ( 759 ) A NewApproachtoAirborneHighResolutionSARMotionCompensationforLargeTrajectoryDeviations...... MENG Dadi, HU Donghui et al. 4 ( 764 ) Novel Implementation of Track-Oriented Multiple Hypothesis Tracking Algorithm ...... GUO Jianhui and ZHANG Rongtao 4 ( 770 ) Chinese Journal of Electronics 2012 (Vol.21) Contents

COMPUTER AND MICROELECTRONICS No. Page OntheTechnologyofHigh-PerformanceParallelSimulation...... LIU Buquan, YAO Yiping et al. 1(1) A HighlyEfficientGPU-CPUHybridParallelImplementationofSparseLUFactorization...... LIU Li, LIU Li et al. 1(7) A DataMiningBasedMeasurementMethodforSoftwareTrustworthiness...... YUAN Yuyu and HAN Qiang 1(13) A 24GHz Low Phase Noise Voltage-Controlled Oscillator with Wide Tuning-Range and Low Power...... YANG Dongxu, WANG Hongrui et al. 1(17) AnInstruction-levelSymbolicChecksumSystemforWindowsx86Program...... CUIBaojiang,JIYupengetal. 1(22) A UniversalGravitationBasedClusteringAlgorithmforDistributedFileSystem...... LIU Shufen, LENG Huang et al. 1(27) DesignandImplementationofEARTHOperatingSystem...... LIANG Hongliang, LI Shoupeng et al. 1(33) Interoperability Measurement of Documents ...... LI Ning, LIANG Qi et al. 1(37) An Evaluation Model Integrating User Trust and Capability for Selection of Cooperative Learning Partners ...... TAN Wenan, WEN Xiang et al. 1(42) EfficientPre-conditionalSingle-NodeSORMethodofStatistical3DThermalAnalysisforHotSpots...... LUO Zuying, ZHAO Guoxing et al. 1(47) A Unified Integrated Method for Evaluating Goodness of Propositions in Several Propositional Logic Systems and Its Appli- cations...... WANG Guojun 2 ( 195 ) A NovelWidebandCPW-Fed5.8GHzRFIDTagAntenna...... LI Huihui, MOU Xuanqin et al. 2 ( 202 ) nID-based Internet of Things and Its Application in Airport Aviation Risk Management . . NING Huansheng, HU Sha et al. 2 ( 209 ) ConstructingDecisionTreesforMiningHigh-speedDataStreams...... XU Wenhua and QIN Zheng 2 ( 215 ) Detect Peripheral Hardware Faults Using I/O-state-based Dynamic Value Invariants . . . . ZHENG Yansong, LU Junlin et al. 2 ( 221 ) A 780MHz Low-Power Fully Integrated CMOS Receiver Front-End for Wireless Sensor Network ...... YIN Yadong, ZHANG Lihong et al. 2 ( 227 ) A LowPower,FullyIntegratedSiGeBiCMOSBasebandCircuitryfora DirectConversionCMMBTunerIC...... GONG Zheng, CHEN Bei et al. 2 ( 231 ) The Algorithm of Infeasible Paths Extraction Oriented the Function Calling Relationship MU Yongmin, ZHENG Yuhui et al. 2 ( 236 ) TransactionalDependencyforFailureRecoveryinWebServicesCompositionSystem...... MEI Xiaoyong, LI Shixian et al. 2 ( 241 ) EfficientGroupKeyManagementSchemewithHierarchyStructure...... LI Dandan, ZHANG Runtong et al. 2 ( 249 ) Identity Based Encryption and Biometric Authentication Scheme for Secure Data Access in Cloud Computing ...... CHENGHongbing,RONGChunmingetal. 2 ( 254 ) AnESD-Aware2.4GHzPADesignforWLANApplication...... SHI Zitao, CHENG Yuhua et al. 3 ( 389 ) Context-AwareTaskAllocationforQuickCollaborativeResponses...... SUN Yuqing, Matthias Farwick et al. 3 ( 395 ) GroupCompetitiveModelofOptimalNodeSelectionBasedonServiceEvaluation...... LIU Shufen and HU Changhong 3 ( 403 ) AirportBird-strikeRiskAssessmentModelwithGreyClusteringEvaluationMethod...... WANG Jiakang, NING Huansheng et al. 3 ( 409 ) Compiler-AssistedValueCorrelationforIndirectBranchPrediction...... TAN Mingxing, LIU Xianhua et al. 3 ( 414 ) DegreeofSpikingNeuralP SystemsWithoutDelay...... JIANG Keqin and SHI Xiaolong 3 ( 419 ) Combining Control Structure and Composition Condition for Web Services Reliability Prediction XIE Chunli, LI Bixin et al. 3 ( 425 ) RNS-to-Binary Converter for New Four-Moduli Set {2n − 1, 2n, 2n+1 − 1, 2n+1 +2n − 1} .... QUAN Si, PAN Weitao et al. 3 ( 430 ) A Novel Survivability Evaluation Model Facing Information System ...... ZHANG Lejun, GUO Lin et al. 3 ( 435 ) HiSCA:OvercomingtheLimitationofClusteredUnicoreProcessorsThroughHardware/SoftwareCodesign...... CHEN Hu, CHEN Shuming et al. 3 ( 439 ) Towards Efficient K-DominantSkylineComputationinCSCW...... HUANG Jin, CHEN Jian et al. 3 ( 445 ) Quantum-Behaved Particle Swarm Optimization Algorithm with Adaptive Mutation Based on q-GaussianDistribution...... ZHAO Wei and SAN Ye 3 ( 449 ) A FormalModelofCollaborativeDiscussionforProblem-Solving...... LIU Xiaoping, TANG Yiming et al. 3 ( 453 ) A GlobalK-modesAlgorithmforClusteringCategoricalData...... BAI Tian, C.A. Kulikowski et al. 3 ( 460 ) RouteOptimizationAlgorithmforVehicletoVehicleCommunicationUsingLocationInformation...... XU Shenglei and LEE Sangsun 4 ( 583 ) AnEvidence-DrivenFrameworkforTrustworthinessEvaluationofSoftwareBasedonRules...... WANG Xiaoyan, LIU Shufen et al. 4 ( 589 ) TISA:ReconfigurableSystemforTemplate-BasedStreamComputing...... YANG Qianming, WU Nan et al. 4 ( 594 ) AnAlgorithmforBusTrajectoryExtractionBasedonIncompleteDataSource...... DAI Dameng and MU Dejun 4 ( 599 ) Controllability of Multi-agent Systems with Multiple Leaders and Switching Topologies . .... LUO Xiaoyuan, LIU Dan et al. 4 ( 604 ) A NovelCollaborativeFilteringUsingKernelMethodsforRecommenderSystems...... CAO Jie, WU Zhiang et al. 4 ( 609 ) ParallelTestTaskSchedulingwithConstraintsBasedonHybridParticleSwarmOptimizationandTabooSearch...... LU Hui, CHEN Xiao et al. 4 ( 615 ) SHIS Model of E-mail Virus Propagation ...... ZHONG Jiang, LI Ang et al. 4 ( 619 ) A ConvexApproachforLocalStatisticsBasedRegionSegmentation...... MA Liyan and YU Jian 4 ( 623 ) A NovelBoostedChargeTransferCircuitforHighSpeedChargeDomainPipelinedADC...... CHEN Zhenhai, YU Zongguang et al. 4 ( 627 ) Frequent2-EpisodeMiningwithMinimalOccurrencesBasedonEpisodeMatrixandLockState...... LIN Shukuan, WANG Ya et al. 4 ( 633 ) Modeling and Path Generation Approaches for Crowd Simulation Based on Computational Intelligence ...... LIUHong,SUNYulingetal. 4 ( 636 ) Low Power EEPROM Designed for Sensor Interface Circuit ...... MENG Xiangyun, YANG Sen et al. 4 ( 642 ) SIGNAL PROCESSING No. Page CPT-FDR:AnApproachtoTranslatingPPDDLConformantPlanningTasksintoFinite-DomainRepresentations...... LI Weisheng, et al. 1(53) A HierarchicalApproachBasedonFastMarchingMethodinMultiPlayerPursuit-EvasionGame...... FANG Baofu, PAN Qishu et al. 1(59) A ParticleSwarmOptimizationBasedAlgorithmfortheCalculationofUserDifferentialRangeError...... SHAO Bo, LIU Jiansheng et al. 1(64) FastAlgorithmsofImageFusionforSuper-ResolutionReconstructionfromMultipleImageswithRandomShifts...... NING Beijia, LI Jie et al. 1(69) SelectiveBayesianClassifierBasedonSemi-supervisedClustering...... CHENG Yuhu, TONG Yaoyao et al. 1(73) ModelingofGPSCodeandCarrierTrackingErrorinMultipath...... CHEN Jie, CHENG Lan et al. 1(78) Speech Enhancement Using Robust Generalized Sidelobe Canceller with Multi-Channel Post-Filtering in Adverse Environments ...... LI Kai, FU Qiang et al. 1(85) GeometryandAccuracyofSpecularPointsinGPS-ReflectionAltimetry...... GUOJia,YANGDongkaietal. 1(91) Language Recognition with Language Total Variability ...... YANG Jinchao, ZHANG Xiang et al. 1(97) 776 Chinese Journal of Electronics 2009

ImageDenoisingandDecompositionUsingNon-convexFunctional...... BAI Jian and FENG Xiangchu 1 ( 102 ) A SVDD Method Based on Maximum Distance between Two Centers of Spheres . . FANG Jinglong, WANG Wanliang et al. 1 ( 107 ) A WaveletApproachFittingtoSignalsinMulti-exponentialDecay...... GUO Qi and WU Long 2 ( 260 ) KernelICAFeatureExtractionforAnomalyDetectioninHyperspectralImagery...... ZHAO Chunhui, WANG Yulei et al. 2 ( 265 ) Content-Aware Opportunistic Packet Scheduling with Dynamic Routing Algorithm for Multi-path Video Streaming over Mesh Networks...... ZHANG Yongfei, ZHANG Yunsheng et al. 2 ( 270 ) Color-DependentDiffusionEquationsBasedonQuaternionAlgebra...... LI Yafeng, FENG Xiangchu et al. 2 ( 277 ) TheNPUMulti-caseChinese3DFaceDatabaseandInformationProcessing...... ZHANG Yanning, GUO Zhe et al. 2 ( 283 ) A Frequency Estimator for a Real Single-tone Based on Modified Pisarenko Harmonic Decomposer and Lag-limited Correlation Expansion...... CAO Yan and WEI Gang 2 ( 287 ) OptimalNonuniformSamplingforSystemIdentificationonSparselySampledData...... NI Boyi and XIAO Deyun 2 ( 292 ) AnAdaptiveHECSchemewithVariablePacketSizeforWirelessReal-timeReliableMulticast...... TAN Guoping and LI Yueheng 2 ( 299 ) PoissonNoiseRemovalwithTotalVariationRegularizationandLocalFidelity...... LI Fang and LIU Ruihua 2 ( 304 ) OntheAnalysisofEvolutionaryProgrammingwithSelf-adaptiveCauchyOperation...... LIANG Xiao, YUE Lihua et al. 2 ( 309 ) AnAdaptiveFuzzyMarkovRandomFieldModelforChangeDetection...... GAO Fei, CHEN Bona et al. 3 ( 466 ) AnExtendedMulti-scalePrincipalComponentAnalysisMethodandApplicationinAnomalyDetection...... WEN Chenglin, ZHOU Funa et al. 3 ( 471 ) A Non-StatisticalReinstatementAlgorithmforOrientationFieldofIncompleteFingerprint...... JING Xiaojun, ZHANG Bo et al. 3 ( 477 ) Accelerometer-based Gait Authentication via Neural Network ...... SUN Hu, YUAN Tao et al. 3 ( 481 ) The Heuristic Algorithms for Selecting the Parameters of Support Vector Machine for Classification ...... LANG Rongling, DENG Xiaole et al. 3 ( 485 ) Parallel Length-based Matching Architecture for High Throughput Multi-Pattern Matching ...... WANG Xiaofei, HU Chengchen et al. 3 ( 489 ) Reaction-DiffusionEquationBasedImageDenoisingAlgorithm...... ZHAO Xueqing, WANG Xiaoming et al. 3 ( 495 ) Dynamic Reliability Analysis Model for Fault-tolerant Network Routing...... WANG Bin, WU Chunming et al. 3 ( 500 ) AutomaticPairing2-DDirectionEstimationUsingUniformlybutSparselySpacedElectromagneticVectorSensor...... LIU Zhaoting and LIU Zhong 3 ( 505 ) A DynamicallyReconfigurableVLSIArchitectureforH.264IntegerTransforms...... HONG Qi, CAO Wei et al. 3 ( 510 ) 3-DMedicalImageInterpolationviaMulti-ResolutionDirectionalCorrespondence...... WANG Lingfeng, YU Zeyun et al. 3 ( 515 ) ImprovingtheLowerBoundonLinearComplexityoftheSequencesGeneratedbyNonlinearFiltering...... ZHANG Yin, LIN Dongdai et al. 3 ( 519 ) A Pruning Based Continuous RkNN Query Algorithm for Large k ...... WANG Shengsheng, CHAI Sheng et al. 3 ( 523 ) Orthogonality is Better: Auxiliary Problems in ASO Algorithm ...... ZHANG Taozheng, WANG Xiaojie et al. 4 ( 645 ) A NonLocalFeature-PreservingStrategyforImageDenoising...... HE Ning and LU Ke 4 ( 651 ) A Graph-basedMethodtoMineCoexpressionClustersAcrossMultipleDatasets...... ZAN Xiangzhen, XIAO Biyu et al. 4 ( 657 ) A FusionSchemeofRegionofInterestExtractioninIncompleteFingerprint...... JING Xiaojun, ZHANG Bo et al. 4 ( 663 ) UniformSolutiontoQSATbyP SystemswithProteins...... LU Chun and SHI Xiaolong 4 ( 667 ) MissingValueEstimationforGeneExpressionProfileData...... WANG Xuesong, LIU Qingfeng et al. 4 ( 673 ) A Rate-DistortionModelBasedFrameLayerRateControlAlgorithmforStereoscopicVideoCoding...... WANG Qun, ZHUO Li et al. 4 ( 678 ) A KPLS-EigentransformationModelBasedFaceHallucinationAlgorithm...... LI Xiaoguang, XIA Qing et al. 4 ( 683 ) A Two-PartyCombinedCryptographicSchemeandItsApplication...... WANG Shengbao, XIE Qi et al. 4 ( 687 ) DiscriminativeDecisionFunctionBasedScoringMethodUsedinSpeakerVerification...... LIANG Chunyan, ZHANG Xiang et al. 4 ( 692 ) TELECOMMUNICATION No. Page Controlling Inelastic Traffic Queuing Behaviors Using RED as a Decay Rate Filter ...... YANG Qiang and WU Chunming 1 ( 112 ) AnImprovedSecurityDetectionStrategyBasedonW Statein“Ping-pong”Protocol...... LI Jian, SONG Danjie et al. 1 ( 117 ) CoordinatingIPQoSandTrafficEngineeringforGuaranteeingGracefulQoEDegradation...... YANG Qiang and WU Chunming 1 ( 121 ) Real-time Performance Study of Information Transmission in EPA Industrial Ethernet ...... LIU Ning, ZHONG Chongquan et al. 1 ( 125 ) A MultipathRoutingProtocolinWirelessMeshNetworks...... SHU Yongan, SHU Ziyu et al. 1 ( 131 ) A StochasticNetworkCalculusApproachforPerformanceEvaluationofWired-cum-wirelessNetworksoverFAN...... CHEN Xin, XIANG Xudong et al. 1 ( 137 ) Energy-Efficient Web Browsing over IEEE 802.11 Wireless Networks...... WANG Jing, GUAN Xuetao et al. 1 ( 144 ) RationalSecretSharingProtocolwithFairness...... CAI Yongquan and PENG Xiaoyu 1 ( 149 ) PrecodingOptimizationforNon-RegenerativeMIMORelaySystems...... ZHANG Yang, LI Jiandong et al. 1 ( 153 ) QoS Routing and Traffic Scheduling in Long-Distance 802.11 Wireless Mesh Networks ..... ZHAO Zenghua, HE Ming et al. 2 ( 313 ) A Reduced-DimensionApproachtoLCMVBeamformingwithMulti-CycleOptimization...... GUO Huanli and ZHOU Yuanping 2 ( 318 ) Real-timeServiceLevelAgreementGuaranteesUnderNetworkTopologicalandTrafficUncertainties...... YANG Qiang, WU Chunming et al. 2 ( 323 ) A Novel Neighboring Propagation Algorithm Based on Hierarchical Routing Scheme for Power Constrained Wireless Sensor Networks...... CUI Xiaoyan, YANG Sikun et al. 2 ( 327 ) RobustAcousticSourceLocalizationinEnergy-stringentSensorNetworks...... LIU Yong, PAN Quan et al. 2 ( 332 ) SignalRecoverybyCompressedSensinginIR-UWBSystems...... LIU Yulin, WANG Kai et al. 2 ( 339 ) Unified and Complete Point Addition Formula for Elliptic Curves ...... ZHANG Lijun, WANG Kunpeng et al. 2 ( 345 ) DRMA:A DynamicallyReconfigurableManagementArchitectureforWirelessSensorNetworks...... LIN Zhaowen, ZHAO Fang et al. 2 ( 350 ) MONETA: Prior-free and Truthful Auctions with Adaptive Reserve Price for High Revenue in Dynamic Spectrum Access Networks...... WU Xiaobing and CHEN Guihai 2 ( 355 ) A Service-centricNetworkingSchemeforWirelessSensorNetworks...... LUO Hong, LIN Jieqiong et al. 3 ( 528 ) Single-channel Speech Separation by l0 OptimizationUsingQuasi-KLTBases...... GUO Haiyan, YANG Zhen et al. 3 ( 535 ) End-to-endCongestionControlforTCP-friendlyFlowswithVariableDataRates...... JIANG Ming, YANG Qiang et al. 3 ( 541 ) Generic Side-channel Distinguisher Based on Kolmogorov-Smirnov Test: Explicit Construction and Practical Evaluation ...... LIU Jiye, ZHOU Yongbin et al. 3 ( 547 ) continuedonpage774