Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
GDANSK UNIVERSITY OF TECHNOLOGY Faculty of Electronics, Telecommunications and Informatics
Marcin Adam Barylski
Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
PhD Dissertation
Supervisor: prof. dr hab. in Ŝ. Henryk Krawczyk Faculty of Electronics, Telecommunications and Informatics Gdansk University of Technology
Gdansk, 2010
Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
To my wife, Ewa
2 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Acknowledgments
First of all, I would like to thank Professor Henryk Krawczyk, a professor at Faculty of Electronics, Telecommunications and Informatics, President of Gda ńsk University of Technology, for priceless motivation, valuable advices, and his unlimited time for me that was required to write this dissertation. This dissertation would not be ever possible without help of my family. Firstly I would like to thank my wife, Ewa, for her patience, being always ready to give me a helpful hand, and invaluable day-by-day support. My parents taught me how to discover the world around me and without their education I would not ever start my studies. I would like to thank Mrs. Izabela Dziedzic for taking care of all formal and organizational matters. The crucial field test of one of the most important deliverable of this dissertation, MA2QA, would not be possible without full support of Jerzy Proficz from TASK, who introduced me to KASKADA test environment. Finally, I would like to thank my employer, Intel Technology Poland, for enabling access to best know methods.
3 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Table of Contents
INDEX OF FIGURES ...... 7 INDEX OF TABLES ...... 10 GLOSSARY ...... 12 LIST OF ACRONYMS ...... 16 CHAPTER 1: INTRODUCTION ...... 21 1.1. B ACKGROUND OF SECURITY AND PERFORMANCE TESTING ...... 21 1.2. I NTRODUCTION TO DISTRIBUTED PUBLIC -PRIVATE NETWORK ENVIRONMENTS ...... 22 1.3. G OAL AND SCOPE OF THE DISSERTATION ...... 26 1.4. C LAIMS OF THE DISSERTATION ...... 28 1.5. D OCUMENT STRUCTURE ...... 28 CHAPTER 2: CHARACTERIZATION OF PRIVATE-PUBLIC IPSEC AND HTTPS APPLICATIONS...... 30 2.1. IPS EC -BASED DISTRIBUTED APPLICATIONS DESIGN ...... 30 2.1.1. Introduction to IPSec...... 30 2.1.2. ESP security...... 32 2.1.3. ESP performance...... 42 2.1.4. IKEv2 security and performance...... 44 2.2. HTTPS-BASED DISTRIBUTED APPLICATIONS DESIGN ...... 46 2.2.1. Introduction to HTTPS ...... 46 2.2.2. HTTPS security...... 49 2.2.3. HTTPS performance...... 52 2.3. D ISTRIBUTED APPLICATIONS WORKING IN IPS EC /HTTPS ENVIRONMENTS ...... 54 2.3.1. Request/Response (R/R) solution...... 54 2.3.2. Publish/Subscribe (P/S) solution...... 55 2.3.3. Concept of a secure service processing continuous multimedia data...... 56 2.3.4. Security and performance of continuous multimedia streams distribution...... 62 2.4. S UMMARY ...... 65 CHAPTER 3. SELECTION OF SECURITY AND PERFORMANCE TESTING PROCEDURES...... 66 3.1. T HE GIST OF QUALITY CONTROL ...... 66 3.2. F UNDAMENTALS OF SW PERFORMANCE TESTING ...... 70 3.3. N ETWORK LAYER PERFORMANCE TESTS ...... 72 3.3.1. Network throughput testing ...... 73 3.3.2. Network latency testing ...... 84 3.4. M IDDLEWARE LAYER PERFORMANCE TESTS ...... 87 3.4.1. DB performance tests ...... 90 3.4.2. WS performance tests ...... 92 3.4.3. Web performance tests ...... 93 3.5. F UNDAMENTALS OF SW SECURITY ...... 97 3.6. SW SECURITY TESTING ...... 98 3.6.1. Scope of security testing ...... 98 3.6.2. Security attacks...... 100 3.7. IPS EC PERFORMANCE AND SECURITY TESTING ...... 104 3.8. HTTPS PERFORMANCE AND SECURITY TESTING ...... 107 3.9. S UMMARY ...... 110
4 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
CHAPTER 4: PROPOSAL OF MA2QA APPROACH...... 111 4.1. A PPLICATION MODEL ...... 111 4.1.1. Subject of analysis ...... 111 4.1.2. Design and implementation for performance and security ...... 112 4.2. Q UALITY MODEL ...... 114 4.2.1. Quality tree...... 114 4.2.2. Scope of quality analysis ...... 115 4.2.3. Method for finding the correlations between the metrics...... 116 4.3. M ULTIDIMENSIONAL APPROACH TO QUALITY ANALYSIS (MA2QA)...... 116 4.3.1. MA2QA fundamentals...... 116 4.3.2. MA2QA usage in iterative application development...... 117 4.3.3. Compromising security and performance ...... 118 4.3.4. MA2QA quality vector...... 121 4.3.5. Sample MA2QA score card ...... 126 4.3.6. Sample MA2QA evaluation ...... 128 4.4. S UMMARY ...... 128 CHAPTER 5. EXPERIMENTS AND RESULTS ...... 130 5.1. G OAL AND PLAN OF EXPERIMENTS ...... 130 5.2. E XPERIMENT 1 (EXP1): ENDPOINT AUTHENTICATION VS . USER INPUT CONFIRMATION LATENCY ...... 132 5.2.1. Plan and goal of EXP1...... 132 5.2.2. Design of EXP1 ...... 133 5.2.3. Results of EXP1 ...... 134 5.2.4. Security and performance considerations ...... 136 5.2.5. Conclusions after EXP1...... 136 5.3. E XPERIMENT 2 (EXP2): C OMMUNICATION SECURITY VS . DATA THROUGHPUT ...... 137 5.3.1. Plan and goal of EXP2...... 137 5.3.2. Design of EXP2 ...... 138 5.3.3. Results of EXP2 ...... 139 5.3.4. Security and performance considerations ...... 141 5.3.5. Conclusions after EXP2...... 141 5.4. E XPERIMENT 3 (EXP3): COMMUNICATIONS SECURITY VS . USER COMFORT AND IMAGE PROCESSING PERFORMANCE ...... 142 5.4.1. Plan and goal of EXP3...... 142 5.4.2. Design of EXP3 ...... 143 5.4.3. Results of EXP3 ...... 144 5.4.4. Security and performance considerations ...... 146 5.4.5. Conclusions after EXP3...... 147 5.5. E XPERIMENT 4 (EXP4): COMMUNICATIONS SECURITY VS . SIZE OF SERVER PROCESSING QUEUE ...... 147 5.5.1. Plan and goal of EXP4...... 147 5.5.2. Design of EXP4 ...... 148 5.5.3. Results of EXP4 ...... 149 5.5.4. Security and performance considerations ...... 150 5.5.5. Conclusions after EXP4...... 150 5.6. E XPERIMENT 5 (EXP5): PERFORMANCE AND SECURITY ASPECTS OF MULTIMEDIA BASED DISTRIBUTED APPLICATION EQUIPPED WITH CAR GATE STATE RECOGNITION ALGORITHMS IN CONTINUOUS MULTIMEDIA STREAMS ...... 151 5.6.1. Plan and goal of EXP5...... 151
5 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
5.6.2. Design of EXP5 ...... 152 5.6.3. Results of EXP5 ...... 155 5.6.4. Security and performance considerations ...... 156 5.6.5. Conclusions after EXP5...... 157 5.7. S UMMARY OF EXPERIMENTS...... 158 CHAPTER 6. CONCLUSIONS AND FUTURE WORK...... 160 6.1. C LAIMS CONFIRMATION ...... 160 6.2. T HE CONTRIBUTIONS ...... 163 6.3. C HALLENGES AND FUTURE WORK ...... 164 BIBLIOGRAPHY...... 165 WEBLIOGRAPHY ...... 180 APPENDIX A: PERFORMANCE MONITORING AND TESTING TOOLS...... 186 APPENDIX B: SECURITY MONITORING AND TESTING TOOLS...... 203 STRESZCZENIE W J ĘZYKU POLSKIM NAJWA śNIEJSZYCH FRAGMENTÓW ROZPRAWY DOKTORSKIEJ...... 212
6 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Index of figures
Figure 1. Idea of distributed applications working in public-private network infrastructure [source: author]...... 23 Figure 2. Security of public network [source: author] ...... 24 Figure 3. Comparison of two implementations working in public-private network environments with different security areas: system A has characteristics above ciphering security 0% longer than system B but implementation B provides higher ciphering security if effective message lifetime is short [source: author]...... 24 Figure 4. Network security function of RSA-100, RSA-640, and RSA-768 [source: author].26 Figure 5. Idea of Gateway-To-Gateway IPSec Architecture (based on [Kent, Seo 2005]) .....31 Figure 6. Idea of Host-to-Gateway IPSec Architecture (based on [Kent, Seo 2005]) ...... 31 Figure 7. Idea of Host-to-Host IPSec Architecture (based on [Kent, Seo 2005])...... 32 Figure 8. Attributes of policies and SAs, with packet selectors and IPSec specific data [source: author]...... 33 Figure 9. Encapsulating Security Payload Packet Format [Kent 2005a]...... 33 Figure 10. IPv4/IPv6 packet after applying ESP tunnel mode with DiffServ and IPSec processing (based on [Kent, Seo 2005])...... 34 Figure 11. IPv4/IPv6 packet after applying ESP transport mode with DiffServ and IPSec processing (based on [Kent, Seo 2005])...... 34 Figure 12. Overview of IPSec G2 communication with IKEv2 and ESP tunnel or transport mode for IPv4 / IPv6 traffic between hosts A and B from different private network environments [source: author]...... 42 Figure 13. IKEv2 basic activity diagram – Request/Response architecture (based on [Kaufman et al 2005])...... 46 Figure 14. Basic SSL activity diagram – Request/Response architecture [based on Rescorla 2000]...... 49 Figure 15. Pseudocode of OpenSSL multiprocess server on Windows machine [based on Rescorla 2001]...... 52 Figure 16. Activity diagram of R/R model with emphasized parts responsible for secure network communication [source: author]...... 54 Figure 17. Activity diagram of P/S model with broker between client and server; broker role is to store-and-forward the subscribed data [source: author] ...... 56 Figure 18. Secure service processing continuous traffic [source: author]...... 57 Figure 19. Multimedia flow processing system overview [Krawczyk, Barylski 2009b]...... 58 Figure 20. RTP multimedia streams transmission and reception [Krawczyk, Barylski 2009b] ...... 59 Figure 21. System R/R components: a) with operating layers; b) deployment diagram with 3 measurement points [Krawczyk, Barylski 2009b] ...... 60 Figure 22. System P/S components: deployment diagram with 4 measurement points [Krawczyk, Barylski 2009b]...... 61 Figure 23. Multimedia stream transcoder in R/R architecture [source: author]...... 62 Figure 24. Multimedia stream simulcasting in P/S architecture [source: author] ...... 63 Figure 25. Multimedia stream layered coding in P/S architecture [source: author]...... 63 Figure 26. Informatics enterprise from requirements definition to final solution – how do we can to indicate its success or failure? [Barylski, Barylski 2007]...... 66 Figure 27. Overview of typical project lifecycle with fundamental testing activities included, with 3 important project milestones checked in [source: author]...... 68 Figure 28. Validation process lifecycle from performance point of view: a) V-model schema of validation process lifecycle where IPSec performance characteristics are known
7 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
before writing the test plan and further implementation, b) State diagram of validation process lifecycle with agile development methodology where IPSec performance requirements are constantly evaluated [Barylski 2008] ...... 71 Figure 29. Network throughput test environment [Barylski 2007a] ...... 73 Figure 30. Comparison of effectiveness of different network throughput measurement methods (throughput accuracy=1% of transmit port rate, max number of iterations=100) [source: author]...... 74 Figure 31. Intelligent network throughput measurement method (ITMM) [source: author] ...76 Figure 32. The longer the experiment lasts, the more points are measured and the better throughput approximation is obtained: a) 2 points measured, b) 3 points measured, c) 4 points measured, d) 193 points measured [source: author] ...... 79 Figure 33. Part of a perfect throughput characteristics – it is non-diminishing for successive frame lengths – the longer the frame, the closer channel utilization to 100% [Barylski 2007a] ...... 80 Figure 34. Network module with throughput lower than expected - critical path for network packets is affected and must be reviewed [Barylski 2007a]...... 81 Figure 35. Network module with disturbed throughput line - throughput line is not non- diminishing for the successive frame lengths [Barylski 2007a]...... 82 Figure 36. Network module with random throughput values for some frame lengths - it may bring network down without warning [Barylski 2007a] ...... 82 Figure 37. Network module with serious defect - no throughput for network packets of length from 384B to 448B [Barylski 2007a]...... 83 Figure 38. Network module with fatal defect - after some time or for some lengths packets are not forwarded any more – DUT lost its stability until reboot/recovery [Barylski 2007a]83 Figure 39. Network layer latency example graphs [source: author]...... 84 Figure 40. Network latency (maximum acceptable latency = 100 ms) of DUT with: a) network latency above the threshold in some circumstances; b) network latency above the test threshold after crash [source: author]...... 85 Figure 41. Test setup for network latency measurement with test packet payload modification by DUT [source: author]...... 86 Figure 42. Test setup for network latency measurement with test traffic generator capable of monitor the exact time of sending and receiving each test packet [source: author]...... 86 Figure 43. Test setup for test environment network latency measurement – test traffic generator source and destination ports are connected directly [source: author]...... 87 Figure 44. ML performance test setup architecture [source: author] ...... 88 Figure 45. Classes of ML performance characteristics: a) the worst case – system is out of critical resources very quickly; b) linear growing graph – system degrades gracefully, implementation 1 faster than implementation 2; c) wrong test load conditions; d) ideal implementation – flat graph means that system has enough resources for any number of VU [source: author]...... 90 Figure 46. Anatomy of DB performance test [source: author]...... 91 Figure 47. Anatomy of WS performance test [source: author] ...... 92 Figure 48. Anatomy of Web performance test from MVC perspective [source: author]...... 93 Figure 49. Typical Web layer performance defects: a) drastic request response time increase after exceeding the border value of VUs, b) response time increasing with each VU, c) random request response time, d) too high request response time [source: author]...... 95 Figure 50. Illustration of SW security issues’ origin: A - correct and secure implementation (no SW defects), B – lack of implementation/insufficient implementation, C – not intended implementation [source: author, based on Whittaker, Thompson 2004]...... 98
8 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Figure 51. Test setup to validate IPSec performance with Cleartext traffic generator when no suitable IPSec traffic generator is available: DUT1 encrypts the packets; DUT2 decrypts the packets [Barylski 2008] ...... 106 Figure 52. Examples of IPSec performance related defects – DoS attacks: a) CPU resources exhaustion so that other processes cannot get CPU free time slot; b) square complexity of SA adding time – the more SAs are present, the longer the processing time [source: author]...... 106 Figure 53. HTTPS client/server processing (MVC paradigm) – illustration of HTTPS processing timeout for a single client when other SSL clients requests are received and processed [source: author]...... 110 Figure 54. Application model [source: author] ...... 111 Figure 55. Quality tree [source: author] ...... 115 Figure 56. Scope of quality analysis [source: author] ...... 115 Figure 57. Method for finding the correlations between the metrics [source: author]...... 116 Figure 58. MA2QA procedure [source: author] ...... 117 Figure 59. MA2QA in iterative application development [source: author] ...... 118 Figure 60. Possible systems quality score and test result permutations of tests from Table 23 [source: author]...... 119 Figure 61. System security and performance evaluation based on a cumulative mark from passed test weights [source: author] ...... 120 Figure 62. SW model of distributed applications with horizontal layers [Barylski, Krawczyk 2009]...... 121 Figure 63. Security and performance vertical layers of distributed applications [Barylski, Krawczyk 2009] ...... 122 Figure 64. Test bed for S1 / P1 configuration with user input processing flows [Barylski, Krawczyk 2009] ...... 133 Figure 65. State diagram for user input confirmation flow of S1 / P1 configurations [Barylski, Krawczyk 2009] ...... 133 Figure 66. Comparison of P1 and S1 test results: a) P1 and S1 opinion confirmation latency; b) Density of confirmation time for S1: 68% of opinions were confirmed within t c ≤ 4h [Barylski, Krawczyk 2009]...... 135 Figure 67. Test bed for S2 / P2 configuration [source: author]...... 138 Figure 68. L3 and L4 configuration of S2 / P2 setups: a) Cleartext with wildcard bypass policies, b) IPSec with wildcard IPSec policies + exact SAs [source: author] ...... 139 Figure 69. Comparison of IPSec (S2) and Cleartext (P2) data throughput [source: author]..140 Figure 70. WebCamTest application classes [source: author] ...... 143 Figure 71. Test bed for S3 and P3 experiments: WebCamService and WebCamAnalyser with a) IPv4+TCP Cleartext communication; b) IPv4+ESP communication [source: author] ...... 144 Figure 72. Comparison of P3 and S3 test results [Krawczyk, Barylski 2009a] ...... 145 Figure 73. Test bed for EXP4 with 3 cases: HTTP only (P4), HTTPS with SW processing only (S4.1), HTTPS with HW acceleration (S4.2) [source: author]...... 149 Figure 74. Comparison of P4, S4.1, and S4.2 test results [Rescorla 2001]...... 149 Figure 75. Overview of KASKADA test environment [Krawczyk, Proficz 2009]...... 152 Figure 76. Example car gate state recognition events from KASKADA [source: author] ....153 Figure 77. HCC in action: a) both car gates are closed; working area presented b) car.get.in opened – event recognized by HCC; c) car.gate.out opened – event recognized by HCC [source: author]...... 155 Figure 78. Illustration of A3 car gate event recognition efficiency, 15 positive recognitions, 1 false alarm only on h) [source: author]...... 156
9 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Figure 79. Event #19 of EXP5, a truck almost covers up car.gate.out; a) car.gate.out closed; b) cat.gate.out open [source: author] ...... 157
Index of tables
Table 1. Illustration of some RSA numbers resistance to cryptoanalysis, based on RSA Security factoring challenge, normalized to computational power of 2.2 GHz AMD Opteron with 2 GB RAM [RSASECURITY] ...... 25 Table 2. Explanation of ESP packet fields [Kent, Seo 2005]...... 34 Table 3. Authentication and cipher algorithms applicable for ESP [Manral 2007] ...... 35 Table 4. MUST cipher algorithms applicable for ESP [source: author] ...... 36 Table 5. Authentication algorithms applicable for ESP [source: author]...... 38 Table 6. Popular cipher modes [Knudsen 1998] ...... 41 Table 7. IPSec auditable events [source: author] ...... 44 Table 8. Phases of IKEv2 [Kaufman et al 2005]...... 45 Table 9. Security and performance trade-offs of multimedia streams distribution techniques [Parks et al 1999]...... 64 Table 10. Sample list of SW development enterprise success indicators [Barylski, Barylski 2007]...... 67 Table 11. Common STP test categories [source: author] ...... 69 Table 12. Performance test classes [Barylski 2008]...... 70 Table 13. Description of parameters used in Figure 31 [source: author] ...... 75 Table 14. Classes of the most common throughput defects [Barylski 2007a] ...... 78 Table 15. Classes of the most common latency defects [source: author]...... 85 Table 16. Products of SQL profiling [source: author]...... 91 Table 17. Load levels [source: author] ...... 92 Table 18. Security test classes [Arkin et al 2005] [Wang et al 2007] ...... 99 Table 19. Classes of SW users considered in security testing [source : author] ...... 100 Table 20. SW attack classes [Whittaker, Thompson 2004]...... 100 Table 21. The most popular IPSec performance test pass criteria for IPSec GW [Barylski 2008]...... 105 Table 22. Best performance vs. best security for distributed SW applications [source: author] ...... 112 Table 23. Examples of performance and security test cases with associated weights [source: author]...... 119 Table 24. Multidimensional Approach to Quality Analysis (MA2QA) of IPSec and HTTPS distributed applications [Barylski. Krawczyk 2009] ...... 122 Table 25. Examples of SW metrics for HTTPS/IPSec distributed applications [source: author] ...... 123 Table 26. Sample MA2QA score card [source: author] ...... 127 Table 27. Sample MA2QA evaluation [source: author]...... 128 Table 28. Plan of experiments [source: author]...... 131 Table 29. Analysis of P1 and S1 test results [Barylski, Krawczyk 2009]...... 134 Table 30. S1 / P1 total performance and security assessment [source: author]...... 136 Table 31. Analysis of P2 and S2 test results [Barylski, Krawczyk 2009]...... 140 Table 32. S2 / P2 total performance and security assessment [source: author]...... 141 Table 33. Analysis of P3 and S3 test results [source: author] ...... 146 Table 34. S3 / P3 total performance and security assessment [source: author]...... 146 Table 35. Analysis of P4 and S4.1, S4.2 test results [source: author]...... 150
10 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Table 36. S4.1 S4.2 / P4 total performance and security assessment [source: author] ...... 150 Table 37. Chronology of EXP5 video sample events [source: author] ...... 153 Table 38. Summary of experiments for A1, A2, and A3 [source: author] ...... 156 Table 39. MA2QA of EXP5 for A1, A2, and A3 [source: author] ...... 157 Table 40. Summary of experiments [source: author] ...... 159 Table 41. Comparison of R/R and P/S with R/R' and P/S' from MA2QA total score perspectives [source: author]...... 162 Table 42. Performance tool classification parameters [source: author] ...... 186 Table 43. Operating system layer performance tools [source: author]...... 186 Table 44. Network layer performance tools [source: author]...... 190 Table 45. ML performance tools [source: author]...... 191 Table 46. Application layer performance tools [source: author]...... 196 Table 47. Security tool classification parameters [source: author] ...... 203 Table 48. Vulnerability scanners and penetration testing tools [source: author] ...... 204 Table 49. EFI tools [source: author]...... 206 Table 50. Network packet sniffers and crafting tools, traffic monitoring tools [source: author] ...... 206 Table 51. Intrusion detection systems [source: author]...... 208 Table 52. Source code static analysis tools targeted to find security issues [source: author] 210
11 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Glossary
A
Asymmetric Cryptography – In comparison to Symmetric Cryptography , it is a method of securing the communication with a unique pair of keys, called Public Key and Private Key . The keys are mathematically dependent from each other; a requirement to the asymmetric algorithm being that, Public Key can be computed easily from Private Key, while obtaining Private Key from Public Key is computationally unfeasible. This property allows making Public Key publicly known, while its owner must keep Private Key secret.
Authentication algorithm – The particular algorithm used in calculating a hash providing communication integrity and origin confirmation with the use of authentication key.
C
Certificate Authority – An entity (trusted 3 rd party), part of Public Key Infrastructure that issues Digital Certificate .
Cipher – Series of transformations that converts plaintext to ciphertext using the Cipher Key .
Cipher Key – Secret, cryptographic key that is used by the encryption routine to transform Plaintext into Ciphertext (encryption) / Ciphertext into Plaintext (decryption).
Ciphertext – Data input to the Inverse Cipher or output from the Cipher .
D
Digital Certificate – Simply Certificate. An electronic document which uses a Digital Signature to bind together a Public Key with an identity.
Digital Signature – Electronic data used to authenticate the identity of the sender of a message or the signer of a document, and possibly to ensure that the original content of the message or document that has been sent is unchanged. Digital Signature is creation consists of two steps: creation of a Message Digest from the original message with the use of Hash Function H() , then the Message Digest encryption with sender’s Private Key . Digital Signature is appended to the signed data. To verify Digital Signature receiver creates two objects: Message Digest computed on his own from the message with the use of H() , and Message Digest decrypted with sender’s Public Key . If the objects are equal, Digital Signature is proved.
H
Hash Function – It is a transformation H() returning fixed-size value from the input message of any length. H() is called one-way function because it must be relatively easy to
12 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
compute H(x) for a given message x but computationally infeasible to find x’ for a given hash h so that h=H(x’) . Function H() must be also collision-free – it means that it is computationally infeasible to find message y for a given message x so that H(x)=H(y) .
I
Inverse Cipher – Series of transformations that converts Ciphertext to Plaintext using the Cipher Key .
L
Latency – It is a measure of delay experienced by a system.
M
Message Digest – It is a condensed representation of a message, created from the original message using a formula called a one-way Hash Function . Message Digest encrypted with sender’s Private Key creates its Digital Signature .
Model – It is a system of assumptions, definitions, and relations that allows to present some aspect of reality. Model introduces simplification to reality, dropping insignificant properties from its usage goal perspective.
N
Nonce – Cryptographic nonce is a random or pseudorandom number, often time-variant with timestamp included, representing the number that can be used only once. Nonce is required by authentication protocols to ensure that any previous communication cannot be replayed – each time the communication requires a new nonce to be used.
P
Performance Testing – It is a software Testing process aimed to validate system availability, scalability, and its resource usage, determining how fast and effective system can behave under a particular workload.
Plaintext – Data input to the Cipher or output from the Inverse Cipher .
Private Key – A part of Asymmetric Cryptography paradigm. Must be kept in secret by its owner. It is used to revert the computation done by the corresponding Public Key by deciphering the data encrypted with the use of Public Key, sent to the Private Key owner.
Private Network – In contradiction to Public Network , it is a network that is built from fully trusted machines, not publicly available, no additional security checks are required between the nodes to handle to right communication.
13 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Public Key – A part of Asymmetric Cryptography paradigm. Should be publicly known. It is used to revert the computation done by the corresponding Private Key . Public Key of a given recipient allows sending him a message that only he will be able to decrypt with the use of his Private Key. Public Key may be used also to confirm that the Private Key owner sent a message.
Public Key Infrastructure – Used to create, manage, store, distribute, and revoke Digital Certificates . It is an arrangement binding Public Key with its user identity.
Public Network – It is a network that consists of publicly available machines, potentially insecure so communication between the nodes should be encrypted, its integrity checked, and communication parties should always confirm its origin.
Public-Private Key Cryptography – A mathematical technique by which two parties generate a pair of related numbers: a Private Key and a Public Key . Each party transmits its Public Key to the other party, keeping Private Key in secret. Public-Private Key Cryptography allows each party to independently generate a shared secret that is used to protect communications.
Publish/Subscribe – A messaging paradigm where subscribers express their profile and receives only messages they are interested in, without knowing the publisher details.
R
Request/Response – A messaging paradigm where initiator sends a message to responder that process the requests and optionally responds to the initiator, establishing two-way communication channel.
S
S-box – It is a fundamental part of modern ciphering algorithms; a non-linear substitution table used in several byte substitution transformations and in the encryption/decryption algorithm specific routines to perform a one-for-one substitution of a byte value.
Security Association – A generic term for a set of values that identify IPSec features and protection applied to a connection.
Security Testing – It is a software Testing process aimed to assess system endurance to malicious input, system effectiveness from its data protection viewpoint, and maintaining its functionality as intended even if hard work circumstances are created.
Service-Oriented Architecture – A business-centric software architectural approach that supports integrating the business as linked, repeatable business tasks, or services.
Shared Secret – A value produced by the use of Public-Private Key Cryptography .
Software Defect – Software bug. It is an error, mistake, flaw, failure in a software that leads to unexpected results or unintended software behavior.
14 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Software Metric – It is a measure of some software property.
Software Quality – It is a property that indicates how well software is designed and how well software conforms to the design.
Symmetric Cryptography – A method of securing the communication between two parties that involves the use of a secret key known only to the participants of the secure communication. In contrary to Asymmetric Cryptography main disadvantage of this approach is secure distribution of secret key between the communicating parties.
T
Testing – Software testing. It is a process of empirical investigations, a set of experiments organized in test cases and test scenarios, executed against the given product version to assess its quality, intended to find Software Defects , providing information if the product meets its business and technical requirements.
Throughput – Maximum forwarding rate at which none of transported data unit is lost.
V
Validation – A process that examines if a product was created in the way the customer desired.
Verification – A process (formal, functional, runtime) utilizing Testing , reviewing, inspecting, that checks whether the examined product meets the written requirements, standards, and desired specifications.
Virtual User – A virtualized representation of a user, specifically designed to simulate the same behaviors and interactions with the system that a real user would.
Vulnerability – In Security Testing it is the susceptibility to successful attack against software, allowing attacker to access to the software flaw and exploit it.
W
Web Service – Software system designed to support interoperable machine-to-machine interaction over a network.
15 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
List of acronyms
Abbreviation Translation or explanation Category ACK Acknowledge Network AES Advanced Encryption Standard Security AH Authentication Header Network, Security B2B Back-to-Back Testing, Network BKM Best Know Method Design CBC Cipher Block Chaining Security CFB Cipher Block Feedback Security CIFS Common Internet FS Data, Network CIL Common Intermediate Language Programming CMMI Capability Maturity Model Integration Standard, Process CPU Central Processing Unit System resources CRC Cyclic Redundancy Check Data CRT C Runtime Library Programming CSRF Cross Site Request Forgery Security CTR Counter Security DB Database Database DC Digital Certificate Security, Design DES Data Encryption Standard Security DFA Diagonal Fault Attack Security DFS Distributed FS Data, Network D-H Diffie-Hellman Security DiffServ Differential Services Network, Quality DLL Dynamic Linked Library Application DoS Denial of Service Security, Network DS Digital Signature Security DSA DS Algorithm Security DUT Device Under Test Testing ECB Electronic Code Book Security ECC Elliptic Curve Cryptography Security EFF Electronic Frontier Foundation Organization EFI Environmental Fault Injection Security EOL End Of Live Project timeline ESN Extended Sequence Number Network, Security ESP Encapsulating Security Payload Network, Security EXP Experiment Testing EXP1 EXP number 1 Testing EXP2 EXP number 2 Testing EXP3 EXP number 3 Testing EXP4 EXP number 4 Testing EXP5 EXP number 5 Testing FAT File Allocation Table Data FAT12 FAT 12-bit Data FAT16 FAT 16-bit Data FAT32 FAT 32-bit Data FDDI Fiber Distributed Data Interface Network
16 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Abbreviation Translation or explanation Category FFS Fast FS Data FIPS Federal Information Processing Standard Documentation FS File System Data FTP File Transfer Protocol Network G2G Gateway-to-Gateway Network GNFS General Number Field Sieve Mathematics GUI Graphical User Interface Application GW Gateway Network H2G Host-to-Gateway Network H2H Host-to-Host Network HCC Haar Cascades Classifier Multimedia processing HD Hard Disk Hardware HFS+ Hierarchical FS Plus Data HMAC Hash Message Authentication Code Security HPFS High Performance FS Data HTML Hypertext Markup Language Standard HTTP Hypertext Transfer Protocol Network HTTPS HTTP over SSL Network, Security HW Hardware Hardware I/O Input/Output System IANA Internet Assigned Numbers Authority Organization ICMP Internet Control Message Protocol Network ICV Integrity Check Value Network, Security IDS Intrusion Detection System Security IETF Internet Engineering Task Force Organization IGSI IPSec GW State Indicator Testing IGTP IPSec GW Test Plan Testing IIS Internet Information Server Application IKE Internet Key Exchange Network IKEv2 IKE version 2 Network, Security IP Internet Protocol Network IPSec IP Security Network, Security IPv4 Internet Protocol version 4 Network IPv6 Internet Protocol version 6 Network IRQ Interrupt Request System ISAKMP Internet SA and Key Management Protocol Security, Network ISDN Integrated Services Digital Network Network ISO/OSI International Organization for Network, Standard Standardization/Open System Interconnection IV Initialization Vector Network, Security JFS Journaled File System Data JPEG Joint Photographic Experts Group Multimedia L2 Layer 2 Network L3 Layer 3 Network L4 Layer 4 Network LAN Local Area Network Network LEAP Lightweight Extensible Authentication Protocol Network, Security
17 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Abbreviation Translation or explanation Category LIDS Log-based IDS Security, Application MA2QA Multidimensional Approach to Quality Analysis Testing MAC Medium Access Control Network, Ethernet MAC Message Authentication Code Security MD5 Message Digest 5 Security MDC Multiple Description Coding Network, Multimedia ML Middleware Layer Design MSIE Microsoft ® Internet Explorer Application MTU Maximum Transfer Unit Network MVC Model-View-Controller Design NAN Not A Number Data NAT Network Address Translation Network NFS Network FS Data, Network NIDS Network IDS Security, Network NTFS New Technology FS Data NU Navigational User Testing OBO Off-By-One Security, Programming OFB Output Feedback Security OS Operating System Application P/S Publish / Subscribe Design P/S’ P/S with SSL Design PCBC Propagating CBC Security PCI Payment Card Industry Design, Standard PDT Product Development Team Organization PE Production Environment Design PEN PE Monitoring Testing PFD Page Flow Diagram Design, Testing PID Process ID System PKI Public Key Infrastructure Security PPP Point-to-Point Protocol Network PRF Pseudo Random Function Security PTT Page Test Tree Design, Testing QMCT Quality Management and Control Team Testing QS Quadratic Sieve Mathematics R/R Request / Response Design R/R’ R/R with IPSec Design RAM Random Access Memory Application RED Random Early Detection Network REFI Runtime EFI Security RFC Request For Comments Documentation RGID Real Group Identification System RIA Rich Internet Application Design RUID Real User Identification System SA Security Association Network, Security SaaS SW as a Service Design SDL Security Development Lifecycle Security SEFI Source Code-based EFI Security
18 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Abbreviation Translation or explanation Category SHA1 Secure Hash Algorithm 1 Security SLA Service Level Agreement Documentation, Law SLIP Serial Line Internet Protocol Network SLOC Source Line Of Code Programming SMB Server Message Block Data, Network SMTP Simple Mail Transfer Protocol Network SN Sequence Number Network, Security SOA Service-Oriented Architecture Design SOAP Simple Object Access Protocol Network SP Service Pack Application, Security SP Stored Procedure Database SPI Security Parameter Index Network, Security SPS Separate Port Strategy Network SQL Structured Query Language Database SRD System Requirement Document Documentation SRW Sliding Receiver Window Network SSH Secure Shell Network, Security SSL Secure Sockets Layer Network, Security STP SW Test Plan Testing, Documentation SUT SW Under Test Testing SUT System Under Tests Testing SW Software Software TAD Test After Development Testing, Methodology TCL Tool Command Language Testing TCP Transport Control Protocol Network TCP/IPv4 TCP over IPv4 Network TDD Test Driven Development Testing, Methodology TFC Traffic Flow Confidentiality Network, Security TFTP Trivial FTP Network TGS Ticket Granting Server Security TLS Transport Layer Security Security, Network TLS-HSP TLS Handshake Protocol Network, Security TLS-RP TLS Record Protocol Network, Security TOCTOU Time-Of-Check-Time-Of-Use Security TOS Type Of Service Network, Quality TPD Test Plan Document Testing, Documentation TRF Truly Random Function Security TTL Time To Live Network TU Transactional User Testing UDP User Datagram Protocol Network UML Uniform Modeling Language Documentation UNS Upward Negotiations Strategy Network URI Uniform Resource Identifier Network URL Uniform Resource Locator Network URN Uniform Resource Name Network VML Vector Markup Language Standard VoIP Voice over IP Network
19 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Abbreviation Translation or explanation Category VPN Virtual Private Networking Network, Security VU Virtual User Testing WEP Wired Equivalent Privacy Network, Security WLAN Wireless LAN Network WPA WiFi Protected Access Network, Security WS Web Services Application WSDL Web Services Definition Layer Design WTM Web Test Model Design, Testing WWW World Wide Web Network XML Extensible Markup Language Data XSS Cross-Site Scripting Security, Testing # Quantity General
20 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
CHAPTER 1: I NTRODUCTION
This chapter gives short introduction to design, development, testing and maintenance of distributed applications working in public/private IP network environments, providing proper background to scope and goal of this dissertation. Distributed applications are depicted as dynamic entities, communicating between trusted enclaves over insecure and full-of-security- threats public network. To illustrate the challenge of secure communication cryptoanalysis against protected data sent over public network a Network Security Function is defined. Finally claims of this dissertation are presented.
1.1. Background of security and performance testing
Level of complexity of currently developed information systems is increasing and increasing all the time. The scope of distributed applications is wider, their design more sophisticated, and technical solutions applied inside – more complicated [Mantyla 2004] [Frankel et al 2005] [Haag et al 2006] [Barylski 2007b]. Meanwhile a customer demand for cheap, errorless, and delivered on time product is going to be higher and higher. Realization of these requirements is not possible without optimal and adequate strategy of designing, implementing, and testing the delivered solution, including hardware (HW), software (SW), and documentation. To achieve customer’s goal it is a SW / HW testing that is employed, defined as the discipline of executing a SW application placed on HW to determine whether the system meets its requirements and specifications. In the term above Performance Testing [Balsamo et al 1998] [Shaw et al 2002] [Barylski 2008], and Security Testing [Ravi et al 2004] [McGraw, Potter 2004] [Wang et al 2007] are included as a crucial piece. On the other hand in the very complicated runtime environment, integrating many different technologies, interfaces, programming languages, architecture and implementation of distributed applications is exposed to security and performance defects in particular [GAO- 07-1036]. Availability of fast and reliable communication links and global approach to network services (Service-Oriented Architecture – SOA), when system components may be deployed to geographically spread network nodes give the additional degree of freedom that may lead to unexpected security or performance issues due to increased system complexity. The natural consequence of business needs is a fact that almost all currently developed business [Lam 2001] [Meadors 2004] [Filjar, Desic 2004] [Eisenhauer, Donnelly et al 2006], multimedia [Lawton 2007], entertainment or educational applications [Shaw 2000] work in distributed environment, with strict decomposition to SW executed on dedicated calculation units (e.g. clusters or computing clouds, equipped with multicore processors and huge amount of operational memory), supporting data warehouses, DB servers, specialized presentation servers, WWW servers, load balancers, or dedicated telecommunication platforms (e.g. [IXP2855]) that maintain errorless transmission between distributed system nodes. With more and more efficient HW all the points mentioned above lead to thesis that performance of the developed systems is constantly increasing. In the meantime the system security is an area that needs to be improved day-by-day [Anderson 2001] [Monroy, Carrillo 2003] [Pimentel et al 2007]. Communication ciphering [CERTICOM] [SPYRUS], user authentication before gaining access to system resources, symmetric/asymmetric cryptography, message authentication codes, certificates and proofs of correctness are widely employed [Ferguson, Schneier 2003].
21 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
What is more, currently Performance Testing and Security Testing related topics become visible at each stage of SW development process [Wang 2004], beginning from early designing phase, through implementation, to validation, deployment, and maintenance phases until HW & SW EOL. It is getting more and more popular to adapt specialized development methodologies (e.g. Secure Software Development Lifecycle, Security Development Lifecycle (SDL) [Howard, Lipner 2006]), oriented on system security to put more attention to this area. The fundamentals of the process are proved security standards [Caplan, Sandlers 1999]. The role of advisors is also taken by dedicated teams, often external to the project, aimed to monitor system security, control it, and take appropriate actions if a situation requires so (e.g. Systems Engineering Center of Excellence, Inc.). Testing methodology is being equipped with new tools and techniques (e.g. Penetration Testing, Exploratory Testing [Arkin et al 2005]).
1.2. Introduction to distributed public-private network environments
Architecture and implementation of distributed applications are still evolving. Within the compass of 40 years, starting from [Royce 1970], evolution of SW methodology can be observed – from traditional waterfall project lifecycle, through prototyping, release model approach to agile methodology (Agile Development, Extreme Programming) or combined methods. The reasons of change are different: beyond dispute new technologies and human achievements (e.g. multiprocessor multicore platforms), globalization of network services, Internet availability, evolution of programming languages (e.g. Microsoft .NET, Sun Java), standardization of protocols, new business opportunities (e.g. remote services), more demand on application GUI (e.g. Microsoft DirectX, OpenGL, trends created by Google, Apple), finally increased expectations of the clients – global competition, Time-To-Market race, searching for new solutions saving energy and natural resources. But what is more interesting from this dissertation view-point, the changes also impact Security Testing and Performance Testing, demanding wise choice of test strategy and project timeline, placing validation and verification at different project phases. This paper will continue the analysis mentioned above, with attention paid to IPSec (see chapter 2.1) and HTTPS-based (see chapter 2.2) solutions. Figure 1 demonstrates an idea of distributed applications working in public-private network infrastructure. 3 Private Networks (A, B, C) are interconnected with Public Network of unknown, often complicated, and non-trusted topology, full of potential danger for traffic for being dropped, forged, intercepted, or replayed. Each Private Network hosts a part of distributed application. Every piece of the system must communicate with each other in a reliable way so that the whole system may operate. There are many reasons that stands for distribution among many independent pieces: from economic reasons (e.g. some components are outsourced and situated in geographically different location), through design (e.g. system is decomposed into logical layers – each of them require access to specific resources that should not influence on each other), implementation (e.g. system incorporates solutions in many technologies, programming languages), maintenance (e.g. responsibility for system maintenance is distributed among many teams), system scalability (e.g. processing effort of the system is distributed among many network nodes to multiply the processing effectiveness), system availability (to prevent system from being unavailable in case of failure of a single component – each of the crucial piece should have its backup), to system performance (wise distribution allows system to be load-balanced in case of peak user traffic).
22 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Public Network Private Network A
Application Layer Private Network B Middleware Layer
Ne twork Layer Application Layer Middleware Layer Network Layer
Private Network C
Application Layer
Middleware Layer Network Layer
Figure 1. Idea of distributed applications working in public-private network infrastructure [source: author]
Privacy or publicity of the network depends on the type of potential danger that can happen with the traffic sent through the network. If network is available only to trusted users that may not harm the communication it is the private network, otherwise it is the public network. It should be emphasized that not only intended but also not-intended user behavior may harm the network flow, as well as the term “user” (discussed later in chapter 3.6 relating to SW security testing) that can be a person (end-user, system administrator), process, device, or event internal SW failure. Network security is achieved by encryption algorithms that consist of publicly know recipe how to protect the communication but hidden private key used to generate the ciphertext from plaintext that guaranties the secure communication (known as Kerckhoffs’s principle [Mollin 2006]). But one important aspect should be raised here – security is not possible forever. Each commonly used encryption method has its at least brute-force attack that enables successful attack to the ciphertext in long but finite time. There can be also more effective method of breaking the cipher derived from cipher algorithm flaws, inadequate encryption keys distribution, or weak keys used. This observation leads to a thesis that communication flow in time t 0 (start of the communication) is 100% secure but then its security constantly decreases, with 0% security after some time. Furthermore, network security is not stronger than the weakest security link in the system chain [Ferguson, Schneier 2003]. Figure 2 depicts ciphering secure area – the only trusted timeframe for communication security - it is the field delimited with any possibility to break the cipher – including brute- force attack, taking advantage of ciphering flaw, and stealing the key from the system in any way.
23 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Security: brute-force attack 100 Security: attack using algorithms flawness Security: attack targeted on stealing the key from the system
Ciphering security [%] Secure area
0 0 100 200 300 400 500 Cryptoanalysis ability
Figure 2. Security of public network [source: author]
Generally speaking, cryptoanalysis effort required to break the communication security (in other words: to get to know the valid cipher key as the main securable) can be characterized as available processing speed. The more time t we have, the more calculations we can do. The more powerful processing speed v is available, the more efficient are the calculations – their speed increases. Network security FNS (t) is a function of t, incorporating cryptoanalysis speed v in the body, delimited from above by 100%, delimited from bottom, by 0%, decreasing until it reaches the minimum value, then constant (Figure 2). The explanations above lead to a conclusion that private network remains secure for the time t the weakest security communication link in the chain is not broken with equipment available to the attacker. Having in mind this fact the communication is secured only for a given period of time ts, and each information should be secured with the use of a method that provides adequate level of security within it. It means that each information has its security lifetime tl (e.g. see chapter 2.1.1: IPSec SA lifetime). To achieve success, tl must be significantly smaller than ts ( tl << ts). a) b)
100 100 System A System B Cipheringsecurity [%] Ciphering security [%]
0 0 0 100 200 300 0 100 200 300 Cryptoanalysis ability Cryptoanalysis ability
Figure 3. Comparison of two implementations working in public-private network environments with different security areas: system A has characteristics above ciphering security 0% longer than system B but implementation B provides higher ciphering security if effective message lifetime is short [source: author]
24 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Figure 3 depicts two implementations (system A and system B) assessed from ciphering security perspective with FNSA (t) and FNSB (t) , respectively. Let tA is a maximum t for which FNSA (t) > 0 , let tB is a maximum t for which FNSB (t) > 0 . Although tA > t B implementation B is more preferred than implementation A as far as message security lifetime is concerned, if t s << t B, because FNSB (t B) >> F NSA (t B). If it is not true that ts << t A | t B neither implementation A nor implementation B is recommended because risk of successful cryptoanalysis applied to the communication channel is too high in both cases. As an example let’s consider security of RSA [RSA] encryption algorithm with different key lengths (Table 1). Current successful attacks on RSA are based on factoring large integer numbers n = p x q to recover prime factors p and q used to generate a secret exponent d from the Public Key , then obtaining the Private Key from d, finally Ciphertext is decrypted with a standard RSA procedure.
Table 1. Illustration of some RSA numbers resistance to cryptoanalysis, based on RSA Security factoring challenge, normalized to computational power of 2.2 GHz AMD Opteron with 2 GB RAM [RSASECURITY] RSA number Date of factorization Cryptoanalysis power involved RSA-100 [Lenstra 1991] • Algorithm: QS [Pomerance 1990] • Duration: 4 hours • Environment: AMD Athlon 64 2.2 GHz with 2 GB RAM, using msieve [MSIEVE] RSA-640 [RSA640_EMAIL] • Algorithm: GNFS [Pomerance 1996] • Duration: 5 months • Environment: 80. machines with AMD Opteron 2.2 GHz with 2 GB RAM (33 year of a single 2.2 GHz AMD Opteron with 2 GB RAM) RSA-768 [Kleinjung et al 2010] • Algorithm: GNFS [Pomerance 1996] • Duration: 2.5 years • Environment: cluster of machines, representing computational power of 15000 years of a single 2.2 GHz AMD Opteron with 2 GB RAM
Figure 4 depicts cryptoanalysis effort required to break RSA-100, RSA-640, and RSA-768 numbers. Results are normalized to workdays of PC with CPU = 2.2 GHz AMD Opteron, RAM = 2 GB, presented on a logarithmic scale. RSA-100 is factorized almost immediately. Although factorization of both RSA-640 and RSA-768 require computational cluster (note that it is in economic range of many large organizations), the recommendation is to use at least RSA-1024 in secure communication [Kleinjung et al 2010] (see: chapter 2.2.2, discussion on HTTPS security).
25 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
100 RSA-100 security [%] 80 RSA-640 security [%] Security [%] 60 RSA-768 security [%] 40
20
0 1 10 100 1000 10000 100000 1000000 Cryptoanalysis effort 2.2 GHz AMD Opteron with 2 GB RAM [days]
Figure 4. Network security function of RSA-100, RSA-640, and RSA-768 [source: author]
Two more things that should be also taken into consideration when calculating the cryptoanalysis potential are: possibility of successful attack repetition, and attack distribution capability. If there is no easy way to repeat the attack on the other unit of HW/SW, or for each unit the attack is different, the risk of global security flaw is lower. If there is even a chance to automate the attack by creating SW capable of injecting the cryptoanalysis method directly to the unit, then the hacking SW is published and widely distributed, the risk of loosing the network privacy is significantly higher.
1.3. Goal and scope of the dissertation
Many research teams are investigating performance of distributed systems [Shaw 2000] [Weyuker, Vokolos 2000] [Shaw et al 2002] [Avritzer et al 2002] [Denaro et al 2004] [Sung, Lin 2008] [Mekkat et al 2010] and the system security [Li et al 2000] [Anderson 2001] [Ravi et al 2004] [McGraw, Potter 2004] [Arkin et al 2005] [Rajavelsamy et al 2005] [Wang et al 2007] [Chang et al 2010]. Not only architectural and implementation aspects [Weyuker, Vokolos 2000] [Frankel 2000] [Chang et al 2010] are taken under consideration but also validation activities [Krawczyk, Wiszniewski 2000] [Denaro et al 2004] [Barylski 2008]. However there is little research with both performance and security investigated in the same time, with studying horizontal (within the same layer) or vertical (between layers) correlations of performance and security metrics. This dissertation will cover this topic by deep analysis of IPSec (see: chapter 2.1) and HTTPS-based (see: chapter 2.2) distributed applications working in public-private network environments, operating in Request/Response (R/R) and Publish/Subscribe (P/S) architectures, revealing SW performance tradeoffs while raising SW security. The dissertation will also cover design, implementation, and testing challenges according to R/R and P/S architectures (see: chapter 2.3). The key components of each architecture are discussed, and the activity diagrams presented. Analysis includes the performance trade-offs, bandwidth utilization, and potential security issues of both solutions. Examples of implementations, especially related to multimedia streams processing, are depicted and discussed.
26 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
SW / HW Quality Management and Control Team (QMCT) should have a central role in the informatics project, observing and measuring metrics that inform about success (or failure) of business activities (e.g. amount of P1 non-resolved defects, percent of test coverage, percent of acceptance test coverage [Barylski, Barylski 2007]). One of the most important tasks that QMCT must face with at the beginning of the project is proper selection of success metrics. This dissertation will not omit this phase – each experiment (see: chapter 5) carried out to prove the claims of the dissertation precedes a set of definitions, including security and performance metrics. The metrics are used to evaluate the system quality, supporting implementation decision process, proving the usability of proposed testing model (see: chapter 4) targeted to improve quality of distributed applications working in public- private network environments. The crucial output of the proposed testing approach is a summary of total system security and performance on the base of selected metrics. Detection of critical defects of distributed application based on SW metrics is well- known technique [Wijesihna et al 2005] [Barylski 2007a]. Example of such characteristics are: from basic system resources perspective: CPU time usage [%], operational memory usage [B]; from network layer perspective: throughput [Mb/s], latency [ms], response time [ms]; from middleware layer viewpoint: availability of services [%], DB query response time [ms]; from application perspective: correctness of image recognition [%], mathematical calculations latency [ms], number of possible concurrent users. This dissertation will classify the knowledge from this scope, resulting in a testing model for improving quality of the SW implementation. IPSec (Internet Protocol Security) technology (see chapter 2.1) is an answer of community interested in IP network security in the era of increasing demand on globally available secure IP solutions, for both IPv4 and IPv6. IPSec is a set of protocols that allows to extend IP capabilities to cover data confidentiality, communication integrity checking, message source authentication, and communication anti-replay protection. This dissertation is going to present in details the IPSec state-of-art, including the latest security related improvements, ciphering recommendation, and end-user experience. Distributed applications, based on IPSec protocol, are recommended to be developed in iterations or by prototyping [Mantyla 2004] [Frankel et al 2005] [Barylski 2007]. This dissertation is going to cover this topic. The applications used to validate the proposed testing model for improving quality of distributed applications are developed in iterations (see chapter 5 with experiments) – each iteration provides a complete solution, iterated to the next one with new functionality implemented. This area of research is also taken into consideration when creating a test plan for the application (see chapter 3.1). The dissertation will also cover SSL (Secure Socket Layer) protocol and its implementations to widely used solutions (e.g. HTTPS) (see chapter 2.2). SSL provides slightly different security than IPSec – it provides secure mechanism of exchanging data between the applications, but independent on IP. In comparison to IPSec VPN, SSL VPN the remote machine gets access to the specific services and applications, not entire IP network.
27 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
1.4. Claims of the dissertation
Claim 1
A testing procedure presented in the dissertation: Multidimensional Approach to Quality Analysis (MA2QA) allows to calculate an integrated quality score of an application A from both security and performance perspectives, defined as Quality (A) . This score is a resultant value of security and performance properties of A.
Claim 1 defines capabilities of MA2QA.
Claim 2
MA2QA used against application A within its development lifecycle leads to creation of newer version: application A’ with the following property:
Quality (A) ≤ Quality (A’)
Claim 2 depicts usability of MA2QA.
1.5. Document structure
From high-level view this dissertation is composed from four parts: a) Analysis of Performance Testing and Security Testing of distributed applications working in public-private IP network environments with the use of such technologies as IPSec and SSL; b) Proposal of a testing method for improving quality of distributed applications, taking advantage on both Security Testing and Performance Testing; c) Set of experiments executed against distributed applications working in public-private IP network environments that allow to prove efficiency and usability of the proposed testing model for improving the application overall quality; d) Summary of the obtained test results and observations, final conclusions, and future work recommendations.
The detailed document structure is as follows:
1) The analysis part of the dissertation consists of Chapter 2 and Chapter 3. Both of them are the necessary introduction to the research area and give the overview of both performance and security testing, and distributed applications working in private-public IP environments itself, with design details, Best Know Methods (BKMs), potentials issues, and current research topics. 2) Particularly, Chapter 2 gives the necessary introduction to IPSec and SSL technologies that are the heart of System Under Tests (SUT), designed to work in two specific SW
28 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
architectures used as a research base: Request/Response (R/R) and Publish/Subscribe (P/S), naturally eligible for IPSec and SSL. From IPSec standard the main analysis effort is put on IPv4+ESP (both tunnel and transport modes) and IKEv2 protocols. SSL is analyzed as a universal solution that can be applied to a range of ISO/OSI transport layer network protocols, with HTTPS implementation as an example of use. 3) Chapter 3 presents Performance Testing and Security Testing of distributed applications, with all necessary definitions and terms. As far as Performance Testing is concerned, test methodologies, tools, and algorithms are discussed that operate in each layer: from Operating System Layer, through Network Layer, and Middleware Layer to Application Layer. Tests are illustrated with defect classes related to each layer of the application. On the other hand Security Testing is presented as a different point of view from Performance Testing, aimed to uncover SW and HW security threats before the system is released and deployed to target environment where attackers can access it and potentially break it. Background of the evaluated security testing techniques is filled with a list of spectacular security SW bugs, belonging to specified defect class. This chapter is supplemented with two appendixes: A and B, presenting details of performance and security testing tools. 4) Innovatory testing method for improving quality of distributed applications working in public-private network environments is presented in chapter 4. The concept presented by the author joins Performance Testing and Security Testing as complementary techniques allowing to treat quality of distributed applications on a broad basis. The model foundation is a multidimensional matrix of security and performance metrics divided into six areas of interest. The first three relate to system performance: data processing effectiveness, time related measurements and overall user satisfaction of system performance, while the last three deal with system security: SW access control, SW vulnerability to malicious input, and SW communication security. A point value is associated with each metrics, allowing to study mutual correlations between performance and security metrics of each layer of the distributed applications, leading to the final equation that calculates total system quality value from both security and performance perspectives. 5) Then, chapter 5 presents the results of experiments against distributed applications working in public-private IP network infrastructures, built up with the use of IPSec and SSL. Each experiment is aimed to discover security and performance relations to the distributed applications, trying to point out the best solution from both performance and security perspectives. The results show perceptible influence of IPSec/SSL implementation to system performance along with a fact that system total security is significantly increased. 6) Summary of the dissertation is included in the last chapter. Chapter 6 addresses the claims of the dissertation and confronts them with the obtained experimental results. It evaluates the usability and efficiency of the proposed testing method, summarizing the achievements and contributions of this dissertation. Open problems, further work recommendations, and challenges are discussed at the end of the paper.
29 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
CHAPTER 2: C HARACTERIZATION OF PRIVATE -PUBLIC IPSEC AND HTTPS APPLICATIONS
This chapter presents a “patient” subjected to “treatment” with MA2QA from chapter 4: IPSec and HTTPS based application. It provides deep-dive into IPSec and HTTPS protocols (including authentication and encryption algorithms applicable to them) and presents the security and performance aspects of distributed applications working in public/private IP network environments, from design, through implementation, to validation. The discussion concentrates on two the most appropriate software architectures for solutions utilizing IPSec and HTTPS: Request/Response and Publish/Subscribe.
2.1. IPSec-based distributed applications design
2.1.1. Introduction to IPSec
IP packets do not have any inherent security, making it easy to forge and inspect their content, including the IP addresses contained in them. As a result, there is no guarantee that a received IP packet: 1) is from the claimed sender, 2) contains the original data that the sender put in it, or 3) was not sniffed during transit [Shue et al 2005]. Internet Protocol Security (IPSec) [Kent, Seo 2005] is a set of protocols for securing Internet Protocol version 4 (IPv4) [Postel 1981a] and Internet Protocol version 6 (IPv6) [Deering, Hinden 1998] communications by authenticating and/or encrypting each IPv4 / IPv6 packet in a data stream. IPSec also includes protocols for cryptographic key establishment, Internet Key Exchange version 2 (IKEv2) [Kaufman et al 2005] which establish the main communication channel between the two parties in a secure manner. The history of IPSec began in 1995 [Atkinson 1995] when the first IPSec draft specification was created. At the end of 1998 the first official version was presented [Kent, Atkinson 1998], obsolescing the previous one, commonly named as IPSec release 1. Seven years later, in 2005, IPSec experiences resulted in the next document [Kent, Seo 2005], commonly named as IPSec release 2. It is expected that further modifications to the standard will be done, probably simplifying the design and adjusting it to the field practice. The most common use of IPSec is providing Virtual Private Networking (VPN) services. A VPN is a virtual network that extends data and IP information transmitted between networks with a secure communication mechanism [Frankel et al 2005], built on top of existing physical networks. However it must be emphasized that VPNs do not remove all risk from networking. The potential problems are: the strength of IPSec implementation, encryption key disclosure, and availability in terms of the ability of authorized users to access system as needed – it is not supported by IPSec at all [Frankel et al 2005]. Three models of IPSec use are widely met:
a) Gateway-to-Gateway IPSec Architecture b) Host-to-Gateway IPSec Architecture c) Host-to-Host IPSec Architecture
Gateway-to-Gateway (G2G) IPSec Architecture provides secure network communication between two networks. IPSec Gateway is a network device that stands at the
30 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments between private IP network (where all network nodes are trusted and neither encryption nor authentication is required) and public IP network (where encryption and authentication protects the IP traffic from being read or sent by unauthorized parties). It may be a dedicated device that performs IPSec functions only or it may be part of another network devices (router, firewall) what is more frequent. Figure 5 depicts secure communication between the users from different private networks over public Internet network, regardless of hackers, intruders and wire tapping. Private network users are not involved in encryption and authentication processes directly – IPSec Gateway behavior is transparent to them. However no protection is provided between each host and its local IPSec Gateway.
Private network A Public network Private network B
Host B2 Host A1 Internet
IPSec Gateway A IPSec Gateway B
Host A2 Hacker Wire tap Host B3 Host B1 Intruder
Figure 5. Idea of Gateway-To-Gateway IPSec Architecture (based on [Kent, Seo 2005])
Figure 6 depicts Host-to-Gateway (H2G) IPSec Architecture, the increasingly common IPSec deployment model. It is used to provide secure remote access, via WiFi network for traveling users especially. Each remote access requires an IPSec connection being established between the remote host (client) and the IPSec Gateway (server) for each individual user only. The remote user is typically asked by the IPSec Gateway to authenticate itself before the connection can be established and each host must have IPSec client SW installed and configured – it causes that IPSec implementation is not transparent to the user. The communication between the IPSec Gateway and destination host is not protected.
Public network Private network B
Internet Host B2
IPSec Gateway B Remote Host A
Hacker Wire tap Host B3 Host B1 Intruder Figure 6. Idea of Host-to-Gateway IPSec Architecture (based on [Kent, Seo 2005]) Host-to-Host (H2H) IPSec Architecture (Figure 7) is the least commonly used approach. In most cases only special purpose needs, such as remote administrative tasks that require the use of insecure protocols to be done, performed by system administrator on a single server only, are the reasons for this model. This approach causes that a dedicated IPSec
31 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments connection is created between the remote host and the server. When a user wishes to use resources on the IPSec server, the user’s host initiates the IPSec connection, preceded by a successful user’s authentication. The H2H IPSec usage model is resource-intensive to implement and maintain (it requires each client to manage IPSec configuration) however is the only one model to provide protection for data throughout its transit [Frankel et al 2005]. IPSec specification proposes two protocols for secure data transit: IP Encapsulating Security Payload [Kent 2005a] (ESP) and IP Authentication Header (AH) [Kent 2005b]. AH may be applied alone, in combination with the ESP or in a nested fashion. However AH is not popular due to its design limitations (lack of encryption support) and ESP design that may easily simulate AH behavior. AH provides authentication for as much of the IP header as possible, as well as for next level protocol data. However, some IP header fields may change their values in transit (e.g. TTL, TOS, IP options) and the value of these fields, when the packet arrives at the receiver, may not be predictable by the sender. The values of such fields cannot be protected by AH. It causes that the protection provided to the IP header by AH is piecemeal [Kent 2005b]. Thus, AH will not be covered by this dissertation.
Public network
Internet
Remote Server Host A
Hacker Wire tap Intruder
Figure 7. Idea of Host-to-Host IPSec Architecture (based on [Kent, Seo 2005])
2.1.2. ESP security
ESP is a protocol that provides both authentication and encryption for both IPv4 and IPv6 and is universally used by IPSec gateways. It supports cryptographic ciphers that protect the secrecy of our data and cryptographic message authentication codes which prove authentication and protect the integrity of data sent over Internet. ESP can be used to provide confidentiality, data origin authentication, connectionless integrity, an anti-replay service (a form of partial sequence integrity with 32-bit. Sequence Number (SN) or 64-bit. Extended Sequence Number (ESN)), and limited traffic flow confidentiality. The set of services provided depends on options selected at the time of Security Association (SA) establishment (Figure 8) and on the location of the implementation in a network topology.
32 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Inbound policy Inbound SA IPv4 / IPv6 source address Action: IPv4 / IPv6 source address IPv4 / IPv6 destination address Bypass IPv4 / IPv6 destination address L4 source port number Discard L4 source port number L4 destination port number IPSec L4 destination port number Selector Selector L4 protocol L4 protocol Priority for overlapping SPI policies Decryption / authentication algorithm Decryption / authentication keys
Outbound SA Outbound policy IPv4 / IPv6 source address Action: IPv4 / IPv6 source address IPv4 / IPv6 destination address Bypass IPv4 / IPv6 destination address L4 source port number Discard L4 source port number L4 destination port number IPSec L4 destination port number L4 protocol Selector L4 protocol Selector Priority for overlapping SPI policies Encryption / authentication algorithm Encryption / authentication keys
Figure 8. Attributes of policies and SAs, with packet selectors and IPSec specific data [source: author]
ESP works in two exclusive modes: tunnel and transport. The ESP header is inserted after the IP header and before the next layer protocol header (transport mode) or before an encapsulated IP header (tunnel mode).
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 Security Parameters Index (SPI) Sequence Number (SN) Initialization Vector (IV) (optional) Payload Data (variable) Traffic Flow Confidentiality (TFC) Padding (optional, variable) Padding (variable, 0-255 bytes) Pad Length Next Header Integrity Check Value (ICV) (variable)
Figure 9. Encapsulating Security Payload Packet Format [Kent 2005a]
ESP tunnel mode creates a new IPv4/IPv6 header for each packet (Figure 10). The new IP header contains the new source and destination endpoints – IP addresses of the ESP tunnel (e.g. in G2G approach IP addresses of two IPSec Gateways). ESP tunnel mode can be used in 3 architectures: G2G, H2G and H2H. It encrypts and protects the integrity of both the L4 data and original IP header concealing the actual source and destination of the packet.
33 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
New IPv4/IPv6 header ESP Original IPv4/IPv6 header TCP /UDP ESP ESP Data (“outer” IP header) header (“inner” IP header) header Trailer Auth IPSec processing: Encryption Diffserv processing IPSec processing: Authentication Figure 10. IPv4/IPv6 packet after applying ESP tunnel mode with DiffServ and IPSec processing (based on [Kent, Seo 2005]) In transport mode, ESP utilizes the original IPv4/IPv6 header instead of creating a new one (Figure 11), encrypting and protecting integrity of packet’s payload, leaving IP header out of ESP protection. ESP transport mode is designed to work mainly in H2H architectures without Network Address Translation (NAT) processing. NAT introduces modifications to L4 ports that cause e.g. TCP checksum to be recalculated. Encrypted ESP L4 payload prevents intermediate hosts from applying NAT translation. Some modifications of ESP transport mode are known, including G2G mutation (Figure 12) with ESP processing between IPSec gateways only.
Transformed IPv4/IPv6 ESP TCP /UDP ESP ESP Data header header header Trailer Auth IPSec processing: Encryption Diffserv processing IPSec processing: Authentication Figure 11. IPv4/IPv6 packet after applying ESP transport mode with DiffServ and IPSec processing (based on [Kent, Seo 2005])
Table 2. Explanation of ESP packet fields [Kent, Seo 2005] ESP Field Name Field Length Explanation of the field SPI 32-bit (const) SPI > 255 (value 0 reserved for special purposes; values <1, 2 …, 255> are reserved by IANA for future use). In combination with IPv4/IPv6 source & destination addresses, L4 source & destination ports, L4 protocol number uniquely identifies the SA for the datagram. SN 32-bit (const) Monotonically increasing counter value for the consequent packets, starting from 1 up to 2 32 -1 = 4294967295 (value 0 must not be sent over the wire). This counter must be present in every packet sent by the sender and may be analyzed by the receiver if Inbound Sequence Number Verification is performed; after reaching the highest possible value (2 32 -1) SA lifetime is over. IV The value that is a starting seed for some cipher modes (e.g. CBC (Table 6). Payload Data (variable) Contains data described by the ESP Next Header field. This field is mandatory and is an integral number of bytes in length.
34 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
ESP Field Name Field Length Explanation of the field TFC Padding (optional. Used to hide packet length; can be added only if the variable) Payload Data field contains a specification of the length of the IP datagram (always true in tunnel mode, and may be true in transport mode depending on whether the next layer protocol e.g.: UDP, ICMP contains explicit length information); This length information will enable the receiver to discard the TFC padding, because the true length of the Payload Data will be known. Padding (variable, 0-255 Mandatory field; used for encryption (e.g. to fill plaintext bytes) data to be a multiple of some number if cipher requires so) or to ensure that ciphertext terminates on 4-byte boundary and both Pad Length and Next Header are right aligned within a 4-byte word Pad Length 0-255 bytes Mandatory field that indicates the number of padding bytes (variable) immediately preceding it. Value 0 says that no padding bytes are present. Next Header 8-bit (const) Mandatory field that contains the value of the next level protocol sent in the packet (e.g. TCP, UDP). ICV (variable, length Computed over the ESP packet minus the Authentication dependant on Data to check packet integrity and verify its origin. This authentication algorithm field is optional, used only when authentication is enabled selected) for the given SA.
[Manral 2007] presents a list of cipher and authentication algorithms for the ESP (Table 3), focusing attention to MUST (required, shall) [Bradner 1997] and SHOULD+ (it is believed that it will be promoted to MUST in the future) items. MUST- items are expected to be no longer MUST in the future, MAY are optional. Table 4 describes MUST encryption algorithms and Table 5 - the most common authentication algorithms.
Table 3. Authentication and cipher algorithms applicable for ESP [Manral 2007] Requirement Encryption Algorithm Requirement Authentication Algorithm MUST NULL MUST HMAC-SHA1-96 [Glenn, Kent 1998] [Madson, Glenn 1998b] MUST AES-CBC-128 SHOULD+ AES-XCBC-MAC-96 [Frankel, Herbert 2003] MUST- 3DES-CBC MAY NULL
SHOULD AES-CTR MAY HMAC-MD5-96 [Housley 2004] [Madson, Glenn 1998a] SHOULD DES-CBC NOT
It must be emphasized that [DES] is currently too weak for professional use. Experiments done with EFF DES Cracker [DEEPCRACK] showed that the tool run in the distributed environment [distributed.net] allowed cracking DES 56-bit with brute force attack within 22.5 hours [McNett 1999]. [3DES] is expected to be tractable to this kind of attack in the nearest future.
35 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
[Luby, Racko 1988] [Luby, Racko 1989] suggest that a secure block cipher can be assumed to behave as a good Pseudo Random Function PRF() , measured by an adversary’s inability to distinguish between two objects: PRF(a) and TRF() , where a is a random description key and TRF() is a Truly Random Function.
Table 4. MUST cipher algorithms applicable for ESP [source: author] Cipher Cipher algorithm short description NULL Status: offers no confidentiality nor does it offer any other security service, however support for the "NULL” algorithm is required to maintain consistency with the way services are negotiated [Glenn, Kent 1998]
Input: block size of 1 byte
[AES] Status: secure however some cryptanalysis efforts are in progress, including side-channel attacks, effective attacks on versions of the cipher with a reduced number of rounds: 6, 7, 8, and 9 rounds [Ferguson et al 2000]. Other recent research: Diagonal Fault Attack (DFA) [Saha et al 2009] shows exploiting multiple byte faults in the state matrix, leading to deduction of AES key after inducing a random fault anywhere in one of the four diagonals of the state matrix at the input of 8th round. Brute force attack complexity is about 2 32 operations if the fault stays within the diagonal.
Input: data block of 128 bits ( in ) = 4 * 32-bit words, using cipher keys K with lengths of either 128, or 192, or 256 bits ( Nk = 4, 6, or 8 of 32-bit words); Data processing: symmetric block cipher; the number of rounds Nr depends on the key size GetNumberOfRounds() .
int GetNumberOfRounds (K) { if (K is 128-bit) { return 10; } else if (K is 192-bit) { return 12; } else if (K is 256-bit) { return 14; } }
Algorithm takes K and performs KeyExpansion() to create a key schedule w[] – it generates a total of 4(Nr+1) words: the algorithm requires an initial set of 4 words, and each of the Nr rounds requires 4 words of key data and consists of a linear array of 4-byte words, denoted [w i], where 0 ≤ i < 4(Nr+1) .
• SubWord() - function used by the KeyExpansion() that takes a four -byte input
36 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Cipher Cipher algorithm short description word and applies an S-box to each of the four bytes to produce an output word; • RotWord() - function used by the KeyExpansion() that takes a four-byte word and performs a cyclic permutation;
KeyExpansion(byte key[4*Nk], word w[4*(Nr+1)], Nk) {
word temp; i = 0; while (i < Nk) { w[i] = word(key[4*i], key[4*i+1], key[4*i+2], key[4*i+3]); i++; } i = Nk; while (i < 4 * (Nr+1)] { temp = w[i-1]; if (i mod Nk = 0) { temp = SubWord(RotWord(temp)) xor Rcon[i/Nk]; } else if (Nk > 6 and i mod Nk = 4) { temp = SubWord(temp); } w[i] = w[i-Nk] xor temp; i++; } return w[]; }
Ciphering utilizes the following functions that operate on state : • SubBytes(state) - a non-linear byte substitution that operates independently on each byte of the State using a substitution table (S-box) • ShiftRows(state) - the bytes in the last three rows of the state are cyclically shifted over different numbers of bytes (offsets) • MixColumns(state) - operates on the state column-by-column, treating each column as a four-term polynomial • AddRoundKey(state ) - simple bitwise XOR operation between RoundKey and state
Cipher(byte in[4*4], byte out[4*4], word w[4*(Nr+1)]) { Nr = GetNumberOfRounds(K); state= in; AddRoundKey(state, w[0, 3]); for (round = 1; round < Nr; round++) { state = SubBytes(state); state = ShiftRows(state); state = MixColumns(state); state = AddRoundKey(state, w[round*4, (round+1)*4-1]); } state = SubBytes(state); state = ShiftRows(state); state = AddRoundKey(state, w[Nr*4, (Nr+1)*4-1]); return (state); }
37 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Cipher Cipher algorithm short description for (round = Nr-1; round > 0; round--) { InvShiftRows(state); InvSubBytes(state); AddRoundKey(state, w[round*4, (round+1)*4-1]); InvMixColumns(state); } InvShiftRows(state); InvSubBytes(state); AddRoundKey(state, w[0, 3]); return (state); }
Output: data block of 128 bits
Table 5. Authentication algorithms applicable for ESP [source: author] Auth. Authentication algorithm short description MD5 Status: insecure (however popular), successful Differential Attack on MD5 [Rivest known [Wang, Yu 2005], with fast collision implementation [Stevens 2007] 1992] [HASHCLASH] and presented creation method of two MD5 based X.509 certificates that have the same signatures but different public keys and Distinguished Name fields [Stevens et al 2006].
Input: message M of any length < 2 64 bits Padding: blocks of 512 bits M(i) are sequentially processed when computing the message digest. If required padding is applied before the operation: "1" followed by "0"s followed by a 64-bit unsigned integer are appended to the end of the message to produce a padded message of length 512 * n bits. The 64-bit unsigned integer is the length in bits of the original message. Data processing: 64 operations, grouped in four rounds of 16 operations (based on a non-linear functions F() , G() , H() , I() that operate on 32-bit values and produce 32-bit output, and left rotation) done on a 128-bit state divided into four 32-bit words A, B, C, D:
F(X,Y,Z) = (X AND Y) OR (~X AND Z); G(X,Y,Z) = (X AND Z) OR (Y AND ~Z) = F(Y,Z,X); H(X,Y,Z) = X XOR Y XOR Z; I(X,Y,Z) = Y XOR (X OR ~Z); rotateLeft32 (x, c) = (x << c) OR (x >> (32-c));
Initial values:
A = 0x 01 23 45 67; B = 0x 89 AB CD EF; C = 0x FE DC BA 98; D = 0x 76 54 32 10;
Per round shift amounts:
r[ 0..15] = {7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22, 7, 12, 17, 22}; r[16..31] = {5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20, 5, 9, 14, 20}; r[32..47] = {4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23, 4, 11, 16, 23}; r[48..63] = {6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21, 6, 10, 15, 21};
Main loop (for each 512-bit block broken into sixteen 32-bit words w[i] ):
for (i=0; i<63; i++) { switch (i) { case 0..15:
38 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Auth. Authentication algorithm short description f = F(b, c, d); g = i; break; case 16..31: f = G(b, c, d); g = (5*i + 1) mod 16; break; case 32..47: f = H(b, c, d); g = (3*i + 5) mod 16; break; case 48..63: f = I(b, c, d); g = (7*i) mod 16; break; } temp = d; d = c; c = b; b = b + rotateLeft32((a+f+floor(abs(sin(i+1))*(2^32)) + w[g]), r[i]); a = temp; A += a mod 32; B += b mod 32; C += c mod 32; D += d mod 32; } return (A B C D);
Output: 128-bit message digest: ABCD (begin with the low-order byte of A) SHA1 Status: partially insecure, several cryptanalysis methods presented, starting [Eastlake, from collisions found in less then 2 80 operations for SHA1 with reduced (53 Jones instead of 80) number of rounds [Rijmen, Oswald 2005], algorithm of finding 2003] collisions in less than 2 69 operations for full SHA1 [Wang et al 2005], verified by [Cochran 2005] to 2 63 operations.
Input: message of any length < 2 64 bits Padding: blocks of 512 bits M(i) are sequentially processed when computing the message digest. If required padding is applied before the operation: "1" followed by "0"s followed by a 64-bit unsigned integer are appended to the end of the message to produce a padded message of length 512 * n bits. The 64-bit unsigned integer is the length in bits of the original message. Data processing: 80 rounds done for each M(i) , with the use logical functions F() , G() , H() (that operate on three 32-bit words and produce 32-bit word as output) and 80 constant words k.
F(X,Y,Z) = (X AND Y) OR (~X AND Z); G(X,Y,Z) = X XOR Y XOR Z; H(X,Y,Z) = (X AND Y) OR (X AND Z) OR (Y AND Z);
Initial values:
H0 = 0x 67 45 23 01; H1 = 0x EF CD AB 89; H2 = 0x 98 BA DC FE; H3 = 0x 10 32 54 76; H4 = 0x C3 D2 E1 F0
Extend sixteen 32-bit words w[i] into eighty 32-bit words:
for (i=16; i<79; i++) { w(i) = (w(i-3) XOR w(i-8) XOR w(i-14) XOR w(i-16)); }
39 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Auth. Authentication algorithm short description Main loop (for each 512-bit block M(i) broken into sixteen 32-bit words w[i] ):
A = H0; B = H1; C = H2; D = H3; E = H4; for (i=0; i<79; t++) { switch (i) { case 0..19: k = 0x 5A 82 79 99; f = F(B, C, D); break; case 20..39: k = 0x 6E D9 EB A1; f = G(B, C, D); break; case 40..59: k = 0x 8F 1B BC DC; f = H(B, C, D); break; case 60..79: k = 0x CA 62 C1 D6; f = G(B, C, D); break; } temp = S <<< 5 + f + E + w(i) + k; E = D; D = C; C = = S <<< 30; B = A; A = temp; } H0 += A; H1 += B; H2 += C; H3 += D; H4 += E; return (H0 H1 H2 H3 H4);
Output: 160-bit message digest H0 H1 H2 H3 H4 (begin with the low-order byte of H0 )
Table 6 presents five the most popular cipher modes with CBC mode as a primary recommended solution. Although ESP can be used to provide encryption or integrity protection (or both), ESP encryption should not be used without integrity protection and NULL-NULL combination is not allowed [Kent, Seo 2005]. Block ciphers must work in a mode because input plaintext stream is divided into successive sequence of blocks that match the required input block length of the cipher. The strength of the cipher may depend on the mode selected.
40 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Table 6. Popular cipher modes [Knudsen 1998] Cipher mode name Cipher mode description ECB - Electronic It is the simplest mode to use with a block cipher. It encrypts each Code Book mode block independently [Knudsen 1998] Advantages: simplicity; Disadvantages: the same Plaintext always encrypt to the same Ciphertext for the same key. CFB - Cipher This is a self-synchronizing stream cipher implemented from the block Feedback mode cipher. It utilizes IV. Advantages: can encrypt pieces of data smaller than block size, often configured to encrypt/decrypt 8 bits (it is called CFB8) Disadvantages: efficiency – each time a piece of plaintext is encrypted, an entire block is encrypted by the underlying cipher; each piece of decrypted ciphertext comes at the cost of an entire encrypted block (CFB8 is about 8 times slower than ECB or CBC for 64-bit blocks) OFB - The Output- It is a more sophisticated CFB mode, with difference in how the Feedback mode internal buffer is updated: when the internal buffer is shifted left, the (in 8 bit ciphers) space on the right side is filled with the leftmost bits of the encrypted buffer (instead of the ciphertext which is in CFB). CBC - Cipher [Bellare et al 2000] shows that CBC MAC [ISO/IEC9797] is secure if Block Chaining the underlying block cipher is secure. The article points out two weak mode sides of CBC from resistance to forgery perspective: upper bounding the MAC insecurity of CBC, and birthday attack [Preneel, van Oorschot 1995] to create internal collisions. Advantages : It is better than ECB since the plaintext is XOR'ed with the previous ciphertext. A random block – IV – is placed as the first block so the same block or messages always encrypt to something different. Disadvantages : IV must be transmitted with the ciphertext to anyone interested in decrypting the ciphertext. PCBC – It is a more sophisticated CBC mode – plaintext block encrypted is Propagating CBC XOR’ed with both the previous plaintext block and the previous ciphertext block. Likewise, decrypted blocks are XOR’ed with the previous plaintext and ciphertext blocks. CTR – Counter Block cipher and public function is used to generate the pseudorandom mode data sequence of block data length [Diffie, Hellman 1979]. Then ciphertext is a result of CTR sequence XOR block data. Advantages: a fact that any part of ciphertext can be deciphered, without all previous segments being deciphered as it is required by CFB, CBC, and OFB (independency and simplicity). Disadvantages: no support for message integrity, error propagation (bit-flip error in ciphertext is visible also in plaintext), stateful encryption (like CBC), sensitivity to usage errors (counter value cannot be reused), interaction with weak ciphers (successive blocks CTR and CTR+1 usually have small Hamming difference) [Lipmaa et al 2000].
41 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Source private network Public network Destination private network
Destination host B Source host A IPSec gateway IPSec gateway
SA establishment via 2-phased IKEv2 SPD SAD Phase 1: prepare IKEv 2 phase 2 communication – asymmetric cryptography SPD SAD Phase 2: exchange SPD and SAD data for main ESP traffic
ESP tunnel mode Private source Tunnel source IPv4 / IPv6 address IPv4 / IPv6 address Tunnel destination Private destination IPv4 / IPv6 address IPv4 / IPv6 address Destination Source IPSec gateway host B host A IPSec gateway
ESP transport mode (original) Source Destination IPv4 / IPv6 address ESP transport mode (modified) IPv4 / IPv6 address
Source Destination IPSec gateway IPSec gateway host A host B
ClearText traffic IPSec traffic Figure 12. Overview of IPSec G2 communication with IKEv2 and ESP tunnel or transport mode for IPv4 / IPv6 traffic between hosts A and B from different private network environments [source: author]
2.1.3. ESP performance
Outbound ESP packet processing in transport mode consists of encapsulation by the sender the next layer protocol data between the newly created ESP header and the ESP trailer fields (Figure 11). In tunnel mode outbound ESP packet processing there are several ways of constructing the outer IP header. ESP processing is done only if packet is associated with a flow that has SA assigned. Then ESP Payload creation begins: for ESP transport mode the original IP next layer protocol number, for ESP tunnel mode the entire original IP datagram. As far as fragmentation is concerned ESP in transport mode can be applied only to whole IP packets but ESP in tunnel mode may be applied to a packet that is a fragment of an IP datagram. It may happen that for a given packet that corresponds to a given IPSec policy with appropriate selector no appropriate SA is found. As a consequence an auditable IPSec event is thrown (OUT_SA_REQUIRE) (Table 7) that may be caught by SA management protocol to negotiate valid SA data between communication endpoints. In such a case packet may be buffered by IPSec GW and retransmitted after SA becomes complete. Queue length for the buffered packet per single SA strongly depends on HW and SW capabilities of IPSec GW – in general it is a few packet long so long-term transmission without SA negotiated will cause all packets to be dropped. What is more, aging mechanism must be introduced to the packet buffering mechanism so that IPSec GW system resources (mainly operational memory) have chance to be released if SA is not negotiated for a given period of time. The next step of packet processing is padding: optional TFC padding and encryption padding are added as a preliminary step to encryption. Then packet is encrypted with the use of encryption key, algorithm and mode read from the corresponding SA. Some modes of algorithms require synchronization data (like IV required by CBC) that are placed in the packet payload. Then ICV over the ESP packet is calculated. It covers: SPI, SN, Payload Data (ciphered), Padding (if present, ciphered), Pad Length (ciphered), and Next Header (ciphered) (Figure 10, Figure 11).
42 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
SN that appears in the datagram must be created according to SA rules: when SA is created SN is initialized to 0, each packet increments the counter by 1, so the first transmitted packet will have SN=1. SN cannot cycle so the last possible packet from SN point of view ends SA lifetime . An attempt to transmit a packet that would result in SN overflow is an auditable event (OUT_SEQ_OVERFLOW) (Table 7). This event must be monitored and caught by SA management protocol (e.g. IKEv2 block) to renegotiate the expired SA with a new one. If ESN is used (64. bit value), only 32.bit are stored in the packet but both sender and receiver maintain 64.bit ESN counter per SA. After ESP processing, if required, IP fragmentation is performed [Postel 1981a]: IPv4 stack will create a set of IP datagrams with adjusted OFFSETs, MORE FRAGMENTS, and LAST FRAGMENT flags, IPv6 will add extension header: Fragment Header [Deering, Hinden 1998], identified by a Next Header value of 44 in the immediately preceding header. However it is confirmed that fragmentation significantly reduces IPSec traffic throughput [Kent, 2005a] thus ESP implementations may disable fragmentation at all. In such cases DF=1 bit is set in transmitted packets to begin Path MTU discovery [Mogul, Deering 1990] [McCann et al 1996] and IPSec GW support for ICMP PMTU messages (ICMPv4 Destination Unreachable messages with a code meaning “fragmentation needed and DF set”, ICMPv6 Packet Too Big error message) [Postel 1981b] [Conta et al 2006] is required. Inbound ESP packet processing begins with the reassemblation of IP packets [Postel 1981a] [Deering, Hinden 1998]. IPv4 fragments are recognized with OFFSET=0 and MORE FRAGMENTS flag set (the first fragment), OFFSET > 0 and MORE FRAGMENTS flag set (the middle fragments), and OFFSET > 0 and LAST FRAGMENT flag set (the last fragment). ESP processing unit must begin with packet sanitization check that includes reassemblation – IP fragments must be dropped and appropriate IPSec event (IN_FRAG_DROP) (Table 7) should be raised. The next step after successful ESP pre-processing is SA lookup, based on SPI value read from the incoming packet. This action requires SAD browsing. If SA is successfully found, the ESP processing unit can read required SA data: flag if SN or ESN is checked, decryption/deauthentication algorithms, and decryption/deauthentication keys. If no SA is found, packet is discarded and appropriate IPSec event (IN_SPI_UNKNOWN_DROP) (Table 7) should be raised. The next optional step is SN / ESN verification. It must be stated in SA definition that SN / ESN will be checked and SN / ESN anti-replay protection mechanism is applied. Otherwise this step is skipped. Enabled anti-replay service means that each received packet does not duplicate any SN or ESN of packets already processed during this SA lifetime – any duplicate is dropped immediately (IN_ANTIREPLAY_DROP) (Table 7). The next step is packet verification against the Sliding Receiver Window (SRW). The right edge of the SRW represents the highest, validated SN value received on this SA. Packets that contain SN lower than the left edge of the SRW are dropped (IN_ANTIREPLAY_DROP). Packets falling within the SRW are checked against a list of received packets within the SRW. Next, integrity value check is done. The SRW is updated only if the integrity verification succeeds, otherwise packet is dropped (IN_INTEGRITY_DROP) (Table 7). Fundamentals of anti-replay protection are: receive packet counter for the SA is set to 0 when SA is created, SA counter lifetime is 2 32 -1 packets for SN and 2 64 -1 for ESN. The SRW size should be chosen arbitrary according to line transmit rate.
43 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Table 7. IPSec auditable events [source: author] Direction Event name Event description SPI value, date/time received, Source Address, IN_FRAG_DROP Destination Address, SN / ESN, and (in IPv6) the Flow ID SPI value, date/time received, Source Address, IN_SPI_UNKNOWN_DROP Destination Address, SN/ ESN, and (in IPv6) the Cleartext Flow ID SPI value, date/time received, Source Address, Inbound IN_ANTIREPLAY_DROP Destination Address, SN / ESN, and (in IPv6) the Flow ID SPI value, date/time received, Source Address, IN_INTEGRITY_DROP Destination Address, SN / ESN, and (in IPv6) the Flow ID SPI value, date/time received, Source Address, IN_ICVVERIFY_DROP Destination Address, SN / ESN, and (for IPv6) the Cleartext Flow ID SPI value, current date/time, Source Address, OUT_SEQ_OVERFLOW Destination Address, and (in IPv6) the Cleartext Flow ID Outbound Current date/time, Source Address, Destination OUT_SA_REQUIRE Address, Source L4 port number, Destination L4 port number and (in IPv6) the Cleartext Flow ID
The next step, ICV verification, is performed. If separate confidentiality and integrity algorithms are used, ICV’ based on integrity algorithm is computed over the packet minus the original ICV and comparison of ICV’ to ICV present in the datagram is performed. If they match, packet is accepted, otherwise it is dropped and IPSec event (IN_ICVVERIFY_DROP) (Table 7) is thrown. Then ESP processing block decrypts the packet: ESP Payload Data + Data + Padding + Pad Length + and next header. Some algorithms require additional synchronization data taken directly from the payload (IV) or created on-the-fly. Padding analysis is also included, as specified by the algorithm specification. Note that integrity check must be performed before the decryption despite parallel execution of such processes – in this case race conditions must be avoided. Then the Next Header field is analyzed. If Next Header=59, the packet is silently discarded as no next header dummy packet, otherwise it is processed by the next block. After that the reconstruction of the original IP’ datagram is done by the receiver. For ESP transport mode IP’ is combined from outer IP header + original next layer protocol data in the ESP Payload field. For ESP tunnel mode the entire IP’ datagram is taken from ESP Payload field.
2.1.4. IKEv2 security and performance
IKEv2 is a protocol that precedes the ESP. Its purpose is to negotiate, create and manage SAs however SAs can be manually created on both sides of transmission channel without IKEv2 action, with the use of values agreed upon in advance by both parties. The main disadvantage of the manual SA creation is impossibility of SA update what excludes this approach from large-scale real-life VPNs. IKEv2 is designed to dynamically establish SA, possible because of successful authentication between two parties exchanging shared secret. The first version, called
44 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
ISAKMP, described in [Piper 1998], [Maughan et al 1998] [Harkinks, Carrel 1998], then replaced by IKEv2 [Kaufman et al 2005]. The versions are not backward compatible. Request/Response is the operational mode of IKEv2. Each request sent from source address expects the corresponding response from the destination address – the pair creates an exchange. The first exchange messages are: IKE_SA_INIT and IKE_AUTH, then there are exchanges: CREATE_CHILD_SA and INFORMATIONAL (Table 8). A single IKE_SA_INIT exchange and a single IKE_AUTH exchange should be enough (total of 4 messages) to establish SA: IKE_SA and the first CHILD_SA. It may happen that there are more exchanges but in any case all IKE_SA_INIT must complete, then all IKE_AUTH must complete. CREATE_CHILD_SA and INFORMATIONAL may appear then in any order.
Table 8. Phases of IKEv2 [Kaufman et al 2005] Phase Description 1a. IKE_SA_INIT exchange Negotiates security parameters for the IKE_SA, sends nonces, and sends Diffie-Hellman (D-H) [Diffie, Hellman 1977] values. 1b. IKE_AUTH exchange Transmits identities and certificates, proves knowledge of the secrets corresponding to the two identities, and establish an SA for the first (and often only) ESP CHILD_SA. IKE_AUTH exchange is encrypted and protected from integrity point of view with keys established through IKE_SA_INIT. 2. CREATE_CHILD_SA Creates CHILD_SA. This exchange is encrypted with the use exchange (optional) of keys negotiated in IKE_SA_INIT and IKE_AUTH exchanges. May request new Key Exchange parameters. 3. INFORMATIONAL Deletes an SA, reports error conditions, checks if responder is exchange (optional) alive.
IKE_SA_INIT is begun by an initiator that sends the cryptographic algorithms that it supports, its nonce, and initiator’s D-H value (Figure 13). Responder chooses the cryptographic algorithms from the initiator’s list that it supports too, completes D-H value, adds its nonce and replies to the initiator, with optional Digital Certificate (DC) request. It is the moment when two communication sides are able to create the same seed (known as SKeySeed), without contacting with each other any more, from which all SA encryption (SK e) and integrity protection (SK a) keys are derived from. For each direction a separate pair of SK a and SK e is computed. From now further exchanges and encrypted. The initiator asserts its identity with the identification payload, proves knowledge of the secret corresponding to identification and integrity protects the contents of the first message using the authentication payload. If requested in the previous message, it may also send its certificate in certificate payload with public key and a list of certificate’s trust anchors in certificate request. Then the responder asserts its identity with the identification payload, authenticates its identity and protects the identity of the message with authentication payload. May also include the certificate with public key. IKE_AUTH exchange is successfully completed if both sides verifies that received messages contains correct signatures, MACs, and identification payloads correspond to the keys used to generate authentication payload. CREATE_CHILD_SA is an optional exchange, encrypted with the use of data negotiated in IKE_SA_INIT and IKE_AUTH phases. Initiator sends the request that contains SA offer, nonce, and proposed traffic selectors. It may also contain additional Key Exchange payload with D-H value for negotiating stronger encryption algorithms. If proposed data are not accepted, the exchange will fail and must be retransmitted with different Key Exchange
45 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments data. CREATE_CHILD_SA that rekeys existing SA must contain REKEY_SA payload that match the SA being rekeyed. The responder of the CREATE_CHILD_SA replies with a message matching the request (the same message ID). If it accepts the initiator’s SA offer, responder must include the SA in the response. It must also contain Key Exchange with D-H value if initiator asked. If initiator’s proposal does not match responder’s cryptographic expectations, the request must be rejected. Traffic selectors of both initiator and responder may be also included in the responder’s reply.
Initiator Responder
IKE header, Nonce(i), KeyExchange(i), SA(i)
IKE header, Nonce(r), KeyExchange(r), SA(r), CertificateRequest
Seed Seed generation generation Encrypted traffic
IKE Header, Identity(i), Certificate IKE Header, Identity(r), Certificate
Figure 13. IKEv2 basic activity diagram – Request/Response architecture (based on [Kaufman et al 2005])
INFORMATIONAL exchange is an optional communication between initiator and responder. It is encrypted with the IKEv2 negotiated algorithms and keys. Its goal is to inform the opposite side of transmission about SA deletion or simply to check if responder is still alive (INFORMATIONAL exchange without payloads). Each INFORMATIONAL request must have its response – if not it is assumed that message was lost and the request is retransmitted.
2.2. HTTPS-based distributed applications design
2.2.1. Introduction to HTTPS
Hypertext Transfer Protocol over Secure Socket Layer (HTTPS) [Rescorla 2000] is a URI scheme used to indicate a secure HTTP connection. It is syntactically identical to the http:// scheme normally used for accessing resources using HTTP. Using an https:// URL indicates that HTTP is to be used, but with a different default TCP port (443. instead of 80.) and an additional encryption/authentication layer between the HTTP and TCP. It takes advantage of a Separate Port Strategy (SPS) where protocol designer assigns a different publicly well know L4 port of ISO OSI network model to the protocol and the server implementer has the server listen both on the original port (80.) and on the new secure port
46 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
(443.). Any connections that arrive on the secure port are automatically SSL negotiated before the main data transfer begins. Upward Negotiations Strategy (UNS), different approach, is also utilized by some implementations of SSL (e.g. SMTP over SSL) – it this case the protocol designer modifies the application protocol to support a message indicating that one side would like to upgrade the communication channel to SSL. If the other side agrees, an SSL handshake starts, resulting in application message resume over the new SSL channel, if completed successfully. HTTPS was initially designed by Netscape Communications Corporation [NETSCAPE] to provide authentication and encrypted communication, and is widely used on the World Wide Web (WWW) for security-sensitive communication such as payment transactions and corporate logons. In 1995 Netscape Communications Corporation introduced the SSL 2.0 protocol [Hickman, Kipp 1995] (SSL 1.0 was never widely deployed what was discussed in [Rescorla 2001]) which matured to SSL 3.0 version [Freier 1996] that fixed a number of security problems in SSLv2 ([Goldberg, Wagner 1996] shows how to break an SSL 2.0 connection from Netscape Navigator 1.1 in under an hour) and supported a far greater number of algorithms than SSL 2.0. The work over HTTPS has been continued by Internet community and in 1999 resulted in TLS 1.0 specification [Dierks, Allen 1999], adjusted in 2006 by TLS 1.1 version [Dierks, Rescorla 2006] and finally replaced by TLS 1.2 [Dierks, Rescorla 2008] as the latest document. The differences between TLS 1.2 and SSL 3.0 are not dramatic, but they are significant enough that the various implementations of TLS 1.2 and SSL 3.0 do not interoperate. All SSL and TLS versions provide a secure channel between two communicating machines. SSL connections act similar to secured TCP connection [Rescorla 2001]. In order to accommodate connections from Clients who do not use SSL, Server must typically be prepared to accept both secured and non-secured versions of the application protocol Web server for accepting HTTPS requires a Public Key certificate. This certificate must be signed by a Certificate Authority (CA) using Digital Signature (DS) mechanism to prove that the certificate holder is the one it claims to be. HTTPS may be used to authenticate the client browsers too in order to restrict access to a Web server to authorized users only. In that case certificates are created for each user which are loaded into their browser. The most common data in such a certificate are: the name and email address of the authorized user. The certificate is checked each time client reconnects to the Web server. In majority cases two factors influence on level of HTTPS security: client web browser and web server implementations, and cryptographic algorithm used. It must be emphasized that HTTPS is a protocol that protects data from eavesdropping and man-in-the- middle attacks in transit only. Data processed on source or destination points are as secure as the machines are. A crucial part of HTTPS implementation, SSL, has no knowledge of higher level protocols. It causes that SSL servers can present only one certificate for a pair: IP address and L4 port and named-based virtual hosting is not feasible to be used with HTTPS. Some efforts to change this limitation are taken [Dierks, Rescorla 2008], resulting in the latest TLS 1.2 specification (see also: chapter 2.4). The goals of the TLS protocol, in order of priority, are as follows: a) Cryptographic security : TLS should be used to establish a secure connection between two parties; b) Interoperability : Well defined API allows developing applications utilizing TLS that can successfully exchange cryptographic parameters without knowledge of one another's source code;
47 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
c) Extensibility : TLS seeks to provide a framework into which new public key and bulk encryption methods can be incorporated as necessary. This will also accomplish two sub-goals: preventing the need to create a new protocol (and risking the introduction of possible new weaknesses) and avoiding the need to implement an entire new security library; d) Relative efficiency : In general cryptographic operations are highly CPU intensive, particularly asymmetric cryptography based on public key operations. For this reason, the TLS protocol has incorporated an optional session caching scheme to reduce the number of connections that need to be established from scratch. Additionally, care has been taken to reduce network activity. The TLS is composed of two layers: the TLS Record Protocol (TLS-RP) and the TLS Handshake Protocol (TLS-HSP). At the lowest level, layered on top of some reliable transport protocol (e.g. TCP [Postel 1981b]), is the TLS-RP. The TLS-RP provides connection security that has two basic properties: a) Privacy: Symmetric cryptography is used for data encryption (e.g. [AES]) The keys for this symmetric encryption are generated uniquely for each connection and are based on a secret negotiated by another protocol (such as the TLS-HSP). The Record Protocol can also be used without encryption; b) Reliability: Message transport includes a message integrity check using a keyed MAC. Secure hash functions (e.g. SHA1 [Eastlake, Jones 2003]) are used for MAC computations. The TLS-RP can operate without a MAC, but is generally only used in this mode while another protocol is using the Record Protocol as a transport for negotiating security parameters. The TLS-RP is used for encapsulation of various higher-level protocols. One such encapsulated protocol, the TLS-HSP, is designed for the server and client to authenticate each other and to negotiate an encryption algorithm and cryptographic keys before the application protocol transmits or receives its first byte of data. The TLS-HSP provides connection security that has three basic properties: a) The peer's identity can be authenticated using asymmetric cryptography ([RSA], [DSS]). Optional but is generally required for at least one of the peers; b) The negotiation of a shared secret is secure: the negotiated secret is unavailable to eavesdroppers, and for any authenticated connection the secret cannot be obtained, even by an attacker who can place himself in the middle of the connection; c) The negotiation is reliable : no attacker can modify the negotiation communication without being detected by the parties to the communication. One advantage of TLS is that it is application protocol independent. Higher-level protocols can layer on top of the TLS protocol transparently [Ford-Hutchinson 2005]. The TLS standard, however, does not specify how protocols add security with TLS; the decisions on how to initiate TLS handshaking and how to interpret the authentication certificates exchanged are left to the judgment of the designers and implementers of protocols that run on top of TLS. Figure 14 depicts basic SSL activity diagram. Within the first phase (TLS-HSP) Initiator sends Hello(i) message to the Responder which begins the communication. It contains Initiator’s proposed cryptographic parameters (cipher, mode, key length) and random value used in key generation. Responder answers with Hello(r) message (it selects a cipher from the set proposed by Initiator and contains Responder’s random number), sends its Certificate(r) (containing Responder’s Public Key), optionally asks for Initiator’s certificate (sending CertificateRequest(r) ) and finalizes this state with HelloDone(r) to indicate that no more messages are going to be sent in this moment by Responder and make speak Initiator again. Next, Initiator sends ClientKeyExchange(i) message which contains a
48 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments randomly generated key using Responder’s Public Key. If Responder sent CertificateRequest(r) message, Initiator answers with Certificate(i) (with Initiator’s Public Key) and CertificateVerify(i) (string signed with a Private Key associated with the Certificate(i) ). Then Initiator sends a ChangeCipherSpec(i) message which indicate that all further messages will be encrypted using the currently negotiated cipher. Initiator’s ends its work with Finish(i) . Next, Responder sends its ChangeCipherSpec(r) message and Finish(r) to indicate that secure connection has been established.
Initiator Responder
Hello(i)
Hello(r) n n
o o i i t Certificate(r) t a a
r r n e n e o n o n
i i e e t t (optional) CertificateRequest( r) g a g a r r e e n n Pre_master_secret e e HelloDone(r) g g
Pre_master_secret ClientKeyExchange(i) (optional) Certificate(i)
(optional) CertificateVerify(i)
ChangeCipherSpec(i) Encrypted Finish(i)
ChangeCipherSpec(r) Finish(r)
Application data exchange
CloseNotify(i)
Figure 14. Basic SSL activity diagram – Request/Response architecture [based on Rescorla 2000] Then Initiator and Responder exchange encrypted application data. Connection is shut down by Initiator with CloseNotify(i) followed by a TCP FIN. Responder answers with TCP FIN. TLS can work with Kerberos [Miller et al 1987], a symmetric cryptography based authentication system with trusted entity (Ticket Granting Server (TGS)) that shares a ticket with any Initiator that requests for it. The ticket contains a session key, encrypted with target’s shared key, sent later to the target. If Kerberos used with TLS [Medvinsky, Hur 1999], ClientKeyExchange(i) message contains both encrypted pre_master_secret (encrypted with a TGS ticket’s shared key) and the TGS ticket itself. The shared key in the ticket is used to decrypt the pre_master_secret and TLS-HSP continues as in Figure 14.
2.2.2. HTTPS security
SSL-based solutions are placed over TCP/IP stack – non-reliable IP and reliable TCP with retransmissions if no packet TCP ACK is received by the Initiator from Responder, with
49 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments congestion avoidance mechanism and other DiffServ phenomena included. It means that network layer specific disturbances affect higher level protocols, including SSL (e.g. notorious lost TCP connection causes that SSL-based session timeout occurs and must be re- established). Efficient and successful SSL-based communication is possible only when SSL session key sent from the Initiator to Responder over public and unreliable network is received without errors at the destination point. Truncated data means that the message cannot be deciphered, causing retransmission, higher bandwidth utilization, and latency for distributed application processing. Services based on SSL probably will do a series of retries with appropriate timeouts in between to rescue the connection. Design of SSL was done according to several cryptoanalysis attack classes including: Known Plaintext Attack (attacker knows the Plaintext corresponding to the Ciphertext ), Ciphertext Only Attack (attacker does not know the Plaintext), Truncation Attack (attacker convinces one side or both that there is less data than actually was), Substitution Attack (data are manipulated between connection end-points), Replay Attack (attacker can take a record off the wire and sends it again to the receiver). SSL-based implementation resistance to Known Plaintext Attack and Ciphertext Only Attack is covered if strong encryption algorithms are applied to secure the communication. “Strong” means: using a cipher without efficient cryptoanalysis methods along with long and strong encryption keys. Substitution Attack protection is available to SSL with the use of DC. Man-in-the- middle attack (a class of Substitution Attack) [Saltzman, Sharabani 2009] is very expensive if security policies are not violated (DC is generated with strong encryption, Private Key remains secret). [CAB 2007] presents in details for the issuance and management of extended validation certificates for CA and clients. During the TLS-HSP the server’s DC is sent to the client, this DC is signed by CA. The client verifies the DC by checking the signature to see that the issuer of the DC is someone the client trusts. The name of the DC issuer is contained in the DC as well as the name of the server and the servers Public Key . The server encrypts something with his Private Key and it can be decrypted with the Public Key that is included in the DC. Replay Attack protection is achieved by SSL with the use of Nonce – a one-time unique number representing the connection ID. This sequence number is increased, never duplicated, as long as the communication goes on. SSL implementation must have ability to store current session so that the sequence number is always available. Like for IPSec, sequence number = 0 is forbidden. [Pimentel et al 2007] gives more light into security protocols from interleaving Replay Attack perspective in general, proposing a set of method for patching faulty security protocols in this matter. HTTPS-based solutions must be aware of possible timing cryptoanalysis. [Kocher 1996] shows a cryptoanalysis technique that is built over the observation that cryptographic operations take varying amounts of time to complete depending on data and keys used. By carefully measuring the amount of time required to perform private key operations, attackers may be able to find fixed D-H exponents, factor RSA keys, and break other cryptosystems. Possible attack against HTTPS-based implementation should concentrate on examining system response time for handcrafted requests and take advantage of it. Possible methods of preventing from such attacks are: masking timing characteristics (making all operations to take exactly the same amount of time), making timing measurements inaccurate, or using techniques with blinding signatures. Returning results in pre-specified time is not recommended (it is slow because all operation must take as long as the slowest operation) and hard to achieve (because of compiler optimizations, RAM cache hits, instruction timing). Inaccurate measurements can be supported by adding a random value to each processing time
50 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments however it may be compensated by the attacker by collecting more results and applying statistical filter. Blinding signatures [Chaum 1983] look as the best solution – timing characteristics collected by the attacker has useful knowledge limited to minimum. HTTPS supports protection against Downgrade Attack class [Rescorla 2000]. SSL connection initiation to the server is expected by the client – using https:// means using SSL and only possibility for the attacker is to generate errors. In some specific cases client may be asked to retry with http:// instead of https:// if HTTPS connection fails. The second aspect of Downgrade Attack as far as HTTPS is concerned is related to SSL security parameters. SSLv2 has no protection against an attacker downgrading the connection to a weaker algorithm. SSLv3 and TLS fix this issue (including downgrading to SSLv2) the negotiated security properties of a given connection cannot be downgraded. HTTPS design supports end-point authentication. For requests coming from server client should be able to extract hostname ( dNSName preferred, then Common Name ) of the server and check its origin against server’s DC. Client’s certificate may be used as well to check client’s origin if connection properties say so. However mismatch between the certificate and server’s expected identity does not cause HTTPS connection to be terminated. It has been studied that instant connection drop is found very irritating by the users – in most cases mismatch is caused by configuration error, not active attack. To meet end-user expectations HTTPS supports several alternatives for handling this case. Client should be notified about the mismatch firstly. Automated client must then log the error and should terminate the connection. Other HTTPS clients may proceed with the communication (mismatch is ignored), try to retry but with HTTP (not recommended, in most cases accepting out-of-date certificate is better than use Cleartext [Rescorla 2001]), or close the connection. The best solution is to apply the most adequate policy that takes into consideration security requirements. It should be also mentioned that end-point authentication strongly depends on the reference being safely delivered to the client. If communication channel over which reference is sent is not trusted it is possible to create a successful attack that fakes the page containing the reference. HTTPS connection closure happens when any side sends close_notify() . It can optionally wait for close_notify() from the responder. SSL session may be resumed if responder close_notify() is not received yet, otherwise not. HTTPS connection is also closed by TCP FIN (called premature close) (for analysis of Truncation Attack see: chapter 3.8). Despite a strong fence of protection against such attacks it might be useful to simulate such attacks against the SSL-based implementation acting in distributed SW environments (see: chapter 3.8). Work of [Jackson, Barth 2008] shows that HTTPS security is in trouble because of misconfigured servers or improper use of this protocol. Recent studies [Kleinjung et al 2010] demonstrate factorization of a test 768-bit RSA modulus (232 decimal digits) with General Number Field Sieve (GNFS) algorithm (the most efficient known algorithm for factoring large integers (# of decimal digits > 110) to be implemented on classic computers [Pomerance 1996]), making 768-bit key for RSA not safe for sensitive information – recommended is now 1024-bit or 2048-bit key length. A 1024-bit RSA modulus is still about one thousand times harder to factor than a 768-bit one, however it is expected to be broken within the next decade by means of an academic effort (6 months on 80. processors on polynomial selection (3% of overall task), then sieving on many hundred machines for 2 years, then a couple of weeks for preparing the sieving data for the matrix step, lastly few hours of final step) on the same scale as the effort presented in [Kleinjung et al 2010]. Constantly improved cryptoanalysis potential is one of the crucial factors to be taken into consideration when making decisions on distributed applications security.
51 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
2.2.3. HTTPS performance
From performance processing point of view HTTPS processing on server side must be constructed wisely: TLS-HSP must be placed in a child process/thread of the server rather in a parent process/thread (Figure 15) in order to avoid system bottleneck (as stated above SSL asymmetric computations are highly processing expensive). TCP accepting should be left in the parent process to ensure that connection is established before passing it to resources consuming SSL processing part. do { // server process // TCP accepting if ((s = accept (socket, 0, 0) < 0) { err_exit (“TCP Connection cannot be accepted“); }
if (thread = CreateThread()) {
close (s); } else { // client process // SSL accepting sbio = BIO_new_socket (s, BIO_NOCLOSE); ssl = SSL_new (ctx); SSL_set_bio (ssl, sbio, sbio);
if (r = SSL_accept(ssl) <= 0) { berr_exit (“SSL connection cannot be accepted”); }
// client processing ... } } while (true) Figure 15. Pseudocode of OpenSSL multiprocess server on Windows machine [based on Rescorla 2001]
The very important performance aspect of SSL-based processing is SSL session caching. Session cache data must be updated when new session is created, and questionably marked as non-resumable if session is closed. In case of threads safe shared memory technique with adequate locking/unlocking algorithms could be applied. For processes the are several approaches, including storing data in a flat disk file (this approach may suffer reliability issues under heavy load) or separate Session Server which role is to store all SSL session data, being then accessible via inter-process communication mechanism served by specialized server functionality. SSL session records may be also stored in DB rather than in a flat file. E.g. mod_ssl [MOD_SSL] proposes 4 types of caching in SSLSessionCache directive: disabled session caching, use of a DB hashfile on the local disk to synchronize the local OpenSSL [OPENSSL] memory caches of the server processes, use of a high- performance cyclic buffer inside a shared memory segment in RAM (established via
52 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
/path/to/datafile ) to synchronize the local OpenSSL memory caches of the server processes (recommended), and finally support for distributed session caching libraries. [Gupta et al 2005] showed architecture and implementation of a server capable of processing HTTPS, designed for highly constrained embedded devices. The implemented HTTPS stack needs less than 4 KB of RAM and operates with an Elliptic Curve Cryptography (ECC) enabled Mozilla Web browser, allows full TLS-HSP in less than 4 seconds (for session D-H value reuse only 2 seconds) and transfer of application data over SSL with 450 Bps. The presented solution opens new areas of interests for SSL-secured solutions, including personal medical devices, and cheap home devices. [Filjar, Desic 2004] presents location-based service, based on accurate position reporting system, with HTTPS- like communication channel between rover and location server. The solution allows secure application data transfer, with acceptable communication performance trade-off.
53 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
2.3. Distributed applications working in IPSec/HTTPS environments
The very first step of performance and security analysis is to model the SW system with key parameters identified. It is crucial for model to be as close to reality as possible so that it realistically reflects true system's behavior. On the other hand model must be simple enough (Ockham Razor) so that collection of necessary data is feasible. IPSec and HTTPS based distributed applications work mainly in two architectures: Request/Response (R/R) and Publish/Subscribe (P/S).
2.3.1. Request/Response (R/R) solution
In R/R architecture one program is asking the other for any new information that have arrived since the last time it asked by sending a Request message and expecting a corresponding Response message (Figure 16) [Krawczyk, Barylski 2009a] [Krawczyk, Barylski 2009b]. Handshake between client and server to establish secure communication channel (e.g. SA establishment) is a prelude to the main application data traffic.
Client Server
SA Handshake between client and server for SA establishment further secure establishment communication
Request creation A Client Request B Request reception A 1. Use SA Request 2. Encrypt message analysis
1. Use SA Response B Server A 2. Decrypt message B creation Response Response reception
Figure 16. Activity diagram of R/R model with emphasized parts responsible for secure network communication [source: author]
The main advantage of this solution is simplicity what directly causes that a chance of potential SW defect is lower. On the other hand it causes high mean communication channel utilization - every message exchange between client and server must be initialized by a request message. It causes that the performance of R/R is not optimal from application point of view. R/R approach defines a message flow that consists of a Request followed by a Response. It is the responsibility of the requester to ensure reliability. If Response is not received by the requester within a timeout interval, Request must be retransmitted or connection is abandoned.
54 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
As far as performance is concerned, fundamental of R/R solution is a right balance between Request and Response messages size and frequency. The cumulated throughput (size multiplied by frequency) of acknowledge traffic (not related to the main application data stream) should be as low as possible (in comparison to the main application data stream throughput). For instance, when Client is a continuous multimedia data stream source (camera), the Request (containing multimedia data) should utilize as much available throughput as possible, while Response of Server (the recipient of multimedia data for further analysis) should be just a short acknowledge to the Client that data has been received successfully or a short control message indicating an exceptional condition (e.g. truncated data, bad CRC, flow control message etc.). In terms of security incorporated into R/R solution, the overhead of IPSec and IKEv2 is as follows: need of SA renegotiation via two-phased IKEv2 if previous SA has expired (SA lifetime expiry, SA kilobytes expiry or SA sequence number overflow), additional bytes of ESP header and recommended HMAC-SHA1 authentication at the end of the datagram. In comparison to the IP traffic over Ethernet without IPSec it is obvious that less application data is sent over the wire for the same Ethernet frame length. However if both Client and Server incorporate IPSec with AES-CBC-256 encryption and HMAC-SHA1 authentication algorithms with long keys known to these parties only the communication security increases significantly. What is more, [Sung, Lin 2008] reports that IEEE 802.11b access point can support one IPSec RTP stream less than the original RTP streams to maintain a given packet loss rate what illustrates IPSec overhead indeed but not serious. On the other hand in the same article, in terms of n RTM streams latency, the impact of IPSec is more visible: for n < 15 IPSec introduces 9.26 % overhead to latency due to insignificant queuing effect that packet processing causes. If 15 ≤ n ≤ 20 the IPSec latency increases even by 570.97 % due to heavy packet retransmission. For n > 20 system saturates for both original and IPSec streams, producing only 4.38% overhead of IPSec over Cleartext solution.
2.3.2. Publish/Subscribe (P/S) solution
P/S approach [Eisenhauer et al 2006] expands and optimizes the communication channel utilization in comparison to R/R. Client program, commonly named as Subscriber, registers an interest in certain data (topic-based or content-based) with a server program [Krawczyk, Barylski 2009a] [Krawczyk, Barylski 2009b]. Successful registration causes that server, named as Publisher, asynchronously sends new information to the Subscriber each time it is ready (Figure 17).
55 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Client / Subscriber Broker Server / Publisher
Publish Store services contracts Subscription required A C B Inform about services
Subscribe Subscribe
Optional Client certificate ACK A A Client Services of origin subscribed Optional produce ACK output B Broker certificate of origin
C Server certificate of origin Server Services response utilization by Client Broker response
Figure 17. Activity diagram of P/S model with broker between client and server; broker role is to store-and-forward the subscribed data [source: author] The main advantage of P/S cooperation model over R/R is reduction of bandwidth requirements; client program is no longer constantly asking for new data. The server program only sends data changes for a specific point to those clients that have registered for exceptions on that point. The data is not delayed by polling cycles. IP multicast can decrease network traffic by sending the data from the publisher to the subscriber with a single message on the wire. HTTPS used in P/S approach incorporates benefits of SSL which allows managing accessibility by certificates and PKI (Public Key Infrastructure). It slows down the browsing of WWW resources - additional datagrams must be sent - but makes it harder to steal HTTP session variables or fool user authorization mechanisms over network [Krawczyk, Barylski 2009b].
2.3.3. Concept of a secure service processing continuous multimedia data
On-the-fly analysis of multimedia flows, containing high quality voice and video data, is still a challenge [Krawczyk, Barylski 2009b]. Firstly, a system designed to such a target requires wide infrastructure of reliable multimedia stream sources, then network (wired or wireless) capable of secure data high-bandwidth transmission to the system headquarters where stream are analyzed, classified, and quick but correct decision process is performed.
56 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Secondly, there is a group of interest that would like to receive the results of the multimedia streams processing but with certain level of reliability, if possible without need of human assistance, without false alarms. On the other hand application of such analysis and classification is strongly demanded by many business needs. It covers observation of manufacturing processes, security monitoring, detection of fire, flood, or any natural disasters, potential crime scene monitoring, traffic measurements, and so on. Such a system is irreplaceable to support mass entertainments, including sport events, able to detect hooligan behavior, glass break, smoke, fire, shout for help in many languages, and more. A simple secure service processing continuous traffic, composed of R/R and P/S components, acting as on-the-fly multimedia flows processing system, is presented in Figure 18. Interception of continuous stream of high-bandwidth data is done in R/R manner enforced with symmetric encryption on Client’s side (step 1) and decryption on Server’s side (step 3). Successful reception of a bunch of data is acknowledged from Server to the Client (step 2).
SA data SA data Establishment Client continuous Client continuous data stream Encrypted Encrypted data data stream on Server side data 3. 2. 1. R/R components Analysis of continuous data stream producing result message P/S components
Subscriber Public Key Publisher Random Message Private Session Key Key Wrapped Session Key 7. 4. 5. Signed Message Encrypted Message 6. Digest value Publisher certificate Encrypted Publisher certificate
Signature Encrypted signature
Subscriber Private Key 8.
Session 9. Key Wrapped Session Key 11. Digest value Signed Message Encrypted Message 12. 10. Publisher Publisher certificate Encrypted Publisher certificate 13. Public Key Signature Encrypted signature 13. Verify signature
Figure 18. Secure service processing continuous traffic [source: author] Continuous data received by the Server are analyzed by a specialized processing unit, resulting in a set of result messages (e.g. containing classification results, detection of dangerous events etc.) passed to P/S components. The result message is encrypted by the Publisher (steps: 4, 5, 6, and 7), then transmitted over a public network (step 8) to the Subscriber, then decrypted, verified (steps: 9, 10, 11, 12, and 13), and finally utilized by the Subscriber. In details, the system processing is built from the following steps (Figure 18):
57 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
1. Client produces continuous stream of data that are encrypted with the use of SA negotiated between Client and Server; 2. Encrypted data stream is transmitted between Private Networks over Public Network; 3. Server decrypts the continuous stream of data with the use of matching SA; results are passed further for content analysis by the Server; 4. Server produced result passed to Service Publisher; Publisher computes the digest over the result message; 5. Publisher signs the message digest and attaches the resulting DS plus his DC to the message; 6. Publisher produces a Random Session Key and uses it to encrypt the signed message, DC and DS; 7. Publisher encrypts the Session Key under Subscriber Public Key and attaches the wrapped Random Session Key to the message; 8. Message leaves Publisher Private Network, travels via unknown Public Network and arrives to Subscriber Private Network; 9. Subscriber uses his Private Key to decrypt the Session Key; 10. Subscriber uses the Session Key to decrypt the message, DC and DS; 11. Subscriber computes the message digest of the message himself; 12. Subscriber verifies Publisher DC and extracts Publisher Public Key; 13. Subscriber uses Publisher Public Key to verify Publisher DS. If succeeds, message is utilized by the Subscriber.
From high-level perspective the proposed system is build from 3 main components: Web Cam Client Network (WCCN), Multimedia Flow Processing Engine (MFPE), and End- Client Network (ECN) (Figure 19).
Publish / Subscribe Request / Response
End-Client Multimedia Flow Web Cam Network Processing Engine Client Network
1..n 1 1 1..n ECN MFPE WCCN
Figure 19. Multimedia flow processing system overview [Krawczyk, Barylski 2009b]
WCCN consists of a set of Web Cam Clients (Figure 21 b)), organized as a group of small Private Networks capable of capturing voice and video data in runtime, reporting the results over Public Network to non-stop listening MFPE services. The Clients cooperate with the MFPE in R/R manner – the multimedia flows are captured by the Web Cam Clients, initially processed (flow standardization to multimedia format recognized by MFPE) and transmitted as a sequence of subsequent requests to MFPE. MPFE acknowledge the successful reception of each flow piece by short response messages. Lack of acknowledgement means the corresponding data must be retransmitted. ECN is a network of thin end-clients, strongly interested in MFPE processing results, communicated to them over Public Network by Apache Tomcat + JSP. The Clients act in P/S architecture– they register for certain results of multimedia flows processing (e.g. detection of fire) that are published by MFPE as soon as the multimedia flow classification produces output.
58 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
MFPE is a supercomputer capable of fast-enough and accurate classification of multimedia flow received from WCNN, constantly updating the final results matrix available to ECN, stored in data repository (JDBC + MySQL) with Web Services infrastructure. MFPE must be equipped with multimedia processing queue able to store the received data without loss. Unquestionably MFPE needs to be a powerful, multithread and multicore processing unit. MFPE is based on Java Media Framework (JMF) that enables the playback and transmission of Real Transfer Protocol (RTP) streams through the APIs defined in the javax.media.rtp , javax.media.rtp.event , and javax.media.rtp.rtcp packages (Figure 20).
Multimedia data Private Network Network Private Network capture device Interface Session Manager Data source
Data source Processor Public Network Player
Processor Data source Console
Data sink Session Manager File Network Interface DBMS
Figure 20. RTP multimedia streams transmission and reception [Krawczyk, Barylski 2009b]
The JMF RTP APIs are designed to work seamlessly with the capture, presentation, and processing capabilities of JMF. Players and processors are used to present and manipulate RTP media streams just like any other media content. Media streams that have been captured from a local capture device using a capture DataSource or that have been stored to a file using a DataSink can be transmitted. Similarly, JMF can be extended to support additional RTP formats and payloads through the standard plug-in mechanism. SessionManager is used to coordinate an RTP session. The session manager keeps track of the session participants and the streams that are being transmitted. It maintains the state of the session as viewed from the local participant. In effect, a session manager is a local representation of a distributed entity, the RTP session. It also handles the RTCP control channel, and supports RTCP for both senders and receivers. The SessionManager interface defines methods that enable an application to initialize and start participating in a session, remove individual streams created by the application, and close the entire session. Several RTP-specific events are defined in javax.media.rtp.event . These events are used to report on the state of the RTP session and streams. The streams within an RTP session are represented by RTPStream objects. There are two types of RTPStreams: ReceiveStream and SendStream . Each RTP stream has a buffer data source associated with it. For ReceiveStreams , this DataSource is always a PushBufferDataSource . The session manager automatically constructs new receive streams
59 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments as it detects additional streams arriving from remote participants. New send streams are constructed by calling createSendStream on the session manager. To implement a custom packetizer or depacketizer, JMF Codec interface is implemented. a) b)
Network traffic filtering and firewall MFPE C TCP listener IPSec GW 2. Response Network PHY layer
IPv4 SPD Public SAD Network IPSec ESP IPSec GW Web Cam ACK Client 1. Request Network 1 Public IPSec GW Network ESP IKEv2 Camera Web Cam Data Client stream Network 2 B IPSec ESP SPD IPSec GW Camera SAD IPv4 Web Cam Client Network PHY layer Network 3 Camera 2 A Multimedia stream encoding Encoding library Camera 3 Web Cam firm ware Web Cam HW
Figure 21. System R/R components: a) with operating layers; b) deployment diagram with 3 measurement points [Krawczyk, Barylski 2009b] WCCN is a set of Web Cam Clients. Each Web Cam Client is built from the camera HW and firmware, able to continuously capture, pack to appropriate encoding format and transmit the results to its destination end-point (Web Cam Server = MFPE) as soon as possible. A piece of WCCN may consist of one or more Web Cams. The transmission from WCCN to MFPE happens over Public Network (Ethernet + IPv4 + TCP), insecure, available to eavesdroppers, tractable to forgery, data manipulation, open to hackers and DoS attacks. To secure the communication IPSec ESP + IKEv2 mechanism is incorporated. The edge of WCCN is equipped with IPSec GW with appropriate SPD, maintaining the SAs within SAD, handling SA lifetime, SA expiration events, SA SN overflow, SA renegotiation via two-phased IKEv2 (Figure 21a)). On the opposite communication side (MFPE) stands the IPSec GW able to capture and decrypt the received flows from WCCN. SPD and SAD on both end-points must be synchronized by IKEv2. For the security reasons, to protect the data, a pair of HMAC-SHA1 + AES-CBC256 algorithms is used, with setkey infrastructure incorporated.
60 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Public Entrance module Network
Normalize data format Normalizator Decision system Class selection Classification D Class repository 2. Client subscribes F for the classification results Web Services for image processing results G Thin Web End-Client E Relational HTTPS request DBMS Client SSL layer Server SSL layer Web Services for Presentation layer HTTPS response Web Presentation layer 1. MFPE publishes the results 3. Client receives desired data
Figure 22. System P/S components: deployment diagram with 4 measurement points [Krawczyk, Barylski 2009b] Implementation of the system is essential part of research related to verifying the testing model, based on security and performance testing, for improving quality of distributed applications working in public-private network infrastructures. Model fundamentals are the security and performance metrics gathered at the system critical points. There are seven critical points: A, B, C, D, E, F, and G identified in the discussed system architecture that should monitored, situated on a process flow from the multimedia stream source to system end-client presentation. Point A is the Web Cam Client who is open to intensive HW resources consumption. Then there is IPSec GW of WCCN (Point B), responsible for securing the data, with the IPSec ESP and IKEv2 processing. On the opposite side of ESP tunnel there is point C: the destination IPSec GW that handles a bunch of IPSec connections – efficient SPD and SAD processing is required there. Point D is the multimedia stream processing layer, the backbone of the system. The results of analysis are placed in the distributed data repository with appropriate Web Services layer exposing the available methods – point E. Then results are published, being available to the subscribers – point F – the latency from the time when disaster happens to the time when classification results are published. Point G, the last one, is the thin-end-client, able to assess the system performance from the end-user point of view.
61 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
2.3.4. Security and performance of continuous multimedia streams distribution
[Parks et al 1999] briefly discusses security implications of different techniques used in adaptive multimedia streams (audio and video) distribution. Three main approaches are met: transcoding, simulcasting, and layered coding. Their role is to provide fair distribution of multimedia streams among all receivers, however with different surcharge in both security and performance (Table 9). Transcoders [Amir et al 1995] translate high-rate media streams into low-rate media streams in order to accommodate low-capacity receivers with the use of more compact encoding (Figure 23). High-capacity networks are not affected – they receive the same stream quality as the source generates. Transcoding on a boundary of low & high capacity networks is a small private network responsible for decrypting, re-encoding, and encrypting the multimedia stream thus two independent established secure connections (represented by two IPSec SA) are required: SA 1 on high-capacity network and SA 2 on low-capacity network. The main advantage of transcoding is natural flexibility in terms of coding standards and support for different client configurations (e.g. dedicated transcoder to serve clients behind firewalls). R/R approach is naturally present in transcoding flow.
Response Transcoder Response Multimedia private Low-capacity stream source network private network private network Encoding Request Request
SA 1 – before transcoder High-capacity private network
SA 2 – after transcoder High-capacity Low-capacity public network public network
Figure 23. Multimedia stream transcoder in R/R architecture [source: author]
Simulcasting [Li, Ammar 1996] is based on an idea to transmit high-rate and additional low-rate multimedia streams. Each client (Subscriber) must subscribe to server (Publisher) (P/S approach) to receive desired level of stream quality (Figure 24). Single security channel (IPSec SA) is required to support all clients. Simulcasting is a simplified transcoding solution where transcoder is located in the multimedia stream source, producing different stream on Subscriber’s demand. Overhead of simulcasting is observed in high- capacity networks if joined with low-capacity clients – it must transport both low and high- rate traffic even if low-rate is redundant.
62 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Publish Multimedia Low-capacity stream source private network private network
Subscribe on Subscribe on high-quality low quality High-capacity SA from source to destination private network
High-capacity Low-capacity public net work public network
Figure 24. Multimedia stream simulcasting in P/S architecture [source: author]
Layered coding (Figure 25) [Shacham 1992] [Haskell, Messerschmitt 1994] [McCanne et al 1996] [Wu et al 1997] seems to be the best option from both security and performance perspective (Table 9). It eliminates one simulcasting disadvantage - network bandwidth is not lost due to redundant streams transmission - because every stream (called layer in layered coding approach) provides cumulative information and receiving more streams provides progressively better media quality. Each client (Subscriber) may intelligently adjust its received media quality by joining/leaving multicast group that transmits the corresponding multimedia layer, ordered from the very basic one (the most crucial multimedia data) to the most sophisticated one (details). Layers can be totally independent – if so network bandwidth is optimally used because no redundant multimedia data is present in any of the layer – however it may cause that any data lost or malformed in the lower layer potently prevents from using the higher layer data – the final multimedia stream is a composition of all successfully received layers starting from the bottom.
Publish Multimedia Low-capacity stream source private network private network
Subscribe on Subscribe on high-quality low quality High-capacity SA from source to destination private network
High-capacity Low-capacity public network public network
Figure 25. Multimedia stream layered coding in P/S architecture [source: author] [Yu et al 2005] shows that heterogeneity of network and video reliability that robust multimedia stream transmission are the most challenging aspects of multimedia streaming over public network. Layered coding is designed to solve client heterogeneity problems, and Multiple Description Coding (MDC) [Goyal 2001], designed to fragment single multimedia stream into many independent substreams, is an effective method for robust transmission – packet loss or any network congestion in “best-effort” networks will only cause temporary loss of quality (if does not influence the base stream). [Chakareski et al 2005] discusses
63 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments performance aspects of layered coding with MDC, showing that there is a large variation in relative performance between multiple description coding and layered coding depending on the employed transmission scheme. The best performance achieves layered coding with packet transmission schedule optimized in a rate-distortion sense. [Liu et al 2007] proves that layered video provides incentives in P2P live streaming, in which a peer contributing more uplink bandwidth receives more layers and consequently better video quality.
Table 9. Security and performance trade-offs of multimedia streams distribution techniques [Parks et al 1999] Performance considerations Security considerations − Computationally expensive because − Does not preserve multimedia traffic E2E transcoder must decrypt & decode confidentiality, authentication, or multimedia streams to re-encode & re- integrity (transcoders must decrypt, encrypt them in a different format; translate, and re-encrypt any media − Multiple transcoders on the stream path stream); may cause forwarding loops, causing + Provides better anonymity (translated latency issues; media streams are originated at the Transcoding Transcoding + Efficient if deployed at the boundary of transcoder, rather than the original high-capacity and low-capacity regions sending host). of the network. − Simulcast wastes both network − Provides worse anonymity than bandwidth and computational resources transcoding (media streams originate at for high-capacity networks – both low- the sender); rate and high-rate streams are transmitted + Preserves multimedia traffic E2E but only high-rate are utilized; confidentiality, authentication, and + Less computational sacrifice in integrity (data are not modified in Simulcasting comparison to transcoding because transit). decrypting/encrypting is not required. − Ineffective in lossy public networks, − Provides worse anonymity than especially if all layers are independent – transcoding (media streams originate at if any packet from the lowest layer is lost the sender); or corrupted, the packets from higher + Preserves multimedia traffic E2E layers are useless and still consume confidentiality, authentication, and network bandwidth until the lowest layer integrity (data are not modified in is completed; transit). + Multimedia data streams are redundant – they provide cumulative information – Layered coding coding Layered the more streams are received, the better media quality is played – bandwidth utilization and computational expense is fair.
64 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
2.4. Summary
Security and performance aspects of SW applications for sure are not narrowed to ones working solely in distributed environments – they relate to any kind of operational SW component. However it is the distributed environment, composed of several relatively small private networks (hosting application components), and interconnected with each other by a huge public network, where design and implementation for security and performance especially count. Generally speaking, SW security bar is expected to be as high as possible, while SW performance is considered to be not worse than the minimum of acceptance level. Both IPSec and SSL technologies provide such a reasonable solution (see: experiments in chapter 5).
65 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
CHAPTER 3. S ELECTION OF SECURITY AND PERFORMANCE TESTING PROCEDURES
This chapter begins with introduction to general testing, and then it characterizes performance and security testing of distributed applications. It presents testing methodology, comparison of test algorithms followed by the testing tools / testing frameworks description and classification. Data given in this chapter are the building blocks of MA2QA, a method described in the next chapter. Other unique products of this chapter are: Intelligent Throughput Measurement Method, classification of performance and security defects, detailed diagrams of test environments.
3.1. The gist of quality control
Every informatics enterprise is born when the system stakeholders constitute root requirements. The most important, root requirements, set up a goal of the system. Well- defined goal is measurable, eligible, and possible to achieve. Together with the rest of the requirement and project limitations, grouped into sets with given priorities, ordered, classified, and analyzed they build a System Requirement Document (SRD) [Górski et al 2000] [Barylski, Barylski 2007] (Figure 26).
System stakeholders Goal of the system requirements System Success ? System produced on the Or Limitations Requirement base of SRD Failure ? Document (SRD) Opportunities
Standards
Figure 26. Informatics enterprise from requirements definition to final solution – how do we can to indicate its success or failure? [Barylski, Barylski 2007] In mature organizations, SRD, produced according to SW engineering Best Know Methods (BKMs), is a starting point to the assessment if the final product meets end user’s requirements (the subject that will use the SW) and fulfils client’s expectations (the subject that paid for the SW). SRD may be a formal specification (hard-copy document within CMMI-certified organizations [CMMI 2006]), a set of scenarios (for Agile methodology [AGILE 2001]), or in any other format suitable for the chosen SW development process. The assessment is not an easy step, thus, in order to accomplish this task organizations establish a group of experts (e.g. called Quality Management and Control Team (QMCT)), responsible for SW quality definition, measurement, and continuous control. The main role of such team is as follows:
• Continuously monitor the fulfillment of SRD statements in order to increase a chance of full client’s and end-user’s satisfaction when SW is released to the customer;
66 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
• Continuously provide feedback to Product Development Team (PDT) about product “health”, its metrics, test results, and test execution progress; • Enforce the backbone of the SW development process by introduction of desired standards, goal-oriented processes, and continuous SW quality control to the product lifecycle; • Continuously develop internal mechanism of quality management to improve organization agility and maturity, strongly demanded if technical, economic, law, human, or any other factors influencing on the SW development process are taking place later in the next project phases.
QMCT takes advantage of many quality instruments (e.g. formal: audits, documentation or source code formal inspections, mathematical proofs of correctness, certification; experimental: internal testing, field experiments), SW metrics (e.g. number of requirements in SRD, number of tests derived from SRD, number of defects found during test execution) and tools (e.g. SRD repository, test cases repository, bug tracking tools, quality management tools) in order to meet the expectations. It is essential to get to know when the project tends to be successful or not. BKM is to define an appropriate assessment procedure, based on project success indicators Ii, continuously harvested at each project phase. Table 10 presents a list of the most common success indicators of SW development enterprise monitored by QMCT.
Table 10. Sample list of SW development enterprise success indicators [Barylski, Barylski 2007] i Name of the indicator Description 0 Percentage of Value from 0% to 100%, indicating how many requirements implemented from SRD is met for a given HW & SW set. Experimentally to requirements from SRD check whether the requirement is fulfilled or not it is required to design, implement, and execute an appropriate test. Formally proof of correctness must be presented (rarely). 1 Number of SW defects Integer number, positive, indicating how many bugs were found on a given SW discovered in the given SW build. Bugs are found during test build, influencing on execution loops or are reported directly by customer. Each bug requirements from SRD has many attributes like: severity, title, steps to reproduce, etc. 2 Test coverage for a Value from 0% to 100%, indicating the percent of tests that given SW build were created on the base of SRD and are already executed on the given SW build. 3 Percentage of passing Value from 0% to 100%, indicating the percent of passing tests acceptance tests that impacts on client’s decision if SW build is accepted or not. Acceptance tests’ list is derived from the most important requirements from SRD. It is worth mentioning it is possible that some minor defects are present in SRD but they do not influence on acceptance test results.
Typical project lifecycle consists of 5 phases (Figure 27): exploration, architecting, designing, implementation, and maintenance. It starts when system goal is born, and its EOL becomes true when system goal vanishes. Within each phase both SW development and testing activities are performed, resulting in products (high level and low level design specifications, list of test cases, list of SW metrics) exchanged between PDT and QMCT. In the implementation and maintenance phases, direct inputs of QMCT to PDT are SW bugs
67 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments submitted on a SW build delivered by PDT, connected with test cases created on the base of SRD for whose PDT creates SW fixes or patches. QMCT efficiency is powered with test tools’ capabilities, and quality of test cases. Two approaches of test cases implementation are met: Test Driven Development (TDD) [Ambler 2007] and Test After Development (TAD). TDD requires from QMCT to create the test before PDT code is available so that PDT can confront its work, perform SW refactorings improving SW design without changing code’s semantics, and stops development if the test passes. TAD requires less timing discipline from QMCT – tests are implemented in parallel with PDT work or just after it. There are several important milestones in the typical SW development lifecycle when QMCT and PDT synchronization is strongly demanded. The first one is the end (Figure 27, A) of exploration phase when SRD is ready (or at least stable version is available) – after that both teams start their architecting and designing activities. The next milestone (Figure 27, B) is the time when QMCT starts official testing against PDT output. From customer perspective the most important is the moment when SW is released (Figure 27, C) – just before it a set of acceptance tests is run against SW release candidate by joint effort of PDT and QMCT.
Customer
Customer input Customer changes
SW previews SW releases Exploration Architecting Designing Implementation Maintenance
phase phase phase phase phase
Products of SW SW architecture SW design SW builds SW patches development SW components, SW low level deployment plan, SW implementation SW maintenance design high level design
Test execution SRD creation A B SW bugs C SW bugs Tests Tests Tests
System goal defined Tests design specification implementation maintenance System goal vanishes
Test environment Updates to test Products of test Tests cases Tests tools architecture, cases, test tools activities Test metrics List of SW bugs Test design List of SW bugs Figure 27. Overview of typical project lifecycle with fundamental testing activities included, with 3 important project milestones checked in [source: author]
For instance, in the implementation phase, the following success indicators may be monitored: a percentage of requirements implemented from SRD ( I1), number of defects found by tests derived from SRD in a given SW build ( I2), and percentage of test coverage for a given SW build ( I3). Implementation phase is concerned as successfully completed if
I1=100%, I2=0, and I3=100% [Barylski, Barylski 2007]. In the product delivery to the customer checkpoint, after implementation phase but before maintenance phase, on the release SW build version, it is strongly suggested to execute a full set of acceptance tests, measuring a percentage of passing acceptance tests ( I4), and percentage of acceptance test
68 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
coverage ( I5). If I4=100%, and I5=100%, the probability of full client acceptation when SW is delivered increases. Security Testing and Performance Testing are the pieces, along with other testing techniques (Table 11), that builds a testing strategy, commonly called SW Test Plan (STP), described formally in Test Plan Document (TPD). Table 11. Common STP test categories [source: author] Test category Test category short description Unit testing Commonly not a part of QMCT supervised area – rather left solely for [UT STD1008-1987] PDT however may be included to STP; it is a verification and validation method that examines the individual SLOCs (method, class) in isolation from other components, often using mocks, or stubs to simulate interactions external to examined SLOCs (if necessary). Unit tests represent very basic requirements that SLOCs must satisfy. Conformance testing Also called functional testing. It is a set of verification and validation [Lam 2001] methods that deal with basic requirements taken from SRD, representing common SW use cases. Passing conformance tests (after Unit Testing) are the prognostic for more sophisticated testing techniques (e.g. Performance Testing). Positive testing In opposite to Negative Testing, it is a validation and verification method examining “sunny path” of the SW only. It concentrates on the test scenarios that provide from the SW the results it was intended to, without unexpected interaction, program flow, or corner cases. Negative testing In opposite to Positive Testing, it is a validation and verification method that deals with any possible SW interaction, focusing mainly on SW corner cases, inadequate input type, length, or timing, possible arithmetic overflows, exceptions, errors, and any harmful activity. Smoke testing It is a method that uses a very limited but representative set of test [McConnell 1996] cases executed against SW builds of minor importance in order to [Zhao, Shum 2006] verify whether main functionality of SW remains unchanged. Performance testing See chapter 3.2 Security testing See chapter 3.6 Exploratory testing It is a method of manual SW testing taking advantage of tester [Kaner 2006a] intelligence and intuition so that the tests are continuously optimized, [Kaner 2006b] and may follow the unpredictable path in order to find a defect. Manual testing In opposite to Automated Testing, it is a method of SW validation [Ciupa et al 2008] and verification fully performed by human that follow the consequent [Geras 2008] test steps. Automated testing In opposite to Manual Testing, it is a method of SW validation and [Pettichord 1999] verification with the use of computer support (test scripting languages (e.g. Expect, TCL, Perl, PHP, Ruby), tests tools (for examples see chapters 3.5, 3.8)), where human interaction with the test is limited to minimum. In most cases automated tests are controlled within the test framework that allows test definition, scheduling, execution, test results reporting, and test results storing for archival purposes.
The next sub-chapters of the dissertation deal in details with Performance Testing (see: 3.2, 3.3, 3.4, Appendix A), and Security Testing (see: 3.5, 3.6, Appendix B).
69 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
3.2. Fundamentals of SW performance testing
The performance tests that are run against the distributed SW application may be characterized as long-term volume functional tests with a limited set of passing conditions [Krawczyk, Barylski 2009a]. The goal of these tests is to observe how the system behaves in stressful conditions and what its volume limitations are rather than test case corner case characterization. The most popular performance test classes [Barylski 2008] are presented in Table 12. Table 12. Performance test classes [Barylski 2008] Test class Short description Examples of performance metrics Positive Volume functional Application layer: performance tests to hit system 1. Maximum number of concurrent virtual users tests border values per second [# users/s] Stress / soak Extended 2. Transaction throughput [# transactions/s] / load tests conformance tests to 3. Asynchronous message delivery throughput exercise the most [# messages/s] stress conditions and volume Middleware layer: environmental 4. Thread pool size, influence 5. DB connection pool size Stability tests Long-term tests to 6. Message queue pool size observe memory 7. Application components cache size leaks, resource 8. Maximum speed of DB accesses [# accesses/s] exhaustions, system 9. Remote method call latency [s] degradation, reboots Negative Denial of Service Network layer: long-term (DoS) tests, 10. Frame throughput [b/s] tests malformed 11. Frame latency [s] transmission tests 12. Maximum number of IPSec flows per second [# flows/s]
Physical resources: 13. Maximum CPU usage [%] 14. Memory usage [%] 15. Hard disc accesses per second [# accesses/s] Regression Representatives to A limited set of tests derived from all conformance conformance verify if basic tests tests functional tests work as expected
Strongly related to performance tests, stress tests aim at testing the functionalities of a system when the system is heavily loaded for a given period of time or at peak time. The main goal of these tests is to determine if the system is able to meet target availability (response time for the end user) and reliability. There are several factors that are essential to run the performance tests but two of them are crucial: the workload characterization [Avritzer et al 2002] (also named as operational profile) and test pass criteria. The workload characterization requires data collection for significant periods in the product environment to be done in order to create a representative workload to be applied in the performance test so that it is as close to reality as possible. The
70 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments performance test pass criteria (values of performance parameters above the acceptance level) are derived from both system theoretical limitations and SW framework or HW bottlenecks. They should be as precise as possible but still with a small margin of possible exceptions. The test time of performance test should be as long as possible. When creating a test setup it is proposed to take under consideration three possible configurations: the fastest one, the most resource consuming and the most popular in the sense of user ordinary usage [Barylski 2007a]. It is more than sure that project timelines and budget will not allow validating all possible SW and HW configurations before release. The very first step that every validation team must perform is to define the test process, in particular performance validation approach. This test approach is forced mainly by project management technique [Royce 1970] [AGILE 2001] [Frankel et al 2005] and customer expectations. The key products of validation lifecycle that enable mature performance validation are: test pass criteria, test scenarios, test tools and test results. For the CMMI-like enterprises [CMMI 2006] with waterfall project lifecycle [Frankel et al 2005] [Royce 1970], where all requirements are defined and should be frozen at the very beginning of the project and customers are interested in final product version only, it is possible to gather all system performance characteristics at the earliest project phase, making it possible to create a test plan and implement test cases to validate system performance by almost independent team. V-model schema of such validation lifecycle for IPSec-based system (Figure 28 a)) marks out the test development stage (where test cases are created and implemented) and test execution stage (where IPSec gateway is tuned as long as tests pass and the performance requirements are successfully met). a) b)
Product quality n=1 Product quality acceptable ? acceptable? Wait until bugs No, still some Start Bugs Yes End Start Yes End are fixed iterations left n++ Test report Start of Gather IPSec performance Assess product from IPSec iteration n End of characteristics Fixes perfomance perspective Iteration n
Test pass criteria Test results Test report Update IPSec performance Write test plan Execute the tests Assess product from IPSec characteristics perfomance perspective
Test scenarios Test tools Updated test pass criteria Test results
Implement the tests Update test plan Execute the tests
Updated test scenarios Updated test tools
Update the tests
Figure 28. Validation process lifecycle from performance point of view: a) V-model schema of validation process lifecycle where IPSec performance characteristics are known before writing the test plan and further implementation, b) State diagram of validation process lifecycle with agile development methodology where IPSec performance requirements are constantly evaluated [Barylski 2008] With agile development technology [AGILE 2001], where performance requirements are changing rapidly or cannot be confirmed at the early project stage, incorporation of test prototyping into the test process lifecycle is strongly recommended [Frankel et al 2005] [Barylski 2007b] [Barylski 2008] as depicts Figure 28 b). It causes that successfully completed iteration must include actual performance characteristic retrieval, existing test plan rearrangement, test cases and test tools redefinition, concluded with test execution and final iteration output assessment. When performance testing is already introduced into overall project development methodology, it is the performance optimization process itself that must applied. It consists
71 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments mainly from three consequent phases: a) identification of performance bottleneck, b) optimization to SW, and c) final test session repetition to check if desired performance level is reached and optimization succeeds. If not, go back to phase a). Two generic method of bottleneck identification for a system component are commonly used:
• Modification of target component to decrease workload, without influencing on other components: if system performance increases, this component is the bottleneck, otherwise not – another component must be taken into consideration; • Modification of other components to decrease workload, keeping the target component on the same level: if performance does not change significantly, this component is the bottleneck.
Amdahl’s Law (1) [Hennessey et al 1996] is the most basic rule of performance tuning. It says that speedup resulting from an optimization is equal to the improvement of the speedup multiplied by the fraction of the CPU time spent in the code sped up [Rescorla 2001].
Speedup = t old / t new = 1 / ((1 – f e) + f e / s e)) (1) where:
told = old execution time
tnew = new execution time
fe = fraction enhanced
se = speedup enhanced
On the other hand there is a Pareto principle that says 80% of effects come from 20% of the causes – in terms of programming: 80% of code execution time is spent mostly in 20% of the code. The divagations above lead to a thesis that performance tuning should locate 20% of crucial code and make it use faster or less frequently. The central points that are present in the crucial SLOCs and are responsible for the most expensive operations (the points are called bottlenecks) should be identified and eliminated, with the help of performance testing. There is no point in investigating the performance of the rest 80% of code now but after performance fixes when performance evaluation is re-run, the next bottleneck may be located in the 80% of SLOCs skipped before. Chapters 3.3 and 3.4 cover performance testing of distributed applications. They are supplemented with appendix A, describing details of performance testing and monitoring tools.
3.3. Network layer performance tests
Network layer performance tests are responsible for validation of application network layer [Bradner et al 1991] [Bradner, McQuaid 1999] [Mandeville, Perser 2000]. They provide the performance numbers of Throughput (see 3.3.1) and Latency (see 3.3.2). The tests hit the border values, examining network layer stability and network data exchange behavior under stressful conditions. They allows to characterize Frame loss rate of DUT (how many frames are dropped if incoming throughput value exceeds DUT forwarding capabilities), Back-to-Back (B2B) behavior (the number of frames in the longest burst that DUT will
72 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments handle without the loss of any frames), System recovery (the speed at which DUT recovers from an overload condition), Reset (the speed at which DUT recovers from a HW or SW reset) and Jitter [Sung, Lin 2008] (the variation of packet inter-arrival time that creates unexpected pauses between utterances, affecting data stream intelligibility – especially performance of VoIP, video streams). From this set the dissertation covers in details network throughput and latency testing as the most valuable benchmarking tests. Generally speaking latency is the amount of time it takes to process the given transaction from beginning to end. Throughput is the number of total transactions that can be maintained over a period of time.
3.3.1. Network throughput testing
Network throughput is one of the most important features of the distributed applications [Wijesihna et al 2005] [Dumitrescu et al 2005]. It is the maximum forwarding rate at which none of the offered network frames are dropped by DUT [Bradner et al 1991] [Bradner, McQuaid 1999]. It is suggested to do network throughput measurements in the test environment as depicted in Figure 29: network traffic generator, capable of transmitting network traffic of different frame lengths with defined rate, is connected to DUT network interfaces.
Sent and Received packets analysis
Test data storage Test traffic definition Test DUT network network interfaces interfaces Network traffic generator Device Under Test Network traffic generator Command line or GUI Crash & error Static management layer interface monitoring API configuration (i.e.TCL) Network packet sniffer Automated Automated tests tests
Figure 29. Network throughput test environment [Barylski 2007a] To make the experiments automated network traffic generator should be equipped with mechanism of test traffic management without human interaction (e.g. TCL scripts creation to automate the test). Additionally network traffic generator should be able to detect any corrupted/mismatched frames that can appear during transmission by sent and received packet analysis due to DUT potential SW or HW defect. All ports on the network traffic generator must be able to transmit test frames either in a Frame Based or Time Based mode [Mandeville, Perser 2000]. No throughput or disturbance in throughput curve causes that network module responsible for forwarding the traffic becomes a bottleneck that limits or stops other network nodes. When packets are unexpectedly lost, connectionless transmissions are broken (e.g. UDP [Postel 1980]) and all connection-based flows (e.g. TCP [Postel 1981b], FTP [Hethmon 2007]) or connectionless-based traffic with retransmissions (e.g. TFTP [Sollins 1992]) must be re-established again. The worst scenario is when no frames are forwarded which in terms of throughput means that throughput falls down to zero value. Permanent or temporary situation when no packets are forwarded is considered as a network device crash, reboot or undesired disconfiguration and must be avoided.
73 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
The throughput test determines the maximum frame rate, measured in bytes per second, without lost frames the DUT can manage. The test begins at 100% frame rate by sending a pre-determined number of frames. If any frames are lost, the test is repeated at a lower frame rate. This process continues until the maximum throughput is determined. Some test devices capable of throughput test are able to fasten this process by choosing the next frame rate in an intelligent way (e.g. with the use of binary search algorithm). The test equipment should discard any frames received during a test run that are not actually forwarded test frames. In any case, the test equipment should verify the length, payload and any other significant fields of the received frames and check that they match the expected values. It is suggested that the network traffic generator should include sequence numbers in the transmitted frames and check for these numbers on the received packets to detect re-ordering issues. As it is suggested over Ethernet network the following Ethernet packet sizes should be evaluated: 64B, 128B, 256B, 512B, 1024B, 1280B and 1518B [Bradner, McQuaid 1999]. However the experience says that these points should be rounded out by all middle packet lengths from 64B to 1518B to create complete throughput characteristics [Barylski 2007a]. The complete throughput curve describes DUT behavior for every packet size that can appear during network traffic exchange. The brute-force algorithms take a lot of time to gather all necessary results to draw complete throughput line, especially if DUT’s throughput line is jagged or saw-tooth. The algorithm of throughput examination may be optimized to find throughput curve and its disturbances. It is especially useful during performance tuning of the DUT where new version with performance improvements is created and there is a need to compare with the previous version of the code. The method gives quick answer if the performance of new version is better (Figure 30).
60 50 Brutal thoughput measurement method (examine all transmit 50 rates from 1% to 100% of transmit port capabilities)
40 ITMM without historical results (only TBA used)
30 ITMM with historical results (with TIA in the first step when the of test of iterations Average number number Average 20 first result differs significatly from the expected one) 8 6 for a given frame length frame for a given 10 ITMM with historical results (with throughput line found very 2 close to the historical one) 0 Algorithm
Figure 30. Comparison of effectiveness of different network throughput measurement methods (throughput accuracy=1% of transmit port rate, max number of iterations=100) [source: author] Often the one change in the code improves one part of the characteristic and deteriorates the other so we need to verify the whole graph. The Intelligent Throughput Measurement Method (ITMM) (Figure 31) gives rough results very quickly by measuring selected test points from whole range. In first iteration test points are taken with step 128 bytes starting from offset 0. Next iterations start from other offsets in range <0 ..127> with step 128 bytes and every iteration makes the characteristic more precise. This method, later on referred as the Throughput Interlaced Method (TIM) (Figure 32) is continued till the one of the following events: a) Whole throughput characteristic is collected b) DUT crash is detected c) DUT incorrect behavior is observed
74 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
The ITMM incorporates the following rules: 1. If it is possible - use the results from the previous historical performance test as a starting point for new performance test – this minimizes the time for searching the throughput value; 2. Stop measuring the performance after achieving the requested threshold and accept this value. Threshold may be set to value different from 100% of bandwidth when the product requirements say so and can be different for each frame length; 3. If there is no previous measured data for specific frame length then use the closest known value from current test or previous; 4. For the first try for a given packet length, when packet drops are observed the Throughput Interpolation Algorithm (TIA) is used which calculates the estimated performance by analyzing the number of received and dropped packets to minimize the number of tests. TIA detects the overload case (typical for performance measuring) and random packets loss (typical for defect in the code). Detected defects are marked on graph; 5. For the next tries for a given packet length, when packet drops are observed use the Throughput Binary search Algorithm (TBA) which calculates the estimated performance by dividing the performance area by two and making the next experiment in the more promising part of bandwidth; 6. If measured performance is lower than requested threshold then always make sure that achieved performance has no packets loss and one performance point up loses packets; 7. Use TIM for selecting measuring points. Thanks to this method the rough graph is interpolated quickly and during the test it is continuously updated and corrected. Throughput test may generate alarm (print console message or send email) when performance lack is detected; 8. The ITMM also detects incorrect DUT behavior (e.g. the DUT inoperability during test - it may happen that DUT fails and stops forwarding the traffic during performance test; in such case ITMM detects that packets are not forwarded at all and verifies if the DUT has crashed by retesting the performance of one of the previous packet length that was forwarded by tested DUT); 9. Random packet loss detection. Sometimes packets may be dropped accidentally in the result of the SW or HW defect. The ITMM is trying to find such situations and matches them with relevant packet lengths. If the packet loss occurs for very low performance or is not proportional to transmit rate then such case is treated as potential defect in the code or HW. Table 13. Description of parameters used in Figure 31 [source: author] Variable/Constant Description Demonstration values LAST_ITERATION Constant limiting iterations for each frame length I Iteration 0, 1, 2, ..., LAST_ITERATION pkt_len Length of examined packets [B] bytes UseInterpolation To decide if TIA can be used or not TRUE, FALSE testRate Rate of the traffic [fps] frames per second rateFound Maximum forwarding rate found during one loop NAN or testRate Rx Frames forwarded through DUT 0, 1, 2, … Tx Frames sent to DUT 1, 2, 3, … Delta Difference between frames sent to DUT and delta=Tx-Rx forwarded by DUT array_of_pkt_len Contains lengths of packets to be examined according to the interlacing mode
75 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
START
i = 0;
select packet length pkt_len = getPktLen(i); i = i+1; rateFound=NAN
UseInterpolation = TRUE
FALSE i == LAST_ITERATION
is_history_result ? FALSE
TRUE TRUE is_close_neighbor_result ?
testRate = getHistPerf(pkt_len); testRate = getCloseNeighborPerf(pkt_len); TRUE UseInterpolation = FALSE UseInterpolation = FALSE FALSE
testRate = getAcceptanceLevelPerf(pkt_len);
FALSE
delta= doPerfomanceTest(pkt_len,testRate)
Above 0 delta examination Below 0
UseInterpolation == Equal 0 ERROR 3 TRUE More RXed than TXed TRUE Save last forwarding rate Raise ALARM with no loss UseInterpolation = FALSE rateFound=testRate Use PIA for calculating new start rate testRate = Interpolation(testRate, delta) FALSE
FALSE testRate >= getAcceptanceLevelPerf(pkt_len); FALSE Get new testRate using PBA testRate = BinarySearch (testRate,delta, pkt_len)
FALSE Is PBA finished? TRUE
TRUE rateFound==NAN FALSE TRUE testRate = rateFound testRate == 0 FALSE
TRUE
Optional: Final DUT DUTisStable() failed FALSE throughput validation successful TRUE ERROR4 ERROR2 ERROR1 Packet random drop Packet filtering DUT crash Save testRate in history Raise ALARM Raise ALARM Raise ALARM
END
Figure 31. Intelligent network throughput measurement method (ITMM) [source: author]
76 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Functions utilized in ITMM: getPktLen (i) - returns packet length to be checked for iteration i of ITMM; uses array_of_pkt_len to get packets as interlacing mode requires. getHistPerf (pkt_len) - returns historical maximum forwarding rate for the given packet length; getCloseNeighborPerf (pkt_len) - returns maximum forwarding rate for a frame length adjacent to the given packet length; getAcceptanceLevelPerf (pkt_len) - returns forwarding rate for the given packet length that is enough to fulfill the throughput objective; getMinimumLevelPerf (pkt_len) - returns the minimum forwarding rate that is possible for the given packet length; in most cases it should return 0; doPerformanceTest (pkt_len, testRate) - executes the forwarding test for the given packet length and rate of the traffic; returns the difference between frames sent to DUT and forwarded by DUT (delta); interpolation (testRate, delta) - returns new forwarding rate having in mind current test rate and delta value (2) ; uses TIA approach: testTime is time of forwarding the packets in a single iteration;
new rate = testRate - delta/testTime (2)
BinarySearch (testRate, delta, pkt_len) - returns new forwarding rate having in mind current test rate and delta value; uses TBA approach:
if (0==delta) { new rate = (getAcceptanceLevelPerf (pkt_len) - testRate)/2 } else if (delta > 0) { new rate = (testRate - getMinimumLevelPerf (pkt_len))/2 } else { throw new exception (ERROR – delta must be >= 0) }
DUTisStable (void) - returns TRUE if DUT is still alive (e.g. forwards the specified packets, answers on ICMPv4 echo requests, etc.); otherwise returns FALSE.
ITMM allows finding several throughput defect classes (Table 14): DUT crash detection – ERROR 1, DUT packet filtering detection – ERROR 2, DUT misconfiguration detection – ERROR 3, DUT random packet drop detection – ERROR 4. The last defect may
77 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments be the most time consuming to find and is often caused by SW defect. The observation for a specific frame length is that for first test attempt transmit rate passes and for next trial with the same parameters it does not. This disturbs the ITMM and results of performance test are different from one measurement to another.
Table 14. Classes of the most common throughput defects [Barylski 2007a] No. Class of throughput defect Severity 1 Throughput curve is not non-diminishing Low 2 Throughput is lower than expected but > 0 Medium 3 Throughput is random Medium 4 DUT has throughput = 0 High 5 DUT crashes during throughput test High
There are a few methods that help to detect random packet drop: a) The final test is much longer that ordinary one so the random drop is more probable b) The final test is repeated c) The difference in tests for the same packet length can be uncovered by analysis of several historical performance graphs d) The random packet drop can be suspected when in result very close measuring points have very different performance
Only method d) does not extend performance measuring time. Developer may determine the suspected ranges on performance characteristic before the test. ITMM is designed for performance characteristics to become more and more precise when the test proceeds (Figure 32). The test engineer may diagnose performance on-line and obtains very early performance results drops. DUT failure, some potential code defects and performance problems are reported automatically. ITMM has the following strong points:
a) If for the first try for a given packet length all packets are lost (D=Tx) and no historical and adjacent results are available, the PIA approach will jump to the rate=”0” as the next step (3) .
rate-= D/time = Tx/time – D/time = (Tx-D)/time = 0 (3)
b) It allows finding application crash or big performance lag in one step only; c) It uses historical and adjacent results to speed up the whole process; d) It reacts if DUT throughput falls below a specified rate or DUT starts acting wrongly; e) It does not carry out the experiments above the maximum forwarding rate – it excludes some experiments and speeds up the whole throughput investigation.
78 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
a) b)
2,5 2,5
2 2
1,5 1,5 Throughput [fps] Throughput Throughput [fps] Throughput 1 1
0,5 0,5
0 0 60 110 160 210 260 60 110 160 210 260 Frame size [B] Frame size [B] c) d)
2,5 2,5
2 2
1,5 1,5 Throughput [fps] Throughput Throughput [fps] Throughput 1 1
0,5 0,5 0 0 60 110 160 210 260 60 110 160 210 260 Frame size [B] Frame size [B]
Figure 32. The longer the experiment lasts, the more points are measured and the better throughput approximation is obtained: a) 2 points measured, b) 3 points measured, c) 4 points measured, d) 193 points measured [source: author]
One of the important steps of ITMM, especially for milestone products, is Final DUT Throughput Validation. It is required to ensure that maximum forwarding rate found for a given packet length excludes any DUT random packet drops for lower forwarding rates or any performance lags. It is proposed to validate all forwarding rates below measured maximum forwarding rate in one step for a given packet length. A dedicated TCL-easy-to-implement Forwarding Rate Manipulator (FRM) implemented on the test traffic generator is required to perform this task. FRM should be able to transmit the packets of the specified length with a rate varying from the minimum allowable forwarding rate to maximum forwarding rate, using uniform distribution for the validated transmission rates. Perfect throughput curve, presented in Figure 33, should be a non-diminishing function of network packet size. In addition it is very important to confirm that for a given packet length any packet transmit rate not greater than the throughput does not cause packet loss. The longer the Ethernet frame, the closer channel utilization to 100% and higher potential throughput values. The Ethernet inter-frame gap is 12 bytes and each standard Ethernet frame contains 8 bytes of preamble, 14 bytes of Layer 2 header (6 bytes of source MAC address + 6 bytes of destination MAC address + 2 bytes of Ethernet type) and 4 bytes of Frame Sequence Check. Having in mind the minimum Ethernet frame size that is 64 bytes and the maximum Ethernet frame size - 1518 bytes - Ethernet channel utilization grows from 84.2% (for 64 bytes frame) to 99.2% (for 1518 bytes frame) at a uniform rate. The inter-frame
79 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments gap between frames inside a test burst must be at the minimum specified by the standard [Mandeville, Perser 2000].
3 Network module with perfect throughput 2,5 characteristics - line is accumulative 2
Throughput [Gbps] 1,5
1
0,5
0 64 192 320 448 576 704 Network packet length [B]
Figure 33. Part of a perfect throughput characteristics – it is non-diminishing for successive frame lengths – the longer the frame, the closer channel utilization to 100% [Barylski 2007a]
One of the crucial throughput test parameters is a test run time. Transmission of packets of specified size and rate must be done long enough to ensure that it is the throughput that we measure, not a peak or short-lived performance. Too short time for a single trial (e.g. 5 seconds) may cause that too optimistic values are recorded. On the other hand too long time impacts the test process, elongating it and depraving of flexibility. To sum up, test engineer must choose test run time wisely – the most common values are 30 seconds [Mandeville, Perser 2000] or 1 minute, however it is worth of repeating the throughput experiments with the highest non-packet-dropping rate once more with extremely long test time (e.g. 40 minutes, 1 hour or even 48 hours) for the most common frame lengths on the final router version [Barylski 2007a]. Network devices may have HW or SW defects that affect their functionality and performance. Not all of them are crucial to the network transfer rate but there is a group of issues that affect the flows unquestionably: throughput defects. Table 14 lists the most popular classes of throughput bugs. Firstly, there are minor defects that do not limit network module throughput – their influence on the network throughput is not significant however the throughput curve is disturbed (Table 14, defects of class 1) Secondly, there are defects of average weight. Issues from this set affect the throughput values in a significant way but do not cause network destabilization or permanent device crash (Table 14, defects of classes 2 and 3). Finally, network routers are vulnerable to fatal defects that cause network module crash with throughput fallen to zero. These issues may temporarily, periodically or permanently bring network traffic down (Table 14, defects of classes 4 and 5). It is suggested that fatal defects are the first to be fixed by the programmers, then average defects with minor issues at the end of bug fixing queue. From a test engineer perspective fatal defects must be found as soon as possible. Throughput test execution for the selected frame lengths is the first step of deeper throughput analysis. When the following pair of values are available: frame length in bytes
80 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments and throughput in bytes per second, it is worth of drawing a graph as the most effective and spectacular method of result presentation. The x coordinate should be the frame size; the y coordinate should be the frame rate [Bradner, McQuaid 1999]. Created graph becomes an input of further deep analysis. The analysis consists of throughput curve comparison to the expected goal, examination of extreme values and discussion over all throughput fluctuations. The throughput analysis goal is creation of a quick and effective method of HW and SW defects detection. Figure 34 presents the situation where obtained throughput values are lower than expected. Expected throughput curve is known from theoretical analysis or measurements done in the past for the different version of HW or SW. This type of throughput defects indicates an issue located on the critical path for every network packet processed by the module. It means that in comparison to the expected goal a non-optimal algorithm or not required action is applied to the packets. The good news is that network module is stable and forwards the datagrams even if contains a defect.
3
Expected throughput curve 2,5 Obtained throughput curve 2
1,5 Throughput[Gbps]
1
0,5
0 64 192 320 448 576 704
Network packet length [B]
Figure 34. Network module with throughput lower than expected - critical path for network packets is affected and must be reviewed [Barylski 2007a] Figure 35 shows a network module with not non-diminishing throughput characteristics. It means that regularly or randomly throughput values go down while incrementing network packet size. This type of defect may be caused by router internal architecture and in some cases may not be treated as a bug but “functionality as designed”. Most network devices use internal buffers of fixed length to process the incoming network packets. When two packets of successive frame lengths are processed with different number of internal buffers (shorter packet requires one buffer less than the longer one) cost of these packets processing differs and causes visible but controlled throughput degradation. However it may not be a rule because it can be imagined a situation when throughput results go down because some internal overflow is met that causes one SW thread responsible for forwarding the traffic stop working properly.
81 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
3
2,5 Network module with disturbed throughput line: throughput line is not accumulative 2
1,5 Throughput [Gbps] Throughput [Gbps]
1
0,5
0 64 192 320 448 576 704 Network packet length [B]
Figure 35. Network module with disturbed throughput line - throughput line is not non- diminishing for the successive frame lengths [Barylski 2007a] Figure 36 presents a graph with more serious flaw – random throughput values for some frame lengths. It must be highlighted that small throughput fluctuations may be observed very often when repeating the throughput test for the same frame length several times however all of the differences must be below defined margin of error. On the other hand clear random throughput values may cause unpredictable network degradation at any time of router normal activity. 3
Network module with random throughput for 2,5 some frame lenghts 2 2
1,5 Throughput[Gbps]
1
0,5
0 64 192 320 448 576 704 Network packet length [B]
Figure 36. Network module with random throughput values for some frame lengths - it may bring network down without warning [Barylski 2007a] Figure 37 and Figure 38 show characteristics of the most fatal defects of network router from throughput perspective: there is a part of graph with no frame rate at which none of the offered frames are dropped. A zero throughput value recorded during throughput test should raise an alarm as a temporary or permanent network device crash. It should be the highest priority task for test engineers to detect such a harmful behavior. Figure 38 differs from Figure 37 with the time of throughput decrease to zero. Throughput curve from Figure 38 may be the first sign of permanent network module crash. One of the methods to confirm this state is to repeat the throughput test without device reboot
82 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments and reconfiguration once more. If the beginning of the throughput curve found during re-tests was still above zero, device would not be crashed. If successive measurements recorded no throughput too, it would be the next proof of long-term or permanent device crash and would mark the defect with the highest severity.
3
Network module with serious defect: no 2,5 throughput for some frame lengths 2
1,5 Throughput [Gbps]
1
0,5
0 64 192 320 448 576 704 Network packet length [B]
Figure 37. Network module with serious defect - no throughput for network packets of length from 384B to 448B [Barylski 2007a]
3
2,5 Network module with fatal defect: DUT crash - no frames are forwarded 2
1,5 Throughput [Gbps]
1
0,5
0 64 192 320 448 576 704
Network packet length [B]
Figure 38. Network module with fatal defect - after some time or for some lengths packets are not forwarded any more – DUT lost its stability until reboot/recovery [Barylski 2007a]
The most fatal defect from throughput perspective can be imagined as a horizontal line on zero level what means that network module is unable to provide a transmit rate to guarantee no packet drops in any circumstances. In this case, even if router remains stable during the test, no throughput is considered as a soft crash by other network devices that monitor the throughput. Examination of a throughput curve may be a very effective way of serious HW and SW defects detection. Throughput experiments may help finding HW limitation and faults.
83 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Volume network device activity while forwarding heavy traffic causes internal temperature increase that makes the next difficulty to be overcome. Throughput tests simulate tough network conditions where SW algorithms flaws can emboss and should detect all limitations of fast path processing. They easily detect configuration mismatches like Maximum Transfer Unit (MTU) changes. Furthermore including a graph with comments supports easy-to- explain-an-understand method of presenting network module capabilities/limitations and is a recommended illustration to be attached to any throughput-related defect.
3.3.2. Network latency testing
Frame latency is the next important performance factor of distributed applications’ network layer [Dumitrescu et al 2005] – it influences useful diameter of network [Barbosa et al 2005], impacts load sharing to improve system performance in which jobs are transferred from overloaded nodes to under-loaded ones [Iyengar, Singhal 2006], enables sound and video interactivity [Hashimoto, Ishibashi 2006]. As defined in [Bradner et al 1991] [Mandeville, Perser 2000] for store and forward devices it is the time interval starting when the last bit of the input frame reaches the input port and ending when the first bit of the output frame is seen on the output port; for bit forwarding devices it is the time interval starting when the end of the first bit of the input frame reaches the input port and ending when the start of the first bit of the output frame is seen on the output port. Network frame latency should have a value greater than 0 for all examined packet lengths. The succeeding values should create a non-diminishing line (Figure 39).
24
20
16
12 Network layer latency [ms] Network layer latency
8 Network layer with correct behavior: the longer the frame, the bigger latency 4 Network layer with incorrect behavior: frame latency is random and changes unexpectedly
0 64 192 320 Frame length [B]
Figure 39. Network layer latency example graphs [source: author]
During the network latency testing it is essential to define a time limit for each test trial that represents the maximum acceptable network latency value. If network latency exceeds this threshold during the test, test run is stopped and error value is reported. The most common network latency defects (Table 15) are: unpredictable and random latency (may indicate synchronization issues or system overload) (Figure 39), temporary latency threshold
84 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments violation (Figure 40 a)), or DUT permanent crash (Figure 40 b)) when latency threshold is violated until DUT stability before the crash is restored.
Table 15. Classes of the most common latency defects [source: author] No. Class of network latency defect Severity 1 DUT has random network latency Medium 2 DUT has too high network latency High 3 DUT crashes during latency test High
During network latency testing it is essential to monitor the network throughput of the test setup, especially to discover whether the DUT has crashed or not. Figure 40 b) depicts the situation when DUT network throughput falls down to zero – no packets are forwarded any more, so network frame latency → ∞. The test run should be stopped, test setup should be rebooted and reconfigured. a) b)
DUT with too high latency for some packet DUT that lost stability during the test lengths
100 100 90 80 80 70 60 60 50 40 40 30 Network layer latency [ms]
Network layer latency [ms] 20 20 10 0 0 64 128 192 256 320 384 64 128 192 256 320 384 Frame length [B] Frame length [B]
Figure 40. Network latency (maximum acceptable latency = 100 ms) of DUT with: a) network latency above the threshold in some circumstances; b) network latency above the test threshold after crash [source: author]
There are two effective methods of measuring the DUT frame latency. The first one utilizes test packet payload to store the time of packet arrival to the input DUT port trp and time when the packet is transmitted out of DUT tsp . The test traffic analyzes the payload of captured packets and calculates the DUT latency (4) . This measurement method eliminates the latency of test traffic generator and test environment but requires from DUT to be able to manipulate the packet payload on demand (Figure 41). This test packet payload manipulation must be done on the packet critical path causing the increase of network latency.
85 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
get send DUT time t rp Packet payload Test input processing – times of arrival and release source network added network interfaces Test traffic definition interfaces Device Under Test Network traffic generator DUT Test output Network traffic destination network Crash & error Static generator Command line or GUI network monitoring API configuration management layer interface interfaces (i.e.TCL) interfaces get send time t sp Automated Received tests packet payload analysis
Figure 41. Test setup for network latency measurement with test packet payload modification by DUT [source: author]
Latency1(p) = trp – tsp (4)
The second method requires a test traffic generator capable of including unique sequence number i into each test packet forwarded by DUT and recording the time of sending this packet pi from the source port tsi and time of receiving it on the destination port tri (Figure 42).
get send DUT time t si Test input source network Test traffic definition network interfaces interfaces Network traffic get receive Device Under Test generator time t ri DUT Test output Network traffic destination network Crash & error Static generator Command line or GUI network monitoring API configuration management layer interface interfaces (i.e.TCL) interfaces Automated tests
Figure 42. Test setup for network latency measurement with test traffic generator capable of monitor the exact time of sending and receiving each test packet [source: author]
This method is vulnerable to test environment latency te that adds non-zero value to all test results. Measuring the latency of test environment with a test loop between test traffic generator source and destination ports (without the DUT) may eliminate this factor (Figure 43) – the obtained latency value should be subtracted from all test results (5) .
Latency2(p i) = tri - tsi - te (5)
86 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
get send time t se Test Measurement source of test Test traffic definition network environment interfaces latency t e Network traffic get receive generator time t re Test te = tre - tse Network traffic destination generator Command line or GUI network management layer interface (i.e.TCL) interfaces Automated tests
Figure 43. Test setup for test environment network latency measurement – test traffic generator source and destination ports are connected directly [source: author]
3.4. Middleware layer performance tests
In a distributed computing system, Middleware Layer (ML) is defined as the SW layer that lies between the OS Layer and the applications on each site of the system [Krakowiak 2003]. ML performance validation [Majumdar et al 2004] [Liu et al 2004] is a term that covers examination of interconnection of SW components or applications, working in distributed environments. It includes: DB system tests, transaction monitor tests, telecommunication SW [Farooq et al 2007] [Narravula et al 2008], and messaging-and- queuing SW tests [Huet et al 2004]. ML is composed from Web servers, Application servers, and other tools that support application development and delivery [Britton 2000]. It is especially integral to modern information technology based on XML [XML], SOAP [SOAP], WS [Barbir et al 2007], and Service-Oriented Architecture (SOA). ML impact on the architecture, resources management, and the performance of the system is discussed in [Bivens et al 2004], [Verdict et al 2005], [Nudd, Jarvis 2005], [Lee et al 2009]. The ML performance tests examine whether the possibility to locate SW module transparently across the network, interaction with another service or application, independency from network services, and reliability and availability remain unaffected even if tough test conditions are created. Virtual Users (VUs) that participate in ML performance tests is the property that is composed of Transactional Users (TUs) and Navigating Users (NUs) (6) . TUs are the active users that trigger the transactions over the systems; NUs represent just passive behavior, without transactions.
VU(t) = TU(t) + NU(t) (6)
The current test user load, as a function of time, may be constant ( TU(t) = const,
NU(t) = const), segmentally constant ( TU(t) = const, NU(t) = const, t∈
87 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Distributed Test Test scenario Web Servers, Application Servers application environment management Web Services Virtual Test Distributed Objects and Components Users Test controller Agent 1 Data base Management Test requests Test results
Test Test run SW Responses Agent 2 repository ...
Test Test Agent n reports
OS OS Layer
HW Layer
Network Layer Synchronization Firmware Layer Application Layer User behavior Transaction monitoring
Figure 44. ML performance test setup architecture [source: author]
The Test Agents work separately but in a controlled way, simulating predefined number of VUs, however overall test scenario execution and the result data collection are controlled by the test controller. Controller’s job is to manage and handle all test related issues: synchronization points, data-pools, and transactions. The performance of complex, multi-tier applications is impacted by many interconnected factors, including system resources, DB and application architecture, the efficiency of application code, and network infrastructure, and many more. As a result of these interdependencies, symptoms of performance problems usually appear at one or more tiers. Typical approaches of isolating the root cause involve time-consuming, manual analysis of performance metrics reported by disparate tools often owned by different teams. When problems occur, remediation can be a complex, lengthy, and costly process. Recent studies indicate that the following “do’s” and “don’ts” should be considered while designing and executing the ML performance testing: “do’s list”: • Identify the appropriate system requirements to mirror the Production Environment (PE); • Consider using think time to pause between requests based on the operations performed in the ML to capture real-life testing; • While configuring the workload to ramp up or ramp down users for a load test, consider including a buffer time between the incremental increases of VUs; • Parameterize the load test script after recording, to send different set of data for each replay or to avoid errors due to duplication of values; • Identify and prioritize the key user scenarios according to critical functionality and time-consuming transactions.
88 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
“don’t’s list”: • Do not place too much load on a single Test Agent. This will create bottlenecks during load testing. To avoid this, use distributed load testing to distribute the load on multiple Test Agents; • Do not run your load tests in a live PE. Use an in-house test environment that mirrors the PE; • Do not allow the CPU and memory usage of your Test Agents that generates load to cross the threshold limit. Since, this might not give appropriate data in the load test reports.
ML load and performance testing answer the following questions: • Q1: What is the end-user experience when the ML is under load? • Q2: What is the number of VU the system can handle and still maintain an acceptable user experience? • Q3: What is the breaking point (saturation point) of the ML? • Q4: What are the combined capabilities of the HW and SW? • Q5: What is the scalability of the system?
The next important aspect of ML performance testing is a Production Environment Monitoring (PEN) tool. Round-the-clock performance monitoring of critical applications and services is essential for enabling 24-hours/7 days business activity and for meeting production Service Level Agreement (SLA). PEN applications continuously monitor distributed application component key features across all servers in the environment, providing real-time dashboard views and proactive notification alerts when performance thresholds are breached. Ability of quick and accurate analysis of collected ML performance results is one of the most important aspect in terms of successful performance evaluation. Figure 45 presents the most common target system reactions when it is subjected to ML performance tests. Figure 45 a) is an exponentially growing graph, showing that target system is out of some critical resource at the load level it is subjected. It represents an ideal case of a system with inappropriate amount of resources. Load test workload is enough in this case. Figure 45 b) shows two implementations that have response time increasing with increased load but performance degrades gracefully. Linear degradation is often unavoidable but acceptable by most systems. E.g. such characteristics is observed if system is running out of system bandwidth (TCP/IP stack used) and there is global congestion mechanism available (DiffServ). Comparing graph 1 with graph 2 we see that linear degradation can be also dangerous if line is steep – if system response values grow too quickly, system responsiveness with increased load can be in danger. Figure 24 c) presents a case when test load conditions are constructed badly and should be revised immediately. It is expected that with increased system load the response time should be similar or grater but not significantly lower. Falling system response time is a symptom that SUT has not reached saturation point yet. The real load test in Figure 24 c) begins when system response time increases – then it may be either Figure 45 a), Figure 45 b), or Figure 45 d) (rarely). Figure 24 d) depicts the most demanded but also the rarest system implementation from load testing result perspective. Constant system response time regardless system load means that SUT is able to serve any requests without system resources shortage phenomenon. Only the biggest solutions and the richest companies can afford to provide such wealthy infrastructure: geo distributed hosting servers (the closest to the recipient is taken), load balancing (fair distribution of client’s request among all available system resources), geo content delivered service (the most bandwidth consuming components are taken from the
89 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments nearest geo location), replicated environments to be used in case of system failures (e.g. power supply shortage, local server issues). a) b)
Straight graph 1 120 Exponentially 100 growing graph Straight graph 2 100 80 80 60 60 Response time [s] 40
Response timeResponse [s] 40 20 20 0 0 0 50 100 150 0 50 100 150 # of VU # of VU
c) d)
200 8 Flat graph Falling and 7 raising graph 150 6 5 100 4 Response time [s] Response time [s] 3 50 2 1 0 0 0 50 100 150 0 50 100 150 # of VU # of VU
Figure 45. Classes of ML performance characteristics: a) the worst case – system is out of critical resources very quickly; b) linear growing graph – system degrades gracefully, implementation 1 faster than implementation 2; c) wrong test load conditions; d) ideal implementation – flat graph means that system has enough resources for any number of VU [source: author]
The following chapters discuss in details state-of-the-art of DB, WS and Web performance testing of distributed applications.
3.4.1. DB performance tests
The main approach to DB performance testing is to record the client to DB server actions to a trace that accurately captures the DB control and SQL statements. The resulting trace is converted to a replay script in a programming language appropriate for the interface used for load generation that can be replayed. SQL statements and DB actions can be parameterized for fine control of variable input data (Figure 46). The DB performance tests may be executed directly testing two-tier applications or directly testing the DB tier within a multi-tier application. Each DBMS, including MSSQL, Oracle, DB2, ODBC, JDBC, MySQL is usually equipped with a profiling tool that allows to examine the query or stored procedure (SP) performance, variation and frequency of execution [Garbus et al 2001]. The examples of the tools are: Microsoft SQL Profiler, IBM SQL PL Profiler, and more. SQL profiling allows to measure the time of the SQL statement or SP call, the time when they are completed, how long they
90 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments take to execute, how many logical reads/writes are present (Table 16). It helps to find the worst performing queries, possible deadlocks. While profiling captured information should be stored to the file rather than to DB table because the interception mechanism can itself slow the application, destroying its performance characteristics to be measured. SQL events that are thrown after each SQL action are captured by the SQL Profiler and analyzed in real-time by a specialized module SQL Profiler export log analyzer. Response time DB request CPU usage DB test load (DB statements) DBMS Mem usage under DB response test DB requests/response scenario: - SQL statements: INSERT, UPDATE, DELETE DB table DB API - Transactions – COMMIT, ROLLBACK - Stored procedures calls DB index Connections endpoint DB metrics Profiling tools throwing events DB trigger SQL debug statements: EXPLAIN
DB stored procedure
Figure 46. Anatomy of DB performance test [source: author]
Table 16. Products of SQL profiling [source: author] Item Product Query • Total duration of the query • List of actions (e.g. opening tables, system lock, table lock, init, optimizing, statistics, preparing, executing, copying to temporary tables, sending data, freeing items, closing tables) and time of each action when query is executed • Counts for block input and output operations • Counts for voluntary and involuntary context switches • Counts for messages sent and received • Counts for major and minor page faults • Swap counts SP • List of SP never called • List of SP called the most frequent • List of SP that take longest to run
EXPLAIN statements may help to examine the performance of given SQL queries [Henderson 2006]. They show the join type, use of SQL indexes and other optimization areas. The most performance-oriented are system and const joins (query result can be returned in a single call), then are range (range of query result is read directly from SQL index), index (existing SQL index can be used but also index tree must be searched), and ALL , as the least performance-oriented, where all table rows must be iterated to find the query results. Using temporary tables, file sorting, or extended WHERE clauses mean that additional processing, impacting overall SQL query performance, must be done and should be avoided as well. DB performance testing tool should be able to trace and locate the faulty SQL transactions, including these located in SP and DB triggers. All DB objects accessed / hit
91 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments during the tests are identified in order to generate measurements of test coverage – it is necessary to find a list of SQL object never called or a list of SQL objects called the most often.
3.4.2. WS performance tests
The latest research indicates that the best performance is achieved when the network service allows users to focus on their application and obtain services when needed by invoking the service across the network if adaptive dynamic service resource management is done by run-time system able to dynamically select the most appropriate performance predictor or resource management strategy over the time [Lee et al 2009]. WS Layer performance testing tools can be written from scratch or may extract the test information from the WS itself using WSDL and generate appropriate test load (Figure 47). They simulate production scenarios to ensure that the WS that make up the SOA are both high performance and robust. To simulate a very high load hitting WS for load testing, the option of simulating the VUs within the load distribution across multiple Test Agents is recommended.
SOAP requests/response scenario:
Test - expected responses
Agent 1 - SOAP error codes
- faulty messages Data source WSDL WS - endpoint validation Client Test Agent 2 WS metrics WS under WS Test test Client Agent n SOAP request WS Data SOAP response Client processing
Figure 47. Anatomy of WS performance test [source: author]
Table 17. Load levels [source: author] Load level name Load level goal Normal Load Test Used to measure the capability of WS under anticipated production workload Peak Load Test Ramp-up and Ramp-down; Used to compare and determine how well WS respond at peak hours of the system and when they go back to an idle state Mixed Load Test Used to add a combination of workloads to emulate the real-world load behavior
The fundamentals of WS performance test are assertions that are continuously applied to the test-in-progress to validate that WS performs as fast as its requirements say. The most common requirements are: average response time of a request (it can be asserted to not go
92 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments under a specified value for a test period of time), maximum response time, transactions per second [SOAPUI]. Real-world load simulation is a very important aspect of any performance test, including WS test, however several sources recommends 3 types of tests (Table 17) to examine the implementation with different load levels.
3.4.3. Web performance tests
Web performance tests are targeted to examine the HTTP, HTTPS, or any presentation related traffic between the Web server and Web client from timing, data processing, availability, and stability perspective. [Qian et al 2007] showed formal method how to construct a Web test basing on Web application design only. Firstly Page Flow Diagram (PFD) is created, then Page Test Tree (PTT) is derived from PFD, then a test translator creates an XML document from PTT that illustrates all possible Web application paths which is an input for Web test engine. On the other hand [Kung et al 2000] proposed a method, called Web Test Model (WTM), of Web test construction based on test artifacts captured during use of Web application. WTM is utilized by most modern Web testing frameworks that offer Web requests/responses recording and replaying on demand (Table 45). The typical Web performance test (Figure 48) should simulate a desired amount of VU (value constant or changeable with the distribution function) hitting the tested Web site. Two types of tests are met: read-only (with the use of NU only) or fully transactional (with both NU and TU) (6). The test should be aware of content fetched on each Web page – data can be read directly from DB but with dedicated transaction-safe methods (other TU can perform transactions in parallel so Web content can dynamically change). However the best approach to performance testing the Web layer should incorporate isolation of Web content validation from accessing DB (to eliminate I/O DB requests to DB generated by the test code itself, in the same time I/O DB requests are generated by the production code) to make sure that DB processing (often a bottleneck of Web sites) is not influenced by the test. It leads to a conclusion that Web performance test simulating real user experience should be able to predict expected Web content without interaction with Web application DB.
Client Web presentation layer Request Test Request (Web front-end) Agent 1 Arrival Distribution:
Data Controller WWW Client 1 • timing Model Response • # of VU • View Test NU or TU Agent 2 • user intelligence
Web transaction (e.g. random processing layer thinking time) WWW Client 2 • different browsers simulation DB processing Anomaly Accessing detector Test • different network server resources Agent n link speeds
Web back -end Dynamic Web Test scenario WWW Client n content prediction recording mechanism and replaying
Figure 48. Anatomy of Web performance test from MVC perspective [source: author]
93 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
What is more, Web performance test should be equipped with anomaly detector tool able to track if any alarming event is present during the test execution. It relates to Web server resources consumptions (mainly CPU time, operational memory) but also detection of HTTP/HTTPS errors, application exceptions, and more. Most Web performance tools available on the market are equipped with such functionality (Table 45). Currently internal Web application architecture is recommended to follow Model-View- Controller (MVC) design rules that intentionally separate Web interface from business logic and data model (from end user perspective it still follow R/R and P/S communication paradigms). Controller receives requests and processes them according to the business logic, fetching data from Data Model if required. Results are passed to View and appropriate response is generated (HTML, JavaScript, AJAX). MVC allows decompositions of Web performance tests into performance test against Data Model (in most cases it is DB layer performance testing) and performance tests against Web presentation layer (that produces response available to the client on the base of received request). The most intelligent tools for Web layer performance have the following Web specific capabilities: • Performance Test Recording : To provide an easy to use browser-based recording to capture HTTP/HTTPS requests including AJAX requests; • Flexible User Scenarios : To emulate the load test of real-user activities; • Real-world Performance Testing: To dynamically vary the number of VUs hitting the Web applications/Web sites to study the load testing/stress testing conditions under varying load; • Dynamic Data Replacement: To modifying the performance test scripts in-the-fly to reflect dynamic data; • Session Handling : To support for standard session handling techniques, such as cookies and URL-rewriting and Dynamic Session ID, by storing session from previous response; • Support for cookies, basic authentication, URL re-directing/SSL : Automatically handles basic HTTP/HTTPS features such as, cookies, BASIC authentication for password-protected sites and URL redirects; • Parameterized Performance Tests : To parameterize HTTP GET/POST data or session data at runtime from cookies, previous response, previous URL, by executing a JavaScript or hidden elements; • Data pools for Parameterization : To use unique data for each VU from data pools (values from an external data source such as DB); • Configurable Think Time : Configurable think times to perform real-world performance testing where each VU spends thinking time (wait time) before performing the next action in the web page; • Browser Simulation : Support for simulating exact browser behaviors (MSIE, Firefox, Mozilla, Opera, and more); • Random Delay : Random Delay for VU start to simulate its visit to the web site/web application in irregular intervals; • Bandwidth Simulation : Option to emulate different network speeds during playback; • IP Spoofing : Support to simulate unique IPv4 address for each VU; • Server Performance Monitors : Server machine resource monitoring, mainly CPU and memory usage; • Database Monitors : Load test your web sites/web applications with integrated monitoring of databases to collect the database parameters; • Configurable Playback Options : Wide range of play options that affects the playback of load test scripts.
94 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
• Compressed server response (gzip, deflate): Support for the ability to test web servers that use compressed server response (gzip, deflate). • Quick Host Change: Option to run against different hosts without re-recording the web load test scripts; • Proxy Support : Performance testing can be done through proxy servers, supporting authentication and proxy exclusions for local networks; • Batch Testing : Provision to execute performance test suites from batch files or using scheduling utilities; • Response Validation: Extensive built-in functions for response validation; Response Validation report shows the success and failures;
Typical errors found when executing the performance tests against Web applications are: unexpected page response time (too high, random, or increasing), too high resources consumption on both Web server and Web client, unexpected behavior like errors or exceptions, and deprecated availability after exceeding the border value of VUs. a) b)
Web layer with drastic request response Web layer with increasing request 400 time increase after exceeding the border response time 120 value of VUs 350 100 300
80 250
Response time [ms] 200 60
Response timeResponse [ms] 150 40 100 20 50 0 0 0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500 # of VU # of VU
c) d)
Web layer with random request 400 response time 100 Web layer with too high request 350 90 response time after 80 300 exceeding the border 70 value of VUs 250 60 Response time [ms] 50 200 Response[ms] time 40 150 30 100 20 50 10
0 0 0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500 # of VU # of VU
Figure 49. Typical Web layer performance defects: a) drastic request response time increase after exceeding the border value of VUs, b) response time increasing with each VU, c) random request response time, d) too high request response time [source: author]
95 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Most of Web layer performance and load tests produce a graph that helps to identify performance bottlenecks from the perspective of response time for individual URLs, errors rate and distribution, simultaneous VUs that Web server can serve (Figure 49). Figure 49 b) and Figure 49 d) depict the common case when we system resources are lost with increased number of VUs (see: Figure 45 a) and b)) – each Web client request consumes similar portion of Web server resource. Figure 49 a) illustrate a situation when one or more system components already reached their saturation point (it should be the easiest case to investigate, probably only performance statistics of the each system component should be reviewed to find a bottleneck to fix). Figure 49 c) is almost the ideal performance characteristics (see: Figure 45 d)), probably with unexpected timeouts of some requests only.
96 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
3.5. Fundamentals of SW security
SW application security is a term that covers: SW piracy, SW access control, application vulnerability to malicious input and communications security while processing the data by the application [Whittaker, Thompson 2004]. SW security is about protecting SW assets. SW vulnerabilities expose these assets (e.g. confidential data, system-level access) to attackers. These “small” vulnerabilities lead to exploits that can be executed to the system on a larger scale, allowing to gain un-privilege access. One of the example of such an action is C- like buffer overflow causing replacement of function Return Pointer on the application stack with malicious value pointing the address with a code prepared by attacker [Piromsopa, Enbody 2006]. [Wang 2004] recommends systematic SW security testing incorporation into SW development process what most SW teams are currently performing [Howard, Lipner 2006]. SW piracy means illegal copying and running of the SW by users that did not pay for it. This aspect will not be covered by this dissertation as more related to law and economic layer than SW engineering itself. Techniques of securing the code from illegal copying or execution are integral part of most SW developed these days thus they should be included in product’s SRD and treated like other security related requirements, with corresponding development and testing activities. SW access control is understood as an ability of restricting the access of certain parts of the SW application to registered users those posses the appropriate access rights. Application vulnerability to malicious input is a property that characterizes the security of application from application resources stability and prevention of malicious code execution. The most common techniques are: causing buffer overrun that destroys the application stack and sets the return address to the malicious code, and Denial of Service (DoS) attack targeted on aggressive and highly unexpected consumption of system resources: operational memory, CPU time, network bandwidth or HD space. Communications security consists of a number of distinct but somewhat related properties. Three major categories, depending on the attacked capabilities and risk to the data security, that are considered when talking about data exchange security are: confidentiality, message integrity and endpoint authentication [Rescorla 2001]. Confidentiality means that communication data is kept secret from unintended listeners [Rescorla 2001]. Message integrity means that message recipient is able to determine whether the message was changed or not after it was released from the message source. It also covers the case of malformed or lost messages during the transmission. Endpoint authentication means that it is possible to verify whether the source, intermediate or destination endpoint in the communication channel is the one it is intended to be. Publications related to security testing [Whittaker, Thompson 2004] [McGraw, Potter 2004] [Thompson 2005] strongly put emphasis to the point that security testing is slightly different from conformance testing. A + B area (Figure 50) represents SW intended functionality, fully described in SRD. A + C is an actual SW functionality that was implemented. A represents the only correct and secure SW implementation. B is a part of SW never implemented or with deficient implementation. C shows the part of SW actual behavior that is not part of its intended behavior (side effects). It is in C that many security vulnerabilities exist while B is an area where most of common SW bugs (including SW performance defects) are found. In comparison to traditional SW bugs discovery (looking for behaviors that do not work as specified in SRD) the idea hidden behind SW security testing is
97 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments to ignore SRD and looking instead at additional behaviors, their side effects, and the implications between SW and its environment [Whittaker, Thompson 2004].
Deficient implementation: Unintended functionality or Traditional SW bugs unknown functionality: May include security issues A Source of most security issues C B Correct and secure
Intended implementation functionality = Actual SW perfect behavior functionality
Figure 50. Illustration of SW security issues’ origin: A - correct and secure implementation (no SW defects), B – lack of implementation/insufficient implementation, C – not intended implementation [source: author, based on Whittaker, Thompson 2004]
What is more, recent experiences show that one of most significant security vulnerability SW products face today (and should expect it increasing in the future) is misperception of risk [Harkins 2009] – risk is exaggerated or underestimated, resulting in inadequate actions taken. Two groups of external factors: economic and psychology usually drive to the misperception. Although these aspects are not a part of this dissertation they must be taken into consideration when performing the security tests. The crucial factors are: principle of moral hazards (actions depend on a scope of involvement, often are bigger if a party is fully involved into security issue), asymmetry of information (level of information may be different for each party involved in), and principle agent problem (one party acts on behalf of another party, often not fully interested in all security aspects of the fully involved parties). The mitigation of this vulnerability is based on a very cold-blooded, unemotional decision making, supported with a very agile approach to responsiveness on every security issue.
3.6. SW security testing
3.6.1. Scope of security testing
SW security testing is aimed to uncover SW security threats before the system is released and deployed in target environment where attackers can access it and potentially break [Lam 2001] [Howard, LeBlanc 2002]. There is a number of highly-specialized organizations that are targeted to find SW security holes [ITL], discuss them publicly and present appropriate solutions or workarounds: e.g. [MS_SECBULL] [US_CERT] [SECINNOV] [INSECURE] [SECFOCUS], or exploits [SPIDYN] [WHITEHATSEC], even on special auctions [Miller 2008]. Table 18 presents the most popular test classes of security tests [Arkin et al 2005] [Wang et al 2007].
98 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Table 18. Security test classes [Arkin et al 2005] [Wang et al 2007] Test class Short description Examples of security parameters System level Simulates an adversary's Application layer: penetration attempts to achieve malicious 1. Number of broken security policies of black-box tests goals in the system and the system concentrates on explore 2. Percent of threats found to a number of system security flaws; all potential threat conditions attacker is blind 3. Number of successful accesses to the System level Simulates an adversary's system by malicious users to all penetration attempts to achieve malicious system accesses white-box tests goals in the system and 4. Application availability (percent of concentrates on explore working time when application is system security flaws; available to the authorized user) during attacker is fully informed security tests about system design, its source code and deployment Network layer: Threat model Verifies the consistency 5. Number of intercepted / malformed / driven security between the threat prepared packets unexpectedly tests specification and code accepted by the application destination implementation point (these packets are sent by an Misuse tests Concentrates on use scenarios intruder) where malicious user can potentially use the system but authorized user cannot Intrusion Process or techniques used to detection tests detect attacks on a specific network, system or application. Most IDS tools not only detect attacks, but also SW misuse, policy violations and other forms of inappropriate activities.
There is no security testing without a definition of system threats [Wang et al 2007]. For instance [Lai et al 2008] shows that in recent years most Internet attacks started by improper authorization on Web servers and Web applications. [Cardenas et al 2009] discuses security properties, threat model, and the security design space for sensor network security. The successful security testing requires all malicious scenarios to be found, described and verified. There are several techniques of such a data preparation: from UML-based formal methods derived on system requirements specification, through code inspection and peer-to- peer review [Pimentel et al 2007], static source code analysis (e.g. searching for possible stack overflow cases with instruction return code modification) to black-box penetration testing [Arkin et al 2005] performed independent tools that can reveal system vulnerability based on the past hacking experience of the tool authors. Security testing may also require intruder characterization. There are several types of malicious users: from dummy users that act without intelligence (e.g. they are continuously repeating some actions with brutal force), intelligent attackers those smoothly adjust their behavior or learn from the attack results to external components or non-human interfaces (Table 19).
99 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Table 19. Classes of SW users considered in security testing [source : author] User class Class description User Interface Represents inputs that originate from a user interface (GUI, command line, related menu-driven) or application API ; may be called by humans or programs. Data Represents operations related to data managing, storing, retrieving, Management encrypting/decrypting: standard File-Systems (FS) – Windows: e.g. related FAT12/16/32 [FAT], NTFS [Kozierok 2001], HPFS [Duncan 1989], Linux/Unix: e.g. ext2/3 [Tweedie 2000], XFS [XFS], JFS [JFS], ReiserFS [Herborth 2006], MacOS – HFS+ [HFS+], FFS [McKusick et al 1984], OS/2 - e.g. HPFS, JFS; distributed FS – e.g. NFS [Shepler et al 2003], DFS, SMB [Sharpe 2002], CIFS [Leach, Naik 1997]; DB layer – relational DBs, object-oriented DBs, data warehouses, WS layer. OS related Represents operations performed with the use of OS where application resides; it refers to operational memory management and organization, thread and task queue management, parallel processing, system resources coordination and any other API requested by a program from OS; it is the layer between HW and the application Network data Represents SW incitement by network datagrams according to the network exchange protocol specifications, their sequence of arrival including correct or related malformed/truncated/crafted/lost/retransmitted packets. SW related Refers to any applications, library or external SW component that is required for the application relies; Inside SW Refers to the software itself that must be protected: proprietary algorithms, security related optimizations, trade secrets that must survive code reverse engineering techniques performed by disassemblers.
3.6.2. Security attacks
There are several classes of attacks against SW (Table 20). The scale of damage implicates the severity of security defect, starting from Critical (for arbitrary code execution, remote or anonymous user escalation privilege), through High (DoS, local elevation of privilege, violating privacy guidelines, tampering with user data, user or machine spoofing), Medium (temporary DoS, information disclosure of less importance), to Low (non persistent, or hard to replicate defects).
Table 20. SW attack classes [Whittaker, Thompson 2004] Attack class Description Environmental It is a process of simulating failures in the application’s environment. Its Fault Injection power is based on an observation that error-handling routines are (EFI) notoriously subjected to far less testing than the functional code they are designed to protect. There are two approaches: SEFI and REFI. Source Code- EFI where source code statements are modified so that specific faulty Based EFI behavior is attained. (SEFI) Runtime EFI EFI where application accesses to external resources are intercepted, (REFI) modified, resend, blocked, selectively blocked, or rearranged. Breaking It is a process of manipulating the application inputs, starting from GUI to Through low -level interfaces, so that additional, unintended, or undocumented
100 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Attack class Description Interface (BTI) application behavior is observed. Injection Flaws It a BTI technique for injecting code into some form of user input, with the goal of an interpreter somewhere processing it. Examples: SQL Injection (targets the backend DB data store), Command Injection (targets OS), Code Injection (targets the SW), Cross Site Request Forgery (CSRF, targets the trust SW has in the user), Cross Site Scripting (XSS, targets the SW clients), ML Response Splitting (e.g. HTTP response splitting to execute other attacks). Application It is an attack technique targeted to break the SW using application design Design Attack flaws or limitations. The common example is test instrumentation – introduced to the SW for test automation but provides additional interface and new functionality (attack surface increases). Application It is an attack technique that tries to use the errors in the SW Implementation implementation, even if SW design is secure. Attack
EFI attack class, especially REFI, known as an attack against application dependencies, is a very effective method of attacking SW [Whittaker, Thompson 2004]. The examples of this kind of attack are as follows:
• Blocking access to libraries required by the application: o successful attack against MSIE Content Advisor is based on blocked access to DLL library msrating.dll that causes the MSIE security feature completely negated; • (for Windows applications) manipulating registry values in their background: o successful attack against UpdateExpert tool [UPDATEEXPERT] that manages Windows security patches may be forced to pass over the security patch by manually created “Installed” key in HKEY_LOCAL_MACHINE \Software \Microsoft\ Windows NT\ CurrentVersion\ Hotfix ; • Corrupting contents of disk files or filenames used by the application, manipulating and replacing files that the application creates, reads from, writes to or executes: o successful attack against Eudora email client [EUDORA] with email message that has attached a file which name is longer that MAX_PATH characters causes application to crash; • Forcing the application to operate in low memory, low disk-space, or network- unavailability conditions; o It may cause unpredictable errors and may cause the application to lower security fence because some resources are not available; • Attacking the module processed just before the main component to be attacked is loaded/initialized/run: o A misconfiguration in SINIT code could potentially allow a malicious attacker to circumvent Intel® Trusted Execution Technology (TXT) and elevate their privileges [INTEL-SA-00021] [Rutkowska et al 2009]; o Evil-maid attack against TrueCrypt [TRUECRYPT]: attacker gains access to shut- down machine and boots it from a separate volume, attacker writes hacked bootloader onto the system, then shuts it down; when machine is booted with hacked bootloader, user is asked to enter the password that hacked bootloader intercepts and uses for its purposes [TRUECRYPT_EM].
101 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
One of the easiest to execute attack class, especially GUI related, is BTI attack. It includes:
• Causing input buffer overflow by submitting too long, too short or specially crafted input – it may lead to application DoS or execution of arbitrary code. [Howard et al 2005] derives from experience with real-world security bugs that it is buffer overruns that causes the biggest damage. It includes: commands with special character sets, escape characters and SQL Injection to cause SQL query to be modified, especially to avoid authentication or to extend its scope: o Code Red II worm that exploited buffer-overflow vulnerability in Microsoft’s Internet Information Server (IIS). Indexing Service ISAPI filter did not perform proper "bounds checking" on user inputted buffers and therefore was susceptible to a buffer overflow attack [CODERED2 2001a] [CODERED2 2001b]); o Microsoft U.K. Web site was successfully hacked via SQL Injection – a HTML code: " " to external web site was inserted to MSSQL DB in a field belonging to the table which gets read every time a new page is generated. When users accessed the web page, the database downloaded two photos and a graphic from the external site. To discover the name of the table the attacker might have queried the DB trying to read the system table SysObjects or even the INFORMATION_SCHEMA.TABLES view [MSUK 2007]); • Causing double-free issue when SW inadvertently deallocates the same heap chunk two or more times that results in corruption of control structures internal to heap allocator: o CVS [CVS] up to version 1.11.4 is exploitable because of double-buffer free error and finally an unprivileged access to the system [CSV_DF]; after the second free() onside dirswitch() data is overwritten with handcrafted attacker’s data; • Examining all possible combinations of switches and options, including GUI options, command line switches, file name parameters or HW types o MSSQL server equipped with stored procedure xp_cmdshell allows attacker to run OS commands from MSSQL server using either the SQL Server context (e.g. the Windows account used to start the MSSQL service) or a proxy account that can be configured to execute xp_cmdshell using different credentials [MSDN_XP_SHELL] – it allows attacker to break through the SQL interface to OS interface; by default xp_cmdshell is disabled, however [CM_XP_SHELL] shows how to hack MS Windows 2003 Server remotely using MSSQL xp_cmdshell + command: net use \\
The next class of attack relates to application design:
• Trying common default and test account names and password: o Most wireless routers have default username and password publicly known when they are brought up after purchase – before the password is changed (or is never changed if user is not aware of that), attacker can gain access to the open network; • Exposing unprotected API: o Some API was considered to be insecure and banned: C functions like: strcpy() , strcat() , sprintf() , vsprintf() , sscanf() – API with explicit use of length parameters when performing buffer copies - if they are wrongly used in the source code, they may lead to unexpected behaviour; [Howard, Lipner 2006] presents in details SDL banned function calls, mostly present in C Runtime Library (CRT), easily
102 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
leading to buffer overruns and poorly supporting different encoding: ASCII, Unicode, or multibyte chars; • Connecting to all L4 ports and trying to use them: o WinGate 2.1 Proxy Server [WINGATE] is infected with 8010. open-port vulnerability that allowed remote users to browse application’s LogFile and gain access to the drive hosting the service [WINGATE2.1 1998]; o Windows 95 suffered from WinNuke vulnerability – a “junk” packet destined to port 139. (used by NetBIOS) resulted in network instability, blue-screen-crash and system DoS until reboot [WINNUKE 1995]; the reincarnated version of WinNuke appeared in Windows NT, 2000, XP: a malformed Server Message Block (SMB) packet sent to either port 139. or 445. (used by Active Directory) caused system to crash [MS02-045 2002]; • Faking the source of data, checking the case if application accepts data or commands from an non-trusted or unauthenticated source: o The Quake 1, Quake 2 and QuakeWorld servers [QUAKE], the popular multiplayer action games, have a feature where it allows administrators to remotely send commands to the Quake console with a password. However the authentication can be bypassed by handcrafted UDP packets with a header containing the rcon command and the password "tms" with a source IPv4 address coming from 192.246.40.0 IP subnet, ID Software's – author of the game – specific subnet [QUAKE_BUG]; • Creating loop conditions in any application that interprets scripts, code, or other user- supplied logic or taking advantage of unexpected iterate values; o Off-By-One (OBO) boundary check errors causes results in data being written one byte past the end of the buffer (like C function strncat() used with sizeof(buffer) instead of sizeof(buffer)-1 for \0 terminated string) [Larochelle, Evans 2001]; another example may be using <= instead of < in iterative loops that may lead to OBO error if index starts from 0; o Apache HTTP Server 1.3.x, 2.0.x, 2.2.x mod_rewrite within ldap scheme handling is vulnerable to OBO error that may cause DoS or remote system access [SECFOCUS 19204]; • Using alternate routes to accomplish the same task, trying to reveal the places where security can be bypassed: o Microsoft Exchange Server’s Outlook Web Access could be successfully accessed via Windows Explorer instead of MSIE without prompting for user and password because user credentials supplied via MSIE were cached even if all instances of MSIE were terminated; o Session fixation (very common attack based on fundamental that insecure application assigns session ID before authentication and does not change it later on) allows attacker receiving a session ID from the application and then send it to the victim; simple security fix is to always change session ID after successful authentication; • Forcing the system to reset values
Implementation can be successfully attacked too, having in mind that a perfect design can still be made vulnerable by imperfect implementation [Whittaker, Thompson 2004]. The examples of this attack are:
• Examining the newest technology, probably the least tested: o MSIE fails to properly handle malformed Vector Markup Language (VML) tags. This vulnerability may allow a remote, unauthenticated attacker to execute arbitrary code on a vulnerable system. Vulnerability is present in VGX.DLL component and is
103 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
dangerous even if it is compiled with Buffer Security Checking (/GS) flag in Microsoft XP SP2 [MS06-055 2006]); • Getting between time-of-check-and-time-of-use (TOCTOU): o UNIX xterm terminal emulation application had a bug that allowed unrestricted users to create a root account for them by appending data to the /etc/passwd file. This vulnerability was possible because xterm , allowing writing any input or output to the log file /tmp/X , checked writing permissions only when the log file was created. After that user can unlink /tmp/X file and create a symbolic link of the /etc/passwd to the /tmp/X [XTERM bug]. The security hole was caused by improper use of access() before open() - the user might exploit the short time interval between checking and opening the file to manipulate it. [ACCESS man page]); • Creating files with the same name as files protected with higher classification: o certain debugging component in IBM AIX 5.3 and 6.1 [IBM AIX] does not properly handle the _LIB_INIT_DBG and _ LIB_INIT_DBG_FILE environment variables, which allows local users to gain privileges by leveraging a setuid-root program to create an arbitrary root-owned file with world-writable permissions, related to libC.a (aka the XL C++ runtime library) in AIX 5.3 and libc.a in AIX 6.1 [MILW0RM-9645] • Forcing all error messages • Looking for temporary files and screen their contents for sensitive information
3.7. IPSec performance and security testing
Obtaining IPSec performance test pass criteria should prelude each validation loop of IPSec GW [Barylski 2008]. Clear vision of performance requirements and customer expectations presented to both development and test teams enables errorless implementation and validation (Table 21). If requirements are changing rapidly it is recommended to divide the project timeframe for such iterations that allow no modifications to IPSec performance requirements within a single iteration [Barylski 2008]. IPSec performance test pass criteria should be completed with system boundary values directly influencing on the performance test cases: maximum number of concurrently processed IPSec flows, separately for inbound traffic and outbound traffic [Kent, Seo 2005] [Kent 2005a] cryptographic algorithms limitations (e.g. maximum number of concurrently decrypted / encrypted datagrams for a given cipher), maximum lifetime of SA [Kent, Seo 2005] [Kent 2005a] given in seconds, bytes or transmitted packets (with SN or ESN implementation [Kent 2005a]). Having in mind these IPSec-related design limitations validation team leader is able to create an expanded test plan with test scenarios that utilize the boundary values in order to simulate the most suitable test condition (e.g. throughput test [Bradner, McQuaid 1999] [Barylski 2007a] executed for a few IPSec flows and repeated again with the maximum number of available flows – the results can be numerically compared and any meaningful differences may be a good entry point to locate a serious design flaw [Barylski 2007a]).
104 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Table 21. The most popular IPSec performance test pass criteria for IPSec GW [Barylski 2008] Test pass criteria name SW metric • IPSec throughput [Bradner, McQuaid 1999] • Bytes / s for each packet size [Barylski 2007a] for IPv4/IPv6 ESP traffic (for both ESP modes: tunnel and transport). • Cleartext throughput [Bradner, McQuaid 1999] • Bytes / s for each packet size for IPv4/IPv6 TCP/UDP traffic • Mixed traffic (IPSec + Cleartext) throughput • Bytes / s of IPSec and Cleartext traffic [Bradner, McQuaid 1999] for IPv4/IPv6 traffic for each packet size (for both ESP modes: tunnel and transport). • Policy and Security Association (SA) [Kent, • # of policies / SAs added/removed / s Seo 2005] [Kent 2005a] adding/removing time • IPSec events [Kent, Seo 2005] performance • # of handled events / s (e.g. SA Aquire event when IPSec policy is present and no valid SA is available) • SA’s rekeying [Kaufman et al 2005] • # of SAs rekeyed / s performance
IPSec GW Test Plan (IGTP) is a document that contains validation deliverables, test environment description along with test scenarios with test pass criteria to meet IPSec GW validation expectations. In this document a dedicated chapter covering performance validation aspects should be included [Barylski 2008]. Generally IPSec GW performance tests are time consuming (e.g. throughput tests that last several days to collect full performance characteristic [Barylski 2007a]) and require a powerful HW/SW configuration to work. The IGTP should raise this topic and propose the most suitable and close to reality test configuration. Probably IPSec GW will implement many encryption (e.g. DES-CBC [Madson, Doraswamy 1998], 3DES-CBC, AES128/192/256-CBC [Frankel et al 2003]) and authentication (e.g. HMAC-SHA1-96 [Madson, Glenn 1998b] HMAC-MD5-96 [Madson, Glenn 1998a]) algorithms but project schedule enables validation of a few combinations only. It is recommended to pick up at least three the most representative test schemas, preferably the most resource consuming configuration (like the strongest encryption algorithm with the strongest authentication), the most popular configuration and the most unique setup from IPSec design standpoint (like encryption supported by a new generation of HW [Hodjat et al 2004]). Validation team that faces with heavy IPSec traffic may apply a back-to-back (B2B) setup (Figure 51) with DUT inbound communication ports linked directly. It enables DUT1 to be IPSec traffic source and DUT2 to be IPSec traffic destination. B2B setup requires an additional sanity checks to be done – it must be proved that DUT1 encrypts outbound Cleartext IP traffic correctly. DUT1 correctness may be confirmed by interoperability tests. Encryption and decryption may not have the same processing time – it causes that the weakest cell limits overall test result. Eventually ESP protocol can work in two modes: tunnel and transport, separately for IPv4 and IPv6. IPSec configuration, built from SPD and SAD entries [Kent, Seo 2005], may be composed of many types of policy selectors [Jason et al 2003] (exact matching, range matching or wildcard matching). These ESP configuration aspects must be fully addressed by IPSec GW conformance tests but they influence on performance tests too, demanding wise choice of test configuration. Finally IGTP must include detailed test scenarios, covering all test cases, including performance and stability aspects (Table 12). Example IPSec performance defects are depicted in Figure 52 a) and b).
105 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Test traffic source and destination
Cleartext IPv 4/IPv6 Cleartext IPv 4/IPv6 traffic generator traffic analyzer Cleartext Cleartext
DUT1 DUT2 console console IPSec
Outbound Outbound Inbound Inbound DUT2 DUT1 Figure 51. Test setup to validate IPSec performance with Cleartext traffic generator when no suitable IPSec traffic generator is available: DUT1 encrypts the packets; DUT2 decrypts the packets [Barylski 2008]
Successful test approach is not possible without design and implementation support. The most popular solution is to add debug information to the SW with conditional compilation mechanism. However it may not be suitable for performance tests where any additional overhead existing in the application (like processing of debug data) reduces its capabilities and may hide potential performance defects. a) b)
Figure 52. Examples of IPSec performance related defects – DoS attacks: a) CPU resources exhaustion so that other processes cannot get CPU free time slot; b) square complexity of SA adding time – the more SAs are present, the longer the processing time [source: author]
The recommended solution is to implement IPSec GW State Indicators (IGSI) on a non-critical (not influencing IPSec performance) packet flow path in a production code and HW. It may include: IPSec statistics, SW and/or HW asserts, extended firmware support to locate HW defects, error logging or dedicated debug commands that can be run on test demand to examine current system state (e.g. available memory, buffer expenditure, SPD and SAD state). The unquestionable advantage of this approach is sensible support of semi- automated and fully automated validation. Implemented IGSI should be reflected in test pass criteria described in IGTP. It means that a test engineer who is writing and executing the test must be able to assess whether IGSI
106 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments value meets or not performance test pass thresholds. IPSec performance test setup may differ from other IPSec GW test configurations. It must be focused on observing performance test pass criteria. The project risk may be reduced if two twin performance setups are created to perform code development and validation of finished code pieces concurrently [Barylski 2008]. IPSec is a solution that integrates many protocols and each IPSec component should be considered separately. While ESP secures the data traffic, relaying on cryptographic algorithms and keys already available in SAD (natural areas of interests are: data throughput, and data latency), IKEv2 is responsible for the keys negotiations and is a very sensitive element of the transmission. If keys are exchanged in an insecure way, there is no ESP security. IKEv2 creates a testing challenge, including: repeated rekeying using the same key (CREATE_CHILD_SA without new D-H value) [Kaufman et al 2005], making the cipher key vulnerable to cryptoanalysis; examining of randomness of random or pseudo-random created parameters (nonces for IKEv2, IV for ESP, keys for ciphers); ways of removing all D- H values from memory after use so that they cannot be obtained any more; longevity of secret (the longer it lives, the more data are derived from it, and the probability of successful cryptoanalysis is higher).
3.8. HTTPS performance and security testing
When testing HTTPS-based solutions several factors must be taken into consideration. First of all, from the client side, different browsers can be used – code compatibility must be preserved. The client is often out-of-control of the server – its configuration may and will be inadequate. Good design and implementation of HTTPS-based application is aware of the unexpected situations above, causing that limited functionality is presented to the user that does not meet the requirements (e.g. partial or missing SSL support, missing or out-of-date certificate) or even client is refused to proceed. During testing of HTTPS-based application it is essential to create corner conditions, including load, timeframe, sequence order, or attack simulation. Typically HTTPS secures financial transactions (e.g. e-shopping) that consist of ordered steps, required to be performed chronologically (e.g. adding product to the basket, specifying order data, accepting financial agreement, final confirmation, billing, receipt downloading), without a chance of reloading the already visited page to fake order data. To protect the content from being stolen from abandoned session an adequate timeout due to user inactivity must be incorporated into HTTPS design and implementation. HTTPS-based solution security testing should cover Truncation Attack, Substitution Attack, Downgrade Attack, Replay Attack, and Timing Attack. Truncation Attack must simulate the case when communication from initiator to responder is shortened and closes suddenly. It may happen when no Content-Length header was sent and one side of communication receives a premature close (e.g. TCP FIN instead of SSL close_notify() ). Communication is also truncated if HTTP Content-Length header of value X > 0 is sent but premature close was found after receiving Y < X bytes. The natural reaction of the side that discovers Truncation Attack is raising an error to the application stack. From testing perspective it is essential to exercise any DoS or resource exhaustion possibilities in this case. It should also be validated that premature close must always cause SSL connection closure and such SSL session cannot be resumed. SSL-based solutions, including HTTPS, are eligible to Substitution Attack. An idea that is hidden behind breaks the main security rule: references to the communication end-
107 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments points are not transmitted in a reliable way. [Rescorla 2001] presents man-in-the-middle attack against HTTPS (as a example of such Substitution Attack) done on the HTTP request that contains the first reference to HTTPS URL. Attacker replaces the valid HTTPS URL with its own harmful instead. If the attacker provides valid server’s certificate, all client’s security and integrity checks will be bypassed. The difference between valid and faked URL may be very subtle (e.g. https://mybankaccount.com vs. https://mybankacc0unt.com ) and may not be noticed by human. The same type of attack may be mounted against DC that is issued for faked address (e.g. invalid: www.mysercetaccount.com instead of valid: www.mysecretaccount.com ) - human users may override the security monit and continue the connection. It must be verified that appropriate message is presented to the user or appropriate exception is thrown to the application stack. [Schechter et al 2007] evaluated Website authentication with removed HTTPS indicators (man-in-the-middle and “phishing” attacks were mounted) – it has been measured that no experiment participants withheld their passwords when they logged in to the insecure HTTP-only bank website. [Callegati et al 2009] discusses man-in-the-middle attack to the modern transaction system based on HTTPS. It points out that use of self-signed or invalid certificates as well as ignoring security warnings is common among organizations and easily exploits system to HTTPS man-in-the-middle attack. [Saltzman, Sharabani 2009] presents man-in-the-middle attack with the use of IFrame inserted by a attacker into intercepted server response and causes a faked request to the attacker URL along with user’s cookies (Active SideJacking, SurfJacking). It may affect even HTTPS sites if there are any resources (like JavaScript files) loaded outside SSL-secured area. Refer to proxy section below for man-in-the-middle proxy attack. Downgrade Attack against HTTPS based solution should cover three cases. The first one when client initiates the connections of the use of https:// then try from the server side to transmit data with http:// was should not be accepted by the client. The second case should cover a situation when only https:// is used but steps to change of cipher suite to weaker are taken. The implementation should survive such attack. The last one but not least should incorporate a range of test cases that validate system behavior when errors or exceptions are generated. It should not cause unexpected degradation of the link security, impact of possible DoS attack should be limited to minimum. If there is no other choice closure of system functions must be done gracefully and all resources must be fried. Replay Attack is one of the easiest to execute attack against the secured application. It goal is to eavesdrop, record and then replay the same data over a wire once more, willing to either fool the connection end-point or cause its DoS. For SSL-secured Web session such attack could be simulated by reordering of its timeline (e.g. by pushing “Back” button in the browser) or trying to submit the same data (e.g. by pushing “Refresh button). Replay Attack safe implementation should have a machine state on each communication end-point that will prevent from influencing the current state of SW with “dirty” data. HTTPS proxies are another area of testing activities [Chen et al 2009]. The proxy rationale are: caching for multiple clients to improve the performance if the same contents is fetched, then elegant pass-through firewalls but it may not work directly with SSL. HTTPS proxies should be tested in a setup that emulates a very large number of HTTPS transactions occur from large intranet environments. It is a very sensitive configuration of HTTPS-based solution, starting from the point that SSL session between the server and the client passing proxy is encrypted thus proxy caching is disabled. [Khare, Lawrence 2000] presents HTTP CONNECT method for establishing end-to-end tunnels across proxies – proxy is told to initiate the connection to the server for client request, and then bypass the data between client and server without examining it or changing it. The client transmits SSL-like data to the proxy as is it a server. However this method may suffer from Man-in-the-middle proxy attack – faked proxy accepts the HTTP CONNECT [Lai et al 2008] request from client but instead of proxying
108 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments the data it negotiates a pair of SSL connections (client proxy, and proxy server) causing that data are available to the proxy in Cleartext form that breaks the security rule. The main concept of such attack is to create a very special DC that has * as its Common Name. [Rescorla 2000] forbids such certificates from being issued (even the case *.*… ) but it should be verified in field by tests that the implementation is aware of that. HTTPS related tests should examine if SW interacts correctly with virtual hosts. The test should simulate two cases: SSL connection established after any HTTP data is transmitted and before. Server uses the HTTP Host header in the request to determine which virtual host is being accessed. When request is transmitted after TLS-HSP, the server must negotiate the SSL connection without the guidance of the HTTP Host header. There are several methods of implementing virtual hosting: some are not fully consistent with SSL design. One approach is to have different DC for each virtual server (issue: client must know which virtual server it is attempting to connect to). The second one consists of assigning multiple IP addresses to a single network interface, mapping each virtual server to its own alias. When server accepts the SSL connection, it looks at the IP address of accepted connection, then performs lookup for virtual host used. [Rescorla 2000] proposes creation of DC that serves multiple hosts, using regular expressions, the improved idea presented in [Hickman, Kipp 1995]. The tests examining HTTPS performance should take into consideration if basic SSL performance rules are fulfilled. As described in [Rescorla 2001] it includes: RSA as the asymmetric algorithm (DSA performance tends to be between two and ten times slower than RSA of comparable strength, RSA is faster than DSA for verification and comparable for DS), wise choice of private key size (the shortest private keys to meet cryptoanalysis resistance requirements), symmetric cipher and digest algorithm appropriate choice, session resumption (to improve handshake operation by eliminating master_secret manipulation, nearly always worth doing on the client side, in some cases on the server side when transmission is resumed quite often to limit CPU time usage, memory for stored session reuse, context switch into the kernel), and record size (the longest possible data chunks should be used, the border value is set by ciphering capabilities). It is ability to simultaneously and secure handle multiple clients that is one of the most demanding SW performance feature. The transactions between both parties must follow ACID principle (Atomicity, Consistency, Isolation, and Durability) [Reuter, Haerder 1983]. On the other hand serving one SSL client must not cause starvation of another [Rescorla 2001]. Programmatically multiple clients are served is such a way that a separate thread (Windows based platforms) or a process (UNIX based platforms) is assigned to each of them. Then the OS scheduler is responsible for distributing CPU time between the server threads/processes to handle the clients’ request fairly. Figure 53 depicts HTTPS server processing from a single HTTPS client perspective that involves an extensive sequence of requests in depth of distributed SW – the processing timeout is calculated from the moment when HTTPS request is received by the server to the moment when full HTTPS response is prepared (including accessing all required resources and ACID approach). Meanwhile other n-1 clients send HTTPS requests that are processed in the same manner.
109 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
HTTPS request Controller Data model HTTPS Server Client 1
Processing HTTPS request Processing Client 2 View processing timeout for Processing a single ... client HTTPS response
Client n
Figure 53. HTTPS client/server processing (MVC paradigm) – illustration of HTTPS processing timeout for a single client when other SSL clients requests are received and processed [source: author]
The conclusion is that the moment of HTTPS timeout for a one client is the best point of performance testing for the system (it may cause a cascade of another timeouts and extensive usage of system resources for each transaction).
3.9. Summary
The performance and security test bed should be constructed with strict correlation to SRD. In other words test bed should simulate deployment environment as close as possible. It should reflect HW characteristics, OS flavors, system components, network capabilities, external dependencies, usage scenarios, and even potential disaster threats. On the other hand there is a bunch of generic tests that deals with stability, interoperability, usability (e.g. better application usability may speed up the GUI operations performed by application users that influence the programmatic operations throughput and impacts on system performance) that must be also taken into consideration. The test base should include fully automated tests (heavily dependent on tools capabilities mentioned in appendixes A and B) run against the code as well as semi-manual- semi-automated tests, supported by the tools but under human supervision to locate any accidental failures, or unexpected program execution paths. During selection of testing procedures, four invention disclosures were submitted to Intel Corporation scientific committee: “ESP tunnel mode security enchantment”, “Intelligent performance measurement method” (got Intel Secret classification), “Test case description standard for network traffic generators”, and “Method and apparatus of intelligent Random Early Detection (RED) behavior on IPSec gateway with drop policy anticipation”.
110 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
CHAPTER 4: P ROPOSAL OF MA2QA APPROACH
This chapter presents fundamentals of performance and security test model for improving quality of distributed applications working in public/private IP network environments: Multidimensional Approach to Quality Analysis (MA2QA). It analyses correlations between performance and security aspects from design and implementation perspectives. Then details of MA2QA are given: starting from its concept, basic formula, multidimensional quality matrix, set of example SW metrics used in MA2QA, example score card, and sample evaluation.
4.1. Application model
4.1.1. Subject of analysis
Subject of analysis from application standpoint is depicted in Figure 54, derived directly from Figure 1. Distributed applications, consisting of three layers: GUI, Middleware, and Network layer, communicate between distributed components (acting within their private networks) over public network. Two classes of the applications are analyzed: class C – using insecure IP / HTTP communication; class C’ – equipped with IPSec / HTTPS for securing the public communication channel. Component Component
GUI GUI Middleware Lay er Private Private Middleware Layer network network Network layer Public network Network layer
C: HTTP / IP Insecure communication
C’: HTTPS / IPSec Secure communication
Figure 54. Application model [source: author]
Instances of C and C’ (applications R/R, P/S and R/R’ , P/S’ respectively ) are subjected to security and performance considerations, in particular from the view of possible relations between performance and security metrics.
111 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
4.1.2. Design and implementation for performance and security
Unquestionably performance and security are attributes that are situated on the opposite corners of system design [Foster et al 1998] [Ghafoor 2007]. In the most drastic examples the most secure component is the one totally isolated from the outside world thus with communication performance reduced to zero level (e.g. disconnected physically from the network, placed inside a Faraday cage to eliminate possible wireless accessibility), while the most performance-optimized system is the one with all security rules taken off. Obviously such implementations cannot be accepted in terms of SW acting in distributed public-private network environments thus a reasonable trade-off between security and performance is desired, starting from the appropriate design, resulting in the adequate implementation, confirmed by security and performance targeted testing in production environment. Table 22 lists the most representative areas of interest in terms of security and performance, pointing out the appropriate design and implementation principles.
Table 22. Best performance vs. best security for distributed SW applications [source: author] Area Best performance Best security Input validation • No input validation • All inputs validated against XSS, expected values, SQL injection, buffer overflows/underflows Ciphering / • No ciphering/deciphering • Ciphering/deciphering with strong deciphering operations, Cleartext traffic only key asymmetric cryptography or symmetric cryptography Authentication • No authentication, all • Endpoint authentication of origin communication endpoints are incorporated, digital signatures in trusted use; • External resources cannot be trusted I/O operations • Number of I/O operations limited to • Fully controlled I/O operations, minimum; where possible no I/O with secure (but rather slow) operations at all (no HD memory allocation techniques used; reads/writes, all application code • Zeroing secured memory after use and data loaded to fast-accessible (SecureZeroMemory on Windows, memory); may be optimized by compilator); • The most power consuming • Temporary files content and operations (e.g. related to location must be chosen wisely; asymmetric cryptography) done on • Do not trust symbolic links parallel or supported with heavy duty HW & SW
112 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Area Best performance Best security String • Using libraries that accomplish • Use of Safe String Libraries; taking operations String related manipulations as advantage of compilers deprecating fast as possible. the use of all potentially dangerous String functions [Howard et al 2005]; • Avoiding format string vulnerabilities (e.g. for C printf %n and %x that may cause arbitrary stack memory display, or arbitrary write to memory); • Paying attention to size of the string (e.g. C WCHAR is 2 bytes, CHAR is 1 byte, Unicode vs. ANSI strings) System • No system dependencies; • Fully-controlled system dependencies • Use of statically-linked libraries dependencies; during compilation – it increases • Balance between dynamic and static executable size but speeds-up the libraries (to make it harder to access time to the library routines replace system dependable library and resolves dependencies with the one the attacker wanted) problems by code directly added to the executable or loaded during run- time into application address space (address calculated during compilation – static memory offset) Interaction • Asynchronous P/S model with no • Synchronous R/R model with model limitation to the incoming and message ordering, anti-replay outgoing traffic mechanism, session timeouts Memory usage • Aggressive deallocation may result • Memory secured – preventing from in double-free vulnerabilities; swapping it to HD • Conservative deallocation results in (NtLockVirtualMemory for memory leaks; Windows , mlock() for Unix); • Avoiding double free vulnerabilities; • Memory allocated by one library should never be fried by another; • Memory and heap allocators / deallocators cannot be mixed; • Encrypted pointers Thread control • Avoid deadlocks – threads become • Access to data structures from blocked on acquiring a multiple threads must be serialized synchronization primitive; (mutexes, spinlocks, reader/writer • Synchronize between threads only locks) in order to prevent if needed – if possible synchronize corruption; between a group of threads to avoid • Eliminate deadlocks when oversynchronization acquiring resources – the perfect DoS attack against SW
113 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Area Best performance Best security Data • Values assigned statically prior • Every variable is initialized in time initialization on variables are intensively used; it is declared, preventing from the stack • Variables set only if required attack on uninitialized variable [Flake 2006] Execution • System administrator impersonation • Least privilege assigned to perform rights for all tasks; the task; • No SW privilege politics at all • Privileges are separated; • Privileges are dropped as soon as possible; • Privileged methods are separated from non-privileged methods; • Each system resource has its restrictive permissions Error handling • Fail is processed as fast as possible; • Fails are processed gracefully; • Some resources may not be released • All allocated resources are closed; due to possible quick application • Single point of failure resumption after error recovery.
Analysis presented in Table 22 is the starting point for further discussion (about integrated testing method, joining both security and performance testing) and discovery of correlations between performance and security aspects of distributed applications working in public-private network infrastructures (within experiments in chapter 5).
4.2. Quality model
4.2.1. Quality tree
Figure 55 depicts quality tree for the distributed applications subjected to the analysis within this dissertation. Application quality has two quality attributes: security and performance. Quality attributes have adequate quality characteristics assigned to (compare with: Figure 63): security specific - Access control, Communication security, Vulnerability to malicious input, and performance specific – Time processing, User comfort, and Data processing.
114 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Application quality
Quality attributes Security Performance
Access control Time processing Quality characteristics for a given Communication security User comfort application layer Vulnerability to malicios input Data processing
... Quality metrics M[i,j,k,l]
Testing and monitoring tools
Figure 55. Quality tree [source: author] Finally, each quality characteristics have their specific quality metrics correlated with (see: Table 25), supported from the very bottom with a set of testing and monitoring tools. Figure 55 is the starting point for the testing model presented later on in chapter 4.3.
4.2.2. Scope of quality analysis
The scope of quality analysis done within this dissertation is depicted in Figure 56. It illustrates the area of executed experiments. First of all, different variants of the applications are being analyzed: Cleartext-like (pure IP / HTTP) and secure-like (with IPSec and HTTPS). The applications are divided into clear SW layers: GUI (application) layer, middleware layer, and network layer operating on top of communication HW. Two messaging paradigms are taken into considerations: Request/Response (R/R) and Publish/Subscribe (P/S). One of the most crucial factors is the test environment description, including testing and monitoring tools used and testing procedures applied. Finally, experiments are targeted to find correlations between performance and security aspects of distributed applications.
Application variant (with and without security protocols)
Test environment Application architecture (testing and monitoring (GUI layer, Middleware layer, tools, testing procedure) Network layer)
Correlations between performance and security Quality model Application messaging paradigm aspects (R/R, P/S)
Figure 56. Scope of quality analysis [source: author]
115 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
4.2.3. Method for finding the correlations between the metrics
Within this dissertation iterative approach to SW development and testing is applied. Output of each iteration is subjected to quality evaluation allowing improving total SW quality and finding possible relations between security and performance aspects of application variants under test. The crucial steps of a such quality analysis procedure (Figure 57) are as follows:
1. For each variant ( A, A’ , A’’ .. ) of the application: a. Expand quality model and choose a set of supported tools and metrics b. Execute the quality assessment procedures (experiments); c. Gather the test results; Repeat the loop until a given application variant meets the quality criteria. 2. Analyze all experiment results, including security and performance properties of the application; 3. Find relations between security and performance metrics of subsequent variants.
Start Experiments Project lifecycle
Initial Tools and procedures Quality assessment application A
New application variants: A’, A’’... Changes ’, ’’.. to the Criteria application A
Final application A
Analyze results and find relations between security and performance of A, A’, A’’ ...
according to the change ’, ’’ ... Stop
Figure 57. Method for finding the correlations between the metrics [source: author]
4.3. Multidimensional Approach to Quality Analysis (MA2QA)
4.3.1. MA2QA fundamentals
Multidimensional Approach to Quality Analysis (MA2QA) is an integrated approach to security and performance testing for improving quality of distributed applications working in public-private network infrastructures. MA2QA allows to compare different variants of the application and choose the most appropriate one from the given criteria perspective (Figure 58). MA2QA naturally supports application development in iterations, allowing to improve its security and performance characteristics with the new application variant.
116 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Start
Application variant ApplicationApplication variant variant Metrics Quality model Assessment procedure
Test environment Experiments execution Test tools Test procedures
Results analysis
Criteria
Is procedure complete? N New Y application variant Comparative analysis of results
Accepted? N Y
Stop Figure 58. MA2QA procedure [source: author]
Each application variant A is subjected to experiment execution and analysis as long as the assessment procedure is not complete. Then results are analyzed on a broader basis, in comparison to the all previous results, targeted to find reasonable compromise between security and performance metrics (see: chapters 4.3.3 and 4.3.4). If the quality of the application is acceptable, MA2QA procedure is stopped (it means that application variant A meets the criteria). Otherwise new application variant is created and the MA2QA procedure is repeated.
4.3.2. MA2QA usage in iterative application development
Iterative approach of distributed application development with MA2QA incorporated into the process causes that a new application variant: A, A’ , A’’ … is created for each improvements (e.g. pure IP replaced with IPSec). The improvement is the increase ( ) in the iterative approach. The process is continued until subsequent variant of A meets the test criteria derived from the requirements defined in SRD (Figure 59). Uniqueness of MA2QA lies in a definition of increase (7) . In traditional iterative development relates mainly to a new functionality being added to the SW. With MA2QA may be an improvement to security / performance without adding any new functionality.
117 Performance and Security Testing for Improving Quality of Distributed Applications Working in Public/Private Network Environments
Start
Modify application Deploy application A to test environment A+=
Use MA2QA against A
Criteria N Accepted? Define increase: Y
Stop Figure 59. MA2QA in iterative application development [source: author]