Benchmark Test Harness for .NET

APPENDIX Benchmark Test Harness for .NET WHEN MIGRATING TO A new platform like .NET, many questions regarding perfor manee arise. Presented with a ehoiee between multiple ways of aeeomplishing the same task, performanee ean often be the deciding faetor when figuring out whieh teehnique to use. Determining whieh teehnique is the quiekest may seem simple, but ean rapidly deteriorate into eomplexity. This appendix presents a performanee test harness that allows for eonsistent and robust exeeution of per formanee test eases; the results of benehmark test harness were used to eonduet the tests that are presented in this book. Chapter 2 presents a diseussion of teeh niques and praetiees that assist in developing an aeeurate benehmark, whereas the foeus of this appendix is a deseription of the implementation of the harness. Comparing Performance Analyzing the eomparative performanee of two or more software teehnologies involves two eritieal steps. The first step is deseribing a representative test ease for eaeh teehnology, and the seeond step is aeeurately timing the test eases. Deseribing a representative test ease is the domain of software engineering judg ment and experienee, but there are a number of guidelines that ean help: • Eaeh test ease should aeeomplish the same or similar end result. If the performanee of the stream-based XML parsing in the Framewerk Library is being eompared to the traditional DOM-based XML parsing, the XML test doeument and result proeessing for eaeh test should be as similar as possible. • The supporting infrastrueture for the test ease should be the same as it is in produetion eases. For example, eonsider a pieee of eode that needs to pull data from SQL Server and eaehe the data in eustom business objeets. The System.Data.DataReader type should be the quiekest option, but System. Data. DataSet has some added funetionality that may be useful if the performanee delta is small. When testing the relative performanee, it is important that the eonfiguration of SQL Server and the load that the 249 Appendix server is experiencing is the same as the production case. lt may turn out that running the stored procedure to collect that data and transporting it over the network takes 95 percent of the time, and the choice between DataReader and DataSet is insignificant in performance terms. • The test case should be profiled to ensure that supporting code is not taking up a significant amount of the time. If a test is aimed at determining the performance cost of a virtual function compared to a nonvirtual function, ensure that the code inside the functions is not overly expensive. • The test case should be conducted enough times to make the cost of setting up the test harness and calling the test method insignificant. A profiler can assist in making this determination. • The test case should not be so insignificant that the ßT compiler can discard it. Inspection of the x86 assembly code generated by the ßT compiler for a release build of an assembly will allow a determination of the inlining and discarding that has occurred. Once representative test cases have been chosen, conducting the tests seems like a simple step. The code that follows shows the simplest implementa tion possible: DateTime startTechniqueA = DateTime.Now; TestNamespace.TechniqueATest(); //run test DateTime endTechniqueA = DateTime.Now; DateTime startTechniqueB = DateTime.Now; TestNamespace.TechniqueBTest(); //run test DateTime endTechniqueB = DateTime.Now; TimeSpan timeTakenA = endTechniqueA - startTechniqueA; TimeSpan timeTakenB = endTechniqueB - startTechniqueB; 11 Console.Writeline( Test A: II + timeTakenA. ToString()); 11 Console.Writeline( Test B: II + timeTakenB. ToString()); There are a few problems with this code: • If the test case executes a method that has not been called previously, ßT compilation of the method will occur. This will distort the results in favor of the method without the ßT compiler hit. 250 Benchmark Test Harness for .NEI' • The ordering of the tests may distort the results in some cases. For example, if data is being retrieved from SQL Server, the second test can run quicker due to caching by SQL Server. • Same code is required to process the results of the tests into a form easily comprehensible. The code for result processing is not test specific, and should be factared out. • Same tests will require setup and teardown functions to execute either side of the test, and the time taken for these functions should not be included in the test results. Mixing setup code with the timing functionality code obscures the intent of a function, and increases the chance of error. • The results of some tests should be thrown out. Criteria for throwing a test result out are test specific, but will typically involve the occurrence of a significant event that takes processor time away from the executing test. • Same tests should be executed on a priority thread to minimize the interference from other threads executing on the same processor. The code to set up a secondary test thread should be factared out into a reusable test harness. • Same tests take a lang time to execute, and visual feedback of the test progress is desirable. • The DateTime dass is not an accurate timer of operations that complete in under a second, and a timer with higher accuracy is preferable. • It is possible for one test to run for a different number ofloops than other tests ifthe loop-termination literal is embedded in the test method. In the process oftesting, it is common to change the loop-termination literal a number of times until the tests run for a reasonable time period. Failing to have a formal method for using the same loop-termination literal in all tests can lead to incorrect results. • A future version of the CLR may include a progressively optimizing JIT compiler, which implies the tests should be run a number of times to simulate the execution of critical path code in a real program. This list highlights the need for a test harness that can alleviate these issues, which is presented in the following sections. 251 Appendix Implementing the Benchmark Test Harness A number of issues need to be considered in the design of a hamess for generat ing timing data on benchmark runs. Setting up and running test cases should not be overly onerous or difficult, and different categories of test runs need to be supported. Producing accurate results is the most important design goal, how ever, and failure to achieve this goalwill render the test hamess useless. Choosing the correct technologies to use is as important as bug-free hamess code, and this section covers the motivation for the current implementation of thehamess. Function Invocation The first step in the hamess design is deciding how the test case will be created and executed, where a test case is defined as a method that contains the code necessary to exercise a technology or technique in a representative manner. Creating test cases should be simple, and the overhead of calling these functions should be minimal so as not to distort the test results. The common language runtime (Cl..R) exposes a number of techniques for implementlog the function invocation section of the hamess. These include • Interfaces: Each test could be contained in a class that implements a test interface. The interface would expose a RunTest method, and the hamess could iterate over a number of objects, calling the RunTest method on each. • Reflection: Aseries of object references could be passed to the hamess, and a standard method, say Test, could be bound to and invoked on each object. • Delegates: A test method delegate could be defined, and methods that contain test cases could be added to the delegate invocation list. When using reflection, it is not possible to ensure that an object registering for performance testing exposes a Test method at compile time. Following the widely accepted programming principle that it is better to use compile-time enforcement over runtime error detection and the performance impact of late bound method invocation, the use of reflection was rejected. The use of interfaces would require a separate type for each test method, which could become cumbersome. In contrast, delegates support the chaining tagether of a number of methods in a single invocation list, and allow for the use of numerous methods for the same type, as shown in the following snippet. 252 Benchmark Test Harness for .NEI' Given these qualities, delegates where chosen as the function invocation mechanism. public delegate void DoSomething(); I/returns two delegate methods using the same return value public DoSemething ReturnDelegatelist(){ DoSemething ds = null; ds += new DoSomething(FirstDelegateMethod); ds += new DoSomething(SecondDelegateMethod); return ds; } //another method in the same type that returns a single delegate public DoSemething ReturnSingleDelegate(){ return new DoSomething(FirstDelegateMethod); } public void FirstDelegateMethod(){ return; } public void SecondDelegateMethod(){ return; } The cost of making the delegate call will be included in the overall timing for a test method call, and must be very small. The cost of maldng a function call is generallyproportional to the amount of indirection that the runtime must go through to locate the pointer to the underlying function. The level of indirection for "direct" calling technologies like static, instance, and virtual functions can be determined by the number of x86 MOV instructions needed to execute the call. An inspection of the x86 instructions that the ßT compiler produces to call a delegate indicates that only three MOV instructions are needed to locate the function pointer for the delegate method, which is less than the four MOV instructions required to call a virtual method through an interface. Having only three MOV instructions indicates very little indirection is required to call a delegate method. Function Ordering The order in which the test cases are called and the practices of only executing test functions for a single run were identified as problern areas earlier.

Benchmark Test Harness for .NET

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support