Software Testing

Total Page:16

File Type:pdf, Size:1020Kb

Software Testing

Software Testing

If you have any problems with this document or course, please contact the STEP Help Desk - NASA- [email protected] or (440) 962-3033

Module 1

My name's Ann Marie Newfelder [assumed spelling] and I've taught at NASA before and I know there's at least three of you that I recognize from previous classes. My background's actually in software reliability and I've been a software manager and a software tester and a software testing manager for quite a few years so, so that's why I'm presenting the software testing class. I've also worked on the software failure modes effect and [inaudible] analysis class which is now a webinar so my background is in software testing. I have, I've been able to collect quite a bit of data on what's effective and what's not with respect to reducing defects. So a lot of what you're gonna see in this class are things that I've actually seen work either at my location or some other organization. So you're gonna see that as we go through I'm gonna present some facts that say okay here's a test method, this is how effective this thing was at other companies and other places. So with that being said, I'm gonna go ahead and start. Okay these are our objectives for today. Basically we're gonna talk about testing and how reliability and testing are kind of related. So our goal here is to find as many bugs as possible so that's why I talk about reliability there. We're not gonna talk about reliability metrics or anything like that. We're gonna talk about testing, but the idea is we wanna find bugs and get rid of them. I'm gonna talk about the testing process and strategy. I know one of you talked about planning quite a bit so we're definitely gonna talk about planning. I'm gonna talk about three different types of testing, unit level test integration and system level test and I'm gonna talk about different viewpoints on test. There's different phases of testing and then there's different viewpoints on testing and I'm gonna cover them all. I'm gonna tell you how to find the right tools for testing. I don't think any of you mentioned anything about tools, but I'm gonna kinda give you an executive summary of some tools that could help you, and some of these tools are not terribly expensive. As a rule I don't cover expensive tools in any class. I feel like whatever I teach you, you should be able to execute without any tool at all, and if there's an inexpensive tool that helps you I'm gonna point it out. I'm also gonna show you at the very end of each section some metrics that can help make your testing more efficient and more effective and these are real simple things that don't take a lot of work to implement but they can help you determine how effective your testing is. And finally the very last thing we'll cover in each module is your exit criteria for testing. Now how do you know when, okay we're good to go? You know you're reviewing somebody's test finds, I think you'd like to have some exit criteria. When can I say yes, this is good, we're going forward? So those are our class objectives. This is the topics I wanna cover in this section, the introduction module. I'm gonna go over a few facts about testing and reliability. I wanna show you the overall testing process and strategy and then I'm gonna give you a, a very high-level viewpoint of what we're gonna cover for the rest of the class. Okay I've been collecting some data for the last, almost 20 years, it's actually been about 18 years, and I've been collecting data from industry from so far right now I have 115 software projects from not quite 100 organizations. And I've been collecting what people do versus how many bugs end up in their software once they're done testing. And so these are some of the metrics I have so far. By the way, the white paper that's on your CD has all the details about what's on this page, but basically what I've found is that in the left-hand column are different types of things that are related to the testing, different types of tests, different test activities. And the right-hand column was the average percentage increase and defect densities when you compare organizations that don't do this thing versus organizations that do. Now, keep in mind the one at the top there the 3319% these, these metrics weren't computed in a vacuum, so if you do nothing else but do the first one you're not gonna get a 3300% increase in, or reduction in defects. The people who use unit testing metrics, they were also unit testing. They were also employing a lot of the other things you see on the chart like, they were almost certainly doing all of these things down here. So it's kind of a cumulative effect. I don't want anyone to think you could pick one thing on the list and do it. Valid boundary test. Boundary testing is something we're gonna talk about probably in the next hour or so. People that did valid boundary testing and they had quite, quite a few, fewer defects that organizations that didn't. Unit, using unit testing tools that had quite an impact. Testing algorithms, pass logic and initialization, those are all things we're actually gonna talk about in the next hour. Having formal reviews on unit testing which means the developer writes the code, they test the code, and they have to tell someone else that's not a peer what they tested. In my database there was only maybe, I could give you the exact number I guess at some point, something like 10% of the organizations actually did that. At the other 90% the software engineer was allowed to test whatever they wanted. You know they said okay I'm done testing and everyone said okay we take your word for it. Formal means they're not allowed to test whatever they want; they have to convince someone else that how they tested their code was the right tests. Then those who use procedures and checklists, what's on your CD as a procedure or checklist, the people who use those had fewer defects, or something like that. People who did module level exception handling testing, meaning they tested their code to see what it didn't do, they had a reduction. Functional unit tests. What that means is I know several of you are testing requirements and you're, I'm sure you're testing it from a validation standpoint. What this means is testing them from a verification standpoint from the developers perspective. And, so you could see the reduc- the percentage impact there. And then finally having specific criteria. There were other things other than these. These were just the ones that had some pretty big differences in defect density, even the one at the bottom, a 45% decrease in defects. That's pretty monumental, okay? So everything you're gonna see in this class is kinda based on some of these facts. These are the same kind of fact s but related to integrated testing. I know I heard at least two or three of you say you were doing integration test. I'm not sure what integration testing means at NASA but what it means in my materials is that you're integrating multiple CFE and CSVi's and so you're not yet at the level of doing a validation test yet. Okay so these are some of the things people test during integration testing, input/output and input/output related faults. Like for example, just a real simple input/output, let's say you have a software system that has to work with a printer. Okay a real simple I/O test would be, turn the printer off and see if the software figures out that the printer's turned off. Okay that would be a real simple example of testing the input/output faults. Timing is another thing that gets tested lots of times during integration. Testing for hardware interfaces or firmware interfaces wherever applicable is one thing people do during integration. You might also test state diagrams or sequences which would be transaction flow. State diagrams are not always applicable if you have a stateless system. For what you all are working on, I'm guessing somewhere in there there's a stateful system. I'm gonna show you an example later today after lunch of a system where there's not a single state diagram in the entire SRS but there's a state diagram lying within that design, and it takes a little bit of creativity to find it and say yes, we need to test it. Okay so these are some of the things that we're gonna talk about later in the integration module. Now these were the things that are related to systems testing and a system test is done at a black box level. I'm gonna actually define that a little bit later, but things like recovery of data after anomalous conditions. Again lots of times people will test exceptions, but what they don't really look for is how did it recover from the exception. So they may notice okay it, it recognized the exception but did it do the right thing. Okay, so that's at the top of the list. Using testing metrics during system testing was actually right below it. Testing the user document. This one year after year after year I re-do this study every three, four, five years. Year after year this one just continues to pop to the top. That means sitting down the with the user manual or any user instructions whatever it may be and testing with the user manual. A lot of people will write user's manuals and they'll review the user's manuals at like a static review but they don't actually test the manual. And that's what that one means, just really testing with the manual. Using system testing tools. The tools for system testing are a little different than unit testing. I don't know if any of you have any of these tools or not, but at a system level most of the tools are what we call capture replay. They capture the screen and then they replay it later. So those are the kind of tools that are used during system testing and we'll talk about those later. Simulation of course is towards the top of the list. Requirements coverage which I know you guys do at NASA, I know that's one of your requirements is to cover the requirements. A lot of the organizations in my database, they weren't really required to test anything, so that's why this metric is there because, believe it or not, some organizations don't actually test the SRS. Stress testing was up there. That should be no surprise. Using test beds. A test bed means an answer, okay that's all a test bed really means. So if your software is computing some formula, the test bed would be the set of answers under different inputs. Okay, so having those in place is one. Starting the test plan during the requirements phase. One of you mentioned planning. One of the most important characteristics that I found in this data was the organizations who plan the testing while the SRS was still under development, they did much better than everyone else. This is deceiving here this 26% because a lot of these other things, they kind of depend on that. This is kind of a prerequisite for the rest of them. So, so a really important one is to actually start the test planning while you still are in the SRS process. And then the last one was criteria for exiting the testing phase. Okay here's the overview of the testing process that we're gonna use throughout class today. We start at the bottom. Define, let me see if my pointer works here, ha, I did the wrong thing. Okay, I'm not gonna touch that button again. All right. Ah here we go. At the very bottom, we define a plan. We're gonna do that in class. Then we define a strategy. A strategy is, is sort of different than a plan. A plan says we're gonna do XYZ, a strategy says how we're gonna do XY and Z. We'll see later the difference between the two. Then we actually define the test where we write the test. Then we test. Then we use metrics to make this whole thing more efficient and along the way we record failures. Can anybody think of anything else we might possibly do during testing that should be on this chart? Okay, good. Okay here's a, an overview of the software life cycle and I've consolidated the software life cycle here. We could have more phases in here, particularly at NASA you guys have more than one system level test. You have an FQT, you have a variety of system tests but I've kind of consolidated them all down into small blocks. The requirements in the system test go hand in hand. When you do a system test you're normally verifying the requirements. The top level design goes hand in hand with the integration test. It can also be useful during a system test. Sometimes the requirements in an SRS aren't specific enough to test, so sometimes we may need this top level design during system testing. Okay but for sure we need the top level design integration test. We also need the detail design and the code to integration test. Unit testing we need to verify the code and the detail design. So basically these are the feedback loops on what we're trying to test here. Okay here's an example of a test strategy. I call this test strategy a pyramid. A test strategy would be, what are we gonna test in what order. What's our high priority items? Which ones are we gonna test first? Which ones are we gonna test second and so forth. So the very first thing we'd want to look for is the parts of the design or the code that are the biggest risk and I've had a lot of people tell me in testing, well we're gonna test everything so what difference does it make whether we identify the high priority stuff first? Can anybody tell me, let's say your test period's gonna last three or four months and you've got a lot of stuff to test over that three or four months, what benefit would there be in testing the riskiest stuff first? Can anybody just tell me? Even if you know you're going to test everything. First of all it's a bad assumption that you're actually going to be testing everything. Exactly. Okay good. That's, yeah you may run out of time. And second, your highest, your big ticket items if something's wrong with that, you've got to change to fix it and get it back into test. Exactly. You gotta keep in mind that somebody's fixing the defects that you're finding in testing, so you don't want them to pile up at the end because then the project's gonna be late. And then second of all, in a perfect world we may be able to test everything, but sometimes things happen, schedules get truncated; it happens even in the most perfect of worlds so you hit the nail on the head. So even though we're, our goal is to test all of the tests we still want to execute the riskiest ones first. We want to also look, this is risk meaning the ones we think probably have the bugs, okay? If if you have any reused code at all, the software engineers knows where the risk is and in fact a software tester will be able to say right off the bat, that code is riskier than this code. So it's only when you don't have any reused code, when everything's brand new that you don't know that and then, and then in that case everything's risky. Okay, what part of the code's gonna execute the most? This this is different than that in that it's more from a frequency standpoint in that anything that's gonna execute more than something else, you probably want to get that out of the way also. Although getting the risky stuff out of the way first is more important than that. Another thing to think about is what features are used most. So this is from a software perspective, okay? This means from a design perspective what gets executed first, and this is from an end user perspective, what do people do the most with the software assuming that there are people. And if there aren't any people that interface with the software then this is not applicable. And finally last, what customers are using the software the most? A customer doesn't have to be an external paying customer. A customer can be a stakeholder, so, so these are things you'd need to look at. One of the most common things that happens with software testing is that software testers will spend months and months and months testing something and they ship it to a real live site and within a day there's a stack of earth-shattering defects. And usually when that happens it's because of these top two items right here that they, they didn't really test what somebody was going to do the most with the software or what the system was gonna do the most with the software. So anyway this is a strategy, you can see the reason why I have the pyramid like this, at the top of the pyramid would be marketing or usage related. So even though you guys aren't selling software here you can get rid of that marketing word and just say usage. And down here is software related so [inaudible] the software, the bottom two parts of the pyramid, what could the software do wrong, this is what are people gonna do with the software. Later in the class we'll talk about this more. Okay these are the phases of testing. We can have a unit level or a module level testing. What do you guys call this type of testing at NASA? Do they call it module level? Unit testing. Unit testing? Okay good. All right. It should be, the focus of testing is on a part of the code and it's from the developers point of view okay? And you can do it as soon as the code compiles but before its turned over to system testing; that's when you'd want to do it. Integration testing has the similar viewpoint as unit level except it focuses on a larger scale of integrating more code together. It can be executed as soon as any code's integrated. You don't have to, you don't have to do integration as a, as a waterfall model. Integration could be iteratively as the software's being developed. And then finally is the system level testing and these are executed without knowledge of the designer code and they can start as soon as a particular part of the software is integrated. So you can do system testing one CSCI at a time if that whole CSCI is finished, so that was my point there. Okay. The viewpoints, there's these are the viewpoints how we look at the tests. A white box test is done with full visibility of the code. That means you can see the code when you're testing which a developer would be able to do. It's appropriate during unit and integration testing. Some integration tests are run without looking at the code, some are done while looking at the code. A gray box test is when you have full visibility of the code and the architecture and the functional requirements, and the reason why it's gray box is some of these tests can be executed without actually looking at the code, and some of them can be. It's like one of those things it's, you can you cannot, you can get by without looking at the code but lots of times people do. Black box testing is when there's no visibility of the code or design, you have only requirements and those are the appropriate phases of system testing. Okay these are the types of unit level tests that I'm gonna cover. The first one would be path and logic. This is a white box test. It, as the name applies, verifies paths and complex logic. For some of you, are any of you working on a project, I I don't really know for the specific projects that you guys listed, are any of them going to require DO-178B conformants? It's for flying over general aviation. Okay, probably not. I do know that there are some NASA projects that might require DO-178B certification. It's certification required for any aircraft to fly over a commercial airspace. These requirements, the reason why I point this out is that for any project that is DO-178B compliant, you're gonna have to do some amount of that. So that's why it's first on the list. Yeah it's called MCDC [inaudible]. Okay, what does MCDC stand for? Modified Conditions Decision [inaudible]. Okay, yup. Yeah we don't want to test every possible branch and logic, we just want to test the minimum set. Is that what you're referring to? Right. Okay, good. All right, the paths are not visible to system testers. That's one reason why, yes? Oh I'm sorry I thought somebody said something. These paths in the logic are not visible to a system tester. Okay so that's why they had to be tested at a unit or a black box level. You can't see them if you're testing from a black box standpoint. Now on an average, black box testing would cover, if you did not unit testing at all or no path testing, the statistics show that you might be able to cover 40 to 60% of all the lines of code if you didn't do this test. And so the reason why we do this is to get the remainder. So, module level exception handling, it's actually part of this as well. If you do path and logic testing accurately, you can also catch the exceptions but I'd point them out separately just so you'll know that they are. Testing the module level exceptions is also very important and that's next on the list. Domain and boundary testing is also a white box texting. Domain and boundary testing, there's a little bit of overlap between this and path and logic but not completely. You can have a domain test that wouldn't necessarily be something you'd test to test a path. A boundary test is when you verify that the greater than or equal than signs in the code are working well. And this could mean coming up with different tests maybe than you did for the path and logic tests. This is often called off by 1 testing. Lots of times software engineers when they write code, they put the wrong one of these in the code and so whatever data end point is right around this, won't work and a path test won't necessarily pick up on all those. So we have a separate test for that. There can also be mathematical testing and that's when you make sure that any formula you have doesn't have a underflow or an overflow. And then there's functional testing which is left here; it's when we make sure that our unit actually does what we wanted it to do. It could be that the unit works perfectly well but it doesn't solve the problem at hand. So that's the very last test. Okay? These are the types of integration tests. We can test the input, output and the interfaces. This is usually the very first thing that people test when they integrate. They want to make sure everything is communicating with each other. There's also exception handling at the integration level. We need to make sure that the components, see at the unit level test we tested one component to make sure it was trapping exceptions, now at the integration level we want to make sure that the component's talking to each other. So for example, if there's a failure of module A, does module B know about it? So that's this type of test. We can also do timing tests. These usually verify that timeouts are not too long or not too short. Usually you will find timing in the architectural design. Timing is nearly impossible to test at a unit level. It's just is, it's something you gotta have something integrated to test the timing. And sometimes people will even do timing tests when they get to the system test because they may not be able to test it in an integration level. Sequences. It verifies that the order of execution is correct. So for example, when we do the unit test, we're testing one unit. When we do an integration test and test the sequence, we want to make sure that the sequence of the units as they're called, is correct. They could be out of order, I'm gonna show you an example of that later. And finally we can test state transitions. State transitions are normally in an architectural or detailed design. You can't always pick out states and state transitions in an SRS so that's one reason why we have this on the list for integration testing, but I do want to point out you can test state transitions as a black box test as well. As long as there's a document that describes the test, you can test it as a black box point of view. For when you all, for those of you who were doing verification, is your only input the SRS the systems or, the software requirements document or do you guys ever look at the architectural design as well? Are you allowed to look at the architectural and detailed design when you're doing the verification or not? Were those some things you guys would look . . . okay, all right. So you guys could do the state transitions either at the integration or the system level. Okay here are the systems level testing, and by the way there are more than what are shown up here. That's true for actually every module. There are more integration tests, more system tests. These are some of the more popular ones. Requirements validation is of course the one people do; that's making sure each requirement is explicitly tested. A lot of people confuse requirements coverage with code coverage. This is covering a document, it's not covering the code, so I just want to point that out. We also have system level exceptions. This is where we verify that the system as a whole can handle exceptions. Use interface is another test. Performance. This test is applicable only if there are written performance requirements. If there's not, you wouldn't do this. I would suspect for what you guys are working on there's probably written performance requirements. Stress testing. The ability for the software to run for a long time. This is a really important one that people don't do as much as they should. Okay a stress test quite honestly, a real simple one is make sure the software can keep running for four or five days without being rebooted. When you're testing software have you ever noticed you stop it, you start it, you stop it, you start it? Do you ever just let it go? Well it's hard to get through your test scripts without stopping the software at some point right? You have to reinitialize the data, so a stress test is basically testing to make sure that you can keep going. One of the more key examples of this was the London Ambulance disaster about 15 years ago where they wrote some software for the 9-1-1 system in London. It was back before all of that was automated and they really did a poor job of testing it period, but one of the things they did that was really bad is they never made sure that the software could run for a long time without rebooting. And it went, I believe, 36 hours before it just crashed and you know if you think about how long it takes your computer to crash and reboot, the recovery time really is, it's going down, it's coming back up, and then you're exactly where you were before it went down, that may seem like a short period of time, but when somebody's calling 9-1-1 it was an absolute eternity. So they found out the hard way that they had forgotten to do a stress test, and it was a major disaster. I mean it was like something like seven or eight minutes the 9-1-1 system was down so that was a big deal. That's when everybody realized well I guess we need to start doing stress testing. Configuration testing, this is when you might have multiple platforms. I don't know exactly how this might apply here but let's say for example the software needs to run on multiple versions of an operating system or multiple PC's, then you would test that. I'm not sure that this is gonna be terribly applicable for you guys but we'll go over it anyway. Compatibility, it's the ability for the software to work with other software. I think this probably is applicable for what you all are working on. Let's say your software has to work with, well an operating system would be one. How about e-mail? Do you have any software that has to be integrated with like COT software or anything else? If you have any COT software in your system this is probably gonna be a test that'll be at least applicable. May not be the most important test but it'll be applicable. Security. Security is something we just touch on in this class because security's something that actually could be another five day class and most of the time people get professionals to test the security, so we're just gonna touch on that just very briefly. Regression testing. This actually is in somewhat of an order here. Regression testing we're gonna do second to last to verify any changes we made during testing, have it broke in the software. There's an art to doing that. And then finally is the acceptance test which you guys, I assume, called FQT, formal quality, or formal qualification test? We also call it acceptance test. Okay. All right good. All right so that's what we're gonna cover today. So, so far I've learned that there's different types of tests, the viewpoints and the overview, now we're gonna go onto the unit testing module. Whoop. ^M00:31:28 [ Pause ] ^M00:31:50 Okay in this module I'm gonna present a few little facts, I'm actually gonna reiterate some of the facts we talked about earlier. We're gonna talk about the focus of this unit testing and the types of tests you can do and then I'm gonna show you how to define a plan, a strategy, the test themselves, executing them and then finally recording the failures that you find during testing. Okay these were the facts we talked about earlier in class. I'm not gonna go over each one of these again but I am gonna show you that we are actually gonna go over every one of these in this module to some extent. Okay here's the summary you guys saw this earlier in the introduction on the white box testing, that's what we're gonna do here and a little bit of gray box testing. Okay and white box and gray box, I want to go over that a little bit more here, you can, this is a Venn diagram, so you can think of the outside of this Venn diagram, you could think of that area in there as all the stuff that could be tested okay? Now the white box testing tends to find somewhere about not quite half of all the things that we could test could be executed from a white box perspective, and about half of the coverage could come from a black box test. The gray box testing actually does tend to fall in the middle. A lot of gray box tests do tend to find some of the same bugs that could be found on either unit level testing or black box testing; there is some overlap here. Then the area on the outside here, that's what could be tested but you don't know about. So ideally what we'd like is for our test to take up the most amount of that area and so that's one reason why the white box testing is so important. If you didn't have it, there's not much of a chance that you could actually get all of that area. So, so our goal for this unit is we're gonna focus on this white box area. Okay and to recap, the unit tests, we're gonna need the code or the detail design or both. Are you guys, at NASA are you required to have PDL or [inaudible] code as detail design or do they go straight from architectural design to coding? Depends on the project. Depends on the project? Okay in some cases then you may not even have a detail design, you may have the architectural design and the code. But, in that case the unit test would still be based on the code. Okay here are the tests we're gonna go over. I summarized these earlier. Here's a revisit of the testing steps. What I'm gonna go over now is the planning of unit testing. Unit testing needs to be planned just like any other test. You can't just go at it ad hoc and that's what we're gonna go over here. Okay when we plan the unit tests, we want to decide the scope of the unit test. What will and will not be tested? For some projects, we may have to decide what we're not gonna unit test and I'm gonna talk about that shortly. We need to decide who's gonna execute the test, there's been a lot of debate over this so that's why we wanna go over it, when we're gonna execute them, choosing the right tools, and finally, setting up the documentation. This is all part of the planning phase. Okay the scope of the unit tests. These are what different organizations do, even different organizations that are conforming to DO- 178B for example might have different approaches. Okay some organizations unit test only the most critical part of the code. I can tell you in my database, organizations that picked like for example we're gonna unit test 10% of the code but not the other 90%, I found act- absolutely no correlation with fewer defects. So in my database of 115 projects, those companies didn't do better than everyone else. So that's just some food for thought. However, what did actually surface in my database of projects is applying it to only new or modified code. Okay I will tell you that unit testing code that's being reused but not modified, in my database it didn't surface to the top as being super critical. Okay so what I'm saying here is, based on the facts that I have, the organizations that tested the new or the modified code, those are the ones that had the reduction in defects. They didn't necessarily test the reused code because, ideally that's already been tested. Okay so do this, what are your processes at NASA? Are there any guidelines for, I've read the software assurance guidelines and actually I don't think it actually talked about this. So I have, and maybe this is a good point not to bring up in there to have everyone assume it's everything, but I have seen this work, you would be shocked at the organizations where I've seen this practice in place, like medical devices. So I don't, I don't really, I don't suggest that one. I don't think that it works. I don't have any facts that it works but this one I have facts that this part works. Okay? So that's part of the scope. Who will execute the tests? Unit tests are always executed by some software engineer using a development environment. In order to be able to test the code you have to be able to see the code, and to see the code you have to have a development environment. That doesn't mean they can't get reviewed by someone who's not a software engineer. That can happen. Unit tests can also be executed by buddies. This has been a really popular thing lately. Have you heard. . . State programming? Yes. Have any of you used it? [Inaudible] How did it work out? I think I've seen it works in small companies. . . Um hum. With very specific short life projects. That's a good point. Short life, um hum. It's good for companies where you have senior people who [inaudible] know systems or [inaudible] of corporations and you have a new person work with them. Okay. Otherwise, people go nuts to just sit and watch somebody else program [inaudible]. Okay. [Inaudible]. All right. [Inaudible] tank and flight simulators and we had actually our quality assurance [inaudible] software engineers that were working with the developers writing unit tests so they were writing the tests [inaudible] along with the software engineers. Oh well that's interesting. Yeah. So that. . . That might be quite effective. That was used later for the regression testing, automated testing and what have you. I, that's something I haven't captured on my slide. I think that would be incredibly, did it work? It worked out very well. Yeah I would think it would work out real well. Oh that's really interesting. That needs to get up here; that's a good, that's a third alternative. Okay I think the experi- what was your name? I didn't get. . . Oh I'm sorry, I'm Bob. Bob, I don't think I actually had you introduce yourself earlier but Bob's point, he hit the nail on the head. The extreme programming thing, it was actually invented for the smaller projects, not necessarily small companies but smaller projects. One of the problems is with the, well let me talk about the benefits first of the buddy system. Theoretically the software engineer might write better code if they know someone else is looking at it. That was the whole idea behind it, that it's the pride thing. You want your code to look good because somebody else besides you is looking at it, and theoretically the buddy might find more bugs. The disadvantages though, I think you named several of which I don't have up here. The time required for the exchange, and the buddy may not be able to test the code as well as the person who wrote it. It could be the buddy is a really good software engineer but they're working on something else. So there's some benefits and some disadvantages. Normally, the default scenario is whoever wrote the code does the unit test and, and I have no data to show that that doesn't work. So that's the simplest approach to go with. And I think your approach is a really good one where. . . [Inaudible] more dedicated resources, the less risk on a project that way. Okay. I'm gonna make a note of that to put that in my slides. That's a good, good approach. Can anybody think of any other approach? We have three approaches. Okay. Now the one approach you don't see up here is to have someone who's not a software engineer write all the unit tests and run them. The reason why you can't have that is they need to have the development environment. I have seen some companies try to do that and it doesn't really work that well so. Okay when will they be executed? There's a lot of myths on running unit tests. Contrary to popular belief and practice, unit tests can be executed as soon as the code can compile. There's no reason why they can't start their testing as soon as the code compiles. That's the earliest it can be tested. Debuggers can be used to test the code even if there aren't test stubs. The biggest, the biggest, the biggest complaint I get from software engineers about unit testing, I don't have the time for test harnesses, I don't have the time to write the stubs, blah blah blah blah blah blah blah. I've yet to see a module that couldn't somehow be tested using a debugger. If you step through you can do all kinds of things. Now if there are certain types of software for which there aren't good debuggers. Like when you're writing firmware, debuggers are not very good for firmware. They're getting better and better. The only possible reason where you might want to wait a little bit where you have some test harnesses is if you don't have a good debugger available for this code. But in this day and age, even the firmware code I've seen is normally even written in C. Are any of you working on systems with a lot of firmware? Okay. So I don't think that should be a problem then. Unit tests I would highly suggest you not rub, ru- blah, run them in a big blob. That means you write a whole bunch of code and you unit tests all of them at once. This, you know based on my experience as a software manager and a test manager, this just doesn't work and then based on the facts in my database it doesn't work. It actually can take longer. The reason why it can take longer is the developer is most familiar with the code immediately after he or she writes it. That's when he or she is most able to get the bugs out. If you wait until you've written let's say 20 modules to do the unit testing you don't have the advantage of familiarity. Blocking bugs in some code can stall the testing progress as well. I've been a software engineer. I've spend many, many years trying not to unit test and I finally realized after all that time that it was easier and better just to unit test as soon as I wrote the code. Now, do I wait, do I do it one at a time? Not all the time. Sometimes I write three or four modules and then test them. Some simple rules of thumb. Set aside the end of the day or the week. The software engineers in my database who did the best job of unit testing I spent a lot of time interviewing them because I knew that they, they didn't know that I knew this, but I already knew that they had fewer bugs in their code than other people who didn't, so I interviewed the people in my database who'd done a really good job unit testing and these are the things they told me they did. They just set aside a day of the week. They decided every Friday I'm gonna unit test my code, and it worked out for them because they were in a regular pattern, it was predictable. Some of the people in my database who were good at unit testing they said they set aside the last couple of hours of the day. Either one, these are just some general rules of thumb. Okay this is what it looks like from a schedule standpoint. The incremental approach code, unit test code, unit test code, unit test, usually this doesn't result in any last minute emergencies. On the other hand, write a bunch of code, unit test, I intentionally made this longer because based on the fact that I have when people did this, it did take them longer because we have to go back, re- work the code. Have any of you ever done unit testing? Okay, what do you, what are your thoughts? What can you share? My thoughts are you gotta define the interfaces between the software components, have a good definition of that. Okay. And if that's defined well, then the person writing the unit test can pretty much sync up with the person writing the code. Oh. Yeah. Okay. That's a good, that's a good thought for back here. Very good thought, make sure the interfaces are defined. [Inaudible] defining the performance at the interface [inaudible] what kind of exception handling [inaudible]. I'm gonna write that down, make sure interfaces are defined. You know what another good idea too which was buddy with systems testers, or systems people I think. Okay good. This is our list of good ideas that you guys have that I'm gonna put in here later. Okay any other alternatives you all can think of? Okay. When the test will be executed, it's a common practice for software engineers when they're coming up with a schedule, I've been a software manager for a lot of years and I see this time after time after time. I ask a software engineer for a schedule for a particular part of the code, they almost always forget to include the unit testing aspect. They, that's just not how, that's how software engineers are built. And how many of you are software engineers? Well I know you are. Okay. Am I right? Is that kind of how they're built? Do they always think about the testing, normally when you schedule something your thought process, and even I do this, you're thinking about the development part. So it's pretty normal for even the best software engineers to just totally forget about the effort that they need to unit test. And so one of the reasons why people do this, there's a, actually there's a couple of reasons. Sometimes it's because people do actually think there's not gonna be any bugs in it. [laughter] They do think that. I, I've been to 115 projects and on about half of them they told me, I don't unit test my code because there is gonna be nothin' wrong with it. I've heard it from the horse's mouth, people do think that. They assume the unit testing won't find any bugs so this is actually different than that, okay? This is not as bad as that. These people, they think well there's bugs in my code but unit testing isn't gonna find them. So that's a different thought process. As we saw earlier, my comeback to that is black box testing typically only covers 40% of the code. The rest of it has got to come from unit and integration testing. Doesn't come from testing forever. If the viewpoint, the one thing I will point out, the one reason why people believe this and they do have a point, if your viewpoint is to verify what the code does, then this is probably true. You probably won't find a lot of bugs, however, if your viewpoint is to verify what it should do, that's when you find the bugs. One of the things that I've seen, in my database I had a few organizations who did unit testing and had not very good results. [Inaudible] for the unit testing along with the regression testing let's say you made a software change, from the CM you could map what unit test applied to that software change [inaudible] testing out of that. Okay. [Inaudible] you know if there's any differences on the outcome. Can map to regression testing. So if you didn't do this you wouldn't be able to do that. Right but. . . That's a good point. Automated testing of the unit test [inaudible] whatever software was changed. Yeah that's a really good point. Yeah. If you have the unit test defined, then when you go to fix bugs you know what to test. Compare the outputs against benchmark [inaudible]. Okay. Okay. I think getting back to the assumption that unit testing won't find some bugs. There were some projects in my database where they did unit testing and they didn't find a lot of bugs, woops, but when I dug into it to find out what I found is that they were executing the lines of code. So they were, they had projects that were required to have 100% line coverage. They were executing every line of code but they didn't define ahead of time what that code should do. They didn't have any expected results so what they were doing was, just imagine they're executing 100% of the lines of code but all they're really finding is crashes, hangs, any kind of obvious thing that would jump out and say I'm a bug. That was the only thing they found. Well obviously if you're gonna do that, you're not gonna find much. All you're gonna find is a crash or a hang or a memory leak or something like that. If you don't look at the expected results the kinds of bugs you won't find are, the algorithm was wrong, the wrong thing happened, you know it executed a recovery but it was the wrong recovery. So basically, one reason why some organizations haven't had good luck with unit testing is that they didn't verify what the code should do. And if you get overly automated and if you have too much automation, this can happen. So they buy these tools that'll execute every line of code but they don't know if what it did was right. The third reason why this is a common practice for software engineers to not do unit testing is they assume it'll take too long to test. In the statistics in my database were very, very clear. The organizations that unit tested actually didn't take longer to develop the software. Now did it take longer for them to unit test? Of course. If one group's unit testing and another group's not, sure this group's gonna take longer to unit test. But why would the schedule not be delayed? I know that you know. I know some of the software people know. Basically yes it took longer for them, the software engineer to unit test but what took less time? Any of you who are systems testers, have you ever worked on a project where you could swear the code hadn't been tested at all before it was given to you? What happened to your schedule? That's where using [inaudible] integration testing. Yeah, it falls apart at integration testing and then it falls apart, then the whole schedule just ripples. So you can't integrate the code if the functions aren't working at all so, and this isn't just my opinion. I mean I have a whole database of projects that, whenever they didn't unit test, things fell apart later. So the schedule may be longer for them but it's shorter for somebody else. Later in the class I'm gonna show you examples, I'm gonna show you one real example from a project that I had where you're gonna see that the bugs that you find in unit testing are often different than the bugs you find in system tests.

Module 2

Okay. We'll pick up where we left off here. Choosing the right tools. This is with regard to unit testing. I have a link here. I know there's kind of a rule of thumb not to point to Wikipedia, but to be perfectly honest with you, the tools, they're constantly changing, even the names of they vendors; they buy each other out. So I would suggest you go out to this link because it keeps everything up to date, and you don't have 50 million links to look at. It has a very good summary of the tools. One of the things I will tell you, this unit testing can be less tedious with the right tool, and I think the most important tool you can get is a good debugger. For most of the tests I'm going to show you in this section, there's at least one, probably several, tools available to automate it. And, in fact, just about everything I have in this module, there's a tool. And the tools for unit testing, they're not, they're not necessarily super cheap, but they're not super expensive, either. You know something in the four-digit range. Have any of you used any unit testing tools?

I think mostly capture/replay.

Those I have under system testing. The tools I'm referring to here, they would actually be part of your development environment where -- like some of them, for example, keep track of your line coverage while you're testing, so you'll know at the end that you've covered all your lines. Some of them go ahead and they intelligently find your paths in your code and they tell you where they are, and they tell you the inputs to test to execute the paths. So they actually, they work off the source code, and so I would suggest, there's a ton of them out there.

[Inaudible] what you consider the top two or three?

I have, I have a file on your CD that has the most popular tools, but one of them is called -- I'm trying to get the acronym right. I think it's caps, C-A-P-S, but I could be wrong about that. There's a company out in California that makes most of the popular tools, and off the of top of my head, I can't actually remember the name of the tool, but I can tell you that during the break. And I also have those tools on your CD. So as soon as we're done with this unit, I don't want to actually get out of this unit here, I'm going to bring those tools up and I'll show them to you.

[ Inaudible audience question ] No, the tools that I'm -- well, that is, that's a nice coding tool, but the tools I'm referring to, they don't try to find bugs, they try to find test paths. So they will zip through your code and they'll find where your paths are they'll tell you the input you need to execute them. And some of them, some of the more expensive ones, for example, Dr. McCabe [phonetic] has one of the these tools. Yes. And that one would be, that's at the high end. That's an expensive one. It'll actually go through and execute -

JUnit for Java, [inaudible] for C Sharp, Windows 7.

Oh, sure, JUnit, I totally forgot about JUnit for Java. Yeah, there's -- these tools will actually help you test, they don't try to review the code against coding standards. So that's a good question. That's what they do. I'm going to bring up a few of these later. I'll show you some of the links, and I'll show you the price ranges, as well. Just about everything you're going to see in this module can be automated with these tools, so. Okay.

These are kind of a summary of the tools here. The path and logic, and the exception handling, and the domain testing. These three things, the tools that I talked about out in this link that I'm going to show you when we get to the end, they are tools that are available that will test those things. Some of the cheap ones, they'll simply tell you what the paths are, and then you have to go and execute them. And the expensive ones will actually execute the tests for you. So that's what the difference is. One thing I strongly recommend is that, and since you're in this class, you're fulfilling this recommendation -- you should always know what you're supposed to do before you go and run the tool. So by the time you get done with this class, you're going to know what these tests are, so you'll be in a position to go run the tool. A lot of people will buy these tools without actually knowing what they're doing, like what the tool is doing. So that I don't recommend; I recommend against that. Map testing -- normally people, when they do map testing, they use something like a spreadsheet or a Mathcad, or one of those tools as a test harness. In fact, more often than not -- I don't like to use vendor names here -- but more often than not, people use Mathcad. They'll write the code in Mathcad, they get the answer, and then they compare it against what the code does.

Functional testing. You will need to have a SRS available to be able to do some functional testing, so to be able to make sure that the module meets the requirements. This can be done -- lots of times you can do the functional testing with any of these tools. You can just merge the functional test in with these and use the same tool. So, pretty much everything we're going to look at you can automate one way or another. Okay, setting up the documentation. I would suggest for unit testing that you create a template for these artifacts before you start unit testing. And I'm going to show you one of these templates shortly. You have one on your CD. I have found that for unit testing that a spreadsheet is really simple and really convenient unless your automated tool also does the documentation, and then you could use that. Okay, if you don't have anything else, a good spreadsheet will work fine. When I interviewed the software engineers in my database who actually have the fewest bugs in their code, they were all using some kind of spreadsheet. They didn't have anything terribly expensive. They wrote what they wanted down in the spreadsheet, they ran it, they checked it off. That was the end of it. It could be very simple. You can create a sheet for each unit. You should be able to fit all the criteria on one page, and the criteria will list the items shown this presentation. So basically, everything I'm going to show you in this module would be in the checklist, okay? You can use the checklist as a memory jogger, all right. I don't know if I want to show you checklist now, because I want to show you the material first and then show you the checklist. But I'm going to show you the checklist. As we get through a little bit of material, I'm going to show you the checklist that I have.

Okay, so we've talked about the planning for unit testing. Let's talk about the strategy. Okay. Do you remember from the first part of the class we talked about strategizing. The two blocks at the bottom are related to the software, itself, and the two blocks at the top are related to marketing or systems. So let's look at defining a strategy for the unit testing. What parts of the designer code are the biggest risk? Now, earlier I told you that I didn't think it was a good idea to unit test only the critical code. I just don't have any data at all that shows that that works out well. So at this point, we're assuming that you're unit testing anything new or modified. So then the idea is, well, during the strategy, what level of unit testing do we want to apply to different units. So we're assuming we're testing all the new ones. But maybe some of them might need more formal unit testing. Like, for example, formal review from a non-peer subject matter expert. Usually what I find at a lot of companies who do this is either they formal review everything or they formal review nothing. But I just want to point out, there is an option here to select what you want to do your formal reviews on. So, for example, if somebody's working on some code that is just astronomically critical, maybe their code needs to be formal reviewed by someone else. Okay. When you were doing the unit testing in conjunction with the systems engineers, did you guys apply the 100 percent thing, or did you do it for certain parts?

It was 100 percent.

Okay. One of the things I have found with human nature -- I've been in software engineering for this long -- I will tell you, doing something a 100 percent from an organization standpoint is easier than picking and choosing. Okay, I'm presenting an option here. If you some particular parts of the code that are more risky than others, you may make those parts more formalized. You may require more reviews of it. Okay. It shouldn't be construed as testing one thing or another. We're assuming we're testing everything. Okay, another thing I want to point out is that if you don't know what parts of code are the biggest risk, sometimes FMEA can help point that out. Okay. Just a thought. Okay, what parts of the code are executed the most? Well, again we're getting back to being more formalized. We assuming we're testing everything. But if you have some parts of the code that you know are executing 24/7, you may decide to make those more formalized. Okay. Which customers or endusers are using it the most? If your software application has more than one customer type, you may decide there are some features that are super important; we want to formalize those. So have you guys ever done this kind of analysis on your software where you looked at it and kind of tried to dissect the priority and say, well this stuff needs more formalization than this stuff over here. So this is just another way to look at it. And then finally, if you have multiple customers, and I think for most of the NASA projects, I don't know that this would apply. Where this kind of would apply is if your software is going to be installed at multiple sites, okay, which I think for most of you is not the case. I have companies that I work with where they're building software that might be installed at 100 different customers, so for them, they'd actually have to keep track of which customers are going to be using it, and what are they going to do. I don't think you guys have to worry about that here. So, I suspect this one is not applicable.

Okay, defining the test is next. All right. I'm going to go through each one of these, but right off the bat I want to tell you what not to do at the beginning, and then I'm going to tell you how to do the test. For each of these tests, don't test what the code does, test what it should do. This is the biggest money waster of all time, is to just execute the code and not have an expected result. Don't make guesses or assumptions about what the code should do; ask an expert. I've got a bunch of examples on my CD that I'm going to show you where we get into unit testing, we find that the SRS just wasn't quite clear enough. Can anybody, any of you who have done development, how many times do you run into that, when you go to unit tests, you find out something is missing. Have you ever found like a perfect spec where you could code it and not have any questions whatsoever? Like it never happens. Even when I write my own spec, I go to unit tests and I find there's something wrong with the spec. So it's guaranteed that you're going to find this, and you need to set up your documentation and make sure that you have a means to report these things. We're going to see in the checklist, the checklist I give you actually asks you at the end, did you find anything that needs to be changed in the spec, did you tell somebody about it. Okay. All right. Here is an example of a unit test checklist, and again, this could be in an Excel spreadsheet. For unit testing, this is just my suggestion -- you guys are free to change this if you don't like my suggestion, but I would suggest you go for the minimum documentation possible. The idea is you want people to think of the tests, execute them and write the results. If you put in too much stuff, what's going to happen is unit testing just doesn't get done. So we have the minimum here. The module you're testing, some ID, which would be one, two, three, four, five, whatever; a description of what they're testing, the inputs, the expected outputs, and whether or not it's automated. Can anybody think of anything else that you'd want to put up here. Since we're module testing, there's probably not any setup or configuration, so there's probably not any prerequisite test, because they're testing one module at a time. So this is basically the basic stuff. You know, you may have a few things you want to add to it, but I would be very careful not to add too much. Okay. When I was interviewing all the software engineers in my database who had had the lowest defect density, this is pretty much what they were recording. Really simple, just fits on one piece of paper. Okay.

For path testing, this is where we're going to determine where the branches are, how many of them there are, and the minimum number of path tests would be equal to the number of branches plus one. Okay. So here's an example of path testing. There's are logic diagrams. Okay. These little blobs here are all logic, so this right here is a branch in logic. Each one of these nodes, it could be one line of code, it could be a hundred lines of code, it could be a thousand lines of code, but here's where the branch in logic is. Okay. This is an example of an if, then else. If then -- or, I'm sorry. If else, and then it comes back in the middle. Okay. And if, well, an if would be just a straight line down. If we had just a simple if something, an if statement, it would be a straight line down. Okay, here's a case statement, and there could be many of these. There could be four, five, six, seven, so forth. Here's a repeat until loop, here's a wild loop. There's more constructs than these. Can anybody think of any others? We could have a Go To statement, which I'm actually going to show you shortly. This kind of captures most of the branches and logic. So when we're talking about path testing, what we want to do is cover. These are called edges here. We want to cover as many of the edges as we can. And if we use the algorithm right, we can actually cover all the edges. And theoretically, if we cover all the edges, you would cover all the lines of code, you just might not cover all the conditions. Okay, so these are the branches and logic. So the steps are, select paths to cover each branch and the most logic as follows. The first step I always tell people the first time you do this is generate a flow diagram. You guys are asking me what the tools do; a lot of them will do this for you. They'll actually create the little diagrams and tell you what to test. You could also do it by hand. Make nodes, or establish nodes, to indicate branches and logic and edges to indicate sequential statements, so the edges are what connect the nodes. Find the longest path. If there's a tie, just pick one. Make this test path number one. Starting at the top, flip the nodes one at a time, but keep to the first path. Continue until all nodes are flipped. Once you're familiar with this, you won't need to refer back to this material. I've been doing this kind of testing for about 25 years; I just have the algorithm memorized. I can see the path as soon as I look at the code. . Until you get to where that's familiar to you, you should probably keep this algorithm handy or use a tool. Okay, here's an example, and I'm going to let you guys -- actually, I know the answers are in your handbook, but we're going to -- well, you know what, we'll do this together as a class example. Here is the most common algorithm of all time. It's a formula -- well, it's a function. The purpose of the function is, the input is a date. The output is, is the date valid or invalid. So those are the two outputs; it could be valid, it could be invalid, that's it. So the reason why I use the date function is this function has had more bugs in it over the last 30 years than probably any other function, and it still continues to have bugs. It's amazing. And I'm not -- even though I've run this through the ringer, I still, in the back of my mind am nervous that I didn't cover everything. But anyway, here's the algorithm. And you could have developed this a different way, but this is how I developed it. The very first thing, it checks for the month. Is it greater than 1 or greater than 12. If not, the result is an error. Okay. If it's between 1 and 12 it continues on down. It gets here. Is the day less than 1 or greater than 31. Well, we know that would be an error, it comes in here. So if that's okay, it continues on. Is the year greater than zero. Okay, sorry, less than zero. I was going to say, that would be wrong. Okay, if it's less than zero, we go down. So finally, we get here. Now we at least have some data that's reasonable, it just may not be valid together. We know that each of the three parts is valid, but we don't know that the culmination of the parts is valid. We get down here. Is the month equal to 4, 6, 9, or 11, which would be April, June, September, or November. Okay, we get here. Is the day greater than 30? We got on error. Okay. So otherwise, we go down this path. If the month is February -- now you notice what's not here. If the month is anything else, it's valid. It'll just fall through to the bottom. And actually, did I even show that? There is -- I just realized this. I think it'll work anyway. I think it'll work anyway, because it's initializing it to valid. But what I should have done here, I knew that there was something missing from my graph. There should be a line straight down. For everybody else we know it's valid. So it'll still work, because we initialized it, but it's sloppy coding. I should have aligned down. If any other month, it's valid. Okay. It's already found -- I knew there was something on here, now it's eating at me. I got it. Okay, if it's February, we go down this path. By the way, I have the formulas for all this in the bottom part of your foil, so I'm just going my memory here. If it's not evenly divisible by 4 and the day is greater than 29, we've got on error, otherwise, we don't. Okay. If it is evenly divisible by 4, then we have to check if it's evenly divisible by 100. If it's not evenly divisible by 100 and the day is greater than 29, we've got on error; otherwise, we don't. If it is evenly divisible by 100 but not by 400, and the day is greater than 28, we've got on error. If it is evenly divisible by 400 and the day is greater than 29, we have an error. So, this is the logic, this is the logic diagram. We've created it.

Now how would we go about testing this. Well, the longest path is probably one of these over here. We could pick one of those and that would be the longest path. I don't think it actually really matters. We could just pick this one, for example. I don't know which one I picked in my example. But anyway, actually go through. You picked one of these paths. Let me actually show you which one I picked, this way I could stay consistent. Ah, here we go. The paths are ARU-- let me see. Oh, you know what I did? I took them in order from top to bottom. So, let me just explain them from top to bottom. Okay, obviously this is one path, ARU, right. So what I do when I'm unit testing and I'm not using a tool, I get a magic marker and I color that one out to show that I covered it. Okay. If you have a tool, the tool will do it for you. The tool will actually tell you which one tested. So basically what we're going to test is a month that's less than 1 or greater than 12. Now to get path coverage, we only need to test one of those. If all you want to do is execute that line of code, you can test any number less than 1 or greater than 12. You don't have to test all of them. Later when we get to boundary testing, I'm going to tell you how to pick a good number there. But in this case, it could be anything else out of that range, it doesn't really matter. Okay. Now the next path is clearly ABSU, so all we need is a month that's in range, but a day that's not. Then we have ABCDU. Then ABCDEU, ABCDGU, and then so forth. We go down here. Can everybody see where the paths are? Okay. So basically, you identify the test until you've got everything shaded. There's no reason to test something twice unless you have to, to test another path. Okay. So that's what path testing is. I keep forgetting I have this thing. Okay, so you would determine the inputs to make each facet execute, and execute them with the most appropriate tools.

If a unit of code has more than one entrance or exit, it should be redesigned prior to unit testing. If a unit code has a Go To, it should also be redesigned. This path testing is extremely difficult to do when you have a Go To statement. Are Go To statements something that you guys run into commonly, I mean, I would suspect on old code there's probably tons of them.

Old code, yeah.

Okay, but on new code?

Thirty years is good.

Well, hopefully, I mean my guidelines for unit testing were to test new and modified code. Do you guys ever have to modify the old code much?

Oh, yes. Not much, but --

Okay. So you run into this, then. Okay, the testing this with a Go To statement, I'm going to show you how to do that shortly. It makes things more complicated. Here were the paths for our example. Okay. Now, here are the inputs. So you can see for the first test, here's my inputs, there's my expected result. For each test, you see how simple this is? So the tools, a lot of them with generate something like this. It probably won't look exactly like this, but that's what the tools are used for. But you can see we could do this also manually. We don't have to have a tool to do this. So here's my description of the test. Here's the inputs. EDB is evenly divisible by, and not evenly divisible by, and then there's our expected results. Okay. You could see if we didn't have the expected results, what would happen is, this software would probably never crash, and so we'd [inaudible] paths. So you see why now you've got to write the expected results down. Okay. Here's a example of had what happens when you have multiple entrance and exits. This would be like, for example, a Go To statement into a case statement. I don't know why somebody would do that, but it makes for a good diagram. Whenever you have Go To statements or multiple return statements, or things like that, it makes the branch testing more difficult. Now you have to go through your normal paths, but you've also got to test that one, too. So it just makes it harder. Okay, any questions on this. Was my example kind of fairly clear? Okay, good. So for the purpose of path testing for right now, we just want one path that covers the code. We don't really care what that one path is. Later on I'm going to tell you how to be more picky about the paths that you pick.

Okay, logic testing. There's always the possibility that you could have complex logic. Now the previous type of tests that I showed you, it helps you find each branch of logic so that you can test every line of code. But the one thing that it doesn't do is test this. Okay. Let's say you have some complex logic, like if something -- this is an or statement, by the way. If A or B, then X happens. X is some blob of code. Okay. Or we could have if A and B, Y happens, and this is another example: if A or B and C, then Z happens. This is complex logic. We could have any number of things and, and/or together. Well, the path testing that I just showed you, it would only execute one test for this. But really, there's more tests to execute those things, aren't there. We could, let's just take this one. What are the possible ways we could get to X? If A's true, it'll go to X. If B's true, it'll go to X. If A and B's true, it'll go to X. The only time it doesn't go to X is when? When we have A and B totally false. So we actually have four different ways that X could get executed. If we're doing just a path test, our minimum requirements are, we only need to test one of those. Now the problem is that a lot of bugs can take place in these and and or statements. For this we need software engineers, and I'm sure you understand what I mean here. Sometimes these things can introduce bugs just in that one line, and until we actually expand them, a path test is actually not going to help us find this at all. Okay. So let me show you how to test different complex logic. Okay. Here is -- okay. Basically to test the logic, determine the total number of logical conditions for each case, it would be 2 to the nth power where N is the number of operators we have in there. So going back here, 2 to the N is 2 squared. We have four possible conditions. A true, B true, A and B true, A and B false. Here we have four. How about here. We have three conditions with 2 to the third power. Eight. Okay, so we have four possible paths here, and eight there. Okay, so that's the first step. The second step is to create a truth table that has each of the possible conditions in them. How many of you are electrical engineers? Okay, a truth table should come easy for electrical engineers. For everybody else, this is a truth table, okay? So basically let's take the case where I have, let me see, I think this is an example of the previous one here, A, B, or C. Okay, we have only A true, only B true, only C true. By the way, I copy and paste these. I never reinvent. I keep a little truth table handy with a 4 and an 8, and even a 16, and I just go and I plug in whatever the thing is that goes with it. Okay, so now we could see An and B, B and C, A and C, all true, all false. We have eight conditions. Is he here's the actual results. Getting back to this example back here. This is actually -- actually my example is all of it. Okay. Here's the actual results. This is what we know is going to happen from looking at the code, right. X is executed. Nothing happens when we have only C true. Okay. Here's the actual results, we can tell by looking at the code. The expected results we'd have to get from the design. And I point this out because lots of times design documents don't actually have every possible condition in them. What's pretty typical is that lots of times the design document might have about half of these thought out or spelled out, and the rest are just implied. So one of the first things you want to do is make sure you know what is expected for each one of these. And during the break I was talking with James and, I forget your name again.

Brett.

Brett. James and Brett and I were talking, and we discussed that one reason why you do unit testing is because you're guaranteed to find some hole in a design document. This is the place where people try to find holes. So you want to get this truth table knocked out early because you may need to go back and ask for a clarification on these. So that's why I tell you to go ahead and get that from the design. Okay, then if we want to have 100 percent logic coverage, we would actually need to execute all of those. Now, this is I think something that needs to be part of scope. You guys, I think when you're doing unit testing, you need to decide whether you're going to cover the logic or not. Lots of times on different projects, people will decide whether to cover it or not. So, for example, with DO-178B, correct me if I'm wrong, James, but I think the different levels talk about this. Is this required at level A?

Yeah, it's required at level A.

Is it required -- I don't think it is required at C or D, is it, or not?

It is required at [inaudible].

Okay. So this is, this would be required for DO-178, A and B?

Yes.

Okay. So basically, I think, you guys, this is kind of whether or not to do this testing falls under this scope, because I will tell you, this is a lot of work. It's a chunk of work. This is probably more work than doing the path testing. But, again, there are tools that do this, so I think it depends on the requirements for the system, whether you're going to execute all these. Okay.

Now the important thing is to determine the outcome for each of the logical values and compare it against the expected. Don't take guesses for expected outcome. Based on my experience, what I have found with logic testing is software engineers are really good at thinking out there ifs when everything is true. Where it tends to fall apart is where some things are false. That's where the logic testing -- so if you want to pick and choose, I would pick and choose testing on the false conditions, because the true conditions probably work. Okay. Some do's, and don'ts, for path and logic level testing. Do you understand the mechanics of this before you get an automated tool. You guys now understand the mechanics of it, I'm pretty sure, so you can go head and get a tool. So, I think, you know, there's tools out there that'll work for you, and I'll show you what some of them are. Do just what the code should do and what it does. Choose tools that merge the line of branch coverage. One of the important characteristics of these tools that I want to show you is, some of them merge, and some of them don't. What that means is, let's say you're unit testing 45 days in a row. You stop, you start, you stop, you start. It would be really great to be able to merge all that so that when you're done, you can keep track of the coverage for the whole code. So, for example, if you're working on a project that's DO-178 certified, you have to show evidence that you did the coverage. So you'd probably need to have one of these tools that merge the results, otherwise, it's just your word, unless you print out of the results or do something. It's a really big bookkeeping effort to combine all of the line coverage. So you might as well get a tool that does it for you. If you want to know my opinion, one of the single most useful things about having a tool is to have a merge. And when go through and I show you some of the tools that are available, I'm going to show you which ones merge and which ones don't.

Okay. Choose tools that are easier to use than doing the test manually. I would definitely try them all out. I'm pretty sure all the tools come with experimental. If you think it's harder to use the tool than it is to test it manually, then test it manually. It's an option. Okay. Anybody else have any suggestions for the rest of the class on do's and don'ts? I know several of you have been to the school of hard knocks. Okay.

Just an aside, if you have a large program [inaudible], you can do a semiformal trade [inaudible].

I missed one word of what you said; semiformal -- ?

Trade, trade study.

Oh, trade study, okay.

[ Inaudible audience response]

That is an excellent point, and I can't believe I didn't put it on my slide. Make sure everyone likes the tool. That is enormously important. I'm actually shocked that I left that one off. Yeah, if one person likes the tool and 19 people hate it, things aren't going to go well. So, getting back to that, a couple of tips that I have. You know getting people to use these tools is really, it can be really difficult because they're not really wanting to unit test to start with, and then you give them the tool and then they've got to learn the tool and all that. I would suggest that you try to get tools implemented a couple of people at a time. Get all of them to evaluate the tool, as you said -- I forget your name.

Bob.

Bob.

The late guy.

As Bob the late guy said, I think everyone needs to buy into the tool and say, yes, I like this tool, and get buy in, and personally, one way to get people to use the tool is, if they feel like they were part of the buying process, they are more likely to use it. It's just a -- it's a nice thing to do that doesn't take very long. And they may, you know, they may find some things for the tool that maybe one person who's evaluating it didn't find. But anyway, getting back to that, I would suggest that once you've selected the tool that you try to get it implemented in phases. Get a couple of people to use it, then a couple of other people. If you try to get 20 people to use it at once, you may have the help phone line to the vendor ringing off the hook. So, I would try to get it done in batches, so that's another thought. Any other suggestions from anybody who's done this? Okay, good.

All right. Let me see. I think I want to go on to the next section before I show you guys a class example. The next thing I want to talk about is module level example, or exceptions, I'm sorry. The purpose of module level exception handling is to verify that the module has exception handling, first of all, that it is able to handle invalid inputs at a module level. This is not at a system level or an interface level, it is a module, it gets bad inputs coming in, can it handle them and do something. Secondly it verifies that the exception handling is correct. So the first step is, does it have it; is it correct? Lots of times it's wrong. Certainly, verify the exception handling is not empty. Lots of times, this is pretty common where software engineers will write the infrastructure for their exception handling, which is actually a very good coding practice. They will write everything out to where the exceptions are trapped, and then they forget to fill in the code. If I had a dime for every time I've seen this, I would not be teaching this class. It's just a human mistake. It's not an intentional mistake. They just forget to fill it in. So you've got to look for that. Fourthly, verify the exception handling has the correct recovery. Most of the time when these three things are in place, this is where things fall apart. So, the code traps the exception, the code now knows we've got bad data, but it does the wrong thing with it. And I think you guys were talking about it during the break how you percolate the failures up. There's different, there's different design approaches for how to handle exceptions. Some design approaches have the failures percolate up, some have it mitigate the failure rate there. This is where things are most likely to fall apart right here, is what do we do now. We've got bad data, what happens? Okay. Fifth is verify that an exception is cleared after it's happened. If the code gets through these four steps and it's okay, this is where it falls apart next, is lots of times the code will trap an exception, but it forgets to clear itself, so the next time valid data is passing through, it gets trapped, too. So you need to make sure it clears the alarm after it happens. And finally, the very last thing is, and these two things are actually tied together -- I intentionally put them together -- is to make sure it only executes when there is an exception. The worst thing you can have is an exception handler that's overly picky. . The software would never be able to complete a job. So these are the six things to look for. Can anyone think of anything else you'd look for at a module level. No? [inaudible].

Anyway, getting back to my date example. There's all kinds of things that I could test here for exception handling. My test here -- I've named him EH for exception handling. Okay, we already know, based on our path test, that these three things right here, we already know we're going to test those for the path testing, right? We're going to test the year that's too big or too small, month is too big, too small, data that's too big, too small. Now what's interesting is, in the path testing we only needed to test one of these two. We didn't need to test both of them. So for exception handling, you might want to actually test of other end of it. So if during path testing you tested a year that's too small, you might want to test one that's too big here. And the same thing for the month and the same thing for the day. So you probably want to catch the other side of the exception, because there can be a bug that's related to that exception. So these are already covered in the path testing. Some other things we could enter, illogical dates. What happens if we enter it in, 9999999. Well, it's a valid date, but it's kind of silly, so what does the code do. A corrupted date, nonnumeric characters. How do we know that this thing getting passed in is really a number? Did the path testing ever test that? No. It probably would have been trapped in one of those when we got to the bottom and didn't have a valid number, but it didn't intentionally actually test for a corrupted date. We could have a missing date. It could be totally blank. We could have blank, blank, blank, blank, blank, blank. We could have a date that's stale. What if it's not updated at all? Now here, this is actually a good integration test, but we can test it at a module level. What if the module does nothing for a date that's input? We could also have valid and invalid dates, and what we want to do here is pick something that wasn't tested elsewhere in the path testing and so forth. Okay, so some of these would be covered, some wouldn't would not be. Can anybody think of anything else that could be covered as an exception?

[ Pause ]

Okay? Well, this probably covers it for that example. Okay. I'm going to move onto domain testing. Domain testing is where now we're actually going to look at the inputs themselves, and not the logic. Okay. The very first thing we're going to do is identify operations performed on a range of input values. Here is an example of a domain. Okay. If X is greater than 10 and X is less than 20, then the code executes segment A. This is pseudocode, by the way. This is isn't code. So, if it's greater than 10, less than 20, we execute segment A. Otherwise, if the value X is greater than or equal to 20, we execute segment B. Otherwise, if X is less than or equal to 10, we execute segment C. So the very first step is to look through the code and find domains. One of the things that the tools do that I was telling you about is they do find the domains in your code, so that's one thing that's useful. Okay. We want to determine the entire range of input values next. So for this value X, let's say I didn't actually define X which I should have. Let's just define X as a floating value, meaning it could be 10.1, it could be minus 10.1. So let's just assume X is a floating value from minus to plus infinity. Minus to plus infinity would be defined by whatever the computer is that's storing this variable. Okay. So that's the entire range of inputs would be from minus to plus infinity; we've defined that. Oops. Okay.

Now we want to analyze the code and determine if there's any gaps in the domain. So you can see in my date example, remember that date example I showed you? I didn't have the line coming down for the months that had 31 days. That's actually an example of a gap. The code probably would have worked, but maybe not. We don't really know. Anyway, we want to look for gaps, we want to look for overlaps. The overlaps are almost certainly bugs. The gaps may or may not be bugs. You can have a gap in logic and have the code still work. You could also have unreachable code which is definitely a bug. Okay. Then you want to determine the domain of input values for each segment and plot them to determine whatever tests are. So here is an example. Okay, here's a simple example of a gap in a domain. If X is greater than 5, else, if X is less than 5, then do something else. You would be shocked at how often this happens. I mean, I see it almost all the time. I review people's code sometimes for a living, and this is the first thing I go through and look for, and I'll find this almost all the time. I don't find hundreds of them, but I'll find at least one in some code. So the problem is, we're pretty certain, the software developer probably did not intend for this. That's a pretty reasonable guess. They probably don't realize that nothing is going to happen when X is equal to 5, because they probably left out a less than or equal or greater than or equal. So this is an example of a gap. These are super easy to find if you're looking for them, super hard too find if you're not looking for them. Okay, so this is probably a bug.

Overlaps in logic. I see this all the I'm too. Some people do this intentionally and it's extremely poor coding practice, but if A is greater than five, then do X. Otherwise, if A is greater than 10, then do Y. This is even more common than the previous one. And this -- it's entirely possible this code might actually work the way the software engineer wanted it too, but it's extremely poor coding practice. Okay. Because, first of all, it doesn't show what happens when A is less than 5 or less than or equal to 5. That may or may not be a bug. And obviously, if -- what would happen when A is equal to 6, 7, 8, 9, 10. Well, X is going to happen, but maybe they wanted Y to happen. We don't really know. So this is probably a bug, more than likely. This is something you want to look for when testing. Okay, impossible logic, this is almost certainly a bug. This is an example of an overlap, but it's an overlap that causes something to be impossible. In this example, this logic was not possible. Y can still execute -- wait a minute. No, it is. Oh, I take that back. Y is never going to -- no, this is impossible code, too. Never mind, I was thinking of something else. So this is impossible code, as well. This is also impossible where one value won't execute, Z. So, this is also -- I'm pointing these out because these are very, very common. Okay. This is just a different flavor of this. This is when one value is out in nowhere land. And in this case a whole bunch of values is out in nowhere land. And most of these are probably a bug. Okay. So how do we test for the demands? Let's assume that we've looked at our domains and there's not any gaps and there's not any overlaps. Okay. Well, we still have some testing to do, because there could still be some bugs. What we'd want to do is we want to lay out the domains. Okay, here is, if we go back here, this is the example I showed a couple of pages ago. If -- let me see if I can go back here. There. You see this example here? We're going to see that now here. All right. This is what it looks like graphically. All right. The black dot means it includes that number. The clear dot means it excludes it. So domain A is between 10 and 20, but does not include 10 and 20. Domain C includes 10 and everything less than 10. Domain B includes 20 and everything greater than 20. So it's just a graphical representation of that logic. Again we look at that, are there any gaps? No. Any overlaps? No. So, for now it's 10 point -- at least it doesn't have any gaps and overlaps. There's not any unreachable code. But is it right? We don't really know. We've got to test it and see if it's right. So, this is an example of a two- dimensional case, and this was the two-dimensional case I showed a few slides ago. I don't want to go back because I know we're being videotaped and they may not be able to see it very well, but anyway.

If you have two dimensions, you can draw a plane, okay, and you could see here this was the formula. Are there any gaps along that plane? No. Any overlaps, no, it's fine. So we passed the first check as we don't have any obvious defects. Now what we need to do is find out how to test this. So, oops. For each domain, select one point just inside it and one point just outside. Don't forget the extreme positive and negative points which are plus and minus infinity most of the time. These are the test points for that line example. For A, let me go back and I'll show it to you. For A, name a point that's just outside. Well, 10 and 20 are outside; either one will work, we only need one. A point that's inside could be 11, 19. For C, a point outside could be also 11, a point inside could also be 10. For B, 20 is inside, 19 is outside. Okay. So we go back to the example. Oops. Okay, so basically, when you consolidate all of those and get rid of the redundant test points -- we want to test, 10, 11, 19, and 20, and we want to make sure that whatever was supposed to happen happens. Because it could be -- what is the one bug that we're looking for here? What could go wrong with this? What if these are in the wrong place? Usually when these are in the wrong place, there's an off by one bug. We're trying to find the off by one bug. We're also trying to find, this would be, if we tested plus infinity, we want to make sure it doesn't roll over. If we test minus infinity, we want to make sure it doesn't roll over. So we want this point, that point and the zero. There's been a lot of research to show that when it comes to domains, the bugs happen right here. So, we don't need to test 4, 5, 6, 7, 8. All we need to test are these values here and that will test the domain. So we can combine these with whatever path tests we had designed earlier and test them. So you could see for domain B we had 19 plus infinity and 20. Domain C was 11 minus infinity and 10. And we only really have one, two, three, four, six test points, that's it.

What a lot of people will do when they test domains, is they'll try to test all of them. That's -- you're going to be testing until the end of time if you test all of them. What I would suggest is this six is more than enough. Okay. All right. So once you get to define, eliminate the redundant test points, those within the domain tests and across. If some of these tests are being tested in a path test, then you don't need to retest it, okay. Determine if the results match what was expected. For two or three dimensions follow exactly the same instructions; you're just working on a plane instead of a line. Okay. As a class exercise, determine the test points for the two-dimensional example. Okay. This is a good stopping point. We're going to take a break, and then I'm going to let you guys just graphically point out what you would test on this example.

Module 3

Ok let's do a class example here. We're just going to do this very briefly ok, here's a line on a plain. Ok can you all tell me if our goal is one point just outside and one point just inside but we want to test something on the line right? So we want to pick one point somewhere on that line for sure. And then we'd want to pick one point just outside of it. We could just pick something right in the middle whatever that point is test it. And then on the other side whatever one of those points is test it. Ok we can pick anywhere along there it doesn't matter. We also want to test this one out here and something way out here. And that's it that's really all we need to test for a plain. Now I had a very good question during lunch and I feel really bad not saying this. I'm actually going to tell them your question.

Class: I have a question.

Yes? Class: Would you also test how low [inaudible] for the negative numbers just to make sure that it's been working accordingly or no?

Yeah you might want to pick one down there to. Did, did [inaudible] actually didn't cover negative values. A lot of people like to test negative values because they like to test negative values. In this case we could've picked a value here we could've picked one over here. We could pick one there but as a minimum we want to test something right on the line center right off of it and center right off of it on the other side. So you could do definitely do additional testing. You can pick your point way down here and that be fine. Ok you also had another question during break and I, I'm, I'm actually appalled that I didn't explain this. The question was for testing plus and minus infinity what's the actual value? To test plus and minus infinity here what we have to do is find out what kind of variable this is. And find out what is the biggest number that can be stored in that variable and that's plus infinity. Minus infinity would be the smallest. So sometimes minus infinity could be zero depending on if the, if the variable is assigned variable. So minus plus infinity is whatever the biggest and smallest is for that variable. And if you're a software engineer you're looking at the code you're going to know what that is. Ok very good question, any other questions on domain testing? Can any of you see that the domain test some of them do coincide with a path test but a lot of them don't. They really kind of augment the path testing. So this is where if you want to find where bugs probably are it's probably somewhere in these six points. Particularly the plus and minus infinity, that's where you see the rollover bugs. Have you heard of, you guys ever heard of rollover? That's where you see rollover bugs.

Class: Y2K.

Y2K, yes that's true. Oh really yeah that's exactly true nobody thought of the year two thousand. You are exactly right, that's a very good point. Ok so let's see we got here, alright so do's and don'ts for domain testing. Do make sure there's no gaps, no overlaps, no unreachable code. Don't take guesses about the domains. If you find gaps or overlaps let somebody know and get some clarification on it. Ok the next type of test I want to go over unit testing is a math testing. It verifies that mathematical operations do not cause the run results or underflow under, overflow. Okay the most common; to do math testing you need to know what the most common math faults are. And these are it. The very first one is dividing anything, dividing by a variable that can potentially be zero or close to zero. A lot of people misunderstand this bug. It doesn't have to be exactly zero, it can be close to zero and still cause a, a crash or hang. Multiplying by two very large numbers can cause an overflow, a natural log of a negative number and so forth. Can anybody think of any other mathematical failures that could happen?

Class: Multiplying by zero.

Multiplying by zero when you don't really want to multiply by zero. Yeah you can get zero as an answer when you weren't expecting one. Ok there's all kinds of bugs that can happen when you're taking approximations. Your approximation may not be accurate enough and so forth, so ok good. So basically what you want to do here is look for the mathematical functions and find out how they could explode. Ok you could have a requirement for a data item to be bigger than the maximum size of the variable. It could be your variables too small. Normally what happens with these bugs is your variables too small to hold the worst case scenario. So that's what we're looking for. Ok if you want to do math testing you probably are almost certainly going to need some kind of test bed. So a spreadsheet a math program etcetera. For example I'm going to show you one of, one program after lunch that has a lot of math programs in it, a lot of math functions in it. And the test bed is an Excel spreadsheet where the functions are all laid out and you have the answer. So you want to lay that out ahead of time and check your results. And when you do this lots of times you will find out some bugs just by creating those spreadsheet. Make sure the algorithms work or all valid inputs. And if necessary to find a valid inputs from the requirements jack units. Ok, did I skip something? Ok anyway I'm not going to show you an example of that right now but I'll get to one shortly. Using this example, did I go back a step? Oh I'm sorry this is where I want to present the example. Ok, I don't know how this is going to work. And then,

[ Silence ] first of all let me show you the example checklist. I'm going to need to sit down for this.

[ Silence ]

This is on your CD by the way. Here it is an example of a unit testing. Oh shoot, I always do that. Ok let me make this bigger. Ok here is an example of a checklist and I have filled it out for some software. So don't pay attention to the stuff that's filled in right now just look at this column. Here's the unit testing checklist. It should have some basic stuff at the top, the developer, the unit test it, is it new etcetera. Now you can see I have the SRS requirements in here that I'm testing. How frequently, I have the frequency in there. So all of our planning stuff I put right at the top of the worksheet. Ok, so now the purpose of my example actually is to show the math part. So I haven't filled this stuff in but if we were doing real unit testing these are the unit, the path test that we covered in the path testing. You can see that it's a memory jogger. Ok does the module have any module level exception? Ok it does. I'm going to show you the module in a second. Here is more of our domain checklist, remember all the stuff we learned. Everything that we going to capture in class and here's more stuff that we haven't captured as well, the math testing. Here one example I want to show you is; I'm going to show you the software requirements and a second. But I found the problem with the spec so I highlight it in red that tells me that I found a problem in the spec. It's just a simple bookkeeping thing did you guys can do. Do any of you have any tools like RMS Systems, like DOORS and do you also have like ClearQuest? Ok you can do all of this in ClearQuest too. So what I'm showing you here, you're laughing, do you have a problem?

[ Laughter ]

Ok.

Class: We rebuke those questions actually.

Ok.

[ Laughter ] well I'm just pointing out that what I have in here you can incorporate into some other system. Ok you can do all of this through ClearQuest too but anyway. Alright so here we have the worksheet and it has all of our things that I'm looking for. Later on I'm going to have more stuff that I'm actually going to teach in class. But let me show you the example now. This is probably decent thing to do during lunch. Ok I want to give you an example right before lunch of some real software. And let me go back to the beginning. Ok this is actually my software. I decided to be kind of gutsy and use my own software as an example because believe me I know every bug that's in it by now. This is a software program that I first wrote about maybe twelve years ago. And somewhere along the twelve years I had probably uncovered almost every bug in it. And I found that this example actually has everything that I'm going to teach in class just out of complete coincidence I can show everything with this. The software's purpose; let me make this bigger. I'm just going to show you just a fragment of the SRS because this is going to be our class example for the rest of the day. The software purpose is to predict software reliability. It predicts fill the defect in see which is defects over case slack. It can predict; this is the system requirements back here. It can predict fill the defect in instead of using only one question which means you answer one question and it tells you defect density. Ok it also has the capability, let me see if I can get my pointer here. It also has the capability it can predict defect density with fifteen questions. By the way these requirements were given to me by my customers. They said we want to have a model that's short one question. Then they said we want a little medium sized model that's fifteen questions, so that's where that came from. Then the software also can predict defect density using a detailed survey that has a bunch of questions related to development process, product, etcetera. So these are all requirements for the system here. So let me for what I want to show you now, I just want to go onto the next page. Here's the system, the software requirements. So the software requirements were actually mapped right into the system requirements. We took those and all the italicized requirements are now software requirements that were derived from the system requirements. So the very first requirement was that it should predict software reliability in metrics without testing data using defect density models. We get to the very first one which is that it should use only one question. So the very first model here is that question is what's your SEIC and then eye level? So the idea behind this model is you plug in your SEIC and then level it. It pops out a defect density. So that's what the software does. This is pretty simple stuff ok so let's just focus on that. Here are the levels here. So the possible choices are we don't know. We don't really know what our SEIC eye level is. We are unrated level one. I don't want to expand that anyone but between one and two, two between two and three, three and so forth. You can't actually be between one and a two if you got assessed on half of your stuff at level two. Anyway so here's the defect density that pops up when the user selects one of those choices. Ok, so then we go to the SDD. And I believe that's all the detail that we have for the SDD. So let's, let me actually go forward here. There's a couple of math examples in here.

Class: It looks weird.

What?

Class: [inaudible] four and five is higher than [inaudible]

Yeah I was wondering when somebody was going to notice that. Well keep in mind it's not by much. So I was waiting for someone to notice that.

Class: [inaudible]

[ Laughter ]

Ok well let's, yeah that's true. Let's, well let's look at this keeping in mind everything we learned in class. You know what are some of the things we want to test even at a module level? Let's just say this one module that does this computation. So what, what's right off the bat what are we going to test? Well we haven't gone over this unit but we should test every one of those, right. Ok and is there anything else you can think of we need to test? Ok probably for just testing a table that's probably it. Ok so let's look forward and go to a couple of other requirements here. I know there's some math algorithms here I want to get to, here we go.

[ Silence ]

I think it's big enough to look at now it's too big. Ok so we get down further, now it has some other models and so forth that are listed here. Nope, the thing I want to show is still not big enough.

[ Silence ]

Now one of the things it does down here is there's a bunch of other models that are used here and which we talked about here. When it gets done with the defect density prediction it tries to predict the other defects by using a, I'm sorry. Try to predict, oh I'm sorry let me take that back. What it does is somewhere in here there is a requirement that it's going to multiply the defect density by size. Here it is and it's going to yield predicted defects. So you multiply, you get the defect density multiply it by size now we have predicted defects. Ok it's a pretty simple concept. Then it can predict the number of defects and an interval of time. So let's say your software is going to be operational for like ten years. Well the very first thing it predicts is how many total defects will happen over that ten years. The second thing it does with requirement 1.1.4 is it predicts how they're going to happen over the ten years. So for example here's your ten years and let's just say one, two, three, four, five, six, seven, eight, nine, ten. It predicts something like this, that's what that formula does. Ok, so you guys don't have to understand all of the numbers behind this. Right off the bat what are we going to test with this? Well we're certainly going to test whatever inside those exponents, right. We want to make sure that that's not zero ok. But how about from a boundary standpoint what would we want to test. Well that thing right there we want to make sure it's not zero. If any one of those things is zero right that's going to cause; so we want to make sure this things not zero right there. We want to make sure this thing is not zero. Then we want to make sure everything inside of there is not zero. Right, can you think of anything else? And then we want to make sure that this thing times this thing as somebody else in the class said is not too big. There is a variety of ways this thing could blow up so we want to test all of them. Ok let me see what else we have here. Now let me go back to the example spreadsheet.

[ Silence ]

Ok, alright I don't want to show that yet. Alright I'm kind of done with showing this I think, I haven't taught you guys. So anyway we can see here the checklist just keeps track of everything that we want to test on this thing ok.

[ Silence ]

Ok I'm going to give one big example on this when we get back. Right now I just kind of wanted to show you the example. I'm actually going to have you guys when we come back from lunch sit down and look at the SRS and crank out all the test but I don't want to do that right before lunch. So let's move on. Ok functional white box testing. This is the very last type of test I want to go over for unit testing. That's when you verify the code performs its intended function. There's two viewpoints here, a data viewpoint and a event of viewpoint. So depending on what you're verifying in the SRS or the SDD you may have two different things look at with the module. Ok so here is an example of a data viewpoint, testing zero data elements, testing one element, testing many elements, testing valid combinations of elements, testing valid invalid, valid. What this means here is, let's take our date and time example. What if we test a valid date then an invalid date then valid date then an invalid date? What would you think would be the purpose of doing that? You want to make sure it clears itself right. So here is where we can test you know for combinations of data. We might want to test one date and then a whole bunch of dates. We might want to bunch; we might want to plug in like a hundred dates to make sure it can handle it. So this is an example of functional testing from data viewpoint. So here's some of my examples. We might decide for example we came all the way down here. We can try sticking in no dates to see what happens and I don't know what that would accomplish. But we can stick in one date, two dates, many dates. We can select from a range of centuries which is one thing as you said earlier was the Y2K bug no one ever thought of doing that. We could select from a range of months. Select from the first and last data month if we think the thirty- first is more error prone than the thirtieth. We can do like I said valid, invalid, valid. Ok, from a event viewpoint at a module level some of the events we could have are largely related to loops and how many times the module executes. So let's say this module a loop in it. We want to try testing that loop zero times, one time, many times. Why do you think we have testing a loop zero times? Let's say the loop is, let's see what kind of things loop? I'm trying to think what would be a good looping example at a module level.

Class: [inaudible]

Yeah.

Class: Zero.

Yeah I'm trying think of something that has a good do all. Let's say, let's say the codes are parser. The parser is like reading through some file and it's looking for some information. So to do our loop is, would be till he finds the end of the character or the end of the file. So let's just think this is a parser and is parsing through. Well what if the parser never executes at all, we really should test that. It could be the initial condition has never even met and the parser never even starts. We can do one loop. We can do many loops. What would be the example let's say with the parser of doing many loops? Well it might run out of memory eventually. So this is from an event standpoint it means with the software is actually doing. We can also test one execution of the module to many. So let's say the function does the date and time. What would be the, what would be the advantage of executing that many, many, many times? You're not necessarily looking for the data just an event. We want to make sure that it can because it could be that it doesn't reset its initial conditions and so forth. It could be it runs out of memory.

Class: [inaudible]

So, Yeah?

Class: You might want to check whether you iterate on your range [inaudible] go off the end of the range.

Oh, very good.

Class: [inaudible]

Yeah the parser has all kinds of things that we could find by testing many times. Ok so these are just kind of the zero one many and one too many is a good rule of thumb. In fact a lot of suffrage years memorize these. Whenever I have a loop or a function or something test it zero times if it can be zero, one, two, if it's applicable and many. Can anybody think of anything else you do at a module level that's either data or event related? There are other things you can do. I actually summarized them. Ok why anyway I think we kind of killed the unit level, unit testing. I have some pseudo code there. I would like for you guys to take a look at it and we'll determine the test. I actually have the example here. Alright I'm going to look at my example. What page are we on?

[ Sound of turning pages ]

Ok so this is just basically a code getting back to the array that we were talking about earlier. This section of code we'll read a number from a client record from the database one at a time and accumulate statistics on them. By the way the type of bug I'm showing here has happened multiple times in history. I just simplified it to make something kind of simple for class. Ok there's some client array and it has a name, a gender, a homeowner, and an income. So there's a whole bunch of clients in this database, their name, their gender, their home, whether or not they own a home and their income is in it. It's possible for the gender to be male, female or blank if unavailable. It's possible for homeowner to be yes, no, or blank if unavailable. So this is the pseudo code for the code. You input the client and the type and the size. There is local variables that keep track of certain statistics, like how many me and how many women, their average income and so forth. So the way the code read is for, for indexes equal to one the size which is how many clients are in the database. We look at the gender; the code looks at the gender and accumulates male count. It looks at the homeowner and accumulates the home count. But look at that logic it said if it's a male, if the genders M and accumulates the male count otherwise female count. What's wrong with that?

Class: [inaudible] count.

It,

Class: [inaudible]

Yeah would you, do you guys know this is one of the most common bugs with the data is forgetting about the blanks. The IRS mailed out something like fifty thousand refund, refund checks to the wrong people because of blanks in a database. The bug was very similar to what I'm showing here. So my point is we could tell just by looking at it that they completely neglected to capture the blank. And the reason why is they had that us condition that wasn't very specific. So we go back to our test the path testing even if we hadn't noticed that bug the path testing would have found it. Because there would've been no path for the blanks and we would've seen that. Ok the same thing for the homeowner it didn't account for the blanks. Ok then it accumulates the income but what it doesn't account for, I have many, many bugs here. Let's see these are the many bugs that we have. Actually if you tested with a totally empty database it would crash because the income would be divided by zero, if you test with one client that's a good test, many clients. So this is an example where we want to populate the database with nothing, remember that zero one many. That's where the zero one many is popping out. We want to add nothing in the database. One person in the database, many people in the database. Touching, missing, or invalid gender, home count income, mathematical there was a division by zero intent. The total domains for gender and homeowner characters were not just these four. So the domain test would have also found that bug. It would found, hey there's nothing to find but blank. You could see the path here. So basically do you guys see how we can apply what we've learned in class on just about any code and find something interesting? Ok, alright let's go through the last part of the unit testing section before we move forward. We've gone through the different test now I want to show you some metrics to use for efficiency and how to, and some suggestions on recording failures. Ok failure recording and metrics go hand in hand. The two of them actually happen at the same time. You really can't have unit testing metrics without recording failures. Ok the primary purpose of unit testing if we take a step back and think about it. What are we trying to achieve? We're trying to find defects that can't be easily found elsewhere. Ok if we could find defects that could be found in black box testing we wouldn't be doing this testing, right. Our goal is to find the stuff that somebody else has come in behind us in the process isn't going to be able to find. So our metrics really need to address that. We also want to increase the branch in lines coverage which is this is actually part of that. The way we find the defects that can't be easily found elsewhere is by increasing the coverage. So we want to execute the lines of code that can't be executed easily elsewhere. So some useful metrics to ensure that this happens is of course line of branch coverage which if you're doing path testing you're probably required to compute that anyway. So that's, if you're using a tool it's probably going to tell you the line of branch coverage when it's done. So these two kind of go hand-in-hand with the actual test that's being conducted. You may also want to think about trends in the types of defects span during integration or system testing. So for example I got a lot of people who are validation engineers and system engineers who really want the unit testing to take place because let's face it they get stuck holding the bag. You know if they have to stop for the week, if they have to stop testing for week because somebody have to fix a bunch of defects is not good for them. Ok they're the ones who were going to be stop with the schedule delay. Alright so basically some interesting metrics to actually show the unit testing first of all needs to be in place. And second of all if it is in place then it's inefficient. How many defects were found during system testing that blocked testing? That is the single biggest indicator. Ok your effectiveness during unit and integration testing can be measured by this. When we got the system testing how many times that the whole group have to stop? Or how many times were we not able to execute a test because of a blocking defect? There's also defects that cause crashes. Some people track that, personally I think this one is the big one. How many times did the validation team have to stop to wait for something to get fixed? Ok can you guys think of any other good metrics, any that you're using that will you know? Basically when you first start unit testing sometimes there some learning that goes into it, some growing pains. People aren't immediately good at unit testing even when they're trying to be. So think of a metric that could track the progress where okay this is the first time we started unit testing and this is how we did. This is the second time this is how we did, this is the third time. You know can you guys think of anything else that would be good other than these really simple ones? I normally go with that one and if applicable for the program that one and if I have a tool. Ok, alright well let's see, to find the exit criteria for unit testing. As we learned that the beginning the exit criteria is that we want to find bugs that other people can't find. So some typical exit criteria could include completed checklist. I actually showed during my little exercise there I showed you a completed checklist for my SRS. It was all filled out. This could be a deliverable. For the projects in my database where the organizations have lower defect density going into the field because of unit testing that's what they did they just filled out the checklist. Some of them even did it by hand. They printed out the checklist and did it in pen. So the stuff don't have to be fancy. Some of them had scripts for all new and modified code if they used the tools. So there deliverable is one of those two things and sometimes both of them together. For the high critical or frequency code let's say you know I know of organizations that are developing FDA regulated devices. And not all of the code is safety critical maybe 10% of it is. So for 10% of it they might have required a formal review, like a really formal with people other than developers. So that can be in output is the results in the review. The coverage can also be in output. One thing I do want to point out is the difference between 95% and a hundred is absolutely massive. How many of you have worked on a project where you had to have 100% line coverage? No one wow. To get to 100% is super difficult. In fact I've never actually seen personally me anyone get to a 100% without doing some sort of visual inspection. Ok some of the code could be really difficult to trigger, usefully. I mean you can execute the code but do you have a meaningful result we don't know. Getting to 100%, the difference between ninety-five and a hundred is absolutely massive. So I want to point that out here. Your coverage may not necessarily be 100% for the lines when you're done. List of open issues, so of course we want to have going in to the next step the list of open issues pertaining to detailed design which would be what are all the things you found that were missing from the SRS and the design document? Can anybody think of anything else that should be an exit criteria? What are you guys have in your NASA standard? What's your exit criteria for exiting this phase, is there anything in particular? I don't think I actually noticed anything in there. I'll take a look at it but I can't think of anything else that you would have other than this. Ok some wrap-up in this part of the course we learn the practices associated with the unit testing. How to plan and strategize; how to document how to execute them; some do's and don'ts, some metrics and some exit criteria. So we are exactly on time. Right now I want to go on to the next module.

[ Sound of turning pages in a book ]

Ok our next module is integration testing. I want to show you some facts about this and the focus of integration testing, some types of test and then I'm going to show you like I did with the previous unit how to plan strategy, define test, and execute test, and then record failures and use metrics. Ok these were the benchmarking results that I showed you guys previously at the very first part of the class. So some things we really want to look at during integration testing is input output faults, timing, interfaces, states, sequences, database interfaces. Are any other projects you all are working on, do any of them have significant database interface? Ok good. Well there's a lot of things that could go wrong in a database and we're going to cover those. Ok so getting back to our viewpoint, are focus of testing, this is going to focus, the integration testing can be both white box and gray box. And sometimes some of the test can even be black box but we'll talk about that later. For the purpose of this class let's just assume we're testing with the code in front of us. Ok if you guys would remember now what we're focused on really is still the white and gray box tests. Ok so the integration testing is still going to help us flush out the other 50% of the code that's not visible in black box testing. Ok the focus of integration testing is against the code, the detail design, and the top level design. So we're going to be looking at all of those things. People who do integration testing we got a lot of, a lot of stuff to juggle. You got a lot of things to look at. You're looking at the code. You're looking at the design. You're also looking at the requirements. It can be difficult. How many of you do integration testing? It's, it's a fun job. Anyway some of the integration test that can be done the first one you always want to execute is the input output and interfaces test. If you can't get past this there's not much of a chance the black box test will ever get off the ground. So this is always the first thing that gets done. Then there's exception of handling around that which I'm going to go over. There is some exception in handling at the units then there's exception and handling between the units and that's what we're covering here is exception handling between the units. Then there's timing and summary timing can verify the timeouts are not too long, not too short. Usually timing is part of an architectural diagram which is why we included in integration testing. It's exceedingly difficult to test timing on a functional basis. You really can't do that. Sequences it verifies the order of execution is correct. I suppose you can verify sequences in a module level but if you're doing that you're probably verifying the requirements in a module level. So these sequences here would be between the modules. That the modules themselves are executing in the right order. That's what that does. State transitions verifies that state transitions from one state to another are being done correctly. Out of all the things on the list where I see the most bugs, I make a living out of counting bugs, and it's already in a root cause. Where I see a lot of bugs is of course you see him here but luckily with these, these tend to get found and fixed pretty quickly. The ones that linger, the ones that make it to the next phase are these for sure. This one is probably at the top of the list, this one, and then that one. Timing bugs can be hard to find. So, you're laughing, but yeah they can be hard to find. And so they tend to slip through but luckily you tend not to have a ton of them. But state bugs I find slip through quite often. Ok so let's get back to our diagram here. We're going to go over planning the integration. I'm going to talk about the scope of the integration test. With scope always means what are we not going to do. Who's going to execute it? When to execute the right tools in the documentation? Ok first of all when we look at the scope we need to look at what is being covered in the integration testing. Interfaces could include files, devices, sensors, consoles, printers, communication lines, the operating system, the database. Could anybody think of anything else? Any type of hardware would be up here so we need to focus on these interfaces. And we need to define them ahead of time and we need to know what we're interfacing to. One of you said in the morning is super important to write down the interfaces. I think it might have been Brad. This is the part where you're going to get a chance to do that is to find the interfaces. Ok state transitions, timing, and sequencing is per the design spec. Sometimes you may have a stateless system so one of these tests might go away. I've yet to see sequencing and timing not go away though. I mean you almost always have some timing. It's kind of hard to get rid of timing but it is possible you may have a stateless system. So this is the part where you need to say ok do we have any states in our system or not. Ok you definitely want to look at new and modified code and any existing code that might be super high risk. Keep in mind in the unit testing we were unit testing one module at a time so we made the assumption that we don't need to retest any of the, we use with our modified code. Now we're integrating code, we may have to retest some of that old stuff just by the nature of what we are doing. Ok so basically the first step is to define what we're going to test. The second thing is who's going to do it. The initial integration tests are normally run by software engineer because they're actually some extent doing some coding while their integrating. Once the codes integrated the test can be executed by someone else. For example a state diagram, state testing can be executed by somebody else. Ok, alright when the test will be executed. Integration test should not be run in a big blob because it can actually take longer that way. The developer is most familiar with the code is after he, is after he or she writes it. So the same thing we saw with the unit testing implies some integration testing. So here would be an example of code unit test and then integration test in a nice big loop. So we write some code we unit test it and then will we have enough code available we integration test it and we keep going. This is an example of the big blob. Now some people call this waterfall, I just rather call it what it really is, it is a big blob. And that's, you know there are times when a waterfall model works when you have something really, really little. So if you have something that only has one increment then a waterfall model can work. But when a waterfall model was not applicable and we have something that's much bigger I call it what it really is which is the big blob. Ok, so we want to have lots of nice little increments. It's a lot easier to integrate the code once you have the whole chunk of it done, ok. Alright, choosing the right tools, in my benchmarking study these are the things that I found were related to fewer defects. It doesn't mean this is the only thing there could be other things. Test beds, test bed can be in any media. It could have the expected result or the set of results. Simulation of course, programmer's workbench, this is the target hardware or fragment of it. Now when you're doing integration testing what you need to have? You need to have some hardware. Is that hardware probably finished yet? No. Is it available to you? No, probably not. So you're going to need to have what is called the programmer's workbench which is the target hardware and it could be like some fragment of it. Whenever you hear programmer's workbench it means some fragment of our target hardware. Ok and then some tools for doing the state, logic, and timing testing I also have a listing of those. There are quite a few tools that do the state testing quite well. And in fact Microsoft has one developed that they've used internally for a very long time but they may eventually release. So there are tools available for doing this which is good. Ok so the tools, the input, output, and interfaces there are many available for this. At the very end of the class I'm going to go out to this website and talk to you about them. I just don't want to interrupt the flow of the class by doing it now. Like I said before don't test what the code does test what it should do. One of the big common mistakes I see particularly with state testing is people will test the states but they don't develop the state diagram first. And it's like well how do you know what the right answer is if you don't have the state diagrams. But anyway timing, sequence, and state there are plenty, plenty of tools for these. Exception handling lots of times the, the other tools that we have could handle this but lots of times this is done manually. Because lots of times you need to go get the exception, find out what it is, trigger it, which could be like pulling a power supply or something like that and then watch what happens. So lots of times these tests are manual. Ok setting up the documentation, you can create a template for these things before you start testing. I have found spreadsheets to be easier than Word documents. During integration testing these are the things you want to keep track of. We have a few more things to keep track of than we did during unit testing. You want some unique ID. You want to keep track of the requirements being tested if you're testing a requirement. You may also want to keep track of a use case. Because now since we're integrating we might be more interested in some of the use cases. There may not be any use cases developed or it could be that we're not testing and is part of this test but that could be something to test against. We're going to need our setup and configuration. In unit testing setup and configuration was not really that important. Our setup is we get the debugger we turn it on, that was our setup. Ok here now we have to go through, in fact in integration testing is when we do the most setup. We have to get the hardware. We have to get the stuff. We have to set it up. We have to make sure the timing is perfect. Ok so that would go there. We have prerequisite test cases, if there's something you need to run first you put that there, obviously the steps, the priority. I bolded the expected results because it's the single most important thing and then the cleanup. And then also whether or not it's automated. The one note I want to make about the expected results is this is where people skimp. Lots of times they'll say make sure the requirements verified or something like that. This is not the place to skimp. If you want to skimp, skimp in some of the other columns. But don't skimp there that's the most important one. Ok some interesting fact some of the organizations in my database that traced use cases to their test had 12% less defect density. Just tracing the use cases to their test, that's all they had to do. So just some food for thought you might want to start integrating the use cases in. If you don't it during integration testing then at least do it during black box testing. Ok so let's look at the strategy for integration testing. This is the same strategy we looked at before.

[ Sound of looking through pages of a book ]

The parts of the code that are the biggest risk, during integration testing this requires a little more work. We do need to look at anything being integrated. We may also need to look at something that hasn't been changed, where as with the module level testing we didn't have to worry about that. With integration testing this is a new item for us. We may have to look at the unchanged code. The code that's integrated first we want to have the most test time. So basically with integration we want to get the stuff integrated first that we think is going to be a blocker for something else. So that needs to get in first. Ok what parts of the code are executed the most often? Again we want to look at code that is doing the most work. So for example here if you have a database system the code that's interfacing with the database is probably doing a lot of work. So we might want to test that interface first. The software example that I have showed you guys earlier it does the predictions for software defect density. The first thing that gets integrated all the time is interface to the database. Because of that doesn't work can any of the calculations be correct? Not a chance. So even though the database is completely invisible to the user, the user doesn't care what's behind the software. They don't care that it's a database. Even though it's completely invisible to them it actually has to be tested first because otherwise the rest won't work. Ok what features are used the most? I think I have an example in my handout. Let me see if I can get to this. I want to; I can't see my notes so I'm going to read through them. Ok here's my example which you guys can read in your handouts. And this actually came from a real example I changed a few of the details but now I can see with my glasses on now. Ok here is an example of what I mean by, I'm on the wrong page.

[ Silence ]

Strategizing what people were going to use the most, ok. So in our previous slide we talked about the fact that database interfaces really need to be executed first because everything depends on that. But once you get past the database interfaces or whatever hardware interfaces and what needs to happen next? Well we probably want to focus our attention on what stuff people were going to use the most with the software. We want to get the serious bugs out of the way early. So here's an example I've gotten from a real life system. Let's say your company is selling multifunction printers. Have you guys ever seen these? I know you seen them there all over the place. So let's say a hundred thousand of these things have been sold and you're doing some major upgrade. So you have some kind of past history for which it developed those major upgrade. Ok so in the past 80% of these printers went to small businesses, fifteen to home businesses, and 5% to large corporations. So you want to develop a operational profile around this. Well the 80%, 15%, and 5% that's called a customer profile, ok. In your guys case you may have only one customer for your software. So in that case it be would ever it is 100%. Ok then let's say we get down to looking at each customer site, and you can think of a customer as a site. Ok so if it sounds better to you to replace the word customer with site, just say site profile. Like for example I work with people who are developing missile systems and there's multiple sites. So instead of saying customer we just say site one, site two, site three. Ok so now we have our customer profile the eighty, fifteen, five, now we want to look at what those customers are doing with it. So this is called a functional profile. So let's say that the small businesses consist of 20% retail 80% service. And the home offices consist of 40% real estate agents, 50% technical and 10% other. The other we don't know what they are they never turned in their little cards. Ok the large corporations consist of 10% contracts people, they would be like lawyers and things like that, 10% office assistants, 80% office workers. It's known at the retail end-users on average primarily print. Service businesses on average print 80% of the time and fax 10% of the time and scan 10% of the time. And real estate agents use the scanner 40% of the time, they fax 30% of the time and they print 30% of the time. And the home office professional technical people use the printer 90% of the time and 10% of the time. And now my print, my page runs out. But basically what we would do and I think I actually have this example in here. Let me see if I can find it. Let's see here we go, that's not it. No, I hope I have it there, we would look at, I worked it all out and now the question is finding it. Printer example, ok so let's see. I forgot to put in the headings. This is, ok the three customers were, ok small business and the, and we had home offices and in the last one was large corporations. Ok so basically I plug in all of this stuff and I plug in all of those numbers. So this is let's see home office printing is all here, scanning was all there and then faxing and then my other. And you can see it all adds up to one. Ok when I plug in all of those things what this column is, is total duty cycle. So we can look here and we can say, oh shoot did I put it in all my end-users. I didn't put the end-users in here but that's okay for right now. These are all linked to my end users that I had there, the technical people and so forth, each row was the line that I said. Ok so printing when you add up all the printing it came out to 80%. When you add up all the scanning it came to about 10% in a faxing was the remainder. Ok if we look here the big kahuna was the 51% which would've been the eighty, eighty, eighty. It was the service businesses. Oh I'm sorry small businesses who are doing service and print that was the big one. 50% of the customers are service businesses that are small businesses and they print. So half of our test plan should be somehow focused on what those people do. So ideally what would really be great is to find out what these people do during the day and do that. But you can see how this constitutes 51% of what people are doing with the printer. By the way this is what companies like this do. By the way this is how they test. Ok so can you guys see that you know just taking a guest would you; let's say you're given a printer and it can print and it can fax and it can scan would you guess right off the bat that 50% of its printing, that's our focus. And 10%, I'm sorry, 80% printing 10% scanning and 8% faxing. What would most testing people do not given this? They'll probably test a third of each right. Ok it could be some other printer somewhere else has a different profile but this printer we need to make sure it prints period because if it doesn't print we're going to have like eighty thousand unhappy customers. So anyhow this is an example of kind of what I'm trying to show here.

[ Silence ]

Ok so that's what the strategy is and you can see when we were doing the unit testing we didn't have to worry about the strategy really because we're only testing one unit at a time. It's when we get into integration and black box testing that the strategy now becomes important. Ok we also need to worry about like I said the modes of operation, that's why showed to the printer, it was three modes. How did they vary by end-user and so forth? So I think can you guys take this example and apply it to your world? You may only have one customer, one end-user but multiple modes. Do you think this would be applicable to what you do at NASA? Do you guys have to write software for multiple missions or is it one mission? Class: [inaudible]

Ok, because one thing I was thinking of your multiple missions if you know the same software is going to work for it that could be your profile. And I'm sure you guys have modes of operation as well. So I think probably the modes and maybe different multiple missions could be your profile here or maybe not so much the different sites. Ok now let's look at defining some test, the IO and the interfaces, the exception handling, timing, sequence, and state. Like I said before we want to test what the code is supposed to do. Ok input output testing is a type of interface test that relates to these four things; files, database, device, operating system. Ok we are going to investigate the types of faults related to these and tests for them. Then we're also going to test for some data related interface issues. Ok step one identify all file database device and OS interfaces. Develop a test plan that has both normal and abnormal interfaces. Ok explicitly test the abnormal behavior. Some database related faults can be verified from the user interface. So what I'm trying to tell you here is how you're going to do these tests. The things that are IO related you're actually going to hook this thing out and test against it ok so that's what that's saying. When I say explicitly test I mean you're going to hook the device up and test with it. Database related faults might have to be verified from the user interface. You may not be able to see what's going on in the database you might have to have a UI to be able to test the database. Usually when people do database testing they build some kind of UI to tell them what's going on. The other test can be verified by verifying the state of the database at the end of testing. So for example if we're trying to test for corrupted database files we may just have to test for like weeks and weeks and weeks and at the very end make sure there's no corrupt data in the database. So three things that we're going to do to explicitly test, possibly build a user interface to test some things. And possibly verify the actual state of the database is at the end of testing. Can anybody think of any other means? This, these are the means by which we're going to test. Can anybody think of anything else you can do? Ok, alright well I guess actually the other thing I don't have up here is we can have a simulator. I, I just realized that the simulator should be on the list. So we can add simulator to that. Ok here are some common input output faults related to files. So your software is reading from the file, for right now let's assume it's not a database file because we're going to get to those files later. So let's say is reading a text file or something. Ok read to a file that was not successfully open, how we test it, no we test an empty file. Writing data or creating a new file when there's no available disk space filling up the hard drive. See opening a file in read only when it needs to be written to. You can review the code and you can examine the files afterwards to make sure the write was successful. Opening a file in read write mode when the file should never be written to, again you could also reviewed a code and make sure it's only read only. The software that I developed originally was developed for like Windows 95 then it got upgraded to Windows 98 and Windows 2000 then Windows XP then Vista then Win 7. Well basically all of my interfaces for the files work smoothly right up until Vista then I ran into a bit of a problem. And believe it or not I mean I had to redo all of this because Vista had that lock on those files and only the end-user could read the files write to them. I had to go back and what I really did is I did a manual review. I had to go back and make sure the code would now work with the different constraints under Vista. So these can happen whenever you're in a similar type situation. And you may find yourself reviewing a code really to make sure that the test runs. Ok, opening a file that's already open, that can happen. Closing a file that's already closed, too many read write commands at a time. Now one of you had talked about some software products that looks through the code and read for certain types of bugs. I believe some of these, there are some code analysis tools. I think there's code analysis tools that could find some of these. So the very first step that you would want to do is run those first. Can anybody think of anything else related to files that you want to test? This is normally the list, ok. Now devices, well there's basically two things to test for each device. The device not available and the device is available but not operational. Actually there's three things but that line this kind of blurred there. And a device is available but partially operational. So let's take a printer for example. A printer can be available right. It can be totally unavailable which means it's unplugged. And it could be that it's available but it's not working or it could be it's available but is partially working. What would be partially working? Well if it's out of ink. So basically I would copy and paste this for every device you have and test one of these things. Lots of times people just test the first one but they forget it could be partially unavailable. And you can have a whole different set of failure modes if it's partially unavailable, ok. Alright here's some common faults related to database transactions. So for each database transaction it could add wrong data, delete it, fail to update. I would always as a very minimum test adding, deleting, updating. Ok what you could do to test this is look at the content in the database before the test. Look at the content afterwards make sure it did the edits. That's the only real good way or you can develop a user interface to read what's in the database. Another set of failure modes not committed to database when I should be. So lots of times what happens is there's a bunch of changes and it gets stored up in memory and then boom we go out and commit all of them to the database. It could be that that doesn't happen when it should, like I'll give you an example for my software. This was actually a bug about ten years ago. I had gone from, I forget I think it was FoxPro and then I have gone over to Microsoft to Access and the way it commits transactions was totally different. I didn't realize this but it was never committing a transaction. And it was only after the software completely shutdown it would send it out to the database. So it was working by accident which was really bad. But anyway so you want to look for that. It could be it's committed to the database when it shouldn't be. So for example if a user's editing some field do you want to commit it to the database while they're still editing the field. No, you want to wait until they're done and then commit it to the database. In fact I'm a way to commit to the database when they're done with the whole page of stuff. So these two things are next on the list to check. You always want to look for corrupted or lost data. The only way to really do this is to look at the database and see if there's any corrupt data. Then we have table or field names does not match. This is a big problem with some databases. Are you guys using commercial databases for this, you said yes to database? I'm assuming you're using some commercial database. The biggest thing actually, the most common problem on the hold list is actually this one. Where the code says the table is XYZ and the table and the database is XY and it doesn't open it or read it. So the only way to really catch those bugs and you can do this in an integration level because were looking at the code, is to look at the code to make sure they match up. The database we used to generate an error when this happens but you don't want to wait for that to happen. Just look through the code and make sure that they match up that's the easiest way to get it done. Can anybody think of any other database transaction? Let me see if I have any more, here's more. Can anybody think of anything else before I show the other list? Ok you can have too many database instances allowed. So like for example you have the database and you're allowing like five copies of it when you really shouldn't. You can have multiple users allowed to edit the same day at the same time without any synchronization. This is only true when you have multiuser system. You can have a transaction execute despite missing mandatory data. Well this is what happened with that IRS problem years ago. They, the database allowed blanks for name; I mean that's just ridiculous. If there's one thing that should be mandatory it should be a taxpayer's name. So anyway that would be an example of where the database isn't stringent enough. It executed without proper read write permission. This may not be applicable for every case. Or it executes or it fails to execute despite proper read write permission. This was a problem actually with Vista. The whole database thing had to be redone because now the user running the software no longer had the rights to change the access the database. So that would be an example of that. Can anybody think of anything else that they've come across with databases that you need to test or refine? How about the wrong data in the database? I didn't actually put that up here. It could be you tell it to save the number five and it saves 5.000000. So I actually didn't actually put that up there says the wrong; saves the wrong thing. Ok these are some common faults related to an operating system. Each of these can be verified by code inspection or by running the software with the most current operating system with the same permissions the end-user would have. If you're, if you're running the test in a debug mode you can see these problems happening. If you're not then you just have to keep running the software. It's possible you can have the wrong operating system command use in the code. This happens lots of times when people upgrade from one operating system to another. Say for example when it went from XP to Vista I had to change operating system commands. When it went from Vista to Win 7 I had to change the operating system commands. So that can happen they can be obsolete. You can have certain operating system commands that aren't supported at all from in between two versions. You could have some that require permissions with the users does not have. You can have some commands that don't work or required as documented. What operating systems do you guys normally develop target hardware for? Do you have your own internal operating system or do you have, is a commercial? Do you do you make, do you guys make your own operating systems?

Class: No.

No ok, well then you're going to have probably every problem on this list at some point or another. So these are just some things that happened to take a look for. You also want to make sure your, whatever interface to the operating system. It could be you don't have any operating system commands at all in your software that it just resides on the operating system that would be fabulous. But probably you got some command in there that requires some permissions. So these are the, this is the short list of things to look for. Ok do's and don'ts for input output testing. Do report any insufficient information if you find it. Do inspect the contents of the files of your database before and after, that's the single most important thing you can do. Do inspect the code as need be to verify some of the commands and table names. So ok here's some tests for date, for testing data on an interface. These are the most common tests. I had actually talked about this a few minutes ago when I said testing to make sure the data is right. Well I'm talking about that now. The one thing I left off with the database testing was is the stuff in the database right. Well what I'm going to talk about now is data related testing. It could be with respect to the database. It could be respected data either one. Ok in this section I'm going to show you how to test data between two interfaces, any, any, any of these two interfaces so between the console and the sensor or someplace or between sensor and the database. Ok you want to use the interface design spec as your basis for testing here. Do you, are interface design specs required at NASA? I'm pretty sure they are. Ok well I heard somebody say something else other than yes.

Class: There's a little program [inaudible].

Ok.

Class: [inaudible]

Ok, so there's not consistency but you definitely want to find interfaces and expect if it exists. You want to make sure all applicable interfaces are included in it. That's the first thing of the list. Whenever I go to interfaces I'll inspect before I even start testing I'm going to make sure it's right and it got everything in it. And then test each interface data variables. So that's what we're going to talk about here. These are all of the common faults related to interface variables. Ok so you have something that's passing data to something else. Let's just say it's a sensor and the software. Ok so the sensors passing some data to the software. So what could go wrong? The number one thing is the data could be stale. This one is the one people forget. Stale means it's late, it's old. So I'll give you an example from the chamber experiment from the ISO chamber experiment. There is a chamber that does experiments and so it heats up to really high temperatures and so forth. And at some point in time the experiments over and the temperatures start to come down and the pressure starts to come down. And there's three temperature sensors and there that monitor the temperature and the software takes an hour to cross all three of them. So one of the things that's passed is this temperature sensor passing its reading to the software. When those readings all come up with an average that's safe, let's say whatever eighty degrees or whatever it is, the door pops open and the experiments over. So stale data would mean the temperature reading was from five minutes ago. Or even stale data might be from five seconds ago because it's really important to get it down to the right second here. Ok so that would be an example of stale data. So the only way to test it is to run multiple tests with different data and make sure that it's updated after each one. You actually want to sit there and be looking at the data values and make sure. You might have a thermometer there. You might have a sensor there making sure that the data is correct. The data can be in the wrong format. Ok so let's say this one would be very obvious but let's say the temperature sensor is in Celsius and your software is in Fahrenheit. Has that bug ever happened at NASA before?

Class: [inaudible]

Good, I'm being facetious the metric in English of course has. So you always look for the wrong format and in the end it's probably going to be real obvious with Celsius and Fahrenheit. But will it be real obvious with some other unit of metrics, maybe not. I worked on a system, a navigation system, where some of the metrics that were used for navigation were in; I'm trying to remember what the units were. Excuse me.

Class: [inaudible]

Oh navigation software?

Class: [inaudible]

Yeah, well I was working at an inertial navigation system. So like actually the system was a tank so it gets its current coordinates from the GPS. And some of the units, and this wasn't this doesn't cause a bug is suggesting potential for concern. But some of the units for computing its navigation which is where am I right now. Some of them were in, I'm trying to remember what the units were. Some were centimeters and some were millimeters. And that's like, and the reason why they did that is because some units makes sense where others don't. Some things are within the centimeters. So there's nothing wrong with that but that's a potential for we get a make sure that stuff passing it. Because in that case if you have a whole bunch of metrics that are between centimeters and meters it may not be so obvious. The tank might be five feet off and you might not realize its five feet away from where it needs to be until all of a sudden they drive around and it becomes twenty-five feet and then fifty feet. So anyway the wrong format is not always super obvious. So it's always something, inspect the code and make sure that it's always passing the right unit. It could be it's in the wrong unit to measure it which is actually that was the example I was giving. And the centimeter to millimeter would be wrong unit where its wrong format would be miles instead of kilometers. Ok data is valid but incorrect. This is a really difficult one to find. Let's take the temperature sensor on the chamber experiment. What if the sensor is reading the temperature three degrees off for whatever reason? Would that be one that's real easy to find? No, that would be super hard to find. It's a valid but it's incorrect. What if it's twenty degrees off? Well we probably are going to see that one when the door pops open way too early or way too late. So the worse ones you could find are when they're off by a tiny bit. Those are the ones that are most difficult. And again inspecting the code and verifying the results is the easiest way to do it. How about if it's invalid and incorrect? So let's say the chamber experiment temperature, I think there was actually an invalid number. I can't remember what it was let's just say two hundred degrees for lack of memory. So we know that the chamber can't get above that. Actually it's not two hundred, whatever it is let's just say some super high number. We know the experiment chamber can't get above that. So if the temperature reading was ever above that would the software be able to figure that out? Sure. So this one is super obvious so there's a difference here, this one is valid but incorrect this one is invalid and correct. The sensors saying its five thousand degrees but we know it's not. And here the sensors saying its seventy-eight degrees but it's really seventy-five so do you see the difference between the two, this one easy to find that ones hard. You could also have data that's missing. It could be we don't get any reading at all. That's probably the one I would worry about the most. It could be we think the sensor's turning over data and it's not turning over anything whatsoever. That would be bad. And then you can also get data corrupt which would be fairly easy to detect if the codes written for it which would be the temperature sensor sends over some character stream. Can you guys think of anything else? Let's just, let's just use our temperature example as an example here. We have the sensor its sending temperature over the software. Can you think of anything else that could go wrong?

Class: It's probably just the way you design your software but if you set a reading, reading a buffer you actually have your code time to actually go read the data instead of a separate thread. The code would just stop right there and [inaudible].

Ok, alright reads wrong thing.

Class: Or just waiting for data to come back.

Ok.

Class: What is time out for that when you continue [inaudible]?

It could be that time out's too long.

Class: Right. All the people who code sometimes do database reads and they don't take into account the read doesn't come back and start making like a separate thread for a database read inside the one thread.

Oh, ok those are really good. I get it I think I have to integrate those into the slides. Well I am gone over a little bit on lunch. I promised you guys I would split at eleven twenty and I went over a tiny, tiny bit. So when we come back we'll come back and we'll pick up where we left off.

Module 4 Okay. What I want to do now, I’d like to show some of the tools and stuff before we get started. This way I'm not fumbling around changing back and forth between my presentations. This is the link on the Wikipedia that talks about the software testing. And it has -- it's more like -- I wouldn’t say it's an article. It lists all -- they kind of dislike the survey. You know, people update the Wikipedia. Suit’s not an official, you know, reference. But all of the tools-- and they have just gazillions of them. So here's like --here's like the key here. Let’s see. The framework. And the left-hand column is the name of the thing. And lots of times the link will take you to their website, but not always. X unit determines whether a framework should be considered of X unit find. To be perfectly honest, I’m not sure what that means. I have to go out and check what that means. Let's see. X unit. Allow testing of different elements. Okay. All right. I think what it is is I think a Unit column means that you can use it for some type of low-level test. Okay. So tap. Where the framework -- I'm not really sure what that means either. This is one of those things where you guys need to kind of fool around with this. But, anyway, it describes all the different things about the tools. And what's really neat is that there are just loads of them. So we just go through. And like these are -- whoops. They’re listed by language. Forth unit level test tools, those are almost always going to be by language. Like J unit there'sAIDA tools. So you want to look for the language. I'm not actually familiar with what these are. I've never used these. But if you go down hereto AIDA, one of the things I have heard -- are any of you guys using AIDA at all anymore? Okay. Good. Because those tools Okay. Those tools are chunk of change. They're not cheap, the ones for AIDA. I do know that. Okay. There's Apple script, ASP. There's a bunch of tools for C. Now, a lot of the plus tools work for Casco we can actually skip through here. Now, let me see. There was one -- I have a selection that I have kind of written up some about. Let me see. Here’s the C plus tools. ThisCantana [assumed spelling] one I’ve heard things about. Some of the other ones don't look terribly familiar. Cute [assumed spelling] I've heard about. If anybody sees any of these tools that they've used, by all means yell out. The unit test plus is used. Let me see. There's Up Sharp. I’m pretty sure there's C sharp. Here’s internet and Java. Here’s J Unit. Anyway, you guys can take a look at this at your leisure. I have not come across one place where so many of them are listed. So this is a good place to start. Now, I also have a review, which is on yours. A few years ago I was trying to find the right set of tools for a particular project I was working on. So I just did kind of an analysis at every level. And I took a few of the more popular tools that were actually still in business. Throne thing about the Wikipedia is that some of those tools might not be around anymore. Who knows? So these are a list of tools that I know were around. The way that they were classified to me when I started looking into them was that the test coverage was at one of four levels. The source level probe, which is what you would need for unit testing. So any tool that’s at a source level probe that means the probe is in the code itself, the thing that’s testing. So language specific, yes, it's going to be. There’s also for Java there's a thing called byte code level. Those are used for testing Java. Post build means you build the software and you execute it and the tool runs with the executable. Those can be used for integration testing and black box testing. They're not language specific. They are Specific; however, meaning you got to get one for Windows, Windows XP, and Windows Vista. And then run-time means the tool just runs. It just runs in parallel with the software. So it would be like turning on virus checker, something like that. It just runs in the background. Those are Specific, also. Okay. So those are the four types of tools.Oops. Umm. so let me go down. I want to show you a few of the tools. I couldn’t remember the names while I had you guys in class. Cop tool was one. It's used for source level unit testing. It works on C plus. There's language. When I did a review of it, I found it did not do decision coverage, but it does do line coverage. So here I have decision coverage, line coverage. Let me change this column here. Ability to merge results with something that I documented. And the thing I noticed when I looked at it, it was well documented. So you guys can look at these at your leisure, but you can see some of the tools up there. I have my evaluation of them. This is just my personal opinion when I looked into it. Works big on --works well on large scale projects. I have some of the prerequisites for it like Hansel [assumed spelling]. A lot of these tools are freeware. So some of them have prerequisites. Some of the other ones, getting down here, with the test works, this one right here was the one was thinking of. I don't know why I called it caps in class. I don't know what I was thinking of. This one was the one that I was thinking of that actually does -- when I reviewed it at that time, it did almost everything I talk about in class. So like the PAP [assumed spelling] testing, the main testing all that stuff, it was able to do it. And you can see it merges results. So that one was a good one in terms of, you know, ideally what you'd like to have is one tool that does the most amount of stuff. So that one I thought was really good at that. There's RationalPurifyPlus. It's post build. So it doesn't work with the source code. This other tool, Bulls Eye, I was kind of impressed with, too. You can see the price here. Now, it’s been a while since I updated the price column. So take the price with a grain of salt. Okay. It could be that it's changed by now. But, anyway, these were basically my evaluations.Parasoft Jest. McCabe &Associates, it does all of --McCabe and Associates does all of the PAP testing and logic testing and all that. The cost was not on their website, but Idol knows from word of mouth that it’s not cheap. You guys don’t have to worry about AIDA. Draws the only one I found that actually worked with AIDA. And the last one, the CDC plus test well seemed like a good one, but I couldn't get a firm price on it. It seemed like the pricing was kind of depending on how many licenses you have and that kind of thing. So, anyway, these are some tools. And you can take my evaluations with grain of salt, but those are some things for you all to look at. Okay. Now, what I'd like to do is go back to our class.Oops. That is not what I wanted to do [No matter what I do, I'm as blind as a bat.^M00:08:58[ Pause ]^M00:09:04 Anybody remember where we left off? Somewhere around here. Okay. Okay. Right on this page was the page adds other software to that list. So that’s where to put that. And if you want to put other software, too, that would be good place to put that. I’m actually somewhat shocked that I didn’t put that there. Okay. We also had -- I also had during the break I had one really good suggestion for here, which I’m actually kind of surprised I didn’t put on my list. I think talk about it later. But there’s one other common fault with any data. Let's say you have two interfaces, two things talking to each other. There’s actually one other thing that could go wrong. It's the opposite of this. The opposite of stale date is data that’s happening way too often. Okay. So what happens when you get data that updates too often? Iris, do you want to tell the felt, the rest of the class? It is more data was fluctuating too much. And they were using the data [inaudible],in which case they burned too much fuel and which they ran out of fuel. Okay. So they didn't need to have it update as often. How often were they updating? Like-- I don't know. Way too -- the big risk, let’s just take the chamber example with the temperature of the experiment chamber, the big risk of having a temperature check too often is you get overloaded with data. Things start to slow down. It takes up too many resources. And normally what happens when things get pulled too quickly is that you may not have the right timeouts when you're pulling too quickly. And you may use computer resources. So another good item to add to this list would be data is updating too fast. So that was a good suggestion. Great. So basically for each interface, whether it's between hardware, software or operating system, data base, whatever it is, at least one test case for normal behavior, meaning just make sure the interface works. And then the seven test cases, actually, eight test cases, because Siristhought of the eighth one, for each test case, this is my simple rule of thumb: Every time I test an interface, I memorize these. I just simply memorize them. And I go through, and it could be some of them don't apply to my situation at hand, but it doesn't hurt to try to go through all of them. Okay. The hard part is injecting the faulty data. Soto test the behaviors on the previous page, you might have tube fairly creative. Okay. It might require some proof. So, for example, you may not be able to prove it through an executable test. Like corrupted data, sometimes the way to verify corrupted data is to have the absence of it for like an extended period of time. So you may have to get a little creative with how you verify these things. Okay. Now, with that in hand, we're going to flow right through the next topic, which is how to test exception handling. Okay. We just finished talking about how to do some abnormal testing. Now we're going to do how to do the exception handling tests, as we saw earlier, exception handling can be tested at the unit level, the interface level and the system level. It applies at all three levels, but it is done differently at the three levels. Okay. This type of data is an extension of the data interface testing that I just showed you a few seconds ago. So we're just going to go right into exception handling here. But instead of focusing on data, we focus on events. Okay. And you can add other software to this list. Anywhere you see this table put other software. The test can apply to any of these interfaces. Okay. How to test exceptions between two entities. So remember at the unit level when we did exception handling, we're only focused on one module and what can go in and out of that module. Here we're focused on two things, two entities, whether it's hardware, software or external software. And we want to know what can go wrong between those two things. Soyou would want to collect some documentation, because normally these things are documented somewhere, particularly with hardware. Do you guys normally get a list of all the things that can go wrong with the hardware before you start rating the software? Maybe. Maybe not. You'd want to definitely look for anything that discusses the exceptions. Collect any documentation that provides manual recovery procedures. So if you have anything that would say, okay, when this happens, this is what you do. You'd want to get those manuals out. If your software has a user interface, probably there’s something in that manual that would say like, oh, what's the typical section they put those in? What is it? It's on the tip of my tongue. There's award. When you get a user’s manual for anything, what's that section that has all the weird stuff [Inaudible Comment] Yeah. And then there's also-- it's on the tip of my tongue [ Inaudible comment What was that [ Inaudible comment Yeah. There's a section like trouble shooting, that's right. Do you ever get a manual for something that says, troubleshooting? Well, somewhere in troubleshooting Isa list of usually all the error codes. Okay. So you'd want Togo look for that stuff, because that’s a good place to find all your error codes. What I had found -- the reason why I’m telling you to collect all this stuff is that people almost never put this stuff in the software spec. Sometimes you have to go dig for this stuff to find out what the hardware could do wrong. You know what the other software could do wrong, because you developed it. Okay. So you might want to go get this documentation to find out what are the troubleshooting modes. Okay. Investigate the hardware or interface fighter modes one at a time. Verify the correct alert or alarm or warning code, whatever it is. I have these three levels here. An alert normally is like the little yellow triangle box with the exclamation where it comes out and says, you sure you want to do this or, hey, by the way, something happened. You know, those are like -- they just tell you something. And you should probably pay attention to what it says. An alarm is when it comes up sometimes with a little red stop sign; you do not want to do this. And then a warning code could be, hey, by the way, you know, this is a warning, you know. Are you sure you want to delete all the data on your database, blah, blah, blah, blah. So normally the alert doesn't have much of a recovery except for you to acknowledge it. The alarm and the warnings normally have something that the user has to say yes, no, sometimes, whatever. Okay. So those are the differences there. So basically you'd want to get together all the exceptions and the alarms, verify that each one of them actually generate the appropriate alarm. From here on out instead of saying this whole big word, I'm just going to cal them alarms. Okay. Verify the correct recovery for each. So, for example, if I say, yes or no, do both of them do what the screen says it's going to do. Verify that any supporting documentation is correct. One thing that's super important with recovery is the user's manual, because by the time some thing's happened, somebody is probably reading the manual. How many of you have developed software that has some kind of user's manual? Okay. Al l right. So it's applicable. And then, finally, you want to verify the recovery occurs within a required time frame Now, there's probably not a spec that says, hey, recovery has t happen within X seconds or whatever. So this would be relative to what you're doing. Okay. Some parts of the code recovery might have to be instantaneous and other parts, hey, if it takes a day, that's fine. Okay. So this would be relative to whatever is being recovered. Let me think if I could think of a real good example.Let me try to think of something that's really interesting. Okay. I have one .I worked on some -- I used to work for equipment manufacturers back in the fat '80s and early'90s. We were just talking about the fat '80s during lunch.The semiconductor industry was on fire. So all the work was in Silicon Valley. Everybody is, you know, happy, having a goodtime. The chip producers are happy. So, anyway, I was working in that industry on equipment that was making chips. And one of the things that the equipment has to do, as if all equipment has to do this, when it starts up, it has to initialize. Okay. So the equipment starts up. It has to initialize. And then once it’s initialized, it gets into its operational mode and it does bunch of stuff. Well, this equipment could be up for days, weeks. Probably no longer than that. It probably wouldn't be longer than that, because they would have to change some process in that amount of time, but the equipment does not turnoff every day. It goes into manufacturing and it works24/7. So one of the things that were interesting in this equipment was that this equipment had a motor. And the motor was essential. The equipment could work without the motor working, but it couldn’t work well without the motor working. So what would happen was on startup, the startup software would check the motor and say, is the motor working? No. Shut down. It would just instantly shut down, because that’s the recovery. You got Togo fix the motor. Come back when you get it fixed. Okay. That’s totally right. Then if the motor is fine, it would keep going and would go to operational state. So can you guys tell me what the big fire mode was? The software -- this equipment will be turned on for three or four weeks at a time without being rebooted. Exactly. The motor could fail during the three or four weeks. And there was never any check. So, anyway, that kind of gets down to the recovery -- the timeframe. You know, that's not something that's obvious. Istook me weeks, actually, to track this bug down. And I finally -- it was only after --you know how I tracked the bug down? We knew something was going wrong with the equipment, but we had no idea what, because the motor is like real quiet and the motor is deep inside the equipment. And you can't see it turning. So all of a sudden I’m like the equipment is just networking well, but I doing have any idea why. So, finally, one day I came in to the lab, and it hit me. It's too quiet in here’ll of a sudden I realize there’s no noise. I went over. Sure enough the motor's networking. We spent months tracking this problem down. So, anyway, but it all gets back to, if you think about when should the recovery occur, what we should have thought about there is, you know, we made the assumption the equipment would be turned on every day. Bandit’s not. So if we had actually kind of brainstormed that very last thing, we would have figured out, oh, we got to test it more often than on startup. Okay. Anyway, this is kind of temple. Can you guys think of anything else you would want to verify if you're testing recovery or exceptions? We had the user's manual. We had the timeframe. We had correct recovery. This would apply to interfaces. Okay. Well, here’s some common exception handling faults. Well, the first and most obvious is the software can fail to detect the failure. That’s the first thing you need to check for. If there's failure, the software needs to detect it. That's number one. Two, we need to check for the opposite of that, which is false alarm. False alarms can be almost as deadly as the firestone. Okay. If the software detects a failure that's not there, what can happen? Well, particularly on your guy’s system what would happen if there was a false alarm on space shuttle or a false alarm on anything? False alarms are bad. Okay. You definitely don’t want a false alarm. So that would be second. Third, generates the wrong alarm. So goes wrong. It tells you Y went wrong. I don't know which one would be worse, that one or that one, but neither one of them are good. So you want to check for that. And the fourth one, it generates when there is no failure. Oh, you know what? This is not a false alarm. OH, I was totally screwed up. Thesis not a false alarm. What this means, the second one, the software knows that there's failure. It has it in memory.Yeah. It's like, oh, I forgot to pass that to user bay. Sorry. This is pretty common. Lots of times the fact that the failure occurred is resident somewhere in the software memory, but it just hasn’t displayed it. So that one is that. This is the false alarm. Okay. Software generates awarding, but no recovery. This happens quite a bit. It executes the wrong recovery, doesn’t happen within the required time. Not everything has -- this one is actually very common. Not everything has recovery. So some things have recovery. Other things don’t. This is what I see pretty often. It could be that it does the recovery, but the manual procedures are wrong. So there’s something the end user needs to know to recover, but the manual is not telling them what it is. So this would be problem with the manual. And then, finally, it could be the manual’s wrong. Like what if the screen comes up for the manual and it says, this is what you need to do to recover, but that thing is wrong? It could be the code's working fine, but the instructions you just gave the user are wrong. Can anyone think of anything else? Now, our focus is on interfaces here’s we got something that the software is interfacing to and it’s not working. And we want to do something about it. Can anybody think of anything else? ^M00:24:50[Pause] Okay. All right. So do the [inaudible] for exception handling. Use the interface design spec as your basis. That’s where you'll get all the interfaces. Lots of times thesis one big diagram. Lots of times I build my test right off of a big diagram. Get the list of warning codes for the system. This is the hardest thing to get. And then make sure all of the applicable interfaces are in the interface design spec. The last bullet is kind of tricky, because you never know what’s not in the design spec. But you can do a couple of simple tests. If your design spec has ascertain number of interfaces and you look at the software and you see that couldn't possibly ball of them, then you'll know that you don't have everything. So I guess another trick, too, is to make sure the interface design spec is current. Okay. Using this example, find some tests for the interfaces in the integration test checklist. Okay. I am going to start showing you the example now. And let's see. I want you gusto go to the back of your handouts. And I want you --let's see. I want you to go --I kind of want you to take a few minutes and read all the examples. So let me show you. This is in -- it's in the folder. It's in this thing. What is that thing? Pocket. It’s in the back pocket. Okay. And it's actually the very first thing in the file. I want you guys to just kind of skim through this. There's a system requirement spec, the SRR. And kind of want you to kind of skim through it. You do not have to be a domain expert to get the general gist. I just want you to kind of skim through and kind of become familiar with the software. And then we’re going to start applying what we learned in this class to this. Okay. You can read up to -- you do not need to read up to where-- let's see. Let me count the page. One, two, three, four, five, six, seven, eight, nine, ten. Just the first ten pages. I just kind of want you to skim through them. They really aren’t that bad. Okay. So I’ll give you about five or ten minutes to kind of skim through it. It's listed in order. The system requirement specs are first. Then the software. Then the design. And I only -- the design document only went over actually a few of the features. Okay. I think most of you have skimmed through it. Even though in real life the software wouldn’t be developed yet -- so if we're taking a crash course here, we've got one day to go over an example. And I need for you to understand it very quickly. I'm actually going to show you the software so you get what this thing does. Let me show you something. Hold on one second. Because there are oodles and oodles of things that can be tested. This is a foul base system. I think the requirement said it was going tube used on this axis; right? Soothe user can save their predictions in a file. And, ideally, according to the SRS or the systems requirement spec, they can create as many files as they want. And each one of them has a prediction in it. So I’ll open one up. When it loads, you don’t have to worry about that message. We'll talk about the software later, but its job is to demonstrate this page. This page has a prediction in it.You can see all their reliability predictions for the software. And all the input pages are actually over here.This is when it was describing all those inputs. This is where it went. If you got as far as the design document, it talked about where things went. Sphere’s where they input all their stuff. Here's where they select their model. So the models. So they can toggle. As you change the model, the results toggle. Okay. The user interface is extremely simple. Okay. I mean, they enter stuffing. The stuff gets number crunched. They select the model that displays the results. Soil the interesting stuff is happening behind the scenes. Okay. So behind the scenes this interfacing with the data base, because it's doing all this stuff. And it's bringing in, it’s doing a bunch of calculations. And then it’s displaying the calculations. so the___16 UI is incredibly simple. And you guys don't have to even understand what these calculations are other than the fact that they're there. Okay. So this is the system. It canals -- we don't ever actually talk about any of these things up here, but you can print it’s basically right off the bat what are the interfaces? I already named, actually, two of them. Well, the software within itself is one interface. There’s clearly for something this big there's going to be multiple components; right?Yeah? The user. The interface. There’s definitely a printer. There’s a print button. And there’s a data base. What we don’t know is whether the software or interface is directly with an OS command. I’m telling you right now it does it. It just sits on top of the operating system. [Inaudible Comment] Oh, yes. Yes, you're right. Did I have that in there? Yeah. Oh, I knew there was something I wanted to take out. [Inaudible Comments] all right. When it goes to rate the HTML file, it's got abuser operating command. So Intake that back. There is one operating system command. And then everything else sits on top. So when it goes to show that HTML, it's got to use that operating system command. Okay. So thank goodness James caught that one. Okay. So that's the software itself. Now, what I want to do is I want to show you how you would fill out checklist. So here's the integration checklist. It says all the stuff we've gone over sofa for integration. And we’re not totally done. So we’re going to revisit this, but you can see all the recordkeeping stuff at the top. I don't have the SRS statements in here, but would -- by the time I'm done, I would list all the Restatements that I've covered. Okay. You can see here I have the frequency. This software here is going to be executed all the time. Anything executes from the data base. Did I use any tools to help me testing? Well, for sure I'm going to uses Access, because I need to uses Access to make share that what’s in the database actually is the right stuff. I would not be using this tool if I was doing a system level test, because I would be only verifying what's on the screen. Okay. So did you identify the input output for this component? Files. Only the Microsoft access files. That was it.There wasn't any other files. Database, yes, for sure. Device printer. Operating system command. And, James, notice my one thing. Interfaces. ^M00:32:38[Pause ]^M00:32:49 There's actually one system command. Okay. So I tried to tunicate too much here. Okay. Now, we go through our list of stuff. Reading or writing through a file that was successfully open, that was not successfully open. There were no requirements for that. So I put that in here as we know we need to do that, but there still wasn’t any requirements for it.And this is actually a really important thing. We've got database file. What if that database file doesn't open? So right off the bat we have an issue, that it needs -- the spec does need to tell us. Like, for example, what if somebody went out and used Access on their database file and changed something and then they tried to open up? We need to have some basic requirement that says what the software is going to do. Is it going to tell them this is the wrong file and shut down? What is it going to do? So right there we have a missing requirement. Grading data when there's no space. Okay. We got a test for that. Opening a file on read only mode when it needs to be written to. This was not applicable, because the software-- the operating system handles all of that. So it was not applicable. Changing a file, opening a file and rewrite when it should not be written to. The software doesn't open a file. It doesn't do anything other than open a file and rewrite. So what it didn't do, though, is say what if the file-- this is an Access data file. What if it's some other access data file? So that was missing, too. What if somebody decides to open some database that they have sitting on their desktop? What is the software going to do? Well, if you don't have a re quirement, it's probably going to crash. Okay. So that's another missing area. Opening a file that's already opened. This was actually taken care of in the requirements. Closing it was actually taken care of. They were requirements. Too many rewrite commands at a time, we're going to test that. For each device -- well, the only device was the printer. I don't know why I put not applicable there. The printer actually could be any of those. I'm not sure why I put not applicable. Maybe I didn't -- wasn't thinking about that one. Anyway, adding data. The way this software works is it never ads data. It just creates data. It has its own file. And the structure of that file is a constant. So anything it ever does to change it, it's not applicable. You also can't delete it. Not applicable, but it will update data. So that needs to be tested. Committed to database when it should be. The software design document never said when the data would be committed to a database. So that's a flaw. It should actually say after every field, after every page. When does it commit it to a database? Okay. Corrupted data. Well, we'll handle that in the systems task. Okay. These are other things that I tabbed through that were not applicable. All right .Let's see. Which interface is applicable? Well, we have consoles, yes; printers, yes; data base, yes. I would need to go through and test all of these things for each one and so forth. So let's go ahead and look at the procedure. I might be jumping ahead of myself here. Okay. So here would be some examples. What I did is I went through my checklist. And everyplace where I had a yes, I made sure I had a test procedure. So that’s why I put the checklist out here. And then I had the procedure there. I made sure that everywhere where I had something that was applicable that I've got at least a test for it. So remember all these things, these are the ones I said yes for. Okay. So over here, for example, I don't know what the expected result is, because the spec didn't say. So what do you guys think about that? Do you guys do something similar? What do you do when you’re testing and you come across something and you say, I don’t know what it's supposed to do? I can guess. I think I know what I'd like for it to do. What do you guys do when you ‘rewriting a test plan and you don’t know what the expected result is [ Inaudible comment What if it's not there [ Inaudible comment ] OkayInaudible comment Okay. So you would just write a discrepancy report and not necessarily write a test case for it [Inaudible CommentOkay. Oh, I see. You would document what it really does and then write a discrepancy that what it really does is not in this. Okay. That's a good idea. So you would just go ahead and test it and see what it does and then write discrepancy report. That's good idea. Okay. One of the things I have found in looking at people's processes, they never actually decide -- they assume that every test case has a document and expected result. They don't have as part of their process, what do you do if you don’t know what the expected result is? So here the process was to put it in red. Okay. I’m going to get back to the rest of this procedure later. Let me see. Okay. Now, these are all my interfaces tests here. Later on I'm going to show you a timing and sequencing. So I'll come back to this later. Okay. Oops.Okay. so let's get back to the integration test. I want to show you two more tests. And then we'll get back to the example. Okay. Timing is another very important integration test. Some things to look for: Well, first of all, the purpose of the timing test is to verify the software timeouts are of the correct duration. It could also be to verify any timing diagrams. Do you guys ever see timing diagrams much in a design document? Is that something that's normally documented? A timing diagram, I wish I had a picture of it. It's usually got like four or five columns, and it will have a little arrow going from left to right. And then it will go from right to left. But, anyway, I should have brought a picture of a timing diagram, but that could be another part of this test, as well, is to verify a timing diagram. Okay. So some of the common faults could be the timeouts at two line. In that case you could get a delay. So, for example, you know, if the timeout on a printer was an hour, that would probably be little too long. Okay. If it was one Nano second, that's probably a little too short. So the goal is to have the time outs be right in the middle. Okay. A function executes too late. We also want to look at some of the functions. We could have a timing problem on how the functions execute. For example, like in my software that I just showed you, I don't know if you notice or not, but I selected a model, and all the results got populated. Well, one timing problem could be I select the model and ten minutes later the results are populated. That would be a key problem, because the user might have gone on and done something else by then like printed it out. So, anyway, we could have functionality that executes too late or too early .In my example with the user interface an example of it executing too early, what if I selected a model and it started the update the second I touched that pull-down menu? Anybody tell me what would happen? I can tell you I actually had this problem in real life. What would happen is I would touch the pull-down menu and I would tab through. You know, pull down menu, kind of tab through. Every time I tab through one of those blah, blah, blah, the results were getting updated. So the person would be selecting a model. Meanwhile, all the results are going all over the place. Well, that's like extremely confusing, and it's also annoying. And it's also taking up unnecessary resources. So you can see in that example we wanted to execute exactly at the right time. They selected a model. Boom. It updates. Okay. Not before. Not after. Can anyone think of anything else that's timing related? I'm sure on your systems that timing is probably the number one thing you guys test for [ Inaudible comment Well, you can extract the software timing from the operational timing [ Inaudible comment ] ^M00:41:59 I'm sure there's a ton of it. It's probably all spelled out in the SRS, too, I would think. So it could be possible that you guys might be able to test the timing just by testing the requirements if you have this spelled out. But sometimes the problem with timing is sometimes they're not explicitly listed as requirements in the spec. They're derived. Like in my software example I showed you, there was no requirement for the timeouts. That's a design issue. You got to figure out what it is and do it. Okay. So sometimes they're not very explicit. Okay. Does and d don’ts. Do reference the top level design for timing information. That's normally where it is. Do be suspicious if you don't find any reference to timing or timing diagrams. If timing is not mentioned a tall in any document and you know timing is a big deal, that can be a problem. If you have specific requirements for timing, test them. Otherwise, test implicitly. So, for example, with my software there's no requirement that says when that page gets updated. But logical common sense we could derive a requirement. We don't want it to happen every time they're selecting something, but we don't want it to happen ten minutes later. So we can derive one. Test the overly frequent timing by testing for a really long time and recording sluggishness. So the example that Sir is gave us at the beginning of the class where you have polling that happens too quickly, sometimes you don't really see that. And the only way to really see it is to test for a super long time and see if the system starts to slow down. Test the in frequent timing by basically you're testing for stale data. Not sure how long the system takes to respond to it. So for my example, assuming it's not instantaneous, I could select a model, and then record how long it takes for all the last resort to populate on the screen. Now, you guys can't see it when I changed it, but it actually goes from top to bottom and goes like that. It was fairly instantaneous. So I didn’t really have much to measure there. It happened within seconds. But if it had taken longer, I would have waited for the last input to get refreshed on the screen and said, okay, that’s the timing. Okay. Sequence testing. Sequence and timing are different. People confuse these all the time. I’ve heard people call sequence diagrams timing diagrams and vice-versa. A sequence is the order of things. You can have the right order and the wrong timing and the right timing, wrong order. So they’re different from each other. Sequence testing is based on sequence diagrams. Lots of times one of the things I have found is that, particularly design documents, if there's a diagram they'll remember to Putin there, it's usually sequence diagram. Sometimes designers forget to do timing diagrams. Sometimes designers forget to do interface diagrams, but they almost never forget to do this. So probably somewhere buried in the software design document is a sequence diagram. It may be in text. It may not be a pictorial picture like what you’re seeing here, but it’s probably there. So here's an example. And this actually doesn’t come out of the example just showed you. This is class example, though. Okay. So here's the sequence. Check to see if File X exists. Check to make sure it isn't already open. Open it in rewrite mode. Write to it. Close it. Okay. Do you see how this is sequence that has to take place in that order? Okay. So thesis an example of what you would look for in a document. You’d find something in a document that clearly has to be in particular order. And my example had several of them. There were several places where sequence was super important. And you would make sure that that order is tested. What else would you make sure? We'd also want to test what? What happens when it's out of order? Right. So we want to test the order that’s required as well as what happens when things get shuffled around. Okay. So here are some typical sequence faults. The first one obviously is something’s out of order. Well, the truth is it's easier to review the code than it is totes it. What I suggest for sequence testing is look at the code. Is it in the right order? Because if it's not, you’re going to save yourself some time by executing the code. If you don’t have access to the code, then you got to just test it and see what happens. Test it out of order and make sure it generates an appropriate alarm or I could also put there that you’re not allowed to do it period. YesOkay. Cool. Yeah.Sometimes if your software issuer complicated, it may not be obvious what the order is, particularly if you have [inaudible] software. That's good idea. Set some breakpoints and make sure they execute what you're expecting. Okay. Good idea. The one problem, which people always forget to look at, is sometimes people write code and it never gets called at all. That would be a sequence mistake, because it’s out of the sequence. So you want to make sure that everything that you thought was executing actually really does. Sometimes what happens is people write this beautiful function, and they forget to call it. And the thing never gets called atoll. A particular function executes too many times. I can tell you on the example software just showed you I had a case of that happening where I had this solute that was going round and round and round and round. And every time they changed one little input, it would go through the whole thing. Well, it doesn't need to go through the whole thing. It should really just go through the parts that changed. So every time they clicked any button, boom, it would update the results. So that’s an example of that. Sometimes you have unnecessary sequences. You can also have particular function executes in valid sequence, but that sequence is illogical. I’m trying to think of an example of this. You can have a sequence, and it would be totally valid, but it's silly. Let me see if Icon thinks of one. I had a good one when I was making the slide. I should be able to think of this. Off the top of my head I can’t think of one. Have you guys ever reviewed an SRS or software design document and thought, well, that makes sense, but it's silly or it's illogical or doesn't make sense? Anyway, that’s what this one is. If you ever come across something that just is illogical, that's what the last item is. And off the top of my head I can't think of what one would be. Okay. All right. In the file example, this is a common mistake. The black boxes are the ones that don’t get executed. And the blue boxes are what they do. This bug is one of the most common bugs of all time. Open agile in rewrite mode. Write tout. Forget to close it. Forget to make sure it's not already open. A lot of software engineer’s right out of college will make this bug. And then they get burned by it. And then they stop making this bug. Softhis is a pretty common bug. And you can see here, its sequence bug. They forgot to include the other parts of the sequence. So Okay. I'm going to show you the sequence related test later. So let's skip this. OH, actually, there is something I want to show you. I want to show you the diagram of sequences. Okay. For my software -- I knew there was something that I wanted to show here. For the software I just showed you, if you were to take the SRS and the software design document and you were to stick it into a blocked diagram, thesis what it would look like. Did everybody visualize this from looking at the spec? No. It’s an art. You know, testing is hard, because what you're given is a bunch of words and no pictures. Do you guys ever get pictures in your specs? Are you ever lucky enough to get this wonderful spec with lots of pictures in it like this? No? You want to know one reason why you never get pictures in your specs? Well, I can tell you one thing. Doors don’t allow them. So this is like one anomaly that actually, it’s really funny -- in my data base of projects, I'll be very can did with you, defense and aerospace, and aerospace projects overall have the best defect densities in my database. As you know, I'm sure you guys think, oh, my God, I can't believe we would be in that group of people who are the best. But generally space, aerospace defense, medical devices, they're in that elite group that have the really good defect densities. There's only one major flaw I've seen at defense contractors and medical device companies and space people. They all use Doors .And Doors doesn't allow pictures. And some defense contractors are actually forbidden to have pictures in their spec. I don't think you guys are. I don't think that'sa NASA standard. So the problem there is if you don't have any pictures in your spec, how easy is it to test timing, sequencing, interfaces? I believe interface design documents have pictures in them. You have to have pictures in your design document. This is a picture. Diagrams. Yeah. Anything that's not text. Now, I guess if you guys are using object graded development, you would have UML on your documents. Yeah. Okay. Do you ever see any other pictures in there like state diagrams You do. Good. Well, that's fantastic. The reason why I presented this picture in here is that the spec I gave you wasent irely text. And pooling out the diagrams is not easy. And I make you go through the exercise of reading at ten pages of really dull text to let you know hat somewhere in that text was this diagram. You could construct this diagram based only on what was in that specs. So if you're not lucky enough to get a spec that's got some pictures, honestly, I would create one. And I would even do it on a little hand-drawn piece of paper. Because when I look at this, doesn't the sequence now for this software look crystal clear? Okay. This has to be done first. Oops. I should not use my hands. Okay. These three things can be done first in any order. It doesn't matter. One of these or all three of these have to be done first. Okay. One. One. One. Step one. Then this. This and this needs to be done before that can be done. This needs to be done and that needs to bed one before that can be done. And that needs to be done before this and this. So this is actually the order that those results popped up on the screen. It's just it went really, really fast, and you guys couldn't see it. So basically the orange here is a direct user input and the blue was a computed value. But the sequence diagram is here. So what we'd want to te sin this particular case is first of all, we want to make sure it does follow this sequence. As Brett said, put in some debugging prompts. Make sure that these things happen in this order. And then, secondly, we'd want to see what happens. Is there any way for this order to be compromised? Okay. So I guess the point I want to make here is sometimes the diagrams aren't actually in the specs. I personally find that testing sequence and states, if you take the five minutes it takes to make the diagram, it goes a lot faster. So your sequence test here would be make sure that this blue box doesn't start computing until these three boxes are done. That could bey our test. Make sure this thing doesn't start computing until those two boxes are done and so forth. We could write a whole set of scripts around that test. Okay. Let's see. All right. Now is a good time to take our next afternoon break.

Module 5

Okay, we will get started again. Everybody ready, including videotaping? Okay Yes, we're good to go. [noise] Alright, the next thing I want to go over for integration testing is state transitions. I do want to point out states can be tested as a black box as well if the states are visible from an external point of view. Okay, so this can be done both in integration and system testing. I put it here because normally state diagrams are in the design document and that's what we task when we do integration. Okay, now one thing I do want to point out is in our last segment I showed you a sequence diagram and I said that sequence diagram existed in the specs but you couldn't see it because it was in text. The same thing could happen with states. I don't know why software people do this or why software generators do this, I don't really know why, for me it's much harder to describe a state if I don't use either a table or a diagram. So you may see state's described with a table, a transition table, which I'm going to show you one shortly, or you may see it described with a figure like what I have here. But I have found that writing states out without one of those two takes a lot more work than using one of those two, particularly the table. It's so short and easy to make a table. But anyway, they may not be very obvious in the documentation but they could be there. Okay so here's an example, this is the simplest state transition I can think of. Printers working and properly available is a state. Printers working but not available is another state. Printers not operating at all is another state. So we have at least three states here. The transition from this state to this state, the printer cable was disconnected from the network. Okay, that is one way it could get there. Cable was reconnected to the networks and now it's printing again or now it's at least available again. Okay the key word there is available. So the only way the printer could be available is if it's disconnected. Can you think of anything else? If it's a wireless printer, maybe the internet went down, something like that, okay, something is disconnected. Okay so over here, printer not operating. Okay well to get from here to here the printer has to malfunction. To get from here to here the printer has to be restored. Then over here we could have the printer is not available and go straight to not operating so that would be like you unplug the printer and then it stops working. Probably a remote probability but it could happen. Couldn't your printer malfunction after you've turned it off from the system? Sure, it could do that. [inaudible audience comment] Huh? Oh yeah, that's true. We could have it not plugged in. Okay so it is possible it could malfunction while it's disconnected from the network. It could be restored while it's disconnected from the network. If you think about this, if your printer stops working are you gonna, probably the first thing you're going to do is disconnect it from the network so this can definitely happen. Okay so here's an example of state transitions. Here's an example of a state transition diagram. This says the same thing as this one, or it should anyway, except it's a table. So either way, either approach is something to look for in the documentation. Okay so here's some common state related faults and how to test them. Well one of the most common ones is missing states. So the real state diagram doesn't match the implemented state diagram, okay? That's one you need to look for. I'm gonna show you can example of that shortly. The next one is a dead state and this can happen a lot where a state is entered but there's no way for it to be exited. So if we go back here you'd have a dead state if you had this arrow come in here, well actually let's say you have this one here but you didn't have the printer band restored. You'd have a dead state. It would become unoperational and that's the end of it, bam, we're done. Okay so that would be a dead state. Impossible states, this is when you've got a state out there with no lines connected to it. So there's no way for that state to ever happen but it should happen and it's not happening. Disallowed states, this is the opposite of that one. When you've got a state and you shouldn't transition or the software shouldn't transition to it and it does. History is full of these. If you go back, there's some website that describes like 10,000 software failures, probably 2,000 of them might have been somehow related to that. This is a pretty popular one. It's when the software should not be in this state and it is. I'm sure you all have heard of the Therac-25 disaster with the radiation equipment. Oh, I'm really surprised. The Therac-25 was a radiation, it still is. It's called something else now but anyway it was radiation equipment back in the middle 80's and there was a huge case study on it. If you get on the internet and search for Therac-25 you can read everything you'd want to know about it. But to make a long story short, it has two modes and I forget what the two modes are called. There's a specific name for them. I think there's like electron mode and some other mode but anyway, the short story is the equipment had two modes. Depending on the type of cancer the patient has, all the patients are cancer patients, so depending on the type of cancer they had it either shoots a beam directly on to the cancer, which is a low dose but no filter. It just shoots it on there. Or it shoots a high dose with a filter. So can anybody guess what the disallowed state was? [inaudible audience comment] Yeah, well it happened, there not really sure, somewhere between two and 20 times. The problem is they don't know exactly how many times it happened because we have the patients themselves have cancer. But eventually, over the course of time, the doctor's figured this whole thing out and realized it was, there was several root causes and then one final root cause. The final root cause was it was in two modes at once but there were several ways it got that way and one of them was race conditions. So anyway, that's an example of that. Okay so when you're looking at the software it's always a good check to say what do I not want this software to be in? What is that one or two or three modes that we really don't want to have happen? It's worth the exercise to test for that. Okay allowed states don't work. Of course that would be like you come back over here and you test one of the states in transition and it doesn't work, okay? And then finally a state is entered and nothing happens or the wrong event happens. This would be a wrong state diagram where a state is entered but the diagram isn't what happens. So something else happens, not this diagram. So when you test you want to make sure that what's actually in the spec is what happens. Can anybody think of anything else that's state related? Okay, bye Bob. Bye It was nice having you in class. [inaudible audience comment] My one of three talkers is leaving so someone else now needs to talk for Bob since Bob's leaving. [laughter] Okay [inaudible audience comment] Okay so here's an example of state testing. This is what the spec says the software shall detect and report a printer that's offline. Okay, that sounds pretty simple. The software shall detect and report a printer that is malfunctioning. Okay, sounds pretty good to me. What could possibly go wrong with that? Well the spec has unwritten assumption that the software will also detect a printer that's online and working, which was one of our items on the previous list and that the software would detect a printer that's in both states. What if it's offline and malfunctioning? So these two simple little statements are actually an example of a couple of different bugs. There's a bunch of things it didn't say. So if you were to examine this and test it you'd find the dead states are software fails to update status of printer once it comes online. So an example would be this software detects when the printer is offline but it doesn't detect when it's online. Do you know how many times that kind of thing has happened in history? It happens a lot. I'll give you one example that happened at a jail. They had this software to monitor prisoners and the prisoners had some kind of device on them. And so the software, if the prisoner left a certain area in the jail, it was actually a jail, it wasn't a prison, the software would go off and say you know prisoner XYZ has gone somewhere they're not supposed to go like out the front door. Okay well what happened with the software was, I'm not making this up at all. This is unfortunately a true story. So what happened with the software was it checked to see the prisoner XYZ you know is in his or her cell but it didn't check enough. So it goes out, it checks prisoners in his cell, okay. It goes out and checks prisoners not in his cell, okay. And then it doesn't check again. So meanwhile the prisoner is basically, whenever the state got changed, it never reset it. So the prisoner is out of his cell, the prisoner goes back in the cell, then they go back out and the software thinks he's still in the cell. Meanwhile the prisoner walked right out the front door. So anyway, that's an example of this and this happens a lot. Sometimes people get so focused on detecting a failure that they forget they need to detect the success also. Okay the same thing happens with malfunction. We need to test both it coming online and it not being malfunctioned. Okay so this is what would happen if we wrote that code exactly to the spec, the software would assume that it can't malfunction so here's an example of a missing, a dead state. It doesn't know what to do at all when the printer is not working and not available. There's no code for that whatsoever. So that's states out there in la la land. And these two states have entrances but not exits. The software will not detect when they come online and operate. So this is an example where even the simplest state transition could be screwed up. So as testers, you guys, if you create the diagram then you'll know right away, okay something is missing. Okay some do's and don'ts. If the spec does include a state transition then, if it doesn't include it then create one. It's a lot easier to test from a table than it is to test from text. Look for dead or impossible states. You may be able to see other state related faults just by looking at the designs. Sometimes you can see there's missing transition just by looking at the design. And finally, don't forget to test the normal states too. You need to test those as well. Can anybody think of anything else related to states that would be good? Alright, I am gonna show you an example. I'm sort of actually getting a little tight on time so I'm going to give you guys the example to this. The software, well I'll give you guys like two minutes and then I'll show you the example, the software spec that you guys read related to my software, did you spot a state diagram anywhere in there? I'll give you a hint, it had to do with the file management, opening and closing a file. There was actually a state diagram in that and I'm gonna show it to you for the purposes of saving time. Okay this was actually the state diagram that was described in the spec. If you went back and looked in the spec, I think it was, yeah it says section 2.1 dot X, all of those requirements, if you were to take those statements and construct them all and use them without any other knowledge and construct a diagram, this is actually what the diagram said. Now the items in red, I'm gonna give you this handout shortly. Okay I didn't put it in your handout cause I actually was gonna make you guys construct it but we're running short on time. Okay the red item here is where people normally forget to define a state. Okay so those are kind of potential problem areas. But then we saw this, this is the state diagram that's described in that software spec. Okay there's no file open is one state. If you remember it it said basically if the effective size was zero it couldn't display any results. So that's a different state. And then it said if the file was open but the effective size was greater than zero it could display all the results. So there's three states the file can be in. Okay, according to the spec there's three states, not open at all, one of these is open or one of those is open and all the transition to get back to and forth. So for example, a file that has this can change to that just by entering an effective size greater than zero. Okay the red area here means people tend to forget that. I can tell you I wrote this software and I forgot about this. What can happen is you could have a file where the size is greater than zero and somebody could change it to zero. Now it goes back to this state. Well I totally forgot about that state and I can tell you it was bug. So the red ones, they're all the places where I encountered bugs cause I didn't totally think it out. Okay so here's all the states transitions and everything. We could have the new file that would cause it to be open and so forth. Now there's still some bugs in here. You guys can review the spec at your own leisure if you're seriously interested in this but if you review it long enough and think about it long enough, there's some things I didn't talk about, which is the failure states. Okay now I've taken that same diagram and I've put in two failure states here. Remember that one bullet where I said find the states that you don't want to have happen? Okay I failed to do that in a spec. These are all the good states. These are all the normal states. The two red ones are the failed states. And I didn't include that in the spec at all. There was not a single statement to say what would happen if the file can't be opened? It assumed that the file could always be opened properly. It also assumed that the state of the database, so for example the file itself would always be readable. So basically when you put those two on there you can see that we're missing some specifications. So I need to add something to the spec to say what happens when we get a wrong database template, the files corrupt, wrong version, so forth? So the failed database state, I know that can happen if I abort the software while a file is being opened. So basically creating this diagram helps me to identify these things here, how it could happen, but now I know I don't have any spec for it so I need to go write some code. So can everybody see that this is kind of a normal progression? And this is just something so simple and so silly. Can you imagine doing this on the space shuttle? Or some other system that's truly stateful? This is not even a stateful system. And here we've got already two bugs, okay. You guys are being so quiet. Alright, okay, now I want to finish up, this is actually the end of the integration testing. I want to show you some metrics to use and basically some metrics to use while you're recording failures. Okay record failures and use metrics through integration. This is always something that's done in parallel to the test I just showed. Like I said before, recording failures and using metrics go hand in hand. So we need to give back. If we're going to have metrics for integration testing, we need to give back a revisit. Why are we doing this to begin with? Why are we integration testing? Well we want to find defects that can't be easily found at the next phase or at the unit level phase. Okay so let's take a step back, what would be hard to find in unit testing? Do you guys remember? I told you a few of these. [inaudible audience comment] Yeah, anything that's outside the module, timing, interfaces, basically everything we covered in this section. Yes? Okay so we don't want to, we want to focus on the stuff we can't find at the level below. We also want to focus on the stuff that we can't find next, which would be anything that's difficult to see without the specs. So interfaces is another example. It's really difficult to see that just from looking at the spec. If you look at the spec I gave you on that software, reliability software, if we were to test just the spec and not the design, a lot of that stuff would have been really difficult to test. So our goal here is to test what can't be easily found during the other two tests and also to verify the software, here it is, software to software and software to hardware. Brett, this was your comment. I actually had it on the very last slide but forgot to put it on the other slides. Okay so we're trying to verify software to software, software to hardware. That's our goals. So we want our metrics to match up with these goals. So these are some useful metrics. You know you guys may have other ideas. You could have a ProtoChart of the types of defects found by root cause. And then figure out the percentage of them that are related to timing, sequencing, interfaces and NIL. So what I'm suggesting is go back to some project that's been fielded for a while and go do a root cause on let's say 50 bugs. And categorize them by what you think the root cause was. Okay which lots of times, to be honest with you, with software engineers, they'll put the root cause right in the corrective action. Oh, my timing was off on this, blah, blah, blah. I had the wrong logic. Lots of times the root cause is right in the problem report. Okay so one thing that would be an interesting metric is how many of the bugs getting out are related to these four things? If a whole bunch of them are related to those four things well that's a metric about your integration test thing. It could be that most of your bugs are related to that naturally but probably what it means is you need more tools or something. So this is one possible metric. Can anybody think of any other metrics? Do you guys use metrics here? What kind of metrics do you guys normally use? Number of bug counts or anything like that? Okay well that could be used in addition to this so for example, I could also have the total count of bugs. Let me show you one [background noise] popular metric I did not actually put up here. Are you all familiar with the, I hope this comes out on the video back there, but there's a concept of this shape profile which is, let's see, phase, I'm gonna try and write very neatly here so you can read it. ^M00:21:43 [noise] ^M00:21:48 Okay this profile would be noncumulative. Okay which means we don't add, let's say these are all of our phases of development. Noncumulative means we don't add them to what we found on the previous phase. Okay so there's been a lot of research over the last 35 years that shows that for a typical software project we tend to see this thing looks like a bell curve. Now it is totally possible to see multiple bell curves. Okay just put that issue beside. We'll just focus on one bell curve. So the idea here is that you know let's say somewhere around here is normally, well let's see, somewhere around here might be operation. Somewhere, I probably drew it a little off but somewhere around here would be systems testing. Somewhere in here might be integration testing. And usually out here is unit testing. You're usually at the peak of the bugs right when you're unit testing. Okay so this could also be another metric where you could just keep track of how many you had and plug it onto the [inaudible] so that sounds like something that I heard you guys say you're doing, keeping track of the bugs. Is that kind of what you're doing here? Bug counts Bug counts? Okay, this is just one way to show bug count I'm not sure we track them by phase Okay so you just track the volume. Okay well tracking the bug count is one very simple metric. So what do you do with the bug count after you've tracked it? It goes on a report. [laughter] That what I was afraid of. So basically somebody looks at it and says hmm, there was 979 bugs. Okay, alright well the purpose of the metrics I'm trying to show here is to help you get better at whatever we're trying to test. So one way to do that is if you know how many escaped bugs, see these are escaped bugs. These have gotten through the process. If you know how many of them are related to this, it gives you an idea of what your integration level should be. Okay so it's just a, it's a nice easy metric. It's really not as difficult as it looks. Okay I'm trying to use my magic marker to change the page. Okay some exit criteria. These are some typical exit criteria. Completed test results, obviously, checklists, scripts, [inaudible] and modified and high risk code. For all high critical or frequency code you might have a formal review. So this is the same exit criteria really for the unit testing to supply at the integration level. You might also have a traceability matrix to the interface design document. This is pretty typical. You could also of course have the list of open issues for architectural design documents and system requirements spec. I would make this part of your deliverables. A lot of people assume that this will be nothing. I would assume it's something until proven otherwise. I would assume you're gonna have discrepancies. And even have a means in whatever your process is to document them. So the worst thing that can happen is if somebody finds a discrepancy in a document and they have no means for which to easily communicate that. Because you know, as software people, software testers, software engineers, by the end of the day if we didn't have a means to quickly and easily record that problem, what probably happens to it? It gets stuffed under the 100 other things we had to do that day. So I would highly recommend that there be something, some media, some place, some reservoir, even just a piece of paper hanging up on the wall that says these were things I found today that have to be able to document. Do you guys think you have that here or at NASA? It's probably varies by project, I would think. Some people use risk management systems to do this so that's just one food for thought. Can you guys think of any other exit criteria, any other things you want to have done before you go to system testing? So those of you who are system testers, what would you like to have done before the code's given to youI think everybody's got the 2:30 full stomach. Okay good, well we're done then. Okay wrap up, one thing I do want to point out is the integration practices I showed you in this class, there are other integration tests we can do. These were the ones that surfaced as being related to fewer defects so that's why you learned about them. I showed you how to plan and strategize them, how to document them. I'm gonna show you the completed checklist, actually, for this sample project shortly. Show you how to execute the tests and what kind of bugs you're likely to find. What I have found is when people know what kind of bugs there are they know how to test them. So I like to work backwards. I like to say this is what you're looking for. Figure out how to test that. And I showed you some do's and don'ts for executing the test, some metrics and some exit criteria. So now let's move on to the last module. ^M00:27:40 [noise] ^M00:27:51 Okay, our very last module in this class is the system testing module. Okay here's our module topics. The module topics are exactly what we've gone over for the other two modules except we're gonna go over the same stuff except for system level testing. So you'll notice that we always do the plan, we always do the strategy, we always define the task, then we execute them, then we record failures and use metrics. So we use the same process over and over again but different test focus. Okay to recap, these were some of the statistics that I showed you this morning. We're gonna talk about, I think we're gonna talk about almost everything on this list. I don't think we actually specifically talked about simulation other than it being a tool. But we're gonna talk about everything on this list. These were the things related to system testing and how effective they were with respect to fewer defects. So you can see switch and handling was right on the top of the list and so forth. We're gonna go over each one of these. We're at the part of the course now where we're looking at tests that are run without any disability of the code. So we're gonna put aside all of the tools that we had from the previous two types of tests, the debuggers, any tool that could see the code. We're not gonna be using those anymore. Now we're gonna be looking at only what we can do with the spec, the software requirement spec, more specifically, and the code itself and user's manuals and whatever else. That's it. So the phases we could execute these tests could be alpha, beta, system test, qualification test. Alpha and beta test I don't normally see on government projects. Do you guys actually have phases of testing called alpha and beta? Probably not. It's a commercial term. In the commercial sector an alpha test means someone other than us is testing it. And a beta test is a bunch of people other than us are testing it and they're testing it for free and we may or may not get any feedback from them. Have any of you ever been beta testers for any software? Were you required to actually report? No, you could have just used it and not told them anything, right? So beta testing is pretty loosely defined. Sometimes people will give feedback, sometimes they won't. And the reason why people do alpha and beta testing is they do alpha and beta testing when they're shipping the software to a lot of end users. So if you had 5,000 people who were gonna use the software you would do an alpha and beta test first. An alpha test might be one customer. A beta test might be 50 customers and then the full release might be all 5,000 of them. So I don't think alpha and beta test really apply too much to what you guys do here. You're gonna probably go straight to systems test and then FQT. Okay so now if we look at the black box, white box, grey box diagram, we're firmly looking at this box right over here. That's the one we're gonna be focusing on is the stuff we can find without seeing the code. So as the general rule of thumb, while I'm showing this diagram, if you were to execute all of the black box test that I have in here you might hit 40% to 60% of the code coverage. And that's if you did a really terrific job. Okay one of the things that I've seen in industry, which I saw in my database, when I tracked or list data, is that a lot of companies did only black box testing and they did a really great job of it but they only had line coverage of maybe 40% to 60% and so some were along the line. If you shipped the software with that kind of coverage you're gonna have some bugs. So one thing I do want to get across is that's why we went over the other half earlier, okay? Okay so the focus of the system test is to look definitely at the requirements, possibly at the top level design. Sometimes you need the top level designs because it may have information that the SRS doesn't have. Okay these are some types of tests and this is not a complete list. I know that you all are going to be spending the next two days doing verification and validation. I suspect you'll probably go every other test so there's enough. This is a one day course and this is all I really had time to show you. There's other tests other than this. The very first one is requirements validations. Single most important test is to make sure every statement in the SRS is tested. I believe, if memory serves me right, I need a drink, I'm pretty sure as part of the NASA software assurance guidelines, I think you guys have to do this or it's strongly recommended. Is that a true statement, to test all the SRS statements? I don't know if it's a formal requirement Something that's expected? It's a Software engineering requirement Okay And NBR Okay, it's an NBR requirement? Okay I didn't realize it was an NBR. Okay, yeah it's something that I think most projects are expecting you're going to be doing this. Okay one thing I do want to point out though is you know in a lot of commercial companies this is not necessarily a given. I've worked with a lot of software projects where they have this spec and it has 500 requirements but they didn't test all of them and they really didn't even try to test all of them. So that's what really baffles me, I don't know. But anyway, this is pretty much the first test. Just make sure what's in the spec gets tested. Now right behind that is the opposite of that test which is to test what the software should not do, which is the system level exceptions. Well stress testing, which I'm going to talk about shortly. That's being able to test it for a very long time. User interface testing, performance, configuration, compatibility, these three can be optional. You may not have any requirements related to these three things. If you don't, you don't run the test. You may also not have security related requirements either. So I'm gonna actually spend a relatively short amount of time talking about these four. I'm just gonna tell you enough to let you know what kind of test it is. Regression testing is usually mandatory at most companies. I'm pretty sure, it may not be called regression testing but I'm pretty sure this is in the NASA standards that you need to retest all your changes so that should be somewhere in the NASA guidelines. And then finally the acceptance. I believe the FQT is part of the NASA guidelines too, is it not? Are you guys required to do a formal qualification test or an acceptance test before? Okay, alright, so good. Alright so at least this one and this one and this one would probably be already present in your process. Okay let's go through the planning for the system testing. The scope of the test, as we talked about earlier, who's gonna execute them, when, the right tools and setting up the documentation. Now all of this applies to the system level testing. Okay the scope of the system test, the things that are almost always in scope are the normal behavior, which should be in the spec, the SRS, the abnormal behavior and any special requirements such as security, etcetera, blah, blah, blah, blah, blah. Okay so normally that's almost always gonna be in the scope. Decide whether or not to focus on existing but unchanged features. So for example, if you have 40 million lines of reuse code, you need to decide what you're going to do about testing on it. And NASA is full of examples of where they tested none and found out that wasn't a good idea. So anyway, somebody has got to decide what part of the reuse code, if anything, we're gonna test. Make sure stakeholders have an opportunity to approve the scope before moving forward. In your guys case the stakeholders might be the systems engineers. The people who have this global understanding of the system would probably be your stakeholders. Okay, system tests are usually executed by people who did not write the software being tested. And this is almost always true. Now I have run across cases at small companies and even at NASA where the software engineers actually did the black box test. What they did is they developed code, they integrated it, they stopped. They all put on their I'm a tester now hat and then they run the system test. Can anybody tell me what you think that might actually, have you guys first of all ever seen that case where a software developer actually ran a black box test? And when you think it might happen? You guys ever seen it? I've actually seen it within NASA. When I've seen it happen is on super tiny projects. And I'm not gonna say the NASA site where I saw it at but it wasn't here and it wasn't at Johnson and it wasn't at Ames so we're covered. But anyway, if you've got a super tiny project, it's possible that your tester is the guy that wrote it. So I just want to point that out, but ideally if you have a bigger project you want somebody else doing the test. Okay let's see, and of course we want to get back to when the test will be implemented. There was a myth years ago and it's finally starting to die, that systems testing doesn't start until everything is done, meaning all codes finished, all codes integrated, that's when we start system testing for the first time. I think that myth is starting to die. There's a lot of value in, if you have everything you need to run that system test, run it. If there's some changes made we can rerun the test later. Okay so that's the new way of thinking. And I have found in my database, getting back to the big blob again, I have found that the big blob testing, it stuck out in my database. The company's that did pure waterfall testing, they did not end up in those higher levels or the lower levels of defect density. Getting back to what we talked about in the previous two modules, it just takes longer. It takes more calendar time. Whenever you have something take more calendar time, what are you at risk of? Okay management come along and say why is this taking so long? It happens everywhere. You never want to kill calendar time. Okay so that's basically what can happen. And then we saw these two risks before. Okay so now this is how the incremental approach looks. If we did everything incrementally we have all these nice little loops. There are lots of software development models out there that are incremental. There's the spiral model. There's actually the incremental model. There's a bunch of them. But they all have something in common which is we test in little tiny chunks. Okay so I don't really care what you call it. It's if you're doing this, that's incremental. You can see the length of that is shorter than the length of this. In my database I measured the length of the projects and when they did this, except for little tiny projects, which this works fine for little tiny projects. But for big projects this always took longer than that. And there's no real reason to kill calendar time so why do it? Okay, oops, did I skip a page here? Alright choosing the right tools. Some of you were asking me during the break, my opinion of certain testing tools and I gave a few of you my opinion during the break. Now I'm gonna tell you my opinion as part of the class. During system testing, our tools are totally different now. All of those white box tools like JUnit and all those, those are gone. What we have now are test beds, which are always good, simulation, which is almost always good. I mean you guys couldn't exist without simulation, let's face it, and then the programmer's workbench. You also have, well these are the things that my database found significant. So these three things were highly correlated to fewer defects in my database. So this would make perfect since, right? If you have any of these three things, things are gonna go well. Can you guys imagine testing if you didn't have a simulator or programmers workbench? I'm not sure how you'd ever get it done. So those are pretty much a given. Now the other thing I want to point out is different tests have different tools. Requirements validation will need some kind of traceability tool. So like ClearQuest can be used for that. So that's a good tool to use for requirements validation. You have all your requirements. You want to make sure all of them get covered. System level exceptions, this is often times a manual effort. The user interface, there are many screen capture and replay tools, many, many, many, many, many. Have any of you all ever used these? Screen capture means the tool captures the screen, it saves it and then it allows you to replay it later without you being there. There's a ton of these tools out there. I'm not gonna name the vendor names cause we're on video but anyway you guys can look at my stuff and find out the names of these. These are good for testing the user interface. Now one question I had during the break is do I think these are very effective? And my response was they're effective at testing the user interface. That's it. Their effectiveness for this other stuff dies. So one point I do want to make here is I was really gung ho about the tools in unit testing. I was all gung ho about the tools in integration testing. My enthusiasm starts to die right around here. It is entirely possible, with the exception of simulators and test beds, that you could get through system testing without these tools. So that's, that's just something to remember. Okay performance testing is usually semi manual or manual. Stress, there's a variety of tools that exist for that. If there's one test done the less where you would like to have a tool, it's actually right there. The stress test will just pound away at your software. They just inject data for days and days so those are good to have. Configuration testing varies. Compatibility is almost always manual. Security is a combination of manual and automated. Regression testing lots of times is automated. It's both part automated, part not and the same thing for acceptance. Okay with that slide let's take our last break, or second to last break, for the day and then we'll come back and we'll look at these tools.

Module 6

Okay. Everybody ready? [ Background Discussion ] All right. Let's get back to what we were doing. I'm going to -- we'll get back to the tools a little later. Okay. But for right now, I want you guys to realize that the tools are -- they have limited -- I should say they don't solve every problem. There's been a lot of hoop-law industry about this tool that will test all your stuff. You know, basically, they do what they're supposed to do. These tools test the user interface. They don't -- do they test the process doc? Probably not. So be careful of the fact that when you get a tool, it's not going to automate everything, and you probably don't want it to automate everything. Okay. Create a template for the below artifacts before you start testing. I'm sure you guys probably have some template documents for these things. Do you guys have, like, templates, where every time you start a new project, you fill in the blanks? Okay. If you don't have templates, I can have -- I have some for you. Spreadsheets; I have found are easier than Word Processing documents, but for whatever reason, the whole world use these Word Processing documents for test plans. I have personally found -- I used to have a for-profit test lab, which means, our job was we tested for money. We used spreadsheets. They work amazingly effectively. And I can tell you I've been working with some -- I can't say who -- but some industry leaders, they use spreadsheets, period. So just some food for thought. If you're having a hard time with the documents, let's say in Word, you might want to try switching to Excel. It can be extremely effective. You might also use some kind of requirements management system or clear request or whatever, but basically, whatever your tool is you want it set up ahead of time. Okay. These are the minimum requirements of what you'd like to have in it. This is almost identical, in fact, I think it is identical to what I had in integration testing, so we have the same type of formality as we would for integration testing. And I pointed this fact out earlier about the use cases. During the system test, if you haven't tested the use cases yet, now's a great time to do it. If you didn't do it in the integration testing, this is your last chance to test the use cases, so. All right. A system test checklist; I do have on your CD -- and at the next break, I guess, well, probably our last break for the day, I'm going to show you the system test checklist. The check list has everything that you're going to learn in this module in it. So it's all in one place. So it's a memory jogger. It's very useful. You could also use the checklist to document the decisions that you make. Okay. So here is our next job of the system testing process, to find a strategy. We've actually already gone over this before, but basically, when we get to systems level testing, at this point, it's super important to prioritize [Inaudible]. Okay. During the unit level testing, we didn't have so much of this strategy, because we're going to test every unit. We're going to test it as soon as we finish the code; so strategizing was a little less important. Okay. We had a little bit of strategy that came into play. When we did integration testing, we had a little bit of strategy here. We want to make sure we integrate the stuff in the right order. When we get to system testing, this pyramid becomes super important, because you're probably going to have thousands of tests, even on, like, one CSCI. You could have thousands of tests, hundreds of tests. You definitely want to execute them, keeping in mind this order; this is intentionally on the bottom. You want to definitely test the code, most likely, to have the bugs first. You want to test the code that's used the most, et cetera. We went over these examples in the previous module. I think you guys get the gist on the strategy here. I just want to point out it's very, very important when you get to system testing. So like we said earlier, we want to test the stuff with the biggest risk. Well, if you're a system tester, you may not know that. Because, keep in mind, the integration testing and unit tester developed by the developer, they know where the risks are; they wrote the code. As a system tester, you may not know, unless you have past history with it. So you might have to do a little bit of work to find out what are the risk areas. And one easy thing to find out is to ask those software developers, you know, where's the risky part of the code. They'll probably tell you where it is. Okay. One important point I want to make in bold here. I make a living out of reviewing people's test plans, and it usually takes me maybe 10 to 15 minutes to find something monumentally out of order. One of the first things that I always look for is -- one of the big common mistakes that people made is they write the test plan in the order of how easy something is to test. And I'll have a lot of software testers tell me, I'm going to get the easy stuff out of the way first and then I'll focus on the hard stuff. Isn't that what we tend to do in life? You know, when we get up in the morning and say I'm going to do the easy stuff first and then I'll do the hard stuff at the end of the day. Not the way to do system testing. Okay. You don't want to order your test plan based on the easiest stuff first and the difficult stuff last. That's not a good way to do it. So for example, the user interface, just because it's easy, shouldn't go first, not necessarily. Okay. You don't necessarily want to test in the same order as the SRS either. It could be the SRS is laid out in a perfectly logical order, but probably not. It's probably laid out in whatever order the system requirements were in, which could be total random order; so that's not necessarily the right way either. And for sure, and I see this at least 6 times out of 10, don't order the test plan alphabetically. I see this all the time, and that is definitely -- this would be better than that. Okay. So my point is, all right, so if we don't lay out the test plan in those orders, what do you think is the right order to lay out a test plan? The most important -- Okay. -- function. All right. So let's say you have everything grouped by importance, okay, that's for sure. Now within some grouping, let's say you have the ultra-important stuff, what would be your order within there? It was on my strategy pyramid. That was the bottom. You got the bottom layer of the pyramid, which is exactly what I was looking for. What was above that pyramid? [ Inaudible ] What -- yes. What is the order that's most likely to happen? So we got our important stuff; that was the bottom of the pyramid. Now we want to test the most likely order. So for example, with my software, I'd probably want to test in whatever order's in the user's manual, because that's probably the order somebody's going to execute it in. So open up file, edit some stuff, look at the results, print it out, close the file, shut it down. Okay. That would be like a nice order to my test plan. Okay. So good question. So anyway, I would simply avoid just some random order. Okay. We talked about the code that's executed the most. Okay. I'm not going to go into these too much more. As far as looking at the features, when you're doing integration testing and unit testing, the use cases are not necessarily always traceable. You can't always trace the use case to integration test, and you definitely can't trace them to a unit test. When you get to systems level testing, for certain, you can take into account use cases. So one thought is once you eliminate the code that's not very risky, maybe the next step is to execute the use cases. Okay. And then to look, as we saw before, this is the same example I showed you guys earlier, the software that's being used the most. Okay. Defining this strategy. Now during the break, Jane showed me the MPR 7150. Is it Dot 2? [ Inaudible ] Okay. Dot 2A. And in there, it said, somewhere in there it said, document all of this stuff. Right. There was a bullet paragraph, and so actually you have to do this for the MPR anyway, document everything related to your strategy, your procedures, what's high-priority, blah, blah, blah. I don't think the MPR says you have to prioritize anything. It just says you have to document it. Okay. I would not assume all tests are equally weighed, in terms of priority. A lot of software testers have it in their mind that every test is equal, in terms of priority. And in a perfect world it might be true, but in testing, they're not equal. So you do want to execute the ones with the most impact first to allow time to fix them. Okay. Getting to the test -- now I want to go over the system test. We're going to talk about each one of these, one at a time. These are some Do's and Don'ts. For the requirements validation, don't wait until the start of testing to start planning these. I would start running the test plan as soon as the SRS is approved, and maybe even sooner than that. In fact, if you start writing the test plan before the SRS is approved, what do you think could happen that would be a good thing? You can find some requirements that need to be [Inaudible]. Exactly. In my database, if you guys go out on your CD later on, there's a white paper out there; that's the only PDF on your CD. The one thing that really showed up is super important in my database was the organizations, where the testers started writing the test plan before the SRS was done. Those organizations had that simple little movement of people. It sounds like you're adding people. Somebody has to start a test plan sometime. So the only thing you're changing is when they start doing it. So instead of moving them way down in the lifecycle, you move them way up here. so they're not spending more money, they're just doing it at a different milestone. The organizations that did that, by far, had the lowest defect densities, and it was -- it wasn't a sometimes or most of the time, it was all the time. So that was a very simple technique I want to point out is to start doing those early. System level exceptions; this is actually testing the opposite of a requirement. I wouldn't wait to the end to do that. UI Testing; what I have found works really well -- a lot of people will test the UI Testing all by itself, and like, kind of a vacuum -- I found that it works out really well when you test the UI as you need it. So if you're testing something and you need to use your interface, test it while you're testing it, and kind of map the tests together so you know when you've tested the UI. That's just one suggestion I have. Performance testing; you're going to need criteria before you start the test to do that. Stress testing; I -- there's a fine line to cross in the stress testing. If you do it too early, it might not be valid, because the code might not all be in place. You want to do the stress testing towards the end of testing so that all the code's done. But if you wait until the end, you will run out of calendar time. Stress testing takes calendar time. It can go on for days, maybe even weeks. Just think about the ambulance system I told you guys about. They want to do the stress testing at the end, but not so far in the end that they run out of time. So let's say maybe the last two weeks of system testing there should be stress testing, kind of in parallel with whatever is going on. Okay. So don't wait to the very, very end, but don't do it too early either. All of these two things they definitely -- it depends on what your application is as to what you do and don't do. A regression test; do use a risk assessment to optimize this test. An acceptance; do run this test last. Okay. So let's look at requirements validation. The purpose of the test is to verify each testable requirement. The results is the traceability matrix. I'm sure that's what you guys generate when you do this. So you have the SRS, you have a traceability -- if you put it in clear request, you have it all mapped out. Okay. Major weakness of this test. It only tests what's in the SRS; that's this major [Inaudible]. So if something's not in the SRS, does it get tested? Maybe, maybe not. Okay. So it's only as good as the requirements document. So one of the things that's often left out of an SRS is the abnormal behavior. That's a pretty common thing. It happens to everybody. They tend to forget to put the abnormal behavior in the SRS. Implicit requirements may or may not get tested. When you guys do your requirements validation, do you also test implicit requirements, like derived requirements? Anybody know? Okay. Well, I'll show you an example of that later. Okay. It's a good time to cover use cases if you didn't do it somewhere else. Even though the goal of the requirements validation is to cover the requirements, you still want to prioritize these requirements, so. Okay. Executing the test; create a table to hold all the requirements in the SRS, give each one of them a unique ID, identify other requirements to capture negative behavior. This step here, I'm going to show you an example of from our software reliability software. What this means is you go through your SRS and you come across a requirement and it's a positive requirement; meaning, it says what something shall do. What you should always do is look through the rest of the requirements and look for a negative requirement, which would mean, a requirement for what would happen if that software doesn't do what that requirement said. And the SRS I gave you in my example there were, I think, five or six negative requirements that covered all the other requirements. So in this step, this is when you look through and find other requirements that would test negative behavior. You would identify any derived requirement, assign priority, assign test case numbers, run them, and so forth. So here's an example of a test template. I'm going to show you the Excel spreadsheet with all this stuff. This has everything in it that we had in the integration testing. Okay. All right. Another thing you want to capture in a systems level test, which I'm going to show you in your template, is when you do system testing, you're probably going to have hundreds, thousands, maybe even tens of thousands of tests. Okay. So you're going to have at least one test for every requirement plus a whole bunch of other tests. So one thing that's good to have for a system test, which I didn't show you for integration and unit testing, is an executive summary. I want to point this out. These are just little simple things that can help make your process very effective. The organizations in my database that summarized this, which is your status, in one table just like this, they -- their defect densities was 11 percent lower than others that didn't, and all they did was create this table; that was it. So let's imagine you have a test plan with 10,000 tests in it or maybe even 1000 tests, let's make it simple. You have a test plan with 1000 tests; so this would summarize the status of 1000 tests at any given point in time, let's say you update it once a day, once a week, whatever. Why do you think that would be so effective as to reduce the defects by 11 percent versus somebody else who has 1000 test cases and no summary table? You can see exactly where you are. Exactly. It is your metric. This is the metric of system testing. Where are we. And I have gone in organizations where they do a fantastic job of testing, so wonderful, their plans are so fantastic. Somebody from upper management says where are you? I have no earthly idea. I don't know. I'll get back to you and let you know later after I count up my 1000 test cases and see where we are. Normally, this is automated. And in the template that I've given you guys on your CD, it's actually all linked. When you fill out your procedures, it keeps track of how many test cases you have, and it actually keeps track of the status. I'm pretty sure clear request -- I'm sure clear request would do this. I'm sure you guys probably have it set up to do that as well. So this is a really important thing I would not skip. Okay. Do's and Don'ts. Do review the SRS well in advance and identify discrepancies before the SRS is finalized. Do use the criteria sheet I'm going to give you shortly. Do create a table to ensure all of them are covered. Do run realistic use cases at the same time. Sometimes you can combine an SRS test with a use case test; so that's a good thing to do. Do search for negative behavior in the SRS and report it if you don't find it. So normally, what I do is I list every SRS. I go look through the rest of the SRS to find a negative behavior. If I can't find one, I put a big red X that says, okay, what do we do when this doesn't happen. Okay. Plan on executing other tests for abnormal behavior. Do summarize the status. Okay. I want to show you an example. Now here's an example of what I wanted to show you. The system testing checklist that I'm giving you is actually all set up to do everything I told you about. Let me get to the right page here. Okay. So here's the checklist. It has all the memory joggers, and a lot of these memory joggers we haven't talked about yet. But -- so here's my checklist with lots of memory joggers. For example, these are all the things that we talked about here. Okay. Then here is my procedures. And the way I have this all worked out is that, ideally, what I'd like to do is write a procedure for everything I answered yes to in my checklist. And in this particular case, this is a requirements test. Okay. If you guys look at the spec for my example software, this is one of the very first requirements. Okay. The software shall predict field of defect density using only the [Inaudible] level of the software. So my test is that -- the test is actually listed over here. What I'm going to do is I'm going to test it in everything that was in this spec and make sure the result's positive. So that's pretty straightforward. Right. We can pretty much figure that out. It applies to all the use cases, which I'm going to show you later. I'm going to show you what the use cases were later. There's a normative informative column. I don't think anybody asked me about that. When you guys develop specs, it's possible that there could be an informative statements in the spec. So like, for example, what I just said is an informative statement. Okay. Lots of times specs might say, for example, this is how this works. That's an informative requirement. And so what I normally do and what other organizations do is they'll actually parse all the informative statements into the spreadsheet, but they'll mark it as informative and just leave it. And the reason why is that the informative statements help you test the normative statements. Okay. So for example, if somebody stuck in a picture of state diagram, that would be an informative example. That would help you test the normative statements, which are how that state diagram works. Okay. So anyway, that's what that is. This is a normative statement. This is my setup required. I need to open a file. Okay. And this is testing that requirement, which is pretty straightforward. Now what's interesting is I go down and I capture the negative requirement. Okay. The negative requirement -- if you look through that enormous SRS, was way down in section 4.2.3. So the negative requirement -- and I intentionally showed this -- lots of times your negative requirements are somewhere else, like, at the end. So in this case, this is our positive requirement on the left. The negative requirement in section 4.2.3 was the software should generate an alert when defect density on each of these predictions cannot be computed. So good, we have a negative requirement. That's fantastic. So at least our spec was somewhat complete. So now I would have a test case to test that. Now this is, of course, purely optional for you guys, but I always put the positive requirement and the negative requirement together in a test plan, this way, I make sure that I test the negative requirement. Does that make sense? Okay. Because, otherwise, if you wait to the end and test this requirement, you could forget what the positive requirement was. So here's an example of the requirements testing that I was talking about. I mapped it to the negative requirement. Now if there had been a case where there was no negative requirement, I would have put a line item with -- that would have basically said there's no negative behavior for this. I need to go contact somebody. But in this case, I had one, so I just write it in. Does this make sense? Okay. All right. Now let me show you the coverage. Now right now -- right now, my example spreadsheet has two tests in it so -- but you can see if I start filling it out and populating it, I'm going to have hundreds of tests. So as I fill them out, it actually goes through and it keeps track of how many things I entered over here and how many requirements of these I have. So right now it knows I have two tests case; I've run none of them. None of them have passed and none of them have failed. So this is just, you know, some suggestion. It doesn't have to be real difficult to keep track of these statistics. Okay. Let's go back. Okay. Now hopefully, at this point, you guys have seen a progression. I want to show you guys the mathematical formulas here. I want to show you -- oops -- that is not what I wanted to show. What -- [ Demonstration ] Oh, [Inaudible]. All right. I got to go out. I had the wrong length here. Let me see if it's up here. No, wrong length. Darn it. Okay. Let me show you something real quick. [ Demonstration ] Okay. Now that we're at the end of this course, I want to show you the progression of things. We had a unit level test, an integration level test, and a system level test. So if we use this diagram as an example, our unit level test would have been basically testing one of those boxes, like, that box; that was that one requirement that I just showed you. This box would have been unit test all by itself. This box. This box. Basically, these five blue boxes would have been where we did five unit tests, because there's one function for each one of them. Okay. Then at the integration level, do you guys remember what we tested at the integration level? We tested to make sure the boxes happen in the right order. Right. Now with the system level, what are we testing? Well, I can tell you. We're testing to make sure the number at the end is right, under a whole bunch of circumstances. So one of the things I'm going to show you next in class, after we take maybe one little break, is I want to show you that in a system test we're assuming what's going on inside the box actually reasonably works, although, we're not shocked if it doesn't. We're hoping that what's inside the box works and that the sequence of the boxes works. Now we're taking it to the next level, which is, what are the different types of scenarios I have, you know, who would be using this software? What would they stick in the software and testing the use cases? So now we're testing the whole thing. So I showed you this example so I could kind of point out where the things -- why we have the three levels of testing. Okay. Okay. So hopefully, you can see these were the differences in my mathematical example, why I have three different levels of testing. Okay. I think we're not quite due for a break yet; so I'm going to do one more section and then we'll take a break. Okay. The next thing I want to talk about is the opposite of the requirements validation and that is to test -- oops -- what the software shouldn't do. Okay. Exception handling we've learned. We could apply it at the module level, the integration level, and the system level. At the system level, what we want to do is take each requirement and figure out how it could not work. So at the very high level what could go wrong? Okay. We could identify system level exceptions; some of these may be documented. Do you guys ever get any of these use cases on the software you write? Do system engineers ever say, okay, this is what we don't want to have happen? Okay. Good. A systems engineer is a great place to get these. Collect any documentation that tells you what they are and then basically instigate these one at a time. These are exactly what we talked about at integration testing, except we're testing it at a system level alert, which would mean, CSCI's talking to each other; what could go wrong there. Okay. This list of things we actually saw before also. This is the same list we saw for integration testing, except what's different is the alarm we're testing is a system alarm, not in between two things. Okay. So with my software, for example, a -- an exception in one of the formulas would be a module level alert. An exception in that sequence diagram would be an integration level alert. An exception that the user entered in a whole bunch of validated, but we still can't get an answer. That would be a system level alert. Okay. All right. Let's use this soda machine as an example. I found this soda machine works really great for this. Okay. Some system level failures could be; you select a soda -- you select Diet Coke, but you get a Coke. You pay for a soda, but get none. I've had that happen. You don't pay for a soda, but you get one. Anybody ever have that happen? That's happened to me. You get more than one soda when you pay for one. So these each have a different risk. Right? The last two are risks of the vendor. They don't want too many of those or they stop making money and they pull the machine out of your thing. So these have different risks, of course, but you can see these are all system level risks. Okay. They don't have to do with interfaces. They don't have to do with modules. It has to do with the soda machine. So does that make it really clear? Okay. Good. All right. Let's take our first system level of hierarchy, and let's select one soda against another. Okay. If you try to negate that requirement, human being does not get the wrong soda. That's ridiculous. I can't even say that without stumbling. Okay. So taking that requirement and negating it is not the way we want to do this. What we want to do is we want to find out how could a human being get the wrong soda and test it. This is where things like software for me and fault tree's really useful for this. I mean, I'm sure you guys are aware of what's going on right now with the acceleration problems on certain cars. Since we're under videotape, I'm not going to say whose cars, but you guys are aware of these acceleration problems. They've actually been around for years. This is not the first car that's had this problem. Well, a software fault tree -- or I should actually say a system fault tree is one way to get a really difficult problem and figure out what could cause it. So the very top of the fault tree would be we got two sodas, instead of one. What are all the ways that we could get that? When you get to the bottom of the tree, you have something you can test. Okay. So those things are useful for that. But honestly, a lot of times if you just think about it for five or six minutes, you'll figure it out. How could we get two sodas, instead of one. Or the other option is observe it for a long time and make sure we never get two sodas. One reason why I pointed this example out is -- and this problem has happened at NASA -- I did some research into all the fires that have ever happened at NASA, sometimes testers are so busy looking for one thing that they don't notice the other thing. So you're interested in, okay, did a Diet Coke come out? Yeah, it came out. Oh, two of them came out, okay, whatever. You see what I mean. You sometimes need to look beyond what you're testing and say that wasn't supposed to happen. History is just full of cases where people actually saw bugs, but they weren't looking for it so they didn't record it. And they didn't realize hey, that was actually kind of important. Now the soda machine got shift and we're not making any money, because it's giving out two sodas. So anyway, one thing I do want to point out is it's a good idea when you're testing to record anything unusual. It's a very good idea to record those things. Okay. All right. So Do's and Don'ts for system level exception. Do review the SRS for missing negative behavior and point it out when you don't find it. Don't negate the should statement to get a negative behavior. It usually does not work. Do use some analysis, like a fault tree or a past failure history, to find the negative behavior. Don't assume that negative behavior can't be on the SRS because it's untestable. I was told this by several people in a defense community that they had rules that you couldn't put anything untestable in a software requirements spec because you have to test it, and so for that reason, negative behavior never got any SRS. Well, so if we get back to the soda machine example, okay, these things aren't really testable as is. All we can do is observe and make sure they don't happen. To make them testable, we have to do some work. We have to make it a testable requirement. Okay. How could we get two sodas? Figure it out. Run a test case for it. So I don't know if that seems obvious to everyone. But anyway, report any gaps immediately, preferably, review the SRS before you got here and test each alarm code and its recovery. Okay. Now we actually are ready -- we already did this example. I already did it with you. We reviewed this requirement. We found it had another requirement. So I actually already did this one. So let's go ahead and move forward. Okay. I want to go over UI Testing and then we'll take a break. The UI Testing or the User Interface Testing it verifies button, menus, fields, at cetera. You also -- one part of UI Testing is make sure the cancel key works. Back in the old days, back in the pre-Window's days, in the pre-Mac days, this was like a large standing joke -- don't ever press the cancel key, because a lot of software when you press the cancel key, absolutely nothing happen. So anyway, that's why we always include it for completeness. So always test the cancel key. You also test the type. For field, you test the type, max, minute default values. Any online help or help, you need to make sure that it matches the software. You need to make sure that when you press a button, the right thing happens. One of the things that I've seen happen -- and someone asked me this during the break -- whether or not the software testing tools were effective. From my experience, the tool is fine. The tool's fine. They work. What happens is it's how people use the tool. They use them to press the button to make sure the button presses, but they don't actually test what happened after the button was pressed. So that's one thing I want to keep in mind is the UI Testing. Personally, what I would do is combine it with the other tests. It's just a food for thought. Okay. Don't test it without defining the expected results ahead of time. This is what -- this is actually why the tools often don't help, because the expected results aren't defined. Don't do exhausted UI Testing before verifying what happens behind the button. Okay. Personally, I would test the UI as you're testing the software and make sure the UI, as a whole, works, kind of towards the end of testing, because UI bugs are actually fairly easy to fix. Don't spend more time on automation than it would to take the run the test manually. I have seen this over and over and over again. I worked with one company where the software tester spent three weeks to automate a test that took five minutes to run by hand. Okay. That's not what you want to do with automated tools. It's okay to say we're testing it by hand. Do run the most common UI operations first. Don't test from left to right or top to bottom. I've seen people -- that's one thing that people do with the tools. They'll go through and test from left to right and top to bottom; that's not really the best way to find user interface bugs. You want to test what people do with the software. Try to test it with the manual. One ideal approach is to test the UI in the exact order as the user's manual. And then when you get done, go back and fill in whatever you didn't cover. So that's basically my point there. Do report any gaps in the specification immediately. Okay. I want to show you guys in an example here. Let's -- here we go. Here's the user interface design document. I just kind of want to show you what one would look like. We're not going to spend a huge amount of time on this. This is the user interface design for that software I just showed you. Okay. Here is -- if you remember -- remember when I showed you the results field, there was all these mathematical things that popped up, here's the spec for each one of them. Here's the format, which is how many digits are shown on the page; the min, the max, there was also a prohibited value. If you guys remember from the spec, the one thing we could never ever tell anyone is that the reliability is one. That would be a huge mistake; so we can't do that. One is prohibited. And a favorite of zero is prohibited. So basically, we would go through -- and this is where we could use the tools -- the one place where the tools do come in handy, assessing this stuff. Okay. Go through each field, plug in a value that's valid, plug in one that's invalid, tell us what happens. So that's where it would be immensely useful to use a tool, and so forth. Here's the rest -- I also have -- let's see -- here is -- oops -- I had the pull down boxes. I did have a menu too. Okay. Here's the menu. So your user interface design should say what happens with menu items, and so forth. So this would be the basis for the test. Do you guys have a lot of user interface design documents? Are they something that you have separately? Or are they just part of the spec? Have you guys ever seen a document that just says this, like what I just showed you? Okay. Good. All right. So that's basically what you would use is the basis for your test. Okay. What I want to do now is take our last break for the day and then I think I'm actually on target to finish by five.

[ Silence ]

Module 7

Anne Marie: Okay, we've only got one more hour left, maybe less. So, I'm going to try and get done a little ahead of time, since it is so hot. The one last thing I wanted to show before I move on to the rest of the test is the use cases. In my example, this is an example of the use cases.[ Background Noises ] Actually, I'll wait for everybody to pass those around.[ Background Noises ]Okay, I wanted to show you guys the final part of the system test plan that I had for my sample software. These were the use cases, so I'm sure some of you could probably very clearly see the unit [inaudible] and the integration [inaudible] in the software example I gave you. What I didn't show you was the use cases. This is an example of, for example, the use cases for this software. The use cases were directly tied to the project that the people using the software would have. So, here's the real, I'm showing you the real use cases for the is software. I could have defense applications, so whoever is using this software could have a defense application. They could have high duty cycle, low duty cycle, meaning, they could be developing software that has high duty cycle, low duty cycle. So, for example, sensing that it has high duty cycle could be, you know, whatever ground control system that's monitoring the homeland security, that's always running, whereas something with low duty cycle could be whatever actually is on a missile. Whatever software is on a missile would have a low duty cycle. So, anyway, the use cases were determined by the duty cycle volume, whether to trend of the duty cycle, their mission, their grow trade, and etc., all these things came right out of the spec. So, we identified use cases, and basically these use cases were, we know who's bought the software, we know what they do, came right out of sales, okay. We know these companies bought it, we know they're building XYZ, that goes on the use case table. So, this was the very tippy top of my pyramid that I showed you, and I waited until the very end to show you that. So, we would go through and, as part of our system test, in addition to the requirements verification, we would go through every one of these, there was a file created for every one of these and we executed to make sure the answers are right. So, that was the last thing I wanted to show you on use cases. [Background Noises] Anne Marie: Okay, oops. No, somehow I got, I. [Inaudible Speaker Comment] Anne Marie: No, that was me. I went back to the beginning. Okay, I think we're close. All right, performance testing, performance testing is not always a requirement. You may not have any performance requirements, I'm pretty sure you guys do, but you may not have any. It's basically to make sure the system can handle expected volumes, resources, response time, and lots of times the requirement's implicit as opposed to explicit, which means, in your software spec there might be something that says, "The performance has to be XYZ." And then there's a bunch of derive requirements for how we get there. So, anyway, do run a performance test once the software is stable. I know a lot of people that will run the performance test while the software is under development, probably not the best thing. And, you don't want to wait until the very very end either. This is like a stress test. Somewhere in the middle there is where you want to run these tests. Software needs to be operational so it can get through the test, but you don't want to wait until the very end, because performance bugs, if you want to know which bugs to take the longest to fix, these are it. Trying to make the software faster, that takes a lot of time, so you don't want to wait too late, but you don't want to do it too early either. Okay, stress testing, this is one test that I don't see enough off. If you want to know where the big gap was in my database of projects, nobody did stress testing well. Not, not even the best of the best did this well, so these are some tips. Stress, while performance testing verifies response time, stress testing verifies that the software can operate continuously for a long time. Okay, during testing, it's typical to stop, start, stop, start, that's how you run the test. With a stress test, you want to not shut the computers down. Lots of times, if you're pretty sure you have problems with the software shit again, you might want to even run this in debug mode, so that you know exactly where it fails. See, one of the problems with stress testing is, it could take a week to run into a problem. And, if you run into a problem, and you're not in debug mode, you may have to run it another week in debug mode to figure out where to fix the problems. So, anyway, it's a suggestion that you run it in debug mode. What I suggest is, you run the stress test in parallel to the other test. You cut, like, some baseline, somewhere in the middle of testing, run the stress test and just let it go. I have one really good example of this. I used to work in the equipment industry, and during testing, we used to test this big massive piece of equipment, and had two modes of operation, manual and auto, and so, whenever we tested, we were always in manual mode, because we were trying to verify some integrate algorithm. Well, we don't want this thing processing five-hundred chips while we're testing this algorithm, so we'd put it in manual mode. Well, the one thing that would happen is, nobody Everybody ever switched it to automatic and let it run for like two weeks. So, when I told the software Tester to do that, they were not happy, these were my people. Oh, it will waste the equipment, now we can't run another test. I said, "Run it." It got to about thirteen hours and died. So, I mean, we didn't even have to wait a week, and, you know, basically it's just one of things. This is the kind of bug your customer, or your end user will find immediately, if you don't do this. Okay. They often do require a separate computer, that's the one bad thing about them. Okay, do identify in advance what you're going to run for twenty-four, forty-six, forty-eight hours. It's got to be something realistic. Do run the software in some operational mode for an extended period of time. Try to make use of stack analyzers and debugger. I know a lot of people who put their least experienced people on stress testing, that's not really the way to do it, you need your most experienced people on stress testing. It needs to be somebody who can get down at that bit level and figure out what's wrong. So, mostly, most of the time, it is someone super experienced that does this. Okay, configuration testing, the purpose of this is to verify any platform. So, for example, let me think if I can come up with a good example. [ Background Noises ] I want my software to print to a pdf, I also want it to print to black and white printer, and look good, and I want it to print to a color printer and look good. Believe it or not, I do have to run those three tests, because, believe it or not, those three things lots of times don't work. So, my platform would be a color printer, black and white printer, could be laser, could be dot matrix, I might print to an adobe file, but those are all platforms. Okay, another thing I, that could be a platform, test. I work with people who test by a chip set, so, for example, they might test all their software on an AMD processor, and then they might also test all their software on an Intel processor, for whatever reason, they think something's going to be different. Okay. So, I'm not sure how this would apply to what you guys do here. I have a feeling your configurations pretty well set here. Okay. So, this may or may not be applicable. Well, when you say platform, you're not talking OS? Anne Marie: I'm going to talk about that later. Let's see, I think I have that, I have it under compatibility. So, yeah, it could be under here as well. Okay, so for this test, I'm really just looking at the hardware, but I could have folded OS into it. Okay, so one of the things you might test, for example, is multi-usage capabilities, making sure the software can run with a bunch of different hardware at a time, that could also be a configuration test. Another. [Inaudible Speaker Question] Anne Marie: Yeah, that would be a configuration test. I can't believe I didn't leave that one out. So, getting back to this page, one huge configuration would be, it runs with both a PC and a MAC. That is the perfect example of what I was looking for. Thank you. Very good one. Okay, so, we also, as part, the configuration umbrella includes a lot of tests, and one of them included installation testing, okay. We could, we could have said an installation test was a separate test, but I decided to fold it under configuration. Okay, when you go to deploy your software, do you not have to have an installation script? I mean I, your software may be the type where you may actually put it on in manufacturing. I'm not sure how that happens. But, assuming you're software gets installed through an install script, these are the tests you'd have to go through. These are the typical things that can happen during installation. You could have software that completely working beautifully, but it doesn't install right and the software doesn't work. Have you guys ever bought something, from a store, and you bring it home and doesn't install? I admit it, I am astonished at how often that happens to me. I just bought a major, major financial software package, bought it home, doesn't install. I have no idea why. All I know is that it doesn't, and I just have a simply, little, innocent PC with Windows running. So, anyway, what you're looking for here, what the typical installation problems are, not having the right privileges. So, for example, a lot of people will develop an install script that requires admin privileges. Well, if your end user doesn't have admin privileges, it's not going to install. You also need to consider the impact of firewalls and virus checkers. People, you might have to tell someone to turn these things off before they install the software. It could be that the software, oh, one of the things I want to point out here, the very last bullet is actually the most important. I just realized what I was trying to say. Don't run installation tests on a development environment on a computer that has the development environment on it. The idea behind an installation test is, you want to test it on a computer that's just somebody's computer. I will tell you, with the software that I showed you in class, that software program, you mayfind this shocking, but out of all those tests we saw this week, and I've run all of them, you know the one that takes the longest is this one. I have to go get two fresh computers that were never used for development, one with XP and one with Win 7, because we're not supporting [inaudible], and I have to install the software and run it on both of them, and that takes longer than all these other tests put together. And, one reason why is, if you test it with the development environment on it, can any of the software engineers tell me why we can't have, we can't have the C++ compiler, we can't have all the libraries, why did I say that? [Silence] Anybody know? Because, the purpose of the install task is to make sure that all files, needed for the system, get on the computer. If the development environments on the computer, those files are already there. So, this is the single most important thing you've got to remember, is have the development environment on the computer. Okay, you need to verify any written requirements, such as minimum disc space whatsoever, whatever. Okay. So, these are some of the common installation faults. Software can't update because of limited bandwidth. I would think that would impact you guys. I mean, particularly when software it in outer space. Don't you guys have to make sure you can update it from down here? Yeah, I would think that would be a huge importance. Software installed with the wrong default configuration. So, you installed the software, but the default values are all screwed up, that can happen. The software install terminates, but doesn't tell the user anything about it. This was, gets back to the financial software, I just installed. It said I can't install, but did it tell me why? No. I had no idea. All I know is didn't install. Software install is incomplete but doesn't tell the user. So it doesn't install, but you think it's installed. That would be bad. It could have compression errors, which are pretty common. It can install or update with insufficient permission. So, for example, let's say, the end users got whatever the lowest permissions are, and this update requires something just above it. Okay, you would, you would need to let them know that. Or, the opposite of that, it doesn't install and they have the correct permissions. Can anybody think of anything else? You go to the store and buy any software package, and I guarantee you, if you wait long enough, you'll run into probably all of these. Okay. So, these are things to think about with testing the install. Okay, some other configuration tests would be net, related to networking. How many of you are developing software that runs on a network. So, its client server? Not too many? Okay. [Inaudible Speaker Comment] Anne Marie: Okay. But your software is not, okay. The network goes down, so you can't write your software, but you're software is not being networked. Okay. Some things to think about, if you have client server software is, shutting down the server and the client seeing what happens, and both normal and abnormal shutdowns. So, an abnormal shutdown would be, just pull the power plug and see what happens. A normal shutdown would be, go over there and tell it to turn off. Okay. Maximum number of users, that's an obvious one. Minimum number of users, a lot of people forget to test this. They test client software, and they forget to test it with one client. It could be that it doesn't work. So, that's what that is. Typical number of users would be whatever we think is going to be on the system. Okay, multi-user capabilities. A multi-user software system would be a system is designed to run on a network and be used by multiple people, so this would be part of the previous test here. You'd want to make sure that, if two people are working on the same thing at the same time and the software allows this, that we don't have loss of function, loss of data or inconvenience. Things you might look for is, you know, basically, if two functions are operating at the same time, what happens? Does somebody have to wait? Is it totally transparent to them? Do you get a loss of data? So forth. So, a simple example of this would be, you've got some files on the network, two people try to access them at the same time, what happens? Do both of them view it, then the software handles it behind the scenes? Or, is one of them told, you got to wait, somebody's using it. So, that's what we're testing here. See? Compatibility, this is where we now talk about, I decided to put the operating system as part of the compatibility testing. It could have also been a configuration test. Okay, this is when we verify the software and this system works with other software. Okay, a typical compatibility test is various operating systems. So, in this case, in a configuration test, we would test a MAC and a PC. In this type of testing, we'd test, okay, Window Win 7 SPF, whatever it is, Windows Vista, whatever the last SP was. Okay, so, we're actually now testing versions of the MAC, versions of the PC. Okay, and this is probably the most common types of test people do. For your guy's software, do you guys freeze the operating system? I've seen these lots of time in defense and space. Well, they'll freeze the operating system and say this is what we're going with. We're going to go with XP period, if they come out with something else, we're sticking with this. Do you guys ever see that? Or do you have to support multiple? We usually are forced to use a specific version. Anne Marie: I would think so; because that's the only way you could freeze your software. If you have to run all your tests on two operating systems, that would be just. We do have to deal with PC and MAC issues [inaudible] Anne Marie: Oh. It is a. Anne Marie: I did not know that. [Inaudible], yeah, PC you're running this version Anne Marie: Okay. [Inaudible Speaker comment] Anne Marie: I'm surprised at that, I didn't realize you guys, I would have thought it'd been two different operating systems. Okay. That's interesting to know. Okay, so that's where the various operating systems are, it would be the versions of them. Some, another typical compatibility test is internet browsers. And, at this point in time, for all practical purposes, there's one. Okay, so what, I mean, Netscape's done, so, for most software systems, you're testing version whatever of Explorer. And, that's actually very important. And Firefox. Anne Marie: Okay, Firefox, okay, I take that back. And Safari. Anne Marie: And what? Safari. Anne Marie: Oh yeah, for the MAC, okay, I was thinking of the PC. Okay. All right I take that back. Safari will run on a PC as well. Anne Marie: It does? Yes. Anne Marie: Man, I had no idea. All right, I take that back. There's many internet browser. Okay, I am not a MAC person. I'm a PC person, so, I'm totally ignorant to what they do on MAC's. Okay, other third party software, some examples of other third party software, that could be caught components could be e-mail, any caught software would be right there, that your software needs to interface to. So, like, for example, in my software, this wasn't actually very evident, I didn't put a lot of it into little examples, but my software actually has to export to word processing, spreadsheet, comma delimited, so, for me, I've got to work with all those other things. So, my reports have to be readable in those formats. That's where my compatibility would come in. Okay? So, anyway, so it looks like you guys to have a few of these issues. [Silence] Do's and don'ts of compatibility testing, do find out what the versions are and test them. The other thing you need to do is decide ahead of time what you're going to test. I don't know what you guys do, whether you run the entire suite of tests, some of your operating system, or just run it on one and then do a check on the other, but this is something you need to decide ahead of time. You may or may not run the entire pass on multiple operating systems. Okay, or multiple browsers, like for example, if your software works both on Explorer and Safari, you may do a full set on one and a partial set on the other. But, in any case, it's something that needs to be decided ahead of time. Okay, another type of test is security. One of the things I was talking about during the break, is security is a thing onto itself. We could go spend a three day class on security testing, and I'm not a security expert. All I'm going to do is summarize the key points of what's covered with security. First of all, the test is only run if security is part of your system, which nowadays, it tends to be, part of every system. You would use the operational profile and user requirements to find out which of these things need to be validated, unauthorized transactions are rejected, while authorized transactions are accepted. A lot of people will test the rejection and forget to test the acceptance. You also need to make sure that the valid users aren't locked out, which can happen. There could be multiple authorization rules, which means, there could be multiple levels of security, and that needs to be tested, not just a simple one time you're in. It could be that you have users with limited processing abilities that cannot exceed those limits. Getting back to my equipment example that I worked on several years ago, this equipment had four different users. It had the super user, which would have been actually me, I could do anything I wanted to the equipment, because I know what I'm doing, if I want to get in and do something, I could do it. The next level would be an engineering level, which would be a customer, who's like me, could do almost everything, except update the software. And then, the next level down was the technician, so if somebody's got to get in there and fix thee quipment, they could do a certain amount of stuff. And then, finally, at the very lowest level was the operator level, which was, you can press the green button, and you can press the red button, and that's it. Now, one important thing about testing security is, sometimes it could be very useful to test the lowest possible security level and see what the software does. When I was managing a software group, that was running the equipment software, I went down to the lab and I knew there was four levels of security and I went down, and there's obviously some things that have to be tested at the highest level and so forth, but I noticed, after a while, they were running every test at my level, the engineering level. I said, "Do you guys ever switch it into operator mode?" And, they looked at me like I was from Mars. They said, "No." I said, "Why?" They said, "Because we can't do anything in operator mode."And I said, "And, who are 99.99% of the people going to be using this equipment?" "I don't know." I said, "Operators." I said, "So, if they can't do anything in operator mode, that sounds like a bug to me." So, anyway, my point here is, sometimes it's a good idea just, [inaudible], whatever it is you have, on the lowest possible permission, test it and see what it does. It's a useful exercise, even if you're not a security expert. Okay, obviously, you want to test passwords. Security violations, when they do happen, you want to make sure the right thing happens, which is normally that they're reported electronically, and, finally, the repetitive attempt, shut, whatever this is, down. So, these are the key milestone points, of what goes in the security testing. [Silence] Okay, our next test is regression testing. The purpose of regression testing is to make sure that whatever changes we made to the software, since we started testing, are validated. We also want to make sure that anything that used to work, before we made these fixes, is still working. Most people do this one pretty well. It's this one that people struggle with. A couple of months ago, I was working on a project, which was coincidently for NASA, and one of the things that I wanted to do, as part of this project, was I took a look at all of the serious software related disasters in the last 30 years. And, I looked to see what the root causes were, I and broke the root causes down by two groups. The software related bug, which would be what was wrong with the [inaudible], and then the process related bug, which was, how come our process allowed this to happen, and what I found, which is not too shocking, but the degree that which happened is shocking. When I looked at the process related failure modes, this was it, making a change to a bug, and not testing the stuff that used to work. Would you believe that was number one? I mean it, I believed it, but its staggering to me that thousands of these software failures, when you track back, you know, the actual bug varied all the time, but when you tracked back what went wrong in a process, this is it. So, have you seen this happen quite a few times? Well, let's put it this way, with some managers, regression is a four-letter word. Anne Marie: Oh. And, if you do get to do it, it's very very minimal and you don't always test what you should be. Anne Marie: Okay. Testing.

Anne Marie: Why do you think, basically what you're telling me is, it's not on the schedule. Is that correct? Pretty much. Anne Marie: Okay. So, why would, besides time and money, which is one big reason, why else would regression testing not get on the schedule? What is the basic assumption behind not putting it on the schedule? Its going be right the first time. Anne Marie: Exactly. It's the mindset. It's the whole reason why unit testing never gets on the schedule either. The fundamental belief is that there's not going to be any of these. I point this out to you, unfortunately, I didn't get the this little fact into my slides, these slide were frozen before I came up with this, but you go out to these websites, there two or three of them out there, where they've listed like ten thousand software failures, and they'd go over the root cause. You go through and you read through each one, I already did the work, so I can send you my little spreadsheet, I went through, and I categorized the bug, which was what was wrong in code, the process, why did they not find this, and, I am not joking, the majority was they did this. It wasn't that the bug was in there the first time around, the bug was introduced when they fixed another bug. To me, I just find that such a simple thing to fix, that I'm surprised it continues to happen over and over and over again. But anyway, so this is, probably if there's one slide I want you to remember, it's that one. I think I'm preaching to the choir though. Okay, so what you go over in a regression test, functionality that has been modified, that one's obvious, functionality critical to end user, even if it hasn't been modified, functionality that could be affected by a modification, even if it's not the mod, the general area of our modification. Now, someone, in the beginning of the class [inaudible], in the morning, I can't remember who it was here, someone around here said that they, something about the unit test, that you liked to have the scripts for the unit test, because then, when you went to this test, you knew what to execute. Was that you Brett? Okay, that was you. You hit the nail on the head. If you catch your unit test, from the first time, and you have them organized and they're all traced to the requirements, then when you go to fix the bug, you know, okay, that one, that one, that one, boom, we're going to re-run them, versus I don't have any idea of how I tested this thing the first time around. Part of the reason why manager balk at aggression testing, because they didn't put the infrastructure in to begin with. Anne Marie: Exactly. Yeah, it's not easy. Everybody's got to remember, oh, I forget what I did six months ago, when I tested this. So, very good suggestion by Brett, which I wish I had on my foil here. You also want to retest anything that exercised frequently, or has a risk of being a high risk. This is where I want to go over a different problem. I'm trying to hit all the problems I see, when I look at, I'm a disaster person. I met ever disaster, not physical disaster, like, I haven't been in a NASA disaster, but people call me when they've got a disaster. So, when the phone rings and somebody calls, Anne Marie, can you come out to our place, they had a disaster. And, when I get there, normally these are what I, the things that I find. In January, I was called out to a software disaster, at a company, I can't tell you what they were doing, because I don't want you to figure who they are, but they were making a safety critical thing, and they had, I wouldn't say it was a disaster, there wasn't, somebody didn't die, somebody didn't get seriously killed, but they, they started to realize that the writing was on the wall. And, basically, when I went out there, I did a complete analysis of their bugs in their priority system. And, this company was doing regression testing, I'll give them that, they were doing it. They were following this process, and that, from the very top level point of view, it looked like they were doing everything right. But, when I got into the little nooks and crannies and looked at what they were actually testing, I realized their entire priority system, for prioritizing bugs and for doing regression testing, had missed one super important thing, and it was this one right there. When they prioritized bugs, they would prioritize the effect of the bugs. So, for example, some bugs might have a moderate effect, because it's somewhere more, it's more than annoying, but it's not disastrous. But, what they forgot to think about is, this bug is happening sixty seconds. Where this bug is happening four to five times a day. And, what happened was, they almost had their software recalled, because they shipped the software, and within a day, the doctor, I just gave away what it was, the doctors had found this bug happening once an hour, and it wasn't a serious bug, it really wasn't. It wasn't a serious bug, but the end user saw it happening five to six times a day, they said, "Take it back, I don't want it. Makes me nervous. Makes me wonder what else is crawling around in the software." So, I would highly recommend that you always consider that a bug may look kind of harmless, until it happens five to six times a day. So, and that could impact you guys, because availability is super important for your system. If a moderate bugs happens five or six times a day, you're availability numbers are going to get, even if it's a silly bug. Okay. Now, magnitude and duration of the regression test depends, obviously, on how many changes you had, how modular the software is, how late the defects were corrected, and whether or not the high risk tests were run first, before others, and I need to add my fifth bullet here, which is, how well you have documented the unit test, because if the unit tests are readably available, you've got a trace and clear across this can happen, actually, fairly quickly, so, good idea there. All right, this is, this is a general template that I use. You guys may have some other templates that you might want to use. Yes? Just on your previous slide. Anne Marie: Yes. If you talk about re-running unit test [inaudible], for regression testing, wouldn't you also re-run integration testing and system tests? Anne Marie: If your bug impacted the integration, yeah. If you bug impacted, that's a very good point, if your bug impacted an interface, yes. Very good point, so I've got to add a sixth bullet, which is your traceability to the integration test. Okay, good idea. Okay, here is a template that I use for assigning priority, and this was the thing, the example I just gave you ,what was missing, they were prioritizing this, and they were prioritizing that last column, but they were not prioritizing the one in the middle. Okay, so the three things you want to think about is the impact on the end user in the system, that's super important. That is usually fairly obvious. People don't usually argue over those. Okay, it's critical, its moderate, it's almost invisible. Frequency, once you start thinking about frequency, it becomes very obvious to you what it is. The problem is, nobody ever thinks about frequency, and I'm guilty of it too. I'll have somebody report a bug to me on my software and I'll think, well that's kind of medium priority, and then when I go and look at it, I realize these people are dealing with this error message, like, every day. I'd have to just, I don't know, I'd have to drink multiple cups of coffee to be able to put up with this error message. So, then I realized, okay, well, it's more important than I thought. And then, finally, the last thing is, how stable is the code. You have different types of changes, right? A level one here is the functionality is changed regularly or the code is very unstable. Have you guys ever worked on code where, you know when you touch it; it's probably going to break? That would be number one. Okay, we've got a high risk thing here. It's not if it's going to break other code, it's when. Two would be, if the codes relatively stable, it gets changed once in a while, but basically, it's fairly stable. And then a level three would be, this code is very stable. We don't anticipate any problem with it. Okay, so you could see, we've got three different perspectives here; how likely it is to cause a problem, how frequently it would happen, if it creates a problem, and the impact on criticality. So, you can probably think of other ways to prioritize the test, but these are the three ways that usually work well. Okay, determining control risk of changes measure two things. Now, let's assume that you come across some bugs and you're trying to decide whether or not it's worth fixing these, okay? There's two things you want to look at; the risk of not making the change, is there a working, or is there a workaround, if we don't make this fix, is there a workaround? Will there be downtime? Will we have a marketing disaster? Things like that. Will it get in the newspaper? Things like that. Versus, the risk of making the change; the number of modules that need to be changed, the stability of the modules, and the testing required. One of the problems I find a lot, and I don't know if you guys find it here at NASA or not, but, you know, there's a lot software engineers who are actually a little too, what should I call it, too much of a perfectionist. For this [inaudible] software people, if you go in to write something and you find a bug, what's your natural intuition? There's a bug, I know it's there, I can see it, but I'm in doing something else, do I fix it or do I leave it alone? Most software engineers will fix it; it bugs us to know there's a bug in the code. That's how we think. There's a bug there, boom I'm going to fix it. What's the problem with doing that? [Silence]

Do the testers know they fixed it? Are they going to go back and report a bug log and an SPR for what they changed? No. They're just going to fix it and keep going. So, the problem is, with making changes, you always have the effect that it needs to be tested, it could be that somebody else doesn't know that you tested it, and you need to change it. So, the goal isn't necessarily always to make the change, that's my point. So, one part of regression testing is to take a look at your change and say, "Can we really live without this thing?" And, that will help control regression testing as well. I'm sure you guys have failure review boards for this, do you not? Okay, good. A lot of companies don't have those, so you guys are lucky. Okay, the acceptance test. Now, I was corrected during my break. At NASA, the acceptance test comes after the formal qualification test. So, at other organizations I've worked with, it was actually a one-step process, because their system wasn't as big as yours. For you guys it's a two-step process. So, during an F2T test, or an acceptance test, the goal is not really to find bugs, the goal is to re-run some tests, run some real life scenarios, sometimes its run by the customer. Are you, is your guys acceptance test run by? The end users. Anne Marie: Okay, good. And, it's usually the last checkup prior to delivery. Whereas with you guys, the F2T is the last checkup prior to acceptance testing. So, all these rules actually apply to both. Okay, is there anything else you guys do during an acceptance test, other than this? Okay, so for my software, I can tell you, myself for example, the final thing that gets run is all those use cases, on my use case list, they're the last thing that get run. Actually, that and the installation test. Okay. We actually already did this example, I did this example with you, so we can skip through these. Okay, here is, I thought I had more quizzes in here, but it looks like I only have one quiz. Name five things that should be verified in functional testing. Functional testing is when you verify the SRS, right. So, you guys remember what those were? [Silence] All right, name anything. We won't stop at five. Well, the requirements, the abnormal behavior, right. The use cases could be one thing, right. And I'm not sure what my other two things were, it's probably on your answer page. Name two things you could verify in configuration testing. Well, a MAC and a PC's one example, things that are attached to your system, like printers and things like that. Regression testing, name two things, well you want to test the fixes and you want to test stuff that should work. Okay, true of false. Security testing is always required, no. Regression testing is always preformed after a single corrective action. This is something I didn't talk about. So, you make one bug fix, you do a whole round of regression testing. Is that true or false? It's false. Normally you do them in batches. Okay, functional testing is always part of black box testing, true. In fact, if you only do one of them, it would be functional testing. That was your quiz to pass the class, and I like easy tests. Am I an easy instructor or what? Okay, so I'm going to go over one last thing, and then I'm actually going to finish when I promised you, and that's the metrics and the recording of failures. Okay, like we saw before, the two and two go hand and hand. So, if we want to have some good metrics for system testing, we need to think about what's our goal for system testing, okay. Well, the primary purpose to system testing is to provide functional coverage. So, it was the goal of the other tests to provide line coverage, the goal is this is functional coverage. How much stuff can somebody do with the software did we cover? So, that's our goal. So, keeping that goal in mind, some useful metrics are requirements coverage. I suspect, I don't know, I have to look this up in a little pamphlet that you gave me James, but requirements coverage might actually be part of the MPR, here at NASA. But, so basically, requirements coverage would be, we have five thousand requirements, how many of them did we test? That's a pretty simple metric, and I suspect it's probably required. Test productivity could be another one. Productivity means, how many test cases were executed by how many people testing over how many weeks. Why would you want to keep track of that? Let's say we have ten people and they test for three months and they execute a thousand test cases, why would that be useful information? [Silence] [Inaudible Speaker Comment] Anne Marie: What was it? [Inaudible Speaker Comment] Anne Marie: Yeah, because you can use it for the next project, right? Yeah, so for the next project you could use it to schedule how many people and how much time you need. So, it's a very good metric. I can't live, personally, can't live without that one, because I, I don't like to guess at how much testing time I need. I, and the really good thing about this metric is, particularly when you have managers that like to cut testing time, you could come back and say, "This is our test productivity. We know we have this many number of tests, because I'm sitting here looking at the SRS, so we have to have this amount of time." And, lots of time people would be like, "Okay, since you presented me with some black and white facts, okay." When you don't have some black and white facts, then lots of times it's hard to get a schedule. So, if you've come to an upper manager and you say, "I need three months of testing." And they say, "Why." And you say, "Because that's what I think it will take." That's not nearly as cohesive as, well, on our last two projects; our productivity was X number of test cases per person, when you add it all up, this is what we get. It's kind of hard for someone to argue with you. So, that's why I measure that one. Not only that, you also want to be right. You want to make sure the schedules right too. There are also reliability estimation models that are used for measuring test effectiveness, those are called reliability growth models, and some organizations use them. I'm going to very, I'm going to give you a very, very high-level view of these. These, basically, they take defect data and they trend them, and they tell you whether or not your software stable. That's what they do. I'm going to show you an example of that in a minute. And then, finally, there's acceptance sampling. I'm going to show you those two shortly. Okay, so this is an example of the test productivity, an example of that. One of the things I did want to point out, test productivity's does tend to go up towards the end of testing. You guys ever noticed that, who are testers? On the very first day or two of testing, do you find, I'll be lucky if I can execute five test cases, and then, at the end, boom, I'm plowing through, like, twenty at a time, well, that's because the bugs are out. So, don't be surprised if the trend goes up. What you want to have is that average. You don't really care about the trend, you want to know what it was throughout. Okay, reliability estimation models, these are also called estimators or reliability growth models. Basically, you just collect the defects data, and you extrapolate it to tell you what the software failure rate is. These models have been used for a really long time, and they were actually first used at NASA. So, way back in the late 70's and early 80's, NASA was one of the first people to use these models. Since then, they've gone through, the reliability growth models have kind of gone through varying levels of popularity. They've been really popular, then really unpopular. Right now they're kind of somewhere in the middle. People use these as indicators, and they're not, they're not pinpoint accurate, but they're usually ballpark accurate. So, for example, they can help you determine if the defect profile of your software is stable. So, that's what they're good at. They can be automated. There's tools like [inaudible], [inaudible] is free, most people that use these models use [inaudible]. They're only as good as the test coverage. This is the main, I was telling you that they've gone up and down in popularity. Right now, they're kind of on a downturn, because you could have the reliability models tell you your software is virtually defect free, if you only execute 5% of the code. So, one of the problems with the reliability models is, you have to use them with test coverage. So, normally what people do is, they use the reliability growth models, but then they also present the coverage. So, the reliability models here, your coverage is there, if your coverage is 30%, your faith in the reliability model is not much. When your coverage starts to hit 80 or 90%, then your confidence in the reliability growth model is up there. Does that make sense? Okay. You can also use, these models can be automated in a spreadsheet as well. Okay, if anybody want to know about these models, I'd be happy to tell you all about them, but those are used as a metric. Okay, acceptance sampling. I have actually not seen this done for many, many, many , many years, but it's out there. I, every once in a while, I hear of it being done on systems. The idea behind this, and it is an expensive test, is that you wait for the software, let's say your acceptance, this is where you would do this. You might even do this after the acceptance test. You could combine your acceptance test with this. Let's say, during the acceptance test, you test the software for a really long time. You make absolutely no changes to it at all. You test it in its operational environment, and you set up what are consumers and producers risk, I'm going to explain what those are in a second. The acceptance test continues until either the software's accepted, rejected, or neither. Have any of you ever seen this before, this diagram, it comes right out of a DOD, I forget the standard, DOD-61 something, I have to, I've got the reference in your materials, but anyway, this type of acceptance testing has been around forever, and it's been used on hardware. You guys could actually do this, in combination with an acceptance test, if your tests ran long enough. The idea is, let's say you're testing an acceptance test, then the customers the test. So, on this axis is either test time or test cases, whatever is more applicable. Okay, you might want to just keep track of the time that they're testing. On this axis, the Y-axis is how many faults they find that are due to the software. Okay, these grids, this here and this here, this is all part, there's a DOD standard for this that spells it all out. There's consumers and producers risk. The consumers risk is the risk that you accept the software and it doesn't meet your objective. And, the producers risk is the risk that you reject the software and it's actually okay. So, these two, this thing comes right out of a MIL handbook, you don't need to invent the wheel. We would take one of these charts out of this MIL handbook, like this, and what you do is, you just plot the very first time you software failure, you plot it, versus where it failed. So, basically, if you find a failure really, really fast, like lets use this, let's say you find your, let's say this one here, you find your first failure earlier, let me see, oh, I'm sorry, let's say you find your first failure, wait a minute, let me think about this. Well, you would reject it if you find four failures almost immediately, everybody see that? You would accept it if you find your first failure, like, way out here. And, in between, you just keep going. The reason why this test is expensive, is because lots of times you're in that middle bar for a very, very , very long time, so that's why it's expensive. So, anyway, but this is just something to think about. I saw one organization use this, they had a reliability objective for their software. Their software was for a commercial appliance that would go in your kitchen, and they had a requirement for the entire appliance that it had to go, basically, ten years. And, now this, by the way, this was fifteen years ago, you look at these appliances now, none of them will go ten years in your kitchen. But anyway, it had to go a certain number of years, and their requirements was, it could only have one service call during that ten years. If it ever got two service calls, they were going to lose money. So, they created this whole grid, based on that set of requirements, and they did an acceptance test and watched where it fell. And, so that's the last time I've actually seen this used, and that was probably fifteen years ago, but it's an idea. If you have ultra high reliability requirements, this may be a way to prove that your software met some objective. Okay, so exit criteria for systems testing. My research showed that organizations that had specific exit criteria had fewer fielded defects than those that did not. These are some typical exit criteria; trend analysis, there's a metric called percent removal, I'm going to go over, requirements coverage, we talked about, line and branch coverage we talked about, and acceptance sampling, we talked about. So, these are some guidelines for when to not. It's much easier to say, we're not ready to stop than it is to say we're ready. So, let's go over those first. Okay. This is one metric, and believe it or not, this has been used by quite a few companies. What they do is they plot on an axis. This is just defects over time, so let's say you tested for five thousand hours and you found five defects. Each one of these would be a point in time where you found a defect, okay. So, these little points are just the defects found over the cumulative time you found it. We plot these dots, we draw a best straight line through them, and this is cumulative defects up here. The Y intercept there, if we draw a line through it, is our estimated inherent defects. And, by the way, those reliability growth models I was telling you about, they all use this, so that's what they use. So, anyway, this is how many defects, statistically, are probably in the code. This point here, which is our last data point, is how many we found right now. So, if we take that, whatever that is, and divide it by that, that's what's called percent removal. So, if this number is 90 and this number 100, over there, the percent removal would be 90%. I have used this metric on quite a few projects. In my database of projects, I actually had access to this data. And, even though the people that were in my database, they didn't use this metric, they had the data where I could compute it, so one of the things I did, for all of my projects is I plotted this data to see where people had shipped their software, and there was absolutely no surprise. The projects in my database that were a fail, they're curve here, this was actually going the opposite direction. So, if this trend is going in this direction, what that means is you're finding failures faster every day. The first day we found one, the second day we found two, the third day we three, so forth. A lot of the failed projects, they couldn't even draw this line through it, because defects were happening faster every day. Then the projects that were in the middle, as you could guess, they were somewhere around 40, 30% defect removal, so they were finding 30 defects, the trend was telling them there was 100. And, as you can imagine, the projects that were successful, they had somewhere about 70 to 90% of this was gone. So, the point is, when you get to the end of testing, you should see something that looks a lot like this. And, if you don't, that's probably not a good thing. Okay, one thing I do want to point out is those, those metrics are only as good as the code coverage. These are some other metrics. This is what I was talking about earlier, with the trend going the wrong way. If you see that, that's not a good thing, that means the software is not stable at all. What's interesting is, there was a few projects in my database that had this defect trend, and they shipped the software there, and, in every one of those cases, the software was recalled. Maybe not, like, federally recalled, not like with a recall, but the customers said, "Take it back, we don't want it." So, this is definitely not good. Okay, if your project looks like this, and this is the last day of testing, that's not good. Okay, all right, some wrap up, these are the things that we've learned. I can't believe I'm actually within twenty minutes of finishing here. We covered the system testing practices, planning, executing, dos and don'ts, automation, metrics and exit criteria. So, with this, I'd like to answer any question you guys have, from simple to complicated, whatever the case may be.

Summary

Once you’ve completed this course, you’ll be notified in your SATERN Alerts that you have a survey waiting. Please consider completing the survey to let us know your opinions about this course. If you have a content question relating to this course, please email your query to the STEP Help Desk at [email protected] .

The Help Desk will pass your inquiry on to the appropriate instructor or subject matter expert for their response. We’ll make every effort to get back to you in a timely manner.

Congratulations! You have completed the course. In order to receive credit within SATERN for completing this course, please click 'Complete', and proceed to take the test.

Recommended publications