CS 510/08 1.1 The Nature of Software...

Software is easy to reproduce Cost is in its development C C

S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 in other engineering products, manufacturing is the costly stage is mainly a human activity Hard to automate, labor intensive Software is intangible Xiangyu Zhang Hard to understand development effort The demand of scaling up is pressing r in g

2

The Nature of Software ... The Nature of Software

Untrained people can hack something together Conclusions Quality problems are hard to notice Much software has poor design and is getting worse C C C C

S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 Software is easy to modify n e g i En ar e tw S of S510 Demand for software is high and rising People make changes without fully understanding it We are in a perpetual ‘software crisis’ Software does not ‘wear out’ We have to learn to ‘engineer’ software It deteriorates by having its design changed: erroneously, or in ways that were not anticipated, thus making it complex r r r r in g in g

3 4

1 Historical Aspects What is SE?

1968 NATO Conference, Garmisch, Germany Software engineering A discipline whose aims are C C C C S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 Aim: To solve the software crisis The production of fault-free software, Delivered on time and within budget, Software is delivered That satisfies the client’s needs Furthermore, the software must be easy to modify when the client’s Late needs change Over budget Extremely broad With residual faults , management, economics, etc. r r r in g in g

5 January 7, 2008 6

Standish Group Data Cutter Consortium Data Data on 9236 projects 2002 survey of information technology organizations completed in 78% have been involved in disputes ending in litigation C 2004 C S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 For the organizations that entered into litigation: In 67% of the disputes, the functionality of the information system as delivered did not meet up to the claims of the developers In 56% of the disputes, the promised delivery date slipped several times In 45% of the disputes, the defects were so severe that the

r r information system was unusable in g

–Figure 1.1 8

2 Implications What Does CS 510 do?

The software crisis has not been solved. Two ways out Even worse ? Religion C C C C S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 Reading assignment –“How Microsoft Builds Software”. Automation Modern hardware vs. human capabilities Overall goals NSF cfp 2008 Acquiring an overview of the field without losing yourself in the – Compared to OS, ARCH, Networking, … bibles; Reaching the frontline of algorithmic and automated software engineering. Attaining the capability of solving practical software related r r r problems using tools. in g in g

9 January 7, 2008 10

Scenario One Scenario Two

“Three Amigos” vs. “The Gang of Four” Application hanging Unified process A mouse click freezes the application, and the OS marks it as “not C C C C S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 Design patterns n e g i En ar e tw S of S510 resppgonding” Analysis + testing More Data race, , dangling pointers, memory leak, missing code,… r r r r in g in g

January 7, 2008 11 January 7, 2008 12

3 Scenario Three Scenario Four

Reverse engineering network protocol SQL injection A program is communicating with a remote host through an unknown Deobfuscation C C C C S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 ppport with unknown protocol Is it a mutant More Verification of firewall specifications … r r r r in g in g

January 7, 2008 13 January 7, 2008 14

Scenario Five Scenario X (Not Happening)

Data lineage In scientific , an output value is produced after months of C C C

S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 compp,utation, indicating gp an important finding, g,y should you trust it? n e g i En ar e tw S of S510 Data uncertainty In battle field Long range rainfall prediction r r r in g –Any problems related to software in g

January 7, 2008 15 January 7, 2008 16

4 Course outline Course outline (2)

Main topic areas identified by IEEE Software Engineering Body of Knowledge What this course will not be about: which will be covered: Process and project management C C C Software Engineering Process: Unified Process, Extreme Programming C S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 Requirements Analysis Software Construction: coding practices, design patterns, refactoring, AO P Software Testing: integrated testing with Junit Software metrics Software Engineering Tools and Methods: program analysis, formal UML (explained on demand) specification techniques, reliability, validation r r r r in g in g

January 7, 2008 17 January 7, 2008 18

Schedule Compared to CS 307

1-3 weeks CS 307 focuses on experiencing Software engineering principles and practices Principles & practices C C C C

S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 4-7 weeks n e g i En ar e tw S of S510 The opportunity to exercise these principles by working on a large Program analysis team project. 8-11 weeks At the end, “this is A way to develop software” Logic and model checking CS 510 focuses on automation 12-14 Advanced Topics Algorithmic and automated techniques. Slicing, testing, debugging, and current software engineering Tool design and implementation. At the end, “I know how to design a tool to answer your questions r r r r in g in g about a software ppgrogram.” Broader view of SE

January 7, 2008 19 January 7, 2008 20

5 Compared to Other Graduate Course organization Courses

Most contents have no overlap Logic and model checking Grading policy: C C C C S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 Data program analysis Projj(ects 60% (2 smalls, 1 largggpe, in groups of 1 or 2) Slicing Final 20%, Midterm 10%, Automatic Testing Homework 10%, XP, AP, … Qualifier: final Different focus – practicality and application Data flow Type based analysis r r in g in g http://www-cgi.cs .purdue .edu/cgi -bin/xyzhang/doku.php?id=home

January 7, 2008 21 January 7, 2008 –22 22

References Academic Integrity

All work that you submit in this course must be your own. Stephen R. Schach, Object-Oriented Software Engineering, McGraw Hill. Unauthorized group efforts are considered academic dishonesty. Gamma, Helm, Johnson, Vlissides, Design Pattern — Elements of Reusable C C C C S510 S o f t w a r e E n g i n e g i En ar e tw S of S510 Object-Oriented Software, AW, 1995. n e g i En ar e tw S of S510 Yo u may discuss ho mewo rk in a g eneral wa y with others , but you may not consult Beck, Extreme Programming explained — Embrace Change , AW 1999. any one else's written work. You are guilty of academic dishonesty if: Fowler, Refactoring: Improving the Design of Existing Code, AW 1999. You examine another's solution to a programming assignment (PA) Edmund Clarke, Orna Grumberg, Model Checking, The MIT Press.Hanne Ris- You allow another student to examine your solution to a PA Nielson, You fail to take reasonable care to prevent another student from examining your solution to a PA and that student does examine your solution. Fleming Nielson, Chris Hankin, Principles of Program Analysis, Springer Verlag. Automatic tools will be used to compare your solution to that of every other Michael Huth, Mark Ryan, Logic in Computer Science – Modelling and current or past student. Don't con yourself into thinking you can hide your r r r r in g Reasoningyg about Systems, Cambridge. in g collaboration. The risk of getting caught is too high, and the standard penalty is way too high.

January 7, 2008 –23 23 January 7, 2008 –24 24

6