
software construction Editors: Andy Hunt and Dave Thomas I The Pragmatic Programmers I www.pragmaticprogrammer.com Software Archaeology Andy Hunt and Dave Thomas his isn’t programming, this is archaeol- This analogy is such a compelling and po- ogy!” the programmer complained, tentially useful one that Dave, Andy, Brian wading through the ancient rubble of Marick, and Ward Cunningham held a some particularly crufty piece of code. workshop on Software Archaeology at (One of our favorite jargon words: OOPSLA 2001 (the annual ACM Confer- Twww.tuxedo.org/~esr/jargon/html/entry/ ence on Object-Oriented Programming, Sys- crufty.html.) It’s a pretty good analogy, actu- tems, Languages, and Applications). The ally. In real archaeology, you’re investigating participants discussed common problems of trying to understand someone else’s code and shared helpful techniques and tips (see the “Tools and Techniques” sidebar). Roll up your sleeves What can you do when someone dumps 250k lines of source code on your desk and simply says, “Fix this”? Take your first cue from real archaeologists and inventory the site: make sure you actually have all the source code needed to build the system. Next, you must make sure the site is secure. On a real dig, you might need to shore up the some situation, trying to understand what site with plywood and braces to ensure it does- you’re looking at and how it all fits together. n’t cave in on you. We have some equivalent To do this, you must be careful to preserve safety measures: make sure the version control the artifacts you find and respect and under- system is stable and accurate (CVS is a popu- stand the cultural forces that produced them. lar choice; see www.cvshome.org). Verify that But we don’t have to wait a thousand the procedures used to build the software are years to try to comprehend unfathomable ar- complete, reliable, and repeatable (see the Jan- tifacts; code becomes legacy code just about uary/February issue’s column for more on this as soon as it’s written, and suddenly we have topic). exactly the same issues as the archaeologists: Be aware of build dependency issues: in What are we looking at? How does it fit in many cases, unless you build from scratch, with the rest of the world? And what were you’re never really sure of the results. If they thinking? It seems we’re always in the you’re faced with build time measured in position of reading someone else’s code: ei- hours, with multiple platforms, then the in- ther as part of a code review, or trying to cus- vestment in accurate dependency manage- tomize a piece of open source software, or ment might be a necessity, not a luxury. fixing a bug in code that we’ve inherited. Draw a map as you begin exploring the 22 IEEE SOFTWARE March/April 2002 0740-7459/02/$17.00 © 2002 IEEE SOFTWARE CONSTRUCTION code. (Remember playing Colossal insight into the changes that were re- editing the code directly (AspectJ for Cave? You are in a maze of twisty lit- quired over the years. Java is available at www.aspectj.org). tle passages, all alike….) Keep de- Of course, unless you can prove oth- For instance, suppose you want to tailed notes as you discover priceless erwise, there’s no guarantee that the generate a trace log of every database artifacts and suspicious trapdoors. routine you’re examining is even being call in the system. Using something UML diagrams might be handy (on called. How much of the source con- like Aspect-J, you could specify what paper—don’t get distracted by a tains code put in for a future that never constitutes a database call (such as fancy CASE tool unless you’re al- arrived? Static analysis of the code can every method named “db*’” in a ready proficient), but so too are sim- prove whether a routine is being used in particular directory) and specify the ple notes. If there are more than one most languages. Some IDEs can help code to insert. of you on the project, consider using with this task, or you can write ad hoc Be careful, though. Introducing a Wiki or similar tool to share your tools in your favorite scripting lan- any extra code this way might pro- notes (you can find the original Wiki guage. As always, you should prove as- duce a “Heisenbug,” a bug intro- at www.c2.com/cgi/wiki?WikiWiki- sumptions you make about the code. In duced by the act of debugging. One Web and a popular implementation this case, adding specific unit tests helps solution to deal with this issue is to atwww.usemod.com/cgi-bin/wiki.pl. prove—and continue to prove—what a build in the instrumentation in the As you look for specific keywords, routine is doing (see www.junit.org for first place, when the original devel- routine names, and such, use the Java and www.xprogramming.org for opers are first building and testing search capabilities in your integrated other languages). the software. Of course, this brings development environment (IDE), the By now, you’ve probably started its own set of risks. One of the par- Tags feature in some editors, or tools to understand some of the terminol- ticipants described an ancient but such as Grep from the command line. ogy that the original developers used. still used mainframe program that For larger projects, you’ll need larger Wouldn’t it be great to stumble only works if the tracing statements tools: you can use indexing engines across a Rosetta stone for your pro- are left in. such as Glimpse or SWISH++ (simple ject that would help you translate its Whether diagnostic tracing and Web indexing system for humans) to vocabulary? If there isn’t one, you instrumentation are added originally index a large source code base for can start a glossary yourself as part or introduced later via aspects, you fast searching. of your note-taking. One of the first might want to pay attention to what things you might uncover is that you are adding, and where. For in- The mummy’s curse there are discrepancies in the mean- stance, say you want to add code that Many ancient tombs were ru- ing of terms from different sources. records the start of a transaction. If mored to be cursed. In the software Which version does the code use? you find yourself doing that in 17 world, the incantation for many of places, this might indicate a struc- these curses starts with “we’ll fix it Duck blinds and aerial views tural problem with the code—and a later.” Later never comes for the In some cases, you want to ob- potential answer to the problem original developers, and we’re left serve the dynamics of the running you’re trying to solve. with the curse. (Of course, we never system without ravaging the source Instead of hiding in a “duck put things off, do we?) code. One excellent idea from the blind” and getting the view on the Another form of curse is found in workshop was to use aspects to sys- ground, you might want to consider misleading or incorrect names and tematically introduce tracing state- an aerial view of the site. Synoptic, comments that help us misunder- ments into the code base without plotting, and visualization tools pro- stand the code we’re reading. It’s vide quick, high-level summaries that dangerous to assume that the code or might visually indicate an anomaly in comments are being completely the code’s static structure, in the dy- truthful. Just because a routine is namic trace of its execution, or in the named readSystem is no guarantee data it handles. For instance, Ward that it isn’t writing a megabyte of Cunningham’s Signature Survey data to the disk. Instead of hiding in a method (http://c2.com/doc/Signa- Programmers rarely use this sort “duck blind” and getting tureSurvey) reduces each source file of cognitive dissonance on purpose; the view on the ground, to a single line of the punctuation. it’s usually a result of historical acci- It’s a surprisingly powerful way of dent. But that can also be a valuable you might want to seeing a file’s structure. You can also clue: how did the code get this way, consider an aerial view use visualization tools to plot data and why? Digging beneath these lay- from the volumes of tracing informa- ers of gunk, cruft, and patch upon of the site. tion languishing in text files. patch, you might still be able to see As with real archaeology, it pays to the original system’s shape and gain be meticulous. Maintain a deliberate March/April 2002 IEEE SOFTWARE 23 SOFTWARE CONSTRUCTION ately identified, and the build Tools and Techniques should be automatic and reliable. I Leave a Rosetta stone. The project The workshop identified these analysis tools and techniques: glossary was useful for you as you learned the domain jargon; it will I Scripting languages for be doubly useful for those who –ad hoc programs to build static reports (included by and so on) come after you. –filtering diagnostic output I Make a simple, high-level treasure I Ongoing documentation in basic HTML pages or Wikis map. Honor the “DRY” principle: I Synoptic signature analysis, statistical analysis, and visualization tools Don’t duplicate information that’s I Reverse-engineering tools such as Together’s ControlCenter in the code in comments or in a I Operating-system-level tracing via truss and strace design document.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages3 Page
-
File Size-