COMP 200: Elements of Computer Science Fall 2004 Lecture 24: October 24, 2004 Biological Analogies in

Programming by Biological Analogy So far in COMP 200, we have talked about algorithms (and programs) in rather dry terms—loops, recursion, assignments, lists, arrays, stacks, queues, and so on. Yet we have all heard of computer viruses and worms. Programmers fight “bugs” and coined the term “debugging” (rather than “exterminating.”) You may have heard of a . Why are biological analogies used to describe phenomena in Computer Science? Often, the analogy captures critical properties of a complex process or situation. A programmer who has designed a program (and thinks of it as an elegant creation) is inclined to view an error (a flaw, a mistake …) as an independent entity that “crawled” into their creation and needs to be found and squashed. The term “bug” originated with a crash on one of the very early electronic computers. The problem was a hardware failure caused by a crawling insect that stepped across two power lines and shorted them. Viruses, Worms, Trojan Horses is a general term used to describe malicious software. Common types of malware include viruses, worms, and Trojan horses. • A virus is a short program that piggybacks on another program (another biological analogy) and has two primary goals: to infect additional machines and to deliver some payload. • A worm is a program that transmits itself over a communications network. As a worm reaches more machines, it can clog both the processor with work and the network with traffic. • A Trojan horse is a program that has the capability of letting a user perform otherwise-unauthorized actions. Trojan horses are typically hidden inside some useful piece of software — a game or utility program — that the user is persuaded to install. Today, network-connected personal computers (particularly those running Windows, Explorer, and Outlook) are under constant assault from viruses, worms, and Trojan horses. The dominant market position of Microsoft in the commercial software market makes them an obvious target for authors of malware. Other operating systems, such as Mac OS X or Linux, may be susceptible to malware but are less inviting targets because of their smaller market share. (Some would argue that security is better on these Unix-based systems. However, we should remember that the first major ill-intentioned worm — the Morris worm — targeted UNIX systems.) Viruses In biology, a virus is a snippet of RNA or DNA wrapped in a protective cover. On its own, the virus is harmful. If it gets inside a cell, it hijacks the cell’s chemistry shop to replicate itself. The new viral material escapes the cell through any of several mechanisms — exploding the cell, passing through the cell wall in a “budding” process, or though more exotic means. The key notions behind a virus are • The material is infectious — it takes over some or all of the cell’s legitimate functions and redirects them to its own end. • The material is self-replicating — as one of its primary effects, the virus creates copies of itself and releases them into the environment. Computer programs labeled as viruses generally follow these two principles. Related categories of malware include worms, Trojan horses, and backdoors. A virus has two stages: infection and execution. The infection phase of a typical virus 1. inserts itself into the system so that it runs on a regular basis; and 2. makes copies of itself to infect other programs and/or other systems. The virus may take the form of code that is inserted into the , commands inserted into a startup script, or a prefix for the executable(s) — a small snippet of code that executes on “the way into” an application and either continues the infection phase or runs the payload phase. Early viruses copied themselves into the code that executed on boot — either the computer’s boot block (back in the days when this resided on a floppy disk) or inserted code into the startup command files — code that runs on boot. Later viruses would insert themselves into more complex files — either executable files (.exe files in Windows/DOS) or into the macro sections of Microsoft Office applications (starting in 1996, with the introduction of a BASIC-language macro facility for Word).

The execution phase of the virus runs some “payload” code. The payload may be simple — the first virus (on an Apple II at Texas A&M) simply printed out six lines of doggerel geek verse.

It will get on all your disks It will infiltrate your chips Yes it's Cloner! It will stick to you like glue It will modify ram too Send in the Cloner! Alternatively, a virus may be truly destructive; one late 1980’s virus encrypted the infected machine’s disk and demanded a key to decrypt it. The notion was, presumably, that the user would pay a ransom for the key. It is not clear how the author expected to get paid without being caught. (See history for name and look it up.) A number of viruses have erased critical files from system disks. (More viruses have failed to propagate because they were poorly designed — they destroyed the software required to replicate. Fortunately, these instances don’t thrive in the wild.) Building a Word Macro Virus How would we build a virus? The macro facility in Microsoft Office has enough power to let us write arbitrary programs. (75% of all virsus are estimated to be macro-viruses that rely on Office’s macro language.) We can write a macro that calls the mail application and sends copies to each user in the mailer’s address book. We can attach the macro to a document, set it up so that the macro runs when the document opens — for example, the macro is used to format the document —, and mail that document to some random set of people. If any of them open the document, then (wham) the infection spreads. If we are more malicious, we can insert the macro into some standard Word format and (with a little more programming) arrange matters so that every document opened in Word becomes infected. Now, any legitimate document that you send to a colleague is infected and spreads throughout their machine. Finally, our macro virus needs a payload. Again, within the Office macro framework, we can open files and send mail. To be malicious, we might send a copy of the system’s password file to someone — to an anonymous mail drop, to someone we don’t like, to someone whom we wish to annoy (the mayor, the governor, the president of a company that has done us wrong). We might remove some critical files — delete any calendar files on the machine, or any data files, or all files. We might look for account numbers and passwords, or 16-digit numbers associated with names and expiration dates — to mail back home so that the author can pay the bills. We might simply say “boo”. (Many viruses have done the equivalent.) These mail-borne viruses are not “true” viruses because they typically require user intervention to propagate. The user opens an attachment and the macro executes. More recent viruses embed html-formatted text in an message — including html-code that downloads malware onto the machine. This strategy eliminates the need for you to open some attachment — as soon as any program displays the message, it performs the actions required for infection. Worms Many of the programs commonly described as viruses are actually worms transmitted through electronic mail. A worm copies itself from computer to computer across network connections. The first worm was the print server on the original local-area network at Xerox PARC. Preparing a document for printing on a (new-fangled) laser printer required significant preprocessing. The print server would move from machine to machine across the network, cycle-stealing on idle machines to process documents. Later worms used TCP/IP socket connections to transmit code from an infected system to a pristine system; getting the code to execute took an independent mechanism, such as a buffer overflow or replacing some executable that runs frequently. Some of the modern e-mail viruses might be classified as worms. They do no harm, other than jamming the network (& people’s in-boxes) with mail. Other e-mail viruses are malicious. Trojan Horses and Spyware gathers information on your use of the computer, or on the contents of your computer (Do you use Microsoft Money or Intuit’s Quicken? How about tax software?) and reports back to some external agency. Often, spyware is installed as a consequence of generating a (somewhat) malicious web site or as the result of downloading some useful piece of software. The most common uses of spyware today are commercial — tracking web sites visited and using that information to target pop-up ads at the user. However, malicious uses of spyware are easy to envision and no harder to program than pop-up add generators. Social Engineering Malware relies, to a large extent, on human behavior. The Morris worm operated on the assumption that system managers were lazy and would leave critical passwords in their default state and critical network ports active and unprotected. It succeeded to a surprising extent. The ILoveYou virus infected millions of machines around the world because the mail that it sent had an endearing subject line (“I Love You”) and appeared to come from a friend. (It used Outlook address books to generate new messages.) Spyware hides inside useful free software. A true worm does not require human intervention. Most e-mail and macro viruses, however, require that the recipient open the document. Prevention What can you do to avoid being lost in a sea of evil software? 1. Turn on protection against malicious macros in Office applications 2. Set your system so that it does not run executable attachments 3. Do not open mail attachments that are from folks that you do not know and be wary of attachments that you do not expect 4. Think! Keep the threat in the back of your mind. 5. Backup your system — in the worst situations, you will need to recover the files from a backup. Use your iPod, buy a detachable disk ($200 for a 100mb external drive), open an account with one of the online storage repositories (such as .Mac) and back up regularly.