<<

CS@

NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE COLUMBIA UNIVERSITY VOL.5 NO.1 WINTER 2008

The A-Z of Programming Languages: AWK

Professor Alfred V. Aho talks about the and continuing popularity of his pattern matching language AWK.

Lawrence Gussman Professor of Computer Science Alfred V. Aho

The following article is reprinted How did the idea/concept of language suitable for simple It was built to do simple data with the permission of the AWK language develop data-processing tasks. processing: the ordinary data Computerworld Australia and come into practice? processing that we routinely (www.computerworld.com.au). We were heavily influenced by As with a number of languages, , a popular string-matching did on a day-to-day basis. We The interview was conducted just wanted to have a very by Naomi Hamilton. it was born from the necessity utility on , had to meet a need. As a researcher been created in our research simple scripting language that Lawrence Gussman Professor would allow us, and people of Computer Science Alfred V. at in the early 1970s, center. GREP would search I found myself keeping track a of text looking for lines weren’t very computer Aho is a man at the forefront savvy, to be able to of computer science research. of budgets, and keeping track matching a pattern consisting Formerly the Vice President of editorial correspondence. of a limited form of regular throw-away programs for of the Computing Sciences I was also teaching at a nearby expressions, and then print all routine data processing. Research Center at Bell Labs, university at the , so I lines in the file that matched Were there any programs or Professor Aho is well known had to keep track of student that . languages that already had for co-authoring the ‘Dragon’ grades as well. book series and for being one We thought that we’d like to these functions at the time of the three developers of I wanted to have a simple little generalize the class of patterns you developed AWK? the AWK pattern matching language in which I could write to deal with numbers as well Our original model was GREP. language in the mid-1970’s, one- or two-line programs to as . We also thought But GREP had a very limited along with Brian Kernighan do these tasks. Brian Kernighan, that we’d like to have form of pattern action process- and Peter Weinberger. In the a researcher next door to me computational capability than ing, so we generalized the following interview, Professor at the Labs, also wanted to just printing the line that Aho explains more about the capabilities of GREP considerably. create a similar language. We matched the pattern. development of AWK. I was also interested at that had daily conversations which So out of this grew AWK, a time in string pattern matching culminated in a desire to language based on the principle (continued on next page) create a pattern-matching of pattern-action processing.

CS@CU WINTER 2008 1 Cover Story (continued) and context-free scanned for each pattern in the was that I got to know how language designers later used grammar algorithms program, and for each pattern Kernighan and Weinberger it as a model for developing for applications. that matches, the associated thought about language design: more powerful languages. This means that you can see action is executed. it was a really enlightening About 10 years after AWK was a certain similarity between A simple example should process! With the flexible created, Larry created a what AWK does and what the this . Suppose we have a compiler construction tools we language called , which compiler construction tools file in which each line is a name had at our disposal, we very was patterned after AWK and and do. followed by a phone number. quickly evolved the language some other UNIX commands. LEX and YACC were tools that Let’s say the file contains the to adopt new useful syntactic PERL is now one of the were built around string pattern line ‘Naomi 1234’. In the AWK and semantic constructs. We popular programming languages matching algorithms that I was program the first field is referred spent a whole year intensely in the world. So not only was working on: LEX was designed to as $1, the second field as debating what constructs AWK popular when it was to do and YACC $2, and so on. Thus, we can should and shouldn’t be in introduced but it also stimulated syntax analysis. These tools were create an AWK program to the language. the creation of other popular compiler construction utilities retrieve Naomi’s phone number Language design is a very languages. which were widely used in Bell by simply writing $1==“Naomi” personal activity and each person labs, and later elsewhere, to {print $2} which means if the brings to a language the classes AWK has inspired many create all sorts of little languages. first field matches Naomi, then of problems that they’d like to other languages as you’ve Brian Kernighan was using print the second field. Now solve, and the manner in which already mentioned: them to make languages for you’re an AWK programmer! they’d like them to be solved. why do you think this is? typesetting mathematics and If you typed that program into I had a lot of fun creating AWK, What made AWK popular initially picture processing. AWK and presented it with a and working with Kernighan and was its simplicity and the kinds LEX is a tool that looks for lex- file that had names and phone Weinberger was one of the most of tasks it was built to do. It emes in input text. Lexemes numbers, then it would print stimulating experiences of my has a very simple programming are sequences of characters 1234 as Naomi’s phone number. career. I also learned I would not model. The idea of pattern-action that make up logical units. For A typical AWK program would want to get into a programming programming is very natural example, a keyword like ‘then’ have several pattern-action contest with either of them for people. We also made the in a is statements. The patterns can however! Their programming language compatible with pipes a lexeme. The character ‘t’ by be Boolean combinations abilities are formidable. in UNIX. The actions in AWK itself isn’t interesting, ‘h’ by of strings and numbers; the Interestingly, we did not intend are really simple forms of itself isn’t interesting, but the actions can be statements in a the language to be used programs. You can write a simple combination ‘then’ is interest- C-like programming language. except by the three of us. action like {print $2} or you can write a much more complex ing. One of the first tasks a AWK became popular since But very quickly we discovered compiler has to do is read the it was one of the standard lots of other people had the C-like program as an action source program and group its programs that came with need for the routine kind of associated with a pattern. Some Wall Street financial houses used characters into lexemes. every UNIX system. data processing that AWK was AWK when it first came out to AWK was influenced by this kind good for. People didn’t want to balance their books because of textual processing, but AWK What are you most proud of write hundred-line C programs was aimed at data-processing in the development of AWK? to do data processing that it was so easy to write data- could be done with a few lines processing programs in AWK. tasks and it assumed very little AWK was developed by three background on the part of the of AWK, so lots of people AWK turned a number of people people: me, Brian Kernighan started using AWK. user in terms of programming and Peter Weinberger. Peter into programmers because the sophistication. Weinberger was interested in For many years AWK was one learning curve for the language what Brian and I were doing right of the most popular commands was very shallow. Even today a Can you provide our readers from the start. We had created on UNIX, and today, even though large number of people continue with a brief summary in a grammatical specification for a number of other similar to use AWK, saying languages your own words of AWK as AWK but hadn’t yet created languages have come on the such as PERL have become too a language? the full run-time environment. scene, AWK still ranks among complicated. Some say PERL AWK is a language for processing Weinberger came along and said the 25 or 30 most popular has become such a complex files of text. A file is treated ‘hey, this looks like a language programming languages in the language that it’s become as a sequence of records, and I could use myself’, and within world. And it all began as a almost impossible to understand by default each line is a record. a week he created a working little exercise to create a utility the programs once they’ve Each line is broken up into a run time for AWK. This initial that the three of us would been written. sequence of fields, so we can form of AWK was very useful useful for our own use. Another advantage of AWK think of the first word in a line for writing the data processing is that the language is stable. as the first field, the second routines that we were all inter- How do you feel about We haven’t changed it since word as the second field, and ested in but more importantly it AWK being so popular? the mid 1980’s. And there so on. An AWK program is of provided an evolvable platform I am very happy that other are also lots of other people a sequence of pattern-action for the language. people have found AWK who’ve implemented versions statements. AWK reads the One of the most interesting useful. And not only did AWK of AWK on different platforms input a line at a time. A line is parts of this project for me attract a lot of users, other such as Windows.

2 CS@CU WINTER 2008 How did you determine the use. I think this is great advice community is much more me was to report a bug in the order of initials in AWK? for any language designer. familiar with my theoretical AWK complier. He was very This was not our choice. When work. So I initially viewed the testy with me saying I had Have you had any surprises our research colleagues saw creation of AWK as a learning wasted three weeks of his life, in the way that AWK has the three of us in one or experience and a diversion as he had been looking for a developed over the years? another’s office, they’d walk rather than part of my regular bug in his own code only to by the open door and say One Monday morning I walked research activities. However, discover that it was a bug in ‘AWK! AWK!’. So, we called into my office to find a person the experience of implementing the AWK compiler! I huddled the language AWK because of from the Bell Labs micro-elec- AWK has greatly influenced with Brian Kernighan after this, the good natured ribbing we tronics product division who how I now teach programming and we agreed we really need received from our colleagues. had used AWK to create a languages and , and to do something differently We also thought it was a great multi-thousand-line computer- software engineering. in terms of quality control. name, and we put the AUK aided design system. I was What I’ve noticed is that some So we instituted a rigorous bird picture on the AWK book just stunned. I thought that no scientists aren’t as well known regression for all of the when we published it. one would ever write an AWK for their primary field of research features of AWK. Any of the program with more than handful by the world at large as they three of us who put a new What did you learn from of statements. But he had are for their useful tools. Don feature into the language from developing AWK that written a powerful CAD Knuth, for example, is one of then on, first had to write a you still apply in your development system in AWK the world’s foremost computer test for the new feature. work today? because he could do it so scientists, a founder of the I have been teaching the My research specialties include quickly and with such facility. field of computer algorithms. programming languages and algorithms and programming My biggest surprise is that However, he developed a lan- compilers course at Columbia languages. Many more people AWK has been used in many guage for typesetting technical University for several years. know me for AWK as they’ve different applications that none papers, called TEX. This wasn’t The course has a semester used it personally. Fewer people of us had initially envisaged. his main avenue of research long project in which students know me for my theoretical But perhaps that’s the sign but TEX became very widely work in teams of four or five papers even though they may of a good tool, as you use a used throughout the world to design their own innovative be using the algorithms in them screwdriver for many more by many scientists outside little language and to make a that have been implemented things than turning screws. of computer science. Knuth compiler for it. in various tools. One of the was passionate about having Students coming into the course Do you still work with things about AWK is that a mathematical typesetting have never looked inside a AWK today? it incorporates efficient string system that could be used compiler before, but in all the pattern matching algorithms Since it’s so useful for routine to produce beautiful looking years I’ve been teaching this that I was working on at the data processing I use it daily. papers and books. course, never has a team failed time we developed AWK. For example, I use it whenever Many other computer science to deliver a working compiler These pattern matching I’m writing papers and books. researchers have developed at the end of the course. All of algorithms are also found in Because it has associative useful programming languages this is due to the experience other UNIX utilities such as arrays, I have a simple two-line as a by-product of their main line I had in developing AWK with EGREP and FGREP, two string- AWK program that translates of research as well. As another Kernighan and Weinberger. In matching tools I had written symbolically named figures example, Bjarne Stroustrup addition to learning the principles when I was experimenting and examples into numerically developed the widely used of language and compiler with string pattern matching encoded figures and examples; C++ programming language design, the students learn algorithms. for instance, it translates ‘Figure because he wanted to write good software engineering AWK-program’ into ‘Figure 1.1’. What AWK represents is a network simulators. practices. Rigorous testing is AWK program allows me beautiful marriage of theory and This something students do from to rearrange and renumber practice. The best engineering Would you do anything the start. The students also figures and examples at will in is often built on top of a sound differently in the development learn the elements of project my papers and books. I once scientific foundation. In AWK of AWK looking back? management, teamwork, we have taken expressive saw a paper that had a 1000- One of the things that I would and communication skills, notations and efficient line C program that had have done differently is instituting both oral and written. So from algorithms founded in computer functionality than these two rigorous testing as we started that perspective AWK has AWK. The economy of science and engineered them lines of to develop the language. We significantly influenced how I expression you can get from to run well in practice. initially created AWK as a teach programming languages AWK can be very impressive. and compilers and software I feel you gain wisdom by work- ‘throw-away’ language, so we didn’t do rigorous quality development. ing with great people. Brian How has being one of the control as part of our initial Kernighan is a master of useful three creators of AWK implementation. programming language design. impacted your career? His precept of language I mentioned to you earlier that As I said, many programmers design is to keep a language there was a person who wrote know me for AWK, but the simple, so that a language is a CAD system in AWK. The computer science research easy to understand and easy to reason he initially came to see

CS@CU WINTER 2008 3 Message from the Chair

Henning For several years now, low Unfortunately, it is not clear engineering and computer Schulzrinne, enrollments in computer science how the muddled picture of science probably weaker than have provided plenty of fodder the computer profession can most. Just as law students run Professor for panels, editorials and faculty be clarified, given the lack clinics, we may consider ways & Chair lunches. Right now there are of professional certification to increase the participation of some mildly encouraging equivalent to, say, the CPA or CS students (and faculty...) in signs; at Columbia University, bar exam. However, I believe projects that benefit the public, “competing” majors such that we can do more to install not just through research. At all as financial engineering may a professional ethic in our levels, from local government to be drawing fewer students, students. Deborah Johnson the Red Cross, contributions of and there seems to be some describes, in the October trained or in-training computer progress in convincing the public CACM, the fiduciary model as scientists could dramatically that most CS-related jobs a promising model to highlight contribute to the public welfare. have not shipped to Bangalore. the difference between guns- I suspect it would be easy to However, it seems unlikely for-hire and professionals. find interested students and that we will be returning to the Making students think of faculty, given the real-world heady enrollment numbers of themselves as professionals-in- impact such projects can have, a decade ago any time soon. I training may also . Below but matching up governments, also believe that simply relying I highlight three aspects of not-for-profits and universities on better public relations or such an attitude: emphasis seems challenging and time- the decreasing attractiveness on quality, public service, and consuming. At Columbia of Wall Street jobs alone is not professional engagement. University, our Design quite enough. In far too many cases, student Fundamentals course required Among many other aspects, projects seem to be motivated for all freshmen includes such as a more relevant by the “can I get it by the TA” community projects, but they curriculum, I believe that and “will it survive a demo” are limited in scope by the professionalizing computer tests, rather than pride in relative inexperience of the science may contribute to professional-quality work. students. Professional societies a better reputation of the There seems to often be little such as ACM and CRA might field. Right now, the public stomach to take off points be able to increase awareness understandably considers for sloppy code or lackluster on both sides. Despite the vastly different computing- documentation, as long as the difficulties, I am convinced that related fields and occupations results are correct. Judging showing students, through as “working with computers”, from some of the projects their own experience, that ranging from first-line technical produced by our MS students computer science can make support reading from canned (most of whom come from a real difference in the world scripts to software engineers other institutions), this does at large can only help to make designing new algorithms for not seem to be a Columbia- the field more attractive. I computational biology. Even specific issue. In one telling look forward to hearing your the various classifications of remark, a student of mine who perspectives. As always, I the Bureau of Labor Statistics worked on the FAA project look forward to hearing from (BLS) conflate very different described in this issue noted you as our readers. backgrounds. As one example, how different it was to program the BLS notes that 20% of for class and semester projects computer programmers have compared to a project used 12 no degree, roughly half have a hours a day by non-computer bachelor’s degree and 20% have scientists. In contrast to other completed a graduate degree, professions, we seem to although it is not clear how tolerate lower quality for our many of these are indeed in trainees than what’s expected computer science or computer in the field. Emphasis, from the engineering. Among the five very beginning, on verification, occupations most closely related testing and sound engineering to computer science, all but one principles may help to set apart (computer software engineer) the professionals we hope to list associate to graduate degree train from the amateurs who as prerequisites, a far wider picked up a bit of programming spectrum than one is likely from “Java for Dummies”. to find for other engineering Just knowing O( ) notation professions, let alone the is not sufficient. classical professions such Most professions have a as lawyers or architects. tradition of public service, with

4 CS@CU WINTER 2008 New Faculty Computer Science Department Welcomes Professor Martha Mercaldi Kim

Martha Kim’s research focus algorithms for mapping for configurable on-chip is computer architecture. Her application software to modern communication, not only in particular interest is in problems chips that contain many parallel brick and mortar chips but that lie at the boundaries of computational cores. In the in traditional chips as well. architecture. The rapid scaling of purely hardware domain, she Recently she has proposed and modern silicon processes has has designed a low-cost style evaluated a design of such a stretched the traditional circuit- of chip, called brick and mortar, polymorphic on-chip network. architecture and architecture- with the goal of reaping the Martha received her Ph.D. software abstractions to their benefits of modern circuit in Computer Science and breaking points. Architectures integration without the financial Engineering from the Professor Martha Mercaldi Kim cannot function while ignoring side effects. Brick and mortar University of Washington in circuit-level effects. Nor can chips are assembled from December 2008. Prior to that The Computer Science architectures any longer single- multiple small, pre-fabricated she earned her bachelors department is delighted handedly harness growing hardware components (the in Computer Science from computational resources bricks) that are bonded in a in 2002 to welcome Martha and leave appearances to designer-specified arrange- Mercaldi Kim as an and a masters in embedded software unchanged. ment to a communication systems design from the Assistant Professor of Martha’s research has explored backbone chip (the mortar). University of Lugano in Computer Science. problems such as these. She Stemming from this research Switzerland in 2003. has investigated compilation she has identified the need

Faculty Award Ken Ross Receives Distinguished Faculty Teaching Award

Computer Science on Monday, May 19. “The Engineering School Alumni Professor Ken Ross Columbia Engineering School Association voted unanimously Alumni Association created this to approve the selection. was selected as one award more than a decade ago of the two recipients of Students enthusiastically wrote to recognize the exceptional that courses taught by these this year’s Columbia commitment of members of the professors were the best they Engineering School SEAS faculty to undergraduate have taken at Columbia. The Alumni Association education,” said Mr. Lee. “This qualities that both professors year, I am pleased to present (CESAA) Distinguished share and the ones most these awards to two senior frequently mentioned by Professor Ken Ross Faculty Teaching Awards. faculty members, a testament students are their enthusiasm to their continuing faithfulness The other recipient was for the subject matter, caring to the central mission of attitude, approachability, Professor David Keyes of the teaching undergraduates.” Department of Applied Physics responsiveness to student and Applied Mathematics. The awardees were selected concerns, and the ability to by a Committee of the Alumni make complex subject matter Professor Ross was recognized Association chaired by Eric understandable. for his courses in databases Schon ‘68, with representation and problem solving. Mr. Lee from the student body, and presented the award to based on nominations from the Professor Ross and Professor students themselves. The Board Keyes at Class Day ceremonies of Managers of the Columbia

CS@CU WINTER 2008 5 Featured Awards CUCS Faculty Receive Google Research Awards

Professor Professor Professor Professor Professor Peter Allen Steven Bellovin Angelos Keromytis Rocco Servedio Sal Stolfo

Professors Peter Allen, Steven Bellovin, Professor Peter Allen will be Angelos Keromytis will investi- Angelos Keromytis, Rocco Servedio investigating semantically gate techniques for addressing searchable dynamic 3D the privacy and confidentiality and Sal Stolfo have all recently received databases, developing new concerns of businesses and Google research awards for projects on 3D methods to take an unstructured individuals while allowing for databases, safe browsing, anonymous net- set of 3D models and organize the use of hosted, web-based working, privacy-preserving web searching them into a database that can applications such as Google and martingale ranking algorithms. be intelligently and efficiently Docs and Gmail. Specifically, queried. The database will be the project will combine data searchable, tagged and dynamic, confidentiality mechanisms with and will be able to support Private Information Matching queries based on whole object and Retrieval protocols, to and partial object geometries. develop schemes that offer In the project titled “Safe different tradeoffs between Browsing Through Web-based stored-data confidentiality/ Application Communities”, privacy and legitimate business Professors Angelos Keromytis and user needs. and Sal Stolfo will investigate Professor Rocco Servedio was the use of collaborative soft- awarded a Google Research ware monitoring, anomaly Award to develop improved detection, and software self- martingale ranking algorithms. healing to enable groups of Martingale ranking is an exten- users to browse safely. The sion of martingale boosting, a project seeks to counter the provably noise-tolerant boosting increasingly virulent class from learning theory of web-bourne by which was jointly developed by exchanging information among Prof. Servedio and Phil Long, a users about detected attacks researcher at Google. A goal of and countermeasures when the project is to design adaptive browsing unknown websites and noise-tolerant martingale or even specific pages. rankers that perform well ‘at the In the project “Privacy and top of the list’ of items being Search: Having it Both Ways ranked, which is where accurate in Web Services”, Professor rankings are most important.

6 CS@CU WINTER 2008 Featured Articles

The following piece by Professor Steven Bellovin appeared as an The Physical World and the Real World “Inside Risks” article in the May 2008 issue of Communications Professor of the ACM. Steven Bellovin

Most of us rely on the Internet, for news, For this problem, there are no The DNS-related incidents are entertainment, research, communication good solutions. Anyone whose scarier because they do reflect business depends on connec- deliberate actions, with the with our families, friends, and colleagues, tivity through this region must force of the U.S. legal system and more. What if it went away? take this into account. behind them. In one case, the The dangers aren’t only physical. Wikileaks.org web site was Precisely that happened to many length of fiber optic cable with The last few months have also briefly deleted from the DNS people in early February, in the you when going hiking in the shown that a 1999 National by court order, because a bank wake of the failure of several wilderness. If you get lost, throw Research Council report was claimed the site contained undersea cables. According to it on the ground. A backhoe will quite correct when it warned of stolen documents. (The site some reports, more than 80 soon show up to sever it; ask the fragility of the routing system owners had apparently fore- million users were affected by the driver how to get home.) and the domain name system. seen something like that, and the outages, Both the failure The obvious answer is to run had registered other names and the recovery have lessons In one highly-publicized incident, for the site in other countries: some backup cables on land, a routing mistake by a Pakistani to teach us. bypassing the chokepoint of the the .org registry is located in ISP knocked Youtube off the air. the U.S.) In a second incident, The first lesson, of course, is Red Sea. Again, a glance at the There was a lot of speculation that failures happen. In fact, map shows how few choices a U.S. government agency that this was deliberate—the ordered the names of some multiple failures can happen. there are. Bypassing the Red government of Pakistan had Simply having some redundancy Sea on the west would require non-U.S. sites removed from ordered Youtube banned within .com (again, located in the may not be sufficient; one needs routing through very unstable the country; might someone to have enough redundancy, countries. An eastern bypass U.S.) because they violated the have tried to “ban” it globally?— embargo against Cuba. of the right types. In this case, would require cooperation though later analysis strongly geography and politics made from mutually hostile countries. suggests that it was an innocent What can we learn from these life tougher. Neither choice is attractive. mistake. An outage affecting incidents? The moral is simple: The geographical issue is clear From this perspective, it doesn’t such a popular site is very the Internet is a lot more fragile from looking at a map: there matter much just why the noticeable; there was a great than it appears. Most of the aren’t many good choices for an cables failed. Cables can be deal of press coverage. By time, it works and works very all-water route between Europe by ship anchors, fishing contrast, when a Kenyan well, without government and the Persian Gulf or India. trawlers, earthquakes, hostile network was inadvertently interference, routing mistakes, And despite this series of events, action, even shark bites. hijacked by an American ISP, or outages due to occasional cables are generally thought Regardless of the cause, when there was virtually no notice. fiber cuts. Sometimes, though, to be safer on the seabed than so many cables are in such a Quieter, deliberate misrouting— things go badly wrong. on land. (A standing joke in the small area, the failure modes say, to eavesdrop on traffic to Prudence dictates that we network operator community are no longer independent. or from a small site—might go plan for such instances. holds that you should bring a completely unnoticed.

CS@CU WINTER 2008 7 Research Focus Training air traffic controllers using VoIP

Professor site visits to the FAA facility in Henning Oklahoma City and have taught Schulzrinne FAA technical staff in two training sessions at Columbia University. Designing such a system within the resource constraints of a university lab posed special Voice-over-IP has become challenges, especially over the last two years, after the a well-established technol- FAA went ahead with the ogy and is well on its way production system deployment. to replace the traditional The system being deployed in circuit-switched phone an environment where none of the users are CS majors, has services, both within the meant that even trivial problems enterprise and for the need administrator intervention public phone system. —with a few of those being escalated all the way to the As part of a research project Washington headquarters! with the Federal Aviation An air traffic control display With remote debugging proving Administration (FAA), the Internet to be daunting and stoppage Real-Time (IRT) Laboratory read about VoIP and contacted Designing a system in an of ATC training having legal explored how to use these tech- Prof. Henning Schulzrinne to unknown domain, aviation implications, the IRT team had nologies for a somewhat unusual see if technology developed at training, mandated the to resort to on-site visits, even purpose: training future air traffic Columbia could be repurposed use of alternate software for seemingly trivial issues. controllers in radio communica- to build a modern training facility. development methodologies— Problems with the associated tions skills. In scripted training hardware (e.g., issues The training rooms at the FAA rapid prototyping and iterative exercises, students learn how to in a push-to- headset) and Academy had three parallel development—and as the communicate with (actor) pilots occasional bugs in the VoIP networks: a data network, a prototypes went back and forth and other air traffic controllers system, did not prove helpful voice network and a video between the FAA and Columbia, (ATCs), interacting with user either. Needless to say, the network. VoIP could unify all the IRT researchers built up interfaces that approximate the project team has learned how communications on a single significant domain expertise. technology they will find in a important extensive logging Ethernet and allow to re-con- A uniform wrapper-based inter- control tower. Depending on and fail-safe mechanisms are, figure the classroom setting face efficiently handles the the scenario, small groups of air as well as having well-trained through a control panel. Radio large number of I/O devices and traffic controllers and pilots need eases the integration of newer local system administrators. to be linked together. Unlike links were modeled using IP multicast, while tower-to-tower I/O devices. Similar to industrial Encouraged by the potential of most research prototypes, the control systems, self-correcting software must survive 12 hours communications used point- the initial VoIP system, the FAA to-point VoIP sessions. The mechanisms were added that funded a subsequent effort to a day of classroom exercises, employed heart-beat checks with changing sets of instructors. instructor has to be able to design an advanced VoIP system listen in on on-going exercises; to restart any non-responding to be used in the high-fidelity lab Thousands of ATCs need to be components and restore its trained in the next few years, recordings are used to evaluate environments. While the system the performance of ATC trainees. status prior to the fault. The is for training only, the on- as the generation of controllers project initially started with one hired after the 1981 strike retires. The system is build on most going FAA Next Generation Air of the protocols developed in facility, but the system has since Transportation System (NextGen) The project has its origin in 2004, the IRT lab, including SIP, RTSP been deployed and operational in is also likely to use VoIP, albeit when one of the training man- and RTP.Several exercises are five FAA Academy training rooms developed by much larger teams agers of the FAA Academy in typically run in parallel and must since late 2006, with two more and probably not in a university. Oklahoma City, Susan Buriak, felt not interfere with each other. expected by the end of 2008. that they could no longer contin- Except for supporting software Next time you listen in on air ue using the hard-wired analog To increase verisimilitude with components such as operating traffic radio on channel 9 on communication system which the systems found in a control systems, web servers and SQL United Airlines, you may well had been in use for many years tower, the simulator includes database, all core components be listening to a controller in their training facility. Its hard- many of the same user interface are based on software developed trained on CUCS software... ware was becoming obsolete components, such as a push-to- by IRT students and staff, often and any custom manufacturing talk microphone, a foot pedal extending or modifying work would have been prohibitively and screens that can be done earlier for other projects. expensive, so the FAA needed an reconfigured to approximate various console configurations. In the course of the project, alternative. The FAA manager had students have made five on-

8 CS@CU WINTER 2008 New Initiative The Emerging Scholar’s Program

Ph.D. Student Ph.D. Student Chris Murphy Kristen Parton

Professor Adam Cannon ESP peer leader and junior CS major Sahar Hasan helps students complete a decision for the 9 coins problem

Did you know that analyze the problems and workshops, with help from Although the workshops are from 1984 to 2007 brainstorm solutions. Since workshop assistant and CS major led by an undergraduate peer the percentage of there is no programming and Sahar Hasan. After a successful leader, the classroom materials no homework, students can pilot program, ESP was awarded are prepared by Ph.D. student CS degrees awarded focus on algorithmic thinking a $15,000 two-year grant from coordinators in conjunction with to women nationwide and analyzing problems to see the National Center for Women Professor Cannon. One of the has actually decreased, that there is more to CS than in Information Technology favorite sessions last semester from 37% to 19%1? just programming. and Microsoft Research. With was a social networking work- Professor Kathleen McKeown this grant, as well as strong shop, which used friendship In the Columbia Computer has noticed the scarcity of support from the Columbia CS networks from Facebook.com Science Department last year, undergraduate women in the department, ESP has expanded to explore algorithms, 18% of undergraduates were courses she teaches, such as to two sections this Fall, with such as finding friends of women, as compared with 37% Artificial Intelligence.“Discussing a total of 15 enrolled students. friends and detecting cliques. of all SEAS undergraduates.To try how to approach and solve these ESP follows the Peer-Led Team ESP coordinator Murphy said, to counter this trend at Columbia, problems without programming Learning teaching paradigm, “It’s challenging to come Ph.D. students Chris Murphy and will let these women see early which has been successfully up with interesting and fun Kristen Parton and Computer on why Computer Science is fun. used in Computer Science material, especially so that the Science Professor Adam I think ESP can give them an departments at Duke University, workshops are more interactive Cannon started the Emerging edge and sense of confidence to Georgia Tech, Rutgers University and don’t turn into lectures. Scholars Program (ESP), a combat the feeling that the men and others. Since the workshops The feedback we get from the peer-led workshop designed know more.” Another benefit are led by another student rather students has really helped to encourage talented female of ESP is improved networking than a professor or TA, the us understand what sorts of students to stay in the CS major opportunities for beginning CS sessions can be more interactive. topics they’re interested in.” after introductory CS classes. students. Since many later CS Participant Jana Johnson said, The program received very pos- The goal of ESP is to show courses involve group projects, “I loved being in a small group itive feedback at the end of the students that CS is necessarily it is important to get to know because it forced us all to semester, with all students say- a collaborative activity and that peers in the major early on. participate and I also loved having ing they would strongly recom- it involves much more than just As one ESP participant said, a workshop leader that was mend it to another student. ESP programming. It is targeted “Women make up a small part close in age and encouraged student Diana Cimino explained: towards female students enrolled of the CS major so it’s nice to creative thinking.”Johnson “Through the varied workshops, in Introduction to Computer meet other girls who might has continued with ESP this I was exposed to interesting Science classes who have not share the interest.” semester as a workshop people and ideas, realizing the yet declared a major, but are Last semester, ESP started as assistant, along with sophomore breadth of an entirely fascinating contemplating CS. Each week, a pilot program, supported by Elba Garza, who also participated subject in which I had no small groups meet with a peer Women in Computer Science last semester. Hasan, who previous experience.” leader who presents one or (WICS), with six students from started in the spring as a work- For more information about ESP, see more problems from a variety the Introduction to Computer shop assistant, is continuing this http://www.cs.columbia.edu/esp/ of CS fields, from robotics to Science course. CS major Kim fall as a peer leader, along with natural language processing. Manis led weekly hour-long junior CS major Shylah Weber. 1 National Center for Women & Information Technology Students work as a group to Fact Sheet. http://www.ncwit.org/about.factsheet.html

CS@CU WINTER 2008 9 Research Snapshot

Apple2fpga: Reconstructing The original system: Apple II+; c.1982; Synertek 6502; TTL, 16K DRAMs, an Apple II+ 1 MHz NMOS CPU; CPU: 4K transistors?; 22 Watts, 31 when disk spinning The Apple II had three video a bad idea in modern designs, on an FPGA modes: a 40x24 uppercase-only where the use of on-chip black-and-white text display, a PLLs with skew control is the 40x48 16-color low-resolution preferred technique), I ran display, and a 140x280 6-color most of the core at the 14 high-resolution display. MHz clock and distributed latch The Apple II can almost be enable signals to anything that Professor thought of as a video display was originally clocked at one Stephen device that happens to have a of the derived frequencies. Edwards microprocessor connected to it. Woz started with a 14.31818 My Reconstruction MHz master pixel clock—exactly My Apple II core consists of four times the 3.579545 MHz a timing generator, a video colorburst frequency used in generator, the 6502 processor As a Christmas present memory. In the Apple II, it runs NTSC video. Because of the way core, which I took from Peter color was added to the black-and- to myself in 2007, at slightly above 1 MHz. Aside Wendrich’s Commodore 64 from the ROMs and DRAMs, white NTSC standard, seemingly emulator, the ROMS, and some I implemented an 1980s- the rest of the circuitry consisted black-and-white patterns sent random logic for address era Apple II+ in VHDL of discrete LS TTL chips. at exactly this frequency are decoding and other onboard interpreted as color. to run on an Altera DE2 While the first Apple IIs shipped I/O. Broadly, the core expects FPGA board. with 4K of DRAM, this quickly Woz derived the CPU clock a 14.31818 MHz clock signal, grew to a standard of 48K and from the 14.31818 MHz clock input from the keyboard, access The point, aside from entertain- later to 64K through the help of by dividing by roughly fourteen. to a 64K RAM, and access to ment, was to illustrate the bank switching and a memory I say roughly because in fact the peripheral bus (currently, power (or rather, low power) of expansion card. DRAMs, at every sixty-fifth CPU cycle (once just a disk drive emulator) modern FPGAs. Put another this time, were cutting-edge per horizontal scan line) is and generates a one-bit video way, what made Steve Jobs his technology. While they were stretched by two 14 MHz clock stream along with one bit of first million can now be a class difficult to use, requiring +5, periods to preserve the phase audio data for the speaker. project (well, almost) for my +12, and -5V power supplies of the 3.58 MHz colorburst To make this core usable, I 4840 embedded systems class. and needing to be periodically frequency. Thus, there are attached to it a /2 keyboard refreshed, their roughly six-times 65*14+2=912 pixel periods per controller (from Alex Freed), a What is an Apple II? improvement in density over line, or exactly 228 cycles of the disk emulator, and a VGA line- 3.58 MHz colorburst per line. The Apple II was one of the more standard static memory doubler that converts the 15 first really successful personal chips made it worthwhile. Both the CPU and video access kHz horizontal refresh rate of computers. Designed by Steve Breathless copy in an early ad a byte of memory at 1 MHz. the Apple’s one-bit video output Wozniak (“Woz”) and first for the Apple I explicitly touted Their accesses are interleaved, to a color VGA output with 30 introduced in 1977, it really took the advantages of DRAM chips. so the DRAM effectively kHz horizontal refresh rate. off in 1978 when the 140K While the Apple II came with operates at 2 MHz. Another The line doubler contains mem- Disk II 5.25-inch floppy drive was an integrated keyboard, a Woz trick: the video addresses ory for two lines of video from introduced, followed by VisiCalc, rudimentary (one-bit) sound are such that refreshing the the Apple; I display one twice the first spreadsheet program. port, and a game port typically video also suffices to refresh while the other is being read connected to a two-axis analog the DRAMs, so no additional from the Apple’s video out. In Fairly simple even by the refresh cycles are needed. standards of the day, the Apple II joystick, its main feature was addition to interpreting and was built around the inexpensive its integrated video display. In my reconstruction, I tried to generating the horizontal and 8-bit 6502 processor from MOS It generated composite (base- reproduce the behavior of the vertical synchronization signals, Technology (it sold for $25 when band) NTSC video that was timing circuitry (including the it converts the one-bit video an Intel 8080 sold for $179). usually sent through an RF stretched cycle) as closely as stream to color using an overly- The 6502 had an eight-bit data modulator to appear on e.g., possible. However, rather than simple algorithm: every four bus and could access 64K of a TV’s channel 3. actually generate lower-frequency 14 MHz pixels are interpreted clocks in this way (generally as a block and displayed as a

10 CS@CU WINTER 2008 The system emulated in software: Optiplex Gxa; c.1998; Intel Pentium II; The system emulated on an FPGA: Altera/Terasic DE2; c.2006; Altera 233 MHz, 250 CMOS CPU; CPU: 7.5M transistors; 62 Watts EP2C35F672C6 Cyclone II; 90 nm CMOS; 33K LEs, 150M transistors?; 5 Watts single color. This is reasonably Each time the track number constant for the Dell (while ects) as well as for virtually every effective, and produces some of changes, my controller reads the emulator was running) and video arcade game (see the the odd color fringing effects of from the SD card a new group FPGA board; the Apple II’s MAME project) and videogame the original Apple II, but is not of 6655 bytes (256*25) into power consumption varied console. Apple itself has used quite right since a true television track memory. Meanwhile, considerably when the disk such emulation technology to set actually performs more I emulate the spinning disk was active (spinning). allow, e.g., Mac OS 9 programs gentle low-pass filtering on by periodically changing the Only the number for the Apple II to run under Mac OS X. the luminance and chrominance address of the data read by is really fair. The Dell is freakishly There are fewer historical hard- signals. A better algorithm would the CPU when it accesses overpowered, had far more ware emulation projects, owing consider bits adjacent to each the disk I/O locations. memory (192 MB) than largely to the need for FPGA group of four. Nevertheless, For debugging, I brought out necessary to run and the hardware, but a variety of video the display is certainly usable. the CPU’s PC to four of the Apple II emulator, and never arcade games as well as older My emulation of the Disk II seven-segment LED displays used its floppy drive, ROM, home computers have also is primitive (read-only) but on the board and the current etc. The FPGA board has been emulated (such as the functional. I store a “nibblized” track for the drive on another similar problems: it also has VIC-20; see fpgaarcade.com image of a 5.25-inch floppy on two. While the PC is usually (unused) Ethernet, USB, and and pacedev.net). One of my an external SD card and read it changing so fast it becomes a NTSC video interfaces as well hopes is that my project will into on-chip memory a track at a blur, patterns do often emerge. as SDRAM and Flash chips. encourage others to reproduce time. I say “nibblized” because For example, the PC remains While none were running, they their favorite systems in a like most disk drives, the Disk II in a small range when the still consumed some power. more future-friendly format. doesn’t directly store binary computer is waiting at the data but instead encodes it to prompt. Similarly, I have found Conclusions limit the maximum length with a lot of software, including the no magnetic polarity changes, when it is My FPGA version of an Apple which makes it possible to moving the drive , calls II+ succeeded in entertaining recover the bit clock. Since the the monitor’s “delay” routine me and demonstrating some- Apple II decodes this bitstream to slow things down. thing to my 4840 class, but in software, my disk controller has had additional impact. Alex delivers the raw (encoded) Comparing Freed, whose previous Apple II- on-an-FPGA inspired my own, bitstream. Although this wastes Implementations flash storage space, an encoded went back and further improved 140K disk image only takes A central motivation for this his design, and it got some play 228K. With 2 Gb SD cards project was to illustrate how on the “reddit” website, in which selling for less than $10 as of integrated modern FPGAs people vote for interesting late 2008, it hardly matters. have become and how little things they see on the Internet. power they can consume. So I It even prompted a job offer I coded an SPI interface (one of (relax; I’m still at Columbia). the three spoken by typical SD compared the power consumed by an actual Apple II+, an Apple Finally, I’m using it to power the cards) in VHDL that is able to sign for our Computer Systems wake up the card and download II+ emulated in software, and my FPGA reconstruction. Laboratory, which we wrote random 256-byte blocks. about in an earlier newsletter. Controlling this is a simple To measure power, I used the emulator for the Disk II controller, inexpensive and very easy to My Apple II+ reconstruction is which provides a low-level use “ A Watt” power meter. one in a growing collection of The Apple II-on-an-FPGA controlling the sign interface to the drive. For This claims 0.2% accuracy, historical computer emulators for the Computer Systems Laboratory. example, the read/write head is which is plenty to get a rough available on the Internet. moved under software control by idea of what (watt?) is going on. Motivated by nostalgia as much selectively pulsing four stepper as historical presentation, soft- In each case, I measured ware emulators for virtually every motor phases. My hardware the power consumed by the emulator watches these pulses successful computer platform system excluding the monitor. have already been written (e.g., and follows which track the The power was more-or-less head currently resides. see the MESS and SIMH proj-

CS@CU WINTER 2008 11 New Faces

Mohamed Ragesh Jaiswal environments. During 2007-2008 Troy Lee Altantawy grew up in he was with Mysapient Ltd (UK) graduated graduated from India, where as a software engineer and at from Caltech The American he received his the same time served as a secu- with a B.S. in University in undergraduate rity consultant at the Hellenic mathematics Cairo with a B.S. degree in National Defense General Staff, in 2000. He in Computer Computer department of cyber defense. then did his Science in 2005. Science and While an undergraduate Masters and He worked at IBM Egypt from Engineering at the Indian student he was also a member Ph.D. in Computer Science at 2005-2006. Mohamed did an Institute of Technology Kanpur of Microsoft’s developers the University of Amsterdam, M.S. in computer science at The in 2003. He moved to San platform evangelists group. finishing in 2006. Before American University in Cairo Diego, CA to pursue a Ph.D. In 2007 he was awarded the becoming a postdoc at from 2006-2008. He started as in Computer Science and Ericsson’s award of Excellence Columbia, where he is working a Ph.D. student in fall 2008. His Engineering at the University of in Telecommunications for his with Prof. Rocco Servedio, he main research interests are California San Diego under the B.Sc. thesis. Vasileios joined was a postdoc at University Computational Linguistics and supervision of Prof. Russell the Ph.D. program at Columbia Paris-Sud and Rutgers University. Social Networks. He is currently Impagliazzo. He graduated from in Fall 2008 and currently he His main research interests are working with Dr. Owen Rambow UC San Diego in 2008. His is working with Prof. Angelos in computational complexity, in in the Center for Computational doctoral research work focuses Keromytis in network security. particular communication com- Learning Systems. on Direct Product Theorems plexity and quantum computing. and their applications in Ahmed M. Ang Cui Cryptography, Derandomization, El Kholy Wei-Yun Ma graduated Computational Complexity, and graduated from graduated from Columbia Error correcting codes. He is now The American from Columbia University with a a postdoc at Columbia working University in University B.S in Computer with Prof. Rocco Servedio. Cairo (AUC) with a M.S. Science in 2005. with a B.Sc. in Computer After a three Kangkook Jee in Computer Science in year network graduated Science in June 2005. He 2008. Before administration adventure at from Korea worked at IBM Corporation from coming to Columbia in 2006, D.E. Shaw & Co, he returns to University 2005-2006 and 2008, did a M.Sc. he had worked at Institute of Columbia as a Ph.D. student. with a B.S. in in computer science at the same Information Science, Academia Ang’s research interests include Mathematics University (AUC) from 2006- Sinica, Taiwan for five years. network security and intrusion in 2001. 2008, and started as a Ph.D. He started his Ph.D. studies detection systems. He is Afterwards, he student at Columbia University at Columbia in Fall 2008. He currently working in Prof. Stolfo’s worked at IBM Korea for five this Fall 2008. Ahmed’s main is working with Prof. Kathy IDS lab. In his free time, Ang years. Kangkook did an M.S in research interests are Natural McKeown in the Natural plays the erhu and aspires to computer science at Columbia Language Processing, Social Language Processing Lab. become a competetive cyclist. from 2006-2007 and started as Networking and Machine His research interests include a Ph.D. student in Spring 2008. Learning. He is currently working Natural Language Processing, Joshua B. His main research interests are with Dr. Nizar Habash at The Information Retrieval and Gordon operating systems and system Center for Computational Machine Learning. He is graduated from security. He is currently Learning Systems (CCLS). currently working on improving Columbia with a working with Prof. Angelos machine translation for a B.S. in Computer Keromytis’s NSL (Network Sungjun Kim graduated from multilingual QA system. Science in Security Laboratory) group. Seoul National University with December 2006. a B.S. in Electrical Engineering Snehit Prabhu He returned to Vasileios (Major) and in Business hails from a pursue a Ph.D. in Fall 2008 after Kemerlis Administration (Double Major) small seaside working at Lawrence Livermore graduated in February 2002. He worked town in Goa, National Laboratory. His research in 2006 at BIOMEDLAB Corporation India. He interests are in artificial intelli- from Athens in 2002, HUNEED Corporation finished his gence and natural language. University of in 2003-4, and TOPFIELD undergrad He is currently working with Economics and Corporation in 2004-6, and education in Dr. Passonneau at the Center Business with started as a Ph.D. student in Bangalore in 2003, where he for Computational Learning a B.Sc. in Computer Science. Spring 2008. Sungjun’s main majored in Computer Science. Systems. During his undergraduate studies research interest is Embedded He subsequently spent a few he held appointments with the Systems. He is currently years working at IBM Software same department as a network working with Prof. Stephen Labs and Research, where administrator, lab assistant A. Edwards’s group, and is he dabbled in Bioinformatics, and, after his graduation, as a involved in the Precision Timed Autonomic Computing and Web research fellow working in the (PRET) project. Services among other things. areas of wireless networks He joined the Ph.D. program at and data integration in mobile Columbia in Spring 2008. His

12 CS@CU WINTER 2008 primary research is in Theoretical Vishal K. Singh 2006. Sampada did an M.S. in Jingyue Wu and Computational Biology: did his B. Computer Science at Columbia graduated particularly, understanding the Technology in University from 2006-2007, and from Tsinghua role and contribution of genetic Electronics and started as a Ph.D. student in University variation in evolution. He Communications Spring 2008. Her advisor is Prof. with a B.E. works at the Computational Engineering Luca Carloni. Sampada’s main in Computer Genetics Lab headed by Prof. from National research interests are design Science in July Itsik Pe’er, and is also advised Institute of automation and formal methods. 2008. He started by Prof. Yechiam Yemini. Technology, Warangal in 2001. She is currently working as a Ph.D. student at Columbia He worked at , on system-level design of in Fall 2008. Jingyue’s main Baolin Shao Bangalore from 2001 to 2004 distributed embedded systems. research interests are software graduated before joining Columbia systems and operating systems. from School University as an M.S. student, Li-Yang Tan He is currently working with Prof. of Software and completed his M.S. in graduated from Junfeng Yang and studying fixing Engineering, 2006. After his M.S. he worked Washington errors of software systems. Huazhong at Motorola and NEC-Labs. University University of Vishal started as a Ph.D. student in St. Louis Young Jin Yoon Science and in Fall 2008. His main research with a B.A. in graduated Technology with a B.E. in interests are system manage- Mathematics from Korea Software Engineering in June ment, automated diagnosis in and an M.Sc. in University 2007. Before that, he held an networks, IP QoS management, Computer Science in May 2006. with a B.S. internship at Infosys Technologies presence, and SIP. He is currently He served in the Singapore in Computer Ltd. from September 2006 to working with Prof. Schulzrinne. Armed Forces from 2006 to Science in June April 2007 in Bangalore, India. 2008, and started as a Ph.D. 2006. He did He started as a Ph.D. Student Breannan Smith student at Columbia in Fall an M.S. in computer science at this fall. Baolin’s main research graduated from 2008. Li-Yang’s main research Columbia from 2006 to 2007, interests are computer language the University interest is theoretical computer and started as a Ph.D. student and compilers, and formal of North Carolina science, and he is currently in Spring 2008. Young Jin’s verification. He is currently at Chapel Hill in working with Professors Tal main research interests are working with Prof. Stephen A. 2007 with a B.S. Malkin and Rocco Servedio. Network-on-Chip (NoC) and Edwards on the SHIM project. in Computer Computer Architecture. He is Science and Vishwanath currently working with Prof. Swapneel minors in Mathematics and Venkatesan Carloni’s group. K. Sheth Physics. As an undergraduate, graduated from graduated from Breannan’s research focused Anna University, John R. Zhang Sardar Patel on the simulation of nonlinear, Chennai, India graduated from College of viscoelastic fluids. Breannan is with a B.E. the University Engineering, currently working with Prof. Eitan in Computer of Calgary with Mumbai, India Grinspun within the Columbia Science and a double B.S. with a B.E. Computer Graphics Group where Engineering in June 2006. He did in Computer in Computer Engineering in he is investigating robust collision his M.S. in Computer Science at Science and May 2006. He did an M.S. in and contact schemes for the University of Houston from 2006- Mathematics in Computer Science at Columbia simulation of highly deformable 2008, and started as a Ph.D. June 2008. He started as a Ph.D. University from 2006 to 2007 systems with large degrees of student at Columbia in Fall 2008. student in Fall 2008, pursuing and started as a Ph.D. student multiple collisions and complex Vishwanath’s main research research in computer vision. in Spring 2008. Swapneel’s contact geometries. interests are high performance John is currently working with main research interest is architectures and parallel Prof. John Kender studying Software Systems. He is Sampada computing. He is currently gesture recognition. currently working with Prof. Sonalkar working with Prof. Simha Gail Kaiser on using social graduated Sethumadhavan’s group on auto- Finally, one networking metaphors for from Mumbai matic legacy code parallelization. departing knowledge sharing and University, face deserves scientific collaborative work. India with a B.E. special mention. In his spare time, Swapneel in Computer Awilda Fosse, likes to read, movies, Engineering Assistant to and travel to new places. in June 2001. She worked as Prof. Joseph a Research Assistant at Indian F. Traub and Institute of Technology Bombay Editorial Coordinator, Journal and subsequently continued of Complexity, retired in July there for a M.S. in Computer after 8+ years in the CS Science from 2002-2004. She Department. We will miss her worked at General Motors professionalism, her positive Research in India from 2004- attitude, and her great sense of humor.

CS@CU WINTER 2008 13 Student News “What I did with my summer vacation”

After a busy year in raw word streams that, even classrooms and labs with no recognition errors, are often difficult to read—not only at Columbia, CUCS for humans, but also for natural students branched language processing tools, which out to a wide range usually expect formatted text of interesting jobs as input. During his internship and internships over Agustin helped develop machine learning tools for automatically the summer. Here are inserting punctuation symbols, brief snapshots of capitalizing words, and formatting what a few computer numbers. It was also a great science students did opportunity to meet colleagues from the industry, and to with their summers experience the famous working away from campus. conditions at Google, with endless perks and an amazing M.S.student Rohit Gupta work environment. writes,“This summer I worked Ph.D. student as a Software Engineer Intern at Homin K. Lee Electronic Arts (EA) in Redwood SEAS undergraduate Tristan Naumann (second from right) had a productive summer at Google NYC. spent his City, CA. The most exciting summer of work, the 26 members of Ph.D. student Snehit Prabhu part of my internship was the vacation as an the Robotics Academy heard spent an eventful summer in opportunity to work on ‘The intern at the lectures from robotics and sunny Boston, at the Broad Sims 3,’ the sequel to the IBM Almaden space industry professionals Institute of MIT and Harvard. highest-selling PC game in the Research Center twice a week. We traveled to When he wasn’t biking through history of video games (to date in San Jose, CA. He worked MIT, Carnegie Mellon, and the quaint, sunburnt-brick lined ‘The Sims’ has sold more than in the Theory Group under the Johnson Spaceflight Center streets around Quincy Market, 100 million copies all over the guidance of Vitaly Feldman and to learn about their robotics he was going on long hikes or world). Working on something so T. S. Jayram. Homin is happy to programs. At Carnegie Mellon, learning Lindy-Hop. In his free big, with such a big fan following, report that the weather never we rode in the DARPA Urban time, he studied the ways in was like a dream come true, changed for all three months Challenge winning vehicle, which populations of Baker’s and being in a company where he was there. Boss, in autonomous mode Yeast grown under harsh everybody is either young or on a closed course.” environments (like high salt young at heart was a totally and food scarcity) responded different experience. Being a Undergraduate student Tristan to the evolutionary pressures technical person, I appreciated Naumann (SEAS ‘09) writes: imposed on them. With each the opportunity to interact “This summer I worked at successive generation, the with people having different Google NYC as an Associate population accumulated backgrounds such as artists, Product Management (APM) mutations in its DNA, in the producers, and directors—one of Intern on Google Print Ads. My hope of having fitter, more my team members had worked internship required a unique robust progeny. Comparing the on movies like ‘The Matrix’ combination of broad technical DNA sequences of populations and ‘Sin City.’ ‘The Sims 3’ will knowledge, communication and grown separately under be released in February 2009; Undergraduate student organization in order to create these trying circumstances, you’d better grab a copy!” Benjamin Mann (SEAS ‘11) new features and manage their Snehit’s team found statistically writes:“Last summer, I was development. The people I Ph.D. student significant differences in accepted for NASA’s 10 week met—other interns, full-timers, Agustin genetic constitution between Robotics Academy. With two and even founders—had Gravano spent them. Identifying biological other undergraduates and a diverse backgrounds but were his summer functions that were altered as masters student, I worked all extraordinarily gifted at as an intern at a result of this DNA variation for PI Mehran Armand on the whatever their jobs entailed. Google Research provided insights into natural stereo vision system for a On top of it all, everyone was in New York. selection and the mechanisms bipedal robot. The vision system full of energy and had a great Agustin worked by which these organisms will be used for many robotic sense of humor, making this with Martin Jansche, a member evolve over time. systems as a testing platform one of the best summers I’ve of the Speech Processing Group, for obstacle avoidance, path ever had! I even made it onto on post-processing the output planning, and terrain handling the official Google blog for the of automatic speech recognition algorithms, and more. Outside success of our cube decoration systems. Such systems produce project (see image above).”

14 CS@CU WINTER 2008 Department News The Department at Play: Spring Barbecue Ph.D. student Andrew Wan Barbecues in the courtyard are a cherished writes,“This tradition in the Columbia Computer summer was Science Department. Here are some photos filled with from the most recent event. adventure. First, I made a visit to Tsinghua University in Beijing, where I gave a talk on recent work with Dachman- Soled, Lee, Malkin, Servedio and Wee. Afterwards, I traveled around China with my father. There is a lot of construction in China! I spent the rest of the summer at SRI in Palo Alto working with Chris Peikert on trying to break the NTRU signature scheme. I was glad to pay my family in South San Francisco a visit. Finally, I also made a trip to Reykjavik to attend the ICALP conference.” Undergraduate Anthony Yim (SEAS ‘10) spent his summer working at a technology start- up company called Peek. Peek is currently creating a mass-market, super consumer friendly and contract free mobile email device to be launched in mid Sept 2008. Anthony worked with Amol Sarva (CC ‘98), a Columbia alumni who co- founded Virgin Mobile and was an original member of the Wireless Founders Coalition for Innovation. While at Peek, Anthony prepared the device for the nation-wide launch by performing extensive testing and readying Peek’s CRM system. Anthony also performed and presented independent research on mobile operating systems and processors for the company. In the process, he has learnt a lot about the workings of the telecommunications industry. While not at Peek, Anthony spent his time hiking on the Appalachian Trail and exploring New York City.

CS@CU WINTER 2008 15 Department News & Awards

Ph.D. student Salman Baset include a cash award, a laptop, an Computer Science Ph.D. students studied in the context of received a Marconi Young Scholar Intel mentor, and an internship. Paul Blaer and Matei Ciocarle stream processing models award for 2008. The award Melinda Agyekum, advised were featured in an article in of computation. recognizes the accomplishments La Nouvelle Republique on by Prof. Steven Nowick, Professor Luis Gravano and and promise of Ph.D. students was selected for her work in their work on scanning the in their early years who cathedral of Bourges. Ph.D. student Hila Becker asynchronous digital systems. received a Google Research are working in the areas of Asynchronous digital circuits Professors Peter Belhumeur and communications and networking. Award to work on local event perform synchronization and Shree Nayar are participating in search. Their project focuses on Baset’s Ph.D. work is on design- communication without using an NSF MURI project on human ing algorithms and protocols for two problems associated with a global clock, and thereby can recognition. A research team local event search, namely, how peer-to-peer voice and video provide greater flexibility and led by Prof. Rama Chellappa systems. The work leverages the to identify events of all sizes— timing-robustness in handling on- (ECE/UMIACS/CS) has won including small, not-so-prominent distributed nature of peer-to-peer chip and off-chip communication. a 2008 MURI award for their systems to solve connectivity events not necessarily covered The goal of Melinda’s work is proposal to develop face, gait, in mainstream sources— issues, making routing and to provide low-power encoding long-distance speech and other remote access service easier. and how to determine the techniques that will allow motion-based human recognition “geographical scope” of Computer Science Ph.D. students asynchronous communication algorithms tailored to the events—beyond their explicit Oliver Cossairt and Alexander to become more tolerant of maritime domain. The MURI location. The project will use the Gusev were awarded prestigious dynamic variability (e.g., soft- project will be coordinated by wealth and variety of sources NSF graduate fellowships. errors, cross-talk, noise, etc.) a team of researchers from that are available over the Web which has become an increasing University of Maryland with Olivier Cossairt, advised by Prof. to identify and characterize problem due to device scaling. partnering institutions in events, in turn to produce Shree Nayar, has been selected Columbia University, University for the NSF Graduate Fellowship Ryan Overbeck, advised by expressive, high-quality local Prof. Ravi Ramamoorthi, was of California at Colorado Springs event search results. to further his work on intelligent and University of California at displays. Intelligent displays selected for his research on real-time ray tracing on the CPU. San Diego, and international Professor Luis Gravano received are new types of visualization researchers from University of a three-year National Science systems that sense and react to Ray tracing is the core of many physically based algorithms Southampton, UK and University Foundation grant for the project their physical environment. Using of Queensland, Australia. The titled “Beyond Keyword Search: these displays, real and digital for rendering 3D scenes with global illumination (shadows, team will design and develop Enabling Diverse Structured objects are indistinguishable novel sensors, algorithms and Query Paradigms over Text in appearance, enabling new reflections, refractions, indirect illumination, and other effects), systems for maritime biometrics. Databases”, part of the NSF possibilities for rich user Information and Intelligent interaction in fields as diverse but hasn’t achieved interactive Professor Luca Carloni, together rendering times on commodity Systems: Advancing Human- as medical imaging, military with Professor Keren Bergman Centered Computing, visualization, and entertainment. computers until recently. Ryan (Electrical Engineering), received develops algorithms to ray trace Information Integration and Alexander Gusev, advised by a grant from the National Science Informatics, and Robust 3D scenes with high quality Foundation entitled “Photonic Prof. Itsik Pe’er, has been award- shadows, reflections, and refrac- Intelligence program. Keyword an NSF Graduate Fellowship Interconnection Networks for search is the prevalent query tions providing a higher degree Chip-Multiprocessor Computing for his research work in computa- of realism to interactive content. paradigm for text, but is often tional genetics. Genetic evidence Systems” in the CPA-CSA insufficiently expressive for shows that many pairs of Computer Science Ph.D. student program. This is a competitive complex information needs individuals purported as unrelated Carlos-Rene Perez was awarded program in which, this year, only that require structured data to one another actually share a six-year fellowship from the about 10-15% of the proposals embedded in text. To move an ancestor within the last few National Physical Science received funding. This research beyond keyword search, this generations. Unfortunately, Consortium (NPSC). Carlos-Rene project aims to harness the project exploits information the computational barrier of com- is advised by Professors Angelos recent extraordinary advances extraction technology, which paring all pairs to one another Keromytis and Jason Nieh, and in nanoscale silicon photonic identifies structured data in text, prevented such analysis on a the fellowship is sponsored by technologies for developing to enable structured querying. large scale. Sasha developed a the NSA. The primary objective optical interconnection networks The project explores a wealth linear time method for such all- of the NPSC is to increase the that address the critical band- of structured query paradigms against-all comparison, becoming number of qualified U.S.–citizen width and power challenges of and develops specialized cost- the first to analyze ancestry Ph.D.s in the physical sciences future chip multiprocessor-based based query optimizers for each of thousands of individuals, and related engineering fields, systems. The PIs will investigate query paradigm, accounting and finding surprising results emphasizing recruitment of a the complete cohesive design of for the efficiency and, critically, with implications to population diverse applicant pool of women an on-chip optical interconnection the result quality of the query genetics and disease research. and historically underrepresented network that employs nanoscale execution plans. The project minorities. NPSC accomplishes CMOS photonic devices and will also provide data sets and Computer Science Ph.D. students this objective by assisting enables seamless off-chip , for experimentation Melinda Agyekum and Ryan corporations and government communications to other and evaluation, to the community Overbeck were awarded Intel agencies and laboratories in computing nodes and to external at large over the Web, at Fellowships. These highly awarding doctoral fellowships memory. System-wide optical http://extraction.cs.columbia.edu/. competitive two-year fellowships to outstanding U.S. students. interconnection network architectures will be specifically

16 CS@CU WINTER 2008 Professor Julia Hirschberg was Professor Angelos Keromytis security requirements at massively-parallel on-chip named as one of the 12 initial was appointed to the scientific the implementation level. computer could push the level Fellows of the International advisory board of CERTH (Center This is a fundamental problem of scalability beyond what is Speech Communication for Research and Technology, as cryptographic security currently possible and have Association (ISCA). The purpose Hellas) by the Greek ministry formalisms are often criticized a broad impact in supporting of ISCA is to promote, in an of development. The Centre for for lack of relevance given the parallel applications in much international world-wide context, Research and Technology Hellas wide range of attacks available of computer science and activities and exchanges in (CE.R.T.H.), the largest research at the implementation level. engineering. all fields related to speech centre in Northern Greece, was The research extends models communication science and founded in March 2000. CERTH of cryptographic attacks to The National Science Foundation technology. The association is a non-profit organization that include various forms of private is funding a joint GATech, is aimed at all persons and directly reports to the General data tampering and access Bell Labs and Columbia next- institutions interested in Secretariat for Research and and brings the theory of crypto- generation Internet project. fundamental research and Technology (GSRT), of the Greek graphic constructions closer to The project, titled “NetSerV– technological development Ministry of Development. The security concerns in practice. In Architecture of a Service- that aims at describing, mission of CERTH is to carry particular, the tamper proofing Virtualized Internet,” investigates explaining and reproducing out fundamental and applied of a wide set of cryptographic how to allow network nodes to the various aspects of human research with emphasis on primitives is considered in an offer services beyond just routing communication by speech. development of novel products extended adversarial setting, packets. The Columbia team ISCA has about 1500 members. and services of industrial, such as digital signatures, public is led by Professor Henning Prof. Hirschberg has also served economic and social importance key encryption, secure function Schulzrinne. To address the as past president of ISCA. in a range of different fields. evaluation, as well as arbitrary limitations of services in the cryptographic functions. This current Internet, this work Professor Julia Hirschberg, Professor Angelos Keromytis research thus explores the presents a clean slate Internet along with Professor Ani received a grant from Intel: boundaries of what is achievable architecture, called NetServ, Nenkova at the University of “Tracking Sensitive Information algorithmically and practically based on the concepts of service Pennsylvania, have received a Flows in Modern Enterprises”. through cryptographic means. virtualization. NetServ strives to grant from the National Science The project will investigate break up the functions provided Foundation titled “Corpus-Based mechanisms for implementing Professor Steve Nowick by Internet services and to make Studies of Lexical, Acoustic- information accountability received a grant from the these functionalities available Prosodic, and Discourse and visualization in large- National Science Foundation, as modular building blocks for Entrainment in Spoken Dialogue”. scale distributed enterprise along with Professor Uzi Vishkin network services. This project Participants in human-human environments. Specifically, of the University of Maryland, addresses five major research conversation often entrain to one Angelos will investigate the to design new tools that allow challenges in the new service another, adopting the vocabulary use of Virtual Machine Monitors to design multi-core systems architecture: i) the definition and other behaviors of their (VMMs) and distributed that combine synchronous and of requirements for a service- partners. The project will coordination protocols to asynchronous elements. The virtualized Internet architecture, investigate many types of efficiently track the flow of grant is part of the Computing ii) the design of an architectural entrainment in two large corpora sensitive information (or, more Processes and Artifacts (CPA) framework for modular, virtual- of human-human conversations generally, information of interest program; only about 10-15% of ized services, iii) the identification to improve system behavior in to the administrator) throughout the proposals in the “Design of an initial set of key building Spoken Dialogue Systems (SDS). a distributed system. Although Automation for Micro and Nano blocks, which together can Prof. Hirschberg and Nenkova much work has been done on Systems” topical area were provide a foundation for common want to discover which types VMMs in recent years, the focus funded. The proposal focuses network services, iv) the of entrainment occur generally has been on more efficient on a particular existing easy- development of mechanisms and across speakers and which resource utilization and (from a to-program and easy-to-teach protocols for service discovery seem to be speaker-specific, security standpoint) component multi-core architecture. It then and service distribution, and which types of entrainment can isolation; little to no work has identifies the interconnection v) the design and implementa- be reliably linked to task success been done on fine-grained network, connecting multiple tion of a content distribution and perceived naturalness, and information flow tracking within cores and memories, as the service based on the NetServ which types of entrainment can a single system and across critical bottleneck to achieving architecture. The feasibility be automatically modeled in system boundaries. lower overall power consump- of the NetServ approach will SDS. The project focuses on tion. The target is to substantially be demonstrated by building determining which types of The National Science Foundation improve the power, robustness the prototype of a content system entrainment to users will is funding a new CyberTrust and scalability of the system distribution service. Overall, be most important to users and project by Professor Tal Malkin by designing and fabricating the objective of this work is most feasible for SDS. The team and her colleague at the a high-speed asynchronous to develop an architecture will also provide publicly available University of Connecticut titled communication mesh. The that provides an efficient and annotated corpora for future “Tamper Proofing Cryptographic outcome of the work could extensible architecture for research by others. Prof. Nenkova Operations”. This research project make a step in the paradigm core network services, to received her doctorate from focuses on the development shift from serial to parallel that implement a prototype of the Columbia University in 2006. of cryptographic mathematical the field is now undergoing; new architecture, design and models and constructions the resulting first-of-its-kind implement a prototype of a that address realistic partly-asynchronous high-end content distribution service to

CS@CU WINTER 2008 17 Department News & Awards (continued) demonstrate the feasibility of responsible for advising the Joseph F.Traub, Edwin Howard from the Friedrich-Schiller- NetServ, and to evaluate the Board and management on Armstrong Professor of University in Jena, Germany. new architecture in a simulation matters relating to the support Computer Science, has been The Polish Academy of Sciences environment as well as on a and adoption of applications, elected to the Board of Directors (http://www.pan.pl/english/) GENI-like test bed. middleware, security and other of the Marconi Society. The is a state scientific institution capabilities across the Internet2 Society is best known for the founded in 1952. It functions The National Science Foundation membership and its collaborators Marconi Prize, considered the as a learned society acting is supporting a joint project on around the globe. The AMSAC most prestigious award in the through an elected corporation next-generation emergency is responsible for interacting field of communications and of top scholars and research calling services, with researchers directly with other key advisory the Internet. Among its other organizations, via its numerous from the University of North committees on technical and activities, the Society holds scientific establishments. It has Texas, Columbia University, service issues. It will also provide regular forums on topics of also become a major scientific and Texas A&M University. The advice on Internet2’s efforts societal importance. advisory body through its Columbia University portion of to support applications and scientific committees. The the project is led by Professor middleware for teaching and Professors Henryk honorary doctorate (Dr. rer. nat. Henning Schulzrinne. Traditional learning as well as for research, Wozniakowski and Joseph hc) cited Prof. Wozniakowski’s 9-1-1 systems, which date back and for advice on the investment aub have received an NSF foundational contribution to to the 1970s, support only of resources for current and grant for “Quantum and numerical methods, particularly voice, while non-emergency future initiatives. Classical Complexity of the deep insights due to the communications now feature Continuous Problems”. The new discipline of information- other media. Adding additional Professors Simha investigators are studying the based complexity and the work media for 9-1-1 presents Sethumadhavan and Sal following general question: on the “curse of dimensionality” opportunities and challenges. Stolfo, together with Dr. If physicists and chemists that helps determine which The project will develop a Michael Locasto (Dartmouth, succeed in building quantum high-dimension problems are testbed that will enable former CUCS Ph.D.) and Professor computers, which continuous solvable. The Friedrich-Schiller research on understanding and David August (Princeton), problems arising in science and University in Jena was founded analysis of next generation 9-1-1 have received funding from the engineering can be solved much in 1588. services. This is particularly Defense Advanced Research faster on a quantum computer important as both state and Projects Agency (DARPA) to than on a classical computer? federal governments are study techniques for parallelizing Examples of continuous in the process of planning legacy software codes for multi- problems are path integration, next-generation emergency core processors. This research the Schrodinger equation, high- communication platforms, effort will leverage recent dimensional approximation, unfortunately often without discoveries of latent parallelism continuous optimization, and adequate vendor-neutral testing in sequential codes and integral equations. To obtain the and evaluation. The project plans improvements in machine power of quantum computation to investigate issues related to learning to create a new auto- for continuous problems one locating 9-1-1 callers, securing matic parallelization system. must know the computational Public Safety Answering Points, The parallelization system may complexity of these problems ensuring continuous availability offer performance improvements on a classical computer. This is of 9-1-1 services during at historical rates for legacy exactly what the investigators large-scale emergencies, software (on multi-cores). have studied for decades in predicting emergencies, the field of information-based providing citizen alerts (“reverse Ph.D. student Charles Shen complexity. Some specific 9-1-1”), improving inter-agency received a Best Student Paper goals of the project are to coordination and enhancing 9-1-1 award at IPTComm 2008, held study the power of randomized services for the deaf and hearing- in Heidelberg, Germany. The quantum queries; to investigate impaired using video phones paper “SIP Server Overload whether the randomized The first volume of Professor and instant messaging. The Control: Design and Evaluation” classical setting suffers the Henryk Wozniakowski’s book, research results will translate into was co-authored by Charles “curse of dimensionality” for Tractability of Multivariate engineering guidelines and be Shen and Prof. Henning the problem of computing Problems, written jointly with disseminated across government Schulzrinne. It describes and the ground state energy of a Erich Novak (University of organizations, standards bodies evaluates mechanisms so that system; and to study algorithms Jena, Germany) has been such as IETF and National VoIP servers can continue to and initiate the study of the published by the European Emergency Number Association operate at full capacity even computational complexity of Mathematical Society. (NENA) and 9-1-1 centers. under severe overload. Such the Schrodinger equation in the overload may occur during worst case and randomized Professor Henning Schulzrinne natural disasters or mass call- settings on a classical computer was appointed to the Internet2 in events, such as voting for and in the quantum setting. Applications, Middleware & TV game show contestants. Services Advisory Council Without these measures, Professor Henryk Wozniakowski (AMSAC). The Applications, servers are likely to suffer was inducted into the Polish Middleware, and Services from congestion collapse. Academy of Sciences and also Advisory Council (AMSAC) is received an honorary doctorate

18 CS@CU WINTER 2008 Alumni News

Recent alumni of the DNA serves two audiences. For Michael events held at 77 sites around (Distributed Network Analysis) undergraduate students, Locasto (Ph.D. the U.S. and Canada this past lab have pursued a range of this reference is a concise ‘07) writes,“After winter. The team captured 11 interesting career paths after CliffsNotes-likeTM introduction my postdoc at out of 33 awards, including graduation. to the most commonly used the Institute gold medals in individual and – Hanhua Feng (Ph.D. ‘07) is algorithms in the Computer for Security team events. This year’s working for IBM Research Science Literature. For the Technology Olympiad featured 16 teams Watson as a member of the working professional, the book Studies (ISTS) from around the world, includ- technical staff. contains fully-coded solutions to at Dartmouth College working ing Bulgaria, Estonia, Germany, show how the algorithms can with Sean Smith and David Kotz, Latvia, the Netherlands, – Abhinav Kamra (Ph.D. ‘07) rapidly be coded within your own I’ve joined the faculty at George Poland, Russia, Sweden, South is working for Citigroup as a software systems to achieve Mason University as a Visiting Korea and Slovenia. Each prob- financial analyst. improved code performance.” Research Assistant Professor. lem presented clues about the – Raj Kumar (Ph.D. ‘08) is My position is funded by an I3P sounds, words or grammar of Dr. Arnold Kim working for Telcordia as a Fellowship that I applied for while a language the students had (B.S. ‘96) was member of the technical staff. at Dartmouth. At Dartmouth, never studied, such as featured in a Patrick Lee (Ph.D. ‘08) is I was involved in organizing Micmac, a Native American – New York Times doing a 1-year postdoc at and running the SISMAT language spoken in Canada, Technology University of Massachusetts, program, which is an educational the New Caledonia languages section article and will start a tenure-track outreach program to help foster of Drehu and Cemuhi, as well “My Son, the faculty job in the Computer information security expertise as several historical Chinese Blogger: An Science Department of the at 4-year undergraduate dialects. They were then M.D. Trades Medicine for Apple institutions that do not have judged by how accurately and Chinese University of Hong Rumors.”Dr. Kim has been local systems or security quickly they could untangle the Kong (CUHK) in 2009. blogging about Apple since 2000; curriculum or expertise. A news clues to figure out the rules his blog MacRumors.com is one Issy Ben-Shaul (Ph.D. ‘95) got release from Dartmouth is here: and structures of the lan- of the most popular technology tenure from the Technion, then www.dartmouth.edu/~news/ guages to solve the problem. left to found a startup, Actona, Web sites, attracting more releases/2008/06/23.html.” which he sold to Cisco in 2004. than 4.4 million people and 40 Somech (M.S. ‘80) was After working for Cisco for a few million page views per month. Ajit Singh named Chairman of the Board (Ph.D.‘90) was of Directors of GigaSpaces years, he has recently founded a Jim Kurose new startup, Metis Technologies, appointed Technologies, a leading provider (Ph.D. ‘84) has CTO. as the Chief of a scale-out application where he is currently the been appointed Executive server for Java, .Net and C++ Interim Dean Aaron Davis (SEAS ‘02, Officer of environments. of the College Computer Engineering) recently BioImagene. of Natural Krish Sridhar (M.S. ‘03) writes, joined Tudor, a hedge fund BioImagene is Science and “A f ter working at Morgan Stanley headquartered in Greenwich. a leading provider of innovative Mathematics for the last 5+ years, I decided He is now working in their and affordable digital pathology (the college home of Computer to go back to school and get an office in Singapore as a solutions for cancer diagnosis Science Department) at the MBA with a future interest in programmer and trader. and pre-clinical research. University of Massachusetts. Venture Capital or Strategy. I just Kyle Dent (M.S.‘07) writes,“I He was also recently elected moved to London a few weeks recently took a new position to the Board of Directors of the back and am enrolled in the with the Palo Alto Research Computing Research Association program at London Business Center (formerly Xerox PARC) (CRA). Jim joined the UMass School (www.london.edu). The as a Senior Member of the in 1984 after finishing his Ph.D. classes are just getting started Research Staff. Previously I in Computer Science from and we have students from had worked at HP for about 12 Columbia in networking under over 60 countries here making years. My Columbia masters the supervision of Mischa it an incredibly diverse and degree was certainly a Schwartz and Ye c hiam Yemini. exciting programme.” in obtaining the new position.” Two of his former Ph.D. students Andy Wolfe (Columbia College George Heineman (Ph.D.‘96) (Henning Schulzrinne and Dragomir Radev (Ph.D. ‘99), an ‘07) writes, “I am leaving my writes,“I am an associate Dan Rubenstein) and a former associate professor of comput- position as a software developer professor in Computer Science at postdoctoral researcher (Vishal er science, information, and with IBM. After one short year, a Worcester Polytechnic Institute Misra) are on the Columbia linguistics at the University of great opportunity has presented (WPI) in Worcester, MA, where Computer Science faculty. Michigan, was the head coach itself here in Austin. I am joining I’ve been since I received my for the U.S. team which com- the Product Management team Ph.D. I’m happy to announce peted at the Sixth International at Bazaarvoice, http://www. the publication of Algorithms in Linguistics Olympiad in bazaarvoice.com, next week. a Nutshell by O’Reilly Media, Inc. Slanchev Bryag, Bulgaria. The Bazaarvoice is a mature start-up The web page for the book is: U.S. team was selected from in the social commerce space.” http://oreilly.com/catalog/ over 750 high school students 9780596516246. This book who participated in qualifying

CS@CU WINTER 2008 19 CUCS Resources Stay in Touch!

Visit the CUCS Alumni Portal at You can subscribe to the CUCS news mailing list at https://mice.cs.columbia.edu/alum to: http://lists.cs.columbia.edu/mailman/listinfo/cucs-news • Update your contact information CUCS colloquium information is available at • Look at recent job postings http://www.cs.columbia.edu/lectures • Get departmental news Read the CUCS Newsletter online at If you know another alumni who may not be http://www.cs.columbia.edu/resources/newsletters receiving the newsletter, please forward the Alumni Portal to them. In the spirit of environmental responsibility, we are moving towards more electronic (and less hard-copy) distribution of the newsletter. If you’d like to continue receiving paper copies of the newsletter, please visit the Alumni Portal to sign up.

Columbia University in the City of New York Department of Computer Science NON-PROFIT ORG. 1214 Amsterdam Avenue U.S. POSTAGE Mailcode: 0401 PAID New York, NY 10027-7003 NEW YORK, NY PERMIT 3593 ADDRESS SERVICE REQUESTED

CS@CU NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE AT COLUMBIA UNIVERSITY WWW.CS.COLUMBIA.EDU