NEWSLETTER OF THE DEPARTMENT OF AT COLUMBIA UNIVERSITY VOL.12 NO.1 SPRING 2016

straightforward for small data Rather than searching an entire sets: compare every object to data set for the single most New Faces at CUCS every other one and compute nearest neighbor, a search their similarity for each pair. would go much faster if objects But as the number of objects were pre-grouped according to increases into the billions, com- some shared attribute, making it puting time grows quadratically, easy to zero in on just the small he impact of huge data making the task prohibitively subset of objects most likely to sets is hard to understate, contain the most similar neigh- responsible for advancing expensive, at least in terms of T traditional expectations. bor. The new challenge then almost every scientific and becomes: what attribute shall we technological field, from ma- Alexandr Andoni, a theoretical use to make such a pre-grouping chine learning and personalized computer scientist focused on maximally efficient. The speed-up medicine, to speech recognition developing algorithmic founda- thus gained reverberates across and translation. tions for massive data, sees the a wide range of computational The flip side of the data revolu- need to reframe the issue: “The methods since nearest neigh- tion is that massive data has question today is not ‘what can bors search is ubiquitous and Alexandr Andoni rendered many standard algo- we solve in polynomial time?’ serves as a primitive in higher- level , particularly in Associate Professor rithms computationally expen- but ‘what is possible in time machine learning. of computer science sive, necessitating a wholesale proportional to data size, or even Advancing the algo- rethinking of even the most less?’ With all this data, what More generally, in the same basic computational methods. is the best we can do given the spirit of relying on (approximate) rithmic foundations To take one example: Nearest resources? Fortunately, vast im- attributes to speed operations, of massive data neighbors search, the classic provements are possible in both Andoni has developed a theory method since the 70s for find- theory and practice once we of sketching that represents ing similarity among objects, is settle for approximate answers.” complex objects by smaller,

CS@CU SPRING 2016 1 Cover Story (continued) simpler “sketches” that capture where the ability to provide com- issues related to security and novel by Philip K. Dick—inserts a fiers to find security risks? I’m level of interactivity is not yet Part of the problem is that the For Wu, the natural progres- the main structure and essen- putationally efficient solutions privacy. It’s a wide-open field. privacy protection layer that inter- fortunate to be working within possible for massive data sets. database and the visualization sion is to extend the declara- tial properties of the original depends on the development of cepts the data apps receive and Columbia’s Data Science Insti- communities have traditionally tive approach to interactive Two years ago, for his thesis he “Computing power has grown, objects yet use less (sublinear) new algorithms.” displays it in a console so users tute alongside machine learners been separate, with the data- visualizations. With colleagues looked hard at the security risks data sets have grown, what space and time to compute. For can see and, if they want, limit who can build such classifiers.” base side focusing on efficient at Berkeley and University of Andoni also looks for inspiration inherent in perceptual comput- hasn’t kept pace is the ability many tasks, such as estimating how much data is passed on For guarding against buggy code, query processing and accuracy, Washington, Wu is designing from students. “You feel the new ing, where devices equipped to visualize and interact with similarity of a pair of objects, a to the app. The platform, which Jana imagines adapting program and the visualization commu- a declarative visualization lan- energy. Students are excited, and with cameras, microphones, and all this data in a way that’s sketch may work just as well integrates with the popular analysis, an existing technology nity focusing on usability and guage to provide a set of logical that excitement and enthusiasm sensors are able to perceive the easy and intuitive for people to as a fully realized object. While computer vision library OpenCV, for automatically finding software interactions. Says Wu, “If you operations and mappings that is invigorating. It leads you to world around them so they can understand,” says Eugene Wu, relaxing strict formulations is is designed to make it easy for bugs, so it specifically searches look at visualizations from a would free programmers from think about even things you’ve operate and interact more intel- who recently received his PhD happening generally throughout companies to implement and out those bugs that concern database perspective, a lot of it implementation details so they already checked off, to believe ligently: lights that dim when a from MIT’s Computer Science the community in most part by requires no changes to apps. The security and privacy. looks like database operations. can logically state what they there might be new ways of do- person leaves the room, games and Artificial Intelligence Labo- necessity, Andoni is carrying DARKLY paper, called revolution- In both cases, you’re comput- want while letting the database ing things. You want to try again.” that react to a player’s throwing Technology alone, however, isn’t ratory (CSAIL), where he was a the idea further and is in the ary, won the 2014 PET Award for ing sums, you’re computing figure out the best way to do it. motion, doors that unlock when the answer. Companies are un- member of the database group. forefront of those inventing new Outstanding Research in Privacy common aggregates. We can recognizing the owner. likely to fix privacy problems un- A declarative language for primitives and new data struc- Enhancing Technologies. remove many of the perceived less pressured by the public, and Speed is one important com- visualization would have ad- tures that explicitly incorporate It all comes at a cost, of course, ponent for visualizing data, but differences between databases Making it easy to build in safety Jana sees his role encompassing ditional positive benefits. “Once the concept of sketches. especially in terms of privacy there are others, such as the and visualization systems.” Wu and privacy mechanisms is the policy arena, where he will you have a high-level language and security. ease with which interactive wants to bridge the two sides In early work applying a sketch critical. Manufacturers have work to propose and enact work- capable of expressing analyses, visualizations can be created and to operate more closely togeth- primitive (Locality Sensitive little incentive to construct able regulations and legislation to all of these analysis tools such as “Features don’t come for free; the ability to help understand er so both consider first the Hashing) to nearest neighbor privacy protections; in any protect data and security. the explanatory analysis from my they require incredible amounts what the results actually say. For expectations and requirements search, Andoni in 2006 with case, determining what data is thesis is in a sense baked into of data. And that brings risks. At least for perceptual comput- his PhD thesis, Wu tackled the of the human in the loop. Piotr Indyk was able, for the sensitive is not easy. A single whatever you build; it comes for The same data that tells the ing, Jana says there an opening latter problem by developing a most basic Euclidean distances, data point by itself—a random For instance, what does data- free. There will be less need for thermostat no one is home to do something about privacy visualization tool that automati- to improve over a seminal 1998 security photo of a passerby, base accuracy mean when a individuals to write their own ad might also be telling a would- risks. “The field is still relatively cally generates explanations for widely used for clas- for instance—might seem human analyst can’t differentiate hoc analysis programs.” be burglar,” says Jana. new, and we have the chance anomalies in a user’s visualiza- sification. The Communications harmless, but combined with 3.4 from 3.45 in a scatterplot? Suman Jana to build in security from the tion. This is important because As interactions become portable of the ACM later (2008, vol. 51) What data is being collecting another data point or aggregated A slight relaxation of accuracy Assistant Professor beginning and make life better so while visualizations are very and sharable, they can be copied hailed the new primitive as a isn’t always known, even by over time—similar photos over requirements—unnoticeable of computer science people can trust these devices good at showing what’s happen- and pasted from one interac- breakthrough technology that the device manufacturers who, several weeks—reveals patterns to users—would conserve and use them.” ing in the data, they are not good tive visualization to another for allowed researchers to revisit Protecting security and pursuing features, default to and personal behaviors. The resources while speeding up at explaining why. A visualiza- someone else to modify. And it decades-old problems and solve collecting as much data as they challenge to preventing security query operations. In understand- privacy in an age of tion might show that company becomes easier to build tools, them faster. Few expected more can. This data is handed off to leaks is first finding them amidst ing the boundary between what perceptual computing expenses shot up 400% in a which fits with Wu’s focus in progress to be possible. Yet gaming, health, home monitor- a deluge of data; a single image a human can perceive and what single month, and an analyst making data accessible and when Andoni again revisited the ing, and other apps: not all are of a prescription label or credit amounts to wasted computa- would naturally want to under- understandable to all users. problem (in papers published in trusted; all are possible hacking card might be hidden within tions, Wu hopes to develop mod- stand what types of expendi- 2014 and in 2015), he unexpect- targets. entire sequences of images. els of human perception that “When a diverse group of tures are responsible. However, edly made more headway, this “I like to break things. are both faithful to studies in the people look at the same data, There is no opting out. The Perceptual computing is rife the monthly statistic is often time by using data-dependent, Human Computer Interaction the questions you get are more To open a thing and see inexorable trend is toward more with other such vulnerabilities computed from thousands or rather than random, hash func- and Psychology literatures, and interesting than if just other its inner workings and perception in devices and more made possible by devices that millions of input data points, and tions, a novel idea not previously applicable to database and visu- computer scientists or business really understand how data collection, with privacy and see, hear, and sense what goes identifying a simple description conceptualized as something alization system performance. people are asking questions.” security secondary consider- on in their immediate environ- of the exact subset causing the useful for improving algorithms. it works... and then One of the attractions for Wu ations. For this reason, Jana sees ment; but the landscape of spike (e.g., California shops over- On the visualization side, less break it in some clever in coming to Columbia is the More seems to be possible by the need for built-in privacy pro- vulnerabilities is even larger than spent their budgets) requires attention has been paid to the chance to work within the Data continually re-examining and way that maybe no tections. A paper he co-authored, it would appear since perceptual Eugene Wu laborious, error-prone effort. programming languages (like Assistant Professor Science Institute and collaborate applying fresh perspectives to one else thought of and A Scanner Darkly, shows how computing, while creating new JavaScript) used to construct the of computer science Now starting at Columbia, Wu with researchers from across classic, well-studied problems. privacy protection might work in vulnerabilities, inherits all the visualizations; consequently, vi- in the worst possible is broadening the scope of the university, all sharing ideas It’s one reason, Andoni is return- an age of perceptual computing. old ones associated with any Fast, accurate enough sualizations are hard to write, to way that doesn’t seem his research and is among the on new ways to investigate ing to academia after four years software, namely buggy code. debug, and even harder to scale. “Can we disguise some data? for the human in first looking at the challeng- data. “Columbia has a huge at Microsoft Research Silicon possible, but is.” While Jana gets a bigger space A similar situation once prevailed Should a camera for detecting the loop: Visualizing ing problems in the overlap range of leaders in nearly every Valley, which closed in 2014. One hand gestures also read labels on to explore, for the rest of us, it in the database world, where t’s a hacker mindset, and and interacting between databases and how discipline from Journalism, to point in Columbia’s favor was prescription medicines inadver- spells potential privacy disaster. application developers wrote Suman Jana is a hacker— people want to interact with Bioinformatics to Government the Data Science Institute (DSI) tently left in the camera’s view? with big data sets complex and brittle code to fetch of a sort. Though he admits Preventing such an outcome and visualize the data in those studies. Our use of data is ulti- where exceptional researchers I Would an app work just as well if data from their databases; but will come from enlisting help databases. Visualization systems mately driven by the applications from across the university and enjoying the thrill of destruction the invention of SQL, a high-lev- it detected approximate contours from other technology experts. currently being built must take built on top, and I’m excited diverse disciplines study different and subversion, Jana wants to el, declarative language, made it of the hand? If so, we can pass “For finding images with hidden an all-or-nothing approach. “You about working on research that aspects of data. make systems more secure, and easier for developers to express on lower-resolution data so the personal information, we need or exploring complex data either get performance for small can help improve and benefit companies have hired him to find relationships within the data Says Kathy McKeown, director prescription label isn’t readable.” classifiers. Machine learners sets, nothing matches the data sets using a small set of from the depth and breath of security flaws in their systems without having to worry about of the DSI: “We’re thrilled to over the years have learned how power of interactive visu- fixed interactions, or you get full research at the university.” so those flaws can be fixed, not Jana’s opinion is that users F the underlying data representa- alizations that let people directly have Alex join the Data Science exploited. This semester he joins should decide what data apps to train classifiers to recognize expressiveness with SQL and manipulate data and arrange it tions, paving the way towards Institute. His research is directly the Computer Science depart- are able to see. His DARKLY plat- spam, recommend movies, and queries but you have to wait and in new ways. Unfortunately, that today’s ubiquitous use of data. Linda Crane relevant to big data applications ment to more broadly research form—named after a dystopian target ads. Why not train classi- give up interactivity.”

2 CS@CU SPRING 2016 CS@CU SPRING 2016 3 Cover Story (continued) Faculty News & Awards

well as he did, figured someday ested. That he succeeds is clear leverage that data somehow the US. Instead she took the a teaching opportunity would from student comments on the to learn something new,” says offer of a Postdoctoral Fellow- open up. Until it did, there were Columbia Underground Listing Salleb-Aouissi. ship at the prestigious research Allison Bishop Wins other ways on campus for him on Teacher Abilities (CULPA) lab INRIA (French National An associate research scientist at to contribute. site, where Blaer has earned a Institute of Computer Science silver nugget for his teaching Columbia’s Center for Computa- and Control) at Rennes, France. NSF CAREER Award In Allen’s lab Blaer had been and approachability. tional Learning Systems (CCLS) There she did more fundamental doing systems work—his skills since 2006, she has worked on investigation of new algorithms, “We’re thrilled to have Paul for controlling his own comput- both fundamental research into particularly new methods for join the faculty full-time as a ing environments scaled up for new machine learning and data quantitative association rules, but lecturer. The department has a lab of 50 or more—which led mining algorithms and methods also for frequent patterns match- rock-solid confidence in his Paul Blaer to a full-time position at Com- as well as real-world applications ing, ranking, characterization, and classroom skills because we Lecturer in Discipline puting Research Facilities (CRF); of those methods. action recommendation. precepting led to part-time have the strongest possible Many of her projects are adjuncting. For seven years now, kind of evidence—actual results While still at INRIA, she ap- predictive in nature, forecast- Allison Bishop has been awarded Blaer has been teaching introduc- over several years,” says Rocco plied to the CCLS for an open ing when power-grid failures olumbia this fall promoted tory computer science classes Servedio, chair of the Computer position. Though that position a five-year $500,000 National are likely to occur in one case, Paul Blaer from adjunct part-time while working at CRF Science department. filled quickly, David Waltz, then and in another predicting which professor to Lecturer full time to help faculty design director of the CCLS, took note Science Foundation (NSF) CAREER C expectant mothers are most, or in Discipline, a full-time faculty and build backend systems for of her INRIA fellowship and her least, likely to deliver preterm. In award to develop tools for designing position that makes teaching all types of research projects. growing publications list and this last example, Salleb-Aouissi, Blaer’s primary focus, some- contacted her when a differ- and proving the security of new With his new position, the with support from the National thing he’s wanted for a long ent position came up. She and gets recalibrated: teaching Science Foundation Smart and cryptographic systems. time. Waltz later collaborated on a becomes full-time and CRF Connected Health program, number of papers and projects. Hiring Blaer full time is not part-time. used advanced machine-learning “Dave smoothed my transition exactly a stab in the dark for methods to vastly expand the “I’m thrilled to be working full to the CCLS and helped make it Columbia where Blaer is a number of risk factors to be Allison Bishop, an assistant enhancing the mathematical time with students here at an enriching experience where well-known quantity. Since he considered, including socio- professor within Columbia’s foundations that support such Columbia. It’s the best of both I could grow and learn. I will was 3, he has been floating economic, psychological and Computer Science Department capabilities so that security worlds: a large university environ- always be grateful to him.” around campus. His father is behavioral factors. and a member of the Data at each access level can be ment with highly motivated physics professor Allan Blaer, Once settled in at the CCLS, she Science Institute, has been enforced in a provable way. For students, yet like a college Prediction is also at the heart who did both undergraduate Ansaf was able to get back into teach- awarded a five-year $500,000 this, Bishop is looking to inte- professor I have this direct inter- of her most recent (and current and graduate work at Columbia Salleb-Aouissi ing, adjuncting in the Computer National Science Foundation grate recent advances in lattice action with the students, which favorite) project: a browser and who—after teaching stints Lecturer in Discipline Science department, teaching (NSF) CAREER award to develop cryptography with her progress is the favorite part of my job.” optimized for self-learning. “We at Princeton and Swarthmore— courses in data science, discrete tools for designing and proving in designing security reductions. want to create a personalized math, and artificial intelligence. the security of new cryptograph- returned to Columbia where Blaer knows the classes, Bishop will also use the award self-learning experience by As a lecturer, teaching will now ic systems. The CAREER award his son would likewise attend the students and faculty, the to provide an entry point and sifting through huge number of be her primary focus, but she is the NSF’s most prestigious as both undergraduate and projects, and how the com- he increasingly data- training ground for emerging search results to identify and will continue doing research, honor designed to support junior graduate student. puter systems are set up; in centric approach in all young scientists of all ages, return those customized for which will now serve a double faculty who exemplify the role a department dependent on aspects of science and giving advanced graduate Paul Blaer did his PhD research T student’s learning preferences— purpose. “I like to deliver my of teacher-scholars through their systems, that’s better than technology means students classes a more integrated view in the area of mobile robotics whether they be videos, books, lecture in an engaging and in- outstanding research and excel- knowing where the bodies are need to learn what algorithms of cryptographic system design and 3D vision, working in Peter blogs—and that also fit within teractive way, my own way and lent teaching. buried. He’s involved also in and methods can stand up to principles, while opening Allen’s lab. It was there, while the student’s short or long time to keep the material fresh and the administrative aspects that the immense scale of today’s With the award, Bishop will build valuable research opportuni- a grad student leading recita- constraints. The challenge here, alive so students actively absorb touch on teaching; he is Director data sets. Teaching computer on her current research into ties to students at both the tions, that he got his first taste as it was with the preterm study, it rather than just be passive of Undergraduate Studies for science from the perspective provably secure cryptographic undergraduate and graduate of teaching. He knew immedi- is making all these different and recipients. My own research BS Programs and is active in of large data sets is the job of systems that can accommodate levels. The educational outreach ately that teaching was what heterogeneous resources work may serve to give students a the Science Honors Program Ansaf Salleb-Aouissi. A data various levels of access to data, aspect extends to students he wanted to do. For him, it together in a system. It’s an peek into what you can do with for area high-school science scientist from before the term thus allowing different people to of elementary-school age, for was the fun stuff, a chance to ambitious project and I am very computer science, and I hope and math students. was commonly understood, access different data within the whom Bishop is producing a engage with students, to think excited to work on it. More so that can motivate them and Salleb-Aouissi has worked with same data source. The need for book that uses a fairy-tale set- on his feet to get them work Deep institutional and systems because it is a link between my spark their interest so they learn all types of data on projects fine-grained control over data ting to introduce mathematical through problems themselves. knowledge is all well and good, research and my teaching.” now so they can do later.” For three years before graduat- ranging from geology and access has never been greater reasoning. Motivating others but a lecturer first and foremost Though research forms the bulk ing, he was a preceptor, running geographic information systems as vast amounts of sensitive will help ensure faster progress has to be able to teach. Blaer of her recent work, teaching has classes and seeing results from early in her career, to social data have to be simultaneously towards a flexible and more has that angle covered espe- also been a component. Post- the front of a class. sciences and urban design, shared and protected, such as unified theory of cryptography cially well. As someone who PhD, she worked as an adjunct genuinely cares about teaching, and more recently to medical when a hospital needs to see to meet the mounting chal- With teaching in mind, Blaer orig- professor at the University of he pays attention to what reso- informatics and to education. almost all of a patient’s data but lenges of huge data sets, cloud inally planned to seek a position Orléans and discovered how nates with students and what an insurance company needs to computing, and other emerging at a small four-year college, but “The common denominator much she enjoyed interacting doesn’t, and strives to keep his see only what procedures have data systems. the combined draw of Columbia is data. The context may be with students. She would have lectures engaging, using humor been done. and New York City proved strong, different and the goals may be gladly accepted the position of and real-life stories from his own and Blaer, knowing Columbia as different, but at the end of the assistant professor except for Achieving more nuanced crypto- Linda Crane research to keep students inter- day, data is data and you try to her plans to eventually move to graphic capabilities means also

4 CS@CU SPRING 2016 CS@CU SPRING 2016 5 Faculty News & Awards (continued) Three Columbia Steve Bellovin Named First Technology Scholar Engineering Professors by the Privacy and Civil Liberties Oversight Board

Computer lection and analysis systems are cal Guidelines Development mission to ensure that the Win Sloan Fellowships Science based on software. My role will Committee of the U.S. Election federal government’s efforts to Professor be to help the Board members Assistance Commission, and as prevent terrorism are balanced Steven understand these mechanisms Chief Technologist of the Federal with the need to protect privacy Bellovin and their implications.” Trade Commission. He also has and civil liberties.” has been authored numerous publications Bellovin has taught computer appointed and has received awards and The PCLOB is an independent science at Columbia since the first national recognition for his work. agency within the executive 2005. During more than 20 National Science Foundation. Institute, Daniel Hsu develops Tech- He holds a BA from Columbia branch established by the years at Bell Labs and AT&T machine learning algorithms nology University and an MS and PhD Implementing Recommenda- Computer Science Professor Labs Research, he focused on that have been used in auto- Scholar by in Computer Science from the tions of the 9/11 Commission Roxana Geambasu is working network security firewalls, pro- mated language translation, per- the Privacy and Civil Liberties University of North Carolina at Act of 2007. The bipartisan, to ensure data security and tocol failures, routing security, sonalized medicine, and privacy Oversight Board (PCLOB). A Chapel Hill. five-member Board is appointed privacy in an era of cloud com- transparency systems. His work nationally recognized expert and cryptographic protocols. by the President and confirmed puting and ubiquitous mobile making computers smarter was in technology and network He is a member of the National In announcing the appointment, by the Senate. The PCLOB’s devices—technologies upon recently recognized in IEEE’s security, Bellovin has examined Academy of Engineering and PCLOB Chairman David Medine mission is to ensure that the which billions of users rely to Intelligent Systems magazine. technology and its privacy impli- the Computer Science and said, “I am pleased that Profes- federal government’s efforts to access and host sensitive data Hsu specializes in a branch of cations throughout his career. Telecommunications Board of sor Bellovin will be joining our prevent terrorism are balanced the National Academies. He team as our first Technology Matei Ciocarlie Roxana Geambasu Daniel Hsu and which have become easy machine learning called interac- with the need to protect privacy targets for theft, espionage, tive learning, which turns an “I’m delighted to be joining has served on the Science and Scholar. His vast knowledge and and civil liberties. PCLOB,” says Bellovin. “Modern Technology Advisory Commit- hacking, and legal attacks. Our algorithm loose on a small set significant expertise in both the intelligence agencies rely heavily tee of the U.S. Department of private and public sectors will be mobile devices are packed with of hand-labeled data. When the Three Columbia Engineering the world as skillfully as biologi- on technology; many of their col- Homeland Security, the Techni- of great benefit to our agency’s confidential information under algorithm encounters a term it professors—Matei Ciocarlie cal organisms,” he notes. So far, operating systems that never doesn’t recognize, it requests (Mechanical Engineering), robotic applications that have securely erase data. And at the a label, massively speeding up Roxana Geambasu (Computer had significant impact (espe- other end, cloud services not the training process. As a gradu- Science), and Daniel Hsu (Com- cially in industrial domains) have only accumulate endless logs of ate student in the late 2000s, puter Science)—have won 2016 done it by being fast, precise, user activity, such as searches, Hsu helped develop an active Sloan Research Fellowships. and tireless. In order to advance site visits, and locations, but learning method that was later They are among 126 outstanding to less constrained domains, also keep them for extended applied to electrocardiograms, young scientists and scholars robots need to become more periods of time, mine them for reducing the amount of training Vishal Misra Named IEEE Fellow announced by the Alfred P. versatile and learn to handle business value, and at times data needed by 90 percent. His Sloan Foundation. variability, or be more intelligent share them with others—all work on Hidden Markov Models in their environment interaction. For contri- the IEEE, the credit for it goes cable modems worldwide. his views on zero rating, a policy Awarded annually since 1955, without the user’s knowledge or has been applied in genomics “True dexterity in interacting butions to to all my great collaborators contrary to network neutrality. the Sloan Fellowships honor control. Geambasu, a member to understand the role of gene In a 2008 paper, he and col- with the world will play a role “network who have ensured that our Misra’s opinions and expertise early-career scientists and of the Data Science Institute, is regulation in disease, and how leagues used mathematical in the more general problem of traffic work has had an impact.” are sought not only for his deep scholars whose achievements working to identify the security the chromatin packaging a cell’s modeling to examine the pricing developing cognitively advanced modeling, technical research, but also for and potential identify them as and privacy risks inherent in cur- DNA may be implicated. More On the faculty of the Computer policies and profit motives of computers and machines,” conges- his real-world experience build- rising stars, the next generation rent mobile and web technology recently, he helped develop a Science department at Columbia Internet service providers, in the Ciocarlie adds. His Robotic tion ing Internet-based businesses. of scientific leaders. The 2016 and designs, and constructing tool to bring greater transpar- University, Misra in his research process identifying at an early Manipulation and Mobility Lab control fellows, who receive $50,000 systems to address those prob- ency to how personal data is stage the economic incentives While still a graduate student, is working on a range of applica- emphasizes the use of math- to further their research, have lems. Her research spans broad used on the Web. and Inter- that would later give rise to paid he co-founded the sport web- tions, from versatile automation ematical modeling to examine been drawn from 52 colleges areas of systems research, net eco- peering; Misra was thus one site Cricinfo (acquired by ESPN in manufacturing and logistics to complex network systems, and universities in the United including cloud and mobile com- The Sloan Fellowships are nomics,” of the first in academic circles in 2007); more recently he mobile manipulation in unstruc- particularly the Internet. It’s an States and Canada, and repre- puting, operating systems, and awarded in eight scientific and Vishal Misra has been named to warn that network neutrality founded the data center storage tured environments to assistive approach that has been highly sent a wide range of research databases, all with a focus on technical fields—chemistry, a Fellow of the Institute of Elec- issues are not resolvable without startup Infinio. and rehabilitation robotics in productive from the start. His interests. security and privacy. She inte- computer science, economics, trical and Electronics Engineers first understanding Internet PhD thesis work on modeling Misra’s elevation to IEEE fellow healthcare. He is a member of grates cryptography, distributed mathematics, computational and (IEEE), the highest grade of IEEE Internet congestion, done in economics. Matei Ciocarlie’s research is the Data Science Institute and systems, database principles, evolutionary molecular biology, membership and limited every is an important achievement collaboration with colleagues, focused on developing versatile has won numerous prestigious and operating systems tech- neuroscience, ocean sciences, year to one-tenth of one-percent Recently as network neutrality in a career that has previously opened up entirely new direc- manipulation and mobility in honors, including the 2013 IEEE niques and works collaboratively and physics. Candidates are of the total voting membership. has become a political issue, earned him a National Science tions in TCP analysis and led robotics, in particular on building Robotics and Automation Soci- in developing cross-field ideas nominated by their fellow particularly in the US and India, foundation CAREER Award, a “Throughout my career I have to better control mechanisms, dexterity into robotic hands, and ety Early Career Award, a 2015 in order to solve today’s data scientists and winning fellows Misra has actively participated Department of Energy CAREER attempted to solve real world helping achieve high throughput, he sees robotic manipulation Young Investigator Program privacy issues. are selected by an independent in the public debate, contribut- Award, and Google and IBM in unstructured environments grant from the Office of Naval panel of senior scholars. problems via mathematical low latency, and low packet loss ing articles and interviews to Faculty Awards. as a critical research area. “We Research, a 2015 NASA Early A computer science professor modeling and analysis,” says on Internet links. Software that leading media outlets. Earlier aim to discover how artificial Stage Innovations grant, and a at Columbia Engineering and a Misra. “While I am deeply grew out of Misra’s PhD thesis this year, he appeared before mechanisms can interact with 2016 CAREER Award from the member of the Data Science Holly Evarts and Kim Martineau honored by this recognition by is now being deployed in all the Indian Parliament to present Linda Crane

6 CS@CU SPRING 2016 CS@CU SPRING 2016 7 Faculty News & Awards (continued) Julia Hirschberg and David Blei Henning Schulzrinne Named Recipient Elected 2015 ACM Fellows of 2016 IEEE Internet Award tocol (RTP), the key protocols 2014 he received an Outstand- Schulzrinne is active in pub- that enable Voice-over-IP (VoIP) ing Service Award by the Inter- lic policy and in serving the and other multimedia applica- net Technical Committee (ITC), broader technology community. tions. Each is now an Internet of which he was the founding From 2012 until 2014, he was standard and together they chair. In 2013, Schulzrinne was the Chief Technology Officer for have had an immense impact inducted into the Internet Hall the Federal Communications Computational Linguistics and tical tool and is used to capture on telecommunications, both of Fame. Other notable awards Committee where he guided co-editor-in-chief of Speech Com- interpretable patterns in a range by greatly reducing consumer include the New York City the FCC’s work on technology munication and was on the Exec- of applications, including docu- costs and by providing a flexible Mayor’s Award for Excellence in and engineering issues and utive Board of the Association for ment summarization, indexing, alternative to the traditional Science and Technology and the played a major role in the FCC’s Computational Linguistics (ACL); genomics, and image database and expensive public-switched VON Pioneer Award. decision to require mobile on the Executive Board of the analysis. Henning Schulzrinne, the telephone network. carriers to support customers’ North American ACL; on the CRA Julian Clarence Levi Professor Schulzrinne whose research In addition to continuing work on abilities to contact 911 using Board of Directors; on the AAAI of Mathematical Methods “This award also recognizes interests include applied topic models, Blei develops mod- text messages. He continues Council; on the Permanent Coun- and Computer Science at the work by my students and network engineering, wireless els of social networks, music to serve as a technical advisor cil of International Conference The Fu Foundation School visitors in the Columbia IRT networks, security, quality and audio, images and computer to the FCC. Julia Hirschberg David Blei on Spoken Language Processing of Engineering at Columbia lab as well as all the other of service, and performance vision, and neuroscience and (ICSLP); and on the board of the University, has been named colleagues who contributed evaluation, continues to work Schulzrinne is a past member International Speech Communi- brain activity. Recent work with the recipient of the 2016 IEEE to making Internet-based on VoIP and other multimedia of the Board of Governors of cation Association (ISCA). She is students has resulted in efficient Internet Award for exceptional multimedia possible,” says applications and is currently the IEEE Communications also noted for her leadership in algorithms to fit a wide class of Two professors in the Computer “I’m deeply honored to be contributions to the advance- Schulzrinne, in referring to the investigating an overall Society and a current vice chair promoting diversity, both at AT&T statistical models to massive Science department at Columbia joining this wonderful group ment of Internet technology. Internet Real-Time (IRT) Lab, architecture for the Internet of of ACM SIGCOMM. He has and Columbia, and broadening data sets, enlarging the scale of University have been elected of computer scientists,” says which he directs and which Things and making it easier to served on the editorial board participation in computing. data that can be analyzed using Schulzrinne was recognized “for 2015 Association for Computing Hirschberg. “The ACM has done conducts research in the diagnose network problems. of several key publications, sophisticated methods. formative contributions to the chaired important conferences, Machinery (ACM) Fellows: Julia a wonderful job of supporting Among many honors, she is areas of Internet and multi- He is also active in designing design and standardization of and published more than 250 Hirschberg for “contributions and promoting computer science a fellow of the Association for “I am deeply honored to have media services. technology solutions to limit Internet multimedia protocols journal and conference papers to spoken language processing,” for many years.” Computational Linguistics (2011), been elected an ACM fellow,” phone spam (“robocalls”) and applications.” Schulzrinne The Internet award follows on and more than 86 Internet and David Blei, for “contribu- of the International Speech says Blei. “The ACM is a won- and recently testified on this Upon receiving her PhD in is particularly known for his the heel of two other honors Requests for Comment. tions to the theory and practice topic before the Senate Special Computer and Information Communication Association derful organization—for many contributions in developing the recently accorded Schulzrinne. of probabilistic topic modeling Committee on Aging. Science from the University of (2008), of the Association for years it has nurtured the fantas- Session Initiation Protocol (SIP) In January, he was named an and Bayesian machine learn- Pennsylvania, Hirschberg went to the Advancement of Artificial tic intellectual and community and Real-Time Transport Pro- ACM Fellow, and in December In addition to his research, Linda Crane ing.” The ACM fellowship grade work at AT&T Bell Laboratories, Intelligence (1994); and she is spirit of computer science.” recognizes the top 1% of ACM where in the 1980s and 1990s a recipient of the IEEE James members for their outstanding Blei’s research has earned him she pioneered techniques in text L. Flanagan Speech and Audio accomplishments in computing a Sloan Fellowship (2010), an analysis for prosody assignment Processing Award (2011) and and information technology or Office of Naval Research Young in text-to-speech synthesis, de- the ISCA Medal for Scientific outstanding service to ACM and Investigator Award (2011), the veloping corpus-based statistical Achievement (2011). In 2007, she the larger computing community. NSF Presidential Early Career Jonathan Gross Retires After 47 Years models that incorporate syntactic received an Honorary Doctorate This year, 42 have been named Award for Scientists and and discourse information, mod- from the Royal Institute of Tech- ACM Fellows. nology, Stockholm, and in 2014 Engineers (2011), the Blavatnik of Teaching and Research at Columbia els that are in general use today. Faculty Award (2013), and the She joined Columbia University was elected to the American Julia Hirschberg is the Percy K. ACM-Infosys Foundation Award faculty in 2002 as a Professor Philosophical Society. and Vida L.W. Hudson Professor (2013). He is the author and Jonathan modeling, and sociological Tucker. Together Gross and Gross invented the voltage in the Department of Computer of Computer Science and David Blei is a Professor of co-author of over 80 research Gross modeling. Tucker authored the influential graph construction in 1973, Science and has served as Chair of the Computer Science Computer Science and Sta- papers. retired last and comprehensive Topological which is the basis for a concise department chair since 2012. Professor Gross’s main specialty Department. She is also a tistics and a member of the semester, Graph Theory, which at its algebraic specification of infinite member of the Data Science Before coming to Columbia in is topological graph theory, a As of November 2015, her Data Science Institute. He is a following release in 1987 represented the families of large graphs and also Institute. Her main area of leading researcher in the field of 2014, Blei was an Associate math subdiscipline straddling publications have been cited a highly state-of-the-art in graph theory. of placements of such graphs research is computational probabilistic statistical machine Professor of Computer Science combinatorics and geometry 14,161 times, and she has an active Their objective in writing that on increasingly complicated linguistics, with a focus on the learning and topic models, hav- at Princeton University. He and marked by a strong visual h-index of 60. career book was to create a single surfaces. Gross’s joint work relationship between intonation ing co-authored (with Michael received his PhD in Computer component. In several of his 17 and discourse. Her current proj- Hirschberg serves on numerous I. Jordan and Andrew Y. Ng) the Science from UC Berkeley and that books and in over 100 papers source that would provide with Tucker on its generalization, ects include deceptive speech; technical boards and editorial seminal paper on latent Dirichlet his BSc in Computer Science allowed and journal articles, Gross ex- someone new to topological published in 1977, includes some spoken dialogue systems; committees, including the IEEE allocation (LDA), the standard and Mathematics from Brown him to panded topological graph theory graph theory with sufficient of the most frequently cited entrainment in dialogue; speech Speech and Language Process- algorithm for discovering the University. indulge his lifelong love of math- by initiating new programs of background to move as quickly publications in topological graph synthesis; speech search in ing Technical Committee and abstract “topics” that occur in ematics while doing pioneering investigation and by developing as possible into frontier theory. The name voltage graph low-resource languages; and the board of CRA-W. Previously a collection of documents. LDA work in graph theory, three- new methods for them, often research. It remains a standard plays on the fact that one of the hedging behaviors. she served as editor-in-chief of has become an important statis- Linda Crane dimensional topology, shape collaborating with Thomas W. reference today. key properties that sometimes

8 CS@CU SPRING 2016 CS@CU SPRING 2016 9 Faculty News & Awards (continued) Student Awards occurs in the specification of Medalist John Milnor. After “Not only did we have no cell-phones or personal placements in surfaces is an graduate school, he joined the computers when I was young, most families did Inaugural Morton B. Friedman Prize algebraic generalization of the Mathematics Department at not have a television before 1950. We would start Kirchhoff voltage law, which is a Princeton University, working property of electrical circuits well with Ralph Fox, renowned for being nice to the rich kid around Thursday, in the Honors Robotics Innovator known to electrical engineers his work on knot theory and hope that he would invite us to watch television and physicists. Another paper by three-dimensional topology. at his house over the weekend.” student and researcher at Johns Gross and Tucker explains how He has developed code for a the voltage graph construction Though primarily a math- range of robotics platforms Hopkins and the University of unifies dozens of special cases ematician, Gross had an early spanning research and industry Southern California, he contribut- that occur in the solution of the interest in computers, and it organizer of department-wide For his excellent teaching, and published several papers ed to augmented reality projects Heawood map-coloring problem. was in computer science that efforts to keep the academic Gross received two SEAS at peer-reviewed conferences. to combat phantom limb pain he felt that his teaching would curriculum at the educational awards; in 1994 he received as His research with Allen has and small devices to measure Topological graph theory has have greater impact. He has forefront. Over the years, he well the career Great Teacher included measures of grasp motor impairment of cerebral connections to many other believed since his high school became the keeper of institu- Award from the Society of stability under uncertainty, palsy and osteoarthritis patients. areas of mathematics, including days that computing was for tional memory. Columbia Graduates. “human-in-the-loop” grasping, combinatorial and probabilistic everybody, and his earliest and data-driven hand design Professor Friedman, who models, as well as to knot books are concerned with Mathematician, researcher, In late career and retirement, Jonathan Weisz optimization. He also helped founded the Division of Math- theory. Since 2009, Gross has computer programming. It was author, and computer scientist, Gross continues his research manage integrating the various ematical Methods, the precur- been working with Jianer Chen, to set up a computer science Gross was also an instructor to work with his co-authors around components of the Robotics sor to the applied mathematics one of his former Columbia PhD curriculum for arts and science thousands of Columbia students. the world. Each year he produc- Jonathan Weisz, a computer lab’s grasping platform, arm component of the Department students, to apply topological students that he was invited He taught discrete mathematics, es numerous journal papers in science PhD candidate at trajectory planning, vision, grasp of Applied Physics and Applied graph theory to the computer in 1969 to join the Statistics graph theory, and combinatorial topological graph theory, and he Columbia Engineering, has planning, and tactile sensing. Mathematics, chaired the graphics area called shape Department at Columbia. His theory, lecturing with humor and continues to travel to national been named the recipient of the Department of Civil Engineering “The work our lab is doing with modeling. Another area that first class in introductory com- with what he called “enhance- and international mathematics inaugural Morton B. Friedman and Engineering Mechanics for brain-computer interfaces and Gross tackled and examined for puter programming at Columbia ment,” short historic anecdotes meeting to give talks about his Memorial Prize for Excellence. 14 years. In his role as associate several years is behavioral and had eight students. Within a from science and mathematics research and to chair sessions assistive robotics is exploring Named after the beloved profes- dean, vice dean, and senior vice The Morton B. Friedman cultural rule systems, for which few years, 300 students in that as well as from his own math- in his specialty. One math friend how far we can push practical, sor and senior vice dean who dean, he was in the vanguard he developed information- same course filled the seats in ematical career and personal has joked, “Jonathan, you are in affordable technologies to help Memorial Prize for Excellence was an integral part of Columbia of engineering education and theoretic models and measure- the large lecture room in Have- history. “Enhancements” were danger of flunking retirement.” people with motor impairments is awarded periodically to an Engineering for nearly 60 years, helped shaped the School for ment techniques. Working with meyer. The university expanded as integral to his courses as his To this, Gross responds that regain some autonomy,” said the prize honors undergradu- many decades. He died last undergraduate or graduate the eminent British anthro- the computer science contin- meticulously put-together notes, math is too much fun to stop Weisz. ate and graduate students who year at age 86. student who best exhibits pologist Dame Mary Douglas, gent that he headed within often giving students insight into and that he intends to flunk exemplify “Mort’s” legacy of Weisz participated in Phase 1 of Gross demonstrated how such Statistics one by one, to five a different time and place. retirement for years to come. “I’m very humbled by the link Professor Friedman’s academic excellence, visionary DARPA’s Autonomous Robotic high-powered tools can be faculty members. to someone who contributed characteristics of academic He proved popular with stu- His conclusion of active service leadership, and outstanding Manipulation (ARM) Challenge harnessed to better understand so much,” Weisz said. “I’m dents, who variously described at Columbia was marked in to create a manipulator capable excellence, leadership, human social behavior. In his In the late 1960s and the promise for the future. him as devoted to his work, December with a dinner amidst of high-level tasks and adapting hopeful that the community of and outstanding promise. book, Measuring Culture, Gross 1970s, computer science was brilliant, idiosyncratic, and highly remembrances by colleagues Weisz’s work at the Columbia to real-world environments with researchers that Dr. Friedman and his co-author Steve Rayner also taught by a small nucleus quotable. and family. Among those who University Robotics Group, led little supervision, as well as the helped build will continue to describe how to measure of professors of Electrical shared their personal stories by Computer Science Professor DARPA Robotics Challenge to have an impact.” information content in societal Engineering. In 1978-79, while of Professor Gross, it was Peter K. Allen, advances real- develop innovative ground robots patterns, making it possible to Gross was Acting Chair of perhaps his daughter Rena who time grasp planning through for use in disaster response obtain objective comparisons Statistics, Dean Peter Likins of “When I say a baby-level most closely articulated how brain-computer interfaces. operations. Previously, as a Jesse Adams of different target populations. SEAS committed funds from proof, that’s just how mathe- a substantial gift to SEAS to maticians talk. I don’t actually much mathematics infused For his research, Gross has found a separate Computer know any babies who can do her father’s life when she earned multiple honors and Science Department, which algebraic topology.” recounted how, as a child and awards: an Alfred P. Sloan both contingents agreed to join. misbehaving, her father would Fellowship, an IBM Postdoc- Merging the computer science “Negativebplusormi- threaten “Stop, or I’ll map you toral Fellowship, and numerous course offerings from Statistics nusthesquarerootofbsquared- into the complex plane.” research grants from the Office and from Electrical Engineering minusfouracovertwoa. You PhD Student Wins Google Fellowship Professor Gross plans to be of Naval Research, the National was among the first initiatives have to say it very quickly, or back for the parties. Science Foundation, the Russell that Gross orchestrated for the you’ll get it wrong.” Riley Spahn, a computer sci- challenges in computer science. tools that allow programmers Sage Foundation, and, most new department. He strongly “I have no idea what liquid ence (CS) PhD student working recently, from the Simons encouraged faculty to balance to manage data in more secure soap will make your dishes with CS professors Roxana “I’m very happy that Google Foundation. their teaching assignments ways and add transparency to sparkle, but I recommend Geambasu and Gail Kaiser, was will be supporting my research,” between undergraduate and how web services put our data Gross began his formal liquid Joy for making high- recently awarded a North Ameri- says Spahn, who will pursue graduate levels. His role in to use.” mathematics education as an quality knotted soap bubbles can Google PhD Fellowship for research on operating and starting Columbia’s computer undergraduate at MIT, graduat- with interesting mathematical his work on privacy issues. He is distributed systems with a focus Google created the PhD science department was fun- ing in 1964. From MIT, he went properties.” one of 15 students chosen from on security, privacy, and data Fellowship program in 2009 to damental; as the department to Dartmouth College where a highly competitive group who management. “How we man- recognize and support outstand- grew over the years—it now – From a collection of quotes his PhD thesis on three-dimen- represent the next generation age and control data is a very ing graduate students doing numbers 44 professors and compiled by students sional topology (1968) solved Riley Spahn of researchers working to solve important aspect of modern exceptional work in computer 5 lecturers—Gross was the a published problem of Fields some of the most interesting life and I’m excited to build science and related disciplines.

10 CS@CU SPRING 2016 CS@CU SPRING 2016 11 Feature Articles

and data science, particularly the Columbia Engineering’s Computer Vision Laboratory The Future of DNA Sequencing special challenges of acquiring, storing, and analyzing huge Develops Cambits, a Modular Imaging System that Can amounts of genomic data. (The is Already in the Classroom first reading assignment was Big Transform Into Many Different Cameras Data: Astronomical or Genomi- cal? by ZD Stephens and others.) The class, however, has a major DIY twist. Rather than send- ing out DNA samples to a lab Computer Science Professor hardware and software system interface. Through the circuit, equipped with $1M sequencing Shree Nayar and Makoto that is modular, reconfigurable, each block can provide power machines, Erlich would have Odamaki, a visiting scientist and able to capture all kinds of downstream and receive data students learn DNA sequencing from Ricoh Corporation, have images. We see Cambits as a upstream. Control signals are by actually doing it themselves. developed Cambits, a modular wonderful way to unleash the conveyed both up and down- imaging system that enables creativity in all of us.” stream. What makes this scenario even the user to create a wide range imaginable let alone possible is a “Using our novel architecture, of computational cameras. Cambit blocks, whose exteriors new, portable DNA sequencing we were able to configure a Cambits comprises a set of were 3D-printed, are easy and device called a MinION. Inexpen- wide range of cameras,” adds colorful plastic blocks of five quick to configure. They are sive (approximately $1000), por- Odamaki, who spent two years different types—sensors, light attached through magnets: no table, and capable of sequencing screws, no cables. When two working with Nayar on the sources, actuators, lenses, and DNA in almost real time, the blocks are attached, they are proof-of-concept project. The optical attachments. The blocks MinION will vastly broaden the electrically connected by spring- suite of computational pho- can easily be assembled to applications of DNA sequencing loaded pins. The pins carry the tography algorithms used by make a variety of cameras with and who can accomplish it. Cambits was implemented by a different functionalities such as power (from a host computer, group of MS project students at high dynamic range imaging, tablet, or smartphone), data, The MinION uses a sequencing Columbia Engineering. Odamaki panoramic imaging, refocusing, and control signals. method different from traditional and Nayar are hoping to partner light field imaging, depth imag- (or sequential) DNA sequencing, Makoto Odamaki, visiting scientist from Each block has an ID and when with a manufacturer to bring ing using stereo, kaleidoscopic Halfway through their Ubiquitous students a brand new tool not The class teaches the basics of which works by first breaking up Ricoh Corporation, and Computer Science a set of blocks are put together, their concept to the public. Professor Shree Nayar examine Cambits. imaging and even microscopy. the host computer recognizes Genomics class, 20 students were yet in general use, it’s never clear DNA sequencing with an eye the DNA into tiny snippets be- “There are so many exciting handed a MinION device, a mobile how they are going to use it or on future sequencing technolo- fore painstakingly reassembling “We wanted to redefine what the current configuration and advances in computational DNA sequencer the size of two what they will find. That’s part of gies that promise to make DNA them, mapping them against a we mean by a camera,” says provides a menu of options for what the user might want to do. photography these days,” Nayar matchboxes laid end to end. This the fun, and the learning, too, and identification possible in real template DNA—a process that Nayar, who is the T.C. Chang $1000 device, now fully available Cambits is scalable: new blocks adds. “We hope this reconfigu- it shows the promise of onsite, time at almost any location. can take days and requires a Professor of Computer Science after being introduced in an early can be added to the existing set. rable system will open the door immediate DNA sequencing. high level of expertise. at Columbia Engineering and a access program, is expected to Taught in conjunction with So- to new avenues of creativity, pioneer in the field of compu- play an important role in advancing phie Zaaijer, a postdoc in Erlich’s Instead, MinION relies on nano- A key aspect of the Cambits bringing new dimensions to an But it was not all smooth sail- tational imaging. “Traditional design is a circuit board designed the goal of real-time, on-site DNA ing. While students found the New York Genome Center lab, pore sequencing, where a single- art form we all enjoy.” sequencing, vastly increasing the cameras are really like black by Odamaki that sits inside accidental parasites, some also the class combines aspects of stranded DNA molecule passes applications for DNA sequencing boxes that take one type of im- each block. The board includes misidentified the beef—pur- computer science, biology, elec- through a small biological pore, or age. We wanted to rethink the and, just as far-reaching, expanding trical engineering, algorithms, nanopore, embedded in an elec- a microcontroller, an upstream Holly Evarts the number of people who can do chased from a local New York instrument, to come up with a interface, and a downstream DNA sequencing. For their professor, City grocery store—as bighorn Yaniv Erlich, the device has a more sheep. Not a huge leap (both immediate purpose: a teaching tool animals are in the same family), that gives students direct experience but it does give pause to the idea with handling and sequencing DNA that real-time DNA sequencing samples for themselves. Plus he was will soon be in use at airports curious. What happens when you to screen passengers. give smart, ambitious students a new device not yet fully explored? Classroom encounters with DNA sequencing The parasites were a surprise. In sequencing a food sample pre- Sequencing DNA from food measured to contain 80% beef samples was the first of and 20% tomato, the students two hackathons in the class identified the DNA of three Ubiquitous Genomics, offered parasites (babesia bigemina, for the first time at Columbia wuchereria bancrofti, onchocerca and developed by Dr. Yaniv ochengi) and duly noted it as part Erlich, an assistant professor of Cambits, A Reconfigurable Camera: Cambits comprises a set of colorful plastic blocks of five different types—sensors, light sources, of their assignment. Identifying computer science at Columbia actuators, lenses, and optical attachments. The blocks can easily be assembled to make a variety of cameras with different functionalities. parasites in food hadn’t been the who is also faculty member of original intent, but when you give the New York Genome Center. Slightly more than half the students were computer science majors.

12 CS@CU SPRING 2016 CS@CU SPRING 2016 13 Feature Articles (continued)

can ratchet up quickly depend- For CSI Columbia, the difficulty the grade depended more on mous files weighing in at 100 revised their answer. Students ing on what two sequences are level ratcheted up much more methodology and designing a gigs for Watson, 80 gigs for clearly see the potential for being compared. Discriminating than even Erlich and Zaaijer had workable sequencing pipeline Venter. It ran for over 24 hours mobile DNA sequencing, even between two species is one imagined. (In fact, CSI Columbia than coming up with a correct before the team called a halt. with first-hand knowledge of the thing; differentiating between had initially been slated to occur identification. In this regard, the work and dedication still needed Interestingly the one group that two humans who share many first, ahead of the snack hack- students excelled, even with the to optimize the technology. did correctly identify its suspect of the same traits is something athon. However, preparing the severe computational challenges actually had the fewest reads Though there were hiccups, the else entirely. DNA libraries for CSI Columbia of constructing an integrated The MinION is four inches long, weighs 4 ounces, and gets power from a computer’s USB port. Image credit: Oxford Nanopore Technologies but compensated by using a sta- problems had more to do with took longer than planned, pipeline out of several distinct tistical approach that assigned finding the proper tools and necessitating a switch in the steps (acquisition, storage, Difficulty level increases probabilities to different tem- overcoming incompatible file for- trical field. As the DNA molecule finesse and experience. Right away, students were faced order of hackathons.) distribution, and analysis), each in second hackathon plates, thus narrowing choices mats. Erlich and Zaaijer had been transits through the nanopore, with the question of how to with its own particular file in- With the libraries prepared, Students were not originally giv- to the most likely candidate. It pushing from the beginning to the individual nucleotides (A, T, transfer thousands of individual Of the two hackathons, CSI compatibilities and data storage the students take over. Using a en any clues as to the identities was an impressive and highly see how far the students could G, C) that construct a string of files from the lab-supplied PCs Columbia proved to be much problems. Without a clear route pipette, they dispense a solution of the individuals whose DNA workable solution that Erlich go; that some original assump- the DNA disrupt the ion current to their own (mostly Mac) com- more open-ended. Here the aim already mapped out by others, was being sequenced; they were sees as the subject of a possible tions didn’t work out was only in characteristic ways, creating a containing the prepared DNA into puters where they could carry was to test whether MinION students responded by writing told only to search several online scientific paper. to be expected. However, the profile (called a squiggle) that can the MinION’s flow cell (which out their analysis. The sizes of sequencing could be used to their own code to plug up the genetic databases for a close main goal was clearly achieved: be analyzed by software to “de- contains 512 channels containing the files precluded using cloud- identify a single person. Normally holes and seamlessly transition match. With students having to students new to DNA sequenc- code” the nucleotide sequence, nanopores). Care must be taken based products such as Dropbox short tandem reads (STRs) are data from one step to another. Final project spend considerable time finding ing were able —with a little almost in real time. to not introduce air bubbles that whose free accounts don’t used to identify individuals (the the right tool and overcoming file The fundamental structure was The final project, good for 25% training—to successfully set up a render the pores inaccessible. support synchronizing data at FBI typically uses 13 different Erlich was able to procure for incompatibilities, halfway through sound; it was the data that was of the grade, had students work sequencing pipeline and imagine Pipetting is tricky, and generally such large scale. The file-transfer STRs for identification purposes), his class five MinIONs because the assignment Erlich narrowed lacking. But even then, students in pairs to describe a new use new uses for the MinION. That one person on each team learned issue, after some grappling, was not the long reads returned by the device’s manufacturer, the scope, naming himself, demonstrated they were able to for the MinION. Each group a sophisticated process once how to do it and performed the finally solved by placing the data nanopore sequencing. As yet, no Oxford Nanopore Technologies, Craig Venter, James Watson, or properly interpret the data they had different applications, from relegated to specialized labs task each time. in a BitTorrent Sync folder that scientific framework exists on is interested in exploring the someone in the 1000 Genomes had. If they couldn’t identify the wastewater management, played out relatively smoothly was then synched to students’ how to identify an individual us- potential applications of the Min- As the solution seeps through Project as the possible suspects. exact donor, they still were able to safe person identification in the classroom points to the computers (maxing out the hard ing the reads generated from the ION in education. (The class has the flow cell, individual mod- This extra information changed to provide a list of traits that in at borders, to sequencing by huge possibilities of mobile, drive in at least one case). MinION nanopore sequencer. generated interest among the ules transit the nanopores, and the scope considerably: rather the real world would help narrow zero gravity. Especially innova- onsite DNA sequencing. While there are existing align- community growing up around software on the PC powering the With the sequenced data down- than finding a single individual in the number of suspects. tive was the idea for at-home the MinION and was covered by MinION starts detecting the ion ment tools for comparing two or a sea of others, the task became sequencing to trace potential Says Erlich, “The future is here: loaded, the students head out. we can place DNA sequencers a GenomeWeb article.) current disruptions. This raw data more human DNA sequences, to look closely at a few individu- Zaaijer points out also that transplant rejection; another Their task is now to compare in the hands of our students. No (in HDF5 format) gets uploaded almost all were developed for als, and rule out others. Even students were dealing with proposed using the sequencer their reads with existing DNA more theoretical explanation of Two hackathons count to the cloud where software traditional sequencing methods. then, only one of the five groups a technology that is not yet when traveling to find edible sequences found online to mature. “Mobile sequencing is how sequencers work, no more analyzes the recorded events to made the correct identification. food and clean water resources. for half the grade identify the sample DNA. This Choosing an alignment tool took just now getting off the ground, just data wrangling. We can let identify the individual bases. Min- they do using existing alignment time. With many different ones, Half the grade would be The main issue had to do with and the error-rate in the reads is How soon before these applica- them feel the internal, promote utes later, students begin seeing tools, many free, that compare it was hard to know where to determined by two hackathons, the number of reads students still relatively high compared to tions or any others start ap- critical thinking, and a sense of preliminary sequencing data on two or more reads and produce begin. Even downloading the actually had to work with. Nano- where the 20 graduate and traditional DNA sequencing— pearing in the real world? Once ownership. DNA is everywhere. their screens. (All reads—along a similarity score. tools took time, a step that pore sequencing is less accurate undergraduate students, working though many scientific groups before and once after the hack- In your food, on your clothes, with new code written—were often had to be repeated when and has more errors (deletions, athons, students were asked to in small groups, would be given For the snack hackathon, are working on improving this. everything you touch. By having posted to the class github site.) students discovered their first insertions, and substitutions) estimate when mobile DNA se- the five MinIONs along with five students all used NBCI BLAST, It was good for the students to these sequencers, we can let tool choice didn’t work well. and more noise than traditional quencing might replace passport PCs running MinION software. Not all 512 channels contain a a tool that makes it easy to run experience that not everything students get a glimpse for this sequencing. After filtering out checks at national borders. Their The first hackathon, “Snack to nanopore that produces reads, is an iPhone where you open rich data layer around them.” stand-alone searches for similar File formats were another issue those reads not meeting qual- Sequence,” required student the box and it works. Technology answers were more conservative but those that do produce indi- sequences and to discover, and consumed a significant ity requirements for nanopore teams to identify ingredients of a evolves by hard work of many at the second asking, but not by vidual files for each sequenced for instance, whether a given amount of time for the teams. sequencing, students were left food sample prepared by Zaaijer. people who see a future (and much. Only one or two students Linda Crane read. It’s a lot of data in a very read aligns more closely with a Different tools accept and with a subset of reads covering In the second, “CSI Columbia,” applications) for new types of short time, both the promise of template read from a tomato or output different file formats. the genome to around 1%. Such students were given human DNA devices and machines. The hack- the MinION and the beginning from a zucchini. The concept is Many were incompatible; only low coverage poses a challenge and asked to identify the specific athons were a good learning of the difficulty for the students. simple, but the difficulty level some were standard. since much information about individual who donated it. The experience. Even though there ancestry or traits is derived from first went much smoother than are obstacles to overcome, the tiny changes in the DNA (SNPs). the second. students also saw the opportu- Even so, students were able to nities the technology has.” Before each hackathon, DNA learn some aspects of an individ- samples were first prepared to ual’s ancestry and traits (including Students not only demonstrated create a DNA library for feeding susceptibility to diseases). they absorbed the basics of DNA into the MinION, a step that sequencing but added ideas (Erlich wants to offer the class was done by Zaaijer. Though and strategies of their own. again and is considering adding generating DNA libraries for One team had taken a throw- an intermediate, “where-you- MinIONs is much simpler than processing-power-at-the-problem are” report so students can help for other sequencers, it is approach, setting up a dedicated one another over encountered time-consuming, requires a lab server for the sole purpose of roadblocks.) setting (and is therefore not downloading the entire genomes mobile yet), and takes some In a classroom at the New York Genome Center, students observe MinION data during second hackathon. Screenshot shows stats on number and length of reads. Fortunately for the students, of Watson and Venter—enor-

14 CS@CU SPRING 2016 CS@CU SPRING 2016 15 Feature Articles (continued) Researchers Develop Algorithm to In U.S. Senate Testimony, Henning Schulzrinne Offers 3D Print Vibrational Sounds Technology Solutions to Unwanted Calls 3D metallophone cups automatically created by computers.

In creating what looks to be a tion of contact sounds has long “Our zoolophone’s keys are sound when struck proved to be Robocalls are proliferating and way into the bank, ignore the Do And they usually get away with who logged 62 robocalls within simple children’s musical instru- interested the computer graphics automatically tuned to play the core computational difficulty: becoming increasingly sophisticated Not Call list to commit the bigger it. Experiments done by system a month, an FTC representative ment—a xylophone with keys community, as has computation- notes on a scale with overtones the search space for optimizing and deceptive, purporting to be from crime of fraud, either conning staff at Columbia University who testified about her agen- in the shape of zoo animals— al fabrication, and, he explains, and frequency of a profession- both amplitude and frequency is banks or government agencies to victims into divulging personal in- showed that even large carriers cy’s difficulty in dealing with computer scientists at Colum- “We hoped to bridge these two ally produced xylophone,” says immense. To increase the chanc- trick and scare people into revealing formation or of selling services or do not reject implausible phone the problem, and a Missouri personal information or transfer- bia Engineering, Harvard, and disciplines and explore how Zheng, whose team spent nearly es of finding the most optimal ring money. Recent advances in products that never materialize. numbers such as 311-555-2368. Deputy Attorney General whose MIT have demonstrated that much control one can garner two years on developing new shape, Zheng and his colleagues office last year fielded 57,000 technology have reduced the cost of The ability of robocallers to sound can be controlled by 3D- over the vibrational frequency computational methods while developed a new, fast stochas- complaints, 52,000 of which calling to close to nothing and made What you can do associate their numbers with printing shapes. They designed spectra of complex geometrics.” borrowing concepts from com- tic optimization method, which it easier to “spoof,” or misrepresent, concerned unwanted calls. against robocalling any other number or caller ID an optimization algorithm and puter graphics, acoustic model- they called Latin Complement the originating number or caller ID. Zheng’s team decided to focus name gives rise to a whole Testifying about the technol- used computational methods ing, mechanical engineering, and Sampling (LCS). They input shape The famous Do Not Call list, while on simplifying the slow, com- 1. Hang up immediately. slew of semi-plausible scams: ogy aspects was Henning and digital fabrication to control 3D printing. “By automatically and user-specified frequency and effective against unwanted calls plicated, manual process of do not press buttons or the IRS demanding payment for Schulzrinne, who developed acoustic properties—both sound optimizing the shape of 2D and amplitude spectra (for instance, from legitimate businesses, is no designing idiophones, musical engage the caller. overdue taxes, the Social Secu- the key protocols that enable and vibration—by altering the 3D objects through deformation users can specify which shapes deterrent to criminals intent on instruments that produce fraud. Seniors are especially vulner- rity Administration requesting VoIP and who continues to shape of 2D and 3D objects. and perforation, we were able produce which note) and, from 2. Sign up for Nomorobo sounds through vibrations in the able, and for this reason, the Senate an account number to make a work on VoIP protocols as a Their work—“Computational to produce such professional that information, optimized the or other services that instrument itself, not through Special Committee on Aging held deposit, an extradition threat professor of computer science Design of Metallophone Contact sounds that our technique will shape of the objects through de- blacklist numbers of strings or reeds. Because the hearings in June 2015 on possible from local police if a debt is not at Columbia University. He is Sounds”—will be presented at enable even novices to design formation and perforation to pro- known robocallers. surface vibration and resulting new legislation to prevent unwanted immediately repaid. There are also knowledgeable about the SIGGRAPH Asia on November 4 metallophones with unique duce the wanted sounds. LCS calls. Among those testifying was (Nomorobo is available sounds depend on the idio- many others, like the one that policy issues, having served as in Kobe, Japan. sound and appearance.” outperformed all other alternative Henning Schulzrinne who provided only in the US and only phone’s shape in a complex way, optimizations and can be used in the biggest takeaway of the day: from certain carriers.) promises a “free” medical alert the Chief Technologist at the “Our discovery could lead to designing the shapes to obtain Though a fun toy, the zoolo- a variety of other problems. technology offers solutions. or sign up for services system. Most people today FCC from 2012 to 2014. While a wealth of possibilities that desired sound characteristics is phone represents fundamental such as GoogleVoice’s know enough to be wary of currently consulting for the go well beyond musical instru- not straightforward, and their research into understanding “Acoustic design of objects today More than 10 years after the free feature that prompts such calls, but the robocallers’ agency, it was in his private role ments,” says Changxi Zheng, forms have been limited to well- the complex relationships remains slow and expensive,” Do Not Call list was instituted, callers to say their names simple business model—flood as a technology expert that he assistant professor of computer understood designs such as bars between an object’s geometry Zheng notes. “We would like to more robocall complaints than before you pick up. phones with millions of cheap addressed the committee. science at Columbia Engineer- that are tuned by careful drilling and its material properties, and explore computational design calls to flush out the few naïve ever are being received by the After summarizing eight ing, who led the research team. of dimples on the underside of the vibrations and sounds it algorithms to improve the 3. File a complaint with the victims that make the business Federal Trade Commission (FTC) categories of scams, Schulz- “Our algorithm could lead to the instrument. produces when struck. While process for better controlling ftc. Complaints help model work—is robust against and the Federal Communica- rinne described the technology ways to build less noisy com- previous algorithms attempted an object’s acoustic properties, define patterns of fraud a low success rate. Even a 95% To demonstrate their new tions Commission (FCC). solutions, which fall into roughly puter fans, bridges that don’t to optimize either amplitude whether to achieve desired and abuse, sometimes or 99% suppression rate would technique, the team settled three categories: filtering, caller amplify vibrations under stress, (loudness) or frequency, the zoo- sound spectra or to reduce Technological advances are leading to investigations not sufficiently discourage on building a “zoolophone,” a ID and name authentication, and advance the construction of lophone required optimizing both undesired noise. This project partly to blame. As the tele- that result in fines. robocallers if it leaves the most metallophone with playful animal and gateway blocking. Each, micro-electro-mechanical reso- simultaneously to fully control underscores our first step toward phone infrastructure is changing likely victims unprotected. shapes (a metallophone is an summarized below, has its nators whose vibration modes its acoustic properties. Creating this exciting direction in helping from traditional copper wires to idiophone made of tuned metal Because senior citizens are strong points and limitations. are of great importance.” realistic musical sounds required us design objects in a new way.” Voice-over-IP (VoIP) technology, To increase their odds of success bars that can be struck to make especially vulnerable to such more work to add in overtones, what was once expensive and and because VoIP makes it easy, Zheng, who works in the area sound, such as a glockenspiel). Zheng, whose previous work scams, the Senate Special secondary frequencies higher difficult—international calling, robocallers often impersonate a Filtering of dynamic, physics-based com- Their algorithm optimized and in computer graphics includes Committee on Aging in June than the main one that contrib- auto-dialing, falsifying caller ID legitimate bank or government putational sound for immersive 3D-printed the instrument’s keys synthesizing realistic sounds that held hearings on possible Filtering, either through a third- ute to the timbre associated information—has become cheap agency. It’s called spoofing, and environments, wanted to see if in the shape of colorful lions, are automatically synchronized legislative solutions. Chaired party service or a downloaded with notes played on a profes- and easy, making it possible for it is quasi-legal. The Caller ID Act he could use computation and turtles, elephants, giraffes, and to simulated motions, has by Susan Collins (R-Maine), the app, works by checking each in- sionally produced instrument. almost anyone with a laptop of 2009 does make spoofing a digital fabrication to actively more, modeling the geometry already been contacted by re- committee called four witness- coming call against a white list of and an Internet connection to crime but only when it is used control the acoustical property, to achieve the desired pitch and Looking for the most optimal searchers interested in applying es—a small business owner trustworthy phone numbers or a flood phones with millions of to harm or defraud someone, or vibration, of an object. Simula- amplitude of each part. shape that produces the desired his approach to micro-electro- robocalls and to do so from any something possible to prove only mechanical systems (MEMS), in location in the world. after the fact. No one seems which vibrations filter RF signals. The nature of the calls them- too concerned, and companies The work at Columbia Engineer- selves has changed also. Before openly sell spoofing software. ing was supported in part by the list, most robocalls were There is even a free iPhone app the National Science Foundation legitimate telemarketers looking for spoofing. An app is strictly (NSF) and Intel, at Harvard and to make a sale. Against those small scale and for targeting MIT by NSF, Air Force Research calls, the Do Not Call list has specific individuals; for spoofing Laboratory, and DARPA been largely effective, leaving the at industrial-scale, robocallers field wide open to illegitimate are likely to turn to open-source operators who, like bank robbers phone switch software when Holly Evarts walking past the meter on the inserting fake phone numbers Testing showed that clearly fictitious numbers were transmitted even though To demonstrate their optimization algorithm, the researchers built a “zoolophone,” a metallophone with playful animal shapes. into millions of calls. it would be easy for phone carriers to identify and block them.

16 CS@CU SPRING 2016 CS@CU SPRING 2016 17 Feature Articles (continued) black list of nonacceptable ones do it. One is to authenticate the credit card billing information, Breaking the compiled in one of several ways: originating number to ensure for example. business model from FTC and FCC customer the caller is authorized to use Author Interview: complaints, crowd-sourced by the caller ID contained in the call Blocking at the Each of the three methods— Thinking Security is written If checklists are too static to be for network and security useful, how should people go consumers, or collected through setup message. Authentication VoIP gateway filtering, authentication, VoIP gateway blocking—does its Steven Bellovin on administrators, but some about ensuring their systems honeypots. (Honeypots are would require phone carriers to (“do not originate”) stealth servers programmed to insert links to new cryptographic part to add to the difficulty and security advice applies to are secure? expense of robocalling, but each everybody. act like normal phones—with certificates so any carrier along Perhaps Schulzrinne’s most Thinking Security It starts with two fundamental addresses only a subpart of the numbers not assigned to any the way could validate the innovative proposal is a do-not- • Use a password manager questions: what are you pro- problem. The do-not-originate individual or company—for the signature and detect spoofed originate list that would cut off to securely store a different tecting, and against whom are list addresses spoofing of high- express purpose of capturing caller IDs. These calls could then robocalls closer to the source: at credential for every site and you protecting it. the phone numbers of robocall- be labeled in some way or, if the the VoIP gateways that connect profile numbers of government avoid reuse of keys. ers.) Built-in safeguards can customer prefers, rejected. VoIP calls to the traditional phone agencies and banks but not other If you don’t answer those ques- In the face of relentless security at- protect you. The rules on picking Look for password manag- ensure emergency alert calls system. While VoIP robocalls legitimate-sounding numbers tions, you’re doing security However, it’s not clear how tacks, is it possible to keep systems, strong passwords go back to a ers that encrypt URLs and get through as do calls placed can be placed from anywhere robocallers invent (“Card Svcs,” just to do security, forgetting much the phone carriers will do data, and networks protected? Yes, paper in 1979, so this is not new that add “salt” (a random from medical facilities; unknown “Medcare”). Authentication says respected security expert that the purpose of security is in the world, all such calls pass technology and there are ways string of data) to each pass- phone numbers can be verified voluntarily. For years, carriers stops robocallers from imper- Steven Bellovin, but it requires more not to increase security, but to through such gateways to enter to bypass strong passwords. word to add an extra layer by making callers prove that they have resisted appeals to block sonating legitimate businesses than a static checklist of standard prevent loss. the traditional circuit-switched Keystroke loggers and phishing of protection. are human rather than robotic. robocalls, claiming that federal and government agencies (and security measures. It requires looking phone lines used by most large attacks, for example, don’t care Defenses have to be matched law prohibits them as common makes fraudulent calls less likely ahead of current technology to antici- though web access to Filtering today has several carriers from doing so. The US companies and large carriers. pate vulnerabilities and understand how strong your password is. to the likely attacks. If you’re to pay off) but does nothing to a password collection is drawbacks. It puts the onus on (Companies generally contract how and why they exist; only then is FCC pulled the rug out from The underlying vulnerability convenient, it is also more protecting a single database individuals, and it protects only with a carrier that operates a prevent robocalls themselves. it possible to identify the most effec- this excuse in a June 18 vote here is the reuse that occurs dangerous, especially when accessed only by employees, those who know about filtering VoIP gateway on their behalf Filtering can stop robocalls but tive defense mechanism and guard that explicitly states that phone when you’re sending some- using potentially insecure a firewall will probably suffice. and are willing to do the setup, to handle the transition for all currently protects the relatively against new attacks. To help security companies are legally allowed thing to one site that can be machines. However, if it’s 17 databases generally the most sophisticated incoming and outgoing calls.) few individuals who use it and is specialists and other IT professionals to provide filtering to those foster a security mindset, Bellovin stolen from that site and reused all tied together and made to people who are unlikely to fall for easily circumvented by spoofing. one nice feature is the ability customers who request it. (The VoIP gateways currently do not in his latest book, Thinking Security, against you. In RSA SecurID— function as a single resource a scam in any case. By protect- to copy a password to the FCC does not currently, how- check whether the originating But used in combination with describes fundamental security generally considered very while also needing to be ing the people who least need clipboard for easy pasting ever, obligate phone companies number is valid or not. However, one another, the three methods principles that are true no matter the secure—a cryptographic secret into web forms; however, accessible by those inside it, filtering today leaves the most computing environment or how much to provide filtering.) it would be easy to program complement one another to is embedded in the token but a check that the clipboard gets and outside the company, a vulnerable even more exposed. undermine the economics of technology changes. It’s a pragmatic firewall is not going to work. Using his deep knowledge them to reject originating phone server somewhere has a copy automatically cleared. robocalling. Once authentication approach that presents security as Extending filtering to others of the protocols, Schulzrinne numbers of companies that did a systems issue while considering of that secret. Anyone hacking the more integrated the Nor will a firewall protect you is in place to prevent spoofing is not currently easy. Filtering offered an alternative approach not contract for their services cost, the value of the assets being into that server can imperson- manager is with a browser, from an attack launched from and people can trust that phone works on many landlines, and it to preventing spoofing, one or numbers known to be out of protected, the actual threat, and ate the file of tokens kept there. the more risk there is that inside. Even employees might numbers are legitimate, white is usually available only through that does not rely on carriers. service. Any calls from numbers employees’ need to be productive. Some will say, Lock down the malware can abuse it to work against a firewall if it lists of acceptable numbers— large cable companies like Time The VoIP protocols (specifically on a list to not originate—a server. If you can lock down steal your credentials. prevents them from getting government agencies, banks, Warner or Comcast that support the Session Initiation Protocol, reverse do-not-call list—would Why did you write the server, why can’t you lock their work done. external filtering services such or SIP) allow for changing the be rejected by the gateway and doctors—can be compiled and Thinking Security? down your password file? Why • If your bank offers an online as Nomorobo. thus blocked from entering the safely and widely distributed to is that server more secure than access to your account, mechanics how caller ID infor- People might be surprised phone system. Alternatively, the protect even the most vulner- Dissatisfaction with how security a password file? It’s not. use it. By regularly logging And filters are easily avoided by mation is generated, and thus to see you say firewalls may gateway could replace the fake able. And without spoofing to is practiced in the real world. in, you’ll detect fraudulent robocallers’ use of spoofing. make it difficult to do spoofing A one-time password if done not provide needed security. caller ID information with a fraud disguise their calls, robocallers Security today tends to rely on activity more quickly. in the first place. right is secure, not because it You and William Cheswick indicator, such as the (made- quickly get identified and black- checklists based on yesterday’s is hard to guess, but because • Use a credit card rather wrote the first book on fire- Caller ID and Currently ID information is up) area code 666. Consumer- listed (and in the best case, shut technology and yesterday’s it can’t be reused. than a debit card when walls (Firewalls and Internet name authentication collected from many different chosen call filtering technologies down by law enforcement). threats. Checklists can’t cover databases and is often not every situation, and they can’t making purchases, Security: Repelling the Wily Spoofing is perhaps the most can then reject those calls if Your most valuable password is validated, making it easy for It’s the combination of methods, anticipate new types of attacks. especially when you don’t Hacker). nefarious aspect of the scam- the carrier prefers not to. While your email account password, completely trust a site. fraudulent callers to insert any working in conjunction with the ming schemes; almost anyone companies would have to list After years of seeing misleading because that’s used for all the US law limits cardholders That book was written in 1994. information they like, especially VoIP technology and the support- is likely to pick up when seeing themselves on do-not-originate and simplistic security recom- password resets. Anyone hav- to $50 liability in the case Networks and systems are more for numbers that have not been ing protocols, that stands the the phone number of the local lists, those companies most mendations in the mainstream ing your email password can of unauthorized card use. complicated and interconnected assigned to a carrier. Because best chance of approaching the police department or the IRS. likely to be impersonated would press, I started thinking about potentially learn any password (For debit cards, which are these days. From a security SIP allows the calling carrier 100% suppression rate needed Spoofing has other bad uses have incentive to do so. underlying principles and what emailed to you, no matter how covered under a different perspective, complexity is fatal. to insert name information to put an end to robocalling. as well since a caller ID is often security advice is always going to strong the password. law, you’re liable for up to directly into the call signaling The do-not-originate approach Since it was technology that Firewalls work well when there used to verify one’s identity be the same no matter what hap- $50 if you report within two request, it’s possible to avoid has the advantage that it can be allowed robocalling in the first is a clear distinction between when gaining access to voice- pens in technology; it’s the way Passwords are ubiquitous. days; after two days, you’re looking up the information in implemented quickly and easily, place, it’s only fitting that tech- the inside and outside of what mail or when calling a bank, I try to get my students to think Can they be used safely? liable for up to $500. After databases and making it easier without any changes in telepho- nology be part of the solution. you’re protecting. Today it’s not utility, or airline. about security. All those things 60 days, you’re liable for the to track who generated the ny protocols. Nor does it require I use and recommend pass- always clear-cut. Companies together went into the book. entire stolen amount.) Preventing spoofing is necessary information. Longer term, carri- cooperation of other phone word managers, though there often make their databases and carriers. It is no substitute for Linda Crane Debit cards have the added parts of their network accessible both to make filtering effective ers may also indicate that they What is an example of are some bad designs out authentication, but it should pre- The full transcript of Schulzrinne’s risk of being a direct line to outside contractors, vendors, and to stop robocallers from have validated the information misleading security advice? there. The book discusses the impersonating others, and Schul- by cross-checking them against vent many of the most harmful testimony is at aging.senate.gov/imo/ characteristics that make for a into your bank account. or auditors. In such cases, a fire- zrinne offered possible ways to service address records or calls from reaching consumers. media/doc/Schulzrinne_6_10_15.pdf That a strong password will good password manager. wall is not appropriate, but not

18 CS@CU SPRING 2016 CS@CU SPRING 2016 19 Feature Articles (continued) having a firewall creates vulner- Attackers seem to be always work to get two signatures. vulnerabilities, something or will soon be in their area. abilities that can be exploited. one step ahead... that rarely happens. But companies are under other (It’s not just companies that want Which is what happened in ...mostly, but not completely. pressures. Online vendors need Research into new security context-aware apps and ads; the breach at Target. Attackers We don’t hear about the attacks to make their sites easy to use measures is ongoing. When there is evidence users do, too. obtained the network credentials that get repulsed. for customers. I came to Columbia ten years Cisco found that half of custom- used by Target’s HVAC vendor, ago, sandboxing was known ers surveyed would use coupons Amazon for instance generally which had external access to to have good properties, but it sent from a nearby store.) With all the risks, would you does not make you go through Target’s network. Once inside, was not then in general use. recommend people not use the extra step of inputting the Location data for any single user hackers were able to move freely Today it’s a mainstream part of online banking? 3- or 4-digit card security code may be too sparse to understand over Target’s network, which all operating systems. that other sites require because when a user transitions between from all accounts was rather No. And I’ll tell you why. As Amazon has made one-click Digital rights management has places, but the collection of data loosely structured with little a matter of practice, banks ordering a business priority. Less also been more successful at across all users represents much segmentation. Internal firewalls don’t hold customers liable for The strong, well-defined pattern on the left results from combining global weekly patterns with money hacked from their bank secure verification will incur protecting proprietary content more information that can help spatiotemporal data of an individual user arbitrarily chosen from the dense dataset. The right should have been used to cordon some loss, but Amazon is willing distribution (for the same user) represents a previous baseline model that did not infer global accounts because the next than I thought it would be. It illuminate broader patterns. To off sensitive parts of the net- to eat those losses, figuring net patterns and so as not able to correctly identify important places. bank down the street won’t. works because most of the exploit this collective information, work, like the payment system, profit is greater than the loss if It’s the competitive landscape. content that people were pirating four researchers—Berk Kapicio- which is how the attackers were it’s easy for people to buy. can now be bought at reasonable glu, David S. Rosenberg, Robert ultimately able to steal credit Merchants and businesses, prices online. From a techno- Schapire, and Tony Jebara— Insecurity is not a state of sin; card information. however, are generally liable. A it’s part of running a business logical perspective, digital rights developed a data-driven method Legacy systems are a problem; customer is not. There are a lot management doesn’t seem to that learns people’s important and business can be risky. users can then be determined algorithm and by clustering an internal network might have of things I will do if I’m not liable. be something that should work, places based on global temporal but it works from an economic patterns inferred from the from how they fit into the nearby points using a Gaussian started off simple but just then Are you optimistic people Many things come down to perspective. Not perfectly, of entire data set. They described global temporal pattern. In this mixture model that is a sub- grew, with security put in this can secure sites and data? economics, and security is course, but good enough. this method in the paper way, the model re-constructs component of the collaborative one spot here and another one of them. Yes and no. The biggest cause Collaborative Place Models a particular user’s home-work- place model). This contrasts spot over there, but never with I’m morally certain that right of security problems is buggy presented in July 2015 at the commuting schedule even with the use of averaging in any overarching vision of how now someone in Silicon Valley That is one of the main code. This is not a new thought International Joint Conference though a user might have been other place models to handle it should be done; before or Tel Aviv or Hyderabad or themes in the book. of mine. It was true 20 years on Artificial Intelligence. observed only at Thursday 3pm redundant observations and too long it’s too late to do a Beijing or Accra or somewhere ago, and though code is better and Monday 1pm. the noise that occurs from GPS coordinated plan. Yes, security costs money. is devising something that Collaborative place models dif- written today, programs are errors and from having multiple Companies have to spend 10 years from now, we’ll find fer from previous methods that To prove the concept, the bigger and more complex. cell towers covering the same The Target breach was one resources, understand the need, indispensable, but will have as first label locations according researchers tested the model It’s hard to imagine what a location; by not averaging, the of many big ones in the and they have to be willing to profound an effect on security to time of day and day of week. using two real-world data sets, defense against buggy code collaborative place model avoids last couple of years. Fed- accept inconvenience in order to as today’s smartphones have By assuming, for example, a sparse one collected from a would look like. the strange results sometimes eral agencies were attacked protect themselves. If it takes had on communications and a 9-to-5 workday Monday mobile ad exchange, and a dense data set from a cellular carrier. In caused by deviations in the multiple times, Home Depot, two signatures to fully protect Any system must also be society. We just don’t know through Friday, methods that regular routine, such as a late Anthem Health, even Chase. something, do the extra bit of periodically re-evaluated for what it is yet. rely on labeling might average both cases, the only inputs were user IDs, latitudes, longitudes, work evening or a night or positions between 8am and and time stamps. (Data was weekend away from home. 6pm and call that home while averaging positions between anonymized by removing all Flexibility was built into the 9 and 5 and calling that work. personal information.) model by allowing users to have It’s an intuitive approach but it With data aggregated across all varying numbers of places or lacks flexibility—not everyone users, a strong, global temporal week hours. This flexibility turned has the same schedule—and pattern emerged fairly quickly, out to be key; an early, simpler Right Time, Right Place: A Collaborative Approach for it ignores the commute, which one that contained within it sev- prototype that constrained users can be a significant amount of eral temporal clusters correlated to have the same week-hour More Accurate Context-awareness in Mobile Apps and Ads time for some people and a with work, morning and evening distribution performed worse missed opportunity for those commutes, leisure times after than a baseline model. businesses located along the work, and sleeping at night. In the end, data by itself was commute. With the global pattern thus The right information delivered at meaning to each location, identifying The location data needed for position is relevant to a user, enough to reliably assess a established, the individual spa- the right time can make apps and it as home, work, commute, or another context-aware ads and apps whether it represents a work Rather than imposing a static user’s spatiotemporal schedule. tiotemporal patterns of individual ads more appealing and relevant place frequented by a user. But loca- is surprisingly sparse for any place, home, a point along the temporal framework, collab- Without the need to label or users became apparent even to customers: a traffic app that tion data is often surprisingly sparse single user. For privacy and to morning or evening commute, or orative place models learn average location places, the with few data points associated auto-updates for the work or home for any one user. To overcome sparsity conserve energy, most smart some other frequently visited the quantitative relationship collaborative approach of commute as appropriate; a restaurant and construct reliable weekly routines with each user. phone apps log users’ locations destination. It’s this contex- between week-hours by infer- combining global patterns that offers lunch coupons for people for individual users, researchers only when the app is active. tual information that allows ring similarities across all users, The spatial extent of place who work in the area but dinner integrated global temporal patterns with sparse user location data coupons for people who live nearby. inferred from the entire data set with The result is that location companies to customize their relying on Bayesian estimation types associated with temporal reduced the median distance This level of customization requires user-specific spatiotemporal data. data sets collected from apps apps and ads for their custom- techniques to do so. With a clusters were determined by error by 8% from a simpler non- taking into account a user’s immediate The resulting method is entirely data- comprise many users but few ers’ immediate or near-future global temporal framework replacing multiple observations collaborative baseline model. context, something that is not easy driven—requiring no labeling—and observations per user. This locations. Local businesses thus set, the relevance of the logged during the same hour to do. It requires both location data flexible to accommodate variations sparsity makes it difficult to especially benefit when they sparser latitude-longitude GPS with their geometric median and a temporal framework that gives in a user’s weekly schedule. know how a particular GPS can accurately predict who is coordinates from individual (computed using Weiszfeld’s Linda Crane

20 CS@CU SPRING 2016 CS@CU SPRING 2016 21 Feature Articles (continued) Side-channel Attacks in Web Browsers: Practical, Low-cost, and Highly Scalable Cache Set (non-canonical)

Time (ms)

In a paper presented at the ACM Side channel attacks can be an Assistant Professor at the environment (the sandbox) that data is being fetched. “It’s course, the more information And in this story of data A memorygram of L3 cache activity: Vertical line Conference on Computer and Com- particularly insidious because Department of Information restricts what the website’s remarkable that such a wealth developers have, the more privacy at least there is a segments indicate multiple adjacent cache sets are active during the same time period. Since munications Security, four computer they circumvent security Systems Engineering in Ben- JavaScript can do. of information about the system information an attacker can happy ending. In March 2015, scientists from Columbia University— consecutive cache sets (within the same page mechanisms. Traditionally they Gurion University) puts it, is available to an unprivileged access also. the researchers shared their frame) correspond to consecutive addresses in Yossi Oren, Vasileios P. Kemerlis, However, the sandbox does not are directed against targeted “Attacks always become worse.” webpage,” says Oren. research with all major browser physical memory, it may indicate the execution Simha Sethumadhavan, and Angelos prevent JavaScript running in Different processes have individuals and assume proximity vendors; by September 2015, of a function call spanning more than 64 bytes of D. Keromytis—demonstrate that In one sense at least, spy-in- an open browser window from “While previous studies have different memorygrams, and assembler instructions. The white horizontal line and special software installed on Apple, Google, and Mozilla it’s possible to spy on activities of a the-sandbox attacks are more observing activity in the L3 been able to see some of the the same is true for different indicates a variable constantly being accessed the victim’s computer. However, had released updated versions during measurements, and probably belongs computer user from a web browser, dangerous than other side- cache, where websites interact same behavior, they relied websites; their memorygrams even in some cases determining what those assumptions may have of their browsers to close the to the measurement code or to the underlying channel attacks because they with other processes running on specially written software will differ depending on the website(s) a user is visiting. This type to be rethought after four com- identified security hole. JavaScript runtime. can scale up to attack 1,000, on the computer, even those that had to be installed on the data the site is using, how the of attack, dubbed spy-in-the-sandbox, puter scientists from Columbia 10,000, or even a million users at processes protected by higher- victim’s machine. What’s remark- site is structured, how many The researchers are not yet works by observing activity in the University (Yossef Oren, Vasileios once. Nor are only a few users level security mechanisms like able here is that we see some images it contains and the size done examining the potential of CPU cache on Intel microproces- P. Kemerlis, Simha Sethumadha- sors. It affects close to 80% of PCs, vulnerable; the attack works virtual memory, privilege rings, of the same information using of those images. These various web-based side-channel attacks. van, and Angelos D. Keromytis) and it represents an escalation and against users running an HTML5- hypervisors, and sandboxing. only a browser,” says Vasileios parts of the website end up in They will continue looking at demonstrated for the first time scaling up of what’s possible with capable browser on a PC with an Kemerlis, a PhD student who different locations in memory, the problem (on old versions that it is possible to launch a The attack is possible because side-channel attacks, requiring no Intel CPU based on the Sandy worked on the project (now and need to be called and of browsers) to test the attack side channel attack from within memory location information special software or close proximity to Bridge, Ivy Bridge, Haswell, or an Assistant Professor in the cached, giving each website at larger scale. They are also the victim. Fortunately the fix is easy a web browser. The method is leaks out by timing cache events. Broadwell micro-architectures, Computer Science Department its own distinctive signature. considering a more interesting and Web browser vendors, alerted to detailed in their paper The Spy If a needed element is not in which account for approximately at Brown University). question; can memorygrams the problem, are updating their code in the Sandbox—Practical Cache the cache (a cache miss event), The researchers visited 10 80% of PCs sold after 2011. be used for good purposes? bases to prevent such attacks. One Attacks in JavaScript and Their for instance, it takes longer to By selecting a group of cache sites and recorded multiple other upside: the spy-in-the-sandbox retrieve the data element. This sets and repeatedly measuring memorygrams in each case A pre-set memorygrams might attack may serve as a primitive for Implications, which was present- How it was done allows the researchers to know their access latencies over to build a classifier that could, be placed in memory to be secure communications. ed October 12, 2015 at the ACM Conference on Computer and Neither proximity or special what data is currently being time, the researchers were able with 80% accuracy, determine viewed by a trusted party to used by the computer. To add a to construct a very detailed if a website open on a victim’s convey information. One memo- In a side-channel attack, an Communications Security. software is required; the one new data element to the cache, picture, or memorygram, of the machine matched one of the 10 rygram might represent a 1 bit, attacker is able to glean crucial assumption is that the victim can The attack, dubbed spy-in-the- the CPU will need to evict data real-time activity of the cache. pre-selected sites. (The same another a 0 bit. The process of information by analyzing physical be lured to a website controlled sandbox by the researchers, elements to make room. The website viewed on different communicating in this fashion emissions (power, radiation, by the attacker and leaves open Such a detailed picture is pos- does not steal passwords or data element is evicted not only browsers will exhibit slight would be slow, but it would be heat, vibrations) produced the browser window. sible only because many web extract encryption keys. Instead, from the L3 cache but from differences; it’s this noise that extremely difficult for an attacker during an otherwise secure browsers recently upgraded it shows that the privacy of What’s running in that open lower-level caches as well. To prevents 100% accuracy when to even figure out that explicit computation. Side-channel the precision of their timers, computer users can be compro- browser window is JavaScript check whether data residing at matching memorygrams.) communication is occurring attacks are not new; Cold-War making it possible to time mised from code running inside code capable of viewing and a certain physical address are between two parties. Memo- examples abound, from aiming a events with microsecond the highly restricted (sandboxed) recording the flow of data in present in the L3 cache as well, Future work rygrams might thus serve as a laser beam at a window to pick precision. If memorygrams environment of a web browser. and out of the computer’s the CPU calculates which part of primitive in securely conveying up vibrations from conversations were fuzzier and less detailed, As pernicious as is the side- The researchers were able to tell cache, specifically the L3, or the cache (cache set) is respon- information, and what was once inside, or installing microphones it would not be possible to channel attack, especially con- for instance whether a user was last-level, cache. (A cache is sible for the address, then only a threat to security may serve in typewriters to identify letters capture such small events as a sidering how practical, scalable, sitting at the computer and hit- extra-fast memory close to checks the certain lines within to enhance it. being typed. On computers, cache miss. (Different browsers and low-cost it is, avoiding it ting keys or moving the mouse; the CPU to hold data currently the cache that correspond to this side-channel-attacks often work implement this new feature is surprisingly easy: run only a more worrisome from a privacy in use; caching data saves set, allowing the researchers by inferring information from with different precisions.) High- single web browser window at perspective, the researchers the time it would take to fetch to associate cache lines with Linda Crane how much time or battery resolution timers have recently a time. An across-the-board fix could determine with 80% data from regular memory.) physical memory. been added to browsers as to prevent the attacks is easy power is required to process accuracy whether the victim a way to give developers, also; have browsers return to an input or execute an opera- was visiting certain websites. That an attacker can launch a In timing events, researchers tion. Given precise side-channel side-channel attack from a web were able to infer which especially game developers, using less precise timers (or measurements, an attacker can More may be possible. As Yossef browser is somewhat surprising. instruction sets are active and sufficient fine-grained detail to alert users of the high-precision work backward to reconstruct Oren, a postdoctoral researcher Websites running on a computer which are not, and what areas know what processes might timers that there exist possible the input. who worked on the project (now operate within a tightly contained in memory are active when be slowing performance. Of security vulnerabilities).

22 CS@CU SPRING 2016 CS@CU SPRING 2016 23 In Memoriam

“Let me tell you how I got hooked on computing. For my thesis Watson lab then located in Casa After graduating Columbia, algorithm culminated in his 1964 of roles over the years, often Joseph F. Traub I worked for six months starting from a mathematical model of Hispanica, just off campus at 612 Traub went to work at Bell Labs monograph Iterative Methods organizing workshops to bring the helium atom and writing a program to compute the energy W. 116th Street. He was hired then in its “golden 60s” when for the Solution of Equations. together those working in sci- and other parameters of the atom. I took the cards from the IBM there as a fellow, gaining the researchers were given wide It was the start of his career ence and math. It was in Santa 650 and loaded them on the printer. The printer started spew- perk of unlimited computer time. latitude to choose projects and with many publications to come. Fe where he died Monday ing out approximations to the ground state energy of helium. I conduct pure research. It was morning, unexpectedly and His luck extended to his was using a variational principle which means I was converging In 1959 he earned his PhD there that a colleague one day quickly, after having made plans personal life. He was married down to the ground state energy of the helium matter. Watching, under the Committee of Applied walked into his office with a to travel to Germany, Poland, to Pamela McCorduck, a noted after the six months of work, the numbers rolling off the printer, Mathematics at Columbia. After problem. Could Traub find the and CMU. He is survived by his author who also taught science and seeing that the initial numbers approximated the experi- his first choice to work on a zero of a function that involved wife Pamela and two daughters, mentally measured ground state energy of the helium atom writing at Columbia. He enjoyed chess problem was rejected, an integral? Mulling over the Claudia Traub-Cooper and good to four places. That was the moment.” skiing, tennis, hiking, travel, and he proposed instead a quantum problem led to two observa- Hillary Spector. good food. problem that involved six tions: one, it was expensive to Joseph Traub was an important months of programing to compute the function; and two, He regularly spent his summers and valued member of the calculate the ground energy there were lots of ways of solv- in Santa Fe, where he was an Computer Science department Joseph F. Traub, a pioneering computer science, mathematics, of the Computer Science depart- today, enabling our ongoing state of a helium atom, correct ing it. His thinking about how to External Professor at the Santa he founded. He will be missed computer scientist and founder physics, computational finance, ment, he became the founding frontier leadership in this field,” to four decimal points. select the best, most optimal Fe Institute and played a variety by faculty, staff, and students. of the Computer Science depart- and quantum computing. editor-in-chief of the Journal of said Dean Mary C. Boyce. “Joe ment at Columbia University, Complexity (a position he held will be sorely missed by all of Apart from his scientific died Monday, August 24, 2015 at the time of his death). In us at Columbia and by the research, he had a major role in Santa Fe, NM. He was 83. 1986, he founded the Computer computer science community in building and leading organiza- Most recently the Edwin Howard Science and Technology Board across the globe.” Armstrong Professor of Com- tions that promoted computer (CSTB) of the National Research the subject from 1982 until 1992 Journal of the Association for science. In 1971, at the age of puter Science, Traub was an early Council, serving as its chair A life of science In Memoriam in the Journal of Algorithms. Computing Machinery (1983- pioneer in computer science 38, he was appointed chair of the 1987), and he served as associ- from 1986 until 1992 and again and discovery Born December 9, 1945, years before such a discipline computer science department ate editor of ACM Transactions in 2005 and 2009. Johnson attended Amherst existed, and he would do a lot at Carnegie Mellon University Traub always described himself David S. Johnson on Algorithms (TALG) since its as an undergraduate studying to shape the field. (CMU), overseeing its expansion His awards and honors are as lucky: Lucky in his early life founding in 2004. from fewer than 10 professors many and include election to the that his parents were able to mathematics and went on to Traub was most known for his to 50, and making it one of the National Academy of Engineering flee Nazi Germany in 1939 and MIT where he earned a PhD In 2014, Johnson joined in mathematics in 1973 for his work on optimal algorithms strongest computer science in 1985, the 1991 Emanuel R. settle in New York City; that Columbia’s computer science thesis Near-Optimal Bin Packing faculty as a visiting professor, and computational complexity departments in the country. Piore Gold Medal from IEEE, and he had a knack for math and Algorithms. The same year, he teaching CS students and applied to continuous scien- Based on his achievements at the 1992 Distinguished Service problem-solving just when tific problems. In collaboration started his long and productive interacting with faculty. CMU, Columbia University in Award from the Computer those skills were needed; that with Henryk Wozniakowski, he career at Bell Labs (and later 1979 extended an offer to Traub Research Association (CRA). He a fellow student’s prescient David S. Johnson, a leading topics in both mathematics and “We will miss David very much. created the field of information- AT&T Research) that would last to found the University’s Com- is a Fellow of the Association for suggestion led him to visit expert in the area of computa- computer science, including He was a wonderful colleague based complexity, where the until 2014. During this time, he puter Science department. He Computing Machinery (ACM), IBM’s Watson Laboratories tional complexity and the design combinatorial optimization, and mentor for students,” goal is to understand the cost published continuously, includ- accepted the offer and chose to the American Association for the where he first encountered and analysis of algorithms, died network design, routing and said Julia Hirschberg, chair of of solving problems when ing several books and well over locate Computer Science within Advancement of Science (AAAS), computers. And lucky to be Tuesday, March 8, 2016. Since scheduling, facility location, bin Columbia’s Computer Science information is partial, contami- 2014, Johnson was a visiting 100 papers and articles, many the Engineering School, which the Society for Industrial and among the first to enter a new, packing, graph coloring, and the department. nated, or priced. Applications for professor at Columbia University. Traveling Salesman Problem. of which concern the best ways at the time offered a single com- Applied Mathematics (SIAM), unexplored field when he had information-based complexity to cope with computational In addition to the puter, only three tenured faculty and the New York Academy of the ambition to make new are diverse and include differ- The winner of the 2010 Knuth It is however his pioneering work intractability and his developing awarded to him in 2010, Johnson members teaching computer Sciences (NYAS). He was select- discoveries and a hunger to do ential and integral equations, Prize for his contributions to on NP-completeness for which interest in the interplay between is a 1995 Fellow of the Associa- science, and a huge demand ed by the Accademia Nazionale something significant. In an continuous optimization, path theoretical and experimental he is best known. He was one theoretical and experimental tion for Computing Machinery for computer classes. dei Lincei in Rome to present interview recalling his life, he integrals, high-dimensional inte- analysis of algorithms, Johnson of the first to investigate NP- analysis in computer science. and was just this year elected the 1993 Lezione Lincee, a cycle once said “I’m almost moved helped lay the foundation for completeness, which deals with gration and approximation, and After securing a $600,000 gift to the National Academy of Engi- of six lectures. Traub received to tears but who could have algorithms used to address problems that are believed to Johnson was an active member low-discrepancy sequences. from IBM (which later provided neering. Johnson has an Erdös the 1999 Mayor’s Award for expected such a wonderful life optimization problems, in which be unsolvable within a reason- and leader in the theoretical another $4 million), he was number of 2. Understanding the role of Excellence in Science and and such a wonderful career.” a best solution is sought among able amount of time in the worst computer science community, able to add faculty and attract information about a problem Technology, an award presented a large set of possible solutions case. His book, Computers and founding the Symposium on These awards do not do justice top students. Within a year was a unifying theme of Traub’s by Mayor Rudy Giuliani. That he returned to New York to a problem. His papers on Intractability: A Guide to the Discrete Algorithms (SODA), a to his many contributions to the department was awarding City to found Columbia’s contributions to a number of In 2012, his 80th birthday the experimental analysis of Theory of NP-Completeness, co- conference that has become a the field of computer science, bachelor’s and master’s degrees computer science department diverse areas of computing. was commemorated by a approximation algorithms were authored with Michael Garey and top theory venue; for 25 years both written and in private as well as PhDs. He would chair is entirely appropriate. He Often collaborating with others, symposium at Columbia’s influential in establishing rigorous written in 1979, has been called a he served as SODA’s commit- consultation with colleagues and the department until 1989. attended both Bronx High School he created significant new Davis Auditorium to celebrate standards for algorithms that find classic for its rigorous treatment tee chair. He created also the students. David Johnson will be of Science and City College of algorithms, including the Jenkins- In 1982 he oversaw the his research and contributions an approximately optimal rather of NP completeness and for its DIMACS (Center for Discrete missed for his expertise and for New York (earning degrees in Mathematics and Theoretical Traub algorithm for polynomial construction of the Computer to computer science. than exactly optimal solution. clear, concise exposition. The the modest and unassuming way math and physics) before enter- zeros, the Kung-Traub algorithm Science Building, working Such approximation algorithms book is one of the most cited Computer Science) Implemen- in which he set about to better ing Columbia University in 1954 tation Challenges. His work for comparing the expansion closely with architects to come Traub’s “contributions to play an important role within references in all of computer sci- understand and communicate intent on a PhD in theoretical within the community was of an algebraic function, and up with a final design that Columbia’s Computer Science computer science both in theory ence, with over 55,000 citations. to others the foundational Department have been instru- physics. That plan changed when unflagging. He served on the the Shaw-Traub algorithm to would later win awards. and in practice. Johnson continued to write on topics in computer science. increase computational speed. mental in establishing the strong he discovered computers, not NP-completeness throughout his ACM Council as Member-at- Johnson researched and contrib- He authored or edited ten mono- Traub liked building things from foundation of excellence of our at Columbia—which had no career, maintaining a column on Large (1996-2004), chaired ACM graphs and some 120 papers in scratch. In 1985 while still chair Computer Science department computers—but at the IBM uted to a range of foundational SIGACT (1987-1991), edited the

24 CS@CU SPRING 2016 CS@CU SPRING 2016 25 Department News & Awards

Professor develop a new approach towards Wireless Coverage,” proposed Professor A paper co- A paper coau- Rocco Servedio making systems forget data, or WiPrint, a new computational Shree Nayar authored by thored by CS of Computer the concept they called “machine approach to control wireless received the CS Professor PhD student Science, and his unlearning.” The success of their coverage by mounting signal Distinguished Shree Nayar Georgios former student approach was demonstrated in reflectors in carefully optimized Faculty Teach- and Columbia Kontaxis Li-Yang Tan their paper “Towards Making shapes on wireless routers. ing Award along Engineering received the (PhD ‘14) are Systems Forget with Machine with James researcher Best Student Li-Yang Tan recipients of a Unlearning” that appeared earlier Professor Hone (Mechani- Daniel Sims Daniel Sims Paper Award Adrian Tang John Demme four-year, $1.2M in the 2015 IEEE Symposium on Steven M. cal Engineering). This award won the Best at the 2015 workshop on National Science Security and Privacy. Bellovin, is given on behalf of students Paper Award at the International Web 2.0 Security and Privacy. Foundation coauthor of and alumni for excellence in Conference on Computational The paper is titled “Tracking (NSF) Award for Professor Luca “Keys Under teaching, including dedication Photography. The paper is Protection in Firefox For their proposal Carloni of Com- Doormats: to undergraduate students. titled “Towards Self-Powered Privacy and Performance.” to use random puter Science is Mandating Selection is based on student Cameras.” projections to a guest editor of Insecurity evaluations and recommenda- Lucas prove lower a special issue by Requiring Government tions of a selection committee Professor Prof. Rocco Kowalczyk, Prof. Simha Prof. Salvatore Servedio bounds on of Proceedings Access to All Data and Com- made up of three students and Shih-Fu Chang a first-year Sethumadhavan Stolfo Boolean circuits. of the IEEE that munications,” received the two alumni. of Electrical PhD student A paper by CS PhD student The award will allow Servedio focused on the M3AAWG 2015 J.D. Falk Award. Engineering in computer Adrian Tang, Associate Research and Tan, now on the faculty of evolution of Electronic Design The paper explains potential A paper and of Com- science, has Scientist John Demme, Profes- Toyota Technological Institute Automation (EDA) and its future issues raised from the govern- by CS PhD puter Science been awarded sor Simha Sethumadhavan, at Chicago, to continue work developments. ment’s request for a system student Aaron was awarded a National and Professor Salvatore Stolfo they started last year in their that would allow it to access Bernstein an honorary Science Foundation (NSF) received a Best Poster Award paper “An average-case depth any secured file. and Professor doctorate by the University of Graduate Research Fellow- Kathryn at the conference. hierarchy theorem for Boolean Clifford Stein Amsterdam “in recognition ship, which recognizes and Angeles, The paper is titled “Anti-Virus circuits.” Named Best Paper at Professor won the Best of his pioneering contribution supports outstanding graduate Student Affairs in Silicon.” the FOCS 2015 conference, it Officer in the Steven Nowick Aaron Bernstein Paper Award at to our understanding of the students in science, technology, resolved a conjecture that had Computer Sci- of Computer the International digital universe, particularly in engineering, and mathematics CS Postdoctoral been open for close to 30 years. ence depart- Science received Colloquium the areas of imagery, language, disciplines. Researcher ment, was a $420,000 on Automata, and sound.” Alec Jacobson Professor awarded the National Science Languages and Henrique Teles received the Tony Jebara inaugural American College Foundation Programming Three under- Maia, currently 2015 SGP of Computer Personnel Association (ACPA) (NSF) award (ICALP 2016), graduate completing dual software award Science, with CASHE Fellowship. titled “An Asynchronous the main computer degrees in com- for leading the Lamont-Doherty Network-on-Chip Methodology European science majors puter science Prof. Clifford Stein development Earth Observa- for Cost-Effective and Fault- conference in have been and mechanical The article of the widely used geometry tory scientists Tolerant Heterogeneous SoC Theoretical recognized by engineering, has titled “Spatial processing library, libigl. Joaquim Goes, Computing,” (System-on-Chip) Architectures” Computer Science. The paper is the Computing recently been Ryan Abernethey, and Helga coauthored to explore and significantly titled “Fully Dynamic Matching Yunsung Kim Research awarded a National Science Gomes, win the University’s by Professor advance plug-and-play systems in Bipartite Graphs.” Association Foundation (NSF) Graduate Research Initiatives in Science & Steven Feiner for industrial applications. This (CRA) for show- Research Fellowship and a GEM Engineering (RISE) competition, of Computer grant will fund several significant A paper by CS ing outstanding Fellowship awarded by The for their project titled “Inferring Science, was new research directions in the PhD student research poten- National Consortium for Gradu- Spatial Heterogeneity in Marine the cover story of the January area of asynchronous on-chip Jessica Ouyang tial in an area ate Degrees for Minorities in Phytoplankton Using Fluid Dy- 2016 issue of Communications networks and systems. and Professor of computing Engineering and Science (GEM). namics and Bayesian Machine of the ACM (CACM), where past Kathleen research, includ- Learning Techniques.” From 53 accomplishments, short-term Professor McKeown ing Yunsung CUCS student Alison Y. Chang teams that entered this year’s opportunities and long-term Martha Kim received the Kim (SEAS’16) Danfei Xu competition, only six were cho- research challenges of spatial was named Jessica Ouyang Notable Data for his work (currently a sen to receive funding for each computing are discussed. the recipient Set Award at the on information PhD student project for up to two years. of the Edward 2015 Conference privacy and at Stanford A paper co- and Carole on Empirical anonymity University) won Professor authored by Kim Award for Methods in in big data, Computing Junfeng Yang Professor Faculty Involve- Natural Lan- Alison Y. Chang Research of Columbia Uni- Changxi Zheng ment. This award honors a guage Process- (CC’16) for her Association (CRA) Outstand- versity and Yinzhi of Computer faculty member demonstrat- ing. The paper is Robert Ying research in ing Undergraduate Researcher Cao of Lehigh ing teaching excellence and a titled “Modeling code switching, Award, for his research in Science re- Prof Kathleen University are ceived the “Hot special, personal commitment McKeown reportable web scraping, and text-to- sensory perception of robotic recipients of a Paper Award” to students. Nominations are events as speech data selection, and systems; in particular, tactile four-year, $1.2M at ACM’s Hotwireless 2015. The made by undergraduate and turning points in narrative.” Robert Ying (MS’16) for his sensing, visual perception, National Science Foundation paper, titled “3D Printing Your graduate students. work on assistive robotics and and sensor fusion. (NSF) grant for their proposal to brain-computer interfaces.

26 CS@CU SPRING 2016 CS@CU SPRING 2016 27 Columbia University in the City of New York Department of Computer Science NON-PROFIT ORG. 1214 Amsterdam Avenue U.S. POSTAGE Mailcode: 0401 PAID New York, NY 10027-7003 NEW YORK, NY PERMIT 3593 ADDRESS SERVICE REQUESTED

NEWSLETTER OF THE DEPARTMENT OF COMPUTER SCIENCE AT COLUMBIA UNIVERSITY WWW.CS.COLUMBIA.EDU