What is XSEDE?

The Extreme ScienceXSEDE and Engineering Discovery The five-year, $121 million project is supported by XSEDE is led by the University of Illinois’s National Environment (XSEDE) is the most advanced, powerful, the National Science Foundation. It replaces and Center for Supercomputing Applications. and robust collection of integrated digital resources expands on the National Science Foundation TeraGrid The partnership includes: and services in the world. It is a single virtual system project. More than 10,000 scientists used the TeraGrid • Carnegie Mellon University/Pittsburgh that scientists can use to interactively share to complete thousands of research projects, at no Supercomputing Center - computing resources, data, and expertise. cost to the scientists. XSEDE continues that same • Center for Advanced Computing - sort of work—with an expanded scope, generating • Indiana University • Jülich Supercomputing Centre more knowledge, and improving our world in an even • National Center for Atmospheric Research broader range of fields. • Ohio Supercomputer Center - The Ohio State University •  • Rice University • Shodor Education Foundation • Southeastern Universities Research Association • University of California Berkeley • San Diego Supercomputer Center - University of California San Diego •  • National Center for Supercomputing Applications - University of Illinois at Urbana-Champaign • National Institute for Computational Sciences - Knoxville/Oak Ridge National Laboratory • Texas Advanced Computing Center - The University of Texas at Austin • University of Virginia

xsede.org Dawn of the XSEDE Era 2 No Charge Double Helix 20 Researchers derive the first accurate 3D John Towns, leader of the National Science structure of a synthetic double-helical Foundation’s new Extreme Science and molecule that holds promise for applications Engineering Discovery Environment, talks in biomedicine and nanotechnology about the vision for XSEDE and how it will build on the TeraGrid. A Recipe for Science Success 22 Collaboration between Open Science Grid and TeraGrid aims to give Science highlights researchers the right tools

For the Birds 6 Table of Supercomputers and citizen scientists Education, Outreach, and Contents converge to pinpoint avian populations Training highlights Malaria Mystery Solved 8 Building Skills that Count 26 On the cover: Humans likely source of malarial infections XSEDE The Advanced Visualization A sampling of the education and outreach in great apes, not the other way around as Laboratory at the National programs offered by TeraGrid partners Center for Supercomputing previously thought Applications generated this Champions Help 28 visualization of supernova Ice, Ice, Baby 10 data from Volker Bromm at University of Washington researchers Campuses Connect The University of Texas at Austin. explore mysterious Antarctic sea ice Swarthmore exemplifies how dedicated The image represents an initial champions broaden TeraGrid/XSEDE reach data study, and the collaboration eventually yielded a final Improving Nature’s Top Recyclers 12 rendered scene for the feature The National Renewable Energy Laboratory Compelling, Ferocious Beauty 30 film ‘The Tree of Life.’ uses TeraGrid supercomputers to explore Cosmic simulations and visualization skill Courtesy of the Advanced new enzymes for renewable fuels contribute to acclaimed feature film Visualization Laboratory at ‘The Tree of Life’ the National Center for Supercomputing Applications Placing Landmarks on the 14 Taking Training on the Road 32 Genome Map Collaboration with Southeastern Researchers show for the first time that Universities Research Association differences in DNA between individuals provides visualization workshops can affect the binding of transcription factors to minority-serving institutions

Turbulent Times 16 Being ‘Smart’ at Home 34 TeraGrid aids scientists in developing TeraGrid storage and visualization novel technique to reduce jet noise resources aid ‘smart grid’ research Cold Dark Matter Lives 18 An international team led by University of Washington astrophysicists appears to have solved the problem of dwarf galaxies XSEDE TERAGRID

Dawn of the XSEDE era John Towns, leader of the National Science Foundation’s new Extreme Science and Engineering Discovery Environment, talks about the vision for XSEDE and how it will build on the TeraGrid. 02 03

John Towns XSEDE Project Director

Research no longer typically happens in the context of a single XSEDE will also leverage GlobusOnline services to easily move investigator on a single campus. Instead, today’s investigators data from campus servers, laptops, and desktops, allowing are collaborating across institutional and geographic bound- high-performance data movement to and from XSEDE Dawn of the XSEDE era aries. To be successful, researchers need access to dispersed resources. resources, including instruments, data stores, and high-perfor- mance computers and, critically, to the tools and services that Of course, along with new services and tools, we also want to enable coordinated use and sharing of those resources. continue providing the strong support that people relied on throughout the TeraGrid’s decade of operation; we want to The intent with XSEDE is to create the integrated environment make the transition from TeraGrid to XSEDE as non-disruptive in which all of these resources and services are available. We as possible. Ralph Roskies and Nancy Wilkins-Diehr jointly lead aim to establish a ecosystem that allows us XSEDE’s Extended Collaborative Support Services (ECSS), to interoperate with other resources, with other infrastructure which encompasses Advanced Support for Research Teams providers, and in which researchers and educators can be much (a continuation of TeraGrid’s Advanced Support for TeraGrid more productive and can begin to develop new capabilities. Applications), Advanced Support of Community Capabilities, and Advanced Support for Training, Education and Outreach. For example, we will lower the entry barrier for institutions and collaborations to connect to XSEDEnet by using National XSEDE ECSS will also include support for Novel and Innovative LambdaRail’s (NLR) FrameNet services. In Year 1, XSEDEnet will Projects, an effort led by Sergiu Sanielevici at PSC. This effort provide dedicated 10 Gbps connectivity to the core XD will extend support to domain areas that have not typically Service Providers (Indiana, NCSA, NICS, NCAR, PSC, Purdue, tapped into high-performance computing, such as economics SDSC and TACC); then in Year 2 XSEDE will add a service to and computational linguistics, and to under-represented enable collaborators to create on-demand high-performance communities and institutions. networks between XSEDE service providers and many other potential sites around the country. As XSEDE gets under way, what I find most exciting is the potential to include a lot more disciplinary areas and a lot more researchers who might not have had easy access to the resources in the past. New disciplines, new users, and being able to increase productivity in order to enable new science and engineering—that’s what makes the XSEDE project really exciting. XSEDE TERAGRID

Science highlights 04 05 SCIENCE HIGHLIGHTS Discussion of cyberinfrastructure projects such as TeraGrid and XSEDE often focuses on and bytes and data transfer rates, on hardware and software, on code and computers. But what these projects are really about is gaining new knowledge. Over the past decade, thousands of researchers used the resources, tools, and support provided by TeraGrid to better Science highlights understand climate change, the flow of blood in our bodies, and the evolution of the universe. Thousands more investigators will use XSEDE to tackle these and many other challenges. The following pages offer just a small sampling of the research enabled by TeraGrid over the past year, and a hint of what may be to come with XSEDE. XSEDE

TERAGRID For the Birds Supercomputers and citizen scientists converge to pinpoint avian populations

Sometimes everything just comes together, creating a sum much greater than its parts. Such is the case with eBird, a bird-monitoring project by the Cornell Laboratory of Ornithology that is using a unique National Science Foundation (NSF) collaboration to revolutionize bird conservation and numerous areas of environmental science.

“Three distinct entities came together to make this possible: citizen scientists, a unique statistical algorithm, and the existence of large-scale (high-performance computing) facilities,” says John Cobb, a principal investigator for TeraGrid at Oak Ridge National Laboratory (ORNL) in Tennessee.

Birds are often the first to suffer when damage hits an ecosystem. For that reason they are a widely acknowledged environmental indicator. Thanks to the NSF’s Office of Cyberinfrastructure DataONE and TeraGrid initiatives, along with support from the Leon Levy Foundation, eBird was able to show, for the first time, how bird populations move week-by-week across Indigo Bunting (Passerina cyanea) distribution for June 28, 2008. America and identify the environmental conditions The map shows the predicted occurrence corrected for variation associated with these population movements. in detectability associated with search effort. This estimate was derived from a model using data from eBird and data describing the local environment. Brighter areas indicate higher probability of occurrence. Courtesy: Daniel Fink, Information Science Department, Cornell Lab of Ornithology 06 07 SCIENCE HIGHLIGHTS

“DataONE has brought together a lot of Among eBird’s more interesting findings: Going forward, the eBird project aims to chart more people and resources,” says Cornell Laboratory species with broad distributions will actually than 200 species using more than six years’ worth of Ornithology statistician Daniel Fink. In fact, adapt to local niches distinct in different areas of data from a 3 million-hour allocation. The data eBird was DataONE’s pre-release science of the continent at different times of the year, will be broken apart year by year to reveal changes demonstration project, and the leading and conservationists want to know what and over space and time, a great point of interest across NSF digital data archive wanted an early where those niches are. environmental arenas. science impact—which this project certainly delivered. The maps provide so much valuable “Without TeraGrid, we would be doing boutique information that they were recently featured analysis on our local cluster one species at a time,” Essentially, the eBird project enlists the help in the annual State of the Birds 2011 Report says Fink. of thousands of enthusiastic birdwatchers to on Public Land and Waters, the nation’s first record bird sightings along with the precise assessment of the distribution of bird species For more information: www.ebird.org location and time. These observations are on public lands as a measure of stewardship Grant #: TG-deb110008 reviewed by a network of expert ornitholo- responsibility. “The State of the Birds report is a gists and entered into eBird’s database, where measurable indicator of how well we are environmental data describing the search fulfilling our shared role as stewards of our locations are linked to the eBird records. nation’s public lands and waters,” notes This data is then fed to Texas Advanced Secretary of the Interior Ken Salazar. Thanks Computing Center’s (TACC) Lonestar to the TeraGrid, policymakers got their first supercomputer, where a unique statistical fine-grained nationwide glimpse at bird algorithm is deployed to discover the species distributions, a feat that just a few dynamic associations between the years ago dwelled in the realm of the environment and observed patterns of bird impossible. occurrence. Ecologists use these results to estimate bird occurrence week-by-week “TeraGrid allows us to provide information for across the country, creating a dynamic bird multi-species analysis which is very useful in census that provides invaluable information to the ecology and conservation world and will conservationists and environmental scientists allow us to do year-by-year difference compar- alike. isons to study responses from environmental change,” adds Fink. “TeraGrid was absolutely necessary for the State of the Birds Report, and there was no way we could have done it without Lonestar.” TERAGRID XSEDE Humans likelysource infectionsingreat ofmalarial apes, Malaria Mystery Solved Mystery Malaria not theotherway around aspreviously thought F mammalian hosts. relationship between Plasmodium speciesandtheir species andprovide newinsights into thecomplex ing ofthedivergence oftheseindividualPlasmodium onDec.ogy 1,2010,offer thefirst comprehensive dat The results, Parasitol publishedonlineinthejournal - Plasmodium species. genesfrom genomesofeighttinct recently sequenced of Plasmodium by analyzinganunprecedented 45dis- CarolinaSouth history have theevolutionary clarified andtheUniversity of led by theUniversity ofMaryland andcomputational resources,expertise researchers themto linking With theaidofaportal TeraGrid been elusive. relationships oftheseparasitesevolutionary have acrossmalaria vertebrate ofthe species, keyaspects genetics ofvariousPlasmodium parasites, whichcause the world. Though aboutthe muchhasbeenlearned any otherinfectiousdiseaseacross of large sections patients,terrified claimingmore childhood lives than or centuries, hasmystified malaria physicians and - 08 09 SCIENCE HIGHLIGHTS

“The results clarify the ancient association hosts, the researchers sought to compare ge- SCIENCE GATEWAYS: between malaria parasites and their primate netic variations found in 45 highly conserved It sometimes takes a community hosts, including humans,” says James B. nuclear genes for which sequences are avail- Munro, a researcher from the University of able from all eight Plasmodium species. About five years ago, TeraGrid launched Science Gateways, Maryland School of Medicine. “Indeed, even an innovative program designed to provide a full range of though the data is somewhat noisy due to The evolutionary relationships were inferred specialized services and capabilities—computational analysis, visualization, workflows development, collaborative tools, and issues related to nucleotide composition, the using, among others, the software package more—to a broad range of scientific communities. By the end MrBayes, which consumed about 200,000 signal is still strong enough to obtain a clear of its first year, 2006, about 100 users signed up. Since then, answer.” CPU hours on Abe, a supercomputer at the the number of users has soared to nearly 1,200, accounting for National Center for Supercomputing Applica- roughly 36 percent of all TeraGrid users charging jobs. A major finding of the research is that humans tions (NCSA). The researchers accessed this likely serve as a reservoir for P. falciparum–that resource via the CIPRES Science Gateway, Communities came together to address problems in astron- is, humans are likely to transmit this most a browser interface developed at the San omy, chemistry, earthquake mitigation, geophysics, global virulent among all human-infecting Plasmo- Diego Supercomputer Center (SDSC) that atmospheric research, biology and neuroscience, cognitive dium species to great apes, not the other way permits access to TeraGrid compute resourc- science, molecular biology, physics and seismology, among others. This year, the largest community—CIPRES—represent- around. This finding contradicts previous stud- es. “Without CIPRES, this work, and the other ed 890 users, or 25 percent of all TeraGrid users charging jobs. ies, which suggested that humans acquired P. projects I am working on, would not go as falciparum from apes. The results obtained in quickly or as smoothly,” adds Munro. “CIPRES “Science Gateways is an excellent example of how to democ- this study argue that “if P. falciparum infections is a fantastic resource.” ratize science,” said Nancy Wilkins-Diehr, former TeraGrid area in great apes are derived from humans, [there director for the program and now the co-leader of XSEDE may be a] need to establish refuges for the Other phylogenetic or divergence time analy- Extended Collaborative Support Services. “It provided anyone, great apes that are safe from human intrusion.” ses were conducted on the Brazos computer even those at small institutions, with access to the largest- cluster at Texas A&M University and on the scale HPC resources and expertise, with projects spanning a The research builds on the unveiling of the cluster at the Institute for Genome Sciences wide variety of research domains.” genome sequences of the two most wide- at the University of Maryland School of Medi- A few Science Gateways highlights: spread human malaria parasites–P. falciparum cine. Further studies are expected to be run and P. vivax–and the monkey parasite P. on Trestles, a new data-intensive high-perfor- • Ultrascan connects hundreds of users globally to an ultracen- knowlesi, together with the draft genomes of mance computing resource at SDSC. trifuge at the University of Texas (UT) Health Sciences Center the chimpanzee parasite P. reichenow; three in San Antonio; TeraGrid resources run the high-resolution analysis of the experimental data. rodent parasites, P. yoelii yoelli, P. berghei, and For more information: http://www.phylo.org/sub_sections/portal/ P. chabaudi chabaudi; and one avian parasite, • As an example of multidisciplinary work enabled by gate- P. gallinaceum. To examine the association Grant #: GM43940, 5R01 GM070793-03 ways, the Community Climate System Model (CCSM) was between malaria parasites and their primate used by a joint political science and earth and atmospheric sciences class at Purdue University to simulate the influence of public policy decisions on climate change.

• GISolve, a grid-based environment for computationally intense geographic analysis, made the cover of the Proceedings of the National Academy of Sciences (PNAS). TERAGRID XSEDE Courtesy: Cecilia Bitz,University of Courtesy: Washington as strong asat highresolution (right). second halfofthe 20thcentury. The response at twice low isabout resolution (left) temperatureSurface (indegrees Celsius) response to depletingozone over the University of Washington researchers explore mysterious Antarctic seaice Courtesy: Cecilia Bitz,University of Courtesy: Washington response at highresolution (right). ized eddy response at low correctly reproduces resolution (left) theresolved eddy strengthens theocean currents. This research addresses whethertheparameter- ozone levels. Depletingozone increases thewesterly windstress, andtherefore for inthehigh-andlow-resolution arandomly chosenOctober runs for preindustrial Current (incm/s)andseaice speeds extent (15%concentration contour) are shown Ice, Ice, Baby 10 11

CESM is a fully-coupled, global climate model that provides state-of-the-art simulations of the SCIENCE HIGHLIGHTS Earth’s past, present, and future climate states. It is among the most sophisticated climate models available, a global model that provides consistent simulation of high-resolution effects. However, even CESM has its problems, as previ- ous models of the Antarctic sea ice showed a decrease in area annually, in contrast with observations that show an overall expansion. Cecilia Bitz, In color is the annually averaged surface temperature University of Washington Bitz’s hypothesis was that the models were too over the planet, overlain with the change in winds at 850 coarse, and by ramping up the resolution ten- millibars due to increased carbon dioxide and a strato- The Southern hemisphere has a number of fold her team could get to the truth. For exam- spheric sulfate layer. The magnitude of these atmospheric circulation changes, especially over the Southern Ocean, eccentricities familiar to those of us from North of ple, other factors, such as ocean eddies, could the equator: Penguins instead of polar bears, toilets is similar to that induced by just an increase in carbon play a major role, but they have been largely dioxide. Courtesy: Cecilia Bitz, University of Washington that supposedly flush backwards, and a night sky parameterized to vary with wind strength. Bitz that features a different cast of stars. The poles are believed that the ocean’s response to increased no exception. wind strength in the model’s coarse resolution While the team’s simulations didn’t put an end to specu- was incorrect, downplaying important factors lation regarding the expansion of the Antarctic sea ice, Unlike the Arctic, which is notoriously shedding such as ocean currents and eddies. they were a milestone in climate modeling. Bitz is grate- more and more ice every year, the amount of ful to the NSF’s PetaApps program. “I could only do what Antarctic sea ice—literally frozen, floating ocean After months of preparation, Bitz’s team began I did with the help of that group,” she says, adding that water—is actually increasing. While scientists aren’t the two-month-long process of conducting NICS and Kraken “have both been very helpful to me.” exactly sure why, they do have a few main suspects. the simulations. Most of the runs used 6,000 The current culprit of choice is the hole in the ozone of Kraken’s more than 112,000 cores at the Bitz is also using two new XSEDE resources at NICS and layer that hovers over the Antarctic continent, National Institute for Computational Sciences the Texas Advanced Computing Center (TACC) to teach creating a number of natural phenomena, not the (NICS). Kraken was a TeraGrid supercomputer a climate modeling course and study geoengineer- least of which is increased wind circulation, creat- and is now available through XSEDE. Her team ing as a means to offset our increase in carbon dioxide ing a lower surface temperature on the Antarctic consumed more than 11 million CPU hours, production. Theirs is one of a handful of computational continent and altering ocean heat transport. with each simulation generating approximately geoengineering studies that helps to determine how 50 terabytes of data. “Just to analyze and inter- a drastic human-induced change might interrupt the Not so fast, says Cecilia Bitz of the University of pret that amount of data is intensive,” says Bitz. Earth’s environmental systems. The work follows up on Washington and the principal investigator of the other atmospheric studies by Bitz, including a recent most detailed simulations to date of Antarctic sea The team’s analyses of the Antarctic sea ice publication in Nature that suggested, based on TeraGrid ice. She too was a believer, until her team, a branch simulations aren’t exactly what were expected, simulations, that greenhouse gas mitigation can reduce of the National Science Foundation’s PetaApps but it does seem that they are getting closer to sea-ice loss and increase polar bear survival. program, ran 10km simulations of the Antarctic ice nature. “A lot of the subtle behavior is different sheet using the Community Earth Systems Model at fine resolution,” notes Bitz. However, it is clear For more information: http://www.atmos.washington.edu/~bitz/ (CESM) to determine if, as expected, the depletion to Bitz that the expansion is likely not solely due of ozone at the bottom of our planet is indeed to ozone. Grant #s: NSF #OPP-0938204 and DOE #0013706 causing the sea ice to expand. XSEDE Improving Nature’s TERAGRID Top Recyclers The National Renewable Energy Laboratory uses TeraGrid supercomputers to explore new enzymes for renewable fuels

A coarse-grained model of the bacterial cellulosome system during the self-assembly process. The long scaffold (blue) contains binding sites for the free enzymes (red, yellow, and green) of different sizes. Courtesy: National Renewable Energy Laboratory 12 13 SCIENCE HIGHLIGHTS

“It’s a Goldilocks problem,” notes Beckham. “The enzymes have to be ‘just right.’ We’re trying If a tree falls in the forest and there are no to find out what just right is, why, and how to enzymes to digest it, does it decompose? make mutations to the enzymes to make them most efficient.” It’s a question that has important ramifications for the renewable energy industry. Scientists NREL’s computational researchers used and engineers are studying ways to transform TeraGrid supercomputers to simulate processes non-food-based plant material into transporta- in the world of enzymes. Using Ranger at the Back row, left to right: Yannick Bomble, Michael tion fuel—think alfalfa stalks and wood chips, as Texas Advanced Computing Center (TACC) and Crowley, and Gregg Beckham. Front row, left to right: opposed to the edible corn grains used in the the Red Mesa system at NREL, they simulated Antti-Pekka Hynninen, Mark Nimlos, Christy Payne, production of ethanol. enzyme behavior from the cellulose-devouring and Deanne Sammond. Not shown: Lintao Bu, James bacteria Clostridium thermocellum and the Matthews. “Cellulose in the biosphere can last for years,” says prodigiously plant-eating fungus Trichoderma Using Ranger, the scientists also studied underex- Gregg Beckham, a scientist in the National Bio- reesei. energy Center at the U.S. Department of Energy’s plored parts of the enzyme that the Trichoderma reesei fungus uses to break down cellulose. They National Renewable Energy Laboratory (NREL). After creating a computational model of the found that the cellulose surface has energy wells “It’s really tough, and we want to know why (this molecules and setting them into motion in a set one nanometer apart—a perfect fit for the happens) at the molecular scale.” virtual environment, the researchers learned binding module. In addition, they found that the how the bacteria forms scaffolds for its en- linker region, previously believed to contain both Despite the toughness of plant cell walls, fungi zymes, which work together to break apart the stiff and flexible regions, behaves more like a highly and bacteria have evolved enzymes to convert plant. abundant cellulosic plant matter into sugars to flexible tether. These findings were reported in the Biophysical Journal in December 2010. use as energy to sustain life. Unfortunately, the Contrary to expectation, the larger, slower- most powerful enzymes don’t work fast enough moving enzymes lingered near the scaffold “We’re using rational design to understand how the to break down cellulose at a pace—and price— longer, allowing them to bind to the frame enzyme works, and then to predict the best place that is competitive with fossil fuels…yet. more frequently, while the smaller ones moved to change something and test it,” says Michael faster and more freely through the solution, but Crowley, a principal scientist at NREL. In an effort to improve nature’s top recyclers, bound less often. computational scientists at NREL are trying to “If we can help industry understand and improve create “designer” enzymes capable of speeding The results of this study were reported in the these processes for renewable fuel production, up bio-fuel production and thereby lowering the Journal of Biological Chemistry in February 2011, we’ll be able to offset a significant fraction of fossil cost of biomass-derived fuel to serve the global and insights from the simulations are being fuel use in the long term,” according to Beckham. population. used to create designer enzymes to make bio- mass conversion faster, more efficient, and less expensive. For more information: http://www.nrel.gov/biomass/ staff_pages/gregg_beckham.html

Grant #: TG-MCB090159 TERAGRID XSEDE Texas at Austin Courtesy: Vishy Iyer, The University of conserved positions. positions, whichtends to occur at SNPs relativeallele-specific to other an increased representation of ized total SNPs. barsindicate Higher the two asapercentage ofnormal- indicates thedifferencebetween the CTCF bindingmotif. The yaxis SNPs across and non–allele-specific Representation ofallele-specific Courtesy: Courtesy: Vishy Iyer, The University of Texas at Austin three transcriptionalongthechromosome. factors ofthe outlinedregion, showing thebindinglocations of below isaclose-up Human chromosome 21withasmallregion outlinedinred. The mainrectangle on the Genome Map Placing LandmarksontheGenome between individualscanaffectthebindingoftranscription factors Researchers show for thefirsttimethat differences inDNA work. diseases. However, about how they littleisknown and othercancers, syndrome, Rett andautoimmune been identified, andsomehave beenlinked to breast proteinsAbout 2,000ofthetranscription factor have how that cameabout.” whichgenesareknow upordown regulated, and gene expression patterns change, andwe want to Biology at The University of Texas at Austin. “The professor for inthe Institute Cellular andMolecular that makesitdifferent,” says VishyIyer, an associate you want what tohappenedinthecancer know cell you’re“If cells comparing normal to cancer cells, inacell. activity genes onoroff, ordetermining the amount ofgene ascontrolact dialsfor generegulation—turning wide scale. These proteins bindto thegenomeand scientists are justbeginning to explore onagenome- proteinsfactor play ingeneregulation, which One example involves therole that transcription our traits aswell. regionsthe non-coding ofthegenomeplay arole in nomes andanalyze theirresults, they’re findingthat genes. And itis. Butasbiologists sequence more ge rooted inour type, to orsusceptibility adisease—as W e typically think of heredity—eye color, thinkofheredity—eye e typically body - 14 15 SCIENCE HIGHLIGHTS Placing Landmarks on the Genome Map Sounds simple enough, until you try to The findings bring science one step closer to sequence millions of these regions and locate personalized medicine based on a detailed their exact position among the approximately reading of an individual’s genome, including 3 billion base pairs in the human genome. the non-coding regions. Despite the tre- mendous complexity of the genome, Iyer is Vishy Iyer, “The genome is a vast area with many optimistic that his group’s research will have The University of features,” explains Iyer. “You can think of the Texas at Austin an impact on human health. proteins as landmarks that we’re trying to Iyer, along with colleagues at Duke, The place on the genome map.” “There are lots of diseases and for a subset, University of North Carolina-Chapel Hill, and they’ve got to be affecting gene expres- Hinxton, UK, are trying to change that. Their Using several thousand processors simulta- sion by impacting transcription factors,” he study—based on simulations performed neously, Ranger was used to take the short adds. “If we pick the diseases and the factors on the Ranger supercomputer at the Texas sequence reads generated by ChIP-Seq and smartly, I think we’ll find them.” Advanced Computing Center (TACC) and align them to the reference genome. published in Science in 2010—was one of The single base resolution offered by next-gen For more information: http://microarray.icmb. utexas.edu/research.html the first to use supercomputers and next- sequencing enabled the researchers to look at generation gene sequencing to explore the individual known differences in the DNA and NIH Project #: 5R01CA095548-07 expression of genes related to a specific to use those dissimilarities to examine how regulatory transcription factor, called CTCF. genes on each chromosome bind transcription factors. “We showed for the first time that some of the differences in DNA between individuals “We were able to tell the difference in binding can affect the binding of transcription factors from the gene that you inherited from your and more importantly, that those differences father and mother—that was the big advance,” could be inherited,” according to Iyer. notes Iyer. “We’re now applying this technol- ogy to cases where you know that the gene The group used a relatively new sequencing from one of your parents has a mutation that technology, called ChIP-Seq, to pull out only predisposes you to some disease.” the regions of DNA to which the proteins of interest were bound. These base pairs were then sequenced to determine the order of nucleotides and to count how many mol- ecules of the promoter were bound to the protein. TERAGRID XSEDE

Courtesy: DanielBodony,Courtesy: at Urbana-Champaign University ofIllinois as color. rate. The sound-generating turbulence, asindicated by thevorticity, isshown by theblack-and-white contours ofdilatation, ameasure ofair’s compression different, the controlled jetisproducingpercent 30 lessnoiseasvisualized result inthe quieter,jet (left) controlled jet(right). Though onlysubtly Small, well-timed disturbances addedto anuncontrolled 1.3turbulent Mach TeraGrid aidsscientists indeveloping novel technique to reduce jetnoise how itcanbecontrolled using anewdevice. how soundisgenerated onthemostbasiclevel, and from jet engine exhaust. The simulations helpexplain soundwavesevolution ofturbulence-generated Advanced Computing Center (TACC) to simulate the the team useda TeraGrid supercomputer at the Texas inawindtunnel orlaboratory, ofworking Instead unsteady movement ofair, asturbulence. known to decreaseeffort jetengine noise by controlling the graduate ofaNASA-funded student, Bodony ispart with JonFreund a ofIllinois, andJeonglaeKim, toChampaign, address isworking thisissue. Along atengineering Urbana- at theUniversity ofIllinois Daniel Bodony, an assistant professor ofaerospace solution. mandated inafew years, noonehasaready-made levels.restriction When even tighter are restrictions Aircraft are barely ableto meetthecurrent noise and soldto commercial airlines. canproducean airplane before itcanbecertified Aviation Organization reduces themaximumnoise 10years,hoods. About every theInternationalCivil personnel andresidents insurrounding neighbor creasing pressure to keepnoiselevels low for airport A irlines andaircraft areirlines manufacturers underin- Turbulent Times - 16 17 SCIENCE HIGHLIGHTS

“We can’t squash the turbulence,” Bodony notes. fielded three generations in the future. Bodony “Our controllers aren’t that strong and it may not expects such a device, if successful, to come to even be possible or desirable. So, we add additional market in 10 to 15 years. perturbations to reorganize the pre-existing disturbances such that the unsteady forces and If that sounds long, consider that the new Boeing stresses within the fluid are less.” 787, the first commercial airliner to be equipped Daniel Bodony, with noise-controlling devices, called chevrons, University of Illinois at The simulations on TACC’s Ranger and Lonestar contains elements designed 15 years ago. Urbana-Champaign 4 determined the ideal timing and strength of the perturbations to reduce the engine’s radiated “This work is computationally and intellectually “We’re studying the controlled jet and the sound without significantly altering its thrust. demanding,” says Sanjiva Lele, a professor of uncontrolled jet to understand what changes The first round of improvements showed the mechanical engineering at Stanford University between them,” says Bodony . “That’s what potential to reduce jet noise by three decibels, or who is familiar with the research. “But if system- experiments can’t currently do and what is miss- the equivalent of 30 percent, which equals the atic methods to reduce noise can found, the ing from our understanding of the science.” best that has been found experimentally by trial benefit to the aviation community would be and error. Bodony is confident that with further tremendous.” Bodony, Freund, and Kim use a numerical tech- refinements, his group will be able to reduce the nique called “large eddy simulation” to simulate noise level even further. Results of the group’s theoretical and simulation the motion of air around the jet. The simulations work were published in the Journal of Sound and show the amount of turbulence flowing around The design insights that Bodony uncovered are Vibration in February 2011. the jet and, importantly, the amount of sound expected to reduce the sound levels on “N+3” that this turbulence creates. generation aircraft, NASA’s shorthand for aircraft For more information: http://www.ae.illinois.edu/ people/faculty/bodony.html “Unfortunately, the noise is not generated where you can control it directly, so you have to Grant #: CTS090004 add a control someplace else, like on the nozzle, and tickle the flow in such a way that the sound Extensibility is reduced at a later spot in the jet,” Bodony an ad hoc basis. For a researcher like Bodony, this explains. The advanced computing centers of XSEDE offer un- paralleled power to researchers, but these systems are might mean linking local clusters in the Aerodynamics by no means the only computers that scientists use. Engineering department at the University of Illinois to After conducting four years of research and In fact, the computing power distributed among the Ranger and Kraken in order to verify sample problems relying on TeraGrid resources, Bodony and his tens of thousands of campuses in the United States— or compare scaling results. collaborators developed a novel technique to in departmental clusters, IT centers, and university- determine the optimal controller required to wide data repositories—is far beyond the capacity This extensibility matches XSEDE’s new focus on ad- reduce jet noise. The controller is a plasma provided by XSEDE. dressing the computing needs of all researchers, not actuator—something like a giant spark plug— just those who use high-performance computers. By To leverage these capabilities, XSEDE will allow users bridging the campus and national cyberinfrastructure, based on those developed by colleagues at XSEDE will enable more researchers to take advantage Ohio State that alters the sound field by to connect other resources—local, regional, or inter- national—to the XSEDE network easily and on of the national computing ecosystem to accomplish injecting heat. important work.

TERAGRID XSEDE Patrik Jonsson,Center for Astrophysics, Harvard Brook, Chris Courtesy: The Jeremiah Horrocks at Institute theUniversity ofCentral Lancashire, and (right) andbackgrounda real galaxy imagefrom theSloanSky Survey Digital Collaboration. from GovernatoA galaxy andcolleagues’ identical to inallrespects appears simulation (left) An international team ledby University of Washington astrophysicists Astrophysics, Harvard Lancashire, andPatrik Jonsson, Center for rocks at Institute theUniversity of Central Brook, Chris Courtesy: The Jeremiah Hor andedge-on. (left) face-on at highresolution show light distribution colleagues’ galaxy simulation ofadwarf These two frames from Governato and appears to have galaxies solved the problem ofdwarf Cold DarkMatter Lives - shown excellent agreement withobservations. otherway simulations have CDM-based every nearly stumblingblockforhas beenaserious CDM,since in something wrong withthemodel. This discrepancy about 15years, researchers have suggested there’s galaxiesshow them,astheyhave fortions ofdwarf galaxies suchastheMilky Way, whenCDM simula- although acentral bulgeofstarsiscommon inlarger matter” (CDM) modelofhow galaxies form, and These simulations test thereigning “cold dark tions have beenseeingbulges. Yet astrophysicists whoruncomputational simula- edge lookslikeaFrisbee. “bulgeless”—with that stardistribution on adisc-like and astronomerstheyare whostudythemknow number ofstarsintheMilky Way, ourgalaxy, orbit galaxies, whichhave onlyaboutonepercent ofthe big problem for cosmological theory. Many ofthese A lthough small in size, dwarf galaxieshavelthough smallinsize, poseda dwarf

18 19 SCIENCE HIGHLIGHTS

(Left) Quinn, Governato, and their colleagues this year Overall, the findings from this ongoing work are Fabio Governato, extended their 2010 findings, including an April a game changer for the CDM model and have gath- University of 2011 report in Astrophysics Journal of “artificial ered commentary in several science journals. “Real- Washington observations.” For this data-intensive compari- istic dwarf galaxies are thus shown to be a natural (Right) Tom son between their simulated dwarf galaxies and outcome of galaxy formation in the CDM scenario,” Quinn, University observed dwarf galaxies (from THINGS, The HI wrote the researchers in Nature. Or as Governato of Washington Nearby Galaxy Survey), notes Quinn, “shared puts it, “CDM lives to fight another day.” memory is really helpful,” and the Pittsburgh Supercomputing Center’s (PSC) Blacklight, one For more information: http://www.astro.washington. Similarly, observations of dwarf galaxies in edu/users/fabio/ space show a shallow, unconcentrated distribu- of the newest TeraGrid resources and the largest tion of dark matter, the invisible matter that shared-memory system in the world, contribut- Grant #s: NSF-AST-0607819; TG-MCA94P018 comprises the largest part of the universe’s ed significantly. Analyses show good agreement mass, while CDM simulations have shown dwarf between the simulated and observed galaxies, galaxies with dark matter centrally concen- and lend validation to the simulations. trated. “Basically we have a model that’s really good at explaining a lot of what’s going on in Making the difference, along with more power- the universe,” says University of Washington ful computing, was improvements to GASOLINE, astrophysicist Fabio Governato, “but there’s an astrophysics simulation software developed been these two sore points: bulges and dense over a 15-year period by Quinn, James Wadsley dark-matter halos in the dwarf galaxies.” of McMaster University, and Joachim Stadel of the University of Zurich. By making changes in “This failure is potentially catastrophic for the how GASOLINE represented the physics involved CDM model,” Governato and his University of and with higher resolution than had before Washington colleague Tom Quinn and an inter- been possible, the researchers more realistically national team of collaborators wrote in a paper captured the processes of star formation and published in Nature in January 2010, reporting evolution, including the violent star death and simulations that convincingly resolve the dwarf spectacular gas outflow phenomena of super- galaxy problem. With improved accuracy made novae. “It was a massive computational project,” possible in part by access to more than a million says Governato. “This kind of research wasn’t hours of TeraGrid computing, mainly at the possible just three years ago. We took advantage National Institute for Computational Sciences of the fact that computers are getting faster and (NICS) and the Texas Advanced Computing Cen- faster.” ter (TACC), their simulations show dwarf galax- ies without bulges, meaning the distribution of stars and dark matter agree well with observed dwarf galaxies. TERAGRID XSEDE Courtesy: Carnegie University Mellon Courtesy: obtainedbystructures restrained MD. of10time-averagedSuperposition PNA double-helical moleculethat holds promise for applicationsdouble-helical inbiomedicine andnanotechnology Researchers derive thefirstaccurate ofasynthetic 3Dstructure University Carnegie Mellon Courtesy: andcolleagues.Madrid simulated by Achim, (bottom) view ofPNA Side (top) andaxial O but with a special advantage: It doesn’tbut withaspecialadvantage: It have acharge. PNA doesn’t exist innature, it’s aclosecousin to DNA nucleicmolecule calledpeptide acid, orPNA.Although es to solve ofafascinating the structure “bio-mimetic” Supercomputing Center using (PSC), TeraGrid resourc TeraGrid scientist Marcela from Madrid thePittsburgh To that end, Achim worked withateam, including apply them.” scientists studytheseprocesses sowe to canlearn the basicsofhow transfer electron works, andmany go withoutwasting alot of energy. We someof know we don’t yethow to know takeoilandmakeourcars University. efficiently,“But nature doesitvery while an associate at professor Carnegie Mellon ofchemistry our bodiesto metabolize food,” says Catalina Achim, to heat andlight ourhomes, we transfer in electrons “Just likewe through transfer power electricity lines transfer.electron processes of “controlled burning” that dependon morning, for work orfor pleasure, allcomes from these energy from food. energy Our for gettingupinthe from atom to atom withinlivingorganisms to produce evolved efficient remarkably ways to transfer electrons ver thecourse ofabout4billionyears, nature has Double Helix No Charge - 20 21 SCIENCE HIGHLIGHTS

From this data, and in order to derive an accurate “Our collaboration with PSC was very beneficial,” DNA’s helical strands have negative charges in the 3D structure, PSC’s Madrid turned to MD, which says Achim, “not only for the research itself but backbone, and when the four A-G-C-T bases pair simulates the movement of a biomolecule by track- also for educational purposes. Working with Mar- up via hydrogen bonds to form the DNA double- ing the forces between the atoms over time. In this cela Madrid, my graduate students learned how to stranded helix, there’s built-in electrostatic repulsion case, Madrid—relying on PSC’s SGI Altix system do molecular dynamics simulations.” between the strands. The DNA structure is overall called Pople—used “restrained” MD, a technique For more information: http://www.chem.cmu.edu/ always negatively charged, according to Achim. that made it possible, along with a package of molecular simulation programs called AMBER, to groups/achim/ To circumvent this, Achim and her colleagues sub- determine a family of PNA structures that fit with Grant #s: NSF-CHE-0347140; TG-MCB0700070N stituted peptide-like groups (small proteins) for the the NMR data. phosphate groups of the DNA backbone. The result- ing neutral double helix helped further the study of The results, reported in 2010 in Molecular Biosys- electron transfer and offers useful applications, such tems, a journal of the Royal Society of Chemistry, as a molecular “scaffold” for metal ions to deliver provide for the first time the 3D structure of this electrons to cells or biological molecules. PNA molecule, allowing Achim and her collabora- tors to embark on new, more detailed studies of The necessary first step for these applications, electron transfer. Potential applications include though, is having an accurate 3D structure of PNA attaching metal ions to the PNA scaffold to create in solution. A static structure from an x-ray crystal- metal-PNA complexes that can catalyze reactions. lographic study was available, but in biological applications PNA isn’t static; it’s flexible and mobile. Similarly, Achim foresees PNA being used to create To attain its 3D structure in this state required a “nanowires”—100,000 times finer than a human combination of NMR spectroscopy and molecular hair—for quantum circuitry, in which quantum dynamics (MD) using supercomputing resources. characteristics can lead to electronics that are much faster than today’s integrated circuitry. Catalina Achim, Carnegie Mellon University, with her laboratory group. A “Collaborative Research in Chemistry” grant from Courtesy: Carnegie Mellon University the National Science Foundation (NSF) supported Achim expects that the new shared-memory Black- this project, and teamwork among Achim, Madrid, light supercomputer at PSC will help to further and Achim’s partners at Carnegie Mellon and Duke advance her work. “I look forward to continuing University produced results. First, Achim and her the collaboration,” she says, “and to using this new Carnegie Mellon colleague Danith Ly and their resource. students synthesized PNAs with different chemi- cal structure and flexibility in solution. A graduate student in Achim’s lab worked with Carnegie Mellon chemist Roberto Gil on the 2D NMR spectroscopy of the synthesized PNAs, which provided a matrix of distances between protons in the molecules. TERAGRID XSEDE Courtesy: Aashish Adhikari, University ofChicago Adhikari, Aashish Courtesy: his colleagues isshown inblue. generatedand theprediction by Adhikari and the experimental isshown structure ingreen An example ofprotein prediction; structure Collaboration and between OpenScienceGrid XSEDE aimsto give researchers therighttools both.” provides tools that helpthemto takeadvantage of between thetwo cyberinfrastructures, partnership physics at theUniversity ofFlorida. “Now ExTENCI, a Paul Avery, andprofessor OSGCouncil of co-chair toOpen ScienceGrid runtheirlarge workflows,” says thepast, researchers“In usedeither TeraGrid or computingfor (HPC). high-performance systems available through TeraGrid were designed high-throughput computing (HTC),perform while National ScienceFoundation, isoptimized to ofEnergy andthe funded by theDepartment For example, (OSG), OpenScienceGrid jointly involve multiplesteps, eachrequiring different tools. involve few steps that require onlyonetool. Others “recipes”—known asworkflows in computing— Computational research isnodifferent.Some whites ortaste thesoup. to whipegg possible at all—imagine trying results,sub-standard andsomemight notbe stepstasks. would beslow, Other producing and easybecauseyour tool was designed for those mincing onionsandslicingcarrots, would bequick only aknife. With justthat onetool, somesteps, like I magine cooking a gourmet mealfrom agourmet magine scratch cooking using Science Success Recipe forA Recipe 22 23 SCIENCE HIGHLIGHTS

ExTENCI, which stands for Extending Science “The ExTENCI project is working to make the use “We’re trying to predict protein structures by Through Enhanced National Cyberinfrastructure, of both cyberinfrastructures more seamless, and mimicking how we think proteins fold in nature,” was launched in 2010 under the leadership of Avery easier for individual scientists and smaller collabo- says Aashish Adhikari, a researcher at the Insti- and co-principal investigators Ralph Roskies, co- rations to leverage concurrently.” tute for Biophysical Dynamics at the University of scientific director of the Pittsburgh Supercomputing Chicago. “Experiments suggest that proteins fold Center, and Daniel S. Katz, senior fellow at the Uni- “We’ve begun to do this in a few concrete cases, in a stepwise fashion, where subunits of structure versity of Chicago/Argonne National Laboratory. The with the aim of leveraging the investments of both we call ‘foldons’ form cooperatively and add on to project brings together 11 U.S. universities and na- NSF and DOE in cyberinfrastructure resources and the existing structure in a process called sequen- tional laboratories—including the University of Chi- thereby to improve the productivity of U.S. com- tial stabilization. Our algorithm follows a similar cago, Clemson University, Louisiana State University, putational scientists,” says Roskies. principle.” Purdue University, University of Wisconsin-Madison, Fermi National Accelerator Laboratory, Brookhaven One of those concrete cases is the protein struc- This works well for protein sequences less than National Laboratory, Florida State University, and ture prediction project that operates the “Midway 100 amino acids long, according to Adhikari. But Florida International University—to develop tech- Folding Server,” a collaboration between the “if you increase the number of amino acids, for nology to enable researchers to more easily access laboratories of Karl Freed and Tobin Sosnick of the every amino acid that you add the computation resources through both OSG and TeraGrid/XSEDE. University of Chicago and Jinbo Xu of the Toyota time increases exponentially. Our goal is to try Technological Institute at Chicago. to use our algorithm to fold increasingly bigger “ExTENCI explored how to exploit the mutual capa- proteins.” bilities of both TeraGrid and Open Science Grid,” says The most widely used form of structure prediction Roskies. uses the structure of known proteins as templates Because this process can make excellent use of from which to compute the structure of similar both HTC and HPC resources, Wilde identified the “Many TeraGrid users have a natural need for the unknown proteins. But that only works if there is group as a good match for ExTENCI. Today, the high-throughput resources that the OSG provides. a similar protein with a known structure. Nor does protein structure prediction project is regularly Similarly OSG users sometimes need access to high- it give insight into how proteins fold in nature. using resources from both OSG and XSEDE, allow- performance computing resources such as those of Predicting a protein’s structure based solely on its ing them to fold larger proteins than ever before. TeraGrid,” says Michael Wilde, a fellow at the Univer- amino acid sequence is more difficult—and more sity of Chicago Computation Institute and software computationally intensive. For more information: http://sites.google.com/site/ architect at Argonne National Laboratory. extenci/ XSEDE TERAGRID

Education, Outreach, and Training 24 25 EDUCATION, OUTREACH, AND TRAINING Over the past decade, TeraGrid’s Education, Outreach, and Training activities reached tens of thousands of researchers, graduate, undergraduate and K-12 students, educators, and citizens, helping them harness powerful tools, understand the value of high-perfor- mance computing, and pursue education and careers in math, science, engineering, and Education, Outreach, and Training technology.

The next few pages spotlight some of these activities and give a preview of what is to come in XSEDE, which will continue—and expand— some of the TeraGrid’s most successful efforts, such as the Campus Champions Program. TERAGRID XSEDE Building Skills that CountBuilding Skills A samplingoftheeducation andoutreach programs offered by TeraGrid partners test of the American Chemical Society (ACS).test Chemical Society oftheAmerican and theirstudents over time, asmeasured by thestandardized improvement content by theteachers knowledge inchemistry theICLCSnois schooldistricts, hasfound astatistically significant Using apoolofabout120teachers from more- than100rural Illi environment. content courses delivered viaanonlineprofessional learning years, includingtwo-week summerinstitutes, andyearlong The program includesintensive training over thecourse ofthree developmenttheir pedagogical andcurriculum skills. computational science tools intheclassroom, andto develop knowledge. The program’s goalisto teach teachers how to use immersive environment for teachers to increase their chemistry (ICLCS) at NCSA.For five nearly years, ICLCS hasprovided an through Literacy Computational forInstitute Science Chemistry through computation andvisualization isthecentral goalofthe Improving highschoolstudents’ understandingofchemistry National Center for Supercomputer Applications (NCSA) StudentTECH teaches ushow to usethat andmore.” that thefutureto provides useeverything to thegreatest extent. nia. order“In to besuccessful inthefuture, we need to be able eighth-grader at Bishop’sparticipating inLaJolla,Califor School thefuture,“In computers willbeeverything,” says Angela Li,a ally.” our year-round programs we almost 1,000students attract annu- mer we willhave attracted more than300students, andthrough educationAnge program Mason, managerfor SDSC. sum- “This tion to Maya and3DModeling,’ andwe had14students,” explains program in2006withoneworkshop, was started “The ‘Introduc dentTECH program. andhigh-schoolstudents withitsStu - the concept to middle- Based onthesuccess of TeacherTECH, SDSCdecidedto expand math (STEM)resources to teachers around theworld. TeacherTECH, science, whichbrings technology, engineering, and in—and national recognition for—its teacher outreach program, almost 20years. that time, In SDSChasseentremendous growth force, onethat theiroutreach program hasbeenpromoting for SDSC hasaclearvisionfor work strengthening the21stcentury Supercomputer Diego San Center (SDSC) - - - 26 27 “Chemistry is a challenging subject to teach because “SAFE-Net provides a wonderful repository of re- National Institute for Computational Research it involves species—atoms and molecules—that fall sources for educators as they attempt to address the (NICS) outside the range of human perception,” according to issues of Internet safety,” says Norton Gusky, coor- In East Tennessee and surrounding area, NICS is EDUCATION, OUTREACH, AND TRAINING NCSA director and ICLCS principal investigator Thom dinator of educational technology for Fox Chapel working to add computational thinking to the Dunning. “By emphasizing the use of computational Area School District, north of Pittsburgh. “For my toolbox of area educators. Teachers from the tools that allow students to visualize and interact presentations, I rely on the concise definitions and fourth grade through the undergraduate level with these species, ICLCS enables teachers to convert examples provided by SAFE-Net. All educators, K-12, have attended NICS-sponsored workshops. Be- exercises in abstract thinking into exercises involv- respond positively to these materials and activities.” sides NICS personnel, experts from the Shodor ing concrete objects that can be manipulated and Foundation and area Master Teachers were understood.” The University of Texas at Austin/Texas Advanced brought in to share techniques, experiences and Computing Center (TACC) tools to use in the classroom. Pittsburgh Supercomputing Center (PSC) Sometimes, encouraging engagement is simply a TeraGrid’s education, outreach, and training efforts question of letting students and professors use Ter- “The educators who attend these computational have not only been geared toward workforce develop- aGrid resources to enhance existing computational thinking workshops are under stress to cover all ment. They also raise awareness of real-world con- education. For the past three years, the TeraGrid has items on government-mandated tests, and so cerns about 21st century digital literacy and safety. For offered an education allocation to allow teachers to need tools that both provoke interest and curios- instance, do Internet passwords protect personal in- teach and students to learn using some of the most ity, as well as mesh well with things the teachers formation from unwanted intrusion? How can one be powerful computing systems on the planet. are already doing,” says Jim Ferguson, NICS direc- sure if someone online is who they say they are? Does tor of Education, Outreach & Training. anti-virus software really protect one’s hard-drive? The Freshman Research Initiative (FRI) at The Univer- sity of Texas at Austin is considered a national leader Indiana University To help parents, educators, students, and individuals in engaging undergraduates in scientific research. The staff at the Advanced Visualization Lab at with these questions and many others associated with Of the 20 research tracks in FRI, three focus on the Indiana University also had the needs of educators using the Internet, PSC in 2010 introduced SAFE-Net, concepts of scientific computing through the lens of in mind as they worked with partners across the a program funded by a National Science Foundation chemistry, biology, and physics. United States to produce several videos about the grant for Cyber Safety Awareness. Through SAFE-Net, benefits of computational science and computer PSC presents workshops that train educators and pro- These research experiences allow students to use simulation. The videos employ stereoscopic 3D vide materials for classroom learning. These materials the high-performance computing (HPC) resources storytelling and feature both computer-generated address cyber threats, measures of protection, and at TACC, including Ranger and Lonestar 4, two of the and live-action imagery. Targeted toward students questions of cyber ethics that arise as a result of social top 30 systems in the world, for their coursework. in grades 5-12 as well as the general public, the networking and other uses of the Internet. During the past three years, students at The Uni- short videos can be used in classrooms, museums, versity of Texas at Austin have used more than 1.3 or even viewed at home on 3D TV. “Many Internet users lack an understanding of com- million CPU hours on these clusters to do original mon threats they may face online,” notes Cheryl research in nanotechnology, materials science, and The first two videos have been shown to thou- Begandy, PSC’s director of outreach and education. new energy solutions. sands of viewers at numerous locations and “Among parents, many lack confidence that their child events throughout the United States. A third video is safe when using the Internet.” “The FRI program helps young students experience is scheduled to be released in late 2011. what it is to do research,” says Graeme Henkelman, In 2010, PSC held two ‘Train-the-Teacher’ workshops professor of chemistry at The University of Texas at By continuing these successful education, introducing SAFE-Net to 18 Pittsburgh-area teachers. Austin. “And if they like it, it gives them the resources outreach, and training initiatives and adding new The SAFE-Net website also provides free information, they need to excel.” ones, XSEDE will inspire the scientists of tomorrow including classroom and parent materials about cy- and help keep the pipeline of students and teach- ber-security issues, with lessons geared to grade levels ers flowing for the professions of tomorrow. 1-3, 4-6, and 7-12. For more information: https://www.xsede.org/education- and-outreach XSEDE Champions Help

TERAGRID Campuses Connect Swarthmore exemplifies how dedicated champions broaden TeraGrid/XSEDE reach

For the past several years of the TeraGrid program and now as XSEDE gets under way, more than 100 Campus Champions from 43 states have helped researchers, educators, and students take advantage of the national cyberinfrastructure.

The volunteer Champions fill diverse roles on their home campuses: faculty, information technology administrators, project managers, instructional designers, and high- performance computing specialists. Thirty-three of the Champions are from EPSCoR (NSF’s Experimental Program to Stimulate Competitive Research) jurisdictions and seven from minority-serving institutions.

Champions are sources of information about high-perfor- mance computing generally and XSEDE resources and services specifically not just at their own campuses, but also regionally and nationally. Champions can also help researchers and educators on their campuses get start-up allocations so they can quickly begin using XSEDE resources. 28 29 EDUCATION, OUTREACH, AND TRAINING

For example, Swarthmore College Champion Michael Brown, physics, is working on ways to In addition to the research successes that Cham- Andrew Ruether created accounts for all stu- design fusion reactors and uses the TeraGrid to pions facilitate, they also are a valuable source of dents in professor Tia Newhall’s course on model the behavior of the plasma in a magnetic feedback, helping the XSEDE leadership under- Distributed and Parallel Computing, enabling containment field. One of Brown’s students, stand what resources are needed and what chal- them to use TeraGrid supercomputers for class Swarthmore senior Dan Dandurand, used lenges need to be overcome at the campus level. projects. Two of the students continued their TeraGrid to calculate the complex orbits of more “XSEDE’s vision is to enhance the productivity of project after the course concluded, developing than a billion energetic protons, calculations scientists and engineers by providing them with and testing a novel parallelization technique for that helped shed light on magnetic confinement new and innovative capabilities,” says XSEDE solving the K-Nearest Neighbor problem. The fusion. In another set of calculations, Dandurand leader John Towns. “Campus Champions provide algorithm can be applied to tasks such as discov- determined the fraction of energetic protons col- important local and regional feedback and are ering medical images that contain tumors from lected by a simulated probe in the plasma. These therefore an essential part of the development a large medical database, recognizing finger- calculations helped to calibrate and understand process.” prints from a national fingerprint database, and a probe used in experiments. Brown and Dan- finding certain types of astronomical objects in durand’s research was published in the Review XSEDE will provide increased professional a large astronomical database. Having access to of Scientific Instruments in 2011. development opportunities to the Campus TeraGrid resources allowed the students to run Champion community through a new Fellows large-scale experiments necessary to demon- “Andrew has been a big help in the initial setup Program. Fellows will work with XSEDE Extended strate that their solution worked well for real- and then answering follow-up questions from Collaborative Support Services staff on real- world problems. Their work was presented at the students,” Brown says, adding that he and his world science and engineering projects. The 2010 TeraGrid conference. students plan to continue their research with the expertise Fellows gain through these collabora- help of XSEDE. tive projects can then be shared with their peers, Ruether has also helped other Swarthmore re- students, and others. searchers and students successfully use TeraGrid resources. For more about the Campus Champions program, see https://www.xsede.org/campus-champions or con- tact Campus Champions coordinator Kay Hunt (kay@ “I would not have been able to do this work purdue.edu). without him,” says chemistry professor Paul Rablen who used TeraGrid’s Cobalt system at the Campus Bridging National Center for Supercomputing Applica- tions to investigate rearrangements in highly re- The XSEDE Campus Bridging effort will work with As new XSEDE capabilities and tools are made active carbon molecules that are widely used as campuses to assist them in adopting and making available, the Campus Bridging effort will work with catalysts; this work was published by the Journal effective use of the XSEDE system architecture, pilot sites to deploy, test, and refine the architec- of Organic Chemistry in February 2011. which will make many tasks much easier for re- ture to best benefit the community. Large-scale searchers. XSEDE personnel will work with campus deployment will follow once the tools are proven to personnel to provide awareness, advice, training, be effective and reliable. and assistance with the installation of appropriate XSEDE architecture components to support their The Campus Bridging effort will seek the coopera- local research community. tion of campuses to identify local Campus Cham- pions who are committed to working with campus researchers to raise awareness of and use of the new tools and capabilities. XSEDE Compelling, TERAGRID Ferocious Beauty Cosmic simulations and visualization skill contribute to acclaimed film ‘The Tree of Life’

This early parameter study shows an exploration of Volker Bromm’s supernova data with added density and detail layers. After further visual development, final high-resolution layers were provided to the film’s digital effects team for further processing. Courtesy: Advanced Visualization Laboratory, National Center for Supercomputing Applications. 30 31

Filmmaker Terrence Malick is often praised for “The emergence of the first stars rapidly the beauty of his films and their resonance with transformed the universe from a featureless, EDUCATION, OUTREACH, AND TRAINING nature. Roger Ebert talks about his “painterly cold, and barren place to one teaming with images.” For Janet Maslin, it’s his “visual genius” complexity,” explains Bromm. “At the end of their and his “intoxication with natural beauty.” In brief life, the stars exploded as hyper-energetic Malick’s new movie, “The Tree of Life,” some supernovae. These primordial supernova explo- of that natural beauty came from an unlikely sions seeded the universe with the first heavy source—TeraGrid’s Ranger and Abe supercom- chemical elements, such as carbon, oxygen, puters and the visualization expertise of silicon, and iron, thus setting the path toward Volker Bromm, the National Center for Supercomputing planets, and, ultimately, beings like us.” The University of Applications (NCSA). Texas at Austin Bromm simulated these events on the Ranger NCSA’s Advanced Visualization Laboratory (AVL) supercomputer over the course of 42 days. The has enlivened documentary television and IMAX calculation would have taken 114 years on a movies for years. But “The Tree of Life” marks the laptop. The results of the simulation were not center’s first work in a feature film. The movie only featured as a Palm d’Or winner, they were won the Palm d’Or at the Cannes International also published in Astrophysical Journal in 2010. Film Festival in June 2011. NCSA took that scientific data and turned it into “Cosmic events are powerful visual metaphors visualizations for “The Tree of Life.” Stuart Levy for the human condition, and we wanted processed the simulation results and extracted to combine accurate science with artistic features that they chose to make visually sensitivity,” says Donna Cox, who leads NCSA’s prominent. Alex Betts developed custom Advanced Visualization Laboratory. capabilities for using scientific data to visualize cosmic gas and dust with great realism. For “The Tree of Life,” the AVL team collaborated Bob Patterson orchestrated the camera move with the filmmakers to create two animated and the design of the visualization. visualizations that are based on scientific data, some of which was created by the University of They worked on an in-house cluster with about Texas’ Volker Bromm using Ranger at the Texas 200 processors. When they needed more power, Advanced Computing Center. they used NCSA’s Abe supercomputer.

The work brings “heart and soul to the scientific “Scientific visualization helps to infuse ‘The Tree visualizations. We collaborated closely over of Life’ with an authenticity that goes beyond many months to design the shots in question, any other movie,” says NCSA’s Cox. “We’ve deeply respecting the underlying science while employed the most advanced supercomputing, shaping it into emotional imagery,” says Dan networking, and visualization technologies to An early version of the NCSA University of Illinois Milky Way galaxy Glass, visual effects supervisor on “The Tree of bring some of the compelling, ferocious beauty model. NCSA worked directly with film’s digital effects team to cus- of the universe to the big screen.” tomize the flight and visual settings for the “The Tree of Life” shot. Life.” Courtesy: Advanced Visualization Laboratory, National Center for Supercomputing Applications. One visualization shows an awe-inspiring flight For more information: avl.ncsa.illinois.edu through a highly detailed galaxy model created Grant #: NSF AST-0708795, NSF AST-1009928, at NCSA. The other highlights Bromm’s work, NASA NNX 08-AL43G, NASA NNX 09-AJ33G showing how the very first stars appeared, illuminating the previously dark universe. XSEDE Taking Training TERAGRID on the Road Collaboration with Southeastern Universities Research Association provides visualization workshops to minority-serving institutions

Through a series of workshops at SURA institutions, more than 200 faculty researchers and students learned the basics of scientific visualization. Courtesy: Texas Advanced Computing Center 32 33 EDUCATION, OUTREACH, AND TRAINING

ot every college and university in the United N Over the course of the last year, as part of an Extreme “We’re teaching them to use the tools and apply States has its own computational cluster. This, in Digital (XD) Visualization grant from the National them to their research,” adds Greg P. Johnson, a vi- part, was the impetus for TeraGrid: to offer advanced Science Foundation (NSF), TACC and SURA presented sualization expert at TACC and one of the workshop computing resources to researchers all across the training sessions at SURA member institutions, includ- trainers. “Attendees were able to get visual results country, no matter how large or small their institu- ing Norfolk State University (an HBCU), the University and interact with their data to form new insights.” tion. of Central Florida, University of Miami, and SURA offices in Washington, D.C. In addition to attendees from host Importantly, the workshops dovetail with the The annual TeraGrid conference, the Campus Cham- institutions, the faculty-student research teams from emerging ability to perform visualization remotely, pions program, and the education, outreach, and Florida A&M University, Howard University, and Morgan which is an increasingly important capability training programs developed by TeraGrid resource State University attended the workshops as well. supported by the NSF. providers are just a few ways that TeraGrid brought scientists and students into the fold. “The bulk of the people who come to the workshops “For classes, for teaching, and for giving talks, you have had zero or very nominal exposure to TeraGrid,” want to go to a place where you have the 3D, tiled However, sometimes these programs are not enough notes Linda Akli, program manager of IT Initiatives for display, and high-end visualization hardware,” to get researchers involved. Sometimes, one has to SURA, and the organizer of the workshops. Majchrzak says. “But researchers also want to be bring resources—human and technological—to able to dump in their data, get the visualization researchers on their home turf. Through these workshops, and others at The University out, and analyze it at their desk.” of Texas, TACC staff members taught more than 200 This was the idea behind a unique collaboration faculty researchers and students the basics of scientific In-person training paired with one-on-one consult- between TeraGrid and the Southeastern Universities visualization, a process of transforming data into im- ing and remote access to powerful resources made Research Association (SURA), a consortium of more ages and animations that can be interpreted to derive it easy for new users to begin to take advantage of than 60 universities working to advance and exploit insights. the computing tools available through TeraGrid. The the transformative nature of information technology leadership behind XSEDE expects this access to be on the regional, national, and international fronts. “Visualization is a very important tool for science and even easier in the new program. engineering, but a lot of scientists and engineers have The Southeast region is home to nearly all of the no visualization background, or are even aware of “If we’re not bringing in more folks from diverse and nation’s historically black colleges and universi- what’s capable with visualization,” says Dan Majchrzak, underrepresented communities, we’re not going to ties (HBCUs) and a large percentage of its minor- director of research computing at the University of have much of a scientific and technological work- ity-serving institutions (MSIs). The individuals who South Florida. “The trainings informed the community force,” says SURA’s Akli. “There’s an untapped pool of work and study at these institutions are sometimes at large about what’s available in TeraGrid and how it’s talent and it’s critical to get them engaged if we’re underserved by the national science community, used.” going to continue to have leadership in innovation so the Texas Advanced Computing Center (TACC) and technology.” and TeraGrid reached out to these schools to share Participants learned how to use some of the most com- knowledge and recruit new users. mon scientific visualization software (including Para- view and VisIt), how to get an allocation on TeraGrid, and how to access the Longhorn Visualization Portal, a website that lets scientists visualize massive datasets from the comfort of their offices. TERAGRID XSEDE Courtesy: Adam Kubach andPaulCourtesy: A.Navrátil, Texas Advanced Computing Center, The University of Texas at Austin correlate energy usewithlocations across thecity. incorporatesthe project 1,000homesacross Austin. Visualizations suchasthese willallow PSP to easily pare aggregate useacross thehomesinproject, which willbecome even more usefulinPhase2,when A/C useduringtheheat oftheday. These tools allow theresearchers at Pecan Street Project (PSP) to com- two imagesshow thedramatic increase inenergy primarilyfrom usebetween themorningandafternoon, shade andheight correspond oftheblocks to theenergy usageofthehousesineachcensus block. The Two energy usagedata, aggregated whole-house by imagesoftheMuellerproject census block. The eraGrid storage and visualization resources aid research h c r a e s e r ’ d i r g t r a m s ‘ d i a s e c r u o s e r n o i t a z i l a u s i v d n a e g a r o t s d i r G a r Te Being Being ‘Smart’ at Home greater efficiency. mate the monitoring for andcontrol distribution ofelectrical widening arrayapplications that enhance ofutility andauto public andgovernment to develop “smart grids”—an ever ever, there issignificant momentum from boththe general priority. isanincreasingly important Now,ficiency more than footprint. As globalenergy prices continue to rise, energy ef- P ower generation accounts for 40percent oftheU.S. carbon -

34 35 EDUCATION, OUTREACH, AND TRAINING Being ‘Smart’ at Home According to Michael Webber, associate direc- “TACC has some of the world’s fastest comput- tor of the Center for International Energy and ers, so we’re confident they can do any kind of Environmental Policy at The University of Texas crunching, rendering, or data manipulation,” at Austin, utilities and energy companies are ex- notes Bert Haskell, technology director for pected to spend $1 trillion to $2 trillion over the the Mueller smart grid project. “They have the next few decades to build, update, and upgrade technical expertise to look at different data- their grids nationwide. At the same time, energy base structures and know how to organize the consumers are expected to spend tens of bil- data so it’s more efficiently managed. We’re lions of dollars on energy-related appliances in very excited to work with TACC to come up the home. with new paradigms on how to intuitively por- tray what’s going on with the grid and energy Members of the Mueller Smart Grid project (left to right), and Paul “Before smart grid advocates and companies ask systems.” Navratil (far right) from TACC’s Data and Information Analysis team. customers to invest in new products and servic- es, we all need a better understanding of what With sensor installations in place at 100 One of the weaknesses in smart grid systems is the way they want, what they’ll use, and what they’ll get homes, new data is generated every 15 they visualize data, which is often not intuitive. Since excited about,” says Brewster McCracken, execu- seconds showing precisely how much energy TACC is a leader in providing visualization resources tive director of Pecan Street Project, an organi- individual circuits are using. In response, TACC and services to the national science community, they zation focused on developing and testing new developed a special data transfer format to were a perfect partner to remedy this problem. technologies and business models for advanced pull all of the data into a database on the energy management systems. Corral storage system. To date, the database Paul Navratil, research associate and manager of contains approximately 600 million individual TACC’s Visualization Software group, says: “We’ve used Hence, the creation of the Mueller Smart Grid power readings and continues to grow. our visualization expertise to translate the immense Demonstration Project, a comprehensive energy volume of data into images that convey insights to the consumer research study in Austin, Texas. “We’re trying to create very rich resources for researchers and their industry partners.” people to use in analyzing patterns of energy The Mueller smart grid project is generating usage,” says Chris Jordan, a member of TACC’s Navratil notes that it is a massive data mining problem, complex and large datasets that require power- Advanced Computing Systems group. “Over but this is something the experts at TACC and all super- ful supercomputers to capture, integrate, and time, as the resources grow and become more computer centers work with on a daily basis. Overall, verify the information, and to make sure that it is varied, we expect whole new forms of research the Mueller smart grid demonstration project is trying properly synchronized and analyzed. to be conducted. We’re really interested to see to understand how energy management systems what people can do with it, such as how the can be integrated into our lifestyle. Observes Haskell: Enter the Texas Advanced Computing Center data stream can transfer itself into a decision- “That’s what we want to figure out—how that future (TACC), a TeraGrid resource provider in Austin. making device for city planners and individual automated home environment will interface to the consumers.” smart grid to provide the peak energy demand char- acteristics that the utility needs to run their network without creating a burden on the customer.”

For more information: www.pecanstreetproject.org XSEDE Leadership

John Towns, project director Kurt Wallnau, manager, software development and integration NATIONAL CENTER FOR SUPERCOMPUTING APPLICATIONS SOFTWARE ENGINEERING INSTITUTE University of Illinois at Urbana-Champaign Carnegie Mellon University

Tim Cockerill, associate project director JP Navarro, deputy manager, software development and integration NATIONAL CENTER FOR SUPERCOMPUTING APPLICATIONS ARGONNE NATIONAL LABORATORY University of Illinois at Urbana-Champaign Janet Brown, manager, systems and software engineering John Boisseau, director of user services PITTSBURGH SUPERCOMPUTING CENTER TEXAS ADVANCED COMPUTING CENTER Carnegie Mellon University/University of Pittsburgh The University of Texas at Austin Scott Lathrop, director, education and outreach Chris Hempel, deputy director, user services NATIONAL CENTER FOR SUPERCOMPUTING APPLICATIONS TEXAS ADVANCED COMPUTING CENTER University of Illinois at Urbana-Champaign The University of Texas at Austin Steven Gordon, manager, education Nancy Wilkins-Diehr, director of Extended Collaborative Support Services-communities OHIO SUPERCOMPUTER CENTER SAN DIEGO SUPERCOMPUTER CENTER The Ohio State University University of California, San Diego Laura McGinnis, manager, outreach Ralph Roskies, director of Extended Collaborative Support Services-projects PITTSBURGH SUPERCOMPUTING CENTER PITTSBURGH SUPERCOMPUTING CENTER Carnegie Mellon University/University of Pittsburgh Carnegie Mellon University/University of Pittsburgh Dan Stanzione, manager, training Sergiu Sanielevici, deputy director, Extended Collaborative Support Services-projects TEXAS ADVANCED COMPUTING CENTER PITTSBURGH SUPERCOMPUTING CENTER The University of Texas at Austin Carnegie Mellon University/University of Pittsburgh Bill Bell, manager, external relations Victor Hazlewood, interim director of operations NATIONAL CENTER FOR SUPERCOMPUTING APPLICATIONS NATIONAL INSTITUTE FOR COMPUTATIONAL SCIENCES University of Illinois at Urbana-Champaign University of Tennessee Knoxville/Oak Ridge National Laboratory Susan McKenna, communications coordinator, external relations Kathlyn Boudwin, manager, project management and reporting NATIONAL CENTER FOR SUPERCOMPUTING APPLICATIONS Oak Ridge National Laboratory University of Illinois at Urbana-Champaign

Ian Foster, architect, architecture and design ARGONNE NATIONAL LABORATORY CarltonB ruettDesign

Andrew Grimshaw, architect, architecture and design UNIVERSITY OF VIRGINA XSEDE Leadership TeraGrid XSEDE

xsede.org