the PRIMER Summer 2009 Volume 6 Issue 1

also in this issue Bugs, Big & Small, from On Sequencing, Finishing and Notes from the 4th Annual JGI Inside & Out, Join Sources of Analysis in Santa Fe ...... 2 User Meeting ...... 6-7 Supporting Life in Great Salt Improving Crops and Green Genes to Fill CSP Lake ...... 3 Feedstocks ...... 8 Brown-rot for Biofuels . . . . . 4 A Shortcut called ChIP-seq 9 2010 Pipeline Sequencing with a Algae as a climate change More than 70 projects were could be used to remove heavy Single Cell ...... 5 sensor ...... 10 selected for the 2010 Com- metal contaminants from fresh- munity Sequencing Program water streams. (CSP) portfolio. They involve “The information we generate organisms from regions as far from these projects promises north as the Arctic and south to improve the clean, renew- to New Zealand. They include able energy pathways being proposals to study microbial developed now as well as lend contaminants in alcohol that researchers more insight into could impact biofuel production, the global carbon cycle, options microbial communities in the for bioremediation, and biogeo- guts of insects from an area chemical processes,” said DOE geography scholar Jared JGI Director Eddy Rubin. “In processes that support life on cow, termite and Amazonian Diamond once described as translating DNA sequence data the planet, or imperil it.” stinkbird, Mike Taylor from “the nearest approach to life into biology, we generate valu- Following along the lines of New Zealand’s University of on another planet” and a able science that improves our previous CSP-approved meta- Auckland proposed sequencing novel bacterial isolate that understanding of the complex genomic projects involving the microbial communities inside cont. on page 11 Moving Microbial Forward

BY MASSIE SANTOS BALLON why two-thirds of the nearly In an article published in genomic data, still remain to To get an idea of how many 5,000 genome projects report- the July issue of the journal be defined and codified. Due microbes there are on Earth, ed in the Genomes OnLine Nature , Kyrpides to biases in the selection of imagine taking just the small- Database involve microbes. reflects on the role of microbial genomes for study, vast realms est type of microbe, the virus- Another factor is the myriad studies in the genomics revo- of biodiversity remain unex- es that infect other microbes, potential applications of micro- lution of the past decade, and plored. Data release policies and lining them all up in the bial research for targeted drug considers the factors that have very early on led to what he row. How far do you think that development, bioenergy, biore- hindered the advancement of terms “a violation of a long- line would reach? From San mediation, agriculture, and the field. Although nearly standing precept in all scientif- Francisco to Paris? To the other human endeavors. But 1,000 microbial genomes have ic fields” when reviewers of moon? To the sun? The far more could be done with been sequenced over the past submitted publications answer: A thousand times the microbial genomics, according 15 years, he noted that the couldn’t access the underlying distance across the Milky Way. to DOE JGI Genome Biology data obtained has been com- genomic data. Microbes are not only head Nikos Kyrpides, if promised by the lack of a com- Kyrpides offers numerous exceedingly numerous, but researchers would embrace mon language and standard- suggestions to meet these also extremely diverse, encom- the world of possibilities that ized procedures. Such stan- and other challenges that face passing most of Earth’s total extend beyond such human- dards, required for the genomics research in the biodiversity. This is one reason centric studies. exchange and integration of decade ahead. For example, cont. on page 11 2 / the PRIMER

Summer 2009 Vol. 6, Issue 1

Seeking Sequencing Standards in Santa Fe

BY MASSIE SANTOS BALLON “Data generation is becoming cheaper, something produced in a lab other than a One thing became clear fairly quickly at easier and faster,” he said, ”and this is only large sequencing center such as DOE JGI, the 4th Sequencing, Finishing, Analysis in getting worse.” Based on the current output, which typically generates what he called a the Future Meeting held in Santa Fe, NM Chain predicted that by 2012, there might be “high quality draft,” which has little or no from May 27-29, 2009: the name game 12,000 incomplete genome sequences being manual review. genome researchers play needs new rules. made available to researchers who might In an “improved high quality draft,” the Speaking immediately after the opening not recognize that the datasets may lack data has been reviewed both by people and keynote, Patrick Chain, head of Microbial the information they’re interested in using. machines to some extent so most of the Interactions program at DOE JGI, noted that Chain said another part of the problem genetic data is assembled correctly, but there have traditionally been only two cate- is that as new sequencing technologies some errors may still be present. An gories for sequenced genomes — “draft,” become the machines of choice, they need “annotation grade” sequence would present where the genetic data of an organism has to be evaluated so that researchers under- all the information in various gene regions been collected but not completely organized, stand that the information that comes out as accurately as possible, and the term or “finished,” in which the genetic code has of one company’s instrument may not be “non-contiguous finished” would apply to been correctly put together and can be read the same as the information provided by genomes that are practically finished except like a published book from beginning to end. other equipment. The tags applied to the for “recalcitrant regions” that are proving He then proposed adding four other labels data may also vary based on the sequenc- problematic. Such quirks, Chain said, so that researchers can determine the ing technology used. would be included in the comments when quality of the genomes they’re accessing. All of these changes, Chain said, have the genome sequence is made available. Unfortunately, Chain said, pitching the led to the Consortium’s proposal of four It’s worth wondering after all this what a idea on behalf of the International Genome genome quality categories, as well as some genome sequence has to do to be labeled Sequencing Standards Consortium, the suggested changes to the definitions of the “finished.” To achieve this pinnacle, Chain advent of new sequencing technologies traditional terms, none of which are pegged said, the sequence would have no more has increased the number of genomes to a particular sequencing technology. A than one error in every 100,000 base pairs. being sequenced every year. standard draft, for example, might be Chain noted that one of the hardest parts of coming up with categories of sequence status had been getting every- one in the consortium to agree on the news terms and definitions. “I can barely get agreement in my own group on some of these issues so this was really a fantastic achievement,” he said. He concluded by noting that the Consortium plans to submit their proposal for publication at a later date, and intends to publicize the stan- dards on the websites of the various insti- tutions affiliated with the group. A number of other speakers referenced the need for a standardized terminology that could describe a genome sequence and its various components in their talks over the three days of the conference. Among them was Deanna Church from the Meeting organizers (left to right): Michael Fitzgerald, Broad Institute; Patrick Chain, DOE National Center of Biotechnology Information, JGI; Bob Fulton, Washington University in St. Louis; Donna Muzny, Baylor College of who referred to the need for standard Medicine; Johar Ali, Ontario Institute for Cancer Research; Chris Detter, DOE JGI; Alla Lapidus, DOE JGI terms during her talk when she mentioned cont. on page 9 the PRIMER / 3

Summer 2009 Vol. 6, Issue 1

Salt Lake’s Briny Waters May Harbor Biological Treasures

Photo by Charles Uibel, GreatSaltLakePhotos.com

BY MADOLYN BOWMAN ROGERS mercury and other heavy metals, and high to provide a baseline picture of microbial In one of the 2009 Community volatile sulfur concentrations that give the diversity. Researchers will combine the Sequencing Projects, the DOE JGI are lake a distinctive rotten-eggs stink. DNA data with measurements of microbial partnering with a consortium of researchers Despite these compounds, the lake sup- metabolites and environmental conditions to sequence the genomes of microbial ports five million migrating birds each year such as salinity and oxygen to paint a communities in Utah’s Great Salt Lake. and is home to brine shrimp and several complete picture of the lake’s ecology. Researchers believe the lake’s harsh envi- kinds of algae including diatoms, as well “We’ve done a lot of diversity assess- ronment harbors a wealth of undiscovered as an untapped diversity of microbial life. ment of the lake in the last two years, biodiversity that may provide new solu- Because the microbes can survive in and we’re not even approaching the limit tions for carbon sequestration and biore- such an extreme environment, Weimer of microbial diversity,” Weimer said. “The mediation, and might even give clues to believes they may contain unique proteins services that the DOE JGI will provide are news the nature of life on other planets. that enable them to chemically fix sulfur going to eclipse anything that we could do “We believe we’re going to find new and carbon and to detoxify pollutants. in the next five to ten years.” enzymes and new metabolic routes that Their biochemistry may suggest new meth- have never been seen before,” said the ods for sequestering carbon and reducing FOR ADDITIONAL INFORMATION SEE: project’s lead investigator, Bart Weimer, a acid rain, Weimer said, as well as novel Weimer, Bart C., Giovanni Rompato, microbiologist at University of California, pathways for bioremediation of heavy met- Jacob Parnell, Reed Gann, Balasubramanian Davis. The collaboration also includes als, aromatic hydrocarbons, chlorinated Ganesan, Cristian Navas, Martin Gonzalez, researchers at the Utah State University, compounds and methylmercury (Weimer et Mario Clavel, Steven Albee-Scott. Microbial the U.S. Geological Survey, and the Utah al., 2009). Microbial life in the Great Salt biodiversity of Great Salt Lake, Utah. Division of Water Quality. Lake might also provide a model for life 2009. In Saline lakes around the world: The Great Salt Lake is the third saltiest on planets such as Mars, where sulfate unique systems with unique values. body of water on earth, and the saltiest to and salt concentrations are also very high. pp.15-22. Eds. Oren, A., D. L. Naftz, and support life. The salt content varies widely Since less than one percent of the W. A. Wurtsbaugh. S. J. and Jessie E. across the lake, but on average the lake microbes in the lake can be cultured, Quinney Natural Resources Research is four times saltier than the ocean, and most of the microorganisms have never Library, Utah State University Press, in the North Arm up to 20 times saltier. been studied before. The DOE JGI will Logan. (http://www.cnr.usu.edu/quin- The lake also contains petroleum seeps, sequence DNA from four sites in the lake ney/htm/publications/nrei) 4 / the PRIMER

Summer 2009 Vol. 6, Issue 1

to brown-rot fungi, which makes the wood look like a pile of cracked bricks. Brown-rot’s Researchers hope to identify the enzymes involved in attacking the wood and apply Bypass Benefits them toward other plant material. Biofuel Production “Nature offers some guidance here,” said Dan Cullen, U.S. Department of Agriculture Forest Service, Forest Products Laboratory (FPL) scientist and one of the senior authors on the PNAS paper. “Postia has, over its evolution, shed the conventional enzymatic machinery for attacking plant material. Instead, the evi- dence suggests that it utilizes an arsenal of small oxidizing agents that blast through plant cell walls to depolymerize the cellu- lose. This biological process opens a door to more effective, less-energy intensive and more environmentally-sound strategies for more lignocellulose deconstruction.” Randy Berka, another of the study’s senior authors and Director of Integrative Biology at Davis, Calif.-based Novozymes, Inc., noted that in sequencing the genome of brown-rot fungi, researchers have been Lignin is to plants what a skeleton is code of brown-rot fungi to find out how able to compare the genetic blueprints of for humans: an internal scaffold that keeps they can harness the fungi’s ability to various types of wood-decaying fungi for the organism upright and the softer insides break down wood to make biofuel produc- the first time. protected. As a result, one of the prob- tion more cost-effective. Brown-rot fungi is a type of basid- lems with making biofuels from plants is “The microbial world represents a little iomycete fungi like white-rot fungi, which getting past the lignin to break down the explored yet bountiful resource for enzymes had been previously studied by the DOE news cellulosic biomass into sugars that can that can play a central role in the decon- JGI. White-rot fungi degrades both the be fermented and distilled into liquid bio- struction of plant biomass—an early step lignin and the cellulose however, while fuel. Because the lignin prevents easy in biofuel production,” said Eddy Rubin, brown-rot fungi only works on the latter. access to the cellulosic material, breaking Director of the DOE JGI, where the brown- “Such comparisons will increase our down plant biomass has traditionally rot fungi’s genome was sequenced. “The understanding of the diverse mechanisms involved harsh chemicals and high heat, brown-rot Postia placenta’s genome offers and chemistries involved in lignocellulose processes which add energy to the wrong us a detailed inventory of the biomass- degradation,” Berka said. “This type of infor- side of the biofuel production equation. degrading enzymes that this and other mation may empower industrial biotechnol- The Department of Energy (DOE) and fungi possess.” ogists to devise new strategies to enhance its Bioenergy Research Centers are Brown-rot fungi affect softwood trees efficiencies and reduce costs associated approaching the problem from two differ- such as redwoods, pines, cedars and with biomass conversion for renewable ent angles: developing plants bred specifi- Douglas firs, removing practically all of the fuels and chemical intermediates.” cally to become biofuel feedstocks; and cellulose in the plant without removing the The report by appeared in the February identifying enzymes that can maximize the lignin as well. Interestingly, some researchers 4 online edition of the Proceedings of the biomass being converted into fuel. have noted that most brown-rot fungi can’t National Academy of Sciences (PNAS) and With the latter idea in mind, an interna- work on cellulose if the lignin is missing. involved over 50 scientists from Austria, tional team led by researchers from the Approximately 10 percent of the decay noted Canada, Chile, the Czech Republic, France, DOE JGI examined the complete genetic annually in American timber is attributed Germany, Spain and the United States. the PRIMER / 5

Summer 2009 Vol. 6, Issue 1

When One Cell went to work. The single cells were then blown open and a process called multiple is Enough displacement amplification was applied to make millions of copies of the bacterial Studying microbes in a lab setting usu- genomes for sequencing. ally involves growing large quantities of The resulting flavobacterial genome them for genetic studies. Researchers at sequences are approximately 80 to 90 the DOE JGI demonstrated they could percent complete. Woyke credited DOE assemble a high quality draft of two JGI’s Cliff Han and his team at Los marine microorganisms using just a single Alamos National Laboratory (LANL), which cell from each. worked on closing gaps in the assembly. “As long as you can isolate a single “Even without completed genome cell, pick it from the environment, lyse it assemblies, single cell sequencing offers (blow it open), you can generate millions radically new opportunities for the basic of copies of that genome and gain access research and biotechnology applications of to the information inside that organism,” the microbial ‘uncultured majority’,” said researchers access genomes of interest said DOE JGI researcher Tanja Woyke (pic- DOE JGI collaborator Stepanauskas. that are relevant to the DOE’s mission. tured, top right). More importantly, the single cell “The power of single cell genomics is Woyke and her colleagues at the DOE sequencing approach means researchers that it offers us the ability to sort out one JGI sequenced genomes of two uncul- don’t have to culture organisms in order cell from a complex environmental sam- tured flavobacteria, marine microorgan- to sequence them. This could prove useful ple, liberate the DNA from that cell, and isms collected in Maine’s Boothbay for studying organisms that can’t exist in enzymatically produce millions of copies Harbor for Bigelow Laboratory collabora- a lab environment. of that genome so that we have enough tors Ramunas Stepanauskas and Michael “We estimate that roughly 99.9 percent DNA to sequence it and characterize its Sieracki, who are particularly interested in of the microbes that exist on this planet metabolic potential,” he said. genes encoding proteorhodopsins, which currently elude standard culturing methods, The technique doesn’t just work on allow some cells to use energy from the denying us access to their genetic material, marine microorganisms either; Woyke said sun without applying photosynthesis. so we have to explore other methods to they are helping other DOE JGI collabora- After the Bigelow scientists had picked characterize them,” noted DOE JGI Director tors who study organisms from various out individual bacterial cells from their Eddy Rubin. As a national user facility, he environments but can’t provide enough bacterial samples, Woyke and her team said, the DOE JGI is dedicated to helping sample to apply other test methods. Woyke and her colleagues aren’t the user community faces first to work on the single cell genome sequencing technique, but they’ve been able to provide a higher quality result compared to previous efforts. Still, the method needs more work. “One of the key issues that still needs refining is the lysis step, since many microbes will not lyse with alkaline solutions, the most common agent for the job. But we are actively working on that,” she said. The study was published in the April 23 edition of the journal PLoS One. Tanja Woyke can be heard discussing single cell genome sequencing as it applies to this particular project and other points of inter- Coastal water sample from Boothbay Harbor, Maine collected by Bigelow Laboratory est at http://www.jgi.doe.gov/News/pod- team. Photo courtesy of Ramunas Stepanauskas (Bigelow Laboratory). cast090421_woyke.mp3. 6 / the PRIMER

Summer 2009 Vol. 6, Issue 1

JGI Annual User Meeting #4— Briefly In Review

harnessing nature, particularly the issue of carbon emissions from indirect land use, There are trillions of dollars which has dogged bioenergy researchers invested in vehicle infrastructure since the concern was raised in a Science paper (see sidebar and Searchinger et al, and to think we’ll switch overnight 2008; 319: 1238-1240.) published last to ethanol and hydrogen isn’t nec- A record 500 people attended the DOE year. And he painted a sobering picture of essarily in the cards. JGI Annual Genomics of Energy and the timelines involved in modifying plants Environment User Meeting held March 25- whose genomes were being sequenced to 27, 2009 at the Marriott Hotel in Walnut become better feedstocks, which he con- —George Church Creek, Calif. siders a long-term solution, as well as sequences “without any idea of where you This year the keynote speakers were when producing a billion gallons of cellu- are or where you’re going at all.” As an Energy Bioscience Institute director Chris losic ethanol annually would likely happen. example of how the system could be used, Somerville, Harvard University’s George During the talk, Somerville reminded he talked about an agricultural study in Church and founder of the eponymous people to think about the energy problem on which lycopene was added to a strain of genomic research institute J. Craig a global scale. He also noted that thinking tomato that lacked the antioxidant. Venter. All three speakers focused on a outside the box can lead to surprising col- Venter’s presenta- number of scientific and technological laborations and solutions being proposed. tion (http://www. techniques to move genomic sequencing “We learned it can be very productive scivee.tv/node/ forward in ways relevant to the DOE JGI’s to educate people who don’t know about a 10653) on the last mission of clean energy generation and topic about what the problems are,” he said. day of the User bioremediation. Church’s keynote Meeting combined Somerville’s talk (http://www.scivee. Somerville’s focus on (http://www.scivee. tv/node/10578) the genome sequencing tv/node/10674) following night to design and develop better bioenergy focused on the focused on the tech- sources with Church’s emphasis on the user meeting review development of cel- nologies needed to technological upgrades needed to move lulosic biofuels, which move genomics the science forward. He spoke about what is being heavily pro- research forward. He he termed “digitizing life,” moving toward moted by the Obama began by noting that the technology administration and last year’s keynote should be accessible enough that users speaker-turned-energy secretary Steven Chu. feel comfortable not just using the equip- If you’re looking for new He spoke about the challenges involved in ment but modifying it to suit their needs. “Think of the platform that reads mammalian genes, stop. Basically sequences as an automated microscope. they’ve all been found. My favorite among these That’s really what it is.” Adding a micromirror display, he said, would give Miscanthus —J. Craig Venter [energy] crops is . In an the machine added functionality, allowing average car you could drive around researchers to hold or release cells or reading the genetic code on the computer the world twice from fuel produced sequences of interest. and writing it. He discussed the chal- The Harvard researcher also described lenges of building a bacterial genome by one acre of biomass. a program he called Multiplex Automated from the ground up, which led to the Genome Engineering or MAGE that would development of computer software that —Chris Somerville allow researchers to study and design allows his researchers to act as digital 76678_USGPO:CR 7/8/09 9:40 AM Page 7

the PRIMER / 7

Summer 2009 Vol. 6, Issue 1

DNA designers to create the Venter said. When asked at the species needed to meet the ever- end about making the software or In a February 2008 issue of Science, Tim increasing demand for energy and the species being designed avail- Searchinger and his colleagues at Princeton the need to protect the environment. able, he quipped, “We are not University argued that most studies advocat- “We have discovered 20 million ready to release our software or ing biofuel use as a way to reduce green- genes and we’re using them as our bugs because the software house gas emissions did not factor in the design components of the future,” has bugs.” emissions resulting from the conversion of land to grow the additional energy crops. The researchers argued that carbon sinks Lessons Learned at the JGI Users Meeting such as forests and grasslands would become (excerpts from : http://phylogenomics.blogspot.com/ carbon sources when they were plowed up for 2009/03/lessons-learned-at-jgi-users-meeting.html) use as feedstock farmland. “At the broadest level, the key insight of our paper is that when “…here are my top lessons I got out of the meeting…” cropland-based feedstocks are used for biofuels, •NextGen sequencing con- the source of any potential greenhouse gas tinues to open up new benefit is the use of the land to take carbon windows into biology. out of the atmosphere,” Searchinger reiterated •Ecological and population in a letter published by Science six months genomics are truly the after the study was released. next big thing. “Leaving out emissions from land use

•Related to the above THE INDIRECT LAND USE DEBATE change for such biofuels assumes that land is point, one of the next rev- a carbon-free asset. We would argue that any olutions is going to be in calculation that ignores these emissions, however high throughput phenotyp- challenging it is to predict them with certainty, is too ing --- after all, we cannot incomplete to provide a basis for policy decisions.” solve the genotype-phenotype problem when we only know the geno- One of the examples Searchinger and his team type. used involved corn ethanol. Biofuel advocates had pre- •Model / reference organisms are still in, but every single organism on viously calculated that the energy crop would reduce the planet is now in play. emissions by 62 percent. The Princeton researchers •NextGen sequencing has completely outrun the ability of even good said that when land use emissions were factored in, people to keep up with the data and to use it well. corn ethanol would actually release double the emis- •Related to the above point, the NextNextGen (e.g., Pacific sions that were to have been sequestered. user meeting review Biosciences) seems to be barreling along and almost ready for prime The study drew criticisms from biofuels advocates time. What are we going to do in terms of informatics then? and researchers such as Chris Somerville, who noted •Following up on the above point- we desperately need a MASSIVE that the study was “based on naïve assumptions.” effort in the development of tools for "normal" biologists to make bet- The Princeton study, they said, only targeted land ter use of massive sequence databases. use emissions from biofuel production, instead of also •I am happy to report that just about everyone seems to be trying to considering the emissions from oil production. Another use an evolutionary perspective as part of their work - especially in point they raised was that the study assumed a worst- the selection of organisms for sequencing case scenario in which all the land being converted for •I am sorry to report that many of the evolutionary "perspectives" are a the added crops was pristine land. “According to indi- bit off kilter. rect land use,” Somerville commented during his •If you study a plant or an animal and are not studying the microbes keynote speech, “by not growing corn on unfarmed land, that live with them, you are missing something. we’re deforesting the Amazon.” Other factors, the crit- •If you study ANY organism and are ignoring epigenetics you are behind ics said, could make land available for biofuels without the curve destroying forests. •Reading DNA is being used in every which way imaginable. Searchinger and his colleagues have refuted the •Sequencing is definitely not over - it is just getting started. counter-arguments, but the debate continues to this day. •Next up – writing DNA. 8 / the PRIMER

Summer 2009 Vol. 6, Issue 1

JGI Publishes Sorghum Drought-Tolerant Biofuels Feedstock

For some 500 million people around important biofuel feedstock grass the world, sorghum is a staple grain. genomes⎯switchgrass, Miscanthus, and Approximately 60 million tons of sorghum, sugarcane⎯will be compared,” added which is more drought-tolerant than corn Andrew Paterson, the publication’s first and other cereal crops such as wheat and author and Director of the Plant Genome oats, is produced annually. Mapping Laboratory, University of Georgia. Sorghum also requires a third less Having the complete sorghum genome water to produce the same amount of already benefits a number of current ethanol per bushel compared to corn. research projects involving the grain. Even Because of its potential as a whole plant before they knew sorghum’s complete fiber based biofuel feedstock, the DOE genome sequence, for example, the U.S. JGI and several partner institutions Department of Agriculture had been work- worked on the genome sequence of ing on developing disease-resistant, low- sorghum. The grass’ complete genome lignin sorghum strains. USDA researchers was published in the January 29 edition based their studies on early sorghum of the journal Nature. genetics research that compared “This is an important step on the road sorghum’s genes to those of rice. to the development of cost-effective biofu- With less lignin in the plant, the sorghum els made from nonfood plant fiber,” said would be easier to digest, making it a bet- Anna C. Palmisano, DOE Associate Director ter livestock feed. It would also require of Science for Biological and Environmental less energy to be converted into ethanol Research. biofuel. Similar research projects in other “Sorghum is an excellent candidate for institutions would find the sorghum biofuels production, with its ability to with- genome’s availability equally useful. stand drought and prosper on more mar- The sorghum genome is only the sec- ginal land,” she said. “The fully sequenced ond grass genome to be completely genome will be an indispensable tool for sequenced after rice and has turned out notable publications researchers seeking to develop plant vari- to be nearly 75 percent larger. Part of the ants that maximize these benefits.” reason for the difference in size, noted Researchers plan to use the genetic Jeremy Schmutz from the DOE JGI partner data not only to develop better food and HudsonAlpha Institute for Biotechnology, feedstocks with improved drought tolerance is that plant genomes tend to have large and increased grain yield, but to produce sections of repetitive regions and sorghum grass strains suited specifically for biofuel is no exception. But the comparison production. Sorghum’s position in the grass between sorghum and rice genomes sug- family ⎯ it is closely related to maize and gests that there may be a common grass sugarcane, and more distantly to wheat and genome structure as well. rice ⎯ is expected to give researchers “We found that over 10,000 proposed helpful insights into other plant genomes rice genes are actually just fragments,” said that could be used as biofuels. DOE JGI’s Dan Rokhsar. “We are confident “Sorghum will serve as a template now that rice’s gene count is similar to genome to which the code of the other sorghum’s at 30,000, typical of grasses.” the PRIMER / 9

Summer 2009 Vol. 6, Issue 1

A Shortcut for Genome Sleuths

To identify potentially significant As DOE JGI postdoctoral fellow sequences in an organism’s genetic code, Matthew Blow put it, ChIP-seq such as those that control the genes allows scientists to directly responsible for an organism’s develop- identify a genome-wide set ment or key the development of better of the genetic switches biofuel feedstocks, genome researchers throughout the whole typically compare the target genome with genome rather than individ- the genomes of other, similar species and ually tracking down and look for sequences conserved in evolution. testing every gene for DOE JGI researchers teamed with col- its possible functions. leagues at Lawrence Berkeley National Len Pennacchio, Laboratory and the University of head of the DOE JGI’s California, San Diego to develop a tool Genomic Technologies depart- that provides genome researchers with a ment and senior author of the study, shortcut for searching out locations on added that ChIP-seq also provides The boxing gloves-like blue stain in this- the genome where particular gene func- researchers with an opportunity to learn mouse embryo indicates a specific gene function identified by researchers using a tions take place. The method is known as more about noncoding sequences referred tool refined by (left-to-right) senior Nature ChIP-seq and it can pinpoint the genetic to as genomic “dark matter.” paper author and JGI Genomic Technologies switches by looking at how DNA and pro- The research was described in the Program head Len Pennacchio and co-first teins interact. February 12 edition of the journal Nature. authors Matt Blow and Axel Visel.

Coming to Terms in Santa Fe cont. from page 2 that the same label might be applied to dif- explaining the process and the problems dependent on a person’s point of view, ferent things on different databases. And involved. which is in turn defined by their knowledge Owen White from University of Maryland Land echoed Chain’s comment from the and experience. suggested moving toward a common sys- first day of the conference on the steadily All of these variations led to Land’s call tem that could pull the best information on increasing list of genomes and sequences for annotation standards, which she didn’t bacterial genome sequences from a the being made available. There’s a movement think was as impossible a process as Chain notable publications various databases rather than relying on to make the annotation process automated had made it out to be during his talk. just a few during his keynote speech. in order to speed up the work and make it “We want a method of assessing and “Standards are going to be the key to more consistent. Unlike manual review, she designation quality as currently all annota- understanding some of the difference said, computers will always interpret the tion is deemed uniform in quality,” she said. between the things that are done well and data in the same way. And yet “consistent “We want a standard vocabulary for when the things that are sort of dubious,” said doesn’t always mean better,” Land noted. you’re talking about the same thing.” An Miriam Land, a DOE JGI researcher at Oak “Sometimes it’s consistently wrong too.” “annotation quality sequence,” she said in Ridge National Laboratory who annotates There are a number of genome databas- closing, deserves annotation of compara- microbial genomes sequenced in DOE JGI es out there that serve as references for ble quality. headquarters in Walnut Creek, Calif. annotation work, Land said, and the infor- Patrick Chain’s talk can be viewed online Land was the final speaker at the confer- mation in them is based on observations at http://www.scivee.tv/node/11406. ence and she addressed the need for stan- made by other people. Each of these tools, Miriam Land’s talk can be viewed online at dards in labeling the various components of however, was created with a specific pur- http://www.scivee.tv/node/11424. Other a genome sequence. “I often run into people pose in mind, which is why the data provid- talks from the “Sequencing, Finishing, from the sequencing end who assume that ed by each might not always be useful to Analysis in the Future” conference can be what they see on the computer is right when researchers. In that sense, she added, the found online at http://www.scivee.tv/ they look at annotation,” she said before quality of the annotation can become node/11378. 10 / the PRIMER

Summer 2009 Vol. 6, Issue 1

Ocean Sentinels for Climate Change

Scientists studying climate change approximately 10,000 genes were identi- only to the effective carbon cycling capa- often look for organisms that can serve fied in each isolate. bilities of green algae, but to those they as the local version of the canary in a DOE scientists plan to use the genetic have in common with land plants,” said coal mine. Such sentinels help information to develop algal biofuels, DOE JGI Director Eddy Rubin. researchers identify warning signs of envi- among other things. The algal sequences But the genomic results also revealed ronmental changes that are likely to upset could also provide researchers with a look that the algae weren’t as similar to each the balance of the ecosystem. In the at the encoded evolutionary record of how other as the researchers had previously Arctic region, the scientists watch the plants moved from the water to land. thought. polar bears. In the oceans, however, they “Genome sequencing of Micromonas “These two picoeukaryotes, often con- might be watching something much, much and the subsequent comparative analysis sidered to be the same species, only smaller. with other algae previously sequenced by share about 90 percent of their genes,” An algal genus known as Micromonas DOE JGI and Genoscope [France], have said Worden. To put this in perspective, thrives in waters ranging from the equator proven immensely powerful for elucidating humans and some primates have about to the poles and plays a key role in the the basic ‘toolkit’ of genes integral not 98 percent genes in common. global carbon cycle. These microorgan- The information suggests the algae in isms lend marine scientists each region may thus react insight on how climate change A 3D reconstruction of an electron tomograph- differently to coming environ- and rising ocean tempera- ic slice (0.5 microns thick) of one of the small- mental changes. est known eukaryotic algae, Micromonas: A.Z. tures affect marine ecosys- “By understanding which Worden, T. Deerinck, M. Terada, J. Obiyashi tems and which organisms and M. Ellisman (MBARI and NCMIR). genes a specific strain might become major players Background: Flavio Robles (LBNL). employs under certain condi- in the global carbon cycle. tions, we gain a view into Approximately a 50th of the factors that influence the the width of a human hair, success of one group over Micromonas could give another,” Worden said. “We Olympian Michael Phelps a may then be able to develop run for his gold medals, being models that could more notable publications able to move through water effectively predict a range of at the rate of 50 body possible future scenarios, lengths per second. The tiny that will result from current marine microorganism can climate change.” also move toward the sun- Worden thinks, for exam- light that powers it, much like ple, that the algae would sunflowers constantly keep adapt more easily than other their faces turned toward the marine species because it light throughout the day. has already shown that it The DOE JGI sequenced can thrive in a variety of tem- genomes of two Micromonas peratures. samples being studied by Only five other single Alexandra Z. Worden of celled marine microorgan- Monterey Bay Aquarium isms had had their genomes Research Institute (MBARI). sequenced prior to the publi- Taken from opposite ends of cation of the work, which the world — the South Pacific appeared in the April 10 and the English Channel — issue of the journal Science. the PRIMER / 11

Summer 2009 Vol. 6, Issue 1

CSP 2010 Selections cont. from page 1 gut samples taken from 11 insect species Pringle, whose proposal focuses on the endemic to New Zealand and 3 termite common Amanita thiersii which breaks species from Australia that potentially down cellulose. have novel wood-degrading enzymes. Other approved projects focus on the Another metagenome project involves fermentation process of alcohol production. the desert locust (Schistocerca gregaria), David Mills from the University of California, swarms of which spread over up to 20 Davis and Trevor Phister from North Carolina P. gigantea found fruiting on a red pine log percent of the world’s land mass and State University both proposed studying in a northern Minnesota forest, where the affect the livelihoods of up to 10 percent bacterial contaminants that destroy alcohols fungus, an aggressive saprophyte, is com- monly found on downed logs or cut timber. of the world’s population just by consum- and reduce biofuel production efficiencies: Photo by: Robert A. Blanchette, Professor, ing the equivalent of their body mass daily. the bacteria that turns wine into vinegar University of Minnesota DOE JGI researcher Falk Warnecke pro- (Acetobacter aceti) and the bacteria respon- posed sequencing the insect’s gut wall sible for the distinctive “horse-sweat taint” Another project related to the carbon and the microbial community inside, in in contaminated wine (Dekkera bruxellensis). cycle comes from Ramunas Stepanauskas part to find out if the locust or the microbes The DOE JGI is also using DNA sequen- at the Bigelow Laboratory for Ocean should get the credit for degrading ligno- cing for projects related to carbon cycling Sciences in Maine and involves microbial cellulose during these meals. and biogeochemistry. A project from Mark genomes collected from the Atlantic Ocean Many sequencing projects focus on Waldrop of the U.S. Geological Survey looks near South America. Stepanauskas intends breaking down plant cell walls to get more at microbes trapped in the Alaskan perma- to use single cell genome sequencing tech- energy out of biofuel feedstocks. Two frost, a carbon sink threatened by global niques, building on previous collaborations involving fungi come from Daniel Cullen, U.S. warming. The samples to be sequenced with DOE JGI research scientist Tanja Woyke Forest Service, Forest Products Laboratory, come from the USGS’ Yukon River Basin and her colleagues (see story on page 5). who proposed sequencing a lignin-degrading project, where geoscientists have been study- One sequencing project is led by white rot fungus known as Phlebiopsis ing the region’s response to climate change Volodymyr Dvornyk from the University of gigantea, and Harvard University’s Anne since 2001. Hong Kong and involves cyanobacterial cont. on page 12

Kyrpides on Microbial Genomics cont. from page 1 notable publications the list of microbial genomes for potential a common practice — will become infeasi- — a necessity for sequencing has been limited to the ble. To reduce the size of the datasets, he meaningful genome approximately one percent of organisms proposes a proxy approach in which one comparisons. that can be cultured in the lab and their protein from each protein family or one With new tools in DNA then extracted. He sees a way for- species from each genus represents the hand and internation- ward using single-cell genomics — a tech- group. Taking this one step farther, all the al initiatives for nique now being pursued in earnest by genes from all the sequenced strains in a increased collabora- DOE JGI researcher Tanja Woyke and her species — the pan-genome for that species tion underway, the field of microbial colleagues — in partnership with environ- — would constitute the genome repre- genomics is poised for a decade of excit- mental to provide a more senting that species for gene comparisons. ing advances. Sharing his vision for the holistic understanding of a community Echoing other researchers, most notably future, Kyrpides observes: “To explore and its individual members. DOE JGI’s Patrick Chain and Miriam Land and seek to understand how the Earth Kyrpides also suggests several innova- during the recent “Sequencing, Finishing, breaths, grows, evolves, renews, and sus- tive approaches for easing the data pro- Analysis in the Future” Conference, Kyrpides tains life — all essentially the work of the cessing bottleneck accompanying the calls for the development of genome microbial world — is the great adventure exponential increase in genomic data. All- annotation standards and their adoption now beckoning to us. Microbial genomics versus-all gene comparisons — previously by sequencing centers around the world paves the way forward.” 12 / the PRIMER

Summer 2009 Vol. 6, Issue 1

CSP 2010 Selections cont. from page 11 samples of Nostoc linckia collected from long-term stress affects the evolution of of a freshwater bacterium from the opposite faces of a valley known as Evolu- the bacterial genome. Siderocapsaceae family that can remove tion Canyon north of Tel Aviv, Israel. The Among approved projects that focus on manganese contaminants. south side of the valley receives 800 per- microbes that can help clean up the envi- For the complete list of CSP 2010 cent more solar radiation than the north ronment is one from Gillian Lewis at the sequencing projects, see: side of the valley, so the samples offer University of Auckland in New Zealand. http://www.jgi.doe.gov/sequencing/cspse researchers an opportunity to learn how She proposed sequencing a novel isolate qplans2010.html

Matthias Hess, DOE JGI postdoctoral fellow pre- sented the work on min- ing the metatranscrip- tome of the cow rumen microbiota for feedstock- targeted glycosyl hydro- lases, at the 31st Annual Part of DOE JGI's strong presence at the at the 4th Society of Industrial Sequencing, Finishing, Analysis in the Future Meeting 's held in Santa Fe, NM May 27-29. Left to Right: Stephan Symposium on Fuels and Trong, Steve Lowry, Kurt LaButti, Alicia Clum, Susan Chemicals May 4 in San Lucas, Alla Lapidus and Hui Sun. Francisco.

Twenty-three feet, four and a quarter inches and counting...At a recent gathering of poplar researchers at the base of DOE JGI’s sentinel Populus trichocarpa Jerry Tuskan (arms folded, above the helix), pre- dicted at least another six-foot growth spurt by year's end. U.S. House of Representatives Committee on Science and Technology staffers (left to right) Adam Rosenberg, Margaret Caravelli, Jetta Wong and Elizabeth Chapel, visited DOE JGI for a briefing on May 28.

2010 SAVESAVE TTHEHE DATE!DATE! MARCH 24-26 WALNUT CREEK, CA

CSO 16843

Contact The Primer David Gilbert, Editor / [email protected] / (925) 296-5643