the PRIMER Winter 2010 Volume 7 Issue 1

Revealing the Soybean also in this issue

Sequence: A Series of Firsts Cassava spurs research grant ...... 2 The soybean, one of the energy, protein, and nutrients IMG update released ...... 2 most important global sources for human and animal use,” Probing life in oceanic “dead zones” ...... 3 of protein and oil, has become said Anna Palmisano, Soybean's sequence supplied ...... 4 the first legume to have a pub- Associate Director of the DOE lished complete draft genome Office of Science’s Office of A Guide called GEBA ...... 6 sequence. In the January 14 Biological and Environmental Tracking projects in progress...... 7 issue of the journal Nature, a Research. “This opens the Delft University Chancellor visits ...... 7 team of researchers from the door to crop improvements U.S. Department of Energy that are sorely needed for Joint Genome Institute (DOE energy production, sustainable JGI), the U.S. Department of human and animal food pro- Agriculture-Agricultural duction, and a healthy environ- Research Service (USDA-ARS), mental balance in agriculture the National Science worldwide.” Foundation (NSF), the Briefly setting aside the list University of Missouri, Purdue of potential applications to be Photo by Roy Kaltschmidt, LBNL University, and a dozen other derived from the sequence, institutions described the Jeremy Schmutz, the study’s sequence and how the infor- first author and a DOE JGI sci- mation might be applied to entist at the HudsonAlpha agricultural strategies and Institute for Biotechnology in biodiesel production. Alabama, noted two other key “The soybean genome’s bil- points about the complete lion-plus nucleotides afford us draft genome. “The soybean a better understanding of the sequence project is the largest plant’s capacity to turn sun- plant project done to date at light, carbon dioxide, nitrogen the Joint Genome Institute,” and water, into concentrated he said. “It cont. on page 4

Now Available: Microbial Encyclopedia, Vol. 1

Nearly 2,000 microbes the ribosomal RNA Tree of Life, the 40 major phyla. The same sequenced 100 bacterial and have been sequenced out of which allows researchers to trend holds for archaea, archaeal genomes that repre- the estimated nonillion (1030) track and understand how organ- eukaryotes and viruses. The sent little-studied branches of in, on and around the Earth. isms are related to each other. solution is to use the tree to the Tree of Life. The work, con- And while the information is “We’ve done a very poor job guide us, going through phylo- sidered the first “volume” of a significantly impacting almost of sampling across the tree in genetic diversity to explicitly fill Genomic Encyclopedia of all aspects of microbiology, microbial studies,” said Eisen. in missing branches of the and Archaea or GEBA, said DOE JGI Phylogenomics “If you look at phylogenetic tree with actual data.” was published in the December Program Head and University diversity in the bacterial king- To remedy the problem of 24, 2009 issue of the journal of California, Davis professor dom, most of the available insufficient phylogenetic diver- Nature. Jonathan Eisen, it is bypassing genomes come from just 3 of sity, Eisen and his colleagues The GEBA cont. on page 6 2 / the PRIMER

Winter 2010 Volume 7, Issue 1

The Integrated Microbial Genomes sequencing project. Among these regulatory interactions on the transcrip- (IMG) system, featured in a recent edition genomes, 4,650 are finished genomes, tional and posttranscriptional levels in of Nucleic Acids Research* serves as a and 904 are draft genomes, and four are prokaryotic genomes, that contains experi- community resource for comparative permanent draft (i.e., will never be fin- mental data and predicted sites published analysis and annotation of all publicly ished) genomes. Twenty-seven new fungal in scientific journals. available genomes from three domains of genomes have been also included in IMG IMG 3.0 also contains proteomic data life in a uniquely integrated context. IMG 3.0. Compared with IMG 2.9, IMG 3.0 from recent Arthrobacter chlorophenolicus, 3.0, the 18th release, went live on contains 7,540,500 , an increase of curtum, and December 21, 2009. 1,026,256 genes. Brachybacterium faecium studies. MetaCyc and KEGG pathways in IMG The User Interface for IMG 3.0 has 3.0 have been updated with MetaCyc ver- been extended with Scaffold Cart tools sion 13.5 and KEGG version 52.0 respec- that facilitate the analysis of genomes at IMG 3.0 tively. The Pfam collection of protein the level of individual scaffolds and con- domain families has been updated based tigs, such as individual chromosomes and Goes Live on Pfam version 24.0, and Pfam clans plasmids. ACT (Artemis Comparison Tool), have been added as an additional classifi- a viewer based on Artemis for pair-wise cation of Pfam domain families. genome DNA sequence comparisons, has The content of IMG 3.0 has been In addition, chromosomal cas- also been added to IMG’s suite of synteny updated with new microbial genomes settes† have been recomputed together viewers. available in RefSeq version 37 (June 02, with estimates of their conservation For additional information: see What’s 2009) and contains a total of 5,558 across IMG genomes. New and Using IMG: http://img.jgi.doe.gov. genomes consisting of 1,748 bacterial, Genes in IMG involved in regulatory 77 archaeal, 76 eukaryotic genomes, interaction experiments controlling their *Nucleic Acids Research, 2010, Vol.38: http:// 2,606 viruses (including bacterial expression are now linked to RegTransBase nar.oxfordjournals.org/cgi/reprint/gkp887v1 phages), and 1,051 plasmids that did not (http://regtransbase.lbl.gov). RegTransBase †PLoS ONE 4(11): http://www.plosone.org/article/ come from a specific microbial genome- is a database of regulatory sequences and info%3Adoi%2F10.1371%2Fjournal.pone.0007979

Cassava draft genome sequence spurs Gates Foundation funding news

Not long after the first draft of the improved resistance annotated cassava genome sequence to CBSD and other was made available on the DOE JGI’s diseases. Claude Phytozome.net in November, the Bill & Fauquet, chair and Melinda Gates Foundation announced a co-founder of the $1.3 million grant to fund the develop- Global Cassava ment of a genome variation database that Partnership and a will help farmers grow more disease-resis- researcher at the tant and nutritious varieties of the root Danforth Center, crop in less time. said having the A third of the cassava harvest in Africa genome sequence of is lost because of pathogens such as cassava will benefit the international food DOE JGI, the University of Maryland and cassava brown streak disease. A staple security situation as well as help improve 454 Life Sciences, in collaboration with food for more than 750 million people the farmers’ health and economic growth. researchers in Kenya, Uganda and around the world, cassava was Several varieties of cassava will be Tanzania, to identify genes that corre- sequenced by the DOE JGI as part of CSP sampled by an international consortium spond to important traits and develop a 2007 to help develop a variety with that includes the University of Arizona, the genetic markers database. the PRIMER / 3

Winter 2010 Volume 7, Issue 1

Studying life in a “Dead Zone”

For researchers at the University of The project is part of the DOE JGI’s metabolism. Additionally, a comparative British Columbia (UBC), the Saanich Inlet Community Sequencing Program estab- analysis revealed that 35 percent of the off the coast of British Columbia, Canada lished in 2004 to tackle mission-relevant SUP05 genome is unique and is involved is an ideal “living lab” to study the micro- genomics projects that support the goals in helping the bacteria adapt to changing bial communities in low oxygen waters. As of the U.S. Department of Energy to devel- environmental conditions such as the sea- these so-called “dead zones” expands in op clean, sustainable bioenergy sources sonal increase and decrease of oxygen oceans worldwide, so does interest in and characterize biological and environ- levels in Saanich Inlet, and the shifting understanding how the microorganisms mental processes such as biogeochem- balance of the nitrate and sulfide levels that thrive in these regions affect and are istry and carbon cycling. that are its key energy resources. impacted by the changes to their ecosys- Susannah Tringe, a metagenomics sci- “Just as cyanobacteria play an essen- tems. entist at the DOE JGI, said that oxygen tial role in producing atmospheric oxygen; In the October 23, 2009 issue of the minimum zones (OMZs) are sinks for an in future oceans this could be one of journal Science, a team of UBC and DOE essential nutrient that marine organisms those organisms that play similarly inte- JGI researchers described the results of a need to survive—nitrogen—as well as gral roles, albeit with different ecological study conducted over several seasons on sources for the greenhouse gases outcomes,” said Hallam. He noted that the microbial communities of Saanich methane and nitrous oxide. “By studying the SUP05 microorganism and its rela- Inlet, which led to the identification of the the genomes of the uncultivated microbes tives will become increasingly important most abundant organism called SUP05. found in OMZs, we can better understand as OMZs continue to expand, providing Study senior author and UBC professor how they participate in global geochemical researchers with a biological indicator Steven Hallam noted that the team cycles such as the carbon and nitrogen useful in monitoring the changing state of obtained enough sequence coverage to cycles,” she said. the global ocean. assemble what they called “the SUP05 Hallam described SUP05 as a paradoxi- “Global warming is changing the chem- metagenome, a composite of the entire cal organism, one that fixes carbon diox- istry of the oceans and one of the byprod- SUP05 population spanning the various ide and removes toxic sulfides, but which ucts of change is that the ocean pH is environmental samples that we might also be producing nitrous oxide, a becoming acidic,” Hallam said. “Blooming sequenced.” more potent greenhouse gas than either SUP05 populations have the potential to carbon dioxide or methane. The researchers help offset rising carbon dioxide levels found that SUP05 is closely related to sul- that ultimately lead to ocean acidification.” fur-eating gill symbionts of deep sea Hallam and his team intend to use notable publications clams and mussels, though it utilizes their time-resolved studies as a basis for nitrate rather than oxygen in its energy comparison in the context of another CSP project of Hallam’s approved earlier this year which focuses on an extensive OMZ Study first author David Walsh (left) and in the eastern North Pacific Ocean. The study second author Elena Zaikova (below), both at the University of British Columbia, team also plans to eventually compare their on a water sampling trip at Saanich Inlet work in Saanich Inlet to data collected (right). Courtesy of the Hallam Lab. from other dead zones around the world.

Vancouver Island 4 / the PRIMER

Winter 2010 Volume 7, Issue 1

Image courtesy of the United Soybean Board

Jeremy Schmutz, a DOE JGI scientist at the HudsonAlpha Institute for Biotechnology in Alabama

Soybean Sequence informatics, biochemistry and genetics we cessful agricultural crop rotation strategies. cont. from page 1 can target the development of a soybean In the past, he said, farmers picked with greater than 40 percent oil content.” plants in the field and bred them together also happens to be the largest whole University of Missouri professor Gary to improve crop yields. “In recent years genome shotgun plant that’s ever been Stacey, Director of the Center for Sustainable we’ve kind of tapped out on traditional sequenced. We took the approximately Energy and Associate Director of the soybean breeding and can’t seem to notable publications 1.2 Gigabase genome, broke it apart and National Center for Soybean Biotechnology, increase the yield anymore. Using reassembled it like a puzzle.” said he and his colleagues have also genomics allows us to breed specific One major significant application of the identified more than 46,000 genes from genes, identify specific traits such as soybean genome sequence Schmutz men- the sequence analysis, of which 1,110 drought tolerance, pathogen resistance tioned is for biodiesel production. Right are involved in lipid metabolism. “These and more seed production, and breed now the plant doesn’t produce enough oil genes and their associated pathways are them back into soybean lines,” he said. to compete with petroleum products, the building blocks for soybean oil content For example, the soybean genome though he noted the legume is the major and represent targets that can be modi- sequence has already allowed source of biodiesel worldwide. fied to bolster output and lead to the researchers to identify first resistance Tom Clemente, a professor with increase of the use of soybean oil for gene for Asian Soybean Rust (ASR), a dis- appointments at the Center for Biotechnology biodiesel production,” he said. ease that can reduce yields by as much and Center for Plant Science Innovation at Schmutz said another major applica- as 80 percent in some countries. Another the University of Nebraska, Lincoln, said tion of the soybean genome sequence discovery effort spurred by the sequence the soybean genome sequence could would be to provide a reference sequence has pinpointed a mutation that offer solutions to the production problem. for more than 20,000 legume species, researchers can use to find soybean lines “We can now zero in on the control points helping agricultural researchers boost soy- with lower levels of the sugar stachyose, governing carbon flow towards protein and bean yields and learn more about the which will improve the ability of animals oil,” he said. “With the combination of nitrogen-fixing symbiosis so critical to suc- and humans to digest soybeans. the PRIMER / 5

Winter 2010 Volume 7, Issue 1

Photo by Kim Closser; Studio3, Columbia, MO Image courtesy of the United Soybean Board

JGI collaborator Gary Stacey, Associate Director of the National Center for Soybean Biotechnology at the University of Missouri

“This is a milestone for soybean research and promises to usher in a new era in soybean agronomic improvement,” said Stacey. “The genome provides a parts list of what it takes to make a soy- bean plant and, more importantly, helps to identify those genes that are essential notable publications for such important agronomic traits as protein and oil content.” A third research project that compared the genomes of soybean and corn has also led to the discovery of a single- mutation that reduces phytate production in soybean, which could in turn reduce the environmental runoff from livestock waste. Phytate is the form in which phosphorous is stored in plant tissue and it isn’t absorbed by animals that eat feeds with soybean mixed in. This can lead to higher phosphorus levels in manure, which can become a major contaminant.

Photo by Roy Kaltschmidt, LBNL Image by Roy Kaltschmidt, LBNL Image by Roy Kaltschmidt, LBNL doing microbial diver- the last 10 years; ho5w a place like JGI can uncultured organisms and then uncultured organisms http://www.jgi.doe.gov/sequencing/ More information available at http:// For a list of the GEBA pilot project targets, Videos of Eisen discussing the GEBA “Many people have talked about www.jgi.doe.gov/News/news_09_12_23.html. see: GEBAseqplans.html. project can be viewed at: http://www.scivee. tv/user/7476. something like this for set itself apart from all the other labs around the country. This has been an amazing project.” understand the biology of the organisms. no one tried it until JGI tried it,” said Eisen. “What this project is is an example of the type of project that a large genome center like JGI can do in the new genera- tion where sequencing is a lot cheaper and a lot faster, and emphasizes the vastness of to even sample half sity, pointing out that diversity, of the known phylogenetic to sequence researchers would need still another 10,000 genomes of what are mostly genomes and many project

cont. from page 1

the rRNAthe Tree of Life to phyloge- as He also noted that the GEBA The findings reveal that using a guide 6 the PRIMER / 1 Volume 7, Issue Winter 2010 to be sequenced, and in turn provides them for use in annotating other genome sequences. “You might not care about the genomes sequenced in this study,” said Eisen, “but they provide the ability to study other genomes you might care about more.” netically select organisms, especially uncultured ones, allows diverse of them are being finished. such project was launched in May 2007 with project was launched unrepresented the goal of first identifying Tree and branches from the phylogenetic from these then identifying organisms DNA samples branches that could provide JGI team collab- for sequencing. The DOE orated with researchers at the non-profit German Collection of Microorganisms and Cell Cultures, DSMZ (http://www. dsmz.de/), to sequence 100 bacterial and archaeal genomes for the pilot project, though Eisen said approximately 170 genomes have been sequenced GEBA

notable publications the PRIMER / 7

Winter 2010 Volume 7, Issue 1

DOE JGI Developing Integrated Tracking System

To keep up with 2008, we added two other platforms and handle everything from proposal submis- changes to sequenc- we increased production by 20-fold,” said sion to post-production sequencing and ing strategies and DOE JGI Director Eddy Rubin while dis- annotation, the DOE JGI doesn’t as yet the resulting plethora cussing the need for an Integrated have a single-strategy scheduling compo- of projects, the DOE Tracking System (ITS). “We used to be an nent that can track a project’s status JGI is developing a in and out burger place; now we’re a without using custom queries. system that allows Chinese restaurant with a million different “What the system will be able to do is project managers dishes. We’ve changed what we’re doing change the way information is presented,” and collaborators to and the variety of types of projects. We said Wilson. “ITS expected to improve the check on the status need to develop a tracking system to efficiency of the scheduling process with Software Project of a project. move forward.” a better visualization and cost-tracking, Manager Steven “For the decade The ITS project is being led by Software improving the flexibility involved in bring- Wilson JGI has existed, we Project Manager Steven Wilson, who noted ing new types of projects and sequencing used a single process for sequencing. In that while several systems are in place to platforms into play.” news

DOE JGI's own Jim Bristow (center left) and Susannah Tringe (cen- ter right) appeared on a panel with Jay Keasling on September 28, 2009 at the Berkeley Repertory Theatre's Roda Stage. KTVU Channel 2 health and science editor John Fowler (left) moderated the talk entitled: “From the sun to your gas tank: A new breed of biofuels may help solve the global energy challenge and reduce the impact Rector Magnificus Designatus Karel Luyben (on the right), of fossil fuels on global warming,” discussing ways to convert the Chancellor of the Delft University of Technology in the Netherlands, solar energy stored in plants into liquid fuels. A video of the talk can visited the DOE JGI on December 8, 2009 and was given a tour of be viewed at http://www.youtube.com/watch?v=mRTwuxVurIE the sequencing facilities by Sanger Coordinator Simon Roberts.

The second week of December 2009 brought a cold storm down from the north, dropping several inches of snow on the ridges surrounding Walnut Creek, Calif., including on the slopes of the 3,864-foot Mount Diablo, as seen from the front steps of the DOE JGI. 8 / the PRIMER

Winter 2010 Volume 7, Issue 1

Genomics of Energy & Environment March 24-26 Walnut Creek, California

The 2010 Department of Energy Joint Genome Institute (DOE JGI) Genomics of Energy & Environment meeting will be held March 24-26 in Walnut Creek, California and specifi- cally emphasize the genomics of renewable energy strategies, biomass conversion to biofuels, environmental gene discovery, and engineering of fuel-producing organisms.

Scheduled speakers include: Cristina Cuomo, Broad Institute; Evan DeLucia, University of Illinois at Urbana- FIFTH ANNUAL Champaign; Richard Flavell, Ceres; Steven Hallam, University of British Columbia; Dennis Hedgecock, University of Southern California; Madhu Khanna, University of Confirmed keynote speakers include: Illinois at Urbana-Champaign; Steve Knapp, University of Georgia; Tom Mitchell-Olds, Duke University; Steve Moose, University of Illinois at Urbana-Champaign; Joseph Rita Colwell Noel, Salk Institute for Biological Studies; Forest Rohwer, San Diego State University; Distinguished Professor, University of Maryland and Johns Hopkins University Steven Savage, Cirrus Partners; Gary Stacey, University of Missouri; Jim Tiedje, Bloomberg School of Public Health Michigan State University; Adrian Tsang, Concordia University; Detlef Weigel, Max Jay Keasling Planck Institute for Developmental Biology; Alexandra Worden, Monterey Bay CEO, DOE Joint BioEnergy Institute Aquarium Research Institute

Under the leadership of Fungal Genomics Program head Igor Grigoriev, DOE JGI hosted the Comparative Genomics of Thermophilic Fungi Jamboree December 2-4, 2009. This Jamboree brought together 21 collaborators to explore the genomics and biology of ther- mophilic fungi by comparative analysis of three genomes: Thielavia terrestris (below), Sporotrichum thermophilum and Chaetomium globosum—two ther- mophiles and a mesophile. Image courtesy of Andrew Tsang, Concordia University

Contact The Primer David Gilbert, Editor / [email protected] / (925) 296-5643

CSO 18246