Sequencing, Sequencing, Sequencing BGI-Shenzhen Tackles the Cute, the Edible, and Pretty Much Everything Else
Total Page:16
File Type:pdf, Size:1020Kb
Computational Biology Sequencing, Sequencing, Sequencing BGI-Shenzhen tackles the cute, the edible, and pretty much everything else. BY ALISSA POH HENZHEN, CHINABoxing Day 2008: The Luohu border between Hong Kong and mainland China Sis crowded, smoky, and noisy. With my visitors visa to Shenzhen in hand, Im cleared to visit the Beijing Genom- ics Institutes (BGIs) Shenzhen-based sequencing facility. Accompanied by my father and brother-in-law, I wave down the nearest cab. The cab driver doesnt have the faintest idea where BGI is lo- cated. As none of us can handle his thick Mandarin accent, Im forced to call Zhuo Li, vice president of BGIs health care division. I hand the phone to the driver, and happily were deposited at BGIs main entrance. Its a tall gray-and-glass struc- ture, distinctly newer and shinier than the neighboring buildings. My companions BGI currently houses dozens of Illumina sequencers and scores of China’s ‘best and brightest’. head across the street for a late breakfast (frog legs), and I wander in to meet Li. extraction. time, standing at the podium and looking The lobby lacks a smiling reception- It all started in August 1998 with the out at a sea of skeptical faces. He gambled ist, tasteful paintings, or piped-in music. Human Genome Project, which geneticist on the funds somehow materializing, fig- Save for supercomputers humming away Yang Huanming and three like-minded uring that what the audience didnt know within a glass-enclosed area and several countrymen, all recently returned from couldnt hurt them, or BGIs image. ping-pong tablesnaturallyin a corner, U.S. postdocs, saw as the perfect way to Three years later, the genomics world its Spartan. My sense is BGIs staff wasnt position China on the genomics and se- took notice when BGI metamorphosed going to spend any time on décor that quencing stage. Yangs plan was to utilize from one mans intangible dream to the they could otherwise devote to research. the Chinese Academy of Sciences (CAS), cover of Science, having outraced its Li is tall, lean, and intense. He greets as its Institute of Genetics already had global competition to shotgun-sequence me in immaculate English, and escorts its own Human Genome Center. But he the indica rice genome. A reform-minded me on a whirlwind tour of the institute. Quickly concluded that CAS, bound by China proved that the will to succeed, Its eleven floors of long hallways, each traditions, was lagging behind the rest of spirited nationalism, and sheer manpow- with its respective research unitcloning, the world. In early 1999, he broke away, er can be a potent combination. BGI split bioinformatics and the likeon one side, setting up BGI as a private, non-profit its seQuencing team into 12-hour shifts so posters papering the opposite wall. Lab- research organization. A few months the machines could run 24/7 for the 74 coated staff are everywhere, poring over later, at a conference at the Wellcome days it took to finish indica. Dispensing printouts, peering into cell-culture hoods, Trust Sanger Institute in the U.K., Yang with the commute between workplace shuttling racks of test tubes from one lab announced Chinas intention of becoming and home, staff catnapped in hallways or to another. Most ignore me, apart from a global player in genomics. simply dozed in their chairs. the occasional half-diffident glance. Naturally, he was asked whether he By 2002, BGI had outgrown its initial had the money to realize his vision. As home and relocated to an industrial park BGI Beginnings he later confessed to Science, he lied. Just in Beijing, with an additional campus Li succinctly answers my Questions, four months after the conference, CAS in Hangzhou. The original Beijing unit but as I discover, Chinese scientists are funded three Chinese sequencing centers assumed responsibility for all commer- rather more close-mouthed than their to tackle 1 percent of the human genome, cial and outsourcing projects, while the Western colleagues. Getting them to with BGI receiving over half of the total Hangzhou branch focused on seQuencing elaborate beyond the facts is akin to tooth award. But Yang didnt know it at the and academic research. Then in 2007, [20] BIO•IT WORLD NOVEMBER | DECEMBER 2009 www.bio-itworld.com BGI made a major investment in next- bamboo and have poor li- gen sequencing technology Illuminas bido. It also suggested that Solexaand moved its headquarters to rather than being related Shenzhen. The director is 33-year-old to raccoons, they likely hail Jun Wang, a handsome, highly decorated from the bear family. Ph.D. from Peking University whose in- The first (human) Asian terest in genomics dates back a decade to sequence is the starting point the Human Genome Project. for BGI-Shenzhens Yan- huang projectso named The Chinese Way for the Mandarin saying yan New employees at BGI-Shenzhen dont huang zi sun, or descen- need reminding about the institutes dants of Yan and Huang, two game plan. Its right in their faces: poster- emperors from ancient times style and of billboard proportions, span- that many Chinese consider ning an entire hallway. Printed in giant their earliest ancestors. The font, dead center, is a four-word slogan institute has its sights Sequencing is the basic! Its the founda- set on sequencing at tion for moving into broader biological least 100 additional Chi- systems and processesanalysis of DNA nese genomes, to better variation and global methylation, protein study genetic variations networks, and metagenomics, ultimately among Chinas differ- providing individualized health care and ent populations. agricultural advances. It got a lot of media Large-scale research is a mainstay of attention, Li says of BGI-Shenzhen, and the Tree of Life the November 2008 project among its most prominent. Silk- publication in Nature. worms, cucumbers, chickens, and pigs Not long afterwards, are but a few examples of organisms large we received RMB10 and small that the institutes scientists million [$1.46 mil- have already sequenced. On the wall-sized lion] from an anony- poster, theyre lumped into three groups: mous Chinese donor. animals, plants, and microorganisms. Hes interested in Animals are labeled economic (ducks, decoding personal ALISSA POH for instance); endangered (the Chinese genetic information to im- river dolphin); or model (Drosophila). prove biomedical research, Similarly, microorganisms are categorized and wants to help this proj- as industrial, pathogenic, or environmen- ect move forward. tal. Projects past, present, and future are Information gleaned annotated, respectively, by red flags, green from Yanhuang will con- stars, and yellow circles. tribute to the 1000 Genom- BGI-Shenzhen is perhaps best known es project, aimed at creating for the panda genome, as well as a Han- the most finely-tuned refer- Chinese individual whose genome was ence map of human genetic but the third announced and published variation to date, down to worldwide, after Watson and Venter. the 1 percent level. BGI- Back in February 2008, the institute Shenzhen is one of the key launched its International Giant Panda players in this undertaking. Genome project, aiming to sequence and Other initiatives include assemble the draft sequence within six a strategic alliance, since months. The honor fell to Jingjing, the early 2008, with Knome, prototype for the Beijing Olympics panda George Churchs personal mascot. The project was wrapped up genomics company. The lat- From top: The Beijing Genomics Institute, Shenzhen. An by October. This ranked among Chinas ter gets prime access to BGIs artist’s representation of BGI’s next-planned home in a Shenzen industrial park currently under construction in top ten technology accomplishments for capabilities in whole-genome nearby Enshan village. Supercomputers at BGI. 2008, and is viewed as a major step to- sequencing, assembly, and an- ward understanding why pandas eat only notation for its private clients. www.bio-itworld.com NOVEMBER | DECEMBER 2009 BIO•IT WORLD [ 21 ] Computational Biology BGI-Shenzhen is also one of 13 academic and industrial participants in MetaHIT, Blueprint of the Supercomputer Center a four-year project financed by the Euro- Requirement pean Commission to study connections Genomics is a typical data-intensive computational application. The sequencing Between genes of the human intestinal platform generates over 10 Tb raw data every day currently. microbiota and our health, zooming in Milestones on inflammatory bowel disease and obe- • To the end of 2008, 20 Tflops (Tera FLoating point Operation Per Second), 1 PB storage sity. In addition, a Sino-Danish diabetes • To the midyear of 2009, 50 Tflops, 5 PB storage project involves deep-sequencing of exons • To the end of 2009, 100 Tflops, 10 PB storage and other conserved genomic regions from more than 4,000 individuals, in an System Architecture • Computational capability: >100 Tflops, Linux cluster system, 4 ways x 4 cores CPUs and attempt to discover genetic variations 32-64 RAM per node, ~500 computing nodes; linked with obesity, type 2 diabetes and • Storage: 10 PB, large-scale parallel file system, high speed I/O; hypertension. • Network: 10 Gb computing Ethernet, 100 Mb management Ethernet; Were a completely private organi- • System: professional high-performance Linux cluster system and job management; • Software: bioinformatics software development by our own team zation, with an annual Budget of [$30 million], Li says. So to feed ourselves Budget and carry out all our projects, we rely on RMB60 million ($8.79 million) revenue from these collaBorations, and our spin-off companies [ten in total]. er, with some of the brightest stars barely handmade cloning (HMC) technology BGI-Shenzhen also Benefits from the out of college. Designing novel analysis a cheaper and simpler alternativeto generous support of Shenzhens munici- tools capable of handling short-read se- produce transgenic pigs.