Computational Tools to Battle HIV
Total Page:16
File Type:pdf, Size:1020Kb
Computational Tools to Battle HIV Bette T. M. Korber and Alan S. Perelson n 1981, AIDS, the acquired immune nations are projected to drop as much tion, and the human host ultimately deficiency syndrome, was initially as 20 years by 2010. Eurasia sits at the loses the race. The virus can quickly Idetected in a handful of gay men brink of the rapidly advancing storm. acquire drug-resistant mutations, partic- and reported in the Mortality and Russia, China, and India are facing bur- ularly when therapy is only partially Morbidity Report of the Centers for geoning epidemics with no cure and no successful, and the drug-resistant forms Disease Control. Once the disease had vaccine yet in hand. of the virus can be transmitted. This been defined and described, its pres- HIV has three peculiarities that rapid within-host evolution results in ence was rapidly recognized not only in make it a particularly challenging foe: extraordinary variation at the epidemic the homosexual community in the its latent period, its devastation of the level, and viruses from any two individ- United States but in other populations immune system, and its variability. It uals are quite distinct. and countries. By 1983, the etiological, has, on average, a decade-long latent The Theoretical Biology and or causative, agent was isolated: a retro- period during which infected individu- Biophysics Group at Los Alamos has virus1 called human immunodeficiency als carry the virus and can transmit the played an integral part in understanding virus, type I, or HIV-1. With that dis- virus but are not overtly ill; in fact, and defining the nature of both the covery, diagnostic tests soon became unless they are tested, they are likely to host-viral dynamics and the evolution available, and by the mid-1980s, it was be unaware of their infection. During of HIV. This article covers the history of clear that there was a new pathogen on this period, the immune system some of the ideas that were developed the loose in the world. It was soon rec- responds to the virus but is unable to by Los Alamos scientists in collabora- ognized that a global epidemic of terri- clear it, and the ability to make immune tion with experimental and theoretical ble magnitude was coming, although responses to any infection slowly dete- colleagues worldwide. the extent of the devastation now expe- riorates. Combinations of antiretroviral rienced in sub-Saharan Africa was therapies have been developed that can unimagined in those early days. The control viral replication and prolong The HIV-1 Databases: An World Health Organization established life and good health, but so far, these International Resource an international network drugs have not been able to clear the (www.unaids.org) to track this disease infection. Therapy is very expensive Los Alamos was the original home as it moved through populations, and and, because of side effects, can be dif- of GenBank, a database of all publicly through the years, it has maintained a ficult to take for years on end. available genetic sequences from virtu- grim record of HIV-1’s death toll and During the infection, HIV infects ally all organisms. GenBank is now spread. Current estimates indicate that and decimates CD4+ T lymphocytes, housed at the National Library of more than 65 million people have con- the very cells that are central to the Medicine at the National Institutes of tracted HIV-1, and 25 million of those immune response needed to counter a Health (NIH). During the mid-1980s, are already dead. Africa has been the viral infection. But the most serious HIV sequences began to be archived in most brutally affected. Life expectan- impediment to defeating the virus is its GenBank, arriving not only from the cies in some of the worst afflicted extraordinary variability. This virus United States and Europe, but from evolves during the course of every Africa as well. Gerald Myers, who was 1 A retrovirus is a virus that stores its infection. The immune system responds working at Los Alamos at the time, was genetic code in RNA rather than DNA, and inhibits one form of the virus, but a struck by the extraordinary diversity of and once in the host, copies its RNA into new form inevitably escapes from that the incoming HIV sequences. He real- DNA. The viral DNA is then incorporated into the host DNA and read to make new response. This cyclical response and ized, with great prescience, that given virus particles. evasion continues throughout the infec- this level of diversity, creating a vaccine 208 Los Alamos Science Number 28 2003 Computational Tools to Battle HIV was going to be a tremendous challenge and that the HIV research community would benefit from additional curatorial effort. He proposed to generate viral sequence alignments to enable compar- isons of similarities between the incom- ing and archived sequences, to correlate the sequences with any information about their gene products, to link sequences to additional information (for example, the health status of patients and geographic and phenotypic infor- mation), and to publish an annual com- pendium. Researchers with new HIV sequences could then quickly put them into the context of a global framework without having to extract and organize the information each time. The NIH, persuaded by Myers, subsequently funded a program to create the HIV sequence database. In the mid-1990s, Myers initiated databases for other pathogens, leaving the HIV project to Los Alamos researchers Bette Korber and Carla Kuiken. Sequencing technologies were Figure 1. Tools to Combat HIV rapidly improving, and HIV sequences Los Alamos is home to four databases that archive most of what is known about were being generated at an accelerating HIV and its associated immune responses, drug resistance, and vaccine trials. The pace. With a dedicated staff contribut- database website at www.hiv.lanl.gov also contains numerous analysis tools. This ing both computer and biological figure is a screenshot from a geography tool that provides a bird’s-eye view of viral expertise, the HIV sequence database diversity throughout a region, in this case the African continent. The pie chart indi- developed into an easily used, Web- cates the distribution of viral strains for different subregions. Clicking on an individ- accessible, relational database, which ual pie chart on the Web brings up more information about the local viral strains. Clicking on a strain accesses the sequence information contained in the database. now houses roughly 80,000 HIV sequences. Web search tools were added to retrieve data in new ways. For and added to the sequence database. It Alamos databases became an integral example, a researcher working on a allowed a researcher to integrate part of global HIV research efforts, pro- vaccine for South Africa can pull up an immune-response data with sequence viding a foundation for continuing sci- automated alignment of all regional variability data. In 1997, an HIV drug- entific work. sequences, organized by year of isola- resistance mutation database was tion, by subtype, or by phenotype (see added, and in 2002, a vaccine-trial data- Figure 1.) Pioneering computational base became part of the collection. The HIV, the Shape Changer tools were developed for handling such latter was developed by Los Alamos basic problems as detecting viral postdoctoral fellow John Mokili. It The database was developed to be recombination, hypermutation, and summarizes and allows direct compar- an international resource, but the inte- polymerase-chain-reaction contamina- isons of data sets from hundreds of vac- grated data it provides also serves as a tion, tools that served to improve vastly cine studies conducted in primates. The basis for our own research efforts, and the overall quality of the HIV scientific HIV database Web pages, with access we have used it to study the evolution literature. to all four databases, received more of the virus from many different per- In 1995, an immunology database than a million hits per month during the spectives. The exposed envelope pro- that contains information concerning peak use period in 2002, averaging tein of HIV (shown in Figure 2) is the HIV immune responses was created 35,137 pages accessed per day. The Los most antibody-vulnerable part of the Number 28 2003 Los Alamos Science 209 Computational Tools to Battle HIV virus. The envelope has two different protein components: gp41, which is locked into the outer membrane of the virus, and gp120, which is bound to gp41. The component gp120 projects from the surface of the virus, allowing HIV to recognize and bind to receptors on the human cell it infects. Gp120 is remarkably mutable, and sequences from different lineages, or subtypes, of HIV can differ in more than 30 percent of their amino acids. But even the high fraction of amino- acid changes does not reflect HIV’s true capacity to generate structural diversity. The virus has at least five mechanisms to alter its antigenic con- formation—the molecular shape that is relevant to an immune response—and thus exhibits an extraordinary (and Figure 2. Human Immunodeficiency Virus, Type 1 This illustration depicts the essential features of HIV-1. The virus stores its genetic unique) capacity to avoid triggering the information in two strands of RNA. These lie within a protein layer known as the immune system. Following are the five capsid, which is surrounded by the matrix protein. Also lying within the capsid is a mechanisms: metaprotein complex called pol, which contains four protein subunits: a reverse transcriptase, which will copy the RNA into DNA once the RNA has entered a host Mutational change through base cell; an integrase, which inserts the viral DNA into the host cell’s genome, an substitution.