NATIONAL INSTITUTES OF HEALTH

National of Medicine

Programs and Services

Fiscal Year 2004

U.S. Department of Health and Human Services Public Health Service Bethesda, Maryland

i

National Library of Medicine Catalog in Publication

Z of Medicine (U.S.) 675.M4 National Library of Medicine programs and services.— U56an 1977- .—Bethesda, Md. : The Library, [1978- v.: ill., ports. Report covers fiscal year. Continues: National Library of Medicine (U.S.). Programs and Services. Vols. For 1977-78 issued as DHEW publication; no. (NIH) 78-256, etc.; for 1979-80 as NIH publication; no. 80-256, etc. Vols. For 1981-available from the National Technical Information Service, Springfield, Va. ISSN 0163-4569 = National Library of Medicine programs and services.

1. Information Services – United States – periodicals 2. , Medical – United States – periodicals I. Title II. Series: DHEW publication ; no. 80-256, etc.

DISCRIMINATION PROHIBITED: Under provisions of applicable public laws enacted by Congress since 1964, no person in the United States shall, on the ground of race, color, national origin, sex, or handicap, be excluded from participation in, be denied the benefits of, or be subjected to discrimination under any program or activity receiving Federal financial assistance. In addition, Executive Order 11141 prohibits discrimination on the basis of age by contractors and subcontractors in the performance of Federal contracts. Therefore, the National Library of Medicine must be operated in compliance with these laws and executive order.

ii Contents

Preface ...... v

Office of Health Information Programs Development ...... 1 Outreach and Consumer Health...... 1 International Programs ...... 3 Planning and Analysis ...... 6

Library Operations ...... 7 Program Planning and Management ...... 7 and Management ...... 8 Vocabulary Development and Standards ...... 10 Bibliographic Control ...... 11 Information Products ...... 13 Direct User Services ...... 16 Outreach ...... 17

Specialized Information Services ...... 25 Resource Building ...... 25 AIDS Information Services ...... 28 Outreach/User Support ...... 29 Research and Development Initiatives ...... 29

Lister Hill Center ...... 31 Organization ...... 31 Training Opportunities at the Lister Hill Center ...... 32 Language and Knowledge Processing ...... 33 Image Processing ...... 35 Information Systems ...... 40 Research Infrastructure and Support ...... 44

National Center for Biotechnology Information ...... 47 GenBank®: The NIH Sequence Database ...... 47 The Human Genome ...... 48 Model Organisms for Research ...... 51 Literature Databases ...... 51 The BLAST® Suite of Sequence Comparison Programs ...... 52 Other Specialized Databases and Tools...... 53 Database Access ...... 55 Research ...... 56 Outreach and Education ...... 56 Biotechnology Information in the Future ...... 58

Extramural Programs ...... 59 Resource Grants ...... 59 Training and Fellowships ...... 60 Research Support ...... 61 Pan-NIH Projects ...... 62 EP Operating Units—Highlights ...... 64

Office of Computer and Communications Systems ...... 68 Executive Summary ...... 68 Customer Services ...... 69 Desktop Support ...... 70 Network Support ...... 70 Systems Support ...... 71

iii IT Security ...... 71 Policies and Product Standards ...... 72 Quality Management and Configuration Control ...... 72 Computer Room Facilities ...... 72 Consumer Health ...... 73 Professional Health Information...... 73 Research and Development Efforts ...... 75 NLM® Web Support ...... 75 Outreach ...... 76 Administrative Support Systems ...... 76

Administration ...... 77 Personnel ...... 77 NLM Committee Activities ...... 81

NLM Organization Chart ...... (inside back cover)

Tables

Table 1. Growth of Collections ...... 22 Table 2. Acquisition Statistics ...... 22 Table 3. Cataloging Statistics ...... 23 Table 4. Bibliographic Services ...... 23 Table 5 Consumer Web Services ...... 23 Table 6. Circulation Statistics ...... 23 Table 7. Online Searches—PubMed and NLM Gateway ...... 24 Table 8. Reference and Customer Services ...... 24 Table 9. Preservation Activities ...... 24 Table 10. History of Medicine Activities ...... 24 Table 11. Extramural Grants ...... 67 Table 12. Financial Resources and Allocations ...... 77 Table 13. Full-time Equivalents (Staff) ...... 77

Appendixes

1. Regional Medical Libraries ...... 84 2. Board of Regents ...... 85 3. Board of Scientific Counselors/LHC ...... 86 4. Board of Scientific Counselors/NCBI ...... 87 5. Biomedical Library and Informatics Review Committee ...... 88 6. Literature Selection Technical Review Committee ...... 90 7. PubMed® Central National Advisory Committee ...... 91 8. Organizational Acronyms and Initialisms Used in this Report...... 92

iv Preface

The National Library of Medicine is playing a role of growing importance on the national health information scene. The registration of clinical trials, “” for published government- sponsored research information, and creating and providing access to national health data terminology standards are three notable examples in Fiscal Year 2004. Among the advances described by NLM programs in this report:

• ClinicalTrials.gov received Harvard University's prestigious “Oscar” of government awards—the Innovations in American Government Award. ClinicalTrials.gov was created and is maintained by the staff of the Lister Hill Center. • MedlinePlus® and MedlinePlus en español ranked 1st and 2nd among all U.S. government Web sites in the American Customer Satisfaction Index surveys. The rate of page views of MedlinePlus has more than doubled this year—to 498 million. MedlinePlus is maintained by the Division of Library Operations. • In August 2004 PubMed reached its 15 millionth citation. That system continues to be a tremendous resource for medical research, logging some two million searches a day. The Library Operations Division and the National Center for Biotechnology Information share the responsibility for maintaining this vital system. • This year NLM successfully launched NIHSeniorHealth.gov with a demonstration in the Congress. This new information resource is a joint effort of the NLM and the National Institute on Aging. • The Specialized Information Services Division this year introduced several new services, including the Asian American Health Web site and the Wireless Information System for Emergency Responders (WISER). • The Office of Computer and Communications Systems established the NIH Consolidated Collocation Site, which provides crucial backup capability for NIH’s extensive computer- based operations.

We at the National Library of Medicine are conscious of carrying on a tradition of 169 years of service to the medical community and to the nation. It is a responsibility we are proud to undertake and the accomplishments detailed in this report are the result. I extend my thanks to the Library staff and to the many advisors and consultants we rely on.

______Donald A.B. Lindberg, M.D. Director

v the NLM in FY2003 to create the “Health Office of Health Information Prescription” program. Doctors in Information Programs several pilot states were given customized prescription pads that they can use to point patients to Development first-rate online health information in NLM’s MedlinePlus database. Elliot R. Siegel, Ph.D. The Information Rx project was launched Associate Director nationally on April 22, 2004, the opening day of the American College of Physicians Annual Session in The Office of Health Information Programs New Orleans. The joint project has been tested in Development is responsible for three major Georgia and Iowa by more than 500 ACP internists functions: and their patients. Among a variety of feedback tools • planning, developing, and evaluating a yielding important findings, pre- and post-tests found nationwide NLM outreach and consumer that 97 percent of the participating internists made health program to improve access to NLM information referrals, with 59 percent using the information services by all, including prescription pads for information provided by ACPF minority, rural, and other underserved and NLM. Twenty percent of participating physicians populations; also reported an increase in patients bringing Internet • conducting NLM’s international programs; information to the office visit. and Internists who participated in the pilot • establishing, planning, and implementing the programs said that MedlinePlus empowers patients NLM Long Range Plan and related planning (54 percent), explains difficult concepts and and analysis activities. procedures (43 percent), and improves patient- physician communication (42 percent). The project Outreach and Consumer Health was modified for the third stage of the pilot program in Virginia in March 2004 to partner with Virginia NLM carries out a diverse set of activities directed at ; a toolkit to facilitate participation by building awareness and use of its products and libraries was developed. services by health professionals in general and by particular communities of interest. Considerable Diabetes: Consumer Health Diabetes Projects emphasis has been placed on reducing health NLM is exploring the use of new information disparities by targeting health professionals who technologies to enable diabetes patients to manage serve rural and inner-city areas. Additionally, starting their disease and avoid or delay the onset of costly in 1998, NLM has undertaken new initiatives and debilitating complications, especially patients specifically devoted to addressing the health from minority and medically underserved information needs of the public. These projects build populations. In particular, we seek to learn how the on long experience with addressing the needs of use of NLM’s MedlinePlus web site, and other health professionals and on targeted efforts aimed at computer-based health information resources, can be making consumers aware of medical resources, helpful to patients, their families, and members of the particularly in the HIV/AIDS area. public to learn about and understand the latest research news on diabetes, nutritional requirements, NLM Coordinating Committee on Outreach, tests, devices, secondary prevention techniques, and Consumer Health and Health Disparities for obtaining answers to patient-specific questions. This office has convened and is chairing the NLM In the clinical setting, the principal Coordinating Committee on Outreach, Consumer hypothesis is that MedlinePlus can reinforce and Health and Health Disparities. This Committee plans, supplement the information provided by physicians, develops, and coordinates NLM outreach and nurses and health educators. A related hypothesis is consumer health activities. A number of the activities that a combination of individualized training and described below are conducted under the auspices of access to publicly available computer resources at the Committee. hospital libraries and elsewhere in the community can help reduce the health disparities experienced by American College of Physicians Physician minority populations that have less ready access to Information Prescription Project computer-based health information in the home, Doctors often prescribe medication after seeing a school and workplace than the majority population. patient. But what if that doctor also wants to direct The goal is to develop, design, implement the patient to up-to-date, reliable, consumer-friendly and evaluate a comprehensive program of diabetes- information about a health concern? The American focused outreach initiatives in collaboration with College of Physicians (ACP) Foundation teamed with academic health science centers and libraries, clinical

1 centers, community-based organizations and • strengthening of each participating voluntary health organizations. component’s Web evaluation capability and The latest initiative in collaboration with the activity; Upper Cardozo Health Center in Washington, D.C. • sharing of web evaluation learning and and George Washington University, undertakes a experience on a trans-NIH basis; controlled field experiment with patients enrolled in • aggregation of ACSI results and learning on the Diabetes Health Disparities Collaborative a trans-NIH basis; and wherein diabetes patients will receive an • sponsoring an NIH-wide staff workshop that individualized information technology-based will highlight the contributions and intervention to complement their regular patient challenges of the ACSI from the NIH education program. Education and training will be perspective, consolidate lessons learned, and given to (a) physicians on how to incorporate the identify future directions and opportunities. Information Prescription Project into their daily office practice routine; (b) medical assistants to show The project will be managed by the ACSI patients how to access and use MedlinePlus in Survey Leadership Team consisting of English and Spanish; and (c) clerks on how to representatives from NLM and several other NIH instruct patients to use MedlinePlus in waiting rooms components who will work closely with during clinic hours. A wealth of routinely collected representatives from all participating units to be clinical patient data will give NLM a unique convened under the auspices of the NIH Web opportunity to provide objective evidence of the Authors Group. The primary ACSI contractor will be impact of MedlinePlus use on patient health ForeseeResults Inc. via an agreement between NLM outcomes. and the Federal Consulting Group/Department of the Treasury. An evaluation contractor will be selected to Web Evaluation provide consultant support to the ACSI Leadership The Internet and World Wide Web now play a Team and participating NIH units. dominant role in disseminating NLM information services. And the web environment in which NLM Tribal Connections operates is rapidly changing and intensely NLM has continued to focus on improving Internet competitive. These two factors combined suggested connectivity and access to health information the need for a more comprehensive and dynamic services in American Indian and Alaskan Native NLM Web planning and evaluation process. The communities. Phase I (Pacific Northwest) and Phase continuing Web evaluation priorities of the OCHD 2 (Pacific Southwest) of tribal connections are include: a) quantitative and qualitative metrics of web complete. Also, NLM has funded a Phase 3, in which usage; and b) measures of customer perception and more intensive community-based outreach and use of NLM Web sites. During FY2004, the OCHD training are being implemented at select Phase 1 and continued to pursue an integrated approach intended 2 sites to assess if these community-based approaches to encourage exchange of information and learning significantly enhance the project impacts on health within NLM, and help better inform NLM information, behavior, and outcomes. The Phase 3 management decision-making on Web site research, evaluation report is being prepared for publication in development, and implementation. The year’s 2005. Also NLM has funded a Phase 4 in FY2004 in evaluation activities included: online surveys users of collaboration with the University of Utah select NLM web sites; several online focus groups; (Midcontinental Regional ), access to a syndicated telephone survey of the emphasizing the development of Web-based tribal public’s online and offline health information seeking health information resources in the Four Corners behavior; analysis of NLM Web site log data; and Region (AZ, CO, NM, UT). access to Internet audience measurement estimates A major new initiative during FY2004 was based on Web usage by user panels organized by the planning and implementation of a Native private sector companies. American Listening Circle Project. Listening Circles Also during FY2004, OHIPD collaborated are a Native American tradition for encouraging with other units of NIH to initiate a trans-NIH online dialog and discussion and developing trust among user survey project based on the American Customer various parties, in this case NLM and representatives Satisfaction Index (ACSI), with significant funding of tribes and Native groups. The objectives of the support from the NIH Office of Evaluation. The Listening Circles are to promote open dialog between project will extend the ACSI online user survey NLM and tribal leaders, share perspectives on each methodology to about 60 NIH Web sites at about 28 other’s capabilities and needs, and identify different NIH units, with expected benefits to both. opportunities for collaborative projects. The idea of The project will include multi-level evaluation Listening Circles was brought to NLM’s attention by objectives: Dr. Ted Mala, a member of the original Tribal

2 Connections advisory committee, former President of collaboration with Texas A&M University as well as the Association of American Indian Physicians, the University of Texas at San Antonio, and its currently Director of Traditional Healing for the Regional Academic Health Center in Harlingen, TX. Southcentral Foundation, and a current member of Also, OHIPD funded two projects building on the the NIH Council of Public Representatives. The very successful pilot project at MedHigh in the Listening Circles are consistent with DHHS and Lower Rio Grande Valley, where high school peer White House guidance on Federal agency tutors were trained and then in turn taught their peers consultation and coordination with Tribal about health information resources available from Governments. NLM. The pilot project received several awards for For the Listening Circles, NLM contracted outstanding performance and service to the students with Cindy Lindquist and the National Indian and surrounding communities. The follow-on Women’s Health Resource Center to assist with projects extend the MedHigh concept and planning and organizing a series of three Listening collaborations to other high schools in the LRGV and Circles. Dr. Mala served as a senior advisor. The in the San Antonio and Laredo areas. Resource Center in turn involved local tribal and Native groups in the organizing of each Listening International Programs Circle, working in collaboration with the OHIPD. During 2003–2004, three Listening Circles were MIMCom: A Malaria Research Network for Africa planned, organized, and implemented: one each in the NIH has led an international effort to provide malaria Dakotas (with American Indians); Hawaii (with researchers in Africa with full access to the Internet Native Hawaiians); and Alaska (with Alaska and the resources of the World Wide Web. This Natives). The NLM delegation for all three Listening project began with NIH’s leadership in the Circles was led by Dr. Donald A.B. Lindberg, Multilateral Initiative on Malaria (MIM) in which accompanied by Dr. Elliot R. Siegel, Dr. Fred B. African scientists identified electronic Wood, Ms. Gale Dutcher, and Dr. Rob Logan. The communication and access to scientific information key facilitators were Ms. Lindquist and Dr. Mala as critical in the fight against the devastating and working with local Native organizations. economically debilitating effects of malaria in Also, in 2004 the OHIPD again partnered developing countries. with NIH and NLM Equal Employment Opportunity As a part of MIM, NLM, working in offices to participate in the NIH American Indian partnership with organizations in Africa, the U.S., the Pow-Wow Initiative. This included exhibiting at United Kingdom, and Europe, has created eight pow-wows in the Mid-Atlantic area including MIMCom.Net, the first electronic malaria research one pow-wow each in New Jersey and Pennsylvania network in the world. Using satellite technology, the and two in North Carolina. Also OHIPD participated network provides full access to the Internet and the with NIH in the Gathering of Nations Powwow in resources of the World Wide Web, as well as access Albuquerque, NM. An estimated 24,000 persons to current medical literature, for scientists working in visited the NLM booth over the course of these pow- Africa. The African research sites are of recognized wows. These activities proved to be another viable high quality, require improved communications to way to bring NLM’s health information to the accomplish ongoing research, and have the necessary attention to segments of the Native American resources to purchase equipment and sustain the community and the general public. system. The web site, http://www.nlm.nih.gov/ Outreach to Hispanics mimcom, comprises links to Medline®, a variety of The Lower Rio Grande Valley Hispanic Outreach free online journals, databases, malaria-related sites, Project was a collaboration with the University of and general information. An NLM reference Texas at San Antonio Health Sciences Center to serves as the webmaster and is expanding the site to conduct a needs assessment and various health include special news releases and articles of interest information outreach projects with Hispanic-serving to researchers. community, health, and educational institutions. This MIMCom has evolved in order to support was the beginning of an intensified NLM effort to the shifting needs of the malaria research sites in meet the health information needs of the Hispanic Africa. In 1998, the network started with a population in Texas and elsewhere. The initial Lower microwave link to the Internet in Bamako, Mali, and Rio Grande Project is complete. Based on the project has since assisted 19 other sites in 12 countries. In the results, NLM has funded a series of follow-on intervening years, the telecommunications revolution projects focusing on outreach to Hispanic populations has moved forward—technology has changed, along in South Texas. One follow-on project involves with the Internet itself, the latter now bringing us Hispanic residents of the Lower Rio Grande Valley spam and viruses undreamt of in the not too distant who live in colonias. This project involves past. Conversely, where there was once little or

3 nothing in terms of telecommunications options, achieved the goals of the initial mandate. With there are now, in some but not all instances, a number generous support from Swedish SIDA, more sites of players providing useful services, resulting in will be able to benefit from the telecommunications competitive pricing. Additionally, some sites have tool. The goal is to promote further capacity building, experienced dramatic growth and are no longer resulting in strong African research characterized by properly served by the system as it currently exists. networks and sharing of information. It has become clear from recent meetings of MIM-TDR and the MIM Funders Forum that the African Medical Journal Editors Partnership Project most practical next phase for MIMCom is to focus on “I want my journal to be medical informatics. We have begun to do this with successful, to be a flag bearer for the Antimalarial Drug Resistance Network; this Africa. How do I get there?” James network has set up a secure server so that researchers Tumwine, Editor, African Health can share raw data and post data summaries. It Sciences addresses the need for supporting scientists in making innovative use of their new research tools to facilitate African Medical Journal Editors Partnership new ways of working together. Project is a collaboration with the Fogarty International Center in which the International MIMCom Evaluation Programs is working closely with Library Operations. Evaluation continues to be an integral part of The objective is to create four partnerships between MIMCom development. The results of examinations four African medical journals and journals from the of the project by independent panels and individuals U.S. (3) and U.K. (2) for the purpose of strengthening have been critical to assessing the project’s the African journals. NLM’s specific objective is effectiveness. Evaluations have concluded that strengthening African journals so that they are able to “creation of MIMCom has provided isolated get into MEDLINE, and, as a result, make African scientists with tools that bring the whole world research available to the world. closer. Reliable communication with collaborators Partnerships have been initiated between and vastly improved access to the scientific literature editors of the following journals: Ghana Medical have both increased the reach of African scientists Journal and The Lancet; Malawi Medical Journal and facilitated their participation in the broader and the Journal of the American Medical scientific world, especially by improving their Association; African Health Sciences and British potential to publish in world-class journals, a key part Medical Journal; Mali Medical Journal, the of being a mainstream scientist.” American Journal of Public Health and the Another evaluation found that “MIMCom is Environmental Health Perspectives/National Institute viewed by many (86%) as one of the most successful of Environmental Health Sciences journal. and important contributions of the MIM, and it is Partnerships can encompass sharpening of strongly recommended that MIMCom continues to business, editorial, and technical aspects of: editorial expand the network to new sites and be involved in skill development; training for authors, reviewers, the subsequent steps of IT-training and and editorial board; sharing manuscripts; joint management…. Further expansion and development commissioning of article; exchange of editorial of associated activities such as more access to online content; staff exchange for skills and experience journals and forming web-based research networks sharing; increased publication of local research; was very much desired. The training of on-site IT survey of journal’s target audience. specialists is identified as very important for the sustainability of a functioning local network.” Training Another evaluation studied the effects of NLM continues to be active in the training of medical enhanced connectivity on professional performance librarians, including programs in which the trainers of malaria research staff with the use of a Web-based train others. NLM participates in the biannual questionnaire. In summary, taking into account the Association for Health Information and Libraries in explanations and open answers given in response to Africa conference by offering workshops in PubMed the questionnaire, it can be said that enhanced training and sponsoring the travel of African connectivity is generally experienced as a positive librarians associated with the MIMCom project contribution to professional performance. It gives above. Two librarians from Vietnam came to NLM access to a world of up-to-date scientific information, for training in indexing and MeSH®, so they could facilitates efficient communication, allows effective begin to make their own collection available to coordination of research activities and offers physicians and health workers in that country. improved possibilities for capacity building. NLM’s Associate Program has an international NLM has provided the leadership, resources, fellow who returns home with expertise and and glue for this first phase of MIMCom and has resources to carry out projects locally.

4 International Network Partnerships AUSTRALIA OHIPD is pursuing strategies to develop international National Library of Australia network partnerships. One initial area for exploration is international DOCLINE®. In FY2003, letters of CANADA invitation to join DOCLINE were sent to selected Canada Institute for Scientific and Technical libraries in Mexico, following the 1.5 release of Information (CISTI) DOCLINE, which added Region 21 for Mexico. The NN/LM South Central Region, housed at the Houston CHINA Academy of Medicine–Texas Medical Center Institute of Medical Information Chinese Academy of Library, is serving as Region 21’s Regional Medical Medical Sciences Library in its initial phases. A number of Mexican libraries have joined and they are now be able to add EGYPT holdings to SERHOLD®, enabling them to share ENSTINET Academy of Scientific Research and resources among themselves and border libraries in Technology Texas and other U.S. libraries agreeing to reciprocal borrowing with Mexico. FRANCE In addition to supporting international INSERM libraries, international network partnerships can support the international research community through GERMANY programs such as the Multilateral Initiative on German Institute for Medical Documentation and Malaria. NLM can share its expertise in designing Information and implementing telecommunications capacity with scientists in developing countries, enabling HONG KONG researchers to communicate in a timely manner, The Chinese University of Hong Kong access biomedical information resources and databases, and collaborate on proposal preparation INDIA and research implementation with colleagues in National Informatics CenterMinistry of Information industrialized countries. Technology

Global Internet Connectivity ISRAEL End-to-end performance of the Internet, on both Hebrew University national and global scales, continues to be important to NLM in part because the Internet is the primary ITALY vehicle for promoting access to and dissemination of Istituto Superiore di Sanita health information. This includes the further exploration of the methods and metrics needed to JAPAN better understand the quality of Internet performance Japan Science and Technology Corporation (JST) from the end user perspective. NLM is a leader in this field, and several other research and technical KOREA organizations now recognize the importance of end- Seoul National University to-end Internet performance. Additionally, NLM has implemented Phase I of its own Internet connectivity KUWAIT performance monitoring network, starting with select Kuwait Institute for Medical Specialization U.S. sites (the eight Regional Medical Libraries) but envisioned to extend to other U.S. sites and some MEXICO international sites in the medium term. Centro Nacional de Informacion y Documentacion sobre Salud International MEDLARS® Centers Bilateral agreements between the Library and more NORWAY than 20 public institutions in foreign countries allow University of Oslo them to serve as International MEDLARS Centers. As such, they assist health professionals in accessing RUSSIA MEDLINE and other NLM databases, offer search The State Central Scientific Medical Library training, provide document delivery, and perform other functions as biomedical information resource SOUTH AFRICA centers. South African Medical Research Council

5 SWEDEN Based on the Long Range Plan, OHIPD Karolinska Institute Library documents NLM’s progress in achieving its goals for a variety of purposes, including the Government UNITED KINGDOM Performance and Results Act (GPRA) and The British Library appropriations hearings, as well as NLM’s involvement in a variety of disease and policy-related PAN AMERICAN HEALTH ORGANIZATION areas. BIREME/PAHO In September 2004, the NLM Board of Centro Latino Americano e de Caribe Regents approved the initiation of a new effort to Informcao em Ciencias da Saude develop a Long Range Plan for FY 2005–2010, and appointed a Board Subcommittee on Planning, co- INTERGOVERNMENTAL ORGANIZATION chaired by Hon. Newt Gingrich and Dr. William Science and Technology Information Center Stead, to oversee this undertaking. Taipei 10636, Taiwan In addition to specific outreach and consumer health projects outlined below, OHIPD has International Visitors overall responsibility for developing and coordinating In FY2004 the Office of Communications and Public the NLM Health Disparities Plan. This plan outlines Liaison (and HMD) arranged for 331 tours—107 NLM strategies and activities undertaken in support regular daily (1:30 p.m.) tours and 224 specially of NIH efforts to understand and eliminate health arranged tours. There were 6141 visitors in all. They disparities between minority and majority came from the following 67 countries: populations. A new Health Disparities Plan for FY2004–2008 was prepared and is available on the Antigua, Australia, Bangladesh, Belgium, Bosnia, NLM web site. Botswana, Brazil, Burundi, Canada, China, It is important for NLM to be able to Colombia, Côte d’Ivoire, Croatia, Cuba, Ecuador, El describe and analyze its outreach, consumer health, Salvador, England, Eritrea, Finland, Georgia, and health disparities projects in order to identify Germany, Ghana, Haiti, Hungary, India, Indonesia, areas of opportunity, report on their progress, and Iran, Ireland, Jamaica, Japan, Kazakhstan, Kenya, plan for new initiatives. A major activity of the Kyrgyzstan, Korea, Malaysia, Mali, Marshall Islands, OCHD is the implementation of a database of NLM Mexico, Federated States of Micronesia, Morocco, outreach, consumer health, and health disparities The Netherlands, New Zealand, Nicaragua, Nigeria, projects. This database, which includes projects from Norway, Pakistan, Palau, Peru, Philippines, Poland, all of the Regional Medical Libraries as well as Portugal, Romania, Russia, Serbia, South Africa, NLM, is a major source of data for the National Sweden, Switzerland, Tajikistan, Thailand, Trinidad, Outreach Mapping Center, which is seeking to use Turkey, Uganda, United States, Uzbekistan, Vietnam, mapping as an aid to ensuring the effective Zambia, Zimbabwe. distribution of outreach services by the NLM and the National Network of Libraries of Medicine. Planning and Analysis In line with its other planning activities, this office worked with senior NLM and Association of The NLM Long Range Plan 2000–2005, published in Academic Health Science Libraries members to plan 2000, remains at the heart of NLM’s planning and a major and well-publicized joint symposium on budget activities. Its goals form the basis for NLM “The Library as Place: Symposium on Building and operating budgets each year. All of the NLM Long Renovating Health Sciences Libraries in the Digital Range Plan documents are available on the NLM Age,” held at NLM November 5–6, 2003. Post- web site. symposium activities in 2004 have included preparation of a DVD “proceedings” and website.

6 for Biomedical Communications (LHC). LO staff Library Operations work closely with these program areas on the design, development, and testing of new system features. Betsy L. Humphreys Associate Director Program Planning and Management

NLM’s Library Operations (LO) Division is LO sets priorities based on the goals and objectives in responsible for ensuring access to the published the NLM Long Range Plan, 2000–2005, and the record of the biomedical sciences and the health closely related NLM Strategic Plan to Reduce Racial professions. LO acquires, organizes, and preserves and Ethnic Disparities. In FY2004, LO contributed to NLM’s comprehensive archival collection of plans for developing a new NLM Long Range Plan biomedical literature; creates and disseminates for 2006–2016, under the auspices of the NLM Board controlled vocabularies and a library classification of Regents. The actual planning sessions, which will scheme; produces authoritative indexing and involve outside experts and representatives of the cataloging records; builds and distributes Library’s many constituent groups, will begin in bibliographic, directory, and full-text databases; FY2005. provides national backup document delivery, The current NLM Long Range Plan has a reference service, and research assistance; helps strong focus on the opportunities and challenges people to make effective use of NLM products and arising from electronic publishing and the role of the services; and coordinates the National Network of Web and the Internet in locating and accessing health Libraries of Medicine to equalize access to health information. In FY2004, LO continued to review and information across the United States. These basic revise policies, procedures, services, and services support NLM’s outreach to health organizational lines to reflect shifting workloads; to professionals and the general public, as well as use electronic information to enhance basic focused programs in AIDS, molecular biology, health operations and services; and to work with other NLM services research, public health, toxicology, and program areas to ensure permanent access to environmental health. electronic information. Based on an analysis of work Library Operations also develops and currently performed by the Web Management Team mounts historical exhibitions; carries out an active and the Reference and Customer Service Section, research program in the history of medicine and LO’s Public Services Division (PSD) initiated a public health; collaborates with other NLM program reorganization that will merge these two units in areas to develop, enhance, and publicize NLM FY2005. LO assisted OCCS in developing priorities products and services; conducts research related to and schedules for replicating key databases and data current operations; directs and supports training and creation systems at the new offsite backup computer recruitment programs for health sciences librarians; facility. LO assisted NCBI and the Office of the and manages the development and dissemination of Director in its work on the development of an NIH national health data terminology standards. LO staff Public Access policy involving PubMed Central® by members participate actively in efforts to improve the providing a variety of analyses of the volume of quality of work life at NLM, including the work of articles emanating from NIH-funded research, current the NLM Diversity Council. journal publication practices and subscription prices, The multidisciplinary LO staff includes etc. Many other specific projects undertaken to librarians, technical information specialists, subject enhance access and handling of electronic experts, health professionals, historians, museum information are described throughout this chapter. professionals, and technical and administrative In FY2004, LO focused considerable support personnel. LO is organized into four major attention on working with other NLM program areas Divisions: Bibliographic Services, Public Services, to meet the Library’s expanded responsibility for Technical Services, and History of Medicine; three distribution of standard clinical vocabularies within units: the Medical Subject Headings (MeSH) Section, the UMLS® Metathesaurus®. There were major the National Network of Libraries of Medicine changes and enhancements to UMLS development, Office, and the National Center on Health Services distribution, and user support materials that are Research and Health Care Technology (NICHSR); described in many sections of this chapter and in the and a small administrative staff. The activities of all LHC chapter. A decision was made to transition these components receive essential support from a responsibility for the Metathesaurus production wide range of contractors. system from LHC to OCCS, with work to begin in Most LO activities are critically dependent FY2005. on automated systems developed and maintained by Although many LO efforts are devoted to NLM’s Office of Computer and Communications dealing with electronic information and supporting Systems (OCCS), National Center for Biotechnology NLM’s high-priority outreach initiatives, LO must Information (NCBI), or Lister Hill National Center

7 also devote substantial resources and attention to the by the Board of Regents to review NLM’s care and handling of physical library materials and to coverage—in its collection, the MeSH vocabulary, the space and environment for staff, patrons, and and its databases—of the fields of bioethics and of physical and electronic collections. In FY2004, LO biomedical imaging and bioengineering. Both continued to contribute to plans for a new NLM working groups found that NLM’s coverage of these building and for interim arrangements for housing subjects was generally very good, but indicated that staff and storing collections until a new building is improvements were needed to facilitate retrieval of available. In November 2004, an NLM/AAHSL information in these subject areas. (Association of Academic Health Sciences Libraries) symposium on “The Library as Place” was held in Acquisitions NLM’s Lister Hill Center, which examined trends, The Technical Services Division (TSD) received and issues, and lessons learned in building and renovating processed 156,515 contemporary physical items libraries and highlighted the continuing need for (, serial issues, audiovisuals, electronic media), physical library buildings in an increasingly which is slightly below last year’s total. The increase electronic era. LO co-chaired the Organizing in electronic publishing has not yet had a significant Committee for this highly successful symposium and effect on the number of physical items that NLM worked with LHC to produce an interactive DVD acquires. Net totals of 27,101 volumes and 427,921 version of the proceedings. other items (including nonprint media and In FY2004, LO’s Administrative Office manuscripts and pictures acquired by the History of continued to assist managers, supervisors and staff Medicine Division (HMD) were added to the NLM with the transition to a range of new administrative collection. Eighteen libraries offered NLM gifts of systems and the consolidation of human resources retrospective literature; a total of 3,220 journal issues functions within the National Institutes of Health. LO and 1,504 bound volumes were added to the NLM continued to encourage staff to take advantage of collection as a result. flexiplace work arrangements as appropriate. Nearly LO uses subscription agents and 70 LO employees now work at home at least one day vendors to acquire current literature published around per week. the world. In FY2004, TSD awarded new blanket purchase agreements for monographs to seven U.S. Collection Development and Management and international vendors. To address the increasing workload and complexity of licensing electronic NLM’s comprehensive collection of biomedical resources, TSD increased staffing devoted to this literature is the foundation for many of the Library’s activity and worked with NLM’s Office of services. LO ensures that this collection meets the Administration to streamline procedures for needs of current and future users by updating NLM’s reviewing and approving licensing terms. literature selection policy; acquiring and processing HMD acquired a wide variety of important relevant literature in all languages and formats; printed books, manuscripts and modern , organizing and maintaining the collection to facilitate images, and historical films during FY2003. Among current use; and preserving it for subsequent the books were Fabricius ab Aquapendente, De generations. At the end of FY2004, the NLM Respiratione et Eius Intrumentis (Padua, 1615), a collection contained 2,482,585 volumes and work which William Harvey argued against when he 5,469,662 other physical items, including later developed his new theory of circulation; Juan manuscripts, microforms, pictures, audiovisuals, and Luis Vives, De Anima et Vita Libri Tres (Basel, electronic media. 1538), a Renaissance work about the relationship of emotions to remembering and forgetting; Relatione Selection dell’Esperienze Fatte In Inghilterra, Francia, ed Italia In FY2004, NLM completed a total revision of the (Rome, 1668), an extraordinary collection of letters Collection Development Manual of the National and documents arguing for and against blood Library of Medicine. Prepared with advice from an transfusion; and Cecilio Follio’s Sanguinis a Dextro external oversight committee chaired by Alison in Sinistrum Cordis Ventriculum Defluentis Facilis Bunting, former Chair of the NLM Board of Regents, Reperta Via (Venice, 1639), a work contesting the revised manual is available as an interactive William Harvey’s new theory of circulation. Website, with links to related documents (e.g., the Archives and modern manuscript collections NLM preservation policy, joint NLM/Library of acquired included the Food and Drug Congress/National Agricultural Library collection Administration’s “Notice of Judgment Files,” a statements) and cooperating institutions (e.g., the collection 2,679 linear feet documenting fraud Kennedy Institute of Ethics Library). The revision of prosecuted by the FDA during the first half of the the Collection Development Manual was informed by twentieth century. It is one of the largest and most the deliberations of two working groups established significant additions to the Library in many years. An

8 index to the collection accompanied it. Other major preservation resources to support duplication of collections added were the archives of the American historical audiovisuals and increased conservation of Association for the Surgery of Trauma; the papers of prints and photographs. Microfilming will be limited Charles Johnson (Dean, Meharry Medical College), to filling in gaps in filmed Index Medicus and Index and papers of John retired Program Director Catalogue titles and to other monographs and serials of the Artificial Heart Program of the NIH’s National volumes, such as NLM’s unusual collection of pre- Heart, Lung, and Blood Institute. Large additions to Revolutionary Russian materials, that are so several existing collections included the papers of deteriorated that they are at risk of text loss. James Bosma (audiology), Adrian Kantrowitz (heart- In FY 2004, LO bound 18,311 volumes, assist devices), and James Harvey Young (history of microfilmed 2,603 volumes, repaired 1,688 items in quackery and regulation of food and drugs). NLM the onsite repair and conservation laboratory, made rented space at the University of Maryland Health 808 preservation copies of films and audiovisuals, Sciences Library for temporary storage and conserved 4,305 prints and photographs and 202 processing of several large modern manuscript other rare or unique items. Guidelines were collections, including the FDA files. developed for selecting post-1970 audiovisuals for New prints and photograph acquisitions duplication, and procedures were established for included public health posters and medical ephemera inspecting newly produced audiovisual copies. A donated by William Helfand, additional photographs total of 802,069 items were shelved or re-shelved and by Martha Tabor, and fine art prints by Katherine Du about 40,000 duplicate unbound journal issues and Teil and Rosamond Purcell. New videos and films 7,365 bound volumes were removed from the added included master video tapes of NIH lectures collection. Stricter inspection procedures were and a large collection of films from the National established for bags carried out of the NLM library Hansen’s Disease Center, Carville, Louisiana. building. NICHSR and LHC collaborated to expand the collection of interviews with eminent researchers as Permanent Access to Electronic Information part of the effort to document the history of health NLM’s approach to addressing the unique challenges services research. of preserving electronic information is to use its own electronic products and services as test-beds and to Preservation and Collection Management work with other national libraries, the Government LO carries out a wide range of activities to preserve Printing Office, the National Archives and Records NLM’s archival collection and make it easily Administration, and other interested organizations to accessible for current use. These activities include: develop, test, and implement strategies and standards binding, copying deteriorating materials on more for ensuring permanent access to electronic permanent media, conservation of rare and unique information. LO collaborates with other NLM items, book repair, maintenance of appropriate program areas on activities related to the preservation environmental and storage conditions, and disaster of digital information. prevention and response. LO distributes data about PubMed Central, a digital of NLM’s preservation copies to avoid costly duplicate medical and life sciences journal literature developed effort by other libraries. LO works with other NLM by the National Center for Biotechnology program areas to develop digital preservation Information, is NLM’s vehicle for ensuring techniques and to promote the use of more permanent permanent access to electronic journals and digitized media and archival-friendly formats in new backfiles. LO assists NCBI in soliciting participation biomedical publications. of additional journals, particularly in the fields of In FY2004, LO reviewed and revised clinical medicine, health policy, health services NLM’s preservation priorities. The great majority of research, and public health. In FY2004, LO the Index Medicus® and Index Catalogue titles that negotiated the specific terms of an agreement with were identified as brittle in a 1985 survey of the the Wellcome Trust and the Joint Information Library’s print collection have been microfilmed. Systems Committee in the United Kingdom which Although the amount of brittle paper in the NLM will recruit participation of additional journals and collection is still substantial, digitization is emerging fund the digitization of the complete backfiles of as an acceptable alternative to microfilming as these journals. Journals recruited to date are: Annals appropriate commitments, procedures, and systems of Surgery, Biochemical Journal, Journal of for preservation of digital information are established Physiology, Medical History, Journal of the Royal at NLM and elsewhere. In addition, more recent Society of Medicine, and the British Journal of surveys of the condition of NLM’s audiovisual and General Practice. Negotiations are under way with picture collections have highlighted the need to focus publishers of other titles. more attention on preservation of these materials. In FY2004, LO’s Public Services Division Taking these factors into account, LO reallocated continued to work closely NCBI to scan and add to

9 PubMed Central digitized backfiles of journals Metathesaurus is a multi-purpose knowledge source currently depositing newly published articles in the licensed by NLM and many other organizations in archive. PSD prepares back issues for scanning, production systems and informatics research. It shipped them to the scanning contractor, and serves as a common distribution vehicle for manages the human review portion of the quality classifications, code sets, and vocabularies designated control of the scanned images, accompanying OCR as standards for U.S. health data. data, and XML-tagged citations for articles that pre- LO represents NLM in federal initiatives to date current MEDLINE/PubMed coverage. In the select and promote use of standard clinical initial two years, 25,000 issues have been assembled vocabularies in patient records and administrative for scanning, more than 1.8 million pages have been transactions governed by the Health Insurance scanned, and 156,000 XML citations created. Since Portability and Accountability Act of 1996 (HIPAA). bindings are cut to make scanning more efficient, In this capacity, LO staff members serve on the NLM does not use volumes from its archival Department of Health and Human Services Data collection in this effort, but solicits copies from Standards Committee, provide staff support to the publishers and other libraries. NLM is particularly National Committee on Vital and Health Statistics grateful to the Marine Biological Laboratory in (NCVHS) Standards and Security Subcommittee, and Woods Hole for donating complete back runs of participate in the Public Health Data Standards several titles in FY2004. Consortium. In FY2004, in recognition of the NLM is using its own main Web site as a Library’s standards activities and expertise in health test-bed for procedures and mechanisms for ensuring information technology, the Secretary of Health and permanent access to electronic information published Human Services (HHS) acted upon an NCVHS by government agencies and private non-profit recommendation and designated NLM as the institutions. With the redesign of NLM’s main Web coordinating center for standard clinical site in FY2004, the Library established an Archives terminologies. Funds were transferred to NLM from section, which now includes outdated web pages that other HHS agencies to assist with these are important in documenting the history of NLM. responsibilities. The Secretary also selected NLM as Items in the archive are retrievable, but they are the operational home of the Commission on Systemic segregated and clearly labeled to avoid confusing Interoperability, which was established by the users about what is currently applicable. In cases Medicare Modernization Act of 2004 to develop a where archived items have been replaced by newer comprehensive strategy for the adoption and versions (e.g., fact sheets), there are “Replaced by” implementation of health care information and “Previous version” links between them. When a technology standards that includes a timeline and new NLM Web document is created, a “permanence priorities. The Commission is expected to release its level” is assigned. Those designated as Permanent final report by the end of calendar 2005. with unchanging or stable content will be transferred into the Archive if and when they become outdated. Medical Subject Headings (MeSH) Procedures are currently in place for labeling and The 2005 edition of MeSH contains 22,568 main archiving html documents on NLM’s main web site, headings, 83 subheadings or qualifiers, 129 using the Teamsite web management software. LO publication types, and more than 146,000 and OCCS are developing mechanisms to handle supplementary records for chemicals and other other types of documents, e.g., PDF, and expect to substances. For the 2005 edition, the MeSH Section expand the project to incorporate other NLM web added 487 new descriptors, replaced 129 descriptors sites in the coming year. with more up-to-date terminology, deleted 60 descriptors, and added 340 entry terms or “see” Vocabulary Development and Standards references. The 2005 vocabulary reflects work to LO produces and maintains the Medical Subject reorganize and update the vocabulary for Headings (MeSH), a subject thesaurus used by NLM macromolecular substances, including polymers and and many other institutions to describe the subject multiprotein complexes, and intracellular signaling content of biomedical literature and other types of peptides and proteins. Important revisions or information; develops, supports, or licenses for U.S. additions were made to the terminology for digestive use vocabularies designed for use in electronic health system diseases, cardiomyopathies, endocrine system records and clinical decision support systems; and diseases, morphogenesis, reproduction, and a number works with the Lister Hill Center to produce the of types of organisms. A number of foreign brand Unified Medical Language System® UMLS names for drugs were added to MeSH supplementary Metathesaurus, a large vocabulary database that concept records. includes many vocabularies, including MeSH and MeSH is translated into many other several others developed or supported by NLM. The languages by organizations around the world,

10 including a number of NLM’s international There is widespread agreement that the MEDLARS partners. In FY2004, LO and OCCS existence of authoritative electronic mappings from released the first production version of the Web- standard clinical vocabularies to administrative code based MeSH translations database and maintenance sets should facilitate automated production of bills system, which can be used by remote translators to and statistical reports as a by-product of the capture improve the currency and accuracy of their of detailed patient data. In FY2004, NLM defined translations. The system allows translators to view assumptions and parameters for such mappings; and translate new terms as they are added by the enlisted cooperation from relevant federal agencies MeSH Section throughout the year rather than and private organizations; initiated projects to map waiting until a complete new edition is released. LOINC to Current Procedural Terminology (CPT) Three organizations used the system to prepare and SNOMED CT to CPT and to the International updated editions of MeSH for 2005. The system is Classification of Diseases, 9th edition, Clinical also being used to prepare new translations in Modification (ICD-9-CM); and began discussions additional languages. about mappings from SNOMED CT to the Medical In FY2004, the MeSH Section developed Dictionary for Regulatory Affairs and from Medcin and published on the Web files that explicitly to SNOMED CT. To be credible, mapping efforts document the citation and maintenance procedures must: involve both vocabulary producers and that were performed on the MEDLINE database as a intended users, undergo technical review and testing result of implementing the new version of MeSH. within real clinical systems, and establish effective mechanisms for keeping mappings up-to-date and Clinical Vocabularies responding to user feedback. The MeSH Section and its contractors also produce RxNorm, a clinical drug vocabulary that provides UMLS Metathesaurus standardized names for use in prescribing. It is The MeSH Section and its contractors are responsible released within the UMLS Metathesaurus. RxNorm for content editing of the UMLS Metathesaurus, was designated as a U.S. government-wide target using systems developed by the Lister Hill Center clinical vocabulary standard by the Secretary of (LHC). In FY2004, Metathesaurus editors Health and Human Services in 2004. It represents the accomplished the enormous task of editing the information that is typically known when a drug is insertion of both the English and Spanish editions of prescribed, rather than the specific product and SNOMED CT, the largest single vocabulary ever packaging details that are available at the time a incorporated into the Metathesaurus. A number of medication is purchased or administered, and other vocabularies, including several drug provides a mechanism for connecting information vocabularies, were updated in the Metathesaurus. At from different commercial drug information services. the close of FY2004, the Metathesaurus contained In FY2004, RxNorm was linked to additional more than 1 million concepts and 3.8 million concept commercial drug terminology within the UMLS names from 113 source vocabularies. LO staff Metathesaurus, and NLM established an agreement assisted LHC in completing the specifications for a with FirstDataBank for regular electronic data feeds new Metathesaurus Rich Release Format, provided in to assist in keeping RxNorm up-to-date. LO and addition to the existing format, that allows OCCS made significant progress on the development completely accurate representation of all of a system that will permit NLM to issue more relationships present in source vocabularies and frequent additions to RxNorm, between editions of supports distribution of purpose-specific mappings the UMLS Metathesaurus. Documentation for between vocabularies. The Bibliographic Services RxNorm was published on the NLM Website. Division coordinated a complete rewrite of the Through LO’s NICHSR, NLM supports the UMLS documentation to reflect the major changes in continued development and free distribution of the Metathesaurus distribution format, a new LOINC® (Logical Observation Identifiers, Names, licensing agreement, use of the Unicode UTF8 Codes) by the Regenstrief Institute. LOINC was character set and associated software tools and designated as a U.S. government-wide target clinical assumed full responsibility for publication of the vocabulary standard in 2003. NLM also manages and documentation effective with the 2004AC release. pays the annual update fees for the U.S.-wide license (See further information about UMLS activities in the for Systematized Nomenclature of Medicine. Clinical Information Products section of this chapter and the Terms (SNOMED CT®). In FY2004, work continued UMLS section of the Lister Hill Center chapter. on the NLM-commissioned project to examine the overlap between the non-laboratory sections of Bibliographic Control LOINC and SNOMED CT and to recommend strategies for reducing it. LO produces authoritative indexing and cataloging records for journal articles, books, serial titles, films,

11 pictures, manuscripts, and electronic resources, using present in the American National Biography, the MeSH to describe their subject content. LO also Oxford Dictionary of National Biography, and the maintains the NLM Classification, a scheme for Dictionary of Canadian Biography. arranging physical library collections by subject that is used by health sciences libraries worldwide. Indexing NLM’s authoritative bibliographic data improve LO indexes 4,839 biomedical journals for the access to the biomedical literature in the Library’s MEDLINE/PubMed database to assist users in own collection, in thousands of other libraries, and in identifying articles on specific biomedical topics. The many electronic full-text repositories. indexing workload increases steadily, in part due to the selection of additional journals to be indexed, but Cataloging primarily because of increases in the number of LO catalogs the biomedical literature acquired or articles published in journals already being indexed. selected by NLM to document what is available in A combination of Index Section staff, contractors, the Library’s collection or on the Web and to provide and cooperating U.S. and international institutions cataloging and name authority records that minimize indexed 571,000 articles in FY2004, a 9% increase the cataloging effort required in other health sciences from the previous year. Previously indexed citations libraries. Cataloging is performed by TSD’s were updated to reflect 54 retractions, 5,362 Cataloging Section, staff in HMD, and contractors. corrections, and 30,678 comments found in The Cataloging Section is responsible for the NLM subsequently published notices or articles. Classification, coordinates the development and In FY2004, indexers created 33,444 maintenance of the standard NLM schema annotated links between newly indexed MEDLINE for web documents, and also performs name citations for articles describing gene function in for selected NLM web services. selected organisms and corresponding gene records In FY2004, the Cataloging Section in the NCBI LocusLink database. During the year, cataloged 21,238 contemporary books, serial titles, additional organisms were incorporated into the gene nonprint items, and cataloging-in-publication galleys, indexing process, the software supporting gene a 6% increase from the previous year. The Section indexing was improved, and the Index Section began to provide name authority control for participated in testing for the transition from organizations represented in ClinicalTrials.gov, in LocusLink to the new Entrez Gene database. The addition to the similar service already provided for new database will support gene indexing for almost MedlinePlus.gov. With the implementation of the any organism for which information is reported in the archives of outdated, but important NLM Web published literature. documents, the Cataloging Section assumed The Index Section completed installation of responsibility for ensuring that all archived dual monitors for all inhouse indexers and began documents and those prospectively labeled as providing dual monitors to contract indexers to speed permanent have standard and complete metadata and indexing from the electronic versions of journals. are represented in the NLM catalog. Dual monitors allow indexers to have simultaneous The Cataloging Section consolidated, full-screen views of the online indexing system, expanded, and updated NLM’s policies for cataloging which already includes multiple windows for the subject analysis and classification and published MeSH vocabulary, PubMed, etc., and the text of the them on the NLM Web site. FY2004 was the first article being indexed. In the case of journals with year that the new NLM Classification maintenance identical electronic and print versions, indexing from system was used to incorporate and validate changes the electronic version frees the print version for to Classification’s MeSH index. The new system immediate use in fulfilling onsite and interlibrary allowed the 2004 edition to be released in April, a loan document requests. great improvement from the previous year. In FY2004, the Index Section completed the Significant progress was made in providing basic data analysis for the indexing consistency study cataloging records for NLM’s historical and special conducted last year and will use the data to establish collections. HMD completed a two-year project to a baseline for evaluation of continuing efforts to catalog a collection of 22,000 unbound pamphlets improve automated assistance to the indexing and also cataloged 152 early monographs, 2,785 process. Indexer use of the MeSH headings suggested pictures, 5,134 historical audiovisuals, and 423 linear by the Medical Text Indexer system is gradually feet of manuscripts. New Profiles in Science® Web increasing, and preliminary data indicate that use of sites were released for C. Everett Koop, former U.S. the system shortens the time required to train new Surgeon General, and Wilbur A. Sawyer, a major indexers. Experiments with extracting certain data figure in international public health in the first half of (e.g., grant numbers) from the full-text of electronic the 20th century. HMD also began a project to create articles indicate that there are other ways to reduce or “chapter” cataloging records for medical biographies eliminate certain tasks now performed by human

12 experts. LO is continuing to work with other NLM disseminate some of the world’s most heavily used program areas to enhance the efficiency and biomedical and health information resources. effectiveness of its critical and very high volume indexing operation. Databases Indexers perform their work after the initial LO manages the creation, quality assurance, and data entry of citations and abstracts has been maintenance of the content of MEDLINE/PubMed, accomplished. Over the past eight years, great strides NLM’s database of electronic citations; the NLM have been made in improving the efficiency of data catalog, which is now available to the public in two entry. In FY2004, 74% of the citations and abstracts different databases; MedlinePlus and MedlinePlus en were received from publishers in electronic form (the español, NLM’s primary information resources for fastest and most economical method), up from 60% patients, their families, and the general public; and a last year; 17% were created by scanning and optical number of specialized databases, including several in character recognition (OCR); and 9% were double- the fields of health services research, public health, keyboarded. The combination of increased electronic and history of medicine. These databases are richly submissions and enhancements made by LHC to the interlinked with each other and with other important scanning/OCR system led LO to discontinue the NLM resources, including PubMed Central, other keyboarding contract in June 2004. (Keyboarding Entrez databases, ClinicalTrials.gov, Genetics Home was the sole method of indexing data entry from ReferenceTM, as well as SIS toxicological, 1967 to 1996.) A total of 315 publishers are now environmental health, and AIDS information supplying XML-tagged electronic data for 2,966 services. journals. In FY2004, LO made significant progress in NLM selects journals for indexing with the ongoing efforts to provide online access to NLM’s advice of the Literature Selection Technical Review retrospective bibliographic data. Following a multi- Committee (LSTRC) (Appendix 6), an NIH- year effort, the Library released all five series of the chartered committee of outside experts. In FY2004, monumental Index-Catalogue of the Library of the LSTRC reviewed 473 journals and rated 95 of them Surgeon General’s Office in an Encompass database highly enough for NLM to begin indexing them available via the Web. (Encompass is a product of the immediately. Another 92 titles ranked sufficiently Endeavor company). Considered an essential highly to be indexed if their publishers are able to resource for the history of medicine and science, supply electronic citation and abstract data. Index-Catalogue contains more than 3.7 million Following up on the special studies of NLM’s entries for books, journal articles, theses, pamphlets, coverage of bioethics and of biomedical imaging and including many not available in other NLM bioengineering, the LSTRC reviewed additional databases. NLM also extended the coverage of journals in these subject areas. NLM implemented a PubMed further back in time by adding 243,000 new policy that indexing of electronic-only journals indexed citations from NLM’s 1950–1952 printed is contingent on their publishers having a credible indexes. Both of these developments improve access strategy for ensuring their permanent availability. to older literature that is newly germane to current Deposit in PubMed Central is one way to meet this health care, including works on smallpox, anthrax, criterion. and tuberculosis. NLM continues to work with the Fogarty LO and NCBI collaborated to develop a new International Center and the editors of a number of Entrez database, NLM Catalog, using the new XML prestigious Western medical and public health catalog distribution format defined and generated by journals to assist African editors in improving the LO and OCCS. The NLM Catalog database was quality of their journals. NLM’s role is to improve created to provide search capabilities that are not communications support for African editors so they available in LocatorPlusTM, the version of the can use the Internet to recruit authors and reviewers, catalog in the Voyager integrated library system. communicate with editors in other countries, and LocatorPlus will continue to be used for cataloging, otherwise become connected to the worldwide onsite circulation, and other library processing scientific journal community. functions. The NLM Catalog database does not contain detailed holdings data or provide MARC- Information Products formatted output, but it provides links to LocatorPlus for these features. NLM produces databases, publications, and Web Use of MEDLINE/PubMed increased to 678 sites that provide access to the Library’s authoritative million searches in FY2004, a 35% increase from the indexing, cataloging, and vocabulary data and link to previous year, most directly in PubMed and some via other sources of high quality information. LO works the NLM Gateway. Page views totaled 2.5 billion, with other NLM program areas to produce and 39% more than last year. Google is now indexing selected PubMed content, which has contributed to

13 the growth. MEDLINE/PubMed now includes more NCBI to move the entire contents of HSTAT (Health than 15 million citations. BSD staff assisted NCBI Services and Technology Assessment Text) to the with design, development, and testing of many Entrez systems, as part of the Bookshelf. This allows enhancements to PubMed and also worked with LHC more robust linking between HSTAT documents on the development and testing of many new features (including all evidence reports produced by the in the NLM Gateway. PubMed’s Clinical Queries Agency for Healthcare Research and Quality, CDC’s page was updated to reflect refined search strategies Guide to Preventive Services, etc.), and developed by Brian Haynes and colleagues at MEDLINE/PubMed, PubMed Central, and other McMaster University. Beta versions of PubMed Entrez databases. NICHSR continued to work filters that facilitate retrieval of evidence on cost and through AcademyHealth and the Sheps Center at the outcomes of health services were made available via University of North Carolina, Chapel Hill to expand the NICHSR web site. the content of HSRProj (Health Services Research Use of MedlinePlus and MedlinePlus en Projects) to incorporate work funded by additional español also continued to increase dramatically. foundations and states. Organizations contributing Almost 52 million unique visitors viewed a total of data for the first time in FY2004 included the Idaho half a billion pages. The number of page views more Department of Health and Welfare and the states of than doubled and the number of visitors more than Kansas and Utah. The HSRR (Health Services and tripled in comparison to the previous year. More than Sciences Research Resources) database also 42,000 people subscribe to the weekly continued to expand to cover additional datasets, announcements of new additions to MedlinePlus surveys, other research instruments, and software content. MedlinePlus and MedlinePlus en español packages used with datasets. Among the new ranked 1st and 2nd among all U.S. government Web resources added were the Health Utilities Index, sites in the continuous American Customer American Stop Smoking Intervention Study, and the Satisfaction Index (ACSI) surveys. Last year, PSD National Children’s Study. worked with SIS and the Office of Health HMD is also expanding the Entrez Information Programs Development to obtain NIH Bookshelf through the “Medicine in the Americas” evaluation funding for NLM participation in the project, which provides scanned ACSI program and to implement it as a test for the historical American medical books and searchable rest of NIH. Yahoo decided to use an XML file of versions of the texts. In another database effort, MedlinePlus health topics, in both English and HMD has established History of Medicine: Online Spanish, to promote MedlinePlus search results Syllabus Archive, which already includes 130 syllabi above others due to the quality and authority of the from more than 50 educational institutions in many content. countries. This new resource has been received with PSD and OCCS continued to expand and enthusiasm by educators. improve the content and features of the English and Spanish sites. Forty-seven new health topic pages Machine-Readable Data were added to MedlinePlus to bring the total to 677; NLM leases many of its electronic databases to other 38 were added to MedlinePlus en español for a total organizations to promote the broadest possible use of of 625. Fifteen new interactive tutorials were added its authoritative bibliographic, vocabulary, and in both languages. Other new features included “Find factual data. There is no charge for any NLM a Hospital” based on the American Hospital database, but recipients must abide by use conditions Association database and pages that provide access to that vary depending on the database involved. The all easy-to-read and low vision materials and to commercial companies, International MEDLARS English and Spanish materials. “Go Local” was Centers, universities, and other organizations that expanded to include a Missouri site that assembles obtain NLM data use them in many different community health service information. NLM released database and software products for a very wide range a new Go Local input system for those who wish to of purposes. build NLM-hosted Go Local sites. A number of Demand for MEDLINE/PubMed data in groups are actively entering information about local XML format continues to increase. At the end of health service Web pages and more Go Local sites FY2004, there were 290 MEDLINE licensees, a 32% are expected to debut by mid-2005. An innovative increase from the previous year. The majority use the “talking” version of NIHSeniorHealth was released data for research and data-mining. LHC and BSD with additional features and more topics provided by collaborated to produce statistical reports covering several NIH Institutes. the content of the 2002, 2003, and 2004 MEDLINE Under the direction of NICHSR, NLM baseline databases and published them via the Web continues to expand and enhance its databases for for use by licensees and other researchers. NLM health services researchers and public health made its cataloging data available in XML format in professionals. In FY2004, NICHSR worked with FY2004, as an alternative to the MARC format

14 distribution which has been available since the early customer groups and problems identified in usability 1970s. NLM also redistributed its Chinese cataloging testing of the previous version. The new home page records in MARC format, following completion of has a more flexible three-column format that the project to add pinyin transliteration to them. A accommodates news, allows NLM to highlight time- relatively small number of organizations license sensitive content, and leads to several different types NLM catalog records or one or more of the SIS of portal pages, e.g., for broad subject groupings such toxicological or environmental health files in XML as “health services research and public health” and format. Many users execute the online Memorandum “environmental health and toxicology”; for particular of Understanding that permits FTP transfer of the audiences (e.g., public, health professionals, MeSH files in XML, ASCII, or MARC format. librarians), and for types of NLM services (e.g., In FY2004, BSD, OCCS, and LHC training and outreach). In FY2004, NLM’s main web completely revamped the procedures for licensing site received more than 48 million page hits from UMLS data to allow users to establish licenses via users at more than 7.9 million unique Internet the Web. To obtain the 2004AA release, all UMLS addresses. The number of page hits increased 30% users had to execute a new UMLS license (now from the previous year; the number of unique IP applicable to the Metathesaurus only) that addresses increased 68%. incorporates new language covering the terms related In conjunction with major changes to the to SNOMED CT. As of the end of FY2004, there UMLS formats, associated programs, documentation, were 2,115 UMLS Metathesaurus licensees. DVD and licensing procedures, NICHSR consolidated two replaced CD has the hard media distribution separate UMLS websites previously maintained by mechanism due to the effect of SNOMED CT and the LHC and LO into one revised and expanded set of additional distribution format on the size of the pages under NLM’s main web site. The new UMLS Metathesaurus. UMLS users may also obtain the site has an expanded set of resources for UMLS Knowledge Sources and related programs via users, including links to information about key download, through an application programming UMLS source vocabularies, and prominent links to interface, or an interactive Web interface, all from the the UMLS Knowledge Source Server, which is UMLS Knowledge Source Server. During FY2004, accessible to UMLS licensees only. BSD staff began assuming an greater role in quality Publications available from the main Web assurance of UMLS releases. site include recurring newsletters and bulletins, fact sheets, technical reports, and documentation for Web and Print Publications NLM databases. In FY2004, TSD published the List NLM’s databases and Web sites are its primary of Serials Indexed for Online Users in XML format publication media. Demand for the Library’s print for the first time. It was previously available in PDF publications has declined dramatically due to only. BSD’s MEDLARS Management Section edits increasing electronic access to NLM data throughout the NLM Technical Bulletin, which provides timely, the U.S. and around the world. Reflecting this detailed information about changes and additions to situation, NLM decided to cease publication of the NLM’s databases and related policies, primarily for monthly Index Medicus, effective with the December librarians and other information professionals. 2004 issue, after 125 years of publication. (The Published since 1969, the Technical Bulletin also annual Cumulated Index Medicus ceased publication serves as the historical record of the evolution of in 2000.) Launched by John Shaw Billings in 1879, NLM’s online systems and databases. Index Medicus was for many years an indispensable PSD’s Reference and Customer Service tool for medical librarians, researchers, and Section edits Current Bibliographies in Medicine, a practitioners. The desire to publish it in a more timely series of special bibliographies on topics of current fashion was the impetus for NLM’s pioneering work interest to NLM or other federal agencies. Topics in automation in the early 1960s, which provided the covered this year included health literacy and foundation for the development of MEDLINE in distance education in public health. In FY2004, PSD 1971. With the spread of the Internet, the printed reallocated some of the resources previously devoted Index Medicus has outlived its usefulness, but it will to Current Bibliographies to other high priority survive as a searchable subset within activities, such as periodic systematic review of MEDLINE/PubMed. The “Black and White” MeSH, MedlinePlus topic pages. This change was possible published as a supplement to Index Medicus since the because NIH can now obtain literature search and 1960s, still receives considerable use as a search tool analysis services for its Consensus Development and will continue to be published in print. meetings from the Evidence Centers identified by the PSD coordinated a complete redesign of Agency for Healthcare Research and Quality NLM’s main home page and the secondary pages to (AHRQ). In the past, NLM produced bibliographies which it refers, which debuted in May 2004. The new for most of these NIH meetings. (The Evidence design is based on feedback from NLM’s various Reports produced by AHRQ-funded centers are one

15 of the series that NLM makes available online in the to support electronic delivery of documents to HSTAT collection on the Entrez Bookshelf.). libraries behind firewalls. The purchase order for document delivery and first search services was Direct User Services recompeted and a new 5-year procurement awarded. A total of 3,260 libraries use DOCLINE, In addition to producing heavily used electronic NLM’s interlibrary loan request and routing system, resources, LO is responsible for document delivery, which received a major interface redesign in FY2004. reference, and customer service for both onsite users DOCLINE users entered 2.7 million requests in and remote users. LO provides document delivery to FY2004, a 5.5% decline from last year; 91% of the remote U.S. users via the National Network of requests were filled. Although the absolute number of Libraries of Medicine (NN/LM). interlibrary loan requests received by NLM declined slightly in FY2004, the Library’s share of all Document Delivery DOCLINE requests continues increase by about half LO retrieves documents requested by onsite patrons a percent each year—to 13.3% in FY2004. from NLM’s closed stacks and also provides Individuals submitted 809,673 document requests to interlibrary loan as a backup to document delivery DOCLINE users via the Loansome Doc® feature in services available from other libraries and MEDLINE/PubMed and the NLM Gateway, a 6% information suppliers. In FY2004, PSD’s Collection decline from the previous year. Document request Access Section processed 631,806 requests for traffic continues to decline in all Regions of the contemporary documents. HMD handled 10,031 NN/LM due to expanded availability of electronic requests for rare books, manuscripts, pictures, and full-text journals. historical audiovisuals. In FY2004, NLM expanded and improved The number of onsite users is declining due the mechanisms for alerting DOCLINE and in part to security measures which make access to Loansome Doc users when articles they intend to NIH facilities more time-consuming and request are freely available either in PubMed Central cumbersome, but onsite use of NLM’s collection is or on the Web site of any LinkOutTM provider. This still significant. Main Reading Room users requested decreased the number of document requests entered 272,229 contemporary documents from NLM’s by more than 15,000. NCBI and staff at the Regional closed stacks, a 6% decline from last year. Users of Medical Libraries continued to promote the use of the HMD Reading Room requested 8,618 items from PubMed’s LinkOut for Libraries and “Outside Tool” the historical and . Paid printing at as means for libraries to customize PubMed to Main Reading Room workstations increased 31% to display their electronic and print holdings to their 395,915 pages, reflecting significant use of the primary clientele. The number of libraries electronic journals NLM makes available to onsite participating in LinkOut increased 31% to 1,091. users. In FY2004, PSD moved the onsite viewing DOCLINE requests are routed to libraries stations for non-print media from the Learning automatically based on automated holdings data. At Resource Center to the Main Reading Room. the end of FY2004, DOCLINE’s serial holdings Materials previously shelved in the Learning database contained 1,401,060 holdings statements for Resource Center were relocated to the stacks or, in 53,850 serial titles held by 3,049 libraries. In some cases, to the Main Reading Room. Given FY2004, LO and OCCS implemented automated declining onsite use of non-print materials, the new transfer of holdings data from OCLC to NLM for arrangement provides better service for patrons, is NN/LM members who requested this service. more efficient for staff, and frees up space for other Transfer of holdings data from NLM to OCLC was purposes. established last year. The Collection Access Section received NLM and the Regional Medical Libraries 359,577 interlibrary loan requests, a 1% decrease continued to encourage network libraries to use the from FY2003, but was able to fill 13,000 more Electronic Funds Transfer System (EFTS), operated requests than last year. The improvement in fill rate for the NN/LM by the University of Connecticut, as a (from 74% to 78%) was due to a collaborative effort mechanism to reduce administrative costs associated with the Index Section and TSD to make more issues with ILL billing. During FY2004, EFTS participation of titles indexed for MEDLINE available for increased 14% to 949 libraries. Participants receive document delivery and a major shelf-reading project either a single net consolidated bill or a net directed by the Preservation and Collection consolidated payment each month. In FY2004, NLM Management Section. The percentage of requests reviewed the policy for the national maximum charge processed within 12 hours of receipt increased from that Resource Libraries in the NN/LM may levy on 80 to 92%. NLM now delivers 92% of interlibrary network members for filling ILL requests. As a result loan requests electronically. Relais, the system NLM of this review, resource libraries have the option to uses to scan and transmit documents, was upgraded conduct a formal study to determine if their actual

16 costs exceed the national maximum and to charge technology throughout the United States; serves as more if the results justify it. NLM has arranged for the secretariat for the Partners in Information Access resource libraries to make use of the ILL cost study for the Public Health Workforce; participates in methodology developed by the Association of NLM-wide efforts to develop and evaluate outreach Research Libraries if they wish to do so. programs for under-served minorities and the general public; produces major exhibitions and other special Reference and Customer Services programs in the history of medicine; and conducts LO provides reference and research training programs for health sciences librarians and assistance to onsite and remote users as a backup to other information professionals. LO staff members services available from other health sciences give presentations, demonstrations, and classes at libraries. LO also has primary responsibility for professional meetings and publish articles that responding to inquiries about NLM’s products and highlight NLM programs and services. services and how to make use of them. With contract assistance, PSD’s Reference and Customer Services National Network of Libraries of Medicine Section responds to initial inquiries and also handles The NN/LM works to provide timely, convenient the majority of questions requiring second-level access to biomedical and health information for U.S. attention. Staff from throughout LO and NLM assist health professionals, researchers, and the general with second-level service when their special expertise public, irrespective of their geographic location. With is required. 5666 full and affiliate members, the Network is the A total of 107,939 inquiries (excluding core component of NLM’s outreach program and its spam) were received in FY2004, up 2% from efforts to reduce health disparities and to improve FY2003. The number of onsite inquiries declined health . Full members are 12% to 36,649, reflecting the decline in the number libraries with health sciences collections, primarily in of onsite users. The number of remote inquiries hospitals and academic medical centers. Affiliate increased 11% to 71,113, with the overwhelming members include some smaller hospitals, public majority arriving via email. NLM uses the Seibel libraries, and community organizations that provide software, integrated with a telephone call system, to health information service, but have little or no track remote inquiries and then applies datamining collection of health sciences literature. LO’s NN/LM tools to analyze and characterize customer service Office (NNO) oversees network programs that are inquiries stored (without personal identifiers) in the administered by eight Regional Medical Libraries, Seibel database. under contract to NLM. (See Appendix 1 for a list of PSD also continues to develop the the RMLs.) of “Cosmo,” a virtual customer In addition to the basic NN/LM contracts service representative built with the NativeMinds and the Electronic Funds Transfer System, NLM software designed to answer frequently asked funds subcontracts for four national centers that serve questions about NLM’s programs, products, and the entire network. The activities of one of these services. In FY2004, Cosmo responded to 3,678 centers, the National Online Training Center and questions that were within his job description and Clearinghouse at the New York Academy of answered 87% of them correctly, up from 72% last Medicine, are described elsewhere in this chapter. year. Questions that Cosmo can’t answer are now The Outreach Evaluation Resource Center at the transferred, at the user’s request, to the Reference University of Washington provides training and staff for response. consulting services throughout the NN/LM and In FY2004, PSD conducted customer assists in designing methods for measuring overall satisfaction surveys for its telephone and email network programs and individual outreach projects. reference service and revised all Reference and In FY2004, the Center focused on refining the Customer Service fact sheets, FAQs, and Reading strategy for measuring progress on network-wide Room handouts to reflect “plain language” principles. outreach goals for 2001–2006: to bring NLM and NN/LM services to the attention of every public Outreach library system and every public health department in the U.S. The National Outreach Mapping Center at LO manages or contributes to many programs Indiana University in Indianapolis assists NLM in designed to increase awareness and use of NLM’s displaying the geographic distribution and impact of collections, programs, and services by librarians and NN/LM programs and services. In FY2004, work other health information professionals, historians, continued on collecting uniform outreach encounter researchers, educators, health professionals, and the data from all Regions and providing a Web-based general public. LO coordinates the National Network tool to the RMLs for use in generating outreach of Libraries of Medicine which attempts to equalize maps. The Web-Services Technology Operations access to health information services and information Center (Web-STOC) provides ongoing technical

17 management of the NN/LM Web sites and also As a result of input from network members investigates, recommends, and directs the at site visits to the 8 RMLs in 2002–2003, NLM and implementation of additional Web technology for the RMLs established an NN/LM Hospital Internet teleconferencing, Web broadcasting, distance Access Task Force in FY2003 to identify: barriers to education, online surveys, etc. access to the Internet in hospitals; best practices for In FY2004, as part of the mid-course achieving the twin goals of easy access to the Internet evaluation of current NN/LM operations, review and appropriate security for hospital patient data; and teams conducted in-person or audio site visits with actions the NN/LM and NLM might take to assist the four Centers. Their reports, with hospital libraries in overcoming barriers. In FY2004, recommendations for NLM, the RMLs, and the the Task Force teamed up with the Hospital Libraries Centers, will be submitted in early FY2005 in time to Section of the Medical Library Association (MLA), be considered in the development for the statement of both to obtain information about barriers and best work for the 2006–2011 NN/LM contracts. practices and to disseminate best practices. The Task In addition to the work on the Force arranged a special open forum on these issues and public health department outreach goals, the at the MLA Annual Meeting in May 2004. RMLs and other network members conduct many Also as a result of input from the site visits, special projects to reach under-served health NLM and the RMLs established an E-licensing professionals and to improve the public’s access to Working Group to identify: state and local group high quality health information. Virtually all of these licensing resources available to network members, projects involve partnerships between health sciences model licensing language, best practices for libraries and other organizations, including public negotiating licenses, and methods for disseminating libraries, public health departments, professional the information to network members. The Working associations, schools, churches, and other Group, which is also coordinating its efforts with community-based groups. Some projects are MLA, will submit an initial report in early FY2004. identified by individual RMLs through regional solicitations or ongoing interactions with regional Partners in Information Access for the Public Health institutions; others are identified by periodic national Workforce solicitations for outreach proposals issued The NN/LM is a key member of the Partners in simultaneously in all NN/LM regions. In FY2004, the Information Access for the Public Health Workforce, NNO initiated a new type of outreach award, the a collaboration initiated by NLM, the Centers for community outreach partnership planning award, to Disease Control and Prevention, and the NN/LM in allow health science libraries and community-based 1997 to help the public health workforce make organizations to explore opportunities for productive effective use of electronic information sources and to collaboration prior to developing full-fledged equip health sciences librarians to provide better outreach project proposals. In all, the NN/LM issued service to the public health community. The Agency 73 subcontracts for outreach projects in FY2004 as a for Healthcare Research and Quality and the Medical result of national solicitations. The projects target Library Association are the two newest members, many rural and inner city communities and special joining 10 other federal agencies and public health populations in 32 states and the District of Columbia. associations. The NICHSR coordinates the Partners With the assistance of other NN/LM for NLM; staff members from the National Network members, the RMLs do most of the exhibits and Office, SIS, and the Office of the Associate Director demonstrations of NLM products and services at for Library Operations serve on the Steering health professional, consumer health, and general Committee, as do representatives from several library association meetings around the country. LO RMLs. organizes the exhibits at the Medical Library The Partners Web site (phpartners.org) Association annual meeting, the American Library provides unified access to public health information Association annual meeting, some of the health resources produced by all members of the professional and library meetings in the Washington, Partnership, as well as other reputable organizations. DC area, and some distant meetings focused on In FY2004, the Web site was migrated from an health services research, public health, and history of NN/LM server to one at the NLM. One of the most medicine. In FY2004, NLM and NN/LM services popular resources on the site is the Healthy People were exhibited at 150 national, regional, and state 2010 Information Access project, which includes meetings across the U.S. These exhibits highlight all evidence-based PubMed search strategies and links to NLM services relevant to attendees, not just those to MedlinePlus topics for Healthy People 2010 which LO contributes. In FY2004, NLM objectives. During FY2004, strategies were implemented a new exhibit database to track this completed and tested for objectives in 11 more focus activity. areas, bringing the total number of objectives covered to more than 400, with every focus area represented.

18 The Partnership also devoted considerable effort to FY2004, LO provided summer employment and the development of additional training resources. training opportunities for several students and Public Health Information and Data: A Training teachers. Manual was developed by staff from the New York City Department of Health and Mental Hygiene, the Historical Exhibitions and Programs Midcontinental Region of the NNLM®, the HMD directs the development and installation of University of Michigan, NICHSR, and NNO and major historical exhibitions in the NLM rotunda, with made available on the Web site in PDF format. A assistance from LHC and the Office of the Director. training course based on its content was conducted at As an important part of NLM’s outreach program, the the fall 2004 annual meeting of the American Public exhibitions are designed to appeal to the interested Health Association, and a Web-based version of the public, as well as the specialist, and to highlight the tutorial is under development. Library’s rich historical resources. The current exhibition, Changing the Face of Medicine: Special NLM Outreach Initiatives Celebrating America’s Women Physicians, debuted LO participates actively in the Library’s Committee on October 14, 2003, with a gala opening program on Outreach, Consumer Health, and Health featuring remarks by Dr. Elias Zerhouni, Director of Disparities and in many NLM-wide outreach efforts NIH, Dr. Donna Christian-Christensen, delegate from designed to expand outreach and services to the the Virgin Islands, and Dr. Antonia Novello, public as well as to address racial and ethnic Commissioner for Health for the State of New York disparities. In FY2004, the Office of the Associate and former U.S. Surgeon-General, and a performance Director and BSD continued to work with other NLM by a string quartet, using instruments made by components, the American College of Physicians pediatrics pioneer, Dr. Virginia Apgar. Foundation, and the NN/LM to launch the This well-reviewed exhibition features more Information Rx project nationwide in April at the than 300 women physicians, living and dead, selected ACP Annual Session. Information Rx provides with advice from an advisory committee of eminent physicians with materials to write prescriptions for physicians (both women and men), chaired by Tenley information from MedlinePlus for their patients. Prior Albright, M.D., former chair of the NLM Board of to the national launch, BSD developed an online site Regents. Girls who might be interested in pursuing an for physicians to order their materials and an NLM M.D. degree are one of the principal audiences for Library Associate Fellow developed a web-based the exhibition, which illustrates the wide range of Information Rx Toolkit for librarians with guidance careers open to women physicians and shows that from NLM staff and input from NN/LM librarians. In women from all segments of U.S. society have FY 2004, a total of 1,450 physicians and librarians excelled in the field. The exhibition has a Web site, requested promotional products for the Information http://www.nlm.nih.gov/changingthefaceofmedicine, Rx initiative. which provides information about the women The Office of the Associate Director, LO, physicians in the exhibition and educational and the NNO, and BSD continued to work with the professional resources for those considering a career American Library Association and Public Library in medicine. The “Share Your Story” section of the Association (PLA) to improve public library Web site encourages people to provide information awareness of MedlinePlus and MedlinePlus en about outstanding women physicians they have español. The Office of the Associate Director encountered, whether family members, mentors, or participated in a panel session at the PLA biennial their own doctors. To date, more than 6,200 visitors meeting which focused on web resources available have seen the exhibition at NLM and the for providing health information to multicultural accompanying Web site has received 880,000 page populations. The Office of the Associate Director hits. The American Library Association and NLM are also serves on an ALA Advisory Committee for the collaborating on the development of a traveling “Be Well Informed @ Your Library” program which version, funded by the NIH Office of Research on is funding 10 public library systems to conduct Women’s Health and the Library. seminars on health education issues. BSD staff Previous NLM exhibitions live on through continued to support a direct mail and library exhibit heavily used Web sites, printed catalogs, DVD program to provide all public and health sciences editions, or touring traveling versions. Excluding libraries with materials to promote MedlinePlus. The Changing the Face of Medicine, exhibition web sites three-year program resulted in 6,318 libraries received more than 4.6 million page hits in FY 2004. ordering materials and more than 2.2 million The traveling version of Frankenstein: Penetrating bookmarks were distributed to their readers. the Secrets of Nature continued its two-year tour of LO staff members continue to be involved in public, academic, and health sciences libraries across NLM’s partnership with the SCIMATECH Academy the United States under the auspices of the American at Wilson High School in the District of Columbia. In

19 Library Association and garnered favorable publicity In FY 2004, the MEDLARS Management at every stop. Section (MMS) and the NTCC trained 948 students In addition to the major exhibitions in the in 76 classes covering PubMed, the NLM NLM rotunda, HMD installs “mini-exhibits,” Gateway/ClinicalTrials.gov, TOXNET®, and the generally in the cases near the entrance to the HMD UMLS. Experiments with remote broadcasts of Reading Room. Mini-exhibits mounted in FY 2004 online training sessions as a means of providing included: John Eisenberg: A Life of Service, 1946- training in more locations were only partially 2002; C. Everett Koop: From Pediatric Surgeon to successful so NLM and the NN/LM are investigating Surgeon General; and Time, Tide, and Tonics: The other approaches to filling this need. An average of Patent Medicine Almanac in America; and Francisco about 31,000 unique users visited the Web-based Goya at the National Library of Medicine, an PubMed Tutorial about 40,000 times each month. exhibition of NLM’s 13 Goya prints. The Exhibition Three new animated Viewlet tutorials were created program also produced a traveling ten-panel exhibit for basic PubMed search features. The PubMed entitled An Odyssey of Knowledge: Medieval tutorial files were made available on NLM’s ftp Manuscripts and Early Printed Books from the server in response to a request from the Life Science National Library of Medicine. It premiered at the Library, Academica Sinica, Taipei, Taiwan. The International Congress of Medical History in Bari, UMLS for Librarians course was revised to reflect Italy in September 2004 and will go on tour. the new Metathesaurus release format and greatly In November 2003, HMD hosted a major enhanced MetaMorphoSys program. LHC and MMS symposium on Visual Culture and Public Health, staff presented a revised UMLS tutorial for which featured presentations by invited scholars who informaticians at MedInfo in San Francisco in drew heavily on NLM’s collections. Other historical September 2004. programs include a monthly series of seminars by The UMLS Courses are one of a number of historical scholars and several special historical NLM training courses useful in preparing librarians lectures organized by HMD in conjunction with the for new and expanded roles. LO and the NTCC assist Diversity Council and the EEO Office. HMD also NCBI in arranging network venues, scheduling, and hosted a number of visiting historical scholars. publicizing the Introduction to Molecular Biology HMD staff members continued to present Information Resources class, which helps to prepare historical papers at professional meetings and to library-based bioinformatics specialists. NCBI also publish the results of their scholarship in books, offers an advanced workshop for Bioinformatics chapters, articles, and reviews, including the Information Specialists at NLM. Both courses were recurring features “Voices from the Past” and developed and are taught by librarians who serve as “Images of Health” for the American Journal of bioinformatics specialists in universities and at NLM. Public Health, which often feature materials from the NICHSR continues to add to its suite of courses on NLM collection. health services research, public health, and health policy. Training and Recruitment of Health Sciences The NLM Associate Fellowship program Librarians had 14 participants in FY 2004: six 2nd year LO develops online training programs to teach the Associates at sites across the country and eight 1st use of MEDLINE/PubMed and other NLM databases year Fellows, who completed their year at NLM in to health sciences librarians and other information August 2004. Seven of the latter also chose to professionals; oversees the activities of the National participate in the optional 2nd year of the program at Online Training Center and Clearinghouse (NTCC) sites across the country: the University of at the New York Academy of Medicine; directs the Massachusetts, Georgetown University, the NLM Associate Fellowship program for post-masters University of New Mexico, Johns Hopkins librarians; and presents continuing education University, the Centers for Disease Control and programs for librarians and others in health services Prevention, the University of Texas at San Antonio, research, public health, the UMLS resources, and and the University of Washington. Seven new other topics. LO also collaborates with the Medical Fellows began a year at NLM in September, Library Association, the Association of Academic including one International Fellow from the Medical Health Sciences Libraries, and the Association of , University of Zambia. Research Libraries to increase the diversity of those NLM works with several organizations on entering the profession, to provide leadership librarian recruitment and leadership development development opportunities, to promote multi- initiatives. Individuals from minority groups continue institution evaluation of library services, and to to be underrepresented in the library profession and a encourage specialist roles for health sciences high percentage of current library leaders will retire librarians. within the next 5 to 10 years. LO has provided support for scholarships for minority students

20 available through the American Library Association, cohort of 5 mid-career health sciences librarians. the Medical Library Association, and the Association AAHSL contracts with ARL for the leadership for Research Libraries (ARL). LO also supports the training portion of the program. Based on the success NLM/AAHSL Leadership Development Program, of the first two years of the initial three year pilot, LO which provides leadership training, mentorship, and has decided to fund the program for an additional site visits to the mentor’s institution for an annual three.

21 Table 1 Growth of Collections

Collection Previous Added New Total Total FY 2004 (9/30/04) (9/30/03)

Book Materials Monographs: Before 1500 ...... 588 ...... 3 ...... 591 1501-1600 ...... 5,938 ...... 25 ...... 5,963 1601-1700 ...... 10,221 ...... 13 ...... 10,234 1701-1800 ...... 24,637 ...... 18 ...... 24,655 1801-1870 ...... 41,424 ...... 36 ...... 41,460 Americana ...... 2,341 ...... 0 ...... 2,341 1871-Present ...... 727,462 ...... 14,834 ...... 742,296 Theses (historical)...... 281,794 ...... 0 ...... 281,794 Pamphlets ...... 172,021 ...... 0 ...... 172,021 Bound serial volumes ...... 1,269,541 ...... 19,301 ...... 1,288,842 Volumes withdrawn ...... (80,483) ...... (7,129) ...... (87,612) Total volumes ...... 2,455,484 ...... 27,101 ...... 2,482,585

Nonbook Materials Microforms: Reels of microfilm ...... 137,442 ...... 4,625 ...... 142,067 Number of microfiche ...... 447,374 ...... 3,029 ...... 450,403 Total microforms ...... 584,816 ...... 7,654 ...... 592,470 Audiovisuals ...... 72,965 ...... 1,738 ...... 74,703 Computer software ...... 2,243 ...... 132 ...... 2,375 Pictures ...... 58,010 ...... 2,422 ...... 60,432 Manuscripts ...... 4,323,707 ...... 415,975 ...... 4,739,682* Total nonbook ...... 5,041,741 ...... 427,921 ...... 5,469,662

Total book & nonbook ...... 7,497,225 ...... 455,022 ...... 7,952,247

* Equivalent to 2,708 linear feet.

Table 2 Acquisition Statistics

Acquisitions ...... FY 2002...... FY 2003 ...... FY 2004

Serial titles received ...... 20,350...... 20,476 ...... 20,769 Publications processed: Serial pieces...... 133,908...... 134,579 ...... 132,192 Other ...... 22,274...... 24,523 ...... 24,323 Total ...... 156,182...... 159,102 ...... 156,515 Obligations for: Publications ...... $5,802,023...... $6,217,417 ...... $6,942,747 (For rare books) ...... ($446,039) ...... ($297,894) ...... ($300,831)

22 Table 3 Cataloging Statistics

FY 2002 FY 2003 FY 2004

Completed Cataloging ...... 21,419 ...... 19,927 ...... 21,238

Table 4 Bibliographic Services

Services FY 2002 FY 2003 FY 2004

Citations published in MEDLINE ...... 502,056 ...... 526,338 ...... 571,000 For Index Medicus ...... 459,558 ...... 492,911 ...... 537,469 Journals indexed for MEDLINE ...... 4,538 ...... 4,697 ...... 4,839 Journals indexed for Index Medicus ...... 3,834 ...... 3,994 ...... 4,189 Total items archived in PubMed Central ...... 72,683 ...... 109,910 ...... 347,680

Table 5 Consumer Web Services

Services FY 2002 FY 2003 FY 2004

NLM Web Home Page Page Views ...... 40,607,752 ...... 37,166,023 ...... 48,335,875 Unique Visitors ...... 5,300,363 ...... 4,792,482 ...... 7,934,966 MedlinePlus Page Views ...... 116,335,454 ...... 214,127,932 ...... 498,702,940 Unique Visitors ...... 9,594,429 ...... 16,356,444 ...... 51,724,958 ClinicalTrials.gov Page Views ...... 23,288,683 ...... 26,010,359 ...... 33,651,851 Unique Visitors ...... 1,422,734 ...... 2,387,487 ...... 3,190,813 Genetics Home Reference Page Views ...... *** ...... *** ...... 8,410,455 Unique Visitors (daily average)...... 25,617 Household Products Database Page Views ...... *** ...... *** ...... 7,096,664 Unique Visitors ...... 1,364,649 Tox Town Page Views ...... *** ...... *** ...... 1,732,336 Unique Visitors ...... 365,383

Table 6 Circulation Statistics

Activity FY 2002 FY 2003 FY 2004

Requests Received ...... 705,069 ...... 653,916 ...... 631,806 Interlibrary Loan ...... 373,292 ...... 363,352 ...... 359,577 Onsite ...... 331,777 ...... 290,564 ...... 272,229

Requests Filled: ...... 539,274 ...... 511,032 ...... 510,751 Interlibrary Loan ...... 268,816 ...... 268,714 ...... 281,543 Onsite ...... 270,458 ...... 242,318 ...... 229,208

23 Table 7 Online Searches—PubMed and NLM Gateway

FY 2002 FY 2003 FY 2004

Total online searches ...... 382,000,000 ...... 504,000,000* ...... 678,000,000

*Corrected figure

Table 8 Reference and Customer Services

Activity FY 2002 FY 2003 FY 2004

Offsite requests ...... 49,153 ...... 64,010 ...... 71,290 Onsite requests ...... 48,395 ...... 41,774 ...... 36,649 Total ...... 97,548 ...... 105,784 ...... 107,939

Table 9 Preservation Activities

Activity FY 2002 FY 2003 FY 2004

Volumes bound ...... 25,609 ...... 15,646 ...... 18,311 Volumes microfilmed ...... 5,255 ...... 2,795 ...... 2,603 Volumes repaired onsite ...... 1,542 ...... 1,285 ...... 1,652 Audiovisuals preserved ...... 283 ...... 500 ...... 795 Historical volumes conserved ...... 66 ...... 111 ...... 197

Table 10 History of Medicine Activities

Activity FY 2002 FY 2003 FY 2004

Acquisitions: Books ...... 424 ...... 314 ...... 498 Modern manuscripts ...... 840,000 ...... 498,750 ...... 5,516,000* Prints and photographs ...... 3,176 ...... 1,000 ...... 1,591 Historical audiovisuals ...... 1,361 ...... 97 ...... 757

Processing: Books cataloged ...... 368 ...... 215 ...... 13,621 Modern manuscripts cataloged ...... 984,025 ...... 203,000 ...... 740,250** Pictures cataloged ...... 0 ...... 1,048 ...... 2,758 Citations indexed ...... 846 ...... 856 ...... 5,134

Public Services: Reference questions answered ...... 14,898 ...... 14,693 ...... 18,701 Onsite requests filled ...... 6,870 ...... 16,163 ...... 8,618

*Equivalent to 3,152 linear feet **Equivalent to 423 linear feet

24

25 • ITER, or International Toxicity Estimates Specialized Information for Risk, a resource that presents chemical Services risk information from authoritative groups worldwide, including the U.S. Jack Snyder, M.D., J.D., Ph.D. Environmental Protection Agency, the U.S. Associate Director Agency for Toxic Substances and Disease Registry, Health Canada, the Dutch National The Toxicology and Environmental Health Institute of Public Health and the Information Program (TEHIP), known originally as Environment, and the International Agency the Toxicology Information Program, was established for Research on Cancer, as well as 35 years ago within the NLM’s Division of independent parties whose risk values have Specialized Information Services (SIS). Over the undergone ; years TEHIP has provided for the increasing need • TOXMAP, a prototype system that uses for toxicological and environmental health maps of the United States to help users information by taking advantage of new computer visually view data about chemicals released and communication technologies to provide more into the environment and easily connect to rapid and effective access to a wider audience. We related environmental health information; continue to move beyond the bounds of the physical • ChemIDPlus Lite, a streamlined version of National Library of Medicine, exploring ways to ChemIDPlus, that allows users to retrieve point and link users to relevant sources of relevant substance records simply by typing toxicological and environmental health information chemical names or registry numbers into a wherever these sources may reside. Resources single search box; include chemical and environmental health databases • A new Special Topic Web resource page and Web-based information resource collections. The that provides information on education, Division’s HIV/AIDS information initiative now careers, and outreach programs in includes several collaborative efforts in information toxicology and environmental health; resource development and deployment, including a • A new Special Topic information portal focus on the information needs of other special devoted to issues affecting the health and populations. well-being of Native Americans; The SIS Web server provides a central point • Continued support of PAHO/NLM Disaster of access for the varied programs, activities, and Preparedness Information Centers in services of the Division. Through this server Honduras, Nicaragua, and El Salvador; (http://sis.nlm.nih.gov), users can access interactive • Expanded Native American outreach retrieval services in toxicology and environmental initiatives; and health, HIV/AIDS information, or special population • Continuing minority outreach activities with health information; find program descriptions and the Historic Black Colleges and documentation; or be connected to outside related Universities, United Negro College Fund sources. Continuous refinements and additions to our Special Projects, and the National Medical Web-based systems are made to allow easy access to Association. the wide range of information collected by this Division. Web usage has continued to increase over Resource Building the past year. In FY2004 SIS continued to balance efforts The wide range of SIS resources related to toxicology to enhance and re-engineer existing information and environmental health information, HIV/AIDS resources with efforts to provide new services in information, and special populations information emerging areas. We further developed various includes many databases that are created or acquired prototypes that rely on geographical information as well as other services and projects. systems, innovative access and interfaces for consumers, and graphical display of data from The Household Products Ingredients Database information sources. Highlights for 2004 include: (http://householdproducts.nlm.nih.gov) provides a Web resource for consumers that links brand name • WISER, or Wireless Information System for household products (more than 4,000) with their Emergency Responders, a tool designed to ingredient chemicals (more than 2,000) and potential provide critical chemical information adverse health effects. Information derived from quickly and conveniently on a Personal manufacturer’s Material Safety Data Sheets and from Digital Assistant (PDA) for use by SIS databases can provide answers to various emergency responders, especially during the questions, including: what chemicals are contained in first 24 hours in a “hot-zone”; specific brands and in what percentage; which

26 products contain specified chemicals; who contains links to the source documentation. Among manufactures a specific brand and how can that the key data provided in ITER are ATSDR’s minimal manufacturer be contacted; what are the potential risk levels; Health Canada’s tolerable acute and chronic health effects of the chemical intakes/concentrations and tumorigenic ingredients found in a specific brand; what other doses/concentrations; EPA’s carcinogen information is available about such chemicals in the classifications, unit risks, slope factors, oral reference toxicology-related databases of the National Library doses, and inhalation reference concentrations; of Medicine? maximum permissible risk levels; and non-cancer and/or cancer risk values (that have undergone peer In FY2004, SIS released TOXMAP, a prototype review) derived by independent parties. system that uses maps of the United States to help users visualize data about chemicals released into the Haz-Map database, released in 2002 at environment. TOXMAP integrates data from the http://hazmap.nlm.nih.gov, is an occupational EPA’s Toxic Release Inventory (TRI) with toxicology database designed to link jobs and information about health effects, research citations, hazardous job tasks to occupational diseases and their etc. found in TOXNET databases. Users can create symptoms. It is a relational database of chemicals, nationwide or local area maps that show where jobs, and diseases that averaged nearly 20,000 chemicals are released into the air, water, and ground. queries per month in 2004. A user may search this TOXMAP also integrates data from other sources, occupational database by chemical agent, such as demographic data from Census Bureau. occupational disease and by job type. TOXMAP provides region-specific links to chemical and bibliographic information. ChemIDplus (Chemical Identification File) is an NLM online chemical dictionary, which contains In FY2004, SIS also released WISER (Wireless nearly 370,000 records, primarily describing Information System for Emergency Responders), chemicals of biomedical and regulatory importance, designed to provide critical chemical information and available to users on the Internet at quickly and conveniently on a Personal Digital http://chem.sis.nlm.nih.gov/chemidplus. ChemIDplus Assistant for use by emergency responders (first 24 features include chemical structure search and hours in hot-zone). The application is being display for over 200,000 chemicals, and hyperlinked developed in partnership with the Agency for Toxic locator fields that retrieve data for a given chemical Substances and Disease Registry, using ATSDR from other resources such as TOXLINE®, Medical Management Guidelines for Acute Chemical MEDLINE or HSDB® as well as EPA and ATSDR. Exposures, which were developed to aid emergency Over 15,000 records of regulatory interest department physicians and other emergency health collectively known as SUPERLIST are also available care professionals who manage acute exposure and hyperlinked in ChemIDplus. During FY2004 following chemical incidents. The WISER prototype over 75,000 queries per month were made of this has focused on approximately 400 agents found in database. To assist with spelling errors, a chemical the Hazardous Substances Data Bank, and current spell checker helps users retrieve substances more deployment plans include a user’s guide, a tutorial, efficiently by chemical name. The checker, which evaluation methodology, and “in-field” testing. can be instantly revised using the SIS DBMaint2 online update system, contains spelling indices for ITER (International Toxicity Estimates for Risk) is a more than 1.3 million chemical names and synonyms. TOXNET data file that contains data in support of The database was enhanced by the addition of human health risk assessments. It is compiled by various new locators pointing to international Toxicology Excellence for Risk Assessment (TERA) resources. In FY2004, the new ChemIDplus “Lite” and contains over 600 chemical records with key data and “Heavy” systems were released with new from the Agency for Toxic Substances & Disease capabilities, including a simpler Web front-end that Registry (ATSDR), Health Canada, National Institute does not require plug-ins for structure display, and an of Public Health & the Environment (The advanced version that allows numeric searching by Netherlands), U.S. Environmental Protection acute toxicity data and effect, and chemical/physical Agency, and independent parties whose risk values properties. have undergone peer review. ITER provides a comparison of international risk assessment The Hazardous Substances Data Bank (HSDB) information in a side-by-side format and explains continues to be a highly used resource, averaging differences in risk values derived by different 60,000–70,000 searches each month (a 5% increase organizations. ITER data, focusing on hazard over FY2003). Increased emphasis continues to be identification and dose-response assessment, is placed on providing more data on human toxicology extracted from each agency’s assessment and and clinical medicine within HSDB, in keeping with

27 past recommendations of the Board of Regents’ distribution mechanism for this project is now the Subcommittee on TEHIP. In 2004, there has also Internet, through a new online resource named been a continued emphasis on adding to HSDB new ALTBIB, which allows search access to all of the chemicals with the potential for high toxicity and 7,595 citations organized from previous high human exposure. Approximately 100 new bibliographies. ALTBIB uses the TOXNET search chemicals were added in 2004, including new engine, and is available at pesticides, drugs, and environmental pollutants. The http://toxnet.nlm.nih.gov/altbib.html. A user may emphasis on the addition of new chemicals will search by keyword, author, or one of the 16 continue in the coming year. Newer sources of subdivisions such as “Quantitative Structure Activity relevant data are being examined for incorporation Studies.” into new and existing data fields within the current 4,757 HSDB records. Special summary information TOXLINE (Toxicology Information Online) is a is being prepared to allow easier presentation of large NLM bibliographic database traditionally information at a health consumer level. The process produced by merging “toxicology” subsets from of developing a new Web-based system for HSDB secondary sources. By the end of FY2004, the creation, review, and maintenance is continuing. As database included over 3 million citations to part of this effort, a relational HSDB database was toxicology literature dating back to 1965. In 2004, created and a new client-server interface was users accessed standard journal literature in programmed to allow easier updates. The new toxicology and environmental health as part of the maintenance system is now poised for integration enlarging MEDLINE database, while NLM with other new features, including numeric searching continued to add journals in the area of toxicology and automatic indexing. and environmental health to MEDLINE to cover some of the literature formerly provided by outside The Toxicology Data Network (TOXNET), NLM’s sources. For the non-standard journal literature in this information system providing database management area, SIS further enhanced a Web-based system on for many of its toxicology files, has moved from a TOXNET that allows efficient acquisition and networked microprocessor environment to a UNIX– updating of these components. Easy access to this based platform (Solaris Version 2.6) on a SUN TOXLINE Special database and to TOXLINE Core, Enterprise 3000 computer. SIS continues to integrate the standard journal literature on PubMed, is this configuration with other database creation available from the improved TOXNET user interface. systems and Web access to them. Further refinements of the SIS search interface (http://toxnet.nlm.nih.gov) DIRLINE® (Directory of Information Resources enhance the ability of users to simultaneously search Online) is NLM’s online directory of resources HSDB, TOXLINE, CCRIS, Gene-Tox, including organizations, databases, bulletin boards, as DART®/ETIC, IRIS, TRI and ChemIDplus from one well as projects and programs with special input screen. Based on recommendations from the biomedical subject focus. These resources provide Institute of Medicine, users are presented with a basic information to users which may not be available from search screen with just a single input box for one of the other NLM bibliographic or factual searching, with customized screens for more databases. DIRLINE continues to receive a high level sophisticated users. These advanced features include of use (nearly 7000 searches per month) through an Boolean searching and the ability to limit search interface that supports direct links to the Web sites of terms to specific fields. Feedback from TOXNET the organizations listed in the database, as well as user online surveys has provided a basis for current direct e-mail connections. The quality and utility of and future planning, and as result, SIS will the database continue to improve as duplicates have implement a chemical spellchecker, automated been eliminated through changes in policy and indexing, and a virtual meta-search tool during the streamlining of maintenance. More than 1000 records coming years. were either revised or verified in FY2004. Health Hotlines, the always popular publication of health- Alternatives to Animal Testing (ALTBIB)—SIS related toll-free telephone numbers, has a recently continues to compile and publish references from the updated Web version which also indicates the MEDLARS files that were identified as relevant to availability of Spanish speaking customer service methods or procedures that could be used to reduce, representatives and Spanish language publications refine, or replace animals in biomedical research and from the resources listed. toxicological testing. Staff members search, edit, and categorize citations to create a true value-added The Toxics Release Inventory (TRI) series of files resource in this field. The 22 bibliographies issued now includes on-line files TRI86 through TRI2002. during the past ten years are available on the Internet These files remain an important resource for through the SIS Web server, and the primary environmental release data and are a useful

28 complement to other SIS databases. Mandated by the Handheld computer devices known as Emergency Planning and Community Right-to-Know Personal Digital Assistants (PDAs) are increasingly Act, these EPA databases contain environmental being used in the fields of toxicology and release data for air, water, and soil for over 600 EPA- environmental health. Moreover, software specified chemicals. These files are used in the new applications covering specialized subject matter in SIS R&D project using a geographical information these fields are increasingly being made available to system, TOXMAP. PDA users. In an effort to provide information on the main technical and content features of selected The Chemical Carcinogenesis Research applications, the SIS has undertaken an ongoing Information System (CCRIS) continues to be built, Review of PDA Applications in Toxicology and maintained, and made publicly accessible at NLM. Environmental Health. Individual reports in the This data bank is supported by the National Cancer review series are usually based on free, downloadable Institute and has grown to over 8,000 records. The demos. Each individual review typically covers the chemical-specific data covers the areas of following topics: general information, intended users, carcinogenesis, mutagenesis, tumor promotion, and authorship/data source, contents, navigation, tumor inhibition. requirements, application type/price, availability, useful web links, and updates. The Integrated Risk Information System (IRIS), EPA’s official health risk assessment file, continues AIDS Information Services to experience high usage and be very popular with the user community. EPA has had a version of IRIS NLM remains as the project manager for the multi- on the agency’s Web page since 1996, and we will agency AIDS Clinical Trials Information Service continue to consider how best to integrate our Web (ACTIS) and the HIV/AIDS Treatment Information service with what EPA provides. IRIS now contains Service (ATIS), which were merged in December 540 chemicals. 2002 into a service entitled “AIDSinfo.” This service provides access to AIDS-related clinical trials The GENE-TOX file is built directly on TOXNET information (through Clinicaltrials.gov) and federally by EPA scientific staff. This file contains peer- approved treatment guidelines. The contract for this reviewed genetic toxicology (mutagenicity) studies service also provides support services for for about 3,200 chemicals. GENE-TOX receives a Clinicaltrials.gov. high level of interest among users in other countries. Evaluation of the AIDSinfo service (accuracy monitoring) was completed in FY2004 The Developmental and Reproductive Toxicology with the goal of assisting federal agencies in (DART) database now contains over 240,000 determining the future direction of the service, citations from literature published since 1989 on including the web site. The number of Live Help agents that may cause birth defects. DART is a interactions continues to grow; users of this service continuation of the Environmental Teratology find it very helpful in learning to navigate and locate Information Center backfile database. In FY2004, information on the AIDSinfo, ClinicalTrials.gov, and next generation DART consisted of two subsets: SIS web sites. The usage level of the consumer fact DART Core on PubMed, containing over 170,000 sheets also continues to grow; the number of PDF citations to the journal literature, and DART Special, downloads for these documents averages more than containing nearly 70,000 citations to specialized 10,000 per month, and project staff continue to resources (including meeting abstracts, books, evaluate options for optimizing the guidelines technical reports) in this subject area. In FY2004, documents for PDAs. more than 500 new records were added, and easy access to DART Special and to DART Core was Other Interagency Initiatives maintained at the new TOXNET interface. DART is In FY2004, SIS personnel continued their leadership funded by NLM, the EPA, the National Institute of of the Interagency Tox-to-Consumer Initiative, which Environmental Health Sciences (NIEHS), and the completed an Inventory of Federal Government FDA’s National Center for Toxicological Research, Consumer Environmental Health Resources. and is managed by NLM. Evaluation Activities The Environmental Mutagen Information Center With funding from the NIH Office of Evaluation, SIS (EMIC) database contains over 24,000 citations to is using the American Customer Satisfaction Index literature on agents that have been tested for (ACSI) to evaluate user satisfaction with AIDSinfo genotoxic activity. A backfile for EMIC and TOXNET. Starting in November 2004, the Index (EMICBACK) contains over 75,000 citations to the provides continuous results that evaluate all aspects literature published from 1950–1991. of the user’s Web experience based on an online

29 survey. Results are benchmarked against other SIS continued its health information training federal Web sites and against private industry, and programs at national and regional meetings of the the results are published quarterly in the mainstream National Medical Association. These programs cover and trade press. AIDSinfo and TOXNET show very all of NLM’s online resources, including TOXNET, strong results, especially for their primary users: PubMed, ClinicalTrials.gov, and MedlinePlus. toxicologists, physicians, chemists and scientists. In In FY2004, SIS continued its support of the response to feedback from the survey, TOXNET has Regional Disaster Information Center for Latin made changes to its home page, enhanced America and the Caribbean (CRID) to strengthen the ChemIDplus content, and uses the survey data to capacity to collect, index, manage, store, and decide priorities for site improvement. AIDSinfo disseminate public health and medical information used the survey results to guide a home page related to disasters. The countries involved are redesign, improve search function, and better address Nicaragua, Honduras, and El Salvador. The main the HIV/AIDS information needs of students and the objective of this project is to contribute to disaster public. SIS has a lead role in using the ACSI at NIH reduction by capacity building activities in the area of and was instrumental in expanding its use to an disaster-related information management. Selected additional 60 NIH web sites through a 2004–6 project libraries and information centers have been provided funded by the Office of Evaluation. with the knowledge, training and technology resources in order to act as reliable information Outreach / User Support providers to health professionals and others in their countries. Through this initiative, the participating Special Population Web Sites: The Arctic Health web libraries and information centers have been site (http://arctichealth.nlm.nih.gov), initially strengthened in several areas: developed by SIS staff, is now updated by the • Technological Infrastructure (Internet University of Alaska at Anchorage; the Asian- connectivity and computer equipment) American Health web site will now be updated with • Information Management (Health science assistance from the Asian American Pacific Islander librarian training) Health Forum, and the Native American Health web • Information Product Development (Digital site has been released. These Web sites include Library, Web sites) relevant policy, legislative, and organizational This project is also assisting SIS in information as well as organized links to health and developing models for collecting and exchanging environmental issues of concern to the designated health information in geographically isolated and population. disaster-prone environments and for handling non- NLM-Tox-Enviro-Health-L listserv was traditional or unpublished literature, in this case on created in June 2003 to send announcements-only the health aspects of disasters. about SIS’s toxicology and environmental health SIS exhibited at over 40 conferences in programs and resources. Messages sent to the nearly FY2004. Several of these provided opportunities for 1200 subscribers include lists of new chemicals presentations or workshops about NLM’s information added to Hazardous Substances Databank, resources. In addition, SIS hosted the UNCFSP e- announcements about the new Household Products Health Conference for HBCU’s, Empowerment for Database, and new environmental health topics for Health Information, in Bethesda, Maryland. consumers added to Tox Town or MedlinePlus. The MedlinePlus Environmental Health listserv, created Research and Development Initiatives in FY2004, now sends messages to nearly 1400 subscribers. To meet the mission of providing information on In FY2003, the Toxicology Information toxicology, environmental health, and targeted Outreach Panel (TIOP) evolved a new strategic plan biomedical topics to the world, SIS has been and was renamed the Environmental Health developing new ways of presenting the world of Information Outreach Panel (EnHIOP). Dr. Henry hazardous chemicals in our environment to a wider Lewis, Dean of the School of Pharmacy at Florida audience. For example: A&M University, became Chair of the new group. The ToxTown (http://toxtown.nlm.nih.gov) The new EnHIOP includes representation from project explores how best to provide environmental additional Historically Black Colleges and health information to a general audience. ToxTown is Universities (HBCUs) as well as from Tribal an interactive guide to commonly encountered toxic Colleges and Hispanic Serving Educational substances, your health, and the environment. It uses Institutions. In FY2004, the panel members met color, graphics, sounds and animation to convey twice, and individual awards of $5000 were made to connections between chemicals, the environment, and 15 of the institutions participating in EnHIOP. the public’s health. Tox Town is designed to provide:

30 • Facts on everyday locations where toxic Testing of the prototype is underway and a beta chemicals might be found version will be ready for public release in FY2005. • Information about how the environment can The World Library of Toxicology, Chemical affect human health Safety, and Environmental Health is designed to • Non-technical descriptions of chemicals provide a web portal to global information resources • Links to authoritative chemical information in toxicology, chemical safety, environmental health, on the Internet and allied disciplines. The World Library is being • Internet resources on environmental health designed, developed, and maintained by SIS staff, topics. and will provide a cyberhome for an ongoing participatory project in which voluntary Tox Town helps users explore an ordinary representatives from participating nations provide town or city or farm to identify its common crucial input and feedback to assure credible and environmental hazards. The city, town, or farm can high-quality sources of information. With support be toured by selecting “Location” or “Chemical” from the Fogarty International Center, this project is links. Locations, like the school, home or office scheduled to release fully developed information building, can be opened for cutaway views and for resources from approximately 15 nations in FY2005. detailed information about potentially hazardous The Automated Indexing Project for chemicals that might be found there, as well as for selected HSDB data fields continues to identify links to environmental health resources. Tox Town appropriate search terms to use in comparing retrieval also offers some resources in Spanish performance of the MeSH-indexed and non-MeSH- (http://toxtown.nlm.nih.gov/espanol/). indexed versions of HSDB. Retrieval testing and ToxSeek provides a virtual meta-search tool evaluation has begun, with further work to be for simultaneous searching of target information completed in FY2005. systems, displaying search results from targeted In these and other new initiatives, SIS systems, and harvesting related concepts. This tool continues to search for new ways to be responsive to can be configured to define a set of target user needs in acquiring and using toxicology and information/search tools, which for SIS are T&EH environmental health, HIV/AIDS, and other databases and searchable resources on the web. specialized information resources.

31 intelligent agent technology, knowledge Lister Hill National Center management, the merging of thesauri and controlled vocabularies, data mining, and machine-assisted for Biomedical indexing for information classification and retrieval. Research issues include knowledge representation, Communications knowledge base structure, knowledge acquisition, and the human-machine interface for complex Alexa T. McCray, Ph.D. systems. Important components of the research Director include embedded intelligence systems that combine local reasoning with access to large-scale online The Lister Hill National Center for Biomedical databanks. CSB research staff include the team that Communications, established by a joint resolution of has developed NLM’s Gateway, the team that the United States Congress in 1968, is a research and annually produces the Unified Medical Language development division of the U.S. National Library of System Metathesaurus, and the staff who coordinate Medicine. Seeking to improve access to high quality the Center’s training programs. The most current biomedical information for individuals around the information about the Computer Science Branch can world, the Center continues its active research and be found at http://lhncbc.nlm.nih.gov/csb/. development in support of NLM’s mission. The Center conducts and supports research and Cognitive Science Branch development in the dissemination of high quality The Cognitive Science Branch (CgSB) conducts imagery, medical language processing, high-speed research and development in computer and access to biomedical information, intelligent database information technologies. Important research areas systems development, multimedia visualization, encompass the investigation of a variety of knowledge management, data mining and machine- techniques, including linguistic, statistical, and assisted indexing. An external Board of Scientific knowledge-based methods for improving access to Counselors meets biannually to review the Center’s biomedical information. Branch members actively research projects and priorities. The most current participate in the UMLS project and collaborate with information about Lister Hill Center research other NLM research staff in the Indexing Initiative activities can be found at http://lhncbc.nlm.nih.gov/. project, the goal of which is to develop automated Lister Hill Center research staff are drawn and semi-automated techniques for indexing the from a variety of disciplines, including medicine, biomedical literature. The branch also conducts computer science, library and information science, research in digital libraries and collaborates with linguistics, engineering, and education. Research NLM’s History of Medicine Division on Profiles in projects are generally conducted by teams of Science, a project to digitize the archival collections individuals of varying backgrounds and often involve of prominent biomedical scientists. Several branch collaboration with other divisions of the NLM, other projects address the challenges involved in providing institutes at the NIH, and academic and industry health information to consumers. ClinicalTrials.gov partners. Staff regularly publish their research results is an important resource for the public and, in the medical informatics, computer and information additionally, serves as a testbed for conducting science, and engineering communities. The Center is consumer health informatics research, and the often visited by researchers from around the world. Genetics Home Reference provides complex The Lister Hill Center is organized into five information about genes and diseases to the public in major components. The work of each is described easily understood language. The most current below. An organization chart with the names of information about the Cognitive Science Branch may Branch and Office Chiefs is on the inside back cover be found at http://lhncbc.nlm.nih.gov/cgsb/. of this report. Communications Engineering Branch The Communications Engineering Branch (CEB) is Organization engaged in applied research and development in image engineering and communications engineering Computer Science Branch motivated by NLM’s mission-critical tasks such as The Computer Science Branch (CSB) applies document delivery, archiving, automated production techniques of computer science and information of MEDLINE records, Internet access to biomedical science to problems in the representation, retrieval multimedia databases, and imaging applications in and manipulation of biomedical knowledge. CSB support of medical educational packages employing projects involve both basic and applied research in digitized radiographic, anatomic, and other imagery. such areas as intelligent gateway systems for In addition to applied research, the branch also simultaneous searching in multiple databases, developed and maintains operational systems for

32 production of bibliographic records for NLM’s branches and NLM divisions in the development, flagship database, MEDLINE. Research areas include operation, evaluation and demonstration of HPCC content-based image indexing and retrieval of research programs and projects. In addition, OHPCC biomedical images, document image analysis and plans, coordinates, and administers the interagency understanding, image compression, image HPCC research and development program. Office enhancement, image feature identification and staff serve as NLM’s liaison to scientific extraction, image segmentation, image retrieval by organizations at all levels of national, state and image content, image transmission and video international government on planning and conferencing over networks implemented via implementing research in High Performance asynchronous transfer mode and satellite Computing and Communications. The major research technologies, optical character recognition, and man- activities of the Office center on the Visible Human machine interface design applied to automated data Project®, NLM’s Next Generation Internet program, entry. CEB also maintains archives of large numbers telemedicine, the HPCC Collaboratory, and the 3D of digitized spine x-rays and bit-mapped document informatics research program. The most current images that are used for intramural and outside information about the Office of High Performance research purposes. Iinformation about the Computing and Communications can be found at Communications Engineering Branch can be found at http://lhncbc.nlm.nih.gov/ohpcc/. http://lhncbc.nlm.nih.gov/ceb/. Training Opportunities at the Lister Hill Center Audiovisual Program Development Branch The Audiovisual Program Development Branch Working towards the future of biomedical (APDB) conducts media development activities with informatics research and development, the Lister Hill several specific objectives. As its most significant Center provides training and mentorship for effort, the branch participates in the Center’s individuals at various stages in their careers. The research, development, and demonstration projects LHNCBC Informatics Training Program (ITP), with high quality video, audio, imaging, and graphics ranging from a few months to more than a year, is materials. From initial project concept through available for visiting scientists and students. Each project implementation and final evaluation, a variety fellow is matched with a mentor from the research of forms and formats of visuals are developed, and staff. At the end of the fellowship period, fellows staff activities include image creation, editing, prepare a final paper and make a formal presentation enhancement, transfer and display. Consultation and which is open to all interested members of the NLM materials development are also provided by the and NIH community. branch for NLM’s other information programs. From In FY 2004 the Center provided training to applications of optical media technologies and 53 participants from 17 states and 8 countries. teleconferencing to support for Web distribution, the Participants worked on projects in the areas of requirement for graphics, video, and audio materials biomedical knowledge discovery, content-based continues to increase in quantity and diversification image retrieval, consumer health systems research, of format. In addition to the development of new document imaging, image database research, techniques and processes, the facilities and hardware research, medical illustration, infrastructure must reflect state-of-the-art standards natural language systems, ontology research, hand- in a rapidly changing field. Included within APDB is held technology, Web services research, user the Office of the Public Health Service Historian. The interface research, telemedicine, ubiquitous office preserves and disseminates information about computing, Unified Medical Language System the history of Federal efforts devoted to public health. research, and visualization research. The Center The most current information about the Audiovisual continues to offer a successful NIH Clinical Elective Program Development Branch can be found at in Medical Informatics for third and fourth year http://lhncbc.nlm.nih.gov/apdb/. medical students. The elective provides an overview of the state-of-the-art of medical informatics in a Office of High Performance Computing and lecture series by nationally and internationally known Communications speakers, and offers an opportunity for independent The Office of High Performance Computing and research under the mentorship of expert NIH research Communications (OHPCC) serves as the focal point staff. The program maintains its focus on diversity for NLM’s High Performance Computing and through participation in programs supporting Communications (HPCC) activities. OHPCC minority students, including the Hispanic Association coordinates NLM’s HPCC planning, research and of Colleges and Universities and the National development activities with Federal, industrial, Association for Equal Opportunity in Higher academic, and commercial organizations while Education summer internship programs. Established collaborating with Lister Hill Center research in 2001, the NLM Rotation Program continues to

33 grow. The eight-week rotation program for trainees Genetics Home Reference. A number of internal from NLM funded Medical Informatics programs tools were also developed to handle data provides these individuals an opportunity to learn customization. These incorporate UMLS updates and about NLM programs and current Lister Hill Center provide client applications with periodic releases of research. The rotation includes a series of lectures customized data and the latest terminology and the opportunity for students to work closely with enhancements. established scientists and meet fellows from other NLM funded programs. Semantic Knowledge Representation Additional information about Lister Hill Innovative methods for providing more effective Center training opportunities is available at the access to biomedical information depend on reliable Center’s Web site under “Training Opportunities.” representation of the knowledge contained in text. Interested individuals will find descriptions of each The Semantic Knowledge Representation project of the training programs, including specific develops programs that extract usable semantic application procedures. information from biomedical text by building on existing NLM resources, including the UMLS Language and Knowledge Processing knowledge sources and the natural language processing tools provided by the SPECIALIST Developing SPECIALIST, an experimental natural system. Two programs in particular, MetaMap and language processing system for the biomedical SemRep, are being used to address a variety of domain, is the focus of the Center’s natural language problems in biomedical language and information processing work. The SPECIALIST system includes processing. MetaMap maps noun phrases in free text several modules based on the major components of to concepts in the UMLS Metathesaurus. The natural language: lexicon, morphology, syntax, and MetaMap Technology Transfer program (MMTx) is semantics. The lexicon and morphological an exportable, Java-based version of MetaMap that component are concerned with the structure of words runs under Windows, Mac OS X or Unix/Linux and and the rules of word formation. The syntactic is provided as a resource to the bioinformatics component addresses the constituent structure of community. Users are able to create MMTx data files phrases and sentences, while the semantic component independently of the UMLS. MetaMap Technology seeks to extract biomedical content from text. All Transfer source code is included in the MMTx components of the SPECIALIST system rely heavily release, and an error reporting and tracking system on the domain knowledge in the Unified Medical ensures that problems reported by users are Language System Knowledge Sources. effectively addressed. SemRep is a tool that uses the Semantic Terminology Research and Services Network to determine the relationship asserted Lister Hill Center research staff build and maintain between concepts developed in MetaMap. SemRep the SPECIALIST Lexicon, a large syntactic lexicon serves as the basis for ongoing research initiatives in of medical and general English terminology released biomedical information management, such as annually with the UMLS Knowledge Sources. New projects for extracting medical and molecular biology lexical items are continually added to the Lexicon information from text, processing clinical data in using a lexicon building tool, LexBuild, developed patient records, and research in knowledge and maintained by the lexical systems research team. summarization and visualization. Recent LexBuild allows researchers to enter items directly enhancements to SemRep’s linguistic coverage into a central database via a Web browser. A new include the addition of a mechanism for interpreting version of LexBuild featuring internal checks to hypernymic propositions. Current work addresses prevent common data entry mistakes and logical arguments of nominalizations, comparative inconsistencies was deployed in FY2004. The structures, and coordination of predicates. Semantic FY2005 SPECIALIST Lexicon release tables will be predications produced by SemRep serve as the basis generated entirely using the new LexBuild tool. The for continued work in automatic abstraction SPECIALIST Lexicon increased by over 32% to summarization of biomedical text, including 242,000 lexical items in the FY2004 release. Lexical MEDLINE citations and an online encyclopedia. access tools are also distributed as open source SemGen, a modification of SemRep, is being resources with each UMLS release. During this past developed for identifying and extracting semantic year the group also developed several tools to propositions on the causal interaction of genes and manage diverse vocabularies for a range of language diseases from MEDLINE citations. Project staff are and information processing purposes. The team also developing methods for automatically suggesting recently achieved a significant milestone in providing appropriate images as illustrations for anatomically customized UMLS data to several projects, including oriented text. ClinicalTrials.gov, Profiles in Science, and the

34 Indexing Initiative The Metathesaurus represents multiple biomedical The Indexing Initiative investigates concept-based vocabularies organized as concepts in a common indexing methods for the automatic selection of format providing a rich terminology resource in subject headings in both semi-automated and fully which terms and vocabularies are linked by meaning. automated indexing environments at the NLM. The The Semantic Network allows users to investigate goal of the Indexing Initiative is to obtain retrieval relationships among semantic types and relations and performance equal to or better than performance of retrieve a list of Metathesaurus concepts assigned to a systems using manually assigned index terms. A particular semantic type. Finally, the data in the prototype indexing system for testing indexing SPECIALIST Lexicon provides users with the methods, the Medical Text Indexer (MTI), is being syntactic and morphologic information about each of tested by NLM indexers. MTI recommendations are its lexical items. available to all indexers as an additional resource The Metathesaurus continues to grow in available through NLM’s Data Creation and size, scope, and mission. As of FY2004, there are Maintenance System. In addition, results of the MTI more than 1 million concepts with 5 million names system are being used as keywords for AIDS/HIV, from 117 source vocabularies in 15 languages. The health sciences research, and space life sciences Metathesaurus is now released in a new “Rich collections of meeting abstracts that are not manually Release” format that contains additional information indexed. allowing exact attribution of the sources for all its On-going improvements to MTI continue to information. This allows specific mappings between be made. Short-term, incremental changes arise from vocabularies, correct inclusion and exclusion of requests made by indexing staff or by a desire to specific sources, and simultaneous representation of a incorporate more of NLM indexing policy into the consistent UMLS view along with each source’s own system. Longer term goals include a word sense view. Following the July 2003 announcement by the disambiguation effort to improve MTI’s accuracy. Secretary, HHS of a government license for The team has also begun to investigate the use of the nationwide use of SNOMED CT, this widely used full text of articles in addition to their work with standard vocabulary for US clinical medicine has MEDLINE titles and abstracts. Additional work been added to the Metathesaurus. The Metathesaurus investigates an approach to fully automated indexing installation and configuration program called based on NLM’s practice of maintaining a subject MetamorphoSys has been enhanced to offer easy index to journal titles using a set of 122 MeSH terms, extraction of pre-computed subsets, for example all known as JDs (journal descriptors) corresponding to HIPAA (Health Insurance Portability and biomedical specialties. The JD system associates JDs Accountability Act) vocabularies, or selected natural with words in titles and abstracts in a three-year language processing names. This feature will assist training set of 1,378,597 MEDLINE records. Each users in many areas including regulatory compliance record “inherits” the JDs from the journal in the in electronic medical records. The Metathesaurus record. A word in the training set can then be team has successfully met several new challenges described by a list of JDs ranked according to the including meeting increasing demand for frequent number of co-occurrences between the word and the updates; developing methodologies for mappings JDs. Text as input to the system can be indexed based between vocabularies; and the development of tools on averaging the word-JD co-occurrences for the to meet the changing needs of an expanding words in the text that are also in the training set, community, especially of clinical users. ranking the JDs in decreasing order of these averages. A significant change in the method of The journal descriptor approach was used as a broad delivery of the UMLS Knowledge Sources to users filter to extract from a ten-year MEDLINE text has occurred along with the increase in size of the collection of 4.59 million records those likely to be of Metathesaurus to 18 gigabytes. Approximately one genomics interest (39% of the collection), as part of third of all users now access the UMLS through the NLM’s participation in the Text Retrieval UMLS Knowledge Source Server, one third request Conference (TREC 2004). the files on DVD-ROM, and one third download the full Knowledge Sources online. The Metathesaurus Unified Medical Language System group has developed a multi-platform Java program Unified Medical Language System research regularly that allows users to decompress, customize, and develops and distributes multi-purpose, electronic install the Knowledge Sources on local machines, knowledge sources and associated lexical programs. and has added browsers for users who create local The Metathesaurus, Semantic Network and subsets. SPECIALIST Lexicon are used by system developers to enhance patient data, create digital libraries, Modeling and Learning Methods retrieve Web and bibliographic data, apply natural The Modeling and Learning Methods project seeks to language processing, and improve decision support. develop new modeling methods that enable

35 researchers to rapidly construct effective frequency of its nodes in a corpus, applicable to the computational models from large datasets. The Gene Ontology, MeSH, and WordNet. Finally, the objectives of the project are to develop machine team pursued work on visualization by developing learning methods that automate the process of RxNav, an application for navigating drug constructing probabilistic models for identifying information in the RxNorm model. In the future, the relevant information among large datasets, mapping research team will investigate a semantic similarity identified information to networks of ontologies, approach to comparing lists of MeSH descriptors accessing queried information accurately, and assigned to MEDLINE documents and to identifying answering user queries through mining the data functionally related gene products annotated with the located in heterogeneous information sources. Gene Ontology. Interest in probabilistic models ranges over a wide spectrum of biomedical fields, including Image Processing computational biology; biomedical, clinical, and healthcare informatics; and epidemiology. The The Lister Hill Center performs extensive research objectives of the project will be evaluated with a set and development in the capture, storage, processing, of suitable metrics such as receiver operating retrieval, transmission, and display of biomedical characteristic that measure the performance of documents and medical imagery. Areas of active prospective models in terms of sensitivity and investigation include image compression, image specificity in reaching their target functions. enhancement, image recognition and understanding, Depending on the domain of the models and the image transmission, and user interface design. problems of interest, domain subjects and/or experts might be needed to determine the gold standards or the target functions for the performance evaluations The Visible Human Project (VHP) image data sets of the models if such gold standards or target are designed to serve as a common reference for the functions are not readily available. FY2004 research study of human anatomy, as a set of common public focused on identifying information represented in domain data for testing medical imaging algorithms, textual data (e.g., MEDLINE abstracts) using UMLS and as a test bed and model for the construction of tools, the SPECIALIST parser, and MetaMap. New image libraries that can be accessed through computational methods in modeling textual and networks. VHP data sets are available through a free numerical data are being developed. Staff license agreement with the NLM. Data sets are participated in TREC 2004 and competed in the distributed to licensees over the Internet at no cost Physiological Data Modeling Contest at the 21st and on DAT tape for a duplication fee. Worldwide International Conference on Machine Learning. use of the data sets continues to grow as they are applied to a wide range of educational, diagnostic, Medical Ontology Research treatment planning, virtual reality, virtual surgeries, While existing knowledge sources in the biomedical artistic, mathematical, and industrial uses by over domain may be sufficient for information retrieval 2000 licensees in 48 countries. The Visible Human purposes, the organization of information in these Project has been featured in well over 850 newspaper resources is generally not suitable for reasoning. articles, news and science magazines, and radio and Automated inferencing requires the principled and television programs worldwide. consistent organization provided by ontologies. The FY2004 saw the continued maintenance of objective of the Medical Ontology Research project two databases to record information about Visible is to develop methods whereby ontologies can be Human Project use. The first database logs acquired from existing resources and validated information about VHP license holders and records against other knowledge sources. Although the their plans for using the images. The second database UMLS is used as the primary source of medical records information about the products that licensees knowledge, OpenGA–LEN, the Gene Ontology, and are developing. The Insight Toolkit (ITK), a research the Foundational Model of Anatomy are being and development initiative under the Visible Human explored as well. Project, completed two official software releases in During FY2004, the research team focused FY2004. ITK makes available a variety of open on foundational issues and explored the ontological source image processing algorithms for computing properties of resources such as SNOMED CT and the segmentation and registration of high dimensional Foundational Model of Anatomy. Non-lexical medical data on a variety of hardware platforms. approaches to identifying dependence relations in Additional ITK awards have been made to extend the ontologies were studied, with application to acquiring software infrastructure into clinical and research associative relations in the Gene Ontology. A generic applications through the introduction of database framework was also developed for computing management tools, workbenches for tumor volume semantic similarity from a and the measurement for possible use in clinical trials, and

36 the sponsorship of Web portals for sharing research correct method for human colon polyp detection in data and publications. Non-funded researchers are helical CT datasets. now testing, developing and contributing to ITK in In September 2004, the HPCC office over 30 countries. Research institutions, including the together with representatives of the National Institute , the Imperial College of London, of Biomedical Imaging and Bioengineering and two Georgetown University, the University of Utah, directorates at the National Science Foundation, Kitware, Harvard University, Cognitica, the sponsored a workshop on visualization research University of Pennsylvania, and NLM staff challenges. The 28-member panel drew national and participated in demonstrations and technology international participants from industry and academia exhibits at the 2003 annual Radiological Society of to begin a discussion on the current grand challenges North America conference in Chicago. Tutorials on in visualization and imaging research. how to use ITK were presented at the IEEE Vis2003 conference in Seattle, the SPIE Medical Imaging AnatQuest Conference in San Diego, and the MICCAI 2003 While the Visible Human images have been used by conference in Montreal. At the end of FY2004, the biomedical scientists and developers worldwide, the NIH Roadmap Initiative for Bioinformatics and goal of this in-house project is to provide widespread Computational Biology awarded a 5-year cooperative access to the Visible Human images for a broad range agreement to the National Alliance of Medical Image of users, including the lay public. In line with this Computing. This $20 million national center for goal, a Web-mediated system, AnatQuest (available biomedical computing has adopted ITK and its at anatquest.nlm.nih.gov), was developed. This software engineering practices as part of its system is based on a 3-tier architecture in which the engineering core. first tier consists of Java applets for displaying thumbnails of the cross-section, sagittal and coronal 3D Informatics images of the Visible Human Male, from which During FY2004 the 3D Informatics Program has detailed full-resolution views are accessed. The continued to mature and develop its in-house research second tier is a set of servlets that process user efforts around problems encountered in the world of requests and compress the requested images prior to 3-dimensional and higher-dimensional, time-varying shipment back to the user. The third tier is the object- imaging. Research is continuing in the areas of oriented database of high resolution VH images and image-based implicit rendering, research and systems rendered 3D anatomic objects. Low bandwidth trials for ITK, and haptic latency analysis for surgical connections are accommodated by a combination of simulation. The team has extended and enhanced its adjustable viewing areas and image compression pilot project for creating the framework for an done on the fly as images are requested. Users may archive of volume image data, the National Online zoom and navigate through the images. Volumetric Archive. This project includes the Current work is proceeding in two physical implementation of the pilot archive for directions. The first is to increase the number and volume image data, as well as a tutorial for data type of rendered images (beyond the current 300 submission, meta-data structure management tools surface-rendered structures) to make the collection using XML, and Web page structure. The metadata more useful for the public. This would require structure management were refined, published and registering all of the cryosection slice images, presented at the 2004 SPIE Medical Imaging segmenting and labeling anatomic structures on each conference. Research is continuing in an effort to slice, and using these to create surface- and volume- create a software framework for artistic and non- rendered images. The second direction taken in this photorealistic rendering of digital models entitled, project addresses a long term NLM goal, that is, to Programmable Layered Architecture With Artistic transparently link the print library of functional- Rendering. The framework will consist of a layered physiological knowledge with the image library of software architecture for implementing medical structural-anatomic knowledge into a single, unified illustration techniques using computer graphics resource for health information. This may add value technologies. The framework adopted the to text resources such as PubMed and MedlinePlus infrastructure from the ITK software engineering by linking to anatomic images. For this purpose, methodologies in FY2004. Additional work includes: project staff are developing a modular prototype research of implicit surface and its application to system (Text to Image Linking Engine, or TILE) to surface generation for efficient rendering of anatomic serve as a testbed to investigate the alternatives in the objects, research of finite element modeling and functions needed to accomplish this linkage. These simulation system for human colon straightening and functions involve identifying biomedical terms in a its application in virtual colonography, and research document, identifying the relevant anatomical terms, on geometric mapping using the index-check-and- identifying the images in the image database, and linking the identified terms to the images.

37 WebMIRS Online X-ray Archive The Web-based Medical Information Retrieval The complete set of 17,000 NHANES II x-ray images System, a Java application, allows remote users to in the full-resolution form in which they were access data from two surveys conducted by the digitized was made publicly available in FY 2000. National Center for Health Statistics. These are the These images are available by FTP and have been National Heath and Nutrition Examination Surveys II accessed by researchers from both within the U.S. and III (NHANES II and III), carried out during the and also from international sites. Staff created the years 1976–1980 and 1988–1994, respectively. The ImViewJ software, a downloadable Java application, NHANES II database accessible through WebMIRS which allows users to view images at their full spatial contains records for about 20,000 individuals, with resolutions (e.g., 1463x1755 for the cervical spine about 2,000 fields per record; the NHANES III images, 2048x2487 for the lumbar spine images). database contains records for about 30,000 Coordinate data collected under the supervision of a individuals, with more than 3,000 fields per record. radiologist at Georgetown University are also In addition, the 17,000 x-ray images collected in available on the FTP site for 550 images. This NHANES II may also be accessed with WebMIRS coordinate data defines landmark points for each and displayed in low-resolution form. Through the vertebra in a manner commonly used in the field of WebMIRS graphical user interface, a user may vertebral morphometry, and serves as reference data construct a query for the NHANES II or NHANES to aid in creating and evaluating the performance of III data. WebMIRS allows the user to save the image processing algorithms for segmentation of the returned data to the local disk drive, where it may be vertebrae. Users may access this coordinate data analyzed with statistical tools. The WebMIRS either through the FTP archive or through the NHANES II database also contains vertebral WebMIRS system. The number of TIFF 8-bit images boundary data that was collected by a board-certified publicly available was increased to 1,000 in FY2004. radiologist for 550 of the 17,000 x-ray images in WebMIRS. This data consists of x,y coordinates for Content-Based Image Retrieval approximately 20,000 points on the vertebral The goal of the content-based image retrieval (CBIR) boundaries in the cervical and lumbar spine images. project is to develop methods for effective extraction Users may do queries for both radiological and/or of biomedical information from biomedical digital health survey data. images, with the current concentration being on the WebMIRS enhancements include NHANES II spine x-rays. The focus is both on collaborative work with Texas Tech University to indexing the image data and search of those data. For develop an advanced compression capability custom example, for the 17,000 NHANES II images, the only tailored to the image characteristics of the x-ray indexing data originally available was the collateral images, to allow delivery of the WebMIRS images in alphanumeric data collected in the questionnaires and compressed form rather than in the low-resolution examinations; no indexing information derived form as at present. Software written in Java has been directly from the images was originally available, and developed for the decompression at four different the high cost of employing radiological experts to levels. Work is now under way to improve the compile such data by physical viewing and performance efficiency of the decompression, before interpreting each image makes it unlikely that such the code is incorporated into the WebMIRS system. information will ever be acquired by purely manual Significant progress was made toward the means. These circumstances could be reversed if development of the next generation WebMIRS reliable, biomedically validated software could system, the Multimedia Database Tool. This system produce image interpretations automatically. Even in will provide a software framework for the the more likely case that only semi-automated incorporation of new text/image databases in a much methods should prove feasible, the reduction in labor more general way than the current WebMIRS and costs could be sufficient to allow the creation of provide new features for the database end user that databases of significant biomedical information extend current WebMIRS capabilities. The specific where this is not currently economically feasible. framework that has been designed has the goal of This is the motivation for research into computer- accommodating new sets of text and images under a assisted image indexing. Computer-assisted image flexible database schema and GUI approach that is searching is a potential enabler of enhanced intended to allow new databases to be incorporated information extraction from a database that has with work done only at the level of the database already been indexed. administrator, and not at the software modification During the current year new and level. substantially extended CBIR capability was developed with the implementation of the latest version, CBIR3. Highlights of the system are that it can operate in networked or stand-alone modes, uses

38 XML for reporting, and allows the user to select spatial resolution to use for digitizing the 35 mm either a more mature or an experimental version of slide collection. the system. CBIR3 differs from its predecessors in that all data (text, images, and segmentations) are Engineering Laboratories now stored on a centralized MySQL database. Each The Image Processing Laboratory is user is allocated a unique login that grants certain equipped with a variety of high end servers, rights and privileges. The system supports access to workstations and storage devices connected by a mix multiple data sources that can be selected by the user. of 100 and 1000 Mb/s Ethernet. The laboratory CBIR3 also provides a validation sub-mode for supports the investigation of image processing expert review, validation, and pathology indication techniques for both grayscale and color biomedical for indexed images. CBIR3 currently allows vertebral imagery at high resolution. In addition to computer shape segmentation using the Modified Active and communications resources and image processing Contour Segmentation and LiveWire segmentation equipment to capture, process, transmit and display techniques. In addition it has a well defined interface such high-resolution digital images, the laboratory allowing the addition of more techniques. It is now also archives a variety of image content. The possible to segment images in a database-controlled equipment includes a Sun Enterprise 4500 server sequential mode that remembers the user’s state when with dual 400 MHz CPUs, and 1.5GB memory, and a he/she stopped working. The last image and vertebra SunFire 280R server with dual 1.2 GHz CPUs, 3 GB segmented are saved and automatically brought up memory, and two internal 73 GB SCSI disks. the next time the same user segments. Another new Additional computers in the lab include two Sun feature of CBIR3 is that it allows text searching on Ultra 10 workstations, each with a 440 MHz CPU, the complete NHANES II dataset through the 512 MB memory, and an external 36 GB SCSI disk; familiar WebMIRS interface. WebMIRS (standalone) and two Sun Ultra 10s, each with a 300 MHz CPU has been linked with CBIR3 for allowing hybrid text and 512 MB memory. All of these machines run the and image searches. For image queries, CBIR3 Solaris 9 operating system. Large-scale magnetic supports query by sketch and query by image storage is provided by a Network Appliance FAS960 example. Query shape can be generated by sketch, which is a network-attached storage device connected choosing it from the existing shapes on the database, by redundant Gb/s Ethernet connections and provides or by supplying an image and segmenting it to obtain 24TB of RAID storage. For the ultra-high-resolution a shape. The query shape can subsequently be edited display of x-ray images, two E-systems Megascan by moving points, adding points, and removing monitors provide image display at a spatial resolution points. of 2048x2560 pixels. The laboratory also contains specialized equipment and software for device Digital Archive of Uterine Cervix Images calibration and color profile creation. This includes a Work continued in FY2004 towards the creation of USB-interfaced MonacoOPTIX colorimeter, capable an archive database of the 60,000–100,000 digital of color measurement from emissive sources for CRT images of the uterine cervix collected by the National and LCD monitor color calibration, and used with Cancer Institute. This work included analysis of color MonacoOPTIX software; and a USB-interfaced models, standards and technology for the digital GretagMacbeth Eye-One spectrophotometer, which capture of color information from 35 mm slides with measures color in the 380–730 nm range, with high color fidelity, and similar issues related to resolution of 10 nm, from both emissive and retaining the color across digital output devices such reflective sources, used with MonacoProof software, as monitors and printers. MATLAB programs were for the creation of standard color profiles which created to enable the comparison of images digitized characterize the color I/O of devices such as at different scan densities or at different compression scanners, monitors, and printers using the levels. A Nikon 4000 slide scanner was acquired, and International Color Consortium standard. 200 uterine cervix slides were scanned to generate The Document Imaging laboratory supports evaluation data. For each of these slides, a medical DocView, MARS and other research and design expert labeled regions of interest with a MATLAB projects involving document imaging. Housed in this tool that was developed for that purpose. A laboratory are advanced systems to electro-optically compression study was conducted to allow the capture the digital images of documents and comparison of uncompressed uterine cervix images subsystems to perform image enhancement, with those compressed using the Hybrid Multiscale segmentation, compression, OCR and storage on high Vector Quantization method developed by Texas density magnetic and optical disk media. The Tech University. Multiple medical experts laboratory also includes high-end workstations participated in the study, which used 50 test images connected by gigabit Ethernet for performing compressed at eight different compression levels. A document image processing. Both in-house similar study was conducted to determine a suitable developed and commercial systems are integrated

39 and configured to serve as laboratory testbeds to video, audio, Web information, and computer text support research into automated document delivery, slides continue to be explored. Web links within document archiving, and techniques for image these assets are used for updating program content enhancement, manipulation, portrait vs. landscape and providing links to additional information tools. A mode detection, skew detection, segmentation, template allowing the simultaneous viewing of compression for high density storage and high speed multiple interactive windows, including speaker transmission, omnifont text recognition, and related video, slides, and an interactive index was developed areas. The laboratory also contains rack-mounted, to improve access to program content on CD-ROM, networked processors running all recent versions of DVD and DVD-ROM technology. By selecting any Windows-based operating systems to support the one slide from the index, two other windows DocView, DocMorph and MyMorph projects. This immediately synchronize to that point in the provides an easily configurable test platform for presentation. Using the new template technology, simulating a variety of potential user environments, project staff developed a symposium DVD-ROM including those with firewalls, for testing, modifying “The Library As Place: A Symposium on Building and improving software developed in these projects. and Renovating Health Sciences Libraries in the The Document Image Analysis Test Facility Digital Age” and a Conference DVD “From Double is an off-campus facility that houses high-end Helix to Human Sequence—And Beyond” featuring workstations and servers that constitute the MARS over 10 hours of video, Web access, video, and production system. While routinely used to produce additional information on each disc. bibliographic citations for MEDLINE, this facility Together with NLM’s Office of also serves as a laboratory for research into Communications and Public Liaison and the HMD techniques for the automatic zoning, labeling, and Exhibition Program, project staff have worked with reformatting of bibliographic fields from document MacNeil/Lehrer Productions to launch the Changing images, intelligent spellcheck by pattern recognition the Face of Medicine: Profiles in Achievement Web- techniques, and other key elements of MARS. These enhanced DVD in FY 2004. The highly interactive techniques are fundamental to the automated DVD features 12 physician profiles, a mentoring extraction of descriptive metadata for the long term program profile, and 200 Web links as an information preservation of document images. Besides real time resource tool for users. The interactive DVD was performance data, also collected and archived are awarded a 2004 Web DVD Excellence Award by the large numbers of bitmapped document images, zoned DVD Association of America. As an element of the images, labeled zones, and corresponding OCR Changing the Face of Medicine exhibit, the NLM is output data. This collection serves as ground truth working on the planning and production phases of data for research in document image analysis and video and Web programs featuring the Local Legends understanding program, a collaborative project between the NLM and the American Medical Women’s Association Multimedia Research and Development (AMWA). The Local Legends Web site highlights Multimedia research and development efforts congressionally nominated women physicians from concentrate on the engineering of technical 50 states. The Web site is designed to include video improvements applied to issues such as image quality profiles of one representative from each state, as and resolution, color fidelity, transportability, selected by a committee within the AMWA. The first storage, and visual communication. In addition to video interview with the Washington, D.C. local developing new methods and processes, LHC legend, Janelle Goetcheus, M.D., was conducted at facilities and hardware infrastructure reflect state-of- the Columbia Road Health Services clinic, and the-art standards in the rapidly changing field of additional video content was produced in the clinic multimedia research and development. High and on the streets of Washington. These materials definition video, for example, represents today’s were edited into an overview video featuring the standard for improved electronic, motion imaging NLM/AMWA program and presented at the annual quality. Multimedia systems, scientific visualization AMWA meeting in San Diego, CA in February 2004. and networked media are being pursued for their The overview video of Dr. Goetcheus won a 2004 performance, educational, and economic advantages. Telly Award. Thirty-three on site video interviews Three dimensional computer graphics, animation with nominees were conducted at the annual meeting techniques, and photorealistic rendering methods to select state representatives of the Local Legends have changed the tools and products of the artists in program. All aspects of the Local Legends Web site the branch. Digital video and image compression design have been completed and approved by the techniques are central to projects requiring storage of NLM Local Legends development team and the large images and rapid visual file transmission. AMWA. Future work will include the development CD-ROM, DVD and DVD-ROM of additional Local Legends video profiles. technology for capturing media assets including

40 Additional projects illustrate a variety of enhance the effectiveness of Profiles in Science. The technological advancements. Project 20, a 15-minute Web site was upgraded to more powerful hardware videotape chronicling the last 20 years of the NLM, with up-to-date applications and operating system highlighted the growth of MEDLINE, the software. Enhancements to the underlying digital development of Grateful Med, Internet Grateful Med, library framework included a new database Free MEDLINE, UMLS, the creation of NCBI, and infrastructure, the creation of additional ways to view other significant events in the history of NLM. A information, and faster methods for extracting prototype DVD-ROM based on the NLM Dream records in ASCII format. New error detection and Anatomy exhibition was completed in FY2004. The correction rules and methods for automatically DVD features a video overview, a gallery and updating data were also added. Protocols for timeline, and a virtual tour of the exhibit. The digitizing collections at other institutions were narrated program also features high definition video developed and tested in collaboration with the of the exhibition, video graphics, and an original Wellcome Library staff, United Kingdom. musical score. Web links to NLM’s Dream Anatomy Development began in FY2004 on a Historical exhibition Web site and a fully functional search tool Events and Prominent Scientists Timeline to are also available when the DVD is viewed on a highlight the major historical events (e.g., political, computer. Additional DVDs were prepared in medical, scientific, and social) that occurred at the FY2004 including: (1) LHNCBC Research Projects time of the major achievements of the scientists Video DVD, (2) The 2004 Collen Award: Dr. represented in the collection. Changes to the current McDonald's Life and Career, and (3) NLM Board of Metadata Entry and Editing Program were made in Regents presentation: Saving Lives and Saving preparation for moving the program to a Web Money, by the Honorable Newt Gingrich. interface. Detailed analysis of workflow in obtaining copyright permissions identified changes needed in Information Systems the database and user interface for tracking permissions. Finally, the development of an XML- The Lister Hill Center performs extensive research in based Web interface and transition to an XML-based developing advanced computer technologies to search engine, as well as automated testing and facilitate the access, storage, and retrieval of verifcation tools, continue to be pursued. biomedical information. MARS Profiles in Science Document image analysis and understanding research The Profiles in Science Web site uses innovative combined with database design, graphical user digital technology to make available the manuscript interface design for workstations, image processing, collections of prominent biomedical researchers, string pattern matching, lexical analysis, speech medical practitioners, and those fostering science and recognition and related areas underlie the health. Database content is created in collaboration development of MARS (Medical Article Records with the History of Medicine Division, which System), a system to automate the production of processes and stores the physical collections. Most MEDLINE citation records from biomedical journals. collections have been donated to the NLM and MARS has evolved through several generations of contain published and unpublished materials, increasing capability. Its core engine consists of including books, journal volumes, pamphlets, diaries, daemons based on heuristic rule-based algorithms letters, manuscripts, photographs, audio tapes and that use geometric and contextual features derived other audiovisual resources. The Visual Culture and from OCR output to automatically segment scanned Health Posters, as well as the collections of C. pages of journal articles, assign logical labels to these Everett Koop and Wilbur A. Sawyer were added in zones, and to reformat zone contents to adhere to FY2004, bringing the total number of archives for MEDLINE conventions. For some years, its prominent biomedical researchers, medical production version has been used to extract practitioners, and those fostering science and health bibliographic data to populate MEDLINE. Two other to 13: Christian B. Anfinsen, Oswald T. Avery, Julius techniques to obtain such data have been manual Axelrod, Donald S. Fredrickson, C. Everett Koop, keyboarding and XML-tagged data directly from Joshua Lederberg, Barbara McClintock, Marshall W. publishers. Nirenberg, Linus Pauling, Martin Rodbell, Florence To meet the NLM’s goal of discontinuing R. Sabin, Wilbur A. Sawyer and Fred L. Soper. The the keyboarding contract and thereby realizing Reports of the Surgeon General (1964–2000), the savings, MARS design faced the challenge of having history of the Regional Medical Programs (1964– to process journals currently handled manually. 1976), and Visual Culture and Health Posters are also These journals include ones with page background in available on Profiles in Science. color or gray shades which greatly compromises In FY2004, project staff continued to OCR accuracy. Experiments were conducted with

41 grayscale scanners comparing different approaches to with over 60 journal titles. Tests comparing eliminating these atypical backgrounds, and the best WebMARS output against existing MEDLINE approach was found to be by using a library citations for past issues have been useful in refining developed with functions in the FineReader OCR the labeling and reformatting algorithms. toolkit. This library was embedded in the inhouse- An additional prototype has been developed developed scan software, and preliminary results to handle meeting abstracts. Testing with four from a test set of 101 articles showed that low- volumes was successful and the prototype is ready confidence characters occur at about the same rate for demonstration. “Meeting abstracts” refers to the with these grayscale scanners as with the proceedings of important conferences in HIV, AIDS monochrome scanners in production, i.e., effectively and other topics of current importance. The contents eliminating the deleterious effects of gray and color of these proceedings are not simply “abstracts” as backgrounds. Following the completion of these conventionally understood, but include most other tests, the grayscale scanners have been placed in bibliographic information: title, author names, production. affiliations, etc., Most important for automation, The scanning software has also been meeting abstracts do not follow the familiar layouts modified to improve quality control. Images from of typical biomedical journal articles. The poorly scanned documents cause OCR errors and unconventional layout of meeting abstracts requires a compromise downstream processes. Conventional modification of the existing zoning, layout and QC relies on the operator viewing the images and reformatting rules. For instance, since author names deciding on their quality. This is highly subjective are arranged differently from a typical journal (all and is not always reliable. To make this step more names in a single line, and separated by semicolons), robust, a commercial library from ScanSoft has been the existing reformatting rules in MARS required incorporated in the scan module to detect low- changes to accommodate this format. confidence characters and calculate those as a percentage of the total number of characters on the Ground Truth Data for Document Image Analysis page. This figure provides the operator a quantitative In August 2003, the Medical Article Records measure of image quality. Another key element in Groundtruth database was released for research in allowing NLM to eliminate its keyboarding contract document image analysis and understanding is the requirement for MARS to accommodate techniques by the computer science and informatics foreign language journals, which account for 11% of communities. The data consists of over 1,000 MEDLINE citations. This requirement introduces bitmapped images of the first pages of articles from new rules to extract vernacular titles (in Roman script biomedical journals indexed in MEDLINE falling languages but not in others), and process the second into nine layout types encountered in MARS pages of articles (formerly only one page needed to production. Included are the corresponding be processed). These have been achieved by the segmented and labeled zones all in XML format. FLEX software suite that incorporates new code in Also available from this Web site is Rover, an several MARS workstations. Starting with journals in analytic tool that may be used to compare the results French, German, Italian, and Spanish, MARS of a researcher’s program with the ground truth data. enhanced by FLEX now processes five Western Rover has been enhanced to allow a visual European languages using Roman script and three comparison of researchers’ algorithmic results with using Cyrillic script. the ground truth data, as well as some statistical WebMARS is a system to extract metrics. bibliographic data from online journals. A prototype system has been developed to combine downloading DocView and classification of journal articles followed by DocView facilitates the delivery of library documents zoning, labeling and reformatting algorithms to directly to the patron via the Internet in multiple identify and extract the data. The NLM Board of ways, but it is most commonly used by library Regents was recently given a talk covering the patrons to receive scanned journal articles from history of automated bibliographic data extraction libraries that use Ariel software for interlibrary loan from 1996, when NLM’s keyboarding contract ran services. While Ariel, developed by Research into difficulties, through the evolution and increasing Libraries Group, and now a product of Infotrieve, is automation in the MARS system, and focusing on the used by libraries and document suppliers routinely to design and functions of WebMARS. A key point in send documents via Internet to similar organizations, the presentation was a comparison of the relative there are few options for end users to directly receive labor required in producing citations with them. DocView helps fill this void by allowing end keyboarding, MARS, XML citations from publishers, users to receive documents sent by Ariel via a and WebMARS (which promises to result in the least modified form of File Transmission Protocol. amount of labor). WebMARS is undergoing testing DocView also enables users to retain the received

42 documents in electronic form, view the images, learning algorithms. In-house tools (e.g., DocMorph, organize them into “folders” and “file cabinets,” MyMorph) are being studied as potential tools for electronically bookmark selected pages, manipulate electronic preservation. Modifications of DocMorph the images (zoom, pan, scroll), copy and paste and MyMorph to produce PDF/A files from image- images, and print them. In addition, DocView serves based files are being explored. This work may lead to as a TIFF viewer for compressed images received a system, accessible from any point on the Internet, through the Internet by other means, such as Web that allows users to mass-migrate image-based file browsers. Users may receive document images either collections to a standard archival format. Additional via Ariel FTP or Multipurpose Internet Mail research is being conducted to identify key issues Extensions protocols. With DocView, users may also related to the preservation of video. forward documents to colleagues for collaborative work. DocMorph allows the conversion of more than Turning The Pages Information Systems 50 different file formats to PDF, for instance, to Turning the Pages Information Systems research enable multi-platform delivery of documents. Also, seeks to design more efficient methods to translate by combining OCR with speech synthesis, paper volumes from the NLM’s historic collection to DocMorph enables the visually impaired to use photorealistic electronic form, extend the virtual library information. The MyMorph Web service books into information systems, and to increase the consists of Windows-based client software and accessibility of historical documents for the public. modifications to DocMorph for accommodating the After the initial development of the Turning the Simple Object Access Protocol. In-house testing has Pages (TTP) format, research began to transform the shown that MyMorph significantly improves user design into a usable information system (TTP+). productivity compared to the conventional use of Research focused on a “discovery” and a “storyline” DocMorph through a Web browser, particularly for model as directions for TTP+. The TTP+ version of users who need to convert large numbers of files to Blackwell’s Herbal uses the “discovery” model, PDF. retaining the photorealism of the original TTP while allowing a patron to “travel” to live sites on the Document Preservation Internet. For example, from highlighted text on the Project staff have begun to design a flexible, modular St. John’s Wort page, users can go to various search software framework that may be used as a prototype engines (e.g., PubMed, ClinicalTrials.gov, USDA) for investigating techniques to preserve NLM’s and obtain citations or general information on St. digital resources in a cost-effective manner. A John’s Wort. The TTP+ version of Vesalius’ prototype system called SPER (System for the Anatomy in Photorealistic uses the “storyline” model Preservation of Electronic Resources) has been and contains images from other sources (e.g., developed. The system allows ingest, metadata rendered Visible Human images, pictures of Italian extraction and file migration, and the identification of cities, etc.). Images are interlinked to present the minimum required technical metadata for document consumer with several multimedia “stories,” files. Developing SPER required careful attention to including Man of Padua and Modes of Portraying proposed standards and models for digital Anatomy. preservation and preservation metadata schemas, Two methods have been investigated in including the NISO X39.87 proposed standard for order to combine all existing virtual books for kiosk digital still images. SPER relies on open source, display. A monolithic approach bundling all software platform-independent components, as well as current into one file was pursued. Memory limits imposed by open resources and tools which already provide some the Windows OS rendered this method unscalable. functionality required by SPER. A JavaServer Faces- On the other hand, a modular approach where the based GUI was chosen to provide the Web interface code for each book is selected by the user, provided a for SPER users or operators. The SPER prototype scalable method more suitable for the addition of was implemented in FY2004, with a first phase future books. In addition, while developed under the model designed to convert TIFF images to PDF Windows OS environment, the TTP code has also documents and/or JPEG2000 images. The Profiles in been successfully tested for operation on a Mac Science collection and MARS document images will computer running OS X. Future goals are to continue be used as test sets. Additional research and developing efficient, high quality methods for development efforts on metadata extraction and producing and distributing TTP books as more prototype design strategies of SPER will address historical books are selected. issues with metadata elements, strengthen tools that automatically learn journal-specific rules using both NLM Gateway geometric and contextual features, and strengthen The NLM offers a number of Internet-based systems that automatically learn the 2D layout information resources, each with its own user models of document page images using Bayesian interface. The NLM Gateway provides an easy to

43 use, “one-stop” search method that allows users to about the location of clinical trials, their design and issue simultaneous searches in 15 NLM information purpose, criteria for participation and, in many cases, resources using 5 retrieval methods from a single further information about the disease and intervention interface. The NLM Gateway continued to grow and under study. There are also links to individuals evolve in FY2004 with several additions and responsible for recruiting participants to each study. enhancements. NLM Gateway access was added for Because clinical trials bridge biomedical the MedlinePlus Health Tutorials, MedlinePlus research conducted in laboratories and applied Current Health News, Online Mendelian Inheritance clinical research in humans, information in this area in Man, Hazardous Substances Data Bank, is often difficult for non-specialists to read. TOXLINE Special, and the Genetics Home ClinicalTrials.gov is designed to help members of the Reference. NLM’s book, serials, and audiovisual public make sense of the information provided. The materials were migrated from LocatorPlus to the new site includes general resources to help people NLM Catalog under “Entrez,” substantially understand what clinical trials are, including a increasing the searching capabilities in the collection. glossary of common terms used to describe clinical The NLM Gateway language table was updated with trials, and a list of frequently asked questions about the latest Machine Readable Cataloging language human research. In addition, each study is presented codes. Targeting PubMed, enhancements include the in a standard format that helps readers quickly addition of a LinkOut feature for PubMed citations identify important elements of a study, such as its and direct links on the Document Ordering page for purpose, criteria for participation, locations of the PubMed Central articles. A spell checker that trial sites, and contact information. Furthermore, to automatically searches both British and American provide additional context, study records also point spellings of words was also incorporated. Author users to relevant health topics at the NLM’s name truncation for searching was added to the consumer health Web site, MedlinePlus, which Meeting Abstracts Collection and the Health Services contains easy-to-read information to help patients Research Projects database, and approximately research their health questions. Some study records 15,000 abstracts were added to the Meeting Abstracts also contain links to published literature, either for Collection. background information or study results. A comprehensively redesigned NLM A Web-based Protocol Registration System Gateway Version 2.0 entered early testing in allows providers to maintain and validate information FY2004. The new user interface will provide clear, about their trials. New views of protocol summaries easy to understand, and a cleaner navigation to are supported by geographical location, date added, different areas of the composite result set. At the and by patient recruiting status. A Spanish-language same time, the new interface will continue to execute prototype system using Spanish-English cross- simultaneous searches in 15 information resources. language information retrieval technology was The targeted release for the new user interface is developed and is undergoing extensive testing. early FY 2005. ClinicalTrials.gov was the recipient of Harvard University’s prestigious 2004 Innovations in Consumer Health Informatics Research American Government Award in recognition of its Exploring consumer information needs, information significant achievements. HHS Secretary Tommy G. seeking behavior, and cognitive strategies, consumer Thompson noted that ClinicalTrials.gov is a good health informatics research projects utilize example of how government can improve access to informatics methods and information technologies to vital health care information for all Americans. study methods to develop, organize, integrate, and The Genetics Home Reference is an deliver accessible health information to consumers integrated Web-based information system designed with all levels of health literacy. for consumers and others to learn about specific ClinicalTrials.gov provides comprehensive, genetic conditions and the genes or chromosomes up-to-date information about federally and privately associated with those conditions. The research results supported clinical trials throughout the US and many made possible by the Human Genome Project are other parts of the world. The system grew out of 1997 increasingly being made available in scientific legislation requiring the HHS, through the NIH, to databases on the Internet, but because of the often establish a registry for both federally and privately highly technical nature of these databases, they are funded trials “of experimental interventions for not readily accessible to the lay public. The goal is to serious or life-threatening diseases and conditions,” provide a bridge between the clinical questions of the thereby broadening the public’s access to information public and the richness of the data emanating from on potential interventions for a wide range of the Human Genome Project. diseases. Launched in February 2000, The Genetics Home Reference Web site ClinicalTrials.gov provides patients, families and provides basic information in a question and answer members of the public easy access to information format on the nature of genes and how they give rise

44 to various conditions and diseases. The site currently MEDLINE bibliographic citations through PubMed. includes more than 100 condition summaries and Initial content selection is involved with categorizing more than 160 gene summaries, over half of which citations returned in response to a query, creating were added during FY2004. Additional FY2004 multi-document summaries for clusters of highly improvements include a new feature that provides related documents, and single-document descriptions information about chromosomes and chromosomal containing features specific only to a given document disorders. Several new topics (e.g., in the cluster. System performance research is pharmacogenomics, multifactorial disorders, and focused on discovering design factors that ensure the imprinting) were also added to “Help Me Understand speed and reliability of the hardware and software Genetics,” the site’s genetics handbook. Genetics required for accurate and timely retrieval of data. Home Reference achieved significant site navigation Areas of investigation include choice of parsers, improvements in FY2004 with a redesigned home efficient use of a database to store recent queries and page, as well as newly designed browse, search, and citations, and load testing. help features. Targeted links were also added A prototype system, developed for PDAs throughout the site. The site was integrated with running the Palm operating system, was built and MedlinePlus, Gateway, PubMed Linkout, and the tested in FY2003. The software uses the PDA’s “What’s New” series in order to help consumers wireless communication interface and HTTP protocol locate the Genetics Home Reference Web site. to communicate with a servlet residing on a proxy Further Consumer Health Informatics server. The proxy server communicates with PubMed Research focuses on understanding and improving through the Entrez programming utilities (e.g., access to online health information. Technologies are Esearch, Efetch and Elink). The proxy server stores being developed that provide measures of text queries, results, and citations to provide a quick difficulty that help determine the suitability of health- response to recurring queries and fast delivery of related documents for consumers at different literacy frequently requested citations. The proxy server also levels. New approaches for providing timely access monitors performance measures and accumulates to consumer health information in order to aggregate statistics to help in developing clustering accommodate the diverse needs of people in the U.S. and ranking tools. The client program is responsible and abroad are being pursued through cross-language for the user interface and for storing user-specific information retrieval research. Finally, the Consumer information, such as preferred search strategies or Health Vocabularies project focuses on mapping recurring queries. FY2004 upgrades, implemented as words and phrases commonly used by consumers to a result of user feedback, have significantly improved technical medical terms and concepts. PubMed on Tap usability. PubMed for Handhelds also explores hand- Research Infrastructure and Support held technology for use in the clinical setting. During FY2004 several new features were introduced. PICO The Lister Hill Center performs and supports (Patient/Problem, Intervention, Comparison, and research in developing and advancing infrastructure Outcome) is a method used for developing well- capabilities such as high-speed networks, nomadic formulated clinical queries. This format can also be computing, network management, wireless access, used for structuring literature searches and may be and improving the quality of service, security, and helpful to those interested in evidence-based data privacy. medicine. In support of users of newer handheld devices that feature WAP browsers (mobile phones, Communication and Collaborative Technologies hybrid PDA-phones) the system has been Lister Hill Center staff engages in research to reformatted. Current services offered are clinical develop technologies that will facilitate easy access queries, systematic reviews, PICO searching without to biomedical information through devices such as filters, journal abstracts browser, and access to Personal Digital Assistants (PDAs), wireless portable ClinicalTrials.gov. computers, mobile phones, and other emerging Additional projects targeting the use of devices. handheld devices as a portal to information PubMed on Tap is a research and dissemination continued to expand in FY2004. The development project to develop accessible Biomedical Informatics and Pathology departments at biomedical information at the point of care through the Uniformed Services University collaborate with handheld devices used by clinicians and other mobile Center staff to provide wireless (e.g., infrared, health care providers. User interface, content Bluetooth, 802.11b) PDA access to PubMed, selection, content organization, and system MEDLINE, and other NLM databases during small, performance are necessary for effective access to medical student group discussions. PDAs will allow information. Initial research is focused on the design students to electronically submit reports and case of a user interface for search and retrieval of

45 summaries, which is expected to enhance their Collaboratory Access Grid node. A commercially interactions with teachers. developed software application was purchased in The ASKLEPiOS project (Access to order to use the commercial software for standard Services and Knowledge, multiLingually, applications, while also experimenting with open Everywhere, Portably, in Open Source) seeks to source beta versions. In addition to utilizing the new explore the integration of portable wireless hand-held software, the audio for the current node was upgraded devices together with non-mobile computer servers with a state of the art echo cancellation system. The and telephones. The integration framework is built Access Grid node was used in NLM’s tutorials on with open source tools and includes internet-based advanced networking at the 2003 Annual Meeting of telephony, videoconferencing, wireless data services, the Radiological Society of North America, co- speech recognition/synthesis services, and a robotic sponsored by NLM and Internet2. chat service. The framework provides the needed The EtherMed database of Web accessible “middleware” layer upon which applications relevant health professions educational materials continued to to the mission of the NLM can be built. Portable be expanded through collaborations with colleagues personalized devices with visual and speech-based at the University of Utah, UCLA, and the University interfaces may prove helpful in delivering health care of Oklahoma. A major FY2004 upgrade allows to an increasingly multicultural and multilingual outside individuals to nominate Web sites and enter society. Through collaboration with external groups, information for later review. After initial testing, this the project focuses on technologies such as improvement is expected to simplify the task of information servers, speech synthesis and recognition identifying sites to be included in the database. software, handheld personal computing devices, Another major review of EtherMed was completed wireless networking, and the public-switched using an NLM developed set of search queries. telephone network. Additions to the database are being held until the The Collaboratory for High Performance research is complete. Computing and Communication investigates innovative means for assisting health science Scalable Information Infrastructure institutions in their use of online distance learning The purpose of the Scalable Information technologies. The Collaboratory also explores Infrastructure (SII) initiative is to encourage the advanced computer and network technologies for development of health-related applications of distance interactivity, including wireless technology scalable, network aware, wireless, geographic and virtual reality research. Major upgrades to information systems, and identification technologies existing videoconferencing codecs were in a networked environment. The initiative focuses accomplished and new codecs were added in on situations that require, or will greatly benefit from FY2004. Several significant demonstrations were the application of these technologies in health care, performed using videoconferencing technology, both medical decision-making, public health, large-scale at NLM and off site at national meetings. health emergencies, health education, and Demonstrations of streaming and wireless biomedical, clinical and health services research. Webcasting were done and videoconferencing and Projects must use test-bed networks linking one or Webcasting were employed routinely in program more of the following: hospitals, clinics, health activities. One significant upgrade was the purchase practitioners’ offices, patients’ homes, health of a Click-2-Meet videoconferencing server that professional schools, medical libraries, universities, allows end points to tunnel through firewalls. The medical research centers, laboratories, or public new software required a significant upgrade in health authorities. computer hardware. Hardware upgrades were also FY2004 began the first year of a three year needed for Webcasting and it appears that dual effort for 11 SII research contract awards. Several SII processing machines are increasingly required. projects have already made notable progress; an early Experiments continued using the prototype system using wireless networks, GPS, RF conventional h.323 videoconferencing technology tags, and handheld and wearable computers was with Charles R. Drew University of Medicine and developed by the University of California, San Science and its affiliated medical magnet high school Diego; an auditorium-scale presentation of 3D for minorities, the King-Drew Medical Magnet High anatomy and collaborative surgery with haptics was School. A pilot videoconference featuring NLM conducted with Stanford University and collaborators librarians was completed. The h.323 in Australia; a monitoring system was implemented videoconferencing technology was also employed in for the Project Sentinel Collaboratory information a virtual site visit of the NLM funded medical security program at Georgetown University; a secure informatics program at the University of Missouri. XML medical record template for individuals was As a result of the phase-out of Access Grid version developed at the Children’s Hospital in Boston; and 1.1, another major upgrade was undertaken with the significant progress was made in viewing and

46 manipulating 4D datasets through the “4D Visible the project considered the feasibility of providing a Mouse Project” at the Pittsburgh Supercomputing distributed program of dental instruction. Center, Carnegie Mellon University. The virtual microscope project has been initiated by in-house staff. The project team has Telemedicine developed a Web-based system that allows users to The Telemedicine Information Exchange, sponsored view an image in an interactive manner, simulating by the NLM, is a Web-based resource of the experience of examining a slide under a telemedicine and telemedicine related activities microscope. Potential applications of the tool include maintained by the Telemedicine Research Center in medical education, quality control and diagnostic Portland, OR. During FY2004, approximately 727 proficiency surveys, and telemedicine. Staff continue non-NLM bibliographic citations and other records to participate in the monthly meetings of the multi- were delivered to the NLM. The University of agency Joint Telemedicine Working Group. Pennsylvania Dental School completed its NLM- Participating in this group, Lister Hill Center staff sponsored project during this past year. Given the made a formal presentation to Congress and the declining manpower in dentistry, limited training Administration on state-of-the-art Telemedicine and facilities, and the increasing cost of dental education, e-Health projects and solutions.

47 this research to the development of public National Center for information resources. Biotechnology Information NCBI programs are divided into three areas: (1) creation and distribution of databases to support David Lipman, M.D. the field of molecular biology; (2) basic research in Director computational molecular biology; and (3) dissemination and support of molecular biology The National Center for Biotechnology Information databases, software, and services. Within each of (NCBI), established in November 1988 by Public these areas, NCBI has established a network of Law 100-607, is a division of the National Library of national and international collaborations designed to Medicine. The establishment of the NCBI by facilitate scientific discovery. Congress reflected the important role information science and computer technology play in helping to GenBank—The NIH Sequence Database elucidate and understand the molecular processes that control health and disease. Since the Center’s GenBank® is the NIH genetic sequence database, an inception in 1988, NCBI has established itself as a annotated collection of all publicly available DNA leading resource, both nationally and internationally, sequences. NCBI is responsible for all phases of for molecular biology information. GenBank production, support, and distribution, NCBI is charged with providing access to including timely and accurate processing of sequence public data and analysis tools for studying molecular records and biological review of both new sequence biology information. Over the past 16 years, the entries and updates to existing entries. Integrated ability to integrate vast amounts of complex and retrieval tools have been built to search the sequence diverse biological information created the scientific data housed in GenBank and to link the results of a discipline of bioinformatics. It is now almost search to other related sequences, bibliographic impossible to think of an experimental strategy in citations, and other related resources. Such features biomedicine that does not involve some dependence allow GenBank to serve as a critical research tool in on bioinformatics. At the core of this shift is the flood the analysis and discovery of gene function as well as of genomic data, most notably gene sequence and discoveries that lead to identification and cures for a mapping information. NCBI will meet the challenge number of diseases. of collection, organization, storage, analysis, and Important sources of data for GenBank are dissemination of scientific data by designing, direct sequence submissions from individual developing, and distributing the tools, databases and scientists and genome sequencing centers, and technologies that will enable the gene discoveries of substantial staff and resources are devoted to the the 21st century. analysis and assembly of genome data. NCBI produces GenBank from thousands of sequence The Center meets these goals by: records submitted directly from researchers and • Creating automated systems for storing and institutions prior to publication. Records submitted to analyzing information about molecular NCBI’s international collaborators, EMBL (European biology and genetics; Molecular Biology Laboratory) at Hinxton Hall, UK • Performing research into advanced methods and DDBJ (DNA Data Bank of Japan) at Mishima, of computer-based information processing are shared through an automated system of daily for analyzing the structure and function of updates. Other cooperative arrangements, such as biologically important molecules and those with the U.S. Patent and Trademark Office for compounds; sequences from issued patents, augment the data • Facilitating the use of databases and collection effort and ensure the comprehensiveness of software by researchers and health care the database. personnel; and In FY2004, approximately 7 million • Coordinating efforts to gather biotechnology sequences were added to GenBank, and the base information worldwide. count rose from 33 billion in August 2003 to 40 billion in August 2004. The 34 million sequences in NCBI supports a multidisciplinary staff of GenBank represent data from over 130,000 senior scientists, postdoctoral fellows, and support organisms. personnel. NCBI scientists have backgrounds in GenBank indexers with specialized training medicine, molecular biology, biochemistry, genetics, in molecular biology create the GenBank records and biophysics, structural biology, computer and apply rigorous quality control procedures to the data. information science, and mathematics. These NCBI taxonomists consult on taxonomic issues, and, multidisciplinary researchers conduct studies in as a final step, senior NCBI scientists review the computational biology as well as the application of records for accuracy of biological information.

48 Improving the biological accuracy of submitted data of TPA sequence submission, the alignment view in as well as updating and correcting existing entries are the Record Viewer was improved, and Batch high priorities for the GenBank team. New releases submission features have increased functionality. The of GenBank are made available every two months; GenBank submission tool Sequin MacroSend allows daily updates are made available via the Internet and submitters to upload a Sequin file from their the World Wide Web. computer directly to the GenBank indexing staff When scientists submit their sequence data where their submission is immediately given a to GenBank, they receive an “accession number.” temporary identification number. Guides for This number serves as a tracking device and allows specialized submissions are also available on the the scientist to reference the sequence in a subsequent GenBank site. journal article. Sequence data submitted in advance BankIt, another sequence submission of publication is maintained as confidential, if software tool, is now in its tenth year of use. Some of requested. the improvements made to BankIt this year include In FY2004, the restriction on sequence the ability to identify sequences appropriate for the length for database records was removed. Previously TPA database, options for including strain name for the International Nucleotide Sequence Database mouse, rat, and Influenza virus, and a more explicit Collaborators (INSD) had agreed to a 350,000 base example of features that can be added to a record. limit in order to maintain compatibility with various GenBank has evolved to contain several existing biology software packages. Newer software types of sequence information, from relatively short versions are able to analyze long sequences quickly Expressed Sequence Tags (ESTs) to assembled and by removing the length limitation, megabase genomic sequences that are several hundred kilobases comparisons can be performed more efficiently. in length. EST data obtained through cDNA NCBI is continuously developing new tools, sequencing are critical to understanding gene and enhancing existing ones, to improve access to, function and therefore continue to be heavily and the utility of, the enormous amount of data stored represented in GenBank. The Genome Survey in GenBank. Sequence data, both nucleotide and Sequences (GSS) division of GenBank contains protein, is supplemented by pointers to corresponding sequences that are genomic in origin, rather than PubMed bibliographic information, including cDNA. The Sequence Tagged Site (STS) division abstracts and publishers’ full-text documents. consists of short sequences that are operationally GenBank provides links to outside sources such as unique in the genome and used to generate mapping biological databases and sequencing centers. In reagents. Expanded STS information can be found in addition to literature information, GenBank also the UniSTS database. provides links to related information in other Entrez Entrez Genomes contains records databases. The availability of such links allows representing over 2,000 species including bacteria, GenBank to serve as a key component in an archaea, and eukaryotes, complete microbial integrated database system that offers researchers the genomes, a number of viroids, mitochondria, a broad capability to perform comprehensive and seamless host range of plasmids, and over 1,000 viruses. The searching across all available data. genomes represent both completely sequenced The Third Party Annotation (TPA), database organisms and those for which sequencing is in created in conjunction with international counterparts progress. Approximately 20 new complete genomes EMBL and DDBJ, supports third party annotation of and over 900 records for viral, microbial, and sequence data already available in public databases. organellar chromosomes were added to the database Sequences in the TPA database are predicted or in FY2004. Twenty-two organism-specific genome assembled from such sources as ESTs, genome data, resource pages are now available including and other unannotated sequences. Publication of the chimpanzee, rat, mouse, chicken, cow, dog, pig, analysis in a peer-reviewed scientific journal is a sheep, cat, and honey bee. requirement of this database. NCBI also accepts submissions from Whole Genome Shotgun (WGS) The Human Genome sequencing projects. Annotations are allowed in these assemblies and records are updated as sequencing NCBI is responsible for collecting, managing, and progresses and new assemblies are computed. Forty- analyzing human genomic data generated from the nine WGS sequencing projects were released during sequencing and genome mapping initiatives of the FY2004. public Human Genome Project. NCBI also plays a Improvement of NCBI’s sequence key role in assembling and annotating the human submission software continues to be a high priority. genome sequence. This resource is truly an A new version of Sequin, NCBI’s stand-alone international public sequencing effort due to the submission tool, was released in FY2004. In this new cooperation of scientists and sequencing centers from version, improvements were made to facilitate ease around the world. In FY2004, multiple annotated

49 builds of the human genome were released to the centers are also available. Genes or markers of public. The latest Build 35, version 1, was released in interest can be found by submitting a query against a June. whole genome, or by querying one chromosome at a time. The results table includes links to a Assembling and Annotating the Human Genome chromosome graphical view where the gene or A team of NCBI scientists is engaged in annotating, marker can be seen in the context of additional data. or characterizing, the biologically important areas of The Evidence Viewer is a feature that provides the genome. In FY2004, annotation for genome graphical biological evidence supporting a particular builds was based on Gnomon, the new gene gene model and the Model Maker allows users to prediction program developed by NCBI scientists. build a gene model using selected exons. Gnomon puts a greater emphasis on coding In FY2004, NCBI continued to improve its propensity and matches to existing proteins when Map Viewer. A new Map Viewer home page was predicting genes. To create a gene model, Gnomon released, grouping the organisms for which map finds the best self-consistent set of transcript and information is available. Twelve organisms were protein alignments to a genomic region and uses added to Map Viewer this year bringing the total these alignments as constraints for a Hidden Markov number of organisms to 34. An advanced search Model (HMM)-based gene prediction. As a result of capability was added which allows restriction of the Gnomon program, the number of genes for searches to specific chromosomes or searching for human, mouse, and rat genome builds has decreased objects based on specific attributes. A new significantly, while the number of models identified comparative maps feature allows users to view maps as pseudogenes has increased. The number of human of different organisms side-by-side for comparison. genes is now predicted as low as 20,000 versus Genes and Disease is a collection of articles earlier estimates of 35,000. designed to educate the lay public and students on how genes are inherited and cause disease and how NCBI Resources Designed to Support Analysis of the an understanding of the human genome will Human Genome contribute to improving diagnosis and treatment of NCBI has developed a suite of genomic resources to disease. This collection, part of the NCBI Books site, support comprehensive analysis of the human contains descriptions for over 150 genetic diseases genome, as well as the complete genomes of several and links to databases and organizations for model organisms. Specialized tools and databases additional information. For each gene description have also been designed to facilitate researchers’ use there is a link to PubMed, the Online Mendelian of this data. NCBI maintains an expanding collection Inheritance in Man database (OMIM), the Map of specialized, yet integrated, database repositories Viewer, LocusLink, and BLink for related sequences. that collectively capture and redistribute the OMIM is an electronic version of Dr. Victor biological relationships between genome sequences, McKusick’s “Online Mendelian Inheritance in Man,” expressed mRNAs and proteins, and individual a catalog of human genes and genetic disorders. The sequence variations. database, produced at Johns Hopkins School of NCBI’s web resource, “Human Genome Medicine, contains over 15,500 records. OMIM also Resources,” serves as a nexus for the collection and contains two maps showing the cytogenetic location storage of diverse human data. This online guide of disease genes. The “OMIM Morbid Map” is provides centralized access to a full range of genome organized by disease, and the “OMIM Gene Map” is resources, including links to BLAST, dbSNP, organized by chromosome. During the past year, LocusLink, RefSeq, Map Viewer, Gene, Homology information connecting diseases to sequence (genes Maps, UniGene, HomoloGene, and GEO. NCBI’s or markers) was used to create a human sequence Human Genome Sequencing site provides access to based-phenotype map. More than 1,600 diseases information on sequencing efforts and various other have been placed in sequence coordinates on the types of resources, such as chromosome-specific human genome. mapping information, and TaxPlot for genome The GeneTests database produced at the similarity plotting. University of Washington is now being supported, as NCBI’s Map Viewer provides a graphical is OMIM, by contract from NCBI. GeneTests is used display of features on assemblies of genomic more than 25,000 times a day by genetics counselors sequence data as well as cytogenetic, genetic linkage, and physicians for its comprehensive genetic testing physical, and radiation hybrid maps, when available. information and genetic disease descriptions. Data Map features that can be seen along the sequence produced by this database is now being integrated include NCBI contigs, the BAC tiling path, the into NCBI data resources. location of genes, exons, STSs, FISH mapped clones, LocusLink, NCBI’s original single-query ESTs, GenomeScan models, SAGE tags, and interface to curated sequences and descriptive sequence variation. Maps from other sequencing information about genetic loci, continues to grow.

50 The number of genes represented expanded to to a request from Dr. Zerhouni, the NIH held an 152,000 not counting genes predicted from NCBI’s RNAi workshop earlier this year and established a genome annotation pipelines. In FY2004, organisms cross-institute working group to aggressively pursue added to LocusLink include honey bee, chicken, dog, RNAi research. To fully realize this investment, the pig, purple sea urchin, African clawed frog, and NCBI has established a database to store information Western clawed frog, bringing the total number of on RNAi reagents and experimental results. NCBI organisms to 15. LocusLink provides one of the scientists are currently working with other NIH windows into NCBI’s annotation of the human scientists to enter the appropriate information and the genome, with direct links to the Map Viewer, gene first public release of the database is planned for annotation, gene ontology terms, and links to other early 2005. NCBI resources. As the number of sequenced genomes The Reference Sequence (RefSeq) database continues to grow, there is increasing interest in provides a comprehensive, integrated, non-redundant comparative analysis of genes from represented set of sequences, including genomic DNA, transcript species. The NCBI HomoloGene resource performs (RNA), and protein products for major research such large-scale comparison automatically and organisms. These standards serve as a basis for presents the results to scientists, obviating the need medical, functional, and diversity studies by for individual analyses. Over the past year, the providing a stable reference for gene identification HomoloGene system has been completely revised to and characterization, mutation analysis, expression use genome-based information, as opposed to the studies, polymorphism discovery, and comparative transcript-based information that was available in the analysis. In FY2004, the NCBI RefSeq database grew pre-genome era. In a recent release, comparisons of by 48% and the full release of all NCBI RefSeq over 16 billion pairs of genes were performed, records includes over 1.1 million proteins from 2,558 leading to 103,677 gene homology groups. organisms. HomoloGene was also added to the Entrez retrieval NCBI is working with other groups to system in FY2004. compare and evaluate genome annotation data and The dbSNP database of genetic variation is a identify the set of proteins, as annotated on genomic comprehensive catalog of common human sequence, which pass quality tests, and are polymorphisms for the international research consistently identified by different groups. A community. dbSNP continued to experience rapid comparison between the human genome annotation growth in FY2004. New content was driven by provided by NCBI and the Ensembl groups ongoing surveys of human sequence variation for the determined that approximately 16,000 annotated International Haplotype Map Project (HapMap), proteins are identical between the two groups, 99% of major submissions by private companies, and which are annotated by sequences provided in the variation analysis using whole genome shotgun reads curated RefSeq database. for freshly sequenced organisms. dbSNP expanded NCBI has created new infrastructure to support for genotype data in 2004 with a new schema support effective communication with research and several rounds of intensive staff curation to groups that work on genome sequencing, annotation, uniquely identify the individuals represented by and biology. During FY2004, pre-existing overlapping sets of cell line reagents and pedigree collaborations continued for seven genomes, in data. dbSNP group members worked in both addition to an established viral genomes advisors production and advisory roles for two major NIH group, and new collaborations for eleven organisms projects, the Mammalian Gene Collection (MGC) plus fungi were established. From this effort, eleven and HapMap, on issues of SNP mapping, sequence sets of supporting web pages were added along with annotation, data interpretation, and final data quality related map and/or genomic sequence information in assessment. The topic of haplotype representation in the NCBI Map Viewer. dbSNP and annotation of linkage disequilibrium on The Entrez Gene database debuted early in reference genome sequence will be a major FY2004 and is a significant step to providing a much development issue in FY2005 when HapMap’s phase larger scope of gene-specific data at NCBI. Entrez I genotype data and high-resolution haplotype data Gene integrates information about genes from are released for unrestricted distribution. LocusLink and gene features annotated on RefSeqs Quantitative trait loci (QTLs) have from Entrez Genomes. Currently more than 2,275 measurable effects on an organism’s “phenotype,” taxa are represented in the Gene database with a total i.e., an open-ended set of measures of the natural of about 860,000 genes. processes of metabolism, growth and reproduction, or RNA interference (RNAi) is an emerging the abnormal processes of disease. In 2004, NCBI technology for silencing specific genes that is developed a new repository for phenotype data to proving to be of great utility as a research tool and include this crucial element of biological data in its may have important clinical applications. In response public genome data infrastructure. NIH’s

51 commitment to understanding human health and community. In FY2004, NCBI Build 33 was released disease through the increasingly powerful lens of and represents a third generation composite genomic biology is producing large systematic assembly. Rat genome Build 2 was released in datasets relating genotype/haplotype to phenotype for FY2004 with new maps including an assembly map, humans and comparative model organisms including EST alignment maps for human, rat, and mouse, and yeast, fruit fly, mouse, worm and dog. A retrieval an ab initio map. system is being designed to accommodate comparisons across heterogeneous submissions of Literature Databases experimental results; capture the multi-dimensional

relationship between phenotype measures and data on PubMed is a web-based literature retrieval system sequence composition, haplotype variation or level of developed by NCBI to provide access to citations and expression; and support retrieval across multiple abstracts for biomedical science journal literature. It scales of organization. is the bibliographic component of the NCBI’s Entrez In 2004, NCBI designed a prototype XML retrieval system and provides links to full-text journal schema for a proof-of-concept implementation. It will articles at web sites of participating publishers, as include publicly available data for human, mouse, rat, well as to other related web resources. and fly. The model’s scope was deliberately confined Full-text journals with PubMed links have to those organisms with extensive available genomic increased from 4,054 in September 2003 to over sequence and an established ontological 4,400 in September 2004. Approximately 60% of all representation of anatomy, developmental stage and PubMed citations from 1990–2003 now have links to disease. Phenotype will become a new Entrez full-text. Usage of PubMed by the scientific and lay database in 2005 to facilitate the association of communities has also grown considerably since its phenotype data with other Entrez components such as introduction in 1997, with up to 2.8 million searches dbSNP, Gene, OMIM, and GEO. In this way, and over 300,000 users per day. putative risk factors like genetic variants or In August, the 15 millionth citation was haplotypes, linked QTLs, drug treatments, diet, and added to PubMed and during this year over 1.7 epigenetic status can be associated with measurable million OLDMEDLINE citations were added. The traits and compared across studies to generate OLDMEDLINE citations were originally printed in hypotheses as to the true etiology of diseases. Using the hardcopy indexes published from 1951 through the Map Viewer, QTLs can be aligned across 1965. The MeSH database was enhanced with terms organisms to identify syntenic regions of phenotype that are identified by MeSH as pharmacological linkage. The phenotype database can become a new actions and a direct link was added to the Clinical point of entry into the Entrez search space via the Queries page. The Clinical Queries page was also ontological concepts reflected in its controlled revised and filter strategies were updated. The vocabulary. History page now includes a menu from the search

statements number to provide an easier way to Model Organisms for Research combine, delete, and retrieve History statements. The

truncation limit in PubMed was increased from 150 The genomes of model organisms can provide variations of a truncated term to 600. The PubMed genetic information for human development and gene Batch Citation Matcher was updated to include an regulation, genetic disease, and the evolutionary email feature and the ability to upload a formatted process. NCBI genome resource guides provide file. information on diverse organism-related resources In September, NCBI released a new Entrez from multiple centers including sequence, mapping, database, NLM Catalog. The NLM Catalog provides and clone information, when available. The guides access to bibliographic data for over 1.2 million also provide easy navigation to organism-specific books, journals, audiovisuals, computer software, BLAST pages, and other NCBI resources. NCBI electronic resources, and other materials in the NLM currently provides genome resource guides for 21 collection via the Entrez retrieval system. The new organisms other than human. Resource guides added database is an alternative search interface to the in FY2004 include Aspergillus (fungus), bee, bibliographic records resident in NLM’s online chicken, cow, Dictyostelium, dog, frog, pig, sea catalog LocatorPlus and supports automated mapping urchin, and sheep. features and MeSH term indexes. The mouse genome was the first model organism available on the NCBI website. The mouse LinkOut is a feature of Entrez designed to genome resource guide has links to mapping and provide users with links from PubMed and other BLAST pages as well as information on sequencing Entrez databases to a wide variety of relevant web- progress, sequencing centers, strain resources, and a accessible online resources, including full-text monthly newsletter designed for the mouse research publications, biological databases, consumer health information, research tools, and more. As of

52 September 2004, over 1,500 organizations have PubMedCentral (PMC) is a web-based supplied links to their Web sites, representing a 40% repository of life sciences journal literature providing increase from last year. Providers include over 1,000 free and unrestricted access to full-text life sciences libraries, 180 full-text providers, and 200 providers of journal literature. This repository is based on a non-bibliographic resources including biological natural integration with the existing PubMed databases. Together they provide links to 29 million biomedical literature database of abstracts. As of Entrez records. LinkOut resources received more August 2004, PMC included over 160 life science than 16 million hits per month, a 35% increase from journals. Use of the service has increased by 50 last year. percent relative to last year, reaching 830,000 unique The LinkOut for Libraries program users for the month of September 2004. continues to provide biomedical libraries the ability PubMedCentral has enhanced its value as a to link library patrons from a PubMed citation digital archive by scanning back issues of journals for directly to the full-text of an article. Enhancements to online access. Approximately half of the 350,000 the program include a new upload-holdings function articles in PMC have come from the NLM back issue that allows libraries to display print holdings in digitization project in the past year. The complete run LinkOut. In addition, a new service, Outside Tool, of the Bulletin of the Medical Library Association directs users to a local tool where they can explore (1911 forward) was released online in November information local to their own environment. 2003, making it the first of what will be many Approximately 100 institutions have registered to journals providing archival access. connect their users to internal OpenURL-based link resolvers. The BLAST Suite of Sequence Comparison The NCBI Bookshelf provides access to the Programs full text of over 38 textbooks in the clinical and research areas of biomedicine. Books may be Comparison, whether of morphological features or searched directly or found through links in PubMed protein sequences, lies at the heart of biology. The abstracts. An innovative indexing approach introduction of BLAST in 1990 made it easier to developed by NCBI permits readers of electronic rapidly scan huge sequence databases for similar books to locate sets of related PubMed articles based sequences and to statistically evaluate the resulting on phrase matching. In addition to textbooks from matches. In a matter of seconds, BLAST compares a commercial publishers, the Bookshelf also includes user’s sequence with up to a million known monographs authored by NCBI, NLM, and NIH staff. sequences and determines the closest matches. Use of the Books database has increased six-fold in BLAST also provides users the option of retrieving the past year and about two million book pages per results with a request ID within 24 hours of month are downloaded by users. searching. Seven new books were added to the The BLAST suite of programs is database this year as well as over 100 chapters to continuously enhanced for easier use. Many versions continuously published books. One new book, of the database were released this year with BLAST HSTAT—Health Services/Technology Assessment 2.2.9 the last released build in FY2004. BLAST Text, was a database transferred from the LO. genome pages were added for chicken, cow, pig, dog, HSTAT contains 741 entries including AHRQ sheep and cat as well as an environmental samples Evidence Reports, AHCPR Supported Guidelines and data page. The BLAST sequence searching server is Consumer Guides, Guides to Clinical Preventive one of NCBI’s most heavily used services and its Services, and NIH Consensus Development usage continues to grow at a pace reflecting the Programs. Two NCBI resources, Genes and Disease growth of GenBank. Each day more than 200,000 and Coffee Break were also transferred to the Books BLAST searches are performed, with users site. Books added include Molecular Biology of the submitting their requests through server/client Cell, 6th Edition, Endocrinology: An Integrated programs and the Web. Additional hardware and Approach, and The Genetic Landscape of Diabetes. improvements in the BLAST code have enabled The Bookshelf is developing tools for publishing response times to decrease despite increases in the Microsoft Word documents to XML easily, with no size of the database and number of users. technical knowledge required on behalf of the author. Several programming changes to BLAST The tools are already being used in collaborative queuing and calculation of final alignments have projects with the creators of GeneTests improved the turnaround time for answering a user’s (www.genetests.org), the NIH Roadmap Imaging query and reduced the peak load on the formatting Agent Database group, and the Fogarty Center/World machines, allowing more searches with fewer Bank for their book Disease Control in Developing resources. A new BLAST report formatter was made Countries. available to the public, improving the presentation and value of results. The improvements include a

53 new alignment style for closely related sequences as pre-computed related links for each case related well as applying existing alignment styles to searches cytogenetically, diagnostically and/or textually. of translated nucleotides that were not previously The new Entrez Genome Project database is supported. A new graphical viewer added the option organized around cellular organism-specific genomic of retrieving results in HTML format. This option information, including but not limited to genome makes it easier for users to store or even produce the sequencing such as whole genome shotgun or BAC results on their own computer and simplifies NCBI ends sequencing projects, large scale EST and cDNA processing of formatting requests. projects, and assembly and annotation projects. The Improvements are routinely made in order to database is designed around a hub-and-spoke model, allow easier access to the tools and database by users. with an organism comprising the hub, and individual Standalone BLAST software is distributed to allow projects the spokes. This allows the collection of users to run BLAST searches within their own disparate data that all refer to a single organism, institution. FASTA BLAST database files were conveniently displayed for easy access with migrated from ZIP to GZIP compression format for references to all subprojects. Currently the database improved efficiency and storage. Algorithmic contains 1,408 eukaryotic and 517 microbial genome improvements have been added to the MegaBLAST projects. 37 complete microbial genomes were program, allowing it to run three times faster for processed this year which brings the total number to some searches. 192 complete genomes of important plant and human pathogens. Other Specialized Databases and Tools The protein clusters database (Proteus) is designed for reference sequence re-annotation by Documenting the interaction of human applying consistent and up-to-date annotation to immunodeficiency virus type 1 (HIV-1) proteins with every protein in a protein family across complete those of the host cell is crucial to our understanding microbial and viral genomes. The intent is to increase of the processes of HIV-1 replication and the speed of re-annotation, increase accuracy and pathogenesis. To meet this need, the Division of consistency by applying the same annotation across Acquired Immunodeficiency Syndrome of the genomes, and to automatically re-annotate proteins National Institute of Allergy and Infectious Diseases, that enter the database from new genomes in collaboration with the Southern Research Institute automatically. Proteus is currently under and NCBI, has begun to compile a comprehensive development, it contains approximately 550,000 “HIV Protein-Interaction Database” to provide a protein sequences in about 40,000 low-level clusters. concise summary of documented interactions Plant Genomes Central is an integrated, between HIV-1 proteins and host cell proteins, other web-based portal to plant genomics data and tools. It HIV-1 proteins, or proteins from disease organisms provides access to large-scale genomic and EST associated with HIV/AIDS. The database, introduced sequencing projects and high resolution mapping in April of this year, has been designed to track projects. The plant genomic effort has one technical information for each protein-protein interaction hurdle relative to other genomic efforts: the range of identified in the literature. plant genome size is very large extending from The new Cancer Chromosomes database, approximately the same size as the genome of many made public in March, integrates three databases, the small animals to more than five times as large as the NCI/NCBI SKY (Spectral Karyotyping)/M-FISH human genome. In September 2004, there were over (Multiplex-FISH) and CGH (Comparative Genomic 80 organisms included the Plant Genomes database, Hybridization) Database, the NCI Mitelman Database many of which appear in the NCBI Map Viewer. of Chromosome Aberrations in Cancer, and the NCI The Viral Genomes website provides a Recurrent Chromosome Aberrations in Cancer, into convenient way to retrieve, view and analyze NCBI’s Entrez retrieval system. Cancer complete genomes of viruses and phages. This site Chromosomes supports searches for cytogenetic, now contains over 1,600 records for more than 1,200 clinical, or reference information using the flexible different species. Entrez search and retrieval system. The Influenza Virus Resource was created at Searches in Cancer Chromosomes are based NCBI with data obtained from GenBank and the on case information and underlying cytogenetic National Institute of Allergy and Infectious Diseases features. From the results list, users can access the Influenza Genome Sequencing Project. This project pull-down menu and display a variety of features, aims to produce “real time” sequence information including the corresponding literature from PubMed, during flu season to provide assistance in flu or the “Similarity Report” showing common vaccination decisions. This resource will prove to be elements relating to diagnosis, site, and other valuable due to the rapid evolution of flu viruses and cytogenetic abnormalities. As in PubMed, there are will include sequence analysis tools for flu sequences

54 as well as links to other resources on flu viruses. The Committee on Taxonomy of Viruses (ICTV). The database currently contains over 12,000 sequences. PubMedCentral archive is now being scanned and The Gene Expression Omnibus, or GEO, is indexed with links to organisms in the Taxonomy a high-throughput gene expression/molecular database. abundance data repository, as well as a curated, UniGene is NCBI’s system for online resource for storage and retrieval of gene automatically partitioning transcribed sequences into expression data. Currently, GEO contains over a non-redundant set of gene-oriented clusters. Each 30,000 user-submitted microarrays. GEO Profiles, UniGene cluster contains sequences that represent a previously Entrez GEO, contains seven million unique known or putative gene, as well as related profiles accounting for hundreds of millions of information such as the tissue types in which the gene individual expression points. GEO DataSet (GDS) has been expressed, and map location. New contains dataset definitions to facilitate identification organisms added to UniGene this year include: of experiments of interest. At this time there are 640 Maxus x domestica (cultivated apple), Ovis aries curated experiments in the database. Over 100 (sheep), Apis mellifera (honey bee), Bombyx mori organisms are represented in these two databases. (domestic silkworm), Canis familiaris (dog), Graphical and text query tools for gene profiles and Helianthus annuus (sunflower), Lactuca sativa datasets have been developed. Multiple clustering (garden lettuce), and Salmo salar (Atlantic salmon). methods are available and link to other resources The PubChem project, a key component of such as HomoloGene and Entrez Gene. the NIH Roadmap project in Molecular Libraries and Serial Analysis of Gene Expression, or Imaging, was initiated in FY2004. The PubChem SAGE, is an experimental technique designed to database is designed to be a repository for small quantitatively measure gene expression. The molecule data and the foundation for the massive SAGEmap tool compares computed gene expression amounts of bioactivity data that will be produced by profiles between SAGE libraries generated by the NIH-sponsored chemical genomics centers. Cancer Genome Anatomy Project (CGAP) and Following a rapid development cycle, a public search submitted by others through GEO. SAGEmap also service became available in September. The includes a comprehensive analysis of SAGE tags in PubChem database contains some 900,000 small human GenBank records. Data can be retrieved by molecules including their structures, properties, and tag, sequence, UniGene cluster ID, and library name. activities. PubChem marks the first time that this type Links to genomic sequence via the Map Viewer are of comprehensive information on the chemical also available. SAGE includes a total of over six structures and biological activities of thousands of million tags from 12 organisms and 389 experiments. small molecules will be freely available to the public The NCBI Taxonomy project provides a sector. PubChem in its first version contains legacy standard classification system used by the data from the NCI’s Developmental Therapeutics international nucleotide and protein sequence Program (250,000 compounds), NIST’s physical databases. The Taxonomy database contains the properties database (300,000), NLM’s ChemIDplus names and lineages of more than 130,000 organisms, (100,000), and NIAID’s anti-HIV screening program both living and extinct, represented by at least one (50,000). PubChem, part of the Entrez database, nucleotide or protein sequence in GenBank. 40,326 contains an extensive set of links to PubMed taxa from newly submitted sequences were added to literature citations as well as links to the proteins the taxonomy database over the past year, and/or genes representing a protein they bind to. representing a 22% increase from the previous year. Compounds are searchable by chemical structures, by The Taxonomy browser allows searches for chemical properties, and by bioactivity. information on an organism or taxon’s lineage. NCBI’s Molecular Modeling DataBase Searches of the NCBI Taxonomy database may be (MMDB) is Entrez’s ‘Structure’ database, a made on the basis of whole, partial, or phonetically compilation of all the structures in the Protein Data spelled organism names, with direct links to Bank (PDB). PDB is a collection of all publicly organisms commonly used in biological research also available three-dimensional protein structures, provided. The Taxonomy system also provides a nucleic acids, carbohydrates and a variety of other ‘Common Tree’ function that builds a tree for a complexes experimentally determined by X-ray selection of organisms or taxa. crystallography and NMR and is maintained by the A major redesign in FY2004 includes the Research Collaboratory for Structural Bioinformatics addition of genomic data and richer links to internal and the European Bioinformatics Institute. resources, the trace archive, and select external NCBI’s three-dimensional structure viewer, resources through the LinkOut program. A Cn3D, provides easy interactive visualization of particularly productive collaboration has developed molecular protein structures from Entrez. Cn3D also over the last year between the taxonomy group, the serves as a visualization tool for sequences and NCBI viral genomes project, and the International sequence alignments. What distinguishes Cn3D is its

55 ability to correlate structure and sequence sequences via neighbors and links provides a very information. Cn3D also features custom labeling powerful and intuitive way of accessing the data. options, coloring by alignment conservation, and a At this time, Entrez consists of 27 integrated variety of file export formats that together make databases providing information on sequences, Cn3D a powerful tool for structural analysis. taxonomy, genes, and literature. Databases added in The Conserved Domain Database (CDD) is FY2004 include: HomoloGene, Cancer an Entrez database of sequence alignments and Chromosomes, NLM Catalog, PubChem Compound, profiles defining protein domains as recurrent PubChem BioAssay, and PubChem Substance. evolutionary modules. Identification of conserved Entrez Global Query was expanded to include 27 domains within a protein sequence is also available databases for simultaneous searching. via the CD-search service, which is now run by default for each protein BLAST search. Other Network Services VAST, or the Vector Alignment Search Usage of NCBI’s Web services continues to expand Tool, is a service that identifies similar protein three- as more information and services are added. NCBI dimensional structures of newly determined proteins. staff continued to make access and usage easier with VAST compares new proteins to those in the improved documentation and tutorials. A web MMDB/PDB database and computes a list of usability group was established this year to address structure neighbors, or related structures, which issues such as improving awareness of underutilized allows a user to browse interactively, viewing services, implementing a better and more consistent superpositions and alignments in Cn3D. means to navigate the NCBI site, establishing a An interagency agreement with the National content management system, and evaluating user Institute of Justice in 2003 commissioned NCBI to experience of all services. develop high-throughput forensic interpretation The NCBI web provides an integrated software for use in state crime labs and for mass approach to accessing all of NCBI’s database and fatalities. In 2004, the NCBI development team services as well as general information about NCBI, created and released two public domain software its research, data submissions, and updates. At the packages: BatchExtract, a software utility to convert end of FY2004, NCBI’s site was averaging over 40 DNA electropherogram instrument files into ASCII million hits daily. Because of the mission-critical text for independent analysis, and a beta test version nature of NCBI’s computing platforms for PubMed, of OSIRIS, an Open Source Independent Review and Entrez, BLAST, and other services, extensive system Interpretation System. NCBI and the Florida monitoring is performed. Based on measurements Department of Law Enforcement are currently taken every 15 minutes from 50 ISP monitoring sites validating OSIRIS for state crime lab use through a across the U.S. and overseas, the average time to load series of concordance studies wherein the forensic the entire NCBI home page is 0.82 seconds, an genotype calls (i.e., DNA “fingerprint”) for 20,000 average PubMed search takes less than 2.5 seconds samples from the instrument’s genotyping software and availability has been better than 99.5 percent. and the independent OSIRIS genotypes are NCBI has a number of network services that compared. A new technology for compressing the provide programmatic access to several important fingerprint image will permit the immediate NCBI databases. A monitoring program was interpretation of database “hit” quality when suspect developed to make sure all of these services are or convicted felon profiles are compared to crime responsive and producing correct information. The scene samples, and thus reduce the wait time to act program quickly notifies relevant staff members if on potential leads from days to minutes. any service for which they are responsible becomes unavailable or starts producing unexpected or Database Access incorrect results. The detailed diagnostic information provided by the program allowed coding bugs, Entrez Retrieval System configuration errors, and hidden dependencies to be The major database retrieval system at NCBI, Entrez, understood and fixed rapidly, greatly increasing the was originally developed for searching nucleotide reliability and utility of the services being monitored. and protein sequence databases and related Software development in NCBI has largely MEDLINE citations. With Entrez, users can search shifted from “C” to “C++” programming language as gigabytes of sequence and literature data with the relatively new NCBI C++ Toolkit has matured techniques that are fast and easy to use. A key feature and stabilized enough to replace the older NCBI of the system is the concept of “neighboring,” which Toolkit written in “C.” permits a user to locate references or sequences NCBI started providing sequence data in an which are related to a given citation or sequence. The XML format known as INSDSeq in FY2004. This ability to traverse the literature and molecular format is an XML structured mapping of the GenBank flatfile fields that annotate DNA and

56 protein sequence records. It is designed to be used by and practical models that have opened doors to new academic groups and biotechnology companies, who areas of research. have always parsed the data from the GenBank NCBI’s basic research group is within the flatfile into their own analysis programs. One Computational Biology Branch and consists of 70 significant advantage to INSDSeq is that biological senior scientists, staff scientists, research fellows, and feature intervals are presented in an expanded format postdoctoral fellows. Projects focus on new computer that is much easier to parse than the condensed form methods to accommodate the analysis of genome required in GenBank format. sequences and molecular sequence databases due to Software used to extract the actual the rapid growth in large-scale sequencing efforts. nucleotide and protein sequence letters from within a Other projects focus on such techniques as the GenBank or RefSeq sequence record was also analysis of particular human disease genes and the redesigned this year. The new code was placed in genomes of several pathogenic bacteria, viruses and programs that produce FASTA files, write GenBank other parasitic organisms, as well as collaborations or INSDSeq format, and validate sequence records with experimental laboratories. New areas of research and decreased processing time up to 40%. Changes to include: development of novel amino acid the GenBank flatfile generator greatly sped up the substitution matrices for improved sensitivity of performance of the Entrez web site when producing sequence alignment programs, evolutionary genetics, GenBank reports on genomic sequences. By analysis of gene regulatory pathways, the eliminating the need to reload components of a large development of new modeling tools for tumor DNA genomic record, the overall speed of transfer data, single nucleotide polymorphism data analysis, improved by a factor of two. analysis of malaria genomes for vaccine The new NCBI computer room in the B2 development, evolutionary analysis of protein level of Building 38A now houses a major part of domains and comparative genomics, and NCBI’s computing infrastructure. This room is development of mathematical models of genome connected to NCBI’s portion of the NLM computer evolution. New databases are also being designed for room by multiple gigabit Ethernet connections. data on conserved protein domains and mRNA FY2004 saw a major expansion and upgrade to the expression experimental results. Staff continued “NCBI Compute Farm,” a batch queue processing collaboration with other NIH institutes for sequence system that functions as a virtual supercomputer and analysis, gene identification, and the analysis of supports many CPU-intensive production and experiments on gene expression. Collaboration was research activities. NCBI’s plan to centralize storage also continued with several institutes worldwide on using Network Attached Storage was substantially genetic linkage analysis problems. advanced with a total increase in network storage A Board of Scientific Counselors comprised capacity to approximately 100 TB. Also, NCBI’s of extramural scientists meets twice a year to review Computers and Networks Section established the the research activities of NCBI. The high caliber of basic network infrastructure necessary to support the the work of this group is evidenced by the number of NIH Consolidated Collocation Site and began to peer-reviewed publications, over 100 this year with provision public services at the site. This site, located more in press. The staff participated in numerous oral in Sterling, Virginia, is in addition to NCBI’s existing presentations and mounted posters at various facility in NLM. It will also provide continuity of scientific meetings, and at universities worldwide. critical services such as PubMed, BLAST, and FTP Presentations were also made to visiting delegations, in the event that the IT infrastructure in Building 38A oversight groups, and steering committees. NCBI is unavailable for any prolonged period. also hosted numerous outside speakers. The NCBI Postdoctoral Fellows program is Research designed to provide training for doctoral graduates in a variety of fields including molecular, Research is at the core of NCBI’s mission. The computational, and structural biology as well as Computational Biology and Information Engineering graduates in other fields who elect to obtain Branches are the main research branches of NCBI, additional training in computational biology. The with the latter branch concentrating on applied NCBI uses the NIH Intramural Research Training Research and Development. Each Branch comprises Award Program and the Fogarty Visiting Fellow a multidisciplinary team of scientists that carries out mechanisms to recruit for this program. research on a broad range of fundamental problems in molecular biology by developing and applying Outreach and Education mathematical, statistical, and other computational methods. Research conducted by NCBI investigators NCBI continues to expand its outreach and education has strengthened applications and database work and programs to increase awareness of its myriad public has led to the development of many new theoretical databases and specialized tools and services. Over the

57 past year NCBI staff maintained a general web site Learning Center, the NIH Library, and the NCI- on NCBI resources; presented at numerous scientific Frederick Cancer Research and Development Center. exhibits, seminars and workshops; sponsored a number of training courses, both lecture and hands- Education: Extramural Educational Collaborations on courses; and published and distributed various The educational collaboration program was forms of printed information. established to train a network of bioinformatics support specialists who provide local educational and Education: NCBI Courses user support services for a wide range of users and In response to an ever-increasing demand for needs. The university medical library is becoming a education and training in the use of the increasing centralized point for providing these services at the diversity of NCBI’s products and services, the local level, and members of the collaboration are course, “A Field Guide to GenBank and NCBI based in institutions that are leading this trend. Resources,” was expanded and is taught at NIH and The third “NCBI Advanced Workshop for throughout the United States as requested. The course Bioinformatics Information Specialists” was held in consists of a three-hour lecture, a two-hour hands-on FY2004. The collaborators and course alumni offer a practicum, and one-on-one sessions if requested. In variety of year-round services at their universities, FY2004, additional modules were developed that including workshops on NCBI resources, individual focus on specific tools and databases including research consults and support, and web portals. Many structures and gene expression. An extended two-day of the workshops are based directly on materials course was also presented that combined the main presented in the Advanced Workshop, thereby course and separate modules. The 11-member extending the impact of original materials. Together, teaching staff presented 64 courses to over 5,000 the collaborators and course alumni form the growing people in FY2004. Bioinformatics Support Network, a group supported by NCBI which has been established for the purpose Education: Mini-Courses and Lecture Presentations of communication and continuing education among NCBI offers 10 mini-courses to provide a practical members. introduction to various programs. Three new mini- A regional training program throughout the courses introduced in FY2004 include, “GenBank country for the three-day introductory course was Quick Start,” “Identification of Genes and Disease,” launched to complement the introductory course and “Correlating Disease Genes and Phenotypes.” taught by NLM by increasing its accessibility This year, 28 mini-courses were offered to over 1,200 nationwide. Four courses were offered with a total of participants. 63 participants in addition to the 16 at the NLM course. Participants support users of NCBI resources Education: Bioinformatics Training at their institutions through instruction integrated into To help NIH researchers make optimal use of their training curriculums, introductory workshops, computer science and technology to address and direct clientele assistance. Regional courses were problems in biology and medicine, the NCBI has an taught by Educollab members who also work with intramural Core Bioinformatics Facility (CoreBio)— NCBI to teach the five-day Advanced Workshop. a network of bioinformatics specialists serving The purpose of both the introductory and advanced individual institutes within the NIH. Individual workshop, as well as the Educollab program, is to CoreBio Members are trained over a nine-week train the trainers, who then provide assistance with period in the use of bioinformatics tools provided to NCBI resources to thousands of end-users across the the research community by NCBI. The CoreBio country. Members, in turn, advise researchers within their respective institutes as to the best methods for Outreach: User Guides for NCBI Resources conducting their bioinformatics analyses. Information NCBI has continued to develop a exchange among the CoreBio Members and the comprehensive list of fact sheets that outline the NCBI faculty is facilitated by regular meetings and e- services and databases offered by NCBI. These fact mail forums. sheets and guides are available for printing via the CoreBio has trained representatives from 15 “About NCBI” site. In addition, a number of other research institutes at NIH, conducting eight 9-week informational and educational resources are available training programs, two in the past year since the on the NCBI Web site. Links are available that program began in 2001. Twenty-five update sessions discuss the fundamental principles of biomolecular and two special topic sessions for the institute research and underlying sequence similarity search representatives have also been held. One-on-one tools. Interactive tutorials may be found for a number consultations are available on an ongoing basis for of databases and search and retrieval tools such as NIH scientists with NCBI faculty in the NCBI Entrez, PubMed, Structure, and BLAST.

58 NCBI News is a quarterly newsletter scientific community with both the resources and designed to inform the scientific community about tools needed to fully explore this data as quickly as NCBI’s current research activities, as well as the possible, as well as recent advances in molecular availability of new database and software services. analysis technologies, promises that the exponential The newsletter contains information on user services, growth in genomic data will only increase. This announcements of new or updated services and reinforces the need to build and maintain a strong available genomes, NCBI investigator profiles, and a infrastructure of information support. NCBI, a leader bibliography of recent staff publications. In FY2004, in the fields of computational biology and over 18,000 printed copies of the NCBI News were bioinformatics, plays an active and collaborative role distributed quarterly. Access to the newsletter via the in deciphering the human, as well as other genomes NCBI Web site has increased dramatically as more and in developing state-of-the-art software and people have become aware of its availability online. databases for the storage, analysis, and dissemination of data. The genomic information resources Biotechnology Information in the Future developed and disseminated thus far by NCBI investigators have contributed significantly to the Over the past few years, there has been an explosion advancement of the basic sciences and serve as a in the volume of genomic data produced by the wellspring of new methods and approaches for scientific community, most notably in the amount of applied research activities. The value of these whole genome, and gene sequence and mapping resources will continue to grow, as NCBI is information. This is due in a large part to the release committed to the challenge of designing, developing, of the human genome, as well as the release of disseminating, and managing the tools and whole-genome sequences from other model technologies enabling the gene discoveries that will organisms. The commitment to providing the significantly impact health in the 21st century.

59 information for prospective applicants as Extramural Programs well as simplified instructions for those new to preparing applications. Milton Corn, M.D. Associate Director Resource Grants (MLAA)

The Extramural Programs Division (EP) of NLM Resource Grants, authorized by the Medical Library continues to receive its budget under two authorizing Assistance Act, support access to information, acts: the Medical Library Assistance Act (unique to connecting computer and communications systems NLM), and Public Health Law 301 (covers all of and promote collaboration in networking, integrating, NIH). The funds are expended mainly as grants-in- and managing health-related information. The four aid, and in some instances as contracts, to the Resource Grant programs range in complexity as well extramural community in support of NLM goals. as in dollar amounts and duration. Internet Access to Review and award procedures conform to NIH Digital Libraries (IADL), Information systems, and policies. The EP Web site at Integrated Advanced Information Management http://www.nlm.nih.gov/ep/funded.html lists grants Systems (IAIMS) grants are considered “seed” grants awarded since 1997, with links to abstracts provided designed to initiate and deploy elements of the in the NIH CRISP database. information environment that are expected to become EP issues grants in a broad variety of self-sustaining after grant funding ends. Publication programs, all of which pertain to informatics and grants support the development of scholarly works in information management with the exception of the selected areas relevant to health and biomedical Publication Grant program: sciences. All Resource Grants are open to public and • Resource Grants for information private, nonprofit health institutions engaged in management, often involving medical health education, research, patient care, and libraries administration. Many include health sciences or • Training and fellowship grants in support public librarians as active participants. training of informaticians and information specialists Internet Access to Digital Libraries Grants. • Research Grants in informatics, information IADL grants enable organizations to offer access to science, and biomedical computing health-related information provided by NLM and • Research Resource grants to support unique others, to transfer files and images, and to interact by tools for informatics and bioinformatics e-mail and videoconferencing with colleagues • Publication and Conference grants to throughout the world. IADL grants provide up to enhance scientific & scholarly $45,000 for a single institution and up to $8,000 each communication for up to 15 additional performance sites. The • SBIR/STTR grants to support informatics applicant may propose two years as the project innovations in small businesses period, but a longer project period does not increase • Special Projects and collaborations with the total size of the award. Forty-four applications for other agencies IADL grants were reviewed in FY2004, from which 7 new grants were funded. The average priority score Highlights of FY 2004: of new IADL grants funded was 155. Nineteen of the • The number of applications assigned to applications were received from community NLM and reviewed by BLIRC continued to organizations or health centers and seventeen from increase. Statistics show a drop in success independent hospitals. Of those awarded, 6 went to rates for most NLM grant programs due to community organizations or health centers. In the increase in applications and a leveling of addition, 2 grants were awarded that were approved the budget after several years of substantial for funding in FY2003 but not funded until FY2004. increases. • A new support staffing model for extramural Information Systems Grants grants offices has been under development Information Systems Grants, which average $150,000 at NIH, following the A-76 competition per year for up to three years, are suitable for a broad which was won by NIH. EP participated in variety of information management projects. They early training for several of these people. emphasize the use of information technology to bring • EP continues to refine and sharpen its useful health-related information to end-users, statements about priorities and interests for professional and/or consumer. This flexible grant grant projects in biomedical informatics and mechanism is often used to apply a new technology bioinformatics. in a way that improves management of health • The new EP web site and new NLM web information or to create unique digital information site are providing improved access to grants

60 resources and services. Eighty four information requires investigators who understand biomedicine as system grant applications were reviewed in FY2004, well as fundamental problems of knowledge and 15 new grants were funded. The average priority representation, decision support, and human- score for new information system grants funded was computer interface. NLM remains the principal 163. Forty-three of the applications came from support nationally for research training in the fields academic centers, 22 from community or health of biomedical informatics as applied to clinical centers and 13 from independent hospitals. Of those medicine and to basic research. NLM provides both awarded, 8 went to community or health centers and institutional and individual training support. 6 to academic centers. In addition, 4 grants were awarded that were approved for funding in FY2003 NLM-Supported Training Programs but not funded until FY2004. Five-year institutional training grants support over 250 trainees at pre-doctoral and postdoctoral levels. Integrated Advanced Information Management Eighteen training programs were funded for a new Systems Planning Grants and Operations Grants five-year period beginning July 1, 2002. Eleven of The NLM provides IAIMS grants to health-related the previous twelve were again funded, and seven organizations that seek to plan, design, test and new programs were added to the set. NLM is deploy systems and techniques for integrating data, expanding its support for such programs in response information and knowledge resources into a to the marked recent interest in biomedical comprehensive networked information management computing and the consequent need for trained system that crosses organizational and disciplinary informaticians. Among our programs, training for boundaries. bioinformatics is now receiving significantly more The IAIMS program contains five options, attention and opportunity than in previous years, and, of which two are funded with MLAA funds. IAIMS for the first time, a program dedicated to imaging Planning Grants provide up to $150,000 per year for informatics is included. For the latter, NLM receives one or two years, with an optional infrastructure some co-funding from NIBIB, the new NIH Institute supplement of $100,000 in the second year; IAIMS for bioengineering and imaging. NIDR continues to Operations Grants provide up to $400,000 per year contribute funds to NLM to help support slots at for up to four years. Twenty three IAIMS grant these training sites for applicants interested in dental applications were reviewed in FY2004, of which 14 informatics. The 18 programs currently funded are at were planning grants. Two awards were made for the following universities: California (Irvine), IAIMS Operations grants, and 3 new IAIMS California (Los Angeles), Columbia, Harvard, planning grants were awarded. The average score of Indiana, Johns Hopkins, Minnesota, Missouri, successful IAIMS planning grants was 169. Oregon Health Science, Pittsburgh, South Carolina, Stanford, Rice, Utah, Vanderbilt, Washington, Publication Grant Program Wisconsin, and Yale. The Publication Grant Program provides short-term This program is scheduled to be re- financial support for scholarly research that will lead competed in FY2006. To provide EP with a timely to a publication. Studies prepared or published under overview of what the programs are doing, EP this NLM program include critical reviews or embarked on a cycle of evaluative site visits in research monographs in the history of medicine and FY2004 Each site visited is asked to provide a pre- life sciences; special areas of biomedical research and visit report, giving statistics and other background on practice; medical informatics, health information the curriculum, students and faculty. One-day visits science and biotechnology information. Unique at were made by a team of three EP staff members plus NIH, the publication grant is also unusual among an outside consultant to the following locations: NLM’s grant programs in that it accepts applications California (Irvine), California (Los Angeles), from individuals without an organizational affiliation. Missouri, South Carolina, Rice, and Wisconsin. Seventy publication grant applications were reviewed Following each visit, the principal investigator in FY2004, and 18 new grants were awarded. The receives a letter which summarizes the findings of the average priority score of new publication grants team. This letter becomes part of the official grant funded was 167. In addition, one new grant was file. awarded that was approved for funding in FY2003 but not funded until FY2004. Individual Fellowships

Training and Fellowships (MLAA) Informatics research training NLM offers two fellowships for informatics research Overview training: an individual fellowship for basic or applied Exploiting the potential of computers and research (F37), which can be pre-or post-doctoral, telecommunication for health care information and a senior fellowship intended for those with 10 or

61 more years of professional experience in an Research Support (PHS 301) appropriate field (F38). In FY2004 21 applications were received for the F37 program, of which four Research support is provided through a variety of were awarded. The average score of successful mechanisms, including individual research grants and proposals was 159 for the F37 program. Seven contracts, cooperative agreements, research resource applications were reviewed for the F38 program, and grants and others. NLM’s research grants support one award was made. both basic and applied projects involving the applications of computers and telecommunication Training for Informationists technology to health-related issues in clinical In October 2003, NLM issued program medicine and in research. announcements for two new fellowships, both aimed at supporting the training of in-context information Biomedical Informatics and Bioinformatics specialists. These programs use the F37 and F38 mechanisms, but emphasize training for professional Research Grant Program careers, not research training. One F37 application In the early years of the R01 grant program, the was received and it was funded. majority of NLM’s research support in informatics focused on the informatics of health care delivery IAIMS Fellowships with support both to applied projects (e.g., the No applications were received for this program. electronic medical record, telemedicine) and related basic problems (e.g., natural language processing, Early Career Development Awards data-mining, knowledge representation). In recent This program provides transition assistance for years there has been marked expansion in research biomedical informaticians who are establishing their support for informatics issues related to biological initial independent research programs. Applicants and medical research. Thus, the research grant may apply without yet having identified their home program now has two “branches,” both of which are institution; once a position is secured, the award funded from PHS 301 funds. In FY2004, a new process is completed. Fourteen applications were program announcement was issued, updating the reviewed in FY2004 for this program, and 3 awards language and clarifying NLM’s research interests in made. The average score of a successful application biomedical informatics and bioinformatics. Forty was 165. Two K22 awards were issued that received eight applications were reviewed for this program, approval in late FY2003 but could not be funded until and 7 awards made. The average score of awarded FY2004. applications was 174. In addition, 7 grants were funded that were approved in FY2003 but could not Loan Repayment Program be funded until FY2004. All but three research grant NLM participates in NLM’s loan repayment program applications came from academic centers. by identifying applications it is willing to sponsor. These applications are reviewed for merit by a Small Grant Program Special Emphasis panel. A central NIH office checks To complement its traditional R01 grants, in 2003 the suitability and substance of the applicant’s debt NLM issued a program announcement for small and employment status. For FY2004, NLM funded 7 project research grants, a mechanism used by most of LRP awards of the 14 received. the NIH Institutes. These grants provide $50,000 per year for one or two years, and are designed to help Biomedical Ethics researchers who are just starting out in an area of Ethical issues in health care and research produce an inquiry. Feasibility and proof of concept studies, and enormous literature. This literature comes from law, the gathering of preliminary data that might support a medicine, public health, philosophy, and government subsequent R01 study are typical uses of the R03 publications. The National Reference Center for grant. Thirty nine R03 grants were reviewed in Bioethics Literature at Georgetown University FY2004, and 4 new grants were funded. The average continues to offer invaluable resources and guidance priority score of funded R03 grants was 148. Like for workers in this area. A contract from NLM’s R01grants, most R03 applications came from Library Operations program area now supports the academic organizations. In addition, 5 R03 grants Center as well as the indexing and cataloging of were funded that were approved in FY 2003 but materials cited in MEDLINE and LocatorPlus. could not be awarded until FY 2004. Arrangements were completed in FY2004 to consolidate the two separate contracts previously Informatics for Disaster Management managed by Extramural Programs and Library NLM’s program of research grants exploring the Operations into one. Transition will take place in application of informatics approaches in natural and early FY2005. man-made disasters. initially an R01 mechanism, is

62 now an R21 mechanism to better accommodate Pan-NIH Projects projects that are more akin to engineering research & development than to hypothesis-testing experimental NLM and Roadmap Activities research. During the formal change process for this A major pan-NIH enterprise initiated by the Director, program, two other institutes (National Institutes of NIH, is resulting in requests related to three themes: Mental Health and National Institute of Biomedical New Pathways to Discovery, Research Teams of the Imaging and Bioengineering) signed onto NLM’s Future, and Reengineering Clinical Research. NLM program announcement. Nineteen new applications is a participant in all of the Roadmap initiatives, and in this program were reviewed in FY2004, and no EP staff was actively involved in NIH Roadmap new awards were made. The average score of teams for the National Centers for Biomedical Informatics for Disaster Management applications Computing and a number of interdisciplinary reviewed was 308. Eleven of the new applications research initiatives. Although NIH Roadmap grants were from academic centers, four from for profit are considered pan-NIH grants, and awards will be firms. managed by teams of program officers, each Roadmap grant has a “home” Institute. NLM Exploratory/Developmental Grants NLM’s new Exploratory/Developmental grant fills a NCBC and BISTI niche between Resource and Research grants and was Initially, following the award of several planning issued in concert with development of such a grants for BISTI (Biomedical Information Science mechanism by NIH. Announced in April 2003, the and Technology Initiative) Centers, NIH intended to R21 grant supports high risk/high yield projects, issue an RFA to support a selection of those centers. proof of concept, and work in new interdisciplinary Instead, the NIH Roadmap issued an RFA for areas. Preliminary data are not required for these National Centers for Biomedical Computing NCBC). grants, and emphasis in review is shifted from P.I.s on the existing Planning Grants were eligible to hypothesis testing to achievement of milestones apply but were not accorded preference in the review. during R&D. Eight applications were reviewed in Forty-one grant proposals were received in response FY2004, and 1 new award was made. The average to the RFA, and 4 were awarded. NLM is score of new applications reviewed was 237. administrative home for one NCBC grant. Because the Roadmap initiative provided only $12 million Resource Grants for Biomedical dollars for Centers, NLM, NIGMS, and NCRR Informatics/Bioinformatics combined to contribute an additional $4 million so In August 2004, NLM issued a program that 4 Centers could be funded, each at a total cost announcement for an earlier, expired program of per year of $4 million. These cooperative agreements support for scientific research resources. This last for five years and can be renewed for another program, which uses the P41 grant mechanism, is five years. similar to an R01 grant but contains a service component and support for maintenance of a resource Special Multi-institute Projects or service. The applicant must demonstrate that the proposed resource or service is already actively used Multi-institute Program Announcements by researchers or clinicians across the US or the In addition to its involvement in the NIH Roadmap, world. Seven new applications were reviewed in NLM also participates with other NIH and federal FY2004, and 3 were funded. The average score of the organizations in a number of multi-agency projects, awards was 139. including the Human Brain Project, the Pharmacogenetics Research Network, and a number Conference Grants of individual program announcements that focus on Support for conferences and workshops is intended to tool development, innovation in computational help scientific communities in focused areas of sciences for biomedicine, and other informatics- informatics and bioinformatics to identify research related topics. The applications for these programs needs, share results, and prepare for productive new are reviewed by the NIH Center for Scientific work. The average conference grant is about $10,000. Review, and then participating institutes select grants The program allows multi-year awards. EP generally for full or shared funding. NLM participation has caps conference awards at $20,000 per year. To been steady but is rarely more than one new grant expedite processing of these grants, NIH permits a each year, and in some years none is funded. The two-level review to be done by NLM staff. Of three statistics for these programs are folded into regular applications received in FY2004, one was funded. grant program counts; most are R01 grants. An updated listing of the multi-institute initiatives in which NLM participates is available on the EP Web site.

63 Informatics for the National Heart Attack Alert Women’s Hospital, “Informatics for Integrating Program (Research Contracts) Biology and the Bedside.” This multifaceted Although some small supplements were added to cooperative agreement supports seven core projects several of these projects in FY2004, funding for the and four research projects. The informatics domains National Heart Attack Alert Informatics Program is include: clinical informatics, functional genomics and essentially complete. A contractors’ conference was genetics of complex traits. held for spring of FY 2004. NLM’s involvement in The National Human Genome Research the program has now ended. Institute (NHGRI) is continuing to co-fund the United Protein Databases (UniProt) of the European Shared Funding for Research Grants Bioinformatics Institute (EBI) under a cooperative The NLM provides funding for Bioinformatics and agreement. This support provides a single database Biomedical Informatics by its continuing support of for protein sequence and function linking existing collaborative extramural funding with other agencies. information from other databases supporting protein The NLM continues its support of the Protein structure information. The NLM has also provided Sequence Databank at Rutgers University jointly co-funding support for a NHGRI grant entitled the with the NSF and has increased its fiscal commitment “Oral History of Human Genetics: The Intelligent to the project. This databank serves as the single Archive” that will include a collection of over 100 worldwide repository for the processing and oral histories from clinicians, scientists, theorists, distribution of 3-D biological macromolecular organizational leaders and others covering the ethical, structure data. legal and social issues surrounding the field of human The NLM has collaborated with the Fogarty genetics. The NLM and the NHGRI jointly funded a International Center (FIC) in support of their new research grant entitled, “BioMediator: Biologic International Training in Informatics by funding the Data Integration & Analysis System” for searching “AMAUTA Health Informatics Research and across various genomic databases for the purpose of Training Program” that involves the collaborative curating the GeneClinics Database formerly efforts of the University of Washington and the supported by a NLM grant. The GeneClinics project, Unversidad Peruana Cayetano Heredia (UPCH) in under NLM contract support, has now been Lima, Peru. This training addresses informatics for integrated into the NCBI as one of their genetics global health and will help the UPCH to establish a resource databases available to clinicians and health informatics research program within Peru. The research scientists. FIC also received complementary co-funding support The NLM provided co-funding to the from the NLM for their International Training National Center for Research Resources (NCRR) in Program in Informatics for the “Informatics Training support of a Neuroimaging Analysis Center that for Public Health in Tanzania.” This program would support further development and extensions of includes the collaborative efforts of the Harvard the Insight Toolkit that will be used in working in a School of Public Health and the Muhimbili grid computing environment. This grid computing University College of Health Sciences. This ten-year application will allow for computation of very large alliance has been focused on addressing major public datasets that otherwise could not be analyzed health problems in Tanzania through effectively. The NLM also provided NCRR with co- multidisciplinary teams of investigators in research funding support for the pan-NIH initiative on and training. This program will support advanced Electronic Research Administration. This cooperative degree programs in public health and work towards a agreement, “Electronic Submission of Grant sustainable training program in Tanzania. Applications,” also supports the overall Federal As part of the NLM’s support of its Training Government’s e-Grants and e-Government initiatives. Program for Bioinformatics, the NLM receives The NLM provided support for the 7th ongoing co-funding support from the National Annual International Protégé Workshop hosted by Institute of Dental and Craniofacial Research for Stanford University and jointly funded by the support of Dental Informatics trainees. The NLM National Cancer Institute and the NLM through an also receives co-funding support from the National intra-agency agreement. Institute of Biomedical Imaging and Bioengineering In support of the Small Business for trainees in bioinformatics. Technology Transfer Research (STTR) Program at The NLM has also provided joint funding to the NIH, the NLM provides co-funding support of a the National Institute of General Medical Sciences grant at the National Institute of Nursing Research providing continued support of a cooperative for the continued development of a home caregiver agreement for the Stanford Phamacogenetic device for people that are cognitively impaired Knowledge Base. In addition, NLM provided co- preventing personal injury at night. The “Night Alert funding support for NIH Roadmap directed to the Prompting System” is a collaboration of the Amron National Center of Biocomputing at the Brigham and Corporation and the University of Florida a part of

64 the STTR program joining private companies with genetics databases titled Helix, GeneClinics and universities in support of a business-research GeneTests, primarily in support of clinical health partnership. professionals. The competitive application for The NLM provided funding support as part renewed funding support for the GeneTests project of a co-funding agreement to the National Institute of proposed the consolidation of all three databases. The Neurological Disorders and Stroke for a project renewal proposal was reviewed, positively scored, entitled, “A Mature Brain Architecture Knowledge and considered for funding. The NCBI expressed Management System (BAMS).” The objective of this interest in incorporating this GeneTests resource into project is to develop a user-friendly neuroinformatics the NLM database resources by converting the grant workbench for the Web allowing the neuroscience proposal to a contract proposal. The new sole-source community to access, evaluate and visualize contract provides public access though an NLM web neuroanatomical literature. BAMS will facilitate interface. basic research into the cause and treatment of all The new website for GeneTests provides diseases that affect the brain. four categorical breakdowns for information. The The NLM provided approximately $1.4 GeneReviews portion provides online publications of million in collaborative co-funding agreements in expert-authored disease reviews. A laboratory FY2004. directory provides an international directory of genetic testing laboratories and the clinic directory an Joseph and Rose Kennedy Institute of international directory of genetics and prenatal Ethics/Georgetown University diagnosis clinics. In addition, a repository for The Division of Extramural Programs has continued educational materials that includes an illustrated its support of the National Reference Center for glossary, information about genetic services and Bioethics Literature (NRCBL) in 2004. Early in the PowerPoint slide presentations are available. year the NRCBL staff worked with its counterpart in Approximately 30,000 entries are viewed per day. Bonn, Germany, Deutsche Referenzzentrum für Ethik in Den Biowissenschaften, and extended its ethics Shared Funding for Training classification scheme to include French and German In June 2003, the Fogarty International Center issued to the already existing English. The expanded an RFA for Informatics Training for Global Health. classification table currently allows those searchers The review of applications was handled more familiar with French or German access to the administratively by EP’s Scientific Review Unit. “ETHX on the Web” database. The National NLM is providing full funding ($250,000 per year) Information Resource on Ethics and Human for one of the programs, and is co-funding a second. Genetics, NCRBL staff has evaluated its existing There are discussions with the Robert computer platform and concluded that a more modern Woods Johnson Foundation exploring possible RWJ and flexible platform (not provided by NLM support of training slots for Public Health Informatics contract) was required and this would be incorporated Research at some of NLM’s existing Informatics with improvements during 2004. The NRCBL Research Training Programs. continues to publish volumes of its ongoing collection, New Titles in Bioethics. The primary SBIR/STTR (PHS 301) measure of success of this ethics resource has been its All NIH research grant programs, including NLM’s, use as a library, and as such, has provided personal by Congressional mandate allocate a fixed percentage responses to requesters from around the world. There of available funds every year to Small Business are approximately 1 million Web queries per year. Innovation Research (SBIR) grants. These projects NRCBL staff has also provided training sessions on may involve a Phase I grant for product design and a the effective use and access to the library databases Phase II grant for testing and prototyping. SBIR and and resources for a graduate seminar on nursing STTR applications are reviewed by CSR. Sixty three ethics taught at the George Mason University. There SBIR/STTR applications were assigned to NLM tin are a number of new bioethics publications for 2004 FY and reviewed by CSR. Two awards were made. including Digital Library Projects: Beyond the Of these applications 35 were ‘unscored,’ indicating Beltway and Bioethics Searchers Guides: Using reviewer assessment that they were not in the top ½ Databases of the National Library of Medicine. The of applications received. The average score for NCRBL will continue to be supported by NLM in SBIR/STTR grant awarded was 199. 2005 as one of the premier national and international bioethics literature and information resources. EP Operating Units - Highlights

GeneTests Grants Management Office The NLM has in the past awarded several grants to the University of Washington in support of clinical The Grants Management staff reviews NLM grant applications for compliance with guidelines and

65 directives; prepares and disseminates grant awards; National Centers for Biomedical maintains official grant files for NLM; provides Computing. consultation and assistance to grantees on appropriate • NLM’s Informatics Research Program— business management concepts; and advises NLM approval of announcement describing the officials on grants management policy and program. procedures. The Grants Management staff, which EP programs presented to BOR as updates included consists of four employees, issued a total of 322 the Publication Grant Program, the Loan awards for FY2004, including grants, administrative Repayment Program, the NIH Roadmap Initiative, supplemental awards, fellowships and administrative the Small Project Grant Program, National Centers actions. Details of the grants are provided in for Biomedical Computing, and NLM’s Informatics Appendix 1, Table 2. Of these, 212 were for new and Training Conference. non-competing awards in NLM’s own grant programs. Grants Management staff continues to Scientific Review Office provide budget oversight for all awards, and prepares NLM’s initial review group, the Biomedical Library reports for NLM staff and Congress as requested. and Informatics Review Committee (BLIRC), evaluates grant applications for scientific merit. Committee Management Activities BLIRC met three times in FY2004 and reviewed 204 Board of Regents: The Board of Regents met three applications. The Committee (see Appendix for roster times in FY 2004 on February 10-11, May 19-20, and of members) operates as a “flexible” review group. September 21-22. The Extramural Programs BLIRC reviews applications for medical informatics Subcommittee was held prior to each of these and biotechnology research projects, information meetings. The Board approved 373 grant systems, and publications. BLIRC has two standing applications, including any special reviews made by subcommittees: the Networked Information Access the EP Subcommittee. These special reviews are Subcommittee and the Medical Informatics conducted when the recommended amount of Subcommittee. The subcommittees consider financial support is larger than some predetermined applications for fellowships, and amount; when at least two members of the scientific training awards in medical informatics and merit review group dissented from the majority; biotechnology information, respectively. when a policy issue is identified; or when an The Amended Charter of the Biomedical application is from a foreign institution. The EP Library and Informatics Review Committee reflects Subcommittee makes recommendations to the full the broader scope of research applications in the Board, which votes on the applications. The Board areas of clinical informatics, bioinformatics, Operating Procedures were reviewed and approved biomedical computing, management of health science without change at the February 10-11, 2004 meeting. information, as well as . Special Emphasis Panels: 18 Special Presentations of Programs to the Board of Regents in Emphasis Panels were held during FY2004. These FY 2004 panels are convened on a one-time basis to review EP programs presented to the BOR for Concept applications for which the regularly constituted Review and Approval: review group lacks appropriate expertise, or when a • NLM participation in a collaborative conflict of interest exists between the applicant and a effort, Interagency Opportunities in Multi- member of the BLIRC. Lately, due to the increase in Scale Modeling in Biomedical, Biological, number of applications received, the panels have also and Behavioral Systems. Partners and been convened to review applications that simply purpose of the initiative were described. cannot be reviewed in the BLIRC. The panels • NLM possible participation in a broad reviewed a total of 178 applications during FY2004. variety of NIH Roadmap activities, One site visit to evaluate an IAIMS Operations including (1) Re-engineering the Clinical application was also carried out by an ad hoc panel. Research Enterprise: Feasibility of A Special Emphasis Panel was convened in Integrating and Expanding Clinical February 2004, at the request of the Fogarty Research Networks, (2) Training for a International Center, to review applications New Interdisciplinary Research responding to their RFA for training grants, Workforce, (3) Interdisciplinary Health “Informatics Training for Global Health.” Research Training: Behavior, Environment A second level peer review of applications is and Biology, (4) Short Programs for performed by the Board of Regents as described Interdisciplinary Research Training above. One of the Board’s subcommittees, the Exploratory Centers (P20) for Extramural Programs Subcommittee, meets the day Interdisciplinary Research, and (5) before the full Board for the review of “special” grant

66 applications. Examples include applications for Interdisciplinary Research, EP program were which the recommended amount of financial support involved in several new multi-agency grant program is larger than some predetermined amount; when at announcements. least two members of the scientific merit review group dissented from the majority; when a policy Training-related Initiatives issue is identified; and when an application is from a The Annual Training Conference was held June 9 foreign institution. The Extramural Programs and 10 in Indianapolis. Poster sessions and break-out Subcommittee makes recommendations to the full sessions were included in the meeting for the first Board, which votes on the applications. time and with great success. At that meeting, Training Directors were briefed on the interest of Program Office Robert Wood Johnson Foundation in partnering with NLM to provide training in public health informatics. Program activities in FY2004 were focused on Parameters of such a program were developed, to be clarifying NLM’s research interests, evaluating the presented to the RWJF Board for approval in university-based informatics training programs, November 2004. building new collaborations, and publicizing NLM’s grant programs. Administration and Operations

Program referral guidelines Personnel Activities The creation of the National Institute on Biomedical EP has had several personnel changes over the past Imaging and Bioengineering (NIBIB) brought a new year and the Division experienced five losses of set of research interests to NIH that overlap in several permanent staff. The NIH A-76 competition for areas with those of NLM. New referral guidelines for grants management, review and program support EP were prepared this year and sent to the Center for staff was completed and officially implemented on Scientific Review. These will continue to be refined October 4, 2004. Only one position was transferred to as the other “non-categorical” Institutes increase their the NIH’s Division of Extramural Administrative funding in informatics. The primary overlaps are with Support. Three additional staffers from the NIH NCRR, NIGMS, NHGRI and NIBIB. Division were also assigned to fill the remaining 3 support vacancies in EP. Program announcements NIH requires all standing programs to be re-issued Some Issues That Impact NLM Extramural Budget every three years. Program staff identified all current and Programs and expired programs and developed a timetable for • Increasing numbers of applications while the re-issuing expired programs. The expired budget is flat may require some narrowing announcements replaced in FY2004 were for the R01 of NLM’s funding interests if a reasonable Research Grants and the P41 Research Resource payline is to be maintained Grants. Draft text was completed for updates to the • NLM’s participation in Roadmap, BISTI Information Systems grant and Publications Grant, and other multi-Institute computing which will be issued early in FY2005. initiatives inevitably decreases available funds for NLM’s own grant programs. Collaborations • Because some of the informatics areas NLM EP was a co-sponsor and active participant in has supported for many years are now also organizing the 2004 BECON Symposium, entitled being funded generously by other Institutes, “Biomedical Informatics for Clinical Decision the proper future focus for NLM’s grant Support: a Vision for the 21st Century.” The meeting, programs in biomedical computing would held June 21–22, was well-attended. In addition to benefit from reevaluation. The upcoming participation in NIH Roadmap workgroups for the long-range plan meetings could provide a National Centers for Biomedical Computing and useful forum for such analysis.

67 EXTRAMURAL PROGRAMS FY 2004 Final ($ in 000)

EXTRAMURAL PROGRAM BUDGET

NON COMPETING COMPETING TOTAL NO AMT NO AMT NO AMT

MLAA ($) ($) ($)

IAIMS 2 $904 6 $1,237 8 $2,141

TRAINING TRAINING PROGRAMS (T15) 18 $14,210 0 $0 18 $14,210 FELLOWSHIP(F37/F38) 5 $270 8 $598 13 $868 CAREER(K22) 0 $0 5 $750 5 $750 TOTAL TRAINING 23 $14,480 13 $1,348 36 $15,828

PUBS(G13) 17 $1,144 20 $1,298 37 $2,442

RESOURCE IADL(G07) 3 $108 7 $452 10 $560 INFO. SYS.(G08) 20 $2,765 16 $2,033 36 $4,798

TOTAL RESOURCE 23 $2,873 23 $2,485 46 $5,358

BIOETHICS(N01)* 0 $0 0 $0 0 $0 GENETESTS(N01) 0 $0 0 $0 0 $0 LOAN REPAYMENT(L30) 0 $0 7 $419 7 $419 NN/LM CONTRACTS(N01) 8 $12,884 0 $0 8 $12,884

TOTAL MLAA: 73 $32,285 69 $6,787 142 $39,072

PHS 301

BIOMED-INFORM. RESEARCH (R01/R03/R13/R21/R24/P41) 36 $9,334 11 $1,923 47 $11,257 PROTEIN SEQ. DATABANK(IAG) 0 $0 1 $200 1 $200 CHAIRMAN'S GRANT(U09) 1 $200 0 $0 1 $200

BIOMED-INFORM. RESEARCH TOTAL 37 $9,534 12 $2,123 49 $11,657

BIOINFORM. RESEARCH(R01/R03/R21) 14 $4,240 7 $2,473 21 $6,713 BIOINFORM. RESOURCE(P41) 2 $950 1 $2,293 3 $3,243 BISTI(R21/R33/P20/P41/U54)** 7 $3,476 2 $967 9 $4,443

BIOINFORM. RESEARCH TOTAL 23 $8,666 10 $5,733 33 $14,399

SBIR/STTR(R43/R44/R41/R42)*** 2 $955 2 $238 4 $1,193

TOTAL PHS 301: 62 $19,155 24 $8,094 86 $27,249

TOTAL EP: 135 $51,440 93 $14,881 228 $66,321

TABLE 11

68 and disasters. Communications systems were planned Office of Computer and and deployed to support the NCCS disaster recovery capabilities. These systems include access to Internet Communications Systems 1, a 622 Mbps link between NIH and the NCCS, load-balancing systems between NLM and the Simon Y. Liu, Ph.D. NCCS, and internal communications systems within Director the NCCS-NLM space.

The Office of Computer and Communications High Speed Communication Network: OCCS Systems (OCCS) provides efficient, cost-effective improved the redundancy of equipment and network computing and networking services, application paths to eliminate single points of failure in the development, technical advice, and collaboration in network. NLM’s network perimeter connections to informational sciences to support NLM’s research external networks provide an aggregate of 2 gigabits and management programs. per second (Gbps), while the interconnection between OCCS develops and provides the NLM NLM and the NIH/CIT campus backbone operates at backbone computer networking facilities, and assists 1 Gbps. Also in FY2004, OCCS implemented the other NLM components in local area networking. remote-access Citrix terminal (and cable modems) as The Division provides professional programming an effective solution for NLM flexi-place workers. services and computational and data processing to OCCS is also expanding secure wireless access to the meet NLM program needs; operates and maintains Internet and internal applications. the NLM Computer Center; develops software; and provides extensive customer support, training A-76 Competitive Sourcing Review: The OCCS courses, and documentation for computer and computer center was one of three NIH centers subject network users. to an A-76 competitive sourcing review in FY 2004. OCCS helps to coordinate, integrate, and After a streamline review, the NIH Most Efficient standardize the vast array of computer services Organization (MEO) won the competition with a cost available throughout all of the organizations savings of over 4 million dollars. The MEO is comprising NLM. The Division also serves as a expected to increase the effectiveness and efficiency technological resource for other parts of the NLM of computer center operations. and for other Federal organizations with biomedical, statistical, and administrative computing needs. Multi-faceted IT Security Program: OCCS continued its multi-faceted and multi-layered IT security Executive Summary program that successfully prevented over 2.7 million virus attacks this year and detected more than 26,000 Enhanced MedlinePlus: OCCS continued an probes, scans, denial of service (DOS) attacks and aggressive campaign of major MedlinePlus releases other security events on a monthly basis. OCCS also this year including Release 16 of the Go Local Input performed a monthly cycle of vulnerability scanning, System and Release 15 of a public directory of health detection, and remediation; implemented automatic services. Major upgrades and enhancements to virus scanning and signature update mechanisms; MedlinePlus included: implemented a perimeter firewall cluster; and • Database software upgrade to Oracle 9i. implemented an automatic patch management • Modifications and testing of the system. MedlinePlus input system to run in NLM’s disaster recovery/failover (NCCS) site in Enhanced DOCLINE: Expanded the functionality Sterling, Virginia. and improved the usability of DOCLINE, the NLM • Modification and testing of MedlinePlus interlibrary loan system, to support 3,200 domestic public pages to run in active-active (load- and international libraries in processing balancing) mode at the NCCS. approximately 3 million interlibrary loan transactions • Added Spanish-language news, a Spanish a year. Version 2.1 was released in April and Version email listserver, and a Spanish language 2.2 was released in August and included a total of 25 translation of ASHP drug information. enhancements in response to user and Library Operations requests. NIH Consolidated Collocations Site (NCCS): OCCS continued to lead the effort on the NIH Consolidated RxNorm Project: Designed and developed a Collocation Site Project. The NCCS became prototype to prove the concept of RxNorm operational in November 2003 in Sterling, Virginia. nomenclature management. This will standardize the The facility provides disaster recovery and continuity labeling data mandated for clinical drugs by the FDA. of operations by reducing the risk of service interruptions due to a variety of unpredicted threats

69 By the end of FY2004, development and planning new Java version of the XML Loader and Extractor were in an advanced stage. that will support Meeting Abstracts and OLDMEDLINE data as well as the redesign of Gene Enhanced Medical Subject Headings (MeSH): Final Indexing to work with NCBI’s Gene Entrez database development of the MeSH Translation Management rather than LocusLink. System (MTMS) was completed in the first quarter of FY2004. Various foreign-language data sets, MEEC License Savings: OCCS renewed NLM’s including Japanese, Spanish, Portuguese, and Dutch, participation in the Maryland Education Enterprise were loaded. MTMS is an interlingual database of Consortium (MEEC) licensing agreement that translations that permits automatic updating of the provides a bundle of Microsoft products at the lowest MeSH terminology tree in all languages. cost available in the U.S. MEEC seat renewal, priced this year at $16, provide licenses and product updates NLM Main Page Redesign: OCCS played a for the current Windows operating system, Microsoft significant role in the redesign of NLM’s Main Web Office Professional, Visual Studio.Net, and site. The redesign of the site improved its look and BackOffice clients. By contrast, GSA prices for these feel. In addition, using an audience of the general same products total $1,712. public, health care providers/professionals and librarians, usability testing was conducted to improve Computer Facility Reengineering: The NLM site navigation. computer room has tripled its use of electrical power, cooling and data transmission capacity over the last NIH E-mail Consolidation: OCCS participated in the three years due to the rapid growth in IT systems. planning and transition of NLM e-mail accounts to Recognizing this growth will continue in the years the NIH Central Email System (CES), an NIH IT ahead, OCCS began a detailed process for evaluating consolidation initiative. OCCS maintained the safety, reliability and performance requirements responsibility for local e-mail clients, assuming the of the computer room. Reengineering activities task of configuring, deploying, and managing the include: MS-Outlook 2002 client. This consolidation resulted • Expanded the Uninterrupted Power Supply in the retirement of the Novell GroupWise e-mail (UPS) capacity to support the growing needs system that had been in use for over seven years at for electrical power protection and NLM. redundancy of systems housed in the NLM Computer Room. Active Directory Consolidation Project: OCCS • Developed plans to bring additional power contributed to the smooth transition of user accounts to the computer room, and to streamline the from the NLM Active Directory to the NIH Active delivery of electrical power to the IT Directory, a project that affected virtually all NLM systems. users, and required changes to login credentials and • Initiated plans for a pre-action sprinkler local machine settings. Through planning in June, system to improve the reliability and safety testing in July, and cutover in August, this project of the fire suppressing system. was accomplished on schedule. • Developed plans for an overhead ladder rack in the computer room, as a separate pathway Enhanced Relais: Relais was modified to for running data networking cables to accommodate innovations in DOCLINE, including improve the reliability, availability, and color copy service and the ability to enter alternate maintainability of data communication Ariel delivery addresses. Ariel is a scanner-based services. document transmission system from Infotrieve that uses Internet protocols and Adobe’s Portable The following describes in more detail OCCS Document Format (PDF) instead of telephone fax accomplishments in FY2004: services for faster delivery. Customer Services Enhanced Voyager: OCCS completed initial development and testing of the new XML distribution Since the 2003 Help Desk consolidation with NIH’s of Voyager bibliographic data, which allows data IT Help Desk, NLM desktop and PC networking sharing between Library Operations and NCBI’s support requests are now channeled to the NIH IT PubMed Entrez search and retrieval system. Help Desk for initial ticket entry into the call tracking system. This year over 10,700 NLM ticket requests Enhanced Data Creation and Maintenance Systems for IT support were entered and tracked. NLM IT (DCMS): Reengineered DCMS to improve staff resolved over 72% of the calls (7,700 tickets) functionality and maintainability. This included a

70 with 28% of support calls being completed by NIH Network Support staff. OCCS conducted over 80 desktop training OCCS continued to fulfill its mission of providing courses this year, in topics such as “SPAMology,” reliable LAN and Internet communications services, “Outlook 2002 FUNdamentals,” and “Office XP meeting the data communications needs for new IT Differences.” Additionally, public briefings were systems, providing security services as well as end conducted in support of the Active Directory user assistance and training, implementing new migration project and many one-on-one sessions network-based applications and operating systems, were held in relation to Outlook PST file reduction. and exploring new technologies and plans to meet NLM’s continued growth in networking, services and Desktop Support communications. OCCS/STB/NES took steps to increase the capabilities and reliability of network OCCS worked with the NIH Center for Information services and storage, by providing for the following: Technology (CIT) this year to transition to the CIT- • NCCS data communications services managed Microsoft Exchange 5.5 services known as • Enhanced network and service monitoring the NIH Central E-Mail System (CES). OCCS and management participated in the planning and transition of NLM e- • Increased IT security mail accounts to the NIH Central Email System • New networked services to support the (CES), an NIH IT Consolidation initiative. In NLM user community addition, OCCS maintained responsibility for local e- • Increased performance and throughput for mail clients, assuming the task of configuring, networks deploying, and managing the MS-Outlook 2002 • Additional redundancy to eliminate single client, and developing a deployment model that kept points of failure disruption at NLM’s desktops to a minimum. In order • Enhanced backup for use in disaster to adopt the latest and most secure Microsoft e-mail recovery scenarios client, Outlook 2002, OCCS developed and deployed • Expanded, centralized and efficient storage an upgrade package for the MS-Office XP suite. This consolidation resulted in the retirement of the Novell Public Internet connectivity services GroupWise e-mail system that had been in use at continued to be provided through a contract with NLM for over seven years. Level3/Genuity. Internet connectivity was provided OCCS staff contributed to the smooth via an OC3 (155Mbps) circuit to the Level3/Genuity transition of user accounts from the NLM Active network node in McLean, VA. The contract also Directory to the NIH Active Directory, a project that provides an OC3 link for CIT/NIH to the impacted virtually all NLM users and required Level3/Genuity network. NLM and NIH collaborate changes to login credentials and local machine in using these links to back up each other’s Internet settings. Through planning in June, testing in July, connectivity. The service features an automatic and cutover in August, this project was accomplished failover in the event of a scheduled or unscheduled on schedule. OCCS support contractors coordinated outage of one Internet connection. the project, created the detailed mapping of each In addition to supporting the indexing NLM account to an NIH account, and developed the system, the remote access Citrix terminal server techniques for effectively delivering the revised solution has been implemented as an effective credentials to each user’s PC. The re-mapping effort solution for NLM flexi-place workers. The terminal enabled the preservation of users’ network server system provides authentication into the NLM permissions and personality settings. network, access to office and NLM business The Software Update Server security hotfix applications, network-based files, and the Internet. deployment solution introduced by OCCS makes Network support continues to provide 56K dial-in possible the expeditious deployment of critical access and cable modem access for a wide range of security updates, keeping 1,200 NLM systems better NLM staff and contractors. High-speed access is insulated from attack. The system enforces the provided mainly through cable modems provided by application of previously released patches, ensuring COMCAST. continuous oversight and active management of NLM consolidated wireless LAN networks security on the desktop. Vulnerability assessments into the support services of CIT. The initial wireless now trend much more favorably. Two hundred nine capabilities were implemented, and further expansion (209) security patches are now consistently applied as of the wireless systems will continue in selected needed to OCCS-supported desktops running the areas. Wireless access to the Internet and public Windows operating system. services of NLM and NIH is provided for guests and typical users. Through a Virtual Private Network,

71 authorized users can access internal applications in a Several new servers and large storage secure manner. systems were procured and deployed for the RxNorm Steps were taken to consolidate dial-up Support project. Deployment and testing is currently remote access services to the NIH Parachute system, underway. whenever possible. Several new servers were deployed for the OCCS continued the use of iTRACS for Siebel development system. This provides for documenting the LAN cabling and infrastructure. The increased capacity and better software life cycle data entry process (Phase I) was continued. Phase II, management. which includes layering the iTRACS information on AutoCAD drawings of NLM building plans, has also IT Security begun, and is expected to be completed during FY2005. Throughout the year, NLM continued to assess and strengthen its security posture based on NLM’s Systems Support current business requirements and risk assessment. Security improvements continued. In order to protect NLM’s mission-critical systems, The perimeter firewall cluster was CIT and NLM have implemented an NIH implemented to enhance NLM perimeter defense Consolidated Collocation Site (NCCS) in Sterling, capability. Testing the alternate firewalls continued in Virginia. Since November of 2003, NCCS has preparation for their eventual deployment at the operated to reduce the risks of service interruptions perimeter. Basic multicast testing was completed. due to a variety of unpredicted threats and disasters. OCCS servers that host NLM public applications At present, all systems under MEDLARS and such as MedlinePlus, DCMS, DOCLINE, and NIH TOXNET are either under Active/active, SeniorHealth were migrated behind the firewall on active/passive or active/cold-backup mode depending the OCCS public firewall boundary in order to on their business requirements. In addition, NLM has improve access control. Strong consideration was established plans for tape backups. The Disaster given to implementing a defense-in-depth (DiD) Recovery and Business Continuity Plan for “best practices” architecture that provides varied MEDLARS and TOXNET covers NCCS as the forms of defenses at the different layers of the NLM primary resource for system restoration and network. OCCS will continue to emphasize the DiD uninterrupted processing if the primary NLM architecture concept in the ensuing years in order to computing facilities on the NIH campus are rendered proactively maintain a strong security posture at unavailable by a disaster or other contingency. NLM. OCCS deployed various applications at the OCCS performs a monthly cycle of NCCS this year, including the NLM Home Page, vulnerability scanning, detection, and remediation to NIH Senior Health, the MeSH Browser, the Intranet, improve NLM security posture. The Internet Security PHP Partners, and MedlinePlus Directories. System’s Internet Scanner provides network Numerous servers and storage systems were vulnerability assessment across servers, desktops, and deployed at the site to support these applications. In infrastructure devices. Internet Scanner performs the coming months, OCCS staff will work with the distributed probes of network services, operating Library Operations Division to deploy additional systems, routers/switches, servers, firewalls, and applications. The success achieved thus far shows application routers to identify potential risks. that the rewards are well worth the efforts. OCCS implemented automatic virus OCCS continued to make improvements to scanning and signature update mechanisms to combat the UNIX architecture. Various upgrades in ever increasing cyber-attacks. OCCS utilizes additional servers, increased memory, and subnet antivirus software at the client level with McAfee communication capacity were performed. Virus Scan where signature updates and scans and The ILS Oracle server was moved from other various settings are set individually on the direct attached storage to a high-speed network client. attached storage system thereby increasing the Since the majority of all security breaches capacity of the production ILS Oracle server. The are caused by a missing patch, OCCS implemented Web servers for this application were also moved an automatic patch management system to eliminate into a high-speed gigabit environment this year. security breaches. Patch management of the The production Oracle database server for Windows operating system is handled by a Windows DOCLINE and DCMS was replaced this year with a SUS server. Settings for automatic update are new system having much faster CPUs and more controlled via group policy for all members of the memory capacity. This server will accommodate NLM domain. Application patches are delivered by additional growth in these applications as well as host using industry standard delivery methods, such as added applications.

72 scripting and pushing through Active Directory and Quality Management (QM) and Configuration Novell Zenworks. Control (CC) OCCS continued Web URL filtering this year in accordance with NIH Policy 2806. All NIH OCCS convened a Configuration Control Board Institutes have been mandated to filter out access to (CCB) to provide oversight of configuration changes inappropriate Web sites while simultaneously not made to production IT systems managed by the affecting NIH business activity. Systems Technology Branch. Implementation of the OCCS responded to IT security incidents CCB concept across all of OCCS is anticipated in the that were observed on the intrusion detection system next fiscal year. Quality management is a top priority (IDS) console. These incidents were pursued by of OCCS and quality management improvements are contacting the appropriate systems administrators and expected to lead to significantly greater maturity and requesting them to take the necessary corrective repeatability in the day-to-day operations of the action. The IDS rules were fine-tuned to reduce false Division. positives. In addition, OCCS continued to run regularly scheduled monthly vulnerability Computer Room Facilities assessments for OCCS, SIS, LHC, and the NCCS. This year, OCCS successfully completed NLM systems continue to be supported in a safe Inspector General reviews for both MEDLARS and environment in NLM’s computer facility, which is TOXNET. Originally, MEDLARS included five available 24x7x365. The Network Operations and major systems: the Voyager Integrated Library Security Center (NOSC), which was established in System (ILS), the Data Creation and Maintenance 2002, continues to serve as a central point in IT System (DCMS), Medical Subject Headings (MeSH), system and service monitoring, IT system the Serials Extract File, and DOCLINE. In 2004, the administration, IT security event monitoring, and MEDLARS umbrella grew to cover MedlinePlus and after-hours Help Desk support. NIH Senior Health from OCCS; PubMed and Basic The NOSC display system consists of four Local Alignment Search Tool (BLAST) from NCBI; 32-inch plasma displays that are visible outside the and Clinical Trials from LHC. computer room. The intended audience of this The Office of Management and Budget display system is the general public and NLM staff. requires that 100% of HHS computer users complete The system consists of information “panels” with annual IT security awareness training. NLM has descriptive text, statistical charts and near real-time completed 100% of the mandatory FY04 Security activity monitors. Each panel focuses on a particular Awareness Training for employee and contractors. NLM service or IT infrastructure component. The panels include near-real-time utilization counters for Policies and Product Standards MedlinePlus and for PubMed/MedlinePlus, and NLM services as seen by remote users around the world. OCCS promoted the review and Near real-time utilization data for NLM’s Internet-1 consideration by the Personal Computer Advisory and Internet-2 data communications links are also (PCA) committee of the document OCCS (NLM) IT displayed. Support Policy for Remote Access from Non-NLM- The NLM computer room has tripled its use Owned Computers. The committee reviewed the of electrical power, cooling and data transmission document, and has formed a subcommittee to discuss capacity over the last three years due to the dramatic other issues relating to IT support. growth in dependence on IT systems to deliver OCCS, participating with the PCA, NLM's mission-critical applications. Recognizing developed technical standards and product selections that this rapid growth will continue in the years for two classes of notebook systems to join the PC ahead, OCCS has begun a detailed reengineering Desktop selection in PCA consolidated orders. One process for evaluating the safety, reliability and system can reliably be connected to AC power performance requirements of the computer room. sources and high performance is paramount. For the Those reengineering efforts include the following: other system, weight, size, and battery life are • Expanded the Uninterrupted Power Supply important, but moderate performance and capacity to support the growing needs for functionality is also needed These laptop systems will electrical power protection and redundancy be available as offerings on the recurring of systems housed in the NLM Computer consolidated PC purchases conducted by OCCS. Room. The Computer Room currently can maintain electrical power for up to 39 minutes after losing commercial power.

73 • Developed plans for an overhead ladder Oracle 9i; modifications and testing of the rack in the computer room, as a separate MedlinePlus input system to run in NLM’s disaster pathway for running data networking cables recovery/failover site; modification and testing of to improve the reliability, availability, and MedlinePlus public pages to run in active-active maintainability of data communication (load-balancing) mode at the remote site; and a services. number of production changes requested by Library • Initiated plans for the implementation of a staff. pre-action sprinkler system to improve the The MedlinePlus team also added Spanish- reliability and safety of the fire suppressing language news, a Spanish email listserver, and a system. The current system is a traditional Spanish language translation of ASHP drug wet pipe system. The proposed pre-action information. sprinkler system would require two actions before water will be released onto the fire: Senior Health Project: NIHSeniorHealth.gov is a First, the smoke detection system must joint NLM and National Institute on Aging project identify a developing fire and then open the that provides health information on the Web using pre-action valve. Second, the sprinkler head modes of delivery—video and narration—appropriate must release to permit water onto the fire. for older Americans with access limitations (low Modern day computer rooms are using this vision and low hearing, etc.). The system uses the approach with upwards of 80% of all Accent “Talking Web” module developed by OCCS computer rooms already in this category. to provide the accessibility enhancements. The • OCCS staff worked closely with NIH to TeamSite workflow application was integrated with develop plans to bring additional power to Accent. Content originators can now preview new the computer room, and to efficiently material by listening to it. When the originator is streamline the delivery of electrical power to satisfied with the new or revised material, he or she the IT systems. This will be a multi-year can release it with a mouse-click. TeamSite effort. automatically routes the new material for review and further revision. Finally, the new pages are Consumer Health permanently Access-encoded and moved into production on the Senior Health site. MedlinePlus: In 2004, OCCS continued an aggressive campaign of major MedlinePlus releases Virtual Customer Service (Native Minds/Cosmo): including the Go Local Input System (Release 16) NLM adopted Frequently Asked Questions software and a public directory of health services (Release 15). from Native Minds to provide first-level automated The Go Local initiative debuted in December 2002, customer assistance. Dubbed Cosmo, the system uses allowing users in North Carolina to search for local to answer customer questions in medical service providers while viewing descriptive a conversational mode. Cosmo can answer hundreds material in MedlinePlus. Additional functionality and of common questions, freeing reference librarians a site for Missouri were subsequently implemented. and other staff for more complex and demanding In FY 2004, the Go Local Input System, MedlinePlus queries. The look and feel of Native Minds was Release 16, was released. This system can be used by redesigned to conform to the enhanced NLM Main organizations to input site records for links to local Web site. services for their areas. Users can associate site records with local service terms that are mapped to Professional Health Information MedlinePlus health topics. NLM makes this service available remotely via a Web interface. Ultimately, NLM Classification System: OCCS completed localities lacking the resources to maintain local sites development of the NLM Classification System. The will be able to create them on the NLM MedlinePlus system allows public and institutional access to the site. NLM Classification and related services and includes The public directory of health care resources a Classification Editor. he NLM Classification is (Release 15) allows the public to search for and updated annually in tandem with MeSH. geographically locate hospitals. Setbacks in acquiring a mapping service from MapQuest and hospital data MeSH Browser: A DCMS connection to the MeSH from the American Hospital Directory caused Browser was made that allows DCMS users to enter Release 15 to be delayed for seven months, but the MeSH terms directly into DCMS. first of several planned medical resource directory releases occurred in July. DOCLINE: DOCLINE, the NLM interlibrary loan Upgrades, fixes, and enhancements to system, supports 3,260 domestic and international MedlinePlus included a database software upgrade to libraries in processing approximately 3 million

74 interlibrary loan transactions a year. In FY2004, the prepared for incorporation into the DCMS. In mid DOCLINE team worked with WestLake Services to September, 55,850 citations were sent to NCBI for redesign the Loansome Doc component of inclusion into PubMed. DOCLINE. WestLake produced HTML in August, and OCCS developer coding and database redesign Medical Subject Headings: The Mesh Translation were well under way by the end of the fourth quarter. Management System (MTMS) is an interlingual At the end of FY2004, DOCLINE version 2.3 database of translations that permits automatic (including an implementation of ISO’s ILL protocol) updating of the MeSH terminology tree in all was in beta testing. Twenty-five enhancements were languages. Final development of the MTMS was implemented during the fiscal year completed in the first quarter of FY2004.Various foreign-language data sets, including Japanese, Relais: Relais was modified to accommodate Spanish, Portuguese, and Dutch, were loaded. The innovations in DOCLINE, including color copy Global Change Maintenance System (GCMS) allows service and the ability to enter alternate Ariel propagation on demand. The MeSH component of delivery addresses. Ariel is a scanner-based GCMS (called MHGCMS), which entered production document transmission system from Infotrieve that in FY 2003, is in use for propagating MeSH term uses Internet protocols and Adobe’s Portable changes. In FY 2004, development was completed on Document Format instead of telephone fax services a Keyword maintenance system (called KWGCMS) for faster delivery. A new Relais version (4.1) was which provides a similar capability for citations in implemented during the year, enabling delivery of SPACELINE and other specialty areas managed by Ariel requests to servers behind firewalls and email NLM's collaborating partners. Numerous MeSH security for PDF documents. OCCS provided new improvements, including changes related to YEP, server hardware for Relais email functions and were implemented in FY 2004. provided new scanner hardware to facilitate the Ariel transmission process. An Access application for Data Creation and Maintenance System (DCMS): Relais Express was created which allows staff to The major year-end event for DCMS is the baseline query any request they process. extract, a re-release of all DCMS citations with the new MeSH headings. The baseline extract was UMLS Licensing System: NLM’s Unified Medical separated for the first time into three groups based on Language System provides UMLS Knowledge publication year. Also in FY 2004, the DCMS team Sources (databases) and associated software tools accomplished the following: (programs) for the development of computer systems • Completed the DCMS-to-MeSH Browser that behave as if they "understand" the languages of connection. This allows the indexers to biomedicine and health. In 2004, OCCS implemented search and save terms from the MeSH an online system to license UMLS components. Browser back to DCMS. • Completed most testing of the new Java Voyager Integrated Library System (ILS): OCCS version of the XML Loader and Extractor completed initial development and testing of the new for DCMS. This new version will support XML distribution of Voyager bibliographic data, Meeting Abstracts and OLDMEDLINE data which allows data sharing between Library as well as the 2005 suite of DTDs. Changes Operations and NCBI’s PubMed Entrez search and are included to support invalid authors and retrieval system. The team also loaded the Voyager new publishing models. LocatorPlus database into NCBI’s Entrez • Completed a new process to import NlmCatalog system. A Unicode test version of the suggested terms from the Lister Hill Medical next release of Voyager (2003.1) entered testing at Text Indexer (MTI) into the DCMS the end of the quarter. database. This allows the current DCMS function to be modified by an HTTP Literature Selection Technical Review Committee database-lookup call to Lister Hill Center's (LSTRC): The OCCS support team tuned and server. modified the application to improve performance and • Completed the redesign of Gene Indexing to enhance functionality. LSTRC was successfully work with NCBI’s Gene Entrez database converted to ColdFusion MX. rather than LocusLink. • Completed twelve monthly issues of Index OLDMEDLINE: In FY2004, the December 2003 Medicus. OLDMEDLINE data set was made available to NLM licensees for the first time. During the fourth quarter Serials Extract File (SEF): The SEF team made the of 2004, after final data modifications were made, Serials Viewer compatible with the modified XML over 1.7 million OLDMEDLINE citations were DTDs. The team worked with NCBI to perform its

75 journal database update using the modified DTDs Digital Archive: A project is ongoing to ensure that employed by the Serials Viewer. The team also “permanent” Web-based material remains accessible continued to process LSTRC History data into the without adversely impacting searches for more SSM table and created a List of Serials Indexed for current material. Development has been completed Online Users and a List of Journals Indexed in Index and implementation is expected to conclude early in Medicus for the year 2004. Programs were created to FY2005. In FY2004, the Advanced Search Engine correct errors in DCMS MeSH. acquired Phase 1 ability to search the Digital Archive. Research and Development Efforts RxNorm Project: OCCS designed and developed a Advanced Search Engine: During 2004, the prototype to prove the concept of RxNorm RecomMind MindServer™ Retrieval System, an nomenclature management. This will standardize the advanced search engine, became a standard labeling data mandated for clinical drugs by the FDA. component of NLM’s Web services. The By the end of FY2004, development and planning RecomMind search engine analyzes search terms were in an advanced stage. Final setup and testing of entered by the user to infer meaning from the context. the RxNorm Testing (QA) J2EE servers, QA For example, a user researching work-related lung application server and QA Oracle database server and diseases might type in the terms “occupational lung final setup and testing of the RxNorm Production diseases.” The string contains no exact matches in the J2EE servers, production application server, and database, but the search engine recognizes an production Oracle database server were well under associated concept and locates articles with way. Requirements and click-through mockups of the information on occupational asthma, occupational new RxNorm Editing System were also in an cancer, etc. Such an expert or intelligent search advanced stage of development. (The RxNorm greatly expands the information available to a Editing System allows NLM RxNorm editors to researcher, either in the professional or the public create Semantic Normal Forms for clinical drugs.) health arena, and assists in the critical task of filtering through the ever-increasing amount of information NLM Web Support available. Web Content Management: The TeamSite workflow Load Testing: The SilkPerformer server load-testing feature was applied to previewing Accent “Talking product from Segue was extensively evaluated, Web” content. Numerous enhancements were made tuned, and moved into production as a standard part to TeamSite and additional TeamSite features were of NLM’s application development and quality evaluated. Testing of TeamSite 6.1 began and assurance toolkit. Load testing refers to the practice continues into FY2005. of modeling the performance of a software program through simulation of access by multiple concurrent Web Statistics: The Web Support team installed the users. SilkPerformer allows application developers WebTrends Web site data analysis software during and testers to predict the breaking points in the first quarter and followed through with training, applications and application infrastructures fixes, and enhancements as needed throughout the accurately before deployment. year. Library staff can now analyze Web statistics for leading NLM sites such as MedlinePlus, the NLM Accent (Accessibility Enhancement) Project: Accent Main Web, Senior Health, NativeMinds, the NLM enables a Web server to provide content to vision- Intranet, and a variety of Web applications. An impaired users through text magnification and a interface to export geographically oriented Web machine-generated spoken version of Web-site text. statistics to the MedMap Project was also developed. The enhancements are produced at the server so the user requires no additional hardware/software. NLM Main Web Redesign: The OCCS Web Support Accent was integrated with NIHSeniorHealth.gov team worked with Westlake Solutions to carry out a during the first quarter of FY 2004. The application redesign of the NLM Main Web. The Web Team was integrated with TeamSite workflow management significantly reworked Westlake’s code to improve so authors and editors can preview text changes accessibility and enhance site maintenance. The before they are final. Research continues into project was completed in approximately 60% of the expanding the range of languages available via scheduled time. Accent, particularly Spanish. Also under study is further enhancement of enunciation and grammatical NLM Link Checker: A custom NLM Link Checker detail. Site navigation by speech has been prototyped was launched to replace the ailing Mom Spider. and is being refined as FY2005 begins. During the course of the year, the team modified the

76 Link Checker code to optimize its running time and quickly. A total of 324 new project records were fix a number of outstanding problems. added to the database in early May.

Technical Bulletin: A template capable of generating Health Services Research Resources (HSRR): HSRR a printable bulletin was provided and multiple is used by the National Information Center on Health changes to the template were implemented in Services Research & Health Care Technology to post response to ongoing requests. information on datasets, instruments, and software frequently used in health services research and in the Miscellaneous: The OCCS Web Support team behavioral and social sciences. In 2004, OCCS provided detailed and intensive technical research, implemented advanced search functionality for the support, and development related to all NLM Web HSRR Web site. By request of the customer, work on and Intranet pages, sites, configurations, and HSRR was then suspended pending the completion of functions. the HSRProj and Public Health Partners sites.

Outreach Public Health Partners: A result of collaboration between U.S. government agencies, public health Consumer Outreach and Health System: OCCS organizations, and health sciences libraries, Public developed this system to support NLM’s Consumer Health Partners provides the public health workforce Outreach and Health System (COHS). The system with timely, convenient access to information entered production in the spring of 2003. During resources. During FY2004, the Public Health FY2004, the team developed an interface to export Partners site was converted from static HTML to a Outreach Project data via HTTP post requests made dynamic ColdFusion application. by local Outreach systems. The data set is extracted on demand and converted to XML before being Administrative Support Systems transferred to the requester. Additionally, the team upgraded the XML data service to provide project Customer Service Support System: The latest funding information, implemented rules updates for compiled version of Customer Service was local partner organizations, and analyzed and fixed a implemented during the first quarter of FY2004. A local Output data discrepancy. new, productivity-enhancing Smartscript agent now provides first-tier staff with a form for quick capture Web-based Exhibits: This system tracks OCCS of caller information. A hierarchical view of service exhibit activity and presence at national meetings. requests allows each manager to see requests for all The database includes activities initiated both by agents in his or her managed department. A Firewall internal NLM staff and the staffs of the Regional Service Request Management System, rolled out in Medical Libraries. During FY2004, numerous September of 2004, greatly enhances the efficiency enhancements were implemented, including a and manageability of network security operations. reformatted navigation menu with clusters of related functions, functionality for state and local exhibits, a Cataloging Statistics Management System (CSMS): method for identifying past national exhibits for This system entered production in the first quarter of which reports cannot be found, Add/Create User and FY2004. It comprises ColdFusion, Oracle, HTML, Delete User functionality for administrators, and a and JavaScript-based functionality to create number of database enhancements. New reports were individual and section-level statistical reports for also added. monthly and yearly production and similar activities. Employee management functions and other HSRProj: In response to user requests, OCCS created enhancements were added during the course of the a new interface for the HSRProj database and moved year. it into production at the end of May. The new interface included options that allow searchers to find Small Purchase Management System: This system Projects, Investigators, and Supporting Agencies received report modifications and other maintenance and enhancements during FY2004.

77 was in user communication. At the Lister Hill Center, Administration Dr. Fung is working on the UMLS project.

Jon G. Retzlaff In November 2003, Hua Florence Chang was Executive Officer appointed Chief, Biomedical Files Implementation Branch, Division of Specialized Information Services. Ms. Chang earned a M.S. in Computer Table 12 Science from Johns Hopkins, and a B.S. in Biology Financial Resources and Allocations, FY 2004 from the University of Maryland. She joined NLM in (Dollars in Thousands) 2001 as a computer specialist and has played a key role in the design and implementation of several SIS Budget Allocation: products. Extramural Programs ...... $69,597 Intramural Programs ...... 237,119 In December 2003, Malay Kumar Basu, Ph.D., Library Operations ...... (89,895) joined the staff of the NCBI Computational Biology Lister Hill National Center for Branch as a Visiting Fellow. Dr. Basu received his Biomedical Communications ...... (60,744) B.Sc. and M.Sc. in zoology from the University of National Center for Biotechnology Information Calcutta. In 2003, he received his Ph.D. from the (72,324) Center for Cellular and Molecular Biology, Toxicology Information ...... (14,156) Hyderabad, India. Dr. Basu will conduct research on Research Management and Support ...... 11,218 phylogenetic classification of genes and proteins and Total Appropriation ...... 317,997 develop tools to advance the database of Cluster of Plus: Reimbursements ...... 11,101 Orthologous Groups of protein and other systems.

Total Resources ...... $329,089 In January 2004, Liran Carmel, Ph.D., joined the staff of the Computational Biology Branch, NCBI as a Visiting Fellow. Dr. Carmel received his Master’s Table 13 degree in physics from the Israel Institute of FY 2003 Full-Time Equivalents Technology, Israel and his Ph.D. in mathematics and computer science from the Weizmann Institute of Office of the Director ...... 10 Science, Israel in 2003. Dr. Carmel will conduct Office of Health Information research on genome evolution. Programs Development ...... 7 Office of Communication and In January 2004, Barend Johannes Mans, Ph.D., Public Liaison ...... 8 joined the staff of the Computational Biology Office of Administration ...... 41 Branch, NCBI as a Visiting Fellow. Dr. Mans Office of Computer and received both his M.Sc. and his Ph.D. degrees in Communications Systems ...... 50 biochemistry from the University of Pretoria, South Extramural Programs ...... 15 Africa. At NCBI, Dr. Mans will conduct research on Lister Hill National Center the evolution of protein families on genome scale. for Biomedical Communications ...... 81 National Center for Biotechnology In February 2004, Rampriya Ramarathnam, Ph.D., Information ...... 143 Dr. Ramarathnam joined the staff of the Specialized Information Services ...... 35 Computational Biology Branch, NCBI as a Visiting Library Operations ...... 285 Fellow. She obtained her Ph.D. in bioengineering TOTAL FTEs ...... 675 from the University of California, San Diego in 2003. Dr. Ramarathnam will conduct research on the classification of protein sequences and structures. Personnel In April 2004, Alice E. Jacobs was appointed Acting In August 2003, Kin Wah Fung, M.D., joined the Head of the Cataloging Section, Technical Services Lister Hill Center staff as a postdoctoral fellow. He Division. Ms. Jacobs graduated Phi Beta Kappa from received his medical degree from the University of Washington University with a B.A. in French in 1972 Hong Kong and a master’s degree in Medical and received her M.S. in library science from Informatics from Columbia University. Dr. Fung has Simmons College in 1974. She came to the over 15 years of clinical practice in surgery. At the Cataloging Section of NLM in 1975, first working as Hong Kong Hospital Authority, his informatics work an audiovisuals cataloger on the development of the AVLINE® database and later becoming Head of Unit

78 I in the section in 1986. Since 1991, Ms. Jacobs has from Duke University and M.S. in library science served as Assistant Head of the Cataloging Section. from Catholic University of America. Ms. Backus has been with NLM since her 1985/86 NLM In May 2004, Sunghwan Sohn, Ph.D., joined the Associate Fellow year. She has served as a Reference staff of the NCBI Computational Biology Branch as a Librarian and a Systems Librarian. Joyce has made Visiting Fellow. Dr. Sohn received his Master’s significant contributions to many NLM products and degree in computer engineering from the University programs including Grateful Med, Locatorplus, of Missouri-Columbia and his Ph.D. in engineering NIHSeniorHealth, the Intranet, and MedlinePlus. management from the University of Missouri-Rolla. Missouri. Dr. Sohn will focus his research on the In July 2004, Saikat Chakrabarti, Ph.D., joined the problem of mining data from the literature relevant to staff of the Computational Biology Branch, NCBI as classes of genes such as arise from gene expression a Visiting Fellow. Dr. Chakrabarti received his arrays. Master’s degree in biophysics, molecular biology and genetics as well as his Ph.D. in computational In June 2004, Patricia L. Gibbons joined NLM as approaches to protein sciences from the National Chief, Office of Acquisitions Management. Ms. Centre for Biological Sciences, Bangalore, India. Dr. Gibbons comes to NLM from the National Institute Chakrabarti will conduct research on development of of Mental Health (NIMH). She brings with her 17 automated structure-based multiple alignment years of contracting experience and unlimited techniques. Contracting Officer authority. Ms. Gibbons received her BA in political science from Pennsylvania State In July 2004, Haixia Du, Ph.D., joined the staff of University and she is currently an MBA student at the the Lister Hill Center as a Postdoctoral Fellow. Dr. University of Maryland. Du received her doctorate degree from the Department of Computer Science at Stony Brook In June 2004 Alvin L. Harris was appointed Acting University. At NLM Dr. Du will work with the Chief of the Office of Administrative and Office of High Performance Computing and Management Analysis Services. Mr. Harris joined the Communications as part of the 3D Informatics NLM in 1971 and has served as the Deputy of the research program under the Visible Human Project. Office of Administrative and Management Analysis Services since 1988. Mr. Harris will allow NLM to In July 2004, Incheol Kim, Ph.D., joined the staff of have continuity of service and provide firm the Lister Hill Center as a Postdoctoral Fellow. He leadership until a new permanent Chief of the Office received his doctorate degree in information of Administrative and Management Analysis processing engineering from Kyungpook National Services is selected. University, Taegu, Korea. At LHNCBC, Dr. Kim will conduct research in document metadata In June 2004, Janaki Ananth Mahadevan, Ph.D., extraction using image processing and Web joined the staff of the Computational Biology document analysis techniques. Branch, NCBI as an IRTA Fellow. Dr. Mahadevan obtained her M.Sc. degree in chemistry from the In August 2004, Raja Jothi, Ph.D., joined the staff Indian Institute of Technology, Madras, India and her of the Computational Biology Branch, NCBI as a Ph.D. in computational chemistry, with honors, from Visiting Fellow. Dr. Jothi obtained both his M.Sc. the University of Kansas in 2001. Dr. Mahadevan and his Ph.D. in computer science from the will conduct research on new strategies based on University of Texas at Dallas, in 2004. Dr. Jothi will genome-wide analysis of sequence and structural do research on models and algorithms for studying data. protein networks and co-evolution of interacting proteins. In June 2004, Balaji Santhanam, Ph.D., joined the staff of the Computational Biology Branch, NCBI as In August 2004, Jane Bortnick Griffith, was a Visiting Fellow. Dr. Santhanam received both his appointed Acting Deputy Director, NLM. Ms. M.Sc. in physics and Ph.D. degree in biophysics from Bortnick Griffith joined NLM in 2000 as Assistant the Indian Institute of Science, Bangalore, India. Dr. Director for Policy and Legislative Development. Santhanam will conduct research on the large scale Ms. Bortnick Griffith holds a BA in American history analysis of protein structures and sequences using from the University of Wisconsin and a MA in computational methods. American history from Rutgers University. Prior to joining NLM, she worked as a senior specialist at the In July 2004, Joyce Backus assumed the position of Library of Congress and served as director of a task Head, Reference and Customer Services, Public force (under the aegis of National Academy of Services Division. Ms. Backus received her A.B. Sciences, National Academy of Engineering, and the

79 Institute of Medicine) that examined the goals, of New York. She has four years’ experience as a organization, and operational effectiveness of the library assistant in the Edward G. Miner Library at National Research Council. the University of Rochester, working in reference, circulation, archives, and Web management. She also NLM Associate Fellowship Program has three years’ experience as an assistant manager at Borders Books & Music. Her undergraduate degree is The NLM Associate Fellowship Program is a one- in History. year training fellowship for recent graduates of Masters Degree programs in library and information Lidia Y. Hutcherson received her MLIS in May science. Fellows receive a comprehensive orientation 2004 from the University of Illinois at Urbana- to NLM programs and services during a structured 5- Champaign. She has experience as a Graduate month curriculum phase, and conduct individual Assistant in the Library of the Health Sciences and in projects over the remaining 7-month period. Projects the University Library’s Office of Planning and relate to key NLM programs areas and are typically Budgeting. She also has four years’ experience as a of a research, development, or evaluation nature. Six library technician, working in public services at new Associate Fellows began their year at NLM on Thomas Jefferson University and in technical September 1, 2004. services at Washington University in St. Louis. Her undergraduate degree is in History. Margaret A. Basket received her MSI in May 2004 from the University of Michigan. She has library Sandy D. Tao received her MLIS degree in May intern experience at the Minnesota State Archives 2004 from San Jose State University in California. and the 3M Company, as well as experience as head She has experience in library automation, serving as a librarian for a university residence hall library. Prior metadata support specialist at the Stanford University to beginning her career in librarianship, she had 10 Library. Prior to beginning her career in librarianship, years’ experience as a project and technical service she had 5 years’ experience in information systems engineer at the 3M Company. She also spent four development, including database, Web site, and Web years from 1998-2002 as a Peace Corps volunteer in applications development. She has laboratory Senegal. Her undergraduate training was in experience as a research technician on a human Mechanical Engineering. genome research project. Her undergraduate training is in Biology. Stephanie N. Dennis received her MLS in May 2004 from the University of Maryland. As a Graduate Retirements and Separations Assistant, she gained experience in the digital conversion of paper-based records. She also has In November 2003, Merlyn Rodrigues, M.D., experience developing Web sites, creating Web departed NLM to join the National Center for tutorials, and categorizing online health information Minority Health and Health Disparities, NIH. Dr. for a search engine development project. Prior to Rodrigues joined the Division of Extramural beginning her career in librarianship, she worked on a Programs, as NLM’s Scientific Review variety of projects within the Grants Resource Center Administrator in February 2001. At EP, Dr. of the American Associate of State Colleges and Rodrigues expertly and efficiently arranged the Universities. Her undergraduate degree is in English timely review of all grant applications. Language and Literature. In December 2003, Maria Korab-Laskowska, Ph. Loren R. Frant received her MLIS in June 2004 D., resigned her Staff Scientist position with the from the University of California, Los Angeles. She NCBI. Dr. Korab-Laskowska joined NCBI’s has varied experience in libraries and museums, Information Engineering Branch in December 1999. including cataloging visual history material, She was responsible for developing and maintaining providing reference assistance, and conducting the locusXref database. training sessions for library users. She also served as a volunteer librarian in South Africa during the In January 2004, John Parascandola, Ph.D., retired summer of 2003. Prior to beginning her career in from the Federal government and his most recent librarianship, she provided client support for position as Public Health Service Historian, companies delivering information management Audiovisual Program Development Branch, systems. Her undergraduate degree is in American LHNCBC. Prior to accepting his current post, Dr. Studies. Parascandola served as Chief of NLM’s History of Medicine Division from 1983 to 1992. Dr. Rachel A. Gyore received her MLS in May 2004 Parascandola's contributions have been recognized by from the University at Buffalo, the State University the PHS through such honors as the Surgeon

80 General's Exemplary Service Award (1989 and from Hobart College in mathematics and economics 1996), the Assistant Secretary for Health’s Superior and his M.A. degree from the Johnson School of Service Award (1999), and the NIH Merit Award Management at Cornell University. During his (1988). He is also the recipient of several awards in tenure, Mr. Smith also served in various committees the history of science and medicine. His book on The holding leadership positions. These include: Development of American Pharmacology: John J. President of the International Council of Scientific Abel and the Shaping of a Discipline was awarded and Technical Information (ICSTI), Chair of the the George Urdang Medal for distinguished Policy Group of the Federal Library and Information pharmaco-historical writing by the American Institute Center Committee (FLICC), and Vice President of of the History of Pharmacy in 1994. UNESCO General Information Program. He received numerous Senior Executive Service Achievement In February 2004, Christa F.B. Hoffmann, retiree Awards, the Assistant Secretary for Health from the Federal government and her position Head Exceptional Achievement Award, NLM Director’s of the Cataloging Section, Technical Services Award, the HHS Superior Service Medal, 1997 Division, LO since October 1980. She came to NLM Medical Library Association President’s Award, and from the University of Nebraska–Lincoln Libraries the 1998 NFAIS Miles Conrad Lecture. where she was an Associate Professor of Library Science and head of the catalog department. During In July 2004, Robert H. Cross retired from his her career at NLM, Ms. Hoffmann led the Cataloging position as Education Specialist, Audiovisual Section into a fully automated environment and Program Development Branch, LHNCBC after 40 played a key role in NLM’s participation in national years of service with the Federal government, 26 of bibliographic programs. In September 2003, Ms. which were with NLM. Between 1970 and 1980, he Hoffmann received the Frank B. Rogers Award. served as the Personnel Officer for various NIH Institutes including NLM and the U.S. Department of In April 2004, Duane W. Arenales retired from her Agriculture. He returned to the NLM in 1980 as a position as Chief, Technical Services Division, Program Analyst for the Office of the Director, Library Operations, after 34 years of service with the LHNCBC and in 1986 became a Staff Assistant in Federal government, 32 of them at NLM. She came the Audiovisual Program Development Branch. to the Library in 1971 after receiving an MLIS from the University of Maryland. As Chief, Technical In September 2004, Jon G. Retzlaff resigned from Services Division, Ms. Arenales was responsible for his position as NLM Executive Officer. Mr Retzlaff NLM collection development policy; for the came to NLM in 2002 from the National Institute of selection, acquisition and cataloging of material for Neurological Disorders and Stroke. While at NLM he the NLM’s general collection; for overseeing the provided advice to the Director and other senior staff development and implementation of related on administrative management matters and directed processing systems; and for representing NLM in the administrative programs and services of the national bibliographic programs. She received the NLM. Mr. Retzlaff accepted a position as Director of NIH Director’s Award in 1998. Legislative Relations with the Federation of American Societies for Experimental Biology. In April 2004, Theodore E. Youwer retired from his position as Chief, Office of Administrative In Memoriam Management and Analysis Services, Office of Administration. This was Mr. Youwer’s second In July 2004, William Leonard, the NLM retirement from Federal service as he initially retired Audiovisual Information Officer, passed away from the U. S. Air Force. In 1990 he joined the NLM suddenly. Mr. Leonard worked for years in the field staff as Chief, OAMAS. During his tenure at NLM, of broadcast journalism, most notably for NBC, Mr. Youwer managed a host of major projects that where he won four Emmy Awards. Mr. Leonard significantly improved both the functional and came to the NLM in the mid 1970s, where he worked aesthetic appearance of the library buildings and on programs designed to connect poorly served rural grounds and he directed major improvements that communities with the latest in medical information. enhanced the quality of the workplace environment. For the last two decades, Mr. Leonard acted as Mr. Youwer received many accolades for his Producer and Director on scores of audiovisual numerous contributions including the NIH Award of programs highlighting important project. Last year Merit and the prestigious NLM Director’s Award. Mr. Leonard was the recipient of the NLM Director’s Award, the Library’s highest honor. He will be truly In July 2004, Kent Smith retired from his position as missed. NLM Deputy Director after 42 years of service with the Federal government. He received his B.A. degree

81 Awards excellence in the organization, planning, and coordination of special meetings of critical The 2004 Secretary’s Award for Distinguished importance to the Division of Extramural Programs. Service was awarded to David J. Lipman, M.D. for exceptional leadership in establishing NIH as the The Philip C. Coleman Award recognizes significant major resource in the filed of computational contributions to the NLM by individuals who molecular biology. demonstrate outstanding ability to motivate colleagues. The recipient of the 2004 award was Ms. The NIH Director’s Award was awarded to Martha Deirdre A. Clarkin (LO) for the successful R. Szczur for developing consumer information management and motivation of student employees in resources to assist in identifying harmful chemical NLM’s Collection Access Section Onsite Unit. and environmental hazards. The NLM EEO Special Achievement Award was The NLM Board of Regents Award for Scholarship presented to Dr. James E. Knoben for his work with or Technical Achievement was awarded to Dr. the Diversity Council and for spearheading the Stuart J. Nelson for initiating, designing, and implementation of the “English Language Program,” directing the development of RxNorm, a clinical drug an initiative aimed at helping improve the language nomenclature designated as a U.S. Government-wide proficiency of employees whose first language is not interoperability standard. English.

The Frank B. Rogers Award recognizes employees The Pehr Edman Award was presented by the who have made significant contributions to the International Association for Protein Structure Library’s fundamental operational programs and Analysis and Proteomics to Dr. Stephen F. Altschul services. The recipient of the 2004 award was Ms. (NCBI) for outstanding contributions to protein and Gail A. Dutcher in recognition of significant nucleic acid bioinformatics. contributions to many NLM programs including outreach to minority communities, consumer health The 2004 Senior Scientist Accomplishment Award and HIV/AIDS activities, and development of related was presented by the International Society for health information resources. Computational Biology to Dr. David J. Lipman for his contributions to the field of computational The NLM Director’s Award, presented in recognition biology through research. of exceptional contributions to the NLM mission, was awarded to three employees: Yuen-Yin Kathy The 2004 Medical Library Association President’s Kwan (NCBI) for creating and managing the Award was presented to Ms. Martha Fishel and Ms. LinkOut project; Julia C. Royall (OHIPD) for Betsy Humphreys in recognition of their leadership unique contributions to strengthening NLM’s and contributions to the professional development international outreach to developing countries programs of the Association. through the Multilateral Initiative on Malaria; and Patricia Tuohy (LO) for outstanding management of The Thomson Scientific/Frank Bradway Rogers the design, development, and installation of NLM’s Information Advancement Award was presented by major exhibitions and associated educational the Medical Library Association to the following programs. individuals: Ms. Joyce Backus, Ms. Paula Kitendaugh, Ms. Lori Klein, Ms. Eve Marie The NIH Merit Award was presented to four Lacroix, Ms. Wei Ma, Ms. Jennifer Marill, and individuals and a group: Dr. Valerie Florance (EP) Ms. Naomi Miller in recognition of distinguished for her sustained and diligent excellence in professional contributions to the application of administering, improving, and publicizing NLM’s technology in the delivery of health care information grant programs; Ms. Judy C. Jordan (LO) for highly in the development of MedlinePlus. successful management of NLM’s ILL Serials Processing Group; Dr. Craig Locatis for his NLM Committee Activities leadership and continuing support to NLM’s ongoing partnership with the Radiological Society of North NLM Diversity Council America; Ms. Jane L. Rosov for superior management of NLM’s Licensing and Data The NLM Diversity Council began 2004 by Distribution Program which extends access to welcoming four new members: Patricia Carson, MEDLINE data; and the Extramural Program Special Melanie Modlin, Helen Ochej, and Bryant Pegram. Meeting Team (Ms. Christine C. Ireland, Ms. Each will serve a two-year term from January 2004 Michelle D. Krever, Ms. Susan Wilcox) for through December 2005. Continuing on the Council

82 are: Kathleen Cravedi, Felicia Derricott, James excellent setting for celebrating the diversity Knoben, Tameka Gore, Renee McLean-Banks, found at the NLM. The Council voted to Donald Jenkins, and Linda Tang. The Council have OCPL staffer Fran Sandridge attend continues to receive support from its ex-officio meetings on an ex-officio basis to assist in members: Ronald Stewart, Acting Executive Officer, the design of needed bulleting displays. David Nash from the Equal Employment Opportunity • English Language Courses: The Council is Office, and Nadgy Roey from the Office of Human continuing to support an English language Resources, as well as its distinguished alumni. program to enable NLM employees to Kathleen Cravedi accepted the responsibilities of improve their linguistic proficiency in Council Chair and James Knoben became Council speaking and writing English. Following the Vice-Chair. model used by local literacy programs, the NLM program offers one-on-one tutoring. FY2004 Accomplishments: NLM staff who volunteer to serve as tutors receive specialized training from the • NLM Director’s Employee Education Fund: Literacy Council of Montgomery County. The NLM Diversity Council continued its Four English language instructors and four coordination of the NLM Director’s students were selected in 2004, with one of Employee Education Fund. In FY2004, the those tutors currently on standby pending Fund enabled 77 NLM staff to take 85 completion of tutor training. In 2005, the classes from 18 area schools. This is up Diversity Council may consider whether to from 46 staff taking 65 classes from 13 area expand the program to include Spanish schools in FY2003. Undergraduate classes language instruction for those employees made up the majority of classes supported. whose work involves that language. The school with the largest number of NLM • NLM Health Education Expo: In 2004, the enrollees was the University of Maryland Diversity Council sponsored its first annual (21 attendees) with Montgomery College Health Education Expo for Employees. The coming in second (17 attendees). Course Expo, organized by Linda Eisenstadt and the disciplines enrolled in included psychology, Council, was titled “Keep a Check on Your business, marketing, computer networking, Health.” While women were the primary chemistry, economics, and biology. In audience, men were encouraged to attend. addition to traditional classroom instruction, The program was held at Lister Hill some courses were taken on the Internet. Auditorium with a one-hour presentation by The Diversity Council continues its effort to Dr. Patricia Davidson, followed by publicize the availability of the fund. In fact, questions on “Hypertension and Heart the Director’s Employee Education Fund is Disease in Women and Preventive featured under “Benefits” in a new NLM Measures.” The keynote presentation was brochure entitled “Working at the NLM.” followed by exhibits in the Lister Hill Lobby • Facility Accessibility and Reasonable where numerous groups including the Accommodations: The Council continued American Diabetes Association, the efforts to upgrade access at NLM for people National Women’s Health Information with disabilities. Accessibility features in Center, the Washington Health Center, many of the bathrooms in NLM have now among others, provided valuable health been added to accommodate the disabled information to NLM employees. In addition, community and Conference Room B has had the American Heart Association was on LED Caption Display installed to provide a hand to answer employee questions. Blood scrolling LED display of CART and real- pressure and cholesterol screening were time captioning to be seen by everyone in provided by NIH. There were raffles for the room. The Diversity Council has door prizes, including a membership to the approved and is working with the NLM’s NIH Fitness Center. The Expo was a great Office of Acquisitions Management to success and the Council decided to make it acquire an electric wheelchair for use by an annual event. requesting patrons. • Diversity Council Honors NLMers with • Communication of NLM Diversity: The Awards and Ice Cream Social. The Diversity Council again collaborated with Diversity Council sponsored a “Laborless” the Office of Communications and Public Moment to honor NLMers whose volunteer Liaison to promote various activities on the activities helped to promote diversity and NLM Staff Bulletin Board located outside improve employment opportunities at the the cafeteria. This display has provided an NLM in 2003–04. Ben and Jerry’s supplied

83 the ice cream on the patio adjacent to the statement on the “Library of the 21st Lister Hill Auditorium following a brief Century,” and approved the Board Operating awards ceremony in the auditorium. About Procedures for 2004. 400 NLMers attended. It was agreed that the • In May, the Board was given updates on the Diversity Council awards ceremony and ice Information Rx Project, Just in Time cream social should remain an annual event. Information for Clinicians Grant Project, a • Reading Club: The Diversity sponsors a report from the Midcontinental Regional reading club that meets regularly for Medical Library, International Toxicity interested employees. Estimates of Risk database, the Wireless • Looking Under Your Hood: In 2004, the Information System for Emergency Diversity Council sponsored a series of Responders project, facilitating foreign monthly lectures by Dr. Donald Jenkins, a language versions of MeSH, WebMARS member of the Council. The lectures, as the data extraction from online journals, and an title suggests, provided a valuable and outreach project on high school students fascinating overview of regions of the body, connecting with MedlinePlus. The Board as shown in the David Bassett archive of gave concept approval for Roadmap images of human cadaver anatomy. This Initiatives, a Multi-Agency Modeling lecture series was based on the belief that Project, and a Specialized Information personal knowledge about the intricate Services Public Health Law Information structure of the human body is beneficial to Project. health and well-being. • In September, the Board reviewed several • Diversity Council Coat and Clothing Drive: projects including the NCBI Bookshelf, the The Diversity Council sponsored a coat and NCBI PubChem system (part of the clothing drive during the 2004 Thanksgiving Roadmap Initiative), a Hospital Elder Life and Christmas holidays. Over three carloads grant project, the CINID-Model Disability of clothing and more than 400 coats were Information Network grant project, collected by the Diversity Council and consumer health information services, a delivered to the Shepherds Table in Silver revised version of the Collection Spring, Maryland. This Center provides Development Manual, and the Visible food and clothing to approximately 150 Human Project. The Board provided concept needy people daily. approval for NLM’s support of informatics research, and approval of the revised Board of Regents Collection Development Manual.

The Board of Regents (BOR) met three times in FY During all Board meetings, the committee 2004 on February 10–11, May 19–20, and September performed the secondary peer review process for the 21–22. The Extramural Programs Subcommittee and NLM grant program. Other grant-related activities the Subcommittee on Outreach and Public are listed under the Extramural Programs section of Information were held during each of these meetings. this annual report. During the FY2004 meetings, the Board of A new Chair was elected to the Board of Regents reviewed several new and ongoing projects: Regents, Dr. William W. Stead, Professor of • In February, the Board was given Biomedical Informatics at Vanderbilt University in presentations on Entrez databases and Nashville, Tennessee. Two new members joined the features, NLM website user studies, Hawaii Board in September: The Honorable Newt Gingrich, Access to Computerized Health Information Chief Executive Officer of The Gingrich Group, in grant program, a Native American internship Washington, D.C., and Mr. Richard Chabran, Chair outreach project, and outreach activities in of the California Community Technology Policy Africa. The Regents approved a draft Board Group, in Chino Hills, California.

84 APPENDIX 1: REGIONAL MEDICAL LIBRARIES

1. MIDDLE ATLANTIC REGION 5. SOUTH CENTRAL REGION The New York Academy of Medicine Houston Academy of Medicine-Texas 1216 Fifth Avenue Medical Center Library New York, NY 10029–5283 1133 M.D. Anderson Boulevard (212) 822-7396 FAX (212) 534-7042 Houston, TX 77030-2809 States served: DE, NJ, NY, PA (713) 799-7880 FAX (713) 790-7030 URL: http://www.nnlm.nih.gov/mar States served: AR, LA, NM, OK, TX URL: http://www.nnlm.nih.gov/scr 2. SOUTHEASTERN/ATLANTIC REGION University of Maryland at Baltimore 6. PACIFIC NORTHWEST REGION Health Science and Human Services Library University of Washington 601 Lombard Street Regional Medical Library, HSLIC Baltimore, MD 21201-1583 Box 357155 (410) 706-2855 FAX (410) 706-0099 Seattle, WA 98195-7155 States served: AL, FL, GA, MD, MS, NC, (206) 543-8262 FAX (206) 543-2469 SC, TN, VA, WV, DC, VI, PR States served: AK, ID, MT, OR, WA URL: http://www.nnlm.nih.gov/sar URL: http://www.nnlm.nih.gov/pnr

3. GREATER MIDWEST REGION 7. PACIFIC SOUTHWEST REGION University of Illinois at Chicago University of California, Los Angeles Library of the Health Sciences (M/C 763) Louise M. Darling Biomedical Library 1750 West Polk Street Box 951798 Chicago, IL 60612-7223 Los Angeles, CA 90025-1798 (312) 996-2464 FAX (312) 996-2226 (310) 825-1200 FAX (310) 825-5389 States served: IA, IL, IN, KY, MI, MN, States served: AZ, CA, HI, NV and U.S. ND, OH, SD, WI Territories in the Pacific Basin URL: http://www.nnlm.nih.gov/gmr URL: http://www.nnlm.nih.gov/psr

4. MIDCONTINENTAL REGION 8. NEW ENGLAND REGION University of Utah University of Massachusetts Medical School Spencer S. Eccles Health Sciences Library The Lamar Soutter Library 10 North 1900 East 55 Lake Avenue, North Salt Lake City, Utah 84112-5890 Worcester, MA 01655 Phone: (801) 581-8771 (508) 856-2399 FAX: (508) 856-5039 Fax: (801) 581-3632 States Served: CT, MA, ME, NH, RI, VT States Served: CO, KS, MO, NE, UT, WY URL: http://nnlm.gov/ner URL: http://nnlm.gov/mcr

85 APPENDIX 2: BOARD OF REGENTS

The NLM Board of Regents meets three times a year to consider Library issues and make recommendations to the Secretary of Health and Human Services affecting the Library.

Appointed Members: Ex Officio Members:

STEAD, William W., M.D. (chair) Librarian of Congress Professor of Biomedical Informatics Vanderbilt University Surgeon General Nashville, TN Public Health Service

BUCHANAN, Holly S., Ed. D. Surgeon General Director and Professor Department of the Air Force Health Sciences Library & Informatics Center University of New Mexico Surgeon General Albuquerque, NM Department of the Navy

CARTER, Ernest L., M.D. Surgeon General Director, Telehealth Sciences Department of the Army Howard University Washington, D.C. Under Secretary for Health Department of Veterans Affairs CHABRAN, Richard, M.L.S., Chair California Community Technology Policy Group Assistant Director for Biological Sciences 3081 Sunrise Court National Science Foundation Chino Hills, CA Director CONERLY SR., A. Wallace, M.D. National Agricultural Library Dean, University of Mississippi School of Medicine Dean Jackson, MS Uniformed Services University of the Health Sciences DEAN, Richard H., M.D. President, Wake Forest University Health Sciences Winston-Salem, NC

DETRE, Thomas, M.D. Distinguished Service Prof. of Health Sciences University of Pittsburgh Pittsburgh, PA

GINGRICH, Newt, Ph.D. Chief Executive Officer The Gingrich Group Washington, DC

KARLIS, Vasiliki, D.M.D., M.D. Associate Professor Department of Oral and Maxillofacial Surgery New York University College of Dentistry New York, NY

86 APPENDIX 3: BOARD OF SCIENTIFIC COUNSELORS/ LISTER HILL CENTER

The Board of Scientific Counselors meets periodically to review and make recommendations on the Library’s intramural research and development programs.

Members: San Francisco, CA

FULLER, Sherrilynne S., Ph.D. (Chair) FRIEDMAN, Carol, Ph.D. Professor of Biomedical & Health Informatics Adjunct Professor, Dept. of Medical Informatics University of Washington School of Medicine Columbia University Seattle, WA New York, NY

CARTER, Jerome H., M.D. GIUSE, Nunzia B., M.D. Director, Division of Infectious Diseases Associate Professor of Biomedical Informatics University of Alabama Vanderbilt University Birmingham, AL Nashville, TN

CHEN, Hsinchun, Ph.D. SRIHARI, Sargur N., Ph.D. Professor of Management Information Systems Distinguished Professor University of Arizona Computer Science & Engineering Tucson, AZ State University of NY Buffalo, NY FERRIN, Thomas E., Ph.D. Professor of Pharmaceutical Chemistry University of California

87 APPENDIX 4: BOARD OF SCIENTIFIC COUNSELORS/ NATIONAL CENTER FOR BIOTECHNOLOGY INFORMATION

The NCBI Board of Scientific Counselors meets periodically to review and make recommendations on the NLM’s biotechnology-related programs.

Members: MACKAY, Trudy F., Ph.D. Professor, Dept. of Genetics PREUSS, Daphne K. Ph.D. (Chair) North Carolina State University Assistant Professor Raleigh, NC Molecular Genetics and Cell Biology University of Chicago SALEMME, F. Raymond, Ph.D. Chicago, IL President Imiplex, LLC FIRE, Andrew Z., Ph.D. Yardley, PA Staff Scientist Department of Embryology SALZBERG, Steven L., Ph.D. Carnegie Institution Senior Director of Bioinformatics Baltimore, MD The Institute for Genomic Research Rockville, MD KWITEK, Anne E., Ph.D. Assistant Prof., Dept. of Physiology TRASK, Barbara J., Ph.D. Human & Molecular Genetic Center Head, Human Biology Division Medical College of Wisconsin Fred Hutchinson Cancer Research Ctr. Milwaukee, WI Seattle, WA

88 APPENDIX 5: BIOMEDICAL LIBRARY AND INFORMATICS REVIEW COMMITTEE

The Biomedical Library Review Committee meets three times a year to review applications for grants under the Medical Library Assistance Act.

Members:

HRIPCSAK, George, M.D. (chair) KAZIC, Toni, Ph.D. Associate Professor Associate Professor of Computer Engineering Department of Medical Informatics University of Missouri-Columbia Columbia University Columbia, MO New York, NY KOHANE, Isaac S., M.D., Ph.D. ALTMAN, Russ B., M.D., Ph.D. Associate Professor Associate Professor, Medical Informatics Department of Medicine Stanford Medical School Children’s Hospital Stanford, CA Boston, MA

BALAS, Andrew, M.D., Ph.D. McKNIGHT, Michelynn, Ph.D. Dean and Professor Assistant Professor College of Health Sciences School of Library and Information Science Old Dominion University Louisiana State University Norfolk, VA Baton Rouge, LA

BYRD, Gary D., Ph.D. OGUNYEMI, Omolola I., Ph.D. Director, Health Sciences Library Research Associate State University of NY at Buffalo Department of Radiology Buffalo, NY Brigham and Women’s Hospital Boston, MA CAMPBELL, James R., M.D. Professor of Internal Medicine PRATT, Wanda, Ph.D. University of Nebraska Medical Center Assistant Professor Omaha, NE Department of Biomedical & Health Informatics University of Washington School of Medicine CLAYTON, Paul D., Ph.D. Seattle, WA Chief Medical Informatics Officer Intermountain Health Care SILVERSTEIN, Jonathan C., M.D. University of Utah Assistant Professor of Surgery Salt Lake City, UT University of Chicago Chicago, IL HUNTER, Lawrence, Ph.D. Associate Professor of Pharmacology SPACKMAN, Kent A., M.D., Ph.D. University of Colorado Health Sciences Center Professor of Pathology Aurora, CO Oregon Health and Science University Portland, OR JENKINS, Carol G., M.L.S. Director, Health Sciences Library TAIRA, Ricky K., Ph.D. University of North Carolina Associate Professor, Dept. of Radiology Chapel Hill, NC University of California Los Angeles, CA

89 TANJI, Virginia M. University of California, Scan Francisco Library Resource Center San Francisco, CA School of Medicine University of Hawaii at Monoa YOKOTE, Gail A. Honolulu, HI Associate University Librarian Peter J. Shield Library TEMPLETEON, Etheldra, M.L.S. University of California Executive Director Davis, CA Library & Information Systems Philadelphia College of Osteopathic Medicine ZHOU, Z. Hong, Ph.D. Philadelphia, PA Associate Professor of Pathology University of Texas Health Science Center – Medical WONG, Stephen T.C., Ph.D. School Assistant Professor Houston, TX Department of Radiology and Neurology

90 APPENDIX 6: LITERATURE SELECTION TECHNICAL REVIEW COMMITTEE

The Literature Selection Technical Review Committee meets three times a year to select journals for indexing in Index Medicus and MEDLINE.

Members: MANNING, Phil, M.D. Professor of Medicine Emeritus (University of Southern California) SHEPRO, David, Ph.D. (chair) Corona del Mar, CA Professor, Depts. of Biology and Surgery Boston University MCCLURE, Lucretia W., M.A. Boston, MA Special Assistant to the Director Countway Library of Medicine BRANDT, Cynthia A., M.D., Ph.D. Harvard University Assistant Professor Boston, MA Center for Medical Informatics Yale University SHARPS, Phyllis W., Ph.D. New Haven, CT Associate Professor School of Nursing CHEN, Jinkun, DDS, Ph.D. Johns Hopkins University Professor of General Dentistry Baltimore, MD Director, Oral Biology Division Tufts University School of Dental Medicine SIEGEL, Vivian, Ph.D. Boston, MA Editor, Cell Cell Press DELCLOS, George L., M.D. Cambridge, MA Associate Professor of Environmental & Occupational Health SOEHNER, Catherine B., M.L.S. University of Texas Health Science Center Head, Science & Engineering Library Houston, TX University of California Santa Cruz, CA DOUGLAS, Janice E., M.D. Professor of Medicine, Physiology & Physics STERNBERG, Esther M., M.D. Case Western Reserve University Director, Integrative Neural Immune Program Cleveland, OH National Institute of Mental Health Bethesda, MD FREY, John J., M.D. Professor and Chair TOM-ORME, Lillian, Ph.D. Department of Family Medicine Research Assistant Professor University of Wisconsin Dept. of Family and Preventive Medicine Madison, WI University of Utah Salt Lake City, UT KAPLAN, Jerry, Ph.D. Professor of Pathology WEISSMAN, Norman, Ph.D. University of Utah School of Medicine Professor, Health Services Administration Salt Lake City, UT University of Alabama Birmingham, AL

91 APPENDIX 7: PUBMED CENTRAL NATIONAL ADVISORY COMMITTEE

The PubMed Central National Advisory Committee meets twice a year to review and make recommendations about the information resource, PubMed Central.

WILLIAMS, James F. (chair) KHOSLA, Chaitan S., Ph.D. Dean of Libraries Prof. of Chemistry & Chemical Engineering University of Colorado Stanford University Boulder, CO Stanford, CA

DELAMOTHE, Anthony P., M.D. KIRSCHNER, Marc W., Ph.D. Editor, British Medical Journal Professor and Chair London, England Department of Cell Biology Harvard Medical School EISEN, Michael B Boston, MA Genome Sciences Lawrence Berkeley National Laboratory LAPPIN, Debra R., J.D. University of California Consultant Berkeley, CA Princeton Partners Ltd. Englewood, CO JOHNSON, Richard K. Enterprise Director ROEHR, Bob, B.A. Scholarly Publishing & Academic Resources Writer Coalition Washington, D.C. Washington, D.C. RUBIN, Gerald M., Ph.D. JOSEPH, Heather D., M.A. Investigator President and CEO Howard Hughes Medical Institute BioOne Chevy Chase, MD Washington, D.C. THOMAS, Sarah E., Ph.D. KAPLAN, Samuel, Ph.D. Carl A. Kroch University Librarian Professor and Chair Cornell University Microbiology and Molecular Genetics Ithaca, NY University of Texas Health Science Ctr. Houston Medical School VARKI, Ajit P., M.D. Houston, TX Professor of Cellular Biology & Molecular Medicine University of California KAUFMAN, Paula T., M.B.A. San Diego, CA University Librarian University of Illinois at Urbana-Champaign WATSON, Linda A. Urbana, IL Director, Claude Moore Health Science Library University of Virginia Charlottesville, VA

92 APPENDIX 8: ORGANIZATIONAL ACRONYMS AND INITIALISMS USED IN THIS REPORT

AAHSL Association of Academic Health DiD Defense-in-Depth Sciences Libraries DIRLINE Directory of Information Resources ACP American College of Physicians Online ACSI American Consumer Satisfaction Index DTD Document Type Definition ACTIS AIDS Clinical Trials Information EBI European Bioinformatics Institute Service EEO Equal Employment Opportunity AHCPR Agency for Health Care Policy and EFTS Electronic Funds Transfer Service Research EMBL European Molecular Biology Laboratory AHRQ Agency for Healthcare Research and EMIC Environmental Mutagen Information Quality Center ALTBIB Alternatives to Animal Testing EnHIOP Environmental Health Information AMPA American Medical Publishers Outreach Panel Association EP Extramural Programs AMWA American Medical Women’s EPA Environmental Protection Agency Association EST Expressed Sequence Tag APDB Audiovisual Program Development ETICBACK Environmental Teratology Information Branch Center backfile ARL Association for Research Libraries FDA Food and Drug Administration ATIS HIV/AIDS Treatment Information FIC Fogarty International Center Service FNLM Friends of the National Library of ATSDR Agency for Toxic Substances and Medicine Disease Registry GEO Gene Expression Omnibus BISTI Biomedical Information Science and GPRA Government Performance and Results Technology Initiative Act BLAST Basic Local Alignment Search Tool GSA General Services Administration BLIRC Biomedical Library and Informatics GSS Genome Survey Sequences Review Committee GUI Graphical User Interface BOR Board of Regents HapMap International Haplotype Map Project BSD Bibliographic Services Division HBCU Historically Black Colleges and CBIR Content-Based Image Retrieval Universities CCB Configuration Control Board HHS Health and Human Services CCRIS Chemical Carcinogenesis Research HIPAA Health Insurance Portability and Information System Accounting Act CDC Centers for Disease Control and HMD History of Medicine Division Prevention HSDB Hazardous Substances Data Bank CDD Conserved Domain Database HPCC High Performance Computing and CEB Communications Engineering Branch Communications CES (NIH) Central Email System HSRProj Health Services Research Projects CGAP Cancer Genome Anatomy Project HSRR Health Services and Sciences Research CgSB Cognitive Science Branch (CgSB Resources ChemIDplus Chemical Identification File HSTAT Health Services and Technology CIT Center for Information Technology Assessment Text CPT Current Procedural Terminology IADL Internet Access to Digital Libraries CRID Regional Disaster Information Center for IAIMS Integrated Advanced Information Latin America and the Caribbean Management Systems CSB Computer Science Branch ICs Institutes and Centers (of NIH) DART Developmental and Reproductive ICTV International Committee on Taxonomy Toxicology of Viruses DDBJ DNA Data Bank of Japan ILL Interlibrary Loan DCMS Data Creation and Maintenance Systems ILS Integrated Library System DHHS Department of Health and Human INSD International Nucleotide Sequence Services Database Collaborators

93 IRIS Integrated Risk Information System NIEHS National Institute of Environmental IT Information Technology Health Sciences ITER International Toxicity Estimates for Risk NIGMS National Institute of General Medical ITK Insight Toolkit Sciences ITP Informatics Training Program NIH National Institutes of Health JD Journal Descriptor NIOSH National Institute for Occupational LAN Local Area Network Safety and Health LHC Lister Hill Center NIST National Institute of Standards and LHNCBC Lister Hill National Center for Technology Biomedical Communications NLM National Library of Medicine LO Library Operations NN/LM National Network of Libraries of LOINC Logical Observations: Identifiers, Medicine Names, Codes NNO National Network Office LSTRC Literature Selection Technical Review NOSC Network Operations and Security Center Committee NRCBL National Reference Center for Bioethics MEDLARS Medical Literature Analysis and Literature Retrieval System NSF National Science Foundation MEEC Maryland Education Enterprise NTCC National Online Training Center and Consortium Clearinghouse MeSH Medical Subject Headings OAM Office of Administrative Management MGC Mammalian Gene Collection OCCS Office of Computer and MIM Multilateral Initiative on Malaria Communications Systems MIRS Medical Information Retrieval System OCPL Office of Communications and Public MLA Medical Library Association Liaison MLAA Medical Library Assistance Act OCR Optical Character Recognition MMDB Molecular Modeling DataBase OD Office of the Director MMS MEDLARS Management Section OHIPD Office of Health Information Programs MMTx MetaMap Technology Transfer Development MTI Medical Text Indexer OMB Office of Management and Budget MTMS MeSH Translation Management System OMIM Online Mendelian Inheritance in Man NCBC National Centers for Biomedical (database) Computing OSIRIS Open Source Independent Review and NCBI National Center for Biotechnology Interpretation System Information PAHO Pan American Health Organization NCCS NIH Consolidated Collocation Site PCA Personal Computer Advisory Committee NCI National Cancer Institute PDA Personal Digital Assistant NCRR National Center for Research Resources PDB Protein Data Bank NCVHS National Committee on Vital and Health PDF Portable Document Format Statistics PHS Public Health Service NHANES National Heath and Nutrition PICO Patient/Problem, Intervention, Examination Surveys Comparison, and Outcome NHGRI National Human Genome Research PLA Public Library Association Institute PMC PubMedCentral NHII National Health Information PRS Protocol Registration System Infrastructure PSD Public Services Division NHLBI National Heart, Lung, and Blood QTL Quantitative Trait Loci Institute RefSeq Reference Sequence (database) NIA National Institute on Aging RML Regional Medical Library NIAID National Institute of Allergy and RNAi RNA interference Infectious Diseases RTECS Registry of Toxic Effects of Chemical NIBIB National Institute of Biomedical Imaging Substances and Bioengineering SAGE Serial Analysis of Gene Expression NICHSR National Information Center on Health SBIR Small Business Innovation Research Services Research and Health Care SEF Serials Extract File Technology SEP Special Emphasis Panel SIS Specialized Information Services SNOMED CT Systematized Nomenclature of Medicine Clinical Terms

94 SPER System for the Preservation of Electronic TOXLINE Toxicology Information Online Resources TOXNET Toxicology Data Network SSEUS SIS SQL Entry Update System TPA Third Party Annotation (database) SSI Scalable Information Infrastructure TRI Toxics Release Inventory STB Systems Technology Branch TSD Technical Services Division STTR Small Business Technology Transfer TTP Turning the Pages Research UMLS Unified Medical Language System STS Sequence Tagged Site UPS Uninterrupted Power Supply TEHIP Toxicology and Environmental Health VAST Vector Alignment Search Tool Information Program VHP Visible Human Project TERA Toxicology Excellence for Risk Web-STOC Web-Services Technology Operations Assessment Center TILE Text to Image Linking Engine WGS Whole Genome Shotgun TIOP Toxicology Information Outreach WISER Wireless Information System for Project Emergency Responders

95 Further information about the programs described in this administrative report is available from the:

Office of Communications and Public Liaison National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894 301-496-6308 E-mail: [email protected] Web: www.nlm.nih.gov

Cover: Information Rx: “Health Information Prescription” program. A joint project with the American College of Physicians to encourage the use of MedlinePlus by patients.

96