Chemical Information BULLETIN

Spring 2011 Volume 63 No. 1

Anaheim Chemical Information Bulletin Vol. 63(1) Spring 2011

Chemical Information Bulletin A Publication of the Division of Chemical Information of the ACS Volume 63 No. 1 (Spring) 2011 (updated March 24, 2011) David Martinsen, Editor American Chemical Society [email protected] In This Issue

Message from the Chair 3 Letter from the Editor 4 CINF Sponsorship 4 CINF at the ACS National Meeting Technical Program Highlights 5 Committee Meetings and Social Events 7 CINF Symposia 8 Technical Program Listing (short) 9 Technical Program Listing with abstracts 14 Committee Report Communications & Publications 40 Awards – Calls for nominations 42 Interviews James L. Mullins 45 Michael Gordin 50 Reviews: 57 Product Announcements 58 CINF Officers 66

Cover design by Mark Luchetti ISSN: 0364‐1910 Chemical Information Bulletin, ©Copyright 2011 by the Division of Chemical Information of the American Chemical Society.

2

Chemical Information Bulletin Vol. 63(1) Spring 2011

Message from the Chair

It is a great honor to begin my tenure as Chair of the ACS Division of Chemical Information in 2011, the International Year of Chemistry. Yet while there is great excitement that comes with this momentous time for chemists worldwide, the continued malaise of the global economy engenders both uncertainty and hesitancy. It is in such uncertain times that participation in CINF makes all the more sense, whether it for networking to find a new job or honing one's skills to keep a current job.

According to the U.S. Bureau of Labor Statistics, 70% of jobs are found through networking. Networking with colleagues in CINF and related ACS Divisions such as the Division of Computers in Chemistry (COMP) is a primary benefit of membership in CINF and attending of CINF events. The strong scientific programs that have been assembled by current CINF Program Chair Rachelle Bienstock and her predecessor Rajarshi Guha (CINF Chair Elect) provide another powerful reason to attend CINF meetings. Another significant reason to attend CINF events is that its members (present company excluded) are simply delightful human beings: dedicated, motivated, intelligent, charming, unpretentious, and interesting people.

If you have not been involved in CINF events, I urge you to do so. We start out with the Long Range Planning Meeting and breakfast on Saturday March 26th foollowed immediately by committee meetings until noon, a luncheon at noon, and the Executive Committee Meeting in the afternoon. There is a Welcoming Reception on Sunday, Harry’s Party on Monday, the CINF Luncheon on Tuesday followed by the CINF Reception Tuesday evening.. Everyone is welcome to attend any or all of these events (except for the Executive Committee Meeting).

Finally, I must thank all the recent CINF Chairs for their tremendous help and support: Dave Martinsen, Svetlana Korolev, and Carmen Nitsche are simply amazing. That said, I owe the biggest debt to Carmen, CINF Past Chair. Her dedication to CINF is truly inspiring, and her gracious help and guidance over the past year has been greatly appreciated.

I look forward to seeing you in Anaheim!

Warmest regards,

Gregory M. Banik, Ph.D., Chair ACS Division of Chemical Information (CINF)

3

Chemical Information Bulletin Vol. 63(1) Spring 2011

Letter from the Editor

I would simply like to thank all those who made contributions to this issue of the Chemical Information Bulletin. Svetla Baykoucheva provided two thought‐provoking interviews and Bob Buntrock contributed several book reviews. Our new division chair, Greg Banik, shared his thoughts on the year (welcome, Greg!). Also included are highlights from the technical program from our program chair, Rachelle Beinstock, as well as the technical program itself. Be sure to check out the product announcements from our sponsors, as well as the information about submissions for awards. Thanks to Mark Luchetti for the cover page design. Finally, I would also recognize the efforts of our webmaster, Danielle Dennie, who designed the templates for the e‐CIB and flowed all of the content seamlessly into the site (well, at least it looked seamless to me).

Dave Martinsen Guest Editor

CINF Sponsors Spring 2011 March 2011

The American Chemical Society Division of Chemical Information (CINF) is very fortunate to receive generous financial support from our sponsors to maintain the high quality of the Division’s programming and to promote communication between members at social functions at the ACS Spring 2011 National Meeting in Anaheim, CA, and to support other divisional activities during the year, including scholarships to graduate students in Chemical Information.

The Division gratefully acknowledges contribution from the following sponsors:

Gold: FIZ CHEMIE Berlin

Silver: Bio‐Rad Laboratories InfoChem RSC

Bronze: Accelrys ACS Publications CambridgeSoft Thieme Publishers

Opportunities are available to sponsor Division of Chemical Information events, speakers, and material. Our sponsors are acknowledged on the CINF web site, in the Chemical Information Bulletin, on printed meeting materials, and at any events for which we use your contribution.

Please feel free to contact me if you would like more information about supporting CINF.

Graham Douglas Chair, Fundraising Committee Email: [email protected] Tel: 510‐407‐0769

The ACS CINF Division is a non‐profit tax‐exempt organization with taxpayer ID no. 52‐6054220.

4

Chemical Information Bulletin Vol. 63(1) Spring 2011

Technical Program Highlights

ACS Chemical Information Division (CINF) Spring, 2011 ACS National Meeting Anaheim, CA (March 27‐31)

Dr. Martin Walker has organized a very special symposium for this Spring Anaheim meeting in honor of his mentor, Dr. James Hendrickson, "Fifty Years of Computers in Organic Chemistry: A Symposium in Honor of James B. Hendrickson". Dr. Hendrikson, Professor Emeritus of Chemistry, Brandeis University, was a pioneer in the field of computer‐aided organic synthesis design and was one of the early visionaries in this field. The designer of the programs SYNGEN and WebReactions, much current work in the field is built on the early work of his research group. Many of the successful students who trained with Dr. Hendrickson over his long career, or those whose wwork was built on ideas and concepts originating from Dr. Hendrickson's work, will be speaking in this symposium including Dr. Paul A. Wender, Bergstom Professor in Chemistry, Stanford University; Dr. Phil S. Baran, Professor, Scripps Research Institute; Dr. Valentina Eigner‐Pitto, InfoChem GmbH and Dr. Orr Ravitz, SimBioSys Inc.

CINF was ahead of the Golden Globe awards this year when Dr. Steve Bachrach and Dr. Henry Rzepa planned our symposium "Internet and Chemistry: Social Networking". Use of the internet is pervasive and this symposium will focus on how it can be effectively ussed to promote the exchange of chemical ideas and chemical information. The symposium will feature presentations by Dr. Peter Murray‐Rust on "Collaborative Agile Internet Projects: The Green Chain Reaction" and Dr. Antony Williams, the developer of the successful and highly useful ChemSpider, ass well as presentations on the CAS Registry by Dr. Roger Schenck, "Publishing and Consuming in a Digital, Device Agnostic World" by Dr. David Martinsen (ACS) and OpenTox by Dr. D.A. Gallagher.

Dr. X. Simon Wang, has organized a symposium on "Natural Products and Drug Discovery" which will feature some interesting talks on screening traditional Indoneesian herbs by Dr. D. Barlow, and identifying antiviral leads from nature for common cold and flu treatment by Dr. J. M. Rollinger, as well as presentations on cheminformatic analysis of natural product data by Dr. Jose Medina‐Franco’s group and data mining by Drs. Baker and Fourches with Dr. Alex Tropsha. Discoveries in the area of natural product cancer treatment will be presented by Dr. Lawrence Hurley, and natural product 5Ht‐1A inhibitors will be discussed by Dr. X. Simon Wang. A discussion on patentinng traditional medicines from natural products will be presented as well by Drs. Zabliski and Schenck.

Drs. Maciej Haranczyk and Jose Medina‐Franco have organized "Integration of Combinatorial Chemistry with Cheminformatics: Current Trends and Future Directions in Drug Discovery and Material Science". This symposium features presentations by Dr. Dimitris Agrafiotis and Dr. W. Zheng on combinatorial library design and presentations on high throughput screening by Dr. Peter Shenkin. There will also be presentations on managing combinatorial libraries by Dr. Carsten Detering and fragment based design by Dr. Miereles. We are thankful to the CSA Trust for cosponsoring, and Drs. Irina Sens and Peter Rusch for organizing “Open Data, Open Science, Open Knowledge” ffeaturing presentations on visual search in

5

Chemical Information Bulletin Vol. 63(1) Spring 2011 scientific research data by Dr. Sens, curated scientific data resources by Dr. C.R. Groom and Open Data by Peter Murray‐Rust.

Leah Solla, Robert McFarland, Norah Xiao have organized "Data Archiving, E‐Science and Primary Data", which features presentations on Librarian 2.0 data management (Blanton‐Kent), PubChem (S. Swamidass), hosting a computing centric resource for chemistry data by Tony Williams and data curation profiles by Jeremy Garritano.

Dr. Guenter Grethe has selected and organized review of our student award posters for the CINF Scholarship for Scientific Excellence (sponsored by Accelrys) which will be presented in a on Sunday evening March 27th.

We will also have some interesting presentations in our general papers session on Wednesday morning March 30th and a small CINF poster session as part of the Sci Mix session on late Monday evening.

The number of papers presented is too numerous to mention each by title and author so I only hoped to give you an overview flavor of the rich variety of topics and material. The Spring 2011 Anaheim meeting is the first which I have organized as CINF program chairperson. I want to thank Dr. Rajarshi Guha, past program chair, for all his advice and assistance. With the capable assistance of all the symposium chairpersons, I think we have put together an interesting program which will cover the diverse interests of the CINF membership. Please come and experience first hand!

Rachelle Bienstock Chair, Program Committee

6

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF Committee Meetings and Social Events Spring, 2011 ACS National Meeting, Anaheim, CA (March 27‐31)

Saturday, March 26

7:30 AM – 9:00 AM: Long Range Planning and Breakfast Meeting, 201 A, Anaheim Convention Center 9:00 AM – 12:00 PM: Program Committee Meeting, 201 B, Anaheim Convention Center 9:00 AM – 10:30 AM: Membership Committee Meeting, 202 B, Anaheim Convention Center 11:00 AM – 12:00 PM: Finance Committee Meeting, 202 A, Anaheim Convention Center 9:00 AM – 12:00 PM: Communications & Publications Committee, 203 A, Anaheim Convention Center 10:00 AM – 11:00 AM: Fundraising Committee Meeting, 202 A, Anaheim Convention Center 10:30 AM – 12:00 PM: Careers Committee Meeting, 202 B, Anaheim Convention Center 9:00 AM – 12:00 PM: Education Committee Meeting, 203 B, Anaheim Convention Center 9:00 AM – 10:00 AM: Awards Committee Meeting, 202 A, Anaheim Convention Center 12:00 PM – 1:00 PM: CINF Functionary Luncheon, 201 A, Anaheim Convention Center 1:00 PM – 5:30 PM: Executive Committee Meeting, 201 B, Anaheim Convention Center

Sunday, March 27

12:00 PM – 2:00 PM: CINF‐CSA Trust Group Meeting, Orangewood 3, Clarion Hotel Anaheim Resort 6:30 PM – 8:30 PM: CINF Welcoming Reception & Scholarships for Scientific Excellence Posters Ballroom A, Anaheim Convention Center Reception co‐sponsored by ACS Publications, Bio‐Rad Laboratories, CambridgeSoft, InfoChem & Thieme Chemistry Scholarships for Scientific Excellence sponsored by Accelrys

Monday, March 28

4:20 – 4:30 PM: CINF Open Meeting, Room 204 A, Anaheim Convention Center 4:30 – 5:30 PM: ACS Publications/CAS Open Meeting, Room 204 A, Anaheim Convention Center 5:30 – 8:00 PM: Harry’s Party, 2nd Floor Suite, Sheraton Park Hotel at the Anaheim Resort Sponsored exclusively by FIZ CHEMIE Berlin * Use ACS Shuttle #2

Tuesday, March 29

12:00 PM – 1:30 PM: CINF Luncheon, Anaheim Marriott, Platinum Room 7 Sponsored exclusively by RSC Publishing * Ticketed event 6:30 PM – 8:30 PM: CINF Reception, Room 204 B, Anaheim Convention Center Reception hosted by the ACS Division of Chemical Information *Cash bar

Wednesday, March 30

12:00 PM – 5:00 PM: CINF‐CIC Collaborative Working Group, 204 A, Anaheim Convention Center

7

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF Symposia

ACS Chemical Information Division (CINF) Spring, 2011 ACS National Meeting Anaheim, CA (March 27‐31)

R. Bienstock, Program Chair

S M T W T Session title A 50 Years of Computers in Organic Chemistry: Symposium in Honor of James B. Hendrickson ‐ Cosponsored by ORGN P Integration of Combinatorial Chemistry with Cheminformatics: Current Trends and Future Directions in Drug Discovery and Material Science P 50 Years of Computers in Organic Chemistry: Symposium in Honor of James B. Hendrickson ‐ Cosponsored by ORGN E CINF Scholarship for Scientific Excellence Financially supported by Accelrys A Natural Products and Drug Discovery: Chemiformatics and A Open Data Open Data‐, Open Science‐, Open Knowledge‐ Financially supported by Chemical Structure Association Trust P Natural Products and Drug Discovery: Cheminformatics and Computational Chemistry P Data Archiving, E‐Science, and Primary Data E Sci‐Mix A Internet and Chemistry: Social Networking ‐ Cosponsored by YCC P Internet and Chemistry: Social Networking ‐ Cosponsored by YCC A Internet and Chemistry: Social Networking ‐ Cosponsored by YCC A General Papers P Internet and Chemistry: Social Networking ‐ Cosponsored by YCC

Legend: A = AM P = PM E = Evening

See also: Complete Program

8

Chemical Information Bulletin Vol. 63(1) Spring 2011

Technical Program Listing

ACS Chemical Information Division (CINF) Spring, 2011 ACS National Meeting Anaheim, CA (March 27‐31) R. Bienstock, Program Chair

SUNDAY MORNING Sumida Section A 1:55 6 Combinatorial library design revisited: Anaheim Convention Center, 213 C Finding new uses for old tools. D. K. 50 Years of Computers in Organic Chemistry: Agrafiotis, V. S. Lobanov Symposium in Honor of James B. 2:20 7 How to screen 10^14 cores per Hendrickson ‐ Cosponsored by ORGN second. P. S. Shenkin, K. P. Lorton M. Walker, Organizer, Presiding 2:45 Intermission. 3:00 8 Synergies of combinatorial chemistry 9:00 Introductory Remarks. and fragment‐based drug design for efficient 9:10 1 James Hendrickson: A life‐long quest generation of focused virtual libraries. L. for systematizing organic synthesis. G. Grethe Meireles, G. Mustata, I. Bahar 10:00 2 Reaction classification, an enduring 3:25 9 Chemical library design: From success story. V. Eigner‐ Pitto, H. Kraut, H. diversity, similarity, and multicriterion Saller, H. Matuszczyk, P. Loew, G. Grethe optimization to a versatile cheminformatics 10:30 Intermission. content management system (CCMS). W. 10:40 3 Back to the future of synthesis Zheng planning: How new technology and new 3:50 10 Six years of collaborative drug resources revitalize the vision of computer discovery in the cloud. B. Bunin, S. Ekins, M. aided synthesis design. J. Law, M. Mirzazadeh, Hohman, K. Gregory, B. Prom, S. Ernst A. P. Cook, O. Ravitz, P. A. Johnson, A. Simon 4:15 11 Managing giant combinatorial chemistry spaces in silico. C. Detering, H. SUNDAY AFTERNOON Claussen, M. Lilienthal, C. Lemmen Section B Anaheim Convention Center, 211 B SUNDAY AFTERNOON Integration of Combinatorial Chemistry with Section A Cheminformatics: Current Trends and Future Anaheim Convention Center, 213 C Directions in Drug Discovery and Material 50 Years of Computers in Organic Chemistry: Science Symposium in Honor of James B. J. Medina‐Franco, Organizer Hendrickson ‐ Cosponsored by ORGN M. Haranczyk, Organizer, Presiding M. Walker, Organizer, Presiding

1:00 Introductory Remarks. 1:30 12 Toward the ideal synthesis: The role 1:05 4 Experimental design for high of step economy and function oriented throughput materials development. J. N. synthesis in first‐in‐class approaches to HIV Cawse eradication, overcoming cancer resistance 1:30 5 High‐throughput strategies for and treating Alzheimer's disease. P. A. synthesis and characterization of metal‐ Wender organic frameworks for CO2 capture. K. 2:20 13 Aiming for the ideal synthesis. P. S.

9

Chemical Information Bulletin Vol. 63(1) Spring 2011

Baran Chemistry 3:10 Final introduction. R. Bienstock, Organizer 3:25 14 Half a century of computers in X. Wang, Organizer, Presiding chemistry. J. B. Hendrickson 8:30 Introductory Remarks. SUNDAY EVENING 8:35 22 Protein Fold Topology: Will it aid drug discovery or is it the reason natural Section A products have drug properties? R. J. Quinn, E. Anaheim Convention Center, Ballroom A Kellenberger 6:30 PM – 8:30 PM 9:05 23 Screening of herbs used in CINF Scholarship for Scientific Excellence traditional Indonesian medicine for inhibitors Financially supported by Accelrys of aldose reductase. D. Barlow, S. Naeem, P. G. Grethe, Organizer Hylands 9:35 24 Common cold and flu: 6:30 ‐ 8:30 Computational strategies for the 15 Exhaustive protocol with SAR‐ identification of antiviral leads from nature. J. based pose selection. F. Klepsch, G. F. Ecker M. Rollinger, J. Kirchmair, U. Grienke, D. 16 Comparison of weighted and Schuster, K. R. Liedl, M. Schmidtke unweighted consensus approaches in 10:05 Intermission. QSAR/QSPR. D. Zhuang, A. Lee, R. 10:20 25 Chemoinformatic analysis of Fraczkiewicz, M. Waldman, B. Clark, W. natural products: Towards the discovery of Woltosz DNA methyltransferase inhibitors of natural 17 When is chemical similarity significant? origin. J. Medina‐Franco, F. López‐ Vallejo, R. The statistical distribution of chemical Guha, A. Bender, D. Kuck, F. Lyko similarity scores and its extreme values. P. 10:50 26 Lessons from covalent inhibitor Baldi, R. J. Nasr modeling. O. Eidam, S. Bonazzi, S. Guttinger, 18 Reaction prediction as ranking molecular J. Wach, I. Zemp, U. Kutay, K. Gademann orbital interactions. M. A. Kayala, C. A. MONDAY MORNING Azencott, J. H. Chen, P. Baldi Section B 19 Re‐examining the tubulin‐binding Anaheim Convention Center, 202 A conformation of antitumor epothilones using Open Data Open Data‐, Open Science‐, Open QSAR and crystallographic refinement. S. A. Knowledge‐ Financially supported by Johnson, A. J. Smith, J. P. Snyder, K. N. Houk Chemical Structure Association Trust 20 Efficient core structure searches using P. Rusch, Organizer various fingerprinting methodologies: I. Sens, Organizer, Presiding Advantages, particularities and pitfalls. S. M. Furrer, D. J. Wild 9:00 Introductory Remarks. 21 DockingDB: A cyberinfrastructure for 9:10 27 Open Data and the Panton computer‐aided drug design based on Principles. P. Murray‐Rust ChemDB. P. M. Rigor 9:35 28 Making priors a priority. M. D. Segall, A. Chadwick MONDAY MORNING 10:00 Intermission. Section A 10:10 29 Ensuring sustainability of a Anaheim Convention Center, 207 C comprehensive and highly curated scientific Natural Products and Drug Discovery: data resource. I. J. Bruno, C. R. Groom Chemiformatics and Computational

10

Chemical Information Bulletin Vol. 63(1) Spring 2011

10:35 30 Visual search in scientific research 2:05 37 Anatomy of a PubChem project. S. data. I. Sens, O. Koepler Swamidass, B. Calhoun, M. Browning 2:30 38 Evolution of the University of MONDAY AFTERNOON Minnesota Libraries' approach to e‐ Section A scholarship. M. Lafferty, L. Johnston Anaheim Convention Center, 204 C 2:55 Intermission. Natural Products and Drug Discovery: 3:05 39 Hosting a compound centric Cheminformatics and Computational community resource for chemistry data. A. J. Chemistry Williams, V. Tkachenko, R. Kidd X. Wang, Organizer 3:30 40 Library data services in the social R. Bienstock, Presiding sciences: Lessons for science? K. Peter 3:55 41 Using Data Curation Profiles (DCPs) 1:30 31 Specific targeting of the G‐ as a means of raising data management quadruplex in the c‐Myc promoter with awareness. J. R. Garritano ellipticine. T. A. Brooks, V. Gokhale, R. Brown, 4:20 CINF Open Meeting L. H. Hurley 4:30 Open Meeting with the Joint Board‐ 2:00 32 Exploring natural products for drug Council Committees on Publications and discovery by mining biomedical information Chemical Abstracts Service resources. N. Baker, N. Rice, D. Fourches, E. Muratov, A. Tropsha MONDAY EVENING 2:30 33 In silico strategies in natural product Section A research to combat inflammation and lifestyle Anaheim Convention Center, Hall B diseases: Identification of FXR‐inducing Sci‐Mix triterpenes from Ganoderma lucidum. U. R. Bienstock, Organizer Grienke, J. Mihály‐Bison, D. Schuster, D. Guo, B. R. Binder, G. Wolber, H. Stuppner, J. M. 8:00 ‐ 10:00 16. See previous listings. Rollinger 3:00 Intermission. 42 Synthesis of 3‐halo‐2‐butanones. J. 3:15 34 Discovery of natural product‐derived Porter 5HT‐1A receptor binders by QSAR modeling of 43 Visualizing similarity. K. Boda known inhibitors, virtual screening and TUESDAY MORNING experimental validation. X. S. Wang Section A 3:45 35 Traditional medicine lead to Anaheim Convention Center, 204 A enhanced drug discovery derived from Internet and Chemistry: Social Networking ‐ natural products. J. Zabilski, R. Schenck Cosponsored by YCC MONDAY AFTERNOON H. Rzepa, Organizer Section B S. Bachrach, Organizer, Presiding Anaheim Convention Center, 204 A 8:25 Introductory Remarks. Data Archiving, E‐Science, and Primary Data 8:30 44 Collaborative agile Internet projects: R. McFarland, N. Xiao, Organizers The Green Chain Reaction. P. Murray‐Rust, S. L. Solla, Organizer, Presiding E. Adams, L. Hawizy, D. M. Jessop 1:30 Introductory Remarks. 9:10 45 Re‐imagining scientific 1:40 36 Librarian2.0: Synthesizing data communication for the 21st century: Is management and subject expertise. B. chemistry low hanging fruit or the worst‐case Blanton‐Kent, S. Lake, A. Sallans scenario? C. Neylon

11

Chemical Information Bulletin Vol. 63(1) Spring 2011

9:50 46 Quixote: An Internet project to build 8:30 53 Bridging the gap: Publishing and a distributed Open Knowledgebase for consuming the scientific literature in a digital, . P. Murray‐Rust, J. device‐agnostic world. D. P. Martinsen Thomas, P. Echenique, J. Estrada, M. D. 9:10 54 in chemistry: Hanwell, S. E. Adams, W. Phadungsukanan, L. Information wants to be free? J. Kuras, B. Westerhoff Vickery, D. Kahn 10:30 Intermission. 9:50 Intermission. 10:40 47 Catching the mobile wave. S. M. 10:00 55 OpenTox: An open‐source web‐ Muskal service platform for toxicity prediction. D. A. 11:20 48 Chemistry in your pocket: Shrinking Gallagher, B. Hardy, S. Chawla cheminformatics applications for mobile 10:40 56 CAS Registry: Maintaining the gold devices. A. M. Clark standard for chemical substance information. R. Schenck, J. Zabilski CINF LUNCHEON 11:20 57 Evolution of the science journal and Anaheim Marriott, Platinum Room 7 the chemical publication. H. S. Rzepa 12:00 PM – 1:30 PM (Ticketed Event) TUESDAY AFTERNOON WEDNESDAY MORNING Section A Section B Anaheim Convention Center, 204 A Anaheim Convention Center, 201C Internet and Chemistry: Social Networking ‐ General Papers Cosponsored by YCC R. Bienstock, Organizer, Presiding H. Rzepa, Organizer 9:00 Introductory Remarks. S. Bachrach, Organizer, Presiding 9:05 58 Collaborative QSAR analysis of Ames 1:30 49 chemicalize.org: Adding chemistry to mutagenicity. E. Muratov, D. Fourches, A. Web pages and predicted data and links to Artemenko, V. Kuz'min, G. Zhao, A. Golbraikh, structures. A. Allardyce, A. Stracz, D. Bonniot, P. Polischuk, E. Varlamova, I. Baskin, V. F. Csizmadia Palyulin, N. Zefirov, L. Jiazhong, P. Gramatica, 2:10 50 Using Campus Guides for leveraging T. Martin, F. Hormozdiari, P. Dao, C. Sahinalp, Web 2.0 technologies and promoting the A. Cherkasov, T. Oberg, R. Todeschini, V. chemistry and life sciences information Poroikov, A. Zaharov, A. Lagunin, D. resources. S. Baykoucheva Filimonov, A. Varnek, D. Horvath, G. Marcou, 2:50 Intermission. C. Muller, L. Xi, H. Liu, X. Yao, K. Hansen, T. 3:00 51 How the web has weaved a web of Schroeter, K. Muller, I. Tetko, I. Sushko, S. interlinked chemistry data. A. J. Williams Novotarskyi, N. Baker, J. Reed, J. Barnes, A. 3:40 52 What is the Internet doing to Tropsha chemistry and our brains? S. Heller 9:25 59 How (not) to build a toxicity model. A. C. Lee, R. Clark, M. Waldman, J. Chung, R. WEDNESDAY MORNING Fraczkiewicz, W. S. Woltosz Section A 9:45 60 Metabolic site prediction using Anaheim Convention Center, 204 B artificial neural network ensembles. M. Internet and Chemistry: Social Networking ‐ Waldman, R. Fraczkiewicz, J. Zhang, R. D. Cosponsored by YCC Clark, W. S. Woltosz H. Rzepa, Organizer 10:05 61 Withdrawn. S. Bachrach, Organizer, Presiding 10:25 Intermission. 10:35 62 Use and results of using an online

12

Chemical Information Bulletin Vol. 63(1) Spring 2011

chemistry laboratory package in a large 1:40 64 Automated semantic data embargo general chemistry course. R. L. Nafshun and publication by the CLARION project. S. E. 10:55 63 Reaction prediction as ranking Adams, N. Day, J. Downing, B. Brooks, P. molecular orbital interactions. M. A. Kayala, Murray‐Rust C. A. Azencott, J. H. Chen, P. Baldi 2:20 65 Chemical eCommerce. K. Gubernator WEDNESDAY AFTERNOON 3:00 Intermission. Section A 3:10 66 Waiting on the Chemical Internet. S. Anaheim Convention Center, 204 B M. Bachrach Internet and Chemistry: Social Networking ‐ 3:50 67 Rapid dissemination of chemical Cosponsored by YCC information for people and machines using H. Rzepa, Organizer Open Notebook Science. J. Bradley, A. S. Lang S. Bachrach, Organizer, Presiding

13

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF Symposia with Abstracts

ACS Chemical Information Division (CINF) Spring, 2011 ACS National Meeting Anaheim, CA (March 27‐31)

R. Bienstock, Program Chair

CINF 1 project was the development of an electronic James Hendrickson: A life‐long quest for version of the printed series systematizing organic synthesis “ChemInform” published by FIZ CHEMIE Berlin. Then in 1989 InfoChem acquired an Guenter Grethe(1), [email protected], exclusive license to a reaction database 352 Channing Way, Alameda CA 94502‐7409, (SPRESI) of (initially) 2.3 million records. Since United States . (1) Self employed, 352 CA the reaction database management systems 94502‐7409, United States (REACCS and ORAC) commercially available at During his long academic tenure, James that time could not handle more than Hendrickson was interested in applying logic 500,000 records, InfoChem was forced to and systematic characterization of conceive a concept for the selection of and reactions to organic synthesis. Starting in meaningful subsets of SPRESI. Based on a high the early 70's, his work gradually evolved quality reaction center detection module, from a mathematical presentation of the InfoChem's sophisticated reaction type structural and functional features of classification application, “Classify”, remains molecules and their reactions to the unique to this day. This concept allowed the development of systematic signatures for generation of widely used reaction type organic reactions. In this presentation we will databases such as ChemReact (400,000 discuss the individual steps along the way reaction types) and ChemSynth (100,000 illustrated by examples. Some recent reaction types). Classify also enables reaction developments by other groups in the area of type searching, and clustering of reaction reaction classification will be mentioned databases, and, in particular, it is the only way of linking different reaction databases. CINF 2 The world's major vendors of chemical Reaction classification, an enduring success information have adopted this technology to story enhance the reaction retrieval capabilities of Valentina Eigner‐Pitto(1), [email protected], their products. More recent developments at Landsberger Str. 408, Munich Bavaria 81241, InfoChem have resulted in a processing tool Germany ; Hans Kraut(1); Heinz Saller(1); for detecting name reactions in any reaction Heinz Matuszczyk(1); Peter Loew(1); Guenter database, and the retrosynthesis tool Grethe(2). (1) InfoChem GmbH, Munich ICSYNTH, both of which are based on the 81241, Germany (2) None, United States company's earlier fundamental work. This talk will briefly present the background and Beginning in the late 1980s InfoChem started technology of these software modules and to develop a deep understanding of the their efficient use in the field of modern storage and handling of chemical structure reaction planning. and reaction information. The first major

14

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF 3 Cawse and Effect LLC, Pittsfield MA 01201, Back to the future of synthesis planning: United States How new technology and new resources High‐throughput methods of chemical revitalize the vision of computer aided experimentation present a challenge to synthesis design experimental planning. Experiments run in Orr Ravitz(1), [email protected], 135 arrays of dozens to hundreds require Queen[apos]s Plate Dr., Unit 520, Toronto rethinking of the classic methods of Design of Ontario M9W 6V1, Canada ; James Law(1); Experiments. This talk will review the Mahdi Mirzazadeh(1); Anthony P Cook(2); adaptation of classical methods and Peter A Johnson(2); Aniko Simon(1). (1) improvisation of new methods for high SimBioSys Inc., Toronto Ontario M9W 6V1, throughput systems. These methods are Canada (2) School of Chemistry, University of becoming more important as laboratories for Leeds, Leeds LS2 9JT, United Kingdom chemistry and materials science are being equipped with the robots and high‐speed Sophisticated systems like LHASA and analytical tools for the acceleration of SYNGEN were regarded in the late 1980's as a research. In particular, the use of these great promise to the field of organic methods for effective protection of a synthesis. Their intent, as Hendrickson stated, chemical will be discussed. was “not to replace art … but to show where real art lies”. Sparked by the introduction of CINF 5 retrosynthetic analysis, the newborn field of High‐throughput strategies for synthesis and computer aided synthesis design proved that characterization of metal‐organic chemical perception and synthetic thinking frameworks for CO2 capture can be formulated in an algorithmic fashion. Kenji Sumida(1)(2), [email protected], However, the vision of routine use of such 214 Lewis Hall, Berkeley CA 94720, United tools has not materialized, and research in States . (1) Department of Chemistry, that area came to a lull in the early 1990's. University of California, Berkeley, Berkeley CA The major obstacle was the difficulty of 94720‐1460, United States (2) Materials generating high quality and up‐to‐date Sciences Division, Lawrence Berkeley National databases of synthetic transforms. We show Laboratory, Berkeley CA 94720‐1460, United how our retrosynthetic analysis system, States ARChem, capitalizes on the advent of comprehensive reaction databases and the High‐throughput methodologies are a dramatic progress in computing capabilities tremendously versatile platform for the to automatically generate expansive synthetic discovery of next‐generation materials rule‐sets, which pave the way to (metal‐organic frameworks) for CO2 capture. representation and application of synthetic However, the considerable impact that the strategies. reaction conditions employed in the synthetic step can have on the material properties CINF 4 results in a large number of synthetic trials, Experimental design for high throughput which result in tremendous quantities of data materials development from powder X‐ray diffraction and gas James N Cawse(1), adsorption experiments. An ideal [email protected], 132 Kittredge computational support system in this regard Rd, Pittsfield MA 01201, United States . (1) would allow rapid, automated identification of the highest performance materials, and

15

Chemical Information Bulletin Vol. 63(1) Spring 2011 provide feedback to the high‐throughput CINF 7 synthetic step, such that the preparation of a How to screen 10^14 cores per second material may be more rigorously optimized, and new target materials that might show Peter S Shenkin(1), high CO2 capture performance can be [email protected], 120 W. 45th identified. Here, we discuss our overall St. Suite 1700, New York NY 10036, United progress towards this goal, and present a States ; K Patrick Lorton(1). (1) Schrodinger, number of examples in which the system has New York NY 10036, United States been employed to discover the optimal We describe Schrodinger's attachment‐based synthetic conditions for the preparation of core‐hopping method and present results new metal‐organic frameworks for CO2 achieved using it. The method starts with a capture. template compound in which core and side‐ CINF 6 chains are identified. The core is replaced by new cores from a library while maintaining Combinatorial library design revisited: side‐chain positions as well as possible. No Finding new uses for old tools receptor is required, but if a docked pose is Dimitris K Agrafiotis(1), [email protected], available, receptor interactions can be Welsh & McKean Roads, Spring House PA conserved. Several scores are computed. 19477, United States ; Victor S Lobanov(1). (1) These include a synthesizability score as well Informatics, Johnson & Johnson as a score reflecting how well side‐chain Pharmaceutical Research & Development, positions are maintained. A combination of LLC, Spring House PA 19477, United States GPU processing, multithreading, and In the 15 years since our first publication on automatic linker addition lead to an overall diversity analysis and library design, the field screening rate in excess of 1.0e14 unique of combinatorial chemistry has traversed the cores per second. entire length of the hype curve, from the CINF 8 initial excitement, to the peak of inflated Synergies of combinatorial chemistry and expectations, to the trough of fragment‐based drug design for efficient disillusionment, and finally to the plateau of generation of focused virtual libraries productivity. Along the way, many of the tools that were originally developed for Lidio Meireles(1), [email protected], 3064 analyzing massive virtual libraries were either BST3, 3501 Fifth Ave, Pittsburgh PA 15260, forgotten or adapted to the realities of United States ; Gabriela Mustata(1); Ivet modern pharmaceutical research. While the Bahar(1). (1) Department of Computational need to mine massive combinatorial libraries and Systems Biology, University of Pittsburgh, is no longer there, the tools have found a new Pittsburgh PA 15260, United States life in supporting and automating smaller While combinatorial chemistry used to parallel synthesis efforts in lead generation emphasize rapid synthesis and screening of and lead optimization. In this talk, we review large libraries of compounds, the current some of these earlier technologies and trend is to synthesize much smaller focused describe their adaptation and integration in compound libraries. In this talk, we present today's discovery workflows. our recently developed computational strategy that combines combinatorial chemistry and fragment‐based drug design techniques, fragment linking and fragment

16

Chemical Information Bulletin Vol. 63(1) Spring 2011 growing, to generate focused virtual libraries that can organize data, models and more efficiently. Once combinatorial computational tools in a flexible and chemistry scaffolds are placed in the binding extensible fashion. In this talk, I will first site, fragments can be grown from and/or briefly review some concepts for library linked to the scaffold side chains to maximize design, and then describe our effort to favorable interactions with the target protein. develop a flexible cheminformatics content Different methods for placing the scaffold on management system (with tagging, sharing as the binding site will be discussed along with well as user uploading of data and tools). rules that are essential for effective filtering. CINF 10 One advantage offered by our strategy is that it can also be universally applied to design Six years of collaborative drug discovery in compounds that replicate onto combinatorial the cloud chemistry scaffolds the essential binding Barry Bunin(1), features of proteins, peptides and small [email protected], 1633 molecules. The application of the Bayshore Highway, Suite 342, Burlingame CA methodology to designing inhibitors of c‐Myc‐ 94010, United States ; Sean Ekins(1); Moses Max protein interaction will be presented. Hohman(1); Kellan Gregory(1); Barry Prom(1); CINF 9 Sylvia Ernst(1). (1) Collaborative Drug Discovery (CDD, Inc.), Burlingame CA 94010, Chemical library design: From diversity, United States similarity, and multicriterion optimization to a versatile cheminformatics content Collaborative Drug Discovery hosts a widely management system (CCMS) used drug discovery data cloud platform with advanced collaborative capabilities for Weifan Zheng(1), [email protected], 1801 distributed researchers. The CDD Vault, Fayetteville Street, BRITE Building, Durham Collaborate, and Public together host private, North Carolina 27707, United States . (1) collaborative (selectively shared), and public Pharmaceutical Sciences, North Carolina data spanning the competitive, Central University, Durham North Carolina precompetitive, and neglected disease 27707, United States domains including publicly disclosed Combinatorial chemistry and high throughput collaborations with GlaxoSmithKline, Pfizer, screening research often involve the and the Bill & Melinda Gates Foundation, as generation, storage and analysis of large well as with hundreds of academic and datasets. These data are often complex and biotech startup companies. CDD provides a heterogeneous in nature. To enable the most novel, collaborative approach for integration efficient design of chemical libraries and experimental and computational screening biological assays, various computational with distributed data collection, storage, methods have been developed in the past 15 visualization and analysis – balancing privacy‐ years. More recent research in chemical security with encouraging collaborations, genomics and systems chemical biology when desired. Experiences will be shared with require the integration of different data researchers using the “CDD Vault” – a secure, sources and computational tools. For private industrial‐strength database example, target family‐ and pathway‐ based combining traditional drug discovery library design may require information about informatics (registration and SAR) with social biological targets and pathways. These networking capabilities. CDD Collaborate requirements call for an integrated system enables real‐time collaboration by securely

17

Chemical Information Bulletin Vol. 63(1) Spring 2011

exchanging selected confidential data. CINF 12 Traditional drug discovery capabilities include Toward the ideal synthesis: The role of step the ability to import/export to Excel™ and economy and function oriented synthesis in sdfiles, Boolean queries for potency, first‐in‐class approaches to HIV eradication, selectively, and therapeutic windows for overcoming cancer resistance and treating small molecule enzyme, cell, and animal data, Alzheimer's disease substructure and Tanimoto similarity search, physical chemical property search, as well as Paul A Wender(1), [email protected], IC50 calculation/curve generation, heat‐ 333 Campus Drive, Mudd Building, Room 121, maps, and Z/Z' statistics for archived data Stanford CA 94305‐5080, United States . (1) (protocols, molecules, plates, hyperlinked Department of Chemistry, Stanford University, files). CDD Public has unique, constantly Stanford CA 94305, United States growing drug discovery SAR content. Jim Hendrickson has had a major impact on CINF 11 how we think about synthesis. He was also an inspiring influence of my early career. Managing giant combinatorial chemistry Evolving from that time are programs in our spaces in silico group directed at the eradication of HIV Carsten Detering(1), (Science 2008,649), overcoming resistant [email protected], An der Ziegelei 79, cancer (PNAS 2008 12128, the major cause of 53757 Sankt Augustin Germany, Germany ; chemotherapy failure) and novel strategies Holger Claussen(1); Markus Lilienthal(1); for treating Alzheimer's disease Christian Lemmen(1). (1) BioSolveIT, Sankt (Neurobiology of Disease 2009, 332). A major Augustin NRW 53757, Germany aspect of these programs is the singular We will introduce a method which catches importance of step economy in synthesis and the two aforementioned two birds (chemcial how that can be achieved by computational complexity and chemical universe) with one analysis, new reactions and function oriented stone: by cleverly searching a fragment space synthesis (Accounts 2008 40). In this lecture on the fly without the need to enumerate we will show three case studies of how step compounds, the computational overhead is economy provides a key to addressing major kept to a minimum, and thus, search times therapeutic challenges of our time. are low (minutes for 1010 molecules). CINF 13 Secondly, if the fragment space is composed Aiming for the ideal synthesis of the inhouse available chemistry, results obtained are much more likely to be Philip S Baran(1), [email protected], 10550 synthesizable, as the chemical reaction North Torrey Pines Road, La Jolla CA 92037, protocol is automatically delivered together United States . (1) Department of Chemistry, with the hits. We will show a few validation Scripps Research Institute, La Jolla CA 92037, cases from the industry, and look at the United States properties of one publicly available fragment Our laboratory is focused on the practical space which contains 12 billion molecules. total synthesis of complex natural products such as and terpenes by aiming to achieve the “ideal synthesis”. Hendrickson defined such a synthesis in 1975, stating: ”The ideal synthesis creates a complex molecule . . . . . in a sequence of only construction reactions

18

Chemical Information Bulletin Vol. 63(1) Spring 2011 involving no intermediary efunctionalizations, CINF 15 leading directly to the target, not only its Exhaustive docking protocol with SAR‐based skeleton but also its correctly placed pose selection functionality.” (JACS 1975, 97, 5784). In order to achieve this level of efficiency one must Freya Klepsch(1), [email protected], minimize superfluous refunctionalization Althanstrasse 14, Vienna Vienna 1090, Austria steps such as protecting group and non‐ ; Gerhard F Ecker(1). (1) Department of strategic redox chemistry. Such Medicinal Chemistry, University of Vienna, considerations require exquisite control of Vienna 1090, Austria chemoselectivity by the invention of The polyspecific nature of the chemistry and logical frameworks to aid in the transmembrane drug efflux pump P‐ planning of such routes. This invention‐ glycoprotein (P‐gp) represents a great oriented approach to total synthesis will be impediment for standard docking protocols. illustrated with several case studies from our Furthermore, a ~6000 Å3 large laboratory. transmembrane binding cavity, consisting of CINF 14 several binding sites, the high flexibility of P‐ gp and the lack of structural information Half a century of computers in chemistry render the correct ranking of docking poses a James B Hendrickson(1), quite challenging task. Thus, we present a [email protected], MS 015, 415 South docking protocol that combines exhaustive Street, Waltham MA 02453, United States . conformational sampling of propafenone‐ (1) Department of Chemistry, Brandeis type P‐gp inhibitors with common scaffold University, Waltham MA 02453, United States clustering and SAR‐based pose selection. The My half‐century of chemistry and computers resultant binding hypotheses are in may be divided into three areas. The first was agreement with experimental data, which to calculate the lowest‐energy conformations strengthens the validity of this approach. of the 6‐10‐membered cycloalkane rings, and Analogous protocols were performed with then their pseudorotation energies, to assist other membrane proteins, like the GABAA in synthesis planning. The second area was to receptor and the serotonin transporter. We define a process to seek the optimal plans for acknowledge financial support provided by efficient synthesis design. We developed a the Austrian Science Fund, grant F03502. process to find just the few shortest synthesis CINF 16 routes to any input target structure and this Comparison of weighted and unweighted has resulted in the SynGen program. This consensus approaches in QSAR/QSPR effort led to the third area, the development of a general system to afford a unique, linear Dechuan Zhuang(1), dechuan@simulations‐ string to describe any organic reaction, plus.com, 42505 10th Street West, Lancaster defined by its input reactant and product CA 93534, United States ; Adam Lee(1); structures, irrespective of mechanism or Robert Fraczkiewicz(1); Marvin Waldman(1); number of operational steps in the reaction. Bob Clark(1); Walt Woltosz(1). (1) Life Science, This has afforded a program to assign a Simulations Plus, Inc, Lancaster CA 93534, unique “signature” for any given reaction and United States has the important feature of providing Two flavors of making consensus categorical searchable indexing for any reaction predictions in QSAR/QSPR, 'unweighted database. consensus' and the 'weighted consensus'

19

Chemical Information Bulletin Vol. 63(1) Spring 2011

approaches, were compared with several such as sensitivity and specificity at fixed datasets using ADMET Predictor(TM). While thresholds, or receiver operating the unweighted method gives equal weight to characteristic (ROC) curves at multiple every member model, the weighted implicitly thresholds, and to detect outliers in the form assigns different weights to the outcomes of of atypical molecules. Numerous and diverse its member models. To find out if there is any experiments that have been performed, in benefit of using one approach over the other, part with large sets of molecules from the we constructed several datasets, which have ChemDB, show remarkable agreement different structural characteristics (balanced, between theory and empirical results. imbalanced, diverse, non‐diverse, and etc.), CINF 18 and built predictive models from them. The performances of the two approaches on Reaction prediction as ranking molecular these datasets were compared head‐to‐head orbital interactions using paired t‐test. Our results show that the Matthew A Kayala(1), [email protected], performances of the two approaches on the 243 ICS 2, Irvine CA 92697, United States ; selected datasets are statistically equal, and Chloe A Azencott(1); Jonathan H Chen(1); thus in general there is no clear advantage of Pierre Baldi(1). (1) Department of Computer using one approach over the other. Possible Science, University of California, Irvine, Irvine reasons for the observation will be discussed. CA 92697, United States CINF 17 Being able to predict the course of chemical When is chemical similarity significant? The reactions is essential to the practice of statistical distribution of chemical similarity chemistry. While computational approaches scores and its extreme values to this problem have been extensively studied in the past, a fast, accurate, and scalable Ramzi J. Nasr(1), [email protected], solution has yet to be described. Here, we University of California, Irvine, Irvine CA propose a novel formulation of reaction 92697, United States ; Pierre Baldi(1). (1) prediction as a machine learning ranking Department of Computer Science, University problem: given a set of molecules and a of California, Irvine, Irvine CA 92697, United description of conditions, learn a ranking over States potential filled to unfilled molecular orbital As repositories of chemical molecules (MO) interactions approximating the continue to expand and become more open, corresponding transition state energy it becomes increasingly important to develop ranking. Using an existing rule‐based expert tools to search them efficiently and assess the system (ReactionExplorer), we derive statistical significance of chemical similarity restricted chemistry dataset consisting of scores. Here, we develop a framework for 1300 full multi‐step reactions with 2200 modeling, predicting, and approximating the distinct starting materials and intermediates. distributions of chemical similarity scores and This yields 3600 predicted MO interactions their extreme values in large databases. From and 14 million unpredicted MO interactions. the distributions of the scores and their A two‐stage machine learning scheme is used analytical forms, Z‐scores, E‐values, and p‐ to learn the model. First, we train reactive values are derived to assess the significance site predictors using a combination of of similarity scores. In addition, the topological and real‐valued global features to framework also allows one to predict the filter out 61% and 44% of non‐predicted filled value of standard chemical retrieval metrics, and unfilled MOs with a 0.0001% error rate.

20

Chemical Information Bulletin Vol. 63(1) Spring 2011

Then various ranking models are trained on CINF 20 the MO interactions using features Efficient core structure searches using engineered to approximate transition state various fingerprinting methodologies: entropy and enthalpy. Using cross‐validation, Advantages, particularities and pitfalls current best models recover a perfect‐ranking 61% of the time and recover a within‐4‐ Stefan M Furrer(1)(2), ranking 95% of the time. [email protected], 1199 Edison Drive, Cincinnati OH 45215, United States ; CINF 19 David J Wild(2). (1) Science & Technology, Re‐examining the tubulin‐binding Givaudan Flavors Corp, Cincinnati OH 45216, conformation of antitumor epothilones using United States (2) School of Informatics and QSAR and crystallographic refinement Computing, Indiana University, Cincinnati OH Scott A Johnson(1), [email protected], 45216, United States 607 Charles E. Young Drive E., Los Angeles CA The complexity of medicinal chemistry patent 90095, United States ; Adam JT Smith(1); applications as well as the number of James P Snyder(2); Kendall N Houk(1). (1) compounds enumerated as examples was Department of Chemistry and Biochemistry, increasing spectacularly in recent years. University of California, Los Angeles, Los Finding the structures of major interest using Angeles CA 90095, United States (2) traditional methods is often a difficult task. Department of Chemistry, Emory University, Molecular fingerprinting methods are Atlanta GA 30322, United States excellent tools to rapidly organize chemical Several different bioactive conformations of information. Different fingerprinting methods epothilones, potent anti‐tumor compounds, however represent structural characteristics have been reported in the literature. We in different ways. Multiple fingerprinting proposed to provide additional support to methodologies were evaluated in their one of these conformations using a QSAR‐ capacity to differentiate and isolate core based approach. By assuming a common compounds in chemical patents. It was found pharmacophore for a set of epothilone that the fingerprint designs as well as analogs, we clustered conformations of these medicinal chemistry approaches have analogs using dihedral angles responsible for significant impact on the overall performance: orienting functional groups with known SAR different tools shed different "lights" over the effects. We identified clusters common molecular landscape. Modal fingerprints were among the most active compounds, and investigated to focus on core compounds in developed simple QSAR models that relate patents, through a relative over‐expression of the experimental IC50 values to the co‐occurring molecular features. Concrete conformational strain energy. The resulting examples will be given based on several epothilone conformer that minimizes strain major patent cases. energy in the active epothilone analogs is CINF 21 different from previously proposed DockingDB: A cyberinfrastructure for conformers. This conformation demonstrates computer‐aided drug design based on good agreement when refined in the ChemDB experimental electron crystallographic density for tubulin‐bound epothilone. Paul M Rigor(1), [email protected], 248 ICS2 Bldg, Irvine CA 92697, United States . (1) School of Information and Computer Sciences,

21

Chemical Information Bulletin Vol. 63(1) Spring 2011

University of California in Irvine, Irvine CA describes cavity recognition points unrelated 92697, United States to protein fold similarity. The topology or Although there are several open‐source and spatial properties are preserved even though commercially available computational tools there is deformation of the protein elements for virtual high‐throughput drug screening ‐‐ that participate in the protein‐ligand including DOCK, Autodock and Schroedinger's interactions. We observe helices or β‐sheets Maestro; there is still a lack of a more as equivalent in providing the invariant general, tool‐agnostic and scalable framework topology for protein‐ligand interaction and, as that is able to leverage the advantages such, are seeking to find automated methods offered by readily available docking and to interrogate these interactions. programs in a high‐ CINF 23 performance computing (HPC) environment. Screening of herbs used in traditional We have developed a cyber‐infrastructure Indonesian medicine for inhibitors of aldose built on top of an HPC pipeline and existing reductase proteomics and chemical informatics tools ‐‐ such as ChemDB and SCRATCH ‐‐ to support Dave Barlow(1), [email protected], an iterative computer‐aided drug design Franklin‐Wlikins Building, 150 Stamford methodology. We have applied our approach Street, London London SE1 9NH, United to two biological problems and describe Kingdom ; Sadaf Naeem(1); Peter Hylands(1). preliminary results. Moreover, growing (1) Pharmacy, King[apos]s College London, extensions to the pipeline and related tools London London SE1 9NH, United Kingdom are discussed. Virtual screening of phytochemical CINF 22 constituents of herbs used in traditional Indonesian medicine has been performed to Protein Fold Topology: Will it aid drug search for novel leads active against the discovery or is it the reason natural products enzyme aldose reductase (AR). The screening have drug properties? was performed using the docking software, Ronald J Quinn(1), [email protected], MolDock, and the activities (IC50s) of the Nathan Campus, Brisbane Queensland 4111, docked compounds predicted using an ; Esther Kellenberger(2). (1) Eskitis artificial neural network (ANN) trained using Institute, Griffith University, Brisbane the crystallographic data for AR complexes Queensland 4111, Australia (2) Université de involving inhibitors of known potency. The Strasbourg, Illkirch F‐67400, France ANN gave a mean accuracy of ~ 98% for the Natural products are made by nature through activities of those compounds involved in the interacting with biosynthetic enzymes. known protein structures. The trained Natural products also exert their effect as ANN was used to predict the IC50s for all drugs by interaction with proteins. We have carboxyl containing compounds in the explored the question does the recognition of database of Indonesian herbal constituents, the natural product by biosynthetic enzymes and the predicted IC50 values ranged from 17 translate to recognition of the therapeutic nM to 118 mM. Selected hits were target. Molecular modeling of flavonoid subsequently tested in vitro against human biosynthetic enzymes and protein kinases recombinant AR and while some of these with a series of natural product kinase proved to be about as active as predicted, inhibitors led to the development of the others proved significantly less potent than concept of Protein Fold Topology (PFT). PFT predicted.

22

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF 24 CINF 25 Common cold and flu: Computational Chemoinformatic analysis of natural strategies for the identification of antiviral products: Towards the discovery of DNA leads from nature methyltransferase inhibitors of natural origin Judith M. Rollinger(1), Jose Medina‐Franco(1), [email protected], [email protected], Innrain 52c, 11350 SW Village Parkway, Port St. Lucie Innsbruck Tirol 6020, Austria ; Johannes Florida 34987, United States ; Fabian López‐ Kirchmair(2); Ulrike Grienke(1); Daniela Vallejo(1); Rajarshi Guha(2); Andreas Schuster(1); Klaus R. Liedl(2); Michaela Bender(3); Dirk Kuck(4); Frank Lyko(4). (1) Schmidtke(3). (1) Institute of Pharmacy and Torrey Pines Institute for Molecular Studies, Center for Molecular Biosciences, University of Port St. Lucie Florida 34987, United States (2) Innsbruck, Innsbruck 6020, Austria (2) NIH Chemical Genomics Center, Rockville Institute of Theoretical Chemistry and Center Maryland 20850, United States (3) Unilever for Molecular Biosciences, University of Centre for Molecular Science Informatics, Innsbruck, Innsbruck 6020, Austria (3) Department of Chemistry, University of Institute of Virology and Antiviral Therapy, Cambridge, Cambridge CB2 1EW, United Friedrich Schiller University, Jena 07745, Kingdom (4) Division of Epigenetics, Germany Deutsches Krebsforschungszentrum, The search for new drug leads against Heidelberg 69120, Germany respiratory viruses remains an area of active A comparative diversity analysis of natural investigations. In this regard natural products products, drugs, the Molecular Libraries Small offer a tremendous potential as source for Molecule Repository (MLSMR), and antivirals. In our lab several virtual screening combinatorial libraries is presented in this campaigns on 3D natural product databases work. To this end, a multiple criteria strategy such as pharmacophore searches, similarity‐ was employed including physicochemical based approaches and docking have proven properties, scaffolds and different to be highly efficient for the target‐oriented fingerprints as molecular descriptors. The identification of bioactive candidates. approach enabled a comprehensive analysis Integration of these heuristic approaches with of property space coverage, the degree of empirical ones, like ethnopharmacology and overlap between collections, scaffold and in vitro extract screening, are helpful structural diversity and overall structural strategies for prioritizing compounds to be . Since several natural products isolated from natural sources and contained in dietary products are implicated pharmacologically tested. Here we in the inhibition of DNA methyltransferases demonstrate the application of different in (DNMTs), which are emerging targets for the silico techniques for the discovery of new treatment of cancer, we conducted a docking‐ anti‐rhinoviral and anti‐influenza virus natural based virtual screening of a natural product compounds using well defined molecular database with a homology model of the targets, such as the hydrophobic pocket in catalytic domain of DNMT1. Herein we the rhinoviral capsid and the influenza virus discuss the results of the virtual screening neuraminidase. that represents a first step towards the systematic screening of compounds with natural origin targeting DNMTs.

23

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF 26 Although an increasing amount of chemical Lessons from covalent inhibitor modeling data is becoming visible on the Internet it cannot be re‐used without explicit permission Oliv Eidam(1), to avoid potentially breaking copyright. The [email protected], 1700 4th St, Open Knowledge Foundation and Science San Francisco CA, United States ; Simone Commons have collaborated on a definition Bonazzi(2); Stephan Guttinger(3); Jean‐Yves of Open Data and produced a set of principles Wach(2); Ivo Zemp(3); Ulrike Kutay(3); Karl and practices (Panton Principles) to help Gademann(2). (1) Department of authors and publishers assert that their Pharmaceutical Chemistry, UCSF, San published data is truly Open. An example of Francisco CA, United States (2) Chemical fully Open Data is shown in Crystaleye Synthesis Laboratory, EPFL, Lausanne VD, http://wwmm.ch.cam.ac.uk/crystaleye with Switzerland (3) Institut fur Biochemie, ETHZ, over 200,000 crystallographic datasets from Zurich ZH, Switzerland the literature. Several publishers are adopting Leptomycin B (LMB) has antifungal, Panton, and this presentation will show the antibacterial and anti‐tumor activity and is an advantages of doing so important “tool compound” in cell biology. It CINF 28 inhibits the export of certain proteins from the nucleus through specific alkylation of Making priors a priority Cys528 of human CRM1. The recently Matthew D Segall(1), published x‐ray structure of CRM1 motivated [email protected], CB25 9TL, us to model LMB to rationalize the activity of Cambridge Cambridgeshire CB25 9TL, United recently discovered LMB analogues. A manual Kingdom ; Andrew Chadwick(2). (1) Optibrium modeling approach combined with all‐atom Ltd., Cambridge CB25 9TL, United Kingdom (2) energy minimizations was used. We found Tessella plc., Burton upon Trent Staffs DE15 that modeling was largely guided by the 0YZ, United Kingdom structural environment, and steric and When we build a predictive model of a drug geometric restraints imposed both from the property we rigorously assess its predictive binding site and the ligand. Mechanistic accuracy, but we are rarely able to address considerations of covalent inhibitor binding the most important question, “How useful highlight important residues in the binding will the model be in making a decision in a site, and the internal energy of the ligand may practical context?” To answer this requires an play a crucial role in the binding mode of understanding of the prior probability covalent inhibitors. Perhaps the most distribution and hence prevalence of negative important lesson is that manual modeling can outcomes due to the property. We will generate models useful for the design of illustrate the importance of the prior to future analogues. assess the of a model to select or CINF 27 eliminate compounds for further Open Data and the Panton Principles investigation. A better understanding of the prior probabilities of adverse events due to Peter Murray‐Rust(1), [email protected], key factors will improve our ability to make Lensfield Road, Cambridge Cambridgeshire CB good decisions in drug discovery, finding 2 1EW, United Kingdom . (1) Department of higher quality molecules more efficiently. As Chemistry, University of Cambridge, the data necessary to estimate these priors Cambridge Cambridgeshire CB 2 1EW, United does not include proprietary compound Kingdom 24

Chemical Information Bulletin Vol. 63(1) Spring 2011 structures, this presents an opportunity for CINF 30 collaboration to improve the basis for good Visual search in scientific research data decision‐making for all. Irina Sens(1), [email protected]‐hannover.de, CINF 29 Welfengarten 1B, Hannover Lower‐Saxony, Ensuring sustainability of a comprehensive Germany ; Oliver Koepler(1). (1) German and highly curated scientific data resource National Library of Science and Technology, Colin R Groom(1), [email protected], 12 Hannover 30167, Germany Union Rd, Cambridge Cambridgshire CB2 1EZ, In recent discussions among research United Kingdom ; Ian J Bruno(1). (1) CCDC, institutions and research funding agencies, Cambridge Cambridgshire CB2 1EZ, United scientific research data has been identified as Kingdom of strategic interests. As a consequence there The Cambridge Crystallographic Data Centre are ongoing efforts to establish an (CCDC) has been established as the primary infrastructure to support storage, long‐term repository for the experimentally determined preservation, and accessing of scientific 3D structures of organic and organometallic research data. Registration of datasets with compounds for over 45 years. Individual data DOI names makes research data citable and sets are available to the scientific community searchable. To date a number of operational free of charge through CCDC's structure Digital Library systems for scientific research request service. Additionally structures are data already exist. Datasets often comprise made available as part of the Cambridge numeric data on continuous or discrete scales Structure Database (CSD). Structures in the and are often associated with textual CSD are expertly curated by editorial staff so metadata including data description, author as to facilitate reliable and sophisticated and origin information. While searching in retrieval, visualisation and analysis by textual metadata is commonly available a software that the centre also develops. The content‐based access to the research data is CSD and associated software is made an open challenge. Thereby visualisation and available on a subscription basis with visual analysis of numeric data is common significant discounts applied for academic when processing scientific research data. To institutions. The income generated from close this gap in the information retrieval subscriptions has ensured until now the process we report on a concept and first sustainability of a comprehensive and highly implementations to support visual retrieval curated scientific resource. This presentation and exploration in a specific class of primary will discuss the implications that increasing research data, namely, time‐oriented data. throughput and scientific complexity have for The concept discusses relevant challenges for the way CCDC must operate, opportunities a general approach to scientific primary data for alternative distribution models that and we present first implementations on a respond to evolving expectations of the real‐world dataset. scientific community, and the pitfalls we must CINF 31 avoid to ensure sustainability in the years Specific targeting of the G‐quadruplex in the ahead. c‐Myc promoter with ellipticine Laurence H. Hurley(1)(2)(3), [email protected], 1703 E. Helen St., Tucson Arizona 85721, United States ; Tracy A. Brooks(1)(2)(3); Robert Brown(1); 25

Chemical Information Bulletin Vol. 63(1) Spring 2011

Vijay Gokhale(1)(2)(3). (1) College of Pharmacy, University of North Carolina at Pharmacy, University of Arizona, United Chapel Hill, CHAPEL HILL NORTH CAROLINA States (2) Arizona Cancer Center, University of 27599, United States (2) Laboratory of Arizona, United States (3) BIO5 Institute, Theoretical Chemistry, Department of University of Arizona, United States Molecular Structure, Bogatsky Physical‐ Previous studies have shown that the G‐ Chemical Institute NAS of Ukraine, Odessa quadruplex in the c‐Myc promoter is the 65080, Ukraine silencer element for transcriptional control. Parallel screening of Natural Products (NPs) is More recent studies have shown the a typical approach for identifying drug involvement of NM23‐H2 and nucleolin in the candidates and their targets. However, activation and silencing of c‐Myc biomolecular targets of NPs are often transcription. Using a computational overlay discovered serendipitously. We report on the of c‐Myc G‐quadruplex‐binding compounds use of Chemotext, a database of assertions and virtual screening, we have identified extracted from biomedical literature that link ellipticine as a potential G‐quadruplex‐ chemicals, targets, and diseases [J Biomed interactive compound. Then, by taking Inform 2010, 43:510‐9] to rationalize the advantage of a Burkitt's lymphoma cell line in search for NP targets in the context of the which only the non‐translocated allele is Systems Chemical Biology paradigm [Nat under the direct control of the promoter Chem Biol 2007, 3:447‐50]. We have containing the G‐quadruplex, we were able to identified similar biochemical pathways that show that the c‐Myc‐lowering effect is NPs are known to interact with in both plants directly due to interaction with the G‐ and humans. Through this analysis, we can quadruplex. In follow‐up studies using CADD deduce novel compound‐target‐disease we designed further ellipticine analogs. These associations as well as novel molecular studies provide the best available cellular targets for NP‐derived compounds. Using evidence not only for the presence of G‐ Chemotext, we have collected and integrated quadruplex in the promoter elements of cross‐species NP‐target associations. We oncogenes such as MYC but also that present the case studies of Diabetes mellitus inhibition of specific transcription can be for predicting new compound‐target mediated by small molecules that bind to this interactions and Tacrolimus‐Binding Proteins promoter element. for detecting similar biochemical pathways in CINF 32 both plants and animals/humans. Exploring natural products for drug discovery CINF 33 by mining biomedical information resources In silico strategies in natural product Nancy Baker(1), research to combat inflammation and [email protected], Beard Hall, lifestyle diseases: Identification of FXR‐ South Columbia Street, CHAPEL HILL NORTH inducing triterpenes from Ganoderma CAROLINA 27599, United States ; Denis lucidum Fourches(1), [email protected], Beard Judith M. Rollinger(1), Hall, South Columbia Street, CHAPEL HILL [email protected], Innrain 52c, NORTH CAROLINA 27599, United States ; Innsbruck Tirol 6020, Austria ; Ulrike Natalie Rice(1); Eugene Muratov(2); Grienke(1); Judit Mihály‐Bison(2); Daniela Alexander Tropsha(1). (1) Laboratory for Schuster(1); De‐An Guo(3); Bernd R. Binder(2); Molecular Modeling, UNC Eshelman School of Gerhard Wolber(1); Hermann Stuppner(1). (1)

26

Chemical Information Bulletin Vol. 63(1) Spring 2011

Institute of Pharmacy and Center for 20059, United States . (1) Department of Molecular Biosciences, University of Pharmaceutical Sciences, Howard University, Innsbruck, Innsbruck 6020, Austria (2) Center Washington DC 20059, United States of Biomolecular Medicine and Pharmacology, The 5‐Hydroxytryptamine receptor subtype Department of Vascular Biology and 1A (5‐HT1A) has been an attractive target to Thrombosis Research, Medical University of treat mood disorders such as anxiety and Vienna, Vienna 1090, Austria (3) Shanghai depression. In this study we have developed Institute of Materia Medica, Chinese Academy combinatorial Quantitative Structure‐Activity of Sciences, Shanghai 201203, China Relationship (QSAR) models for 105 5‐HT1A Farnesoid X receptor (FXR) is a ligand‐ binders and 61 non‐binders retrieved from activated transcription factor. The available the Psychactive Drug Screening Program structural information and the importance of (PDSP) Ki database. Three advanced methods, FXR to control endogenous pathways related k‐Nearest Neighbor (kNN), Random Forest to inflammation and lifestyle diseases, like (RF) and Support Vector Machine (SVM), were metabolic syndrome, dyslipidemia, employed for model building. The robust atherosclerosis and type 2 diabetes renders QSAR models of 5‐HT1A binders were then FXR an attractive target for computational used to mine major natural product libraries approaches. Virtual screenings of our in‐ such as the TimTec Natural Product Library house Chinese Herbal Medicine database (NPL) and Natural Derivatives Library (NDL). with structure‐based pharmacophore models Multiple potential hits were identified and are revealed mainly triterpenes of the famous currently examined by the PDSP for TCM fungus Ganoderma lucidum Karst. as experimental validation. The success ratios, putative FXR ligands. Ganoderma fruit body chemical diversities and structural novelties extracts verified the predicted FXR‐inducing of the natural product libraries for the effect in a reporter gene assay which purpose of virtual screening were further prompted us to determine its bioactive explored in comparison with other types of constituents. Five out of 25 secondary screening libraries, i.e. drug‐like libraries, metabolites from G. lucidum, i.e. ergosterol targeted libraries and diversity libraries. peroxide, lucidumol A, ganoderic acid TR, CINF 35 ganodermanontriol, and ganoderiol F, dose‐ dependently induced FXR in the low Traditional medicine patents lead to micromolar range. To rationalize the binding enhanced drug discovery derived from interactions, additional molecular docking natural products studies were performed, which allowed John Zabilski(1), [email protected], PO Box establishing a first structure activity 3012, Columbus OH 43125, United States ; relationship of the investigated triterpenes. Roger Schenck(1), [email protected], PO Box CINF 34 3012, Columbus OH 43202, United States . (1) Content Planning, CAS, Columbus OH 43202, Discovery of natural product‐derived 5HT‐1A United States receptor binders by QSAR modeling of known inhibitors, virtual screening and Since ancient times natural products have experimental validation provided relief from numerous aliments. Hippocrates, the father of modern medicine, Xiang S. Wang(1), [email protected], noted that powder derived from the bark of Laboratory of Cheminfomatics and Drug the willow tree helped heal pain and Design, 2300 4th St. NW, Washington DC headaches. In the 1800's, chemists isolated

27

Chemical Information Bulletin Vol. 63(1) Spring 2011 the beneficial substance as salicylic acid and a chemist/chemical engineer through this refined it by buffering sodium salicylate with process. acetyl chloride to create acetylsalicylic acid or CINF 37 aspirin. In more recent years, Traditional Medicine patents have increasingly delved Anatomy of a PubChem project into rich vein of natural products for potential S. Joshua Swamidass(1), drug discovery. The CAS databases have [email protected], 606 S. Euclid, Box mined this wealth by adding more than 8118, St Louis MO 63110, United States ; 50,000 new traditional patent records from Bradley Calhoun(1); Michael Browning(1). (1) several countries. This presentation will Department of Pathology and Immunology, illustrate the vast content available and Washington University in St Louis, St Louis MO methods to easily explore it by using SciFinder 63110, United States or STN. More raw data from high‐throughput screens CINF 36 is made available to the public every day, Librarian2.0: Synthesizing data management often through repositories like PubChem. This and subject expertise data, however, is often unorganized and incompletely annotated. Of particular Beth Blanton‐Kent(1), [email protected], interest, often several screens are P.O. Box 400124, Charlottesville VA 22904, components of a larger project. Each screen is United States ; Sherry Lake(1); Andrew a step in the project's workflow, its anatomy. Sallans(1). (1) University of Virginia Library, Knowledge of the project's workflow includes Charlottesville VA 22903, United States non‐obvious but valuable information. For The University of Virginia Library is working to instance, the scaffolds the project team chose support new data management requirements to pursue and how exactly compounds were in science and engineering by developing a chosen for follow testing. Although, these model that first draws upon close details are not well annotated in PubChem collaboration between data experts and projects, it is possible to infer them from the subject librarians, and culminates in policy raw screening data using a collection of and infrastructure recommendations to the statistical techniques. Moreover, inferred University's Office of the Vice President for workflows can be used to automatically Research (VPR) and the Office of the Vice discover additional active molecules, inform President/Chief Information Officer (VP/CIO). useful views of screening data, and identify This model begins with a data interview to methodological errors. assess the researcher's data management CINF 38 practices and needs and to establish a baseline awareness of current practice. After Evolution of the University of Minnesota collecting this information, the results are Libraries' approach to e‐scholarship furnished to the institutional repository team Meghan Lafferty(1), [email protected], 108 and NSF Data Management Plan working Walter Library, 117 Pleasant St SE, group to inform their processes. In aggregate Minneapolis MN 55455, United States ; Lisa form, this information is provided to the VPR Johnston(1). (1) Science and Engineering and VP/CIO as policy and infrastructure Library, University of Minnesota, Minneapolis recommendations. Ultimately, the entire MN 55455, United States process cycles back to the researcher. This Libraries have struggled with how best to presentation will offer a case study following respond to the challenges of e‐science since

28

Chemical Information Bulletin Vol. 63(1) Spring 2011

the middle of the last decade. The University Open Notebook Science Efforts. This of Minnesota Libraries' approach to e‐science presentation will provide an overview of and other cyberinfrastructure issues has progress to date and outline the vision of this changed multiple times since our initial community platform for chemistry and response in 2006; it has primarily taken the ensuring the longevity of chemistry reference form of groups rather than a dedicated data. position. We have more recently expanded CINF 40 our focus beyond e‐science to e‐scholarship in order to include areas such as the digital Library data services in the social sciences: humanities. The talk will address the Lessons for science? evolution of group structures and their Katharin Peter(1), [email protected], VKC primary emphases over the past 5 years, the Library, B40a USC, Los Angeles CA 90089, rationales for different changes, and potential United States . (1) University of Southern future directions. California Libraries, University of Southern CINF 39 California, Los Angeles CA 90089, United States Hosting a compound centric community resource for chemistry data Social science data have a rich history within universities: aggregate statistical publications, Antony J Williams(1), [email protected], 904 such as Statistical Abstract of the United Tamaras Circle, Wake Forest NC 27587, States, and even more detailed U.S. decennial United States ; Valery Tkachenko(1); Richard census results, have long held a place within Kidd(2). (1) ChemSpider, Royal Society of academic depository library collections. Chemistry, Wake Forest NC 27587, United Following the development of Machine States (2) Informatics, Royal Society of Readable Data Files, social science data Chemistry, Cambridge, United Kingdom archives were established within several Laboratories around the world continue to universities across the United States— generate immense amounts of data that are notably, the Inter‐university Consortium for non‐proprietary and of value to the Political and Social Research and Roper community. If available these data could Center for Public Opinion Research. Although dramatically reduce costs by minimizing differences between social science and rework and ultimately facilitating faster science data are not insignificant (for research. High quality reference data example, average file size), as data librarians collections of chemical compound we face the similar obstacles to: outreach, dictionaries, properties and spectra have access, archiving and management and, in been generated over many decades. With the general, effectively creating a place within advent of social networking tools and libraries for data and data services. This platforms such as Wikipedia, the community presentation will outline current library has an opportunity to contribute. The services and service models for social science ChemSpider platform hosted by the Royal data in hopes of launching a dialog and skill‐ Society of Chemistry is a compound centric share between social science and sciences database with associated data. Already data professionals. populated with almost 25 million unique CINF 41 compounds the community can deposit and host their own data, and curate and annotate Using Data Curation Profiles (DCPs) as a existing data including those generated in means of raising data management awareness 29

Chemical Information Bulletin Vol. 63(1) Spring 2011

Jeremy R Garritano(1), [email protected], working with larger nucleophiles as well as 504 West State St., West Lafayette IN 47907, using larger ketones. United States . (1) Purdue University, West CINF 43 Lafayette IN 47907, United States Visualizing molecule similarity While one can discuss data management plans in a general sense, there is no single Krisztina Boda(1), [email protected], 9 solution for managing the diverse data Bisbee Court, Santa Fe New Mexico 87508, generated by various disciplines and projects. United States . (1) OpenEye Scientific Therefore one possible solution is to Software, Santa Fe New Mexico 87508, determine best practices for individual data United States management plans guided by a more general Similarity searching based on fingerprint Data Curation Profile (DCP). The DCPs were similarity is one of the most common created at Purdue University and the approach for virtual screening. The main University of Illinois Urbana‐Champaign advantages of the method that it provides a through a grant from the Institute of Museum rapid calculation of similarity scores to and Library Services. Using a DCP, librarians identify molecules that are similar to the and/or researchers explore various data reference structure. However, most management issues. Once a profile has been fingerprint methods does not provide any completed, not only will the librarian have a insight into molecule similarity beyond a richer understanding of the kind and quantity single numerical score. The poster will of data that might have to be curated and represent a method where molecular graphs archived, but the researcher will have a are highlighted using a color gradient scheme better understanding of their data that emphasizes shared fragments encoded preferences related to sharing and into fingerprints. This representation not only , regardless of where the makes molecular similarity immediately data ultimately resides. Current applications apparent but also reveals information about of the DCP at Purdue will be discussed. the underlying fingerprint method. The CINF 42 method is utilized to analyze the hit‐lists using different fingerprint methods on datasets of Synthesis of 3‐halo‐2‐butanones previously published benchmarks. The 2D Joseph Porter(1), [email protected], graphics are generated using OpenEye's 300 North Broadway, Lexington KY 40508, Ogham package that provides a framework to United States . (1) Transylvania University, construct molecular diagrams. The poster will United States also represent various Ogham functionalities This study is attempting to find out the effect that allow the customization of molecule of adding a halide group to a ketone. The depiction. main molecules I worked with were 3‐halo‐2‐ CINF 44 butanones. I used ether as a solvent and Collaborative agile Internet projects: The performed Grignard reactions under nitrogen Green Chain Reaction adding ethynyl Grignards as the nucleophiles. I was measuring diastereomeric ratios using Peter Murray‐Rust(1), [email protected], GC‐MS, H1 and C13 NMR, and GC. Lensfield Road, Cambridge Cambridgeshire, Unexpectedly, results showed that ratios United Kingdom ; Samuel E Adams(1); Lezan were similar to those found using LiAlH4 as Hawizy(1); David M Jessop(1). (1) Department the nucleophile. Future experiments will be of Chemistry, University of Cambridge,

30

Chemical Information Bulletin Vol. 63(1) Spring 2011

Cambridge Cambridgeshire CB 2 1EW, United of items. I will argue that chemistry, and in Kingdom particular synthetic organic chemistry, is a An Open Science project was designed, special case with its own particular implemented and completed within a month difficulties, but that the inherent structure to investigate whether chemical reactions and regularity of synthetic research makes it a were using "greener" solvents than formerly. good target for testing and demonstrating 10 volunteers wrote or implemented code to new approaches to scholarly communication. extract recipes from European patents. The CINF 46 recipes were analysed by OSCAR and Quixote: An Internet project to build a chemical Natural Language processing using distributed Open Knowledgebase for medium‐depth parsing to extract solvents, quantum chemistry with high precision. The volunteers crawled the patent website, analysed over 100,000 Jens Thomas(1), [email protected], recipes and posted the results to a Daresbury Laboratory, Daresbury Science and communal, Open server, using the Lensfield Innovation Campus, Warrington Cheshire "make/build" philosophy. The solvent WA4 4AD, United Kingdom ; Peter Murray‐ information was then aggregated and Rust(2); Pablo Echenique(3); Jorge Estrada(3); presented for the years 2000 to 2010. There Marcus D Hanwell(4); Samuel E Adams(2); is no obvious trend showing that "green" Weerapong Phadungsukanan(5); Lance solvents are becoming commoner. Westerhoff(6). (1) Computational Science and Engineering Department, Science and CINF 45 Technology Facilities Council, Daresbury Re‐imagining scientific communication for Laboratory, Daresbury Cheshire WA4 4AD, the 21st century: Is chemistry low hanging United Kingdom (2) Department of Chemistry, fruit or the worst‐case scenario? University of Cambridge, Cambridge Cameron Neylon(1), [email protected], Cambridgeshire CB 2 1EW, United Kingdom Rutherford Appleton Laboratory, HSIC, Didcot (3) Instituto de Química Física NON‐US OX11 0QX, United Kingdom . (1) ISIS [quot]Rocasolano[quot], CSIC, Madrid E‐ Neutron Source, Science and Technology 28006 Madrid, Spain (4) Department of Facilities Council, Didcot NON‐US OX11 0QX, Scientific Visualization, Kitware, Inc, Clifton United Kingdom Park NY NY 12065, United Kingdom (5) Department of Chemical Engineering and We are told that “the web changes Biotechnology, University of Cambridge, everything” but scientific communication still Cambridge Cambridgeshire CB2 3RA, United owes more to the 17th century than to the Kingdom (6) QuantumBio Inc., State College 20th. The central problem with current PA PA 16803‐6602, United States practice is the view of “the paper” as a monolithic object, and the only form of Quixote is a distributed semantic communication that is rewarded. We need to knowledgebase for quantum chemistry both technically enable the publication of deliberately prototyped within a month by many different research objects and to create distributed volunteers. It uses a wide range of tools to aggregate these together into large existing tools such as from the narrative works that retain the structure and collection and uses them to meaning of internal links. Along with this we translate conventional QC files (log, punch, need both technical and social infrastructure archive, input) into semantic form. The to help us filter and discover this large range semantics are controlled by per‐program

31

Chemical Information Bulletin Vol. 63(1) Spring 2011

dictionaries which are created by program recent years many of these services have experts. The process is controlled by and rests been made accessible from ultra‐portable heavily on modern Internet approaches such devices such as smartphones and tablet as Etherpad, Skype, Wiki, REST, HTTP, RDF computers. Efforts have been hampered by and SPARQL. Parsing is through ANTLR and the need to draw chemical structures to recursive descent. Semantics are provided by access certain functionality, e.g. searching namespaced dictionaries, elements and databases by structure. To a large extent attributes allowing lossless transmission of mobile devices have been limited to use for information. The system is completely content consumption. Implementing a Open/free and allows anyone to clone and chemical structure sketching interface on a run a node, on a peer‐to‐peer system with as tiny device is difficult, because the traditional much or little security as desired. paradigm requires an accurate pointing CINF 47 device, such as a mouse. A finger on a touchscreen is simply too clumsy for standard Catching the mobile wave structure drawing techniques, and many Steven M Muskal(1), smuskal@eidogen‐ devices lack a pointing device entirely. This sertanty.com, 3460 Marron Road, Oceanside presentation will describe a new approach to CA 92056, United States . (1) Eidogen‐ drawing 2D chemical structures, which Sertanty, Oceanside CA 92056, United States reevaluates the traditional drawing techniques in order to make them work well With the explosive growth of mobile with input‐constrained devices. This is computing environments, including the accomplished by using a high degree of iPhone, Android‐based devices, the iPad, and automation and inference, which is provided its fast‐followers, it has become important for by newly developed algorithms. The end scientific software companies to enable result is a mobile application which can be technology and content access on these used to create publication quality 2D sketches ubiquitous devices. Coupled with cloud with a small number of steps, which is computing environments (e.g. Amazon's EC2 convenient to use on a variety of current and RDS environments), these platforms smartphones and tablets, including represent the new frontier for scientific BlackBerry, iPhone and iPad devices. Also computing. We will describe both technical discussed will be some of the internet‐based and business challenges and lessons learned applications which are possible now that a as we developed our mobile apps ‐ iKinase, viable structure editor is available. With this iKinasePro, iProtein, and MobileReagents. hurdle removed, a large number of desktop‐ CINF 48 based cheminformatics applications can be Chemistry in your pocket: Shrinking migrated to smaller devices by splitting the cheminformatics applications for mobile interface between a mobile client and web‐ devices based services. Mobile devices can now be used for creating, managing, viewing and Alex M Clark(1), [email protected], 1900 sharing chemical information. St. Jacques West #302, Montreal Quebec H3J2S1, Canada . (1) Molecular Materials CINF 49 Informatics, Montreal Quebec H3J2S1, chemicalize.org: Adding chemistry to Web Canada pages and predicted data and links to Internet resources are now a routine part of structures the workflow of a research chemist, and in

32

Chemical Information Bulletin Vol. 63(1) Spring 2011

Alex Allardyce(1), [email protected], information resources in a new way using Maramaros Koz 3/a, Budapest Pest, Hungary ; multimedia and social networking tools. The Andras Stracz(1); Daniel Bonniot(1); Ferenc flexibility and the wide range of solutions Csizmadia(1). (1) ChemAxon, Budapest H1037, these programs provide have tempted Hungary librarians to use them in many innovative chemicalize.org is a new free online service ways, which has not been possible to do in developed by ChemAxon which adds static web pages controlled by rigid rules and chemistry to Web pages as well as data and other external factors. This presentation will Web pages to structures. The primary use is show how users have responded to the new to parse chemical names from Web page text dynamic information environment created and serve an annotated Web page version with Campus Guides and what the statistical which includes structure images hyper‐linked data show about their preferences toward from the chemical name source. By storing particular information resources in chemistry structures and Web page URL's we can search and the life sciences. the database to find those Web pages CINF 51 containing any given structure query. For How the web has weaved a web of each structure users can also generate interlinked chemistry data structure based prediction results within a user customizable report, predictions include Antony J Williams(1), [email protected], 904 logP, pKa, logD etc. Current developments Tamaras Circle, Wake Forest NC 27587, center around user profiles, 'tracking' United States . (1) ChemSpider, Royal Society structures in newly chemicalized pages and of Chemistry, Wake Forest NC 27587, United presenting chemicalize.org user activity to States give a snapshot of current Web pages and The internet has provided access to structures that are interesting chemists unprecedented quantities of data. In the online. This presentation will outline the aims domain of chemistry specifically over the past of the development, describe the service, decade the web has become populated with current developments and overview use and tens of millions of chemical structures and user feedback. related properties of assays together with CINF 50 tens of thousands of spectra and syntheses. The data have, to a large extent, remained Using Campus Guides for leveraging Web 2.0 disparate and disconnected. In recent years technologies and promoting the chemistry with the wave of Web 2.0 participation, any and life sciences information resources chemist can contribute to both the sharing Svetla Baykoucheva(1), [email protected], and validation of chemistry‐related data White Memorial Chemistry Library, College whether it be via Wikipedia, the online Park MD 20742, United States . (1) White encyclopedia, or one of the multiple public Memorial Chemistry Library, University of compound databases. This presentation will Maryland, College Park MD 20742, United offer a perspective of what is available today, States our experiences of building a public The introduction of Campus Guides and a compound database to link together the “lighter” version of this program, Lib Guides, internet, and a suggested path forward for in the last few years has created many enabling even greater integration and exciting opportunities for science librarians to connectivity for chemistry data for the promote the chemistry and life sciences

33

Chemical Information Bulletin Vol. 63(1) Spring 2011 masses to both use and participate in a much slower evolution. Publishers, authors, developing. editors, reviewers, and readers all make CINF 52 inputs into the ecosystem, and each responds, sometimes in unexpected ways, to What is the Internet doing to chemistry and the changes that are made. As the journal of our brains? the future and the article of the future, Stephen Heller(1), [email protected], 100 emerge from the old models, it is useful to Bureau Drive, Bldg 221, Gaithersburg consider the impact of those changes. Maryland 20899, United States . (1) NIST, CINF 54 Gaithersburg Maryland 20899, United States Open access in chemistry: Information wants The Internet, like any technology, has good, to be free? bad, and ugly sides to it. This lecture will attempt to talk about these aspects with Jan Kuras(1), examples in chemistry that should both [email protected], 236 enlighen and disturb. Gray[apos]s Inn Road, London WC1X 8HB, United Kingdom ; Bryan Vickery(1); Deborah CINF 53 Kahn(1). (1) Chemistry Central, London, Bridging the gap: Publishing and consuming United Kingdom the scientific literature in a digital, device‐ The open access (OA) publishing movement agnostic world was motivated by a desire to increase David P Martinsen(1), [email protected], visibility and dissemination of scientific 1155 16th ST NW, Washington DC 20036, information. and the United States . (1) American Chemical Society, advent of the Internet helped establish and Washington DC 20036, United States accelerate the growth of OA in the early 2000s. Acceptance and uptake was significant Scientific publishing has seen a steady amongst e.g. the high‐energy physics and transition from the primarily paper‐based biomedical research communities as model of the pre‐2000 era to the digital world demonstrated by the success of initiatives of the late 1990s and now the first decade of such as ArXiv, BioMed Central, and the Public the 21st century. While usage analysis, as well Library of Science. In chemistry, the growth of as end‐user studies, indicate that paper, or at OA has been more conservative. This least PDF files printed out on paper, are still presentation will review the development of the preferred way for most scientists to OA in chemistry, examine the current interact with the scholarly literature, there is situation with reference to recent studies, a growing percentage of scientists who are and look forward to future directions in asking for more. New data formats, new particular with the emergence of other open devices, and new applications present a data initiatives and Web technologies. challenge for publishers as well as authors and readers. Publishers try to keep up with CINF 55 the demands of authors and readers who OpenTox: An open‐source web‐service want to push the technology, while at the platform for toxicity prediction same time addressing the more modest concerns of the majority of scientists who just David A Gallagher(1), want to get the article text and not be [email protected], 13690 SW Otter bothered with bells and whistles. While some Lane, Beaverton Oregon, United States ; Barry call for a revolution in publishing, the reality is Hardy(2); Sunil Chawla(3). (1) CAChe Research

34

Chemical Information Bulletin Vol. 63(1) Spring 2011

LLC, Beaverton Oregon, United States (2) these internet resources and how CAS Douglas Connect, Zeiningen, Switzerland (3) evaluates them for inclusion in CAS REGISTRY, Seascape Learning LLC, Cuppertino California, while maintaining its quality standards. Since United States 1965, the scientific experts at CAS have The new European Union (EU) REACH identified more than 56 million organic and chemical legislation will require 3.9 million inorganic substances. This presentation will additional test animals, if no alternative examine the sources of this growth and methods for toxicity prediction are accepted. illustrate what CAS is doing to keep pace with However, the number of test animals could this explosion in small molecule chemistry. be significantly reduced by utilizing existing CINF 57 experimental data in conjunction with Evolution of the science journal and the (Quantitative) Structure Activity Relationship chemical publication ((Q)SAR) models. To address the challenge, the European Commission has funded the Henry S Rzepa(1), [email protected], OpenTox (www.OpenTox.org) project to Exhibition Road Campus, London London SW7 develop an open source web‐service‐based 2AY, United Kingdom . (1) Department of framework, that provides unified access to Chemistry, Imperial college London, London experimental toxicity data, in Silico models Sw7 2AY, United Kingdom (including (Q)SAR), and validation/reporting The concept of a modern procedures. Now, in the final year of the becomes 346 old in 2011 (DOI: initial three‐year project, the current state of 10.1098/rstl.1665.0001), although only since architecture, Open API, algorithms, 1994 has the journal article been embedded ontologies, and approach to web services will in the Internet and Web era (DOI: be presented. Our experiences on current 10.1039/C39940001907). Although the collaborative approaches aiming to combine structure of the article itself morphed little OpenTox with other systems such as CERF, during the first part of the Internet age, there , CDK, and SYNERGY to create are now signs that many aspects of its “super‐interoperable K‐infrastructure” will be creation and dissemination are starting to discussed both in terms of conceptual evolve more rapidly. Here, several potential promise and implementation reality. future enhancements are reviewed, including CINF 56 the role of the scientific blog in augmenting the effectiveness of the peer‐review CAS Registry: Maintaining the gold standard processes, the role of data‐integrity within for chemical substance information the article, integration of Web‐enhanced and Roger Schenck(1), [email protected], 2540 other data‐rich and functional objects, the Olentangy River Road, Columbus OHIO 43202‐ role of open digital repositories, article 1505, United States ; John Zabilski(1). (1) semantification, and delivery and re‐ Department of Content Planning, Chemical functionalisation of the re‐invented article via Abstracts Service, Columbus OHIO 43202‐ new generations of mobile personal devices. 1505, United States CINF 58 CAS has traditionally built its databases from Collaborative QSAR analysis of Ames the journal and patent literature. With the mutagenicity advent of the Internet, CAS now has another major source of chemical substance Eugene Muratov(1)(2), [email protected], information. This presentation will discuss Beard Hall, CB7568, Chapel Hill NC 27599,

35

Chemical Information Bulletin Vol. 63(1) Spring 2011

United States ; Denis Fourches(1); Anatoly QSAR models were developed using different Artemenko(2); Victor Kuz[apos]min(2); Guiyu combinations of chemical descriptors and Zhao(1); Alexander Golbraikh(1); Pavel machine learning approaches, representing Polischuk(2); Ekaterina Varlamova(2); Igor the most extensive combinatorial QSAR Baskin(3); Vladimir Palyulin(3); Nikolai modeling study ever done in the Zefirov(3); Li Jiazhong(4); Paola Gramatica(4); cheminformatics field in public domain. The Todd Martin(5); Farhad Hormozdiari(6); resulting consensus model had the highest Phuong Dao(6); Cenk Sahinalp(6); Artem external predictive power nearly reaching the Cherkasov(7); Tomas Oberg(8); Roberto experimental reproducibility of 85% for the Todeschini(9); Vladimir Poroikov(10); Alexey Ames test. In addition, we found published Zaharov(10); Alexey Lagunin(10); Dmitriy evidence indicating that 31 of 130 outliers (29 Filimonov(10); Alexandre Varnek(11); Dragos mutagens and 2 non‐mutagens) were Horvath(11); Gilles Marcou(11); Cristophe erroneously annotated in the original dataset. Muller(11); Lili Xi(12); Huanxiang Liu(12); This work presents a model of collaboration Xiaojun Yao(12); Katja Hansen(13); Timon that integrates the expertise of participating Schroeter(13); Klaus‐Robert Muller(13); laboratories to establish the best practices Igor[apos] Tetko(14); Iurii Sushko(14); Sergii and most reliable solutions for difficult Novotarskyi(14); Nancy Baker(1); Jane problems in chemical and computational Reed(15); Julia Barnes(15); Alexander toxicology. Tropsha(1). (1) University of North Carolina, CINF 59 Chapel Hill NC, United States (2) A.V. Bogatsky Physical‐Chemical Institute NAS of Ukraine, How (not) to build a toxicity model Chapel Hill NC, United States (3) Moscow Adam C Lee(1), adam@simulations‐plus.com, State University, Moscow, Russian Federation 42505 10th Street West, Lancaster 1 93534, (4) University of Insubria, Varese, Italy (5) US United States ; Robert Fraczkiewicz(1); Robert Environmental Protection Agency, Cincinnati Clark(1); Walter S Woltosz(1); Marvin OH, United States (6) Simon Fraser University, Waldman(1); John Chung(1). (1) Department Burnaby, Canada (7) University of British of Life Sciences, Simulations Plus, Inc., Columbia, Vancouver, Canada (8) University Lancaster CA 93534, United States of Kalmar, Kalmar, Sweden (9) University of When a seemingly well‐curated chemical data Milano‐Bicocca, Milan, Italy (10) Institute of set hits the press, a modelers' first impulse is Biomedical Chemistry RAS, Moscow, Russian to apply their preferred QSAR method to the Federation (11) University of Strasbourg, data in hopes of building a model that Strasbourg, France (12) Lanzhou University, exhibits superior statistics to other published Lanzhou, China (13) Technical University of models. Occasionally, the results appear too Berlin, Berlin, Germany (14) Institute for good to be true. Are these models useful? Bioinformatics, Nuremberg, Germany (15) This work details a procedure for building a BioWisdom Ltd, Cambridge, United Kingdom useful and well‐validated model, using We report the results of a collaborative QSAR respiratory sensitization data. We highlight modeling project between 15 teams to the do's and don'ts of data selection, pre‐ and develop predictive computational QSAR post‐ data curation, QSAR methodologies, models of in vitro Ames mutagenicity induced and validation strategies implemented from by organic compounds. The Ames dataset 1984 to present. The examples demonstrate consisted of 6542 compounds (after how to identify a narrow sampling of curation). In total, 32 predictive classification chemical space by examining good‐looking

36

Chemical Information Bulletin Vol. 63(1) Spring 2011 models, applying a model to (believable) real‐ CINF 62 world data in order to determine its Use and results of using an online chemistry usefulness both inside and outside the laboratory package in a large general model's applicability domain, and techniques chemistry course that modelers (should) use to validate as well as assess the robustness of a model. Richard L Nafshun(1), [email protected], 139 Gilbert Hall, CINF 60 Corvallis Oregon 97331, United States . (1) Metabolic site prediction using artificial Department of Chemistry, Oregon State neural network ensembles University, Corvallis Oregon 97331, United Marvin Waldman(1), marv@simulations‐ States plus.com, 42505 10th Street West, Lancaster In addition to traditional on‐campus general CA 93534, United States ; Robert chemistry courses, The Department of Fraczkiewicz(1); Jinhua Zhang(1); Robert D. Chemistry at Oregon State University has Clark(1); Walter S. Woltosz(1). (1) Simulations been offering an online general chemistry Plus, Inc., Lancaster CA 93534, United States sequence since 2003. We have struggled to Hepatic first‐pass metabolism of drugs and identify a method of facilitating an prodrugs plays a key role in oral appropriate distance laboratory program. We bioavailability, and the cytochrome P450 have investigated a "kitchen" chemistry kit enzymes are responsible for metabolism of and various online virtual toolboxes. We are most drugs. Knowledge of likely sites of currently using a virtual laboratory package metabolic attack in a drug molecule can aid in (www.onlinechemlabs.com) which presents designing out unwanted metabolic liabilities the user with a split screen: one side contains early on in the drug discovery process as well chemistry laboratory tools and the other is as in the design of prodrugs where metabolic text. The tools include standard experimental transformation is desired. Using datasets equipment such as an analytical balance, constructed from literature compilations and flasks, pipettes, and reagents, as well as more commercially available databases, we have complex analytical instruments or reaction constructed models based on artificial neural equipment such as an absorbance network ensembles that predict one or more spectrophotometer, calorimeter, NMR, and a likely sites of metabolism for a given molecule combustion chamber. The logical progress (or for several CYP isoforms including 2C9, 2D6, flow) of these tools in experiments is and 3A4. The models employ atomic analogous to that in classroom labs. The tools descriptors describing charge, reactivity, incorporate both random and systematic steric accessibility, and other properties of error, providing data simulations where the candidate atom and its local environment. detailed error analyses can be performed that Model performance will be shown based on are analogous to that in classroom laboratory various statistical criteria as well as specific experiments. Each of these features allows examples demonstrating scope and for a significant enhancement in instructional limitations. capabilities, and could integrate very well with the instructional modalities of models CINF 61 and argumentation that have been recently WITHDRAWN developed and outlined in more detail below. Results of the use of the online chemistry

laboratory package in three different modes

37

Chemical Information Bulletin Vol. 63(1) Spring 2011

(fully online/hybrid/supplemental) and CINF 64 methods of use will be discussed. Automated semantic data embargo and CINF 63 publication by the CLARION project Reaction prediction as ranking molecular Samuel E Adams(1), [email protected], orbital interactions Department of Chemistry, Lensfield Road, Matthew A Kayala(1), [email protected], Cambridge Cambridgeshire CB2 1EW, United 243 ICS 2, Irvine CA 92697, United States ; Kingdom ; Nick Day(1); Jim Downing(1); Peter Chloe A Azencott(1); Jonathan H Chen(1); Murray‐Rust(1); Brian Brooks(1). (1) Unilever Pierre Baldi(1). (1) Department of Computer Centre for Molecular Science Informatics, Science, University of California, Irvine, Irvine Department of Chemistry, University of CA 92697, United States Cambridge, Cambridge Cambridgeshire CB2 1EW, United Kingdom Being able to predict the course of chemical reactions is essential to the practice of The CLARION project has created the chemistry. While computational approaches infrastructure to enable research chemists to to this problem have been extensively studied make selected data available as Open Data, in the past, a fast, accurate, and scalable shared over the Semantic Web, without solution has yet to be described. Here, we requiring technical expertise themselves. propose a novel formulation of reaction Data is automatically collated from central prediction as a machine learning ranking services, such as the Departmental problem: given a set of molecules and a Crystallographic Service, and chemists' description of conditions, learn a ranking over Electronic Lab Notebooks. An Embargo potential filled to unfilled molecular orbital Manager application presents research (MO) interactions approximating the groups with a view of the data they own, and corresponding transition state energy allows them to set embargo conditions and ranking. Using an existing rule‐based expert add additional metadata. Once the embargo system (ReactionExplorer), we derive period expires data is automatically restricted chemistry dataset consisting of semantified and deposited as Open Data in a 1300 full multi‐step reactions with 2200 public Chem# repository. distinct starting materials and intermediates. CINF 65 This yields 3600 predicted MO interactions Chemical eCommerce and 14 million unpredicted MO interactions. A two‐stage machine learning scheme is used Klaus Gubernator(1), [email protected], to learn the model. First, we train reactive 380 Stevens Avenue #311, Solana Beach CA site predictors using a combination of 92075, United States . (1) eMolecules, Inc., topological and real‐valued global features to Solana Beach CA 92075, United States filter out 61% and 44% of non‐predicted filled Chemist are late adopters of the internet. The and unfilled MOs with a 0.0001% error rate. main obstacle is that search engines and Then various ranking models are trained on eCommerce systems are text‐based and as the MO interactions using features such inherently inadequate to handle engineered to approximate transition state chemical structures. Also, chemical entropy and enthalpy. Using cross‐validation, nomenclature and names are poorly current best models recover a perfect‐ranking standardized and inconsistently used by both 61% of the time and recover a within‐4‐ suppliers and buyers of chemicals. Therefore, ranking 95% of the time. only the combination of a chemical search

38

Chemical Information Bulletin Vol. 63(1) Spring 2011 engine and a chemical eCommerce system CINF 67 can address the needs of the market. Such a Rapid dissemination of chemical information system has to handle millions of chemical for people and machines using Open structures, return results in seconds, and Notebook Science provide tools to handle lists of thousands of molecules. In addition, user expectations are Jean‐Claude Bradley(1), [email protected], created by their experiences with Amazon [email protected], 32nd and Chestnut and eBay: Prices and availability should be on streets, Philadelphia PA 19104, United States ; line. The purchasing process is expected to be Andrew SID Lang(2). (1) Department of predictable: you get what you order on time. Chemistry, Drexel University, Philadelphia PA Implementing and operating a chemical 19104, United States (2) Department of eCommerce system therefore requires a Mathematics, Oral Roberts University, Tulsa paradigm shift in the quality of the entire OK 74171, United States purchasing process. This presentation will cover methods and CINF 66 tools used to collect, record and disseminate chemical information using Open Notebook Waiting on the Chemical Internet Science, the practice of making a laboratory Steven M. Bachrach(1), notebook and all associated raw data [email protected], 1 trinity place, San available publicly in as close to real time as Antonio TX 78212, United States . (1) possible. Both solubility measurements and Department of Chemistry, Trinity University, organic chemistry reactions are handled in San Antonio TX 78212, United States this way. The recording of laboratory data is The chemical internet dates back roughly to handled primarily using free and hosted 1994. Over that time the impact of the services such as Wikispaces and Google Internet and the web on society in general Spreadsheets. The information is made has been overwhleming. Business have come discoverable using redundant communication and gone, communication has evolved from channels, including Google, , web sites to blogs to tweets. But for chemists, Wikipedia and other vehicles. The abstraction the impact has been of much less significance. of key elements from the solubility The talk will present some of the causes of measurements and the chemical reactions the slow uptake of the Internet by chemists allows for the use of live machine‐readable and what potentially the future might hold for feeds and web services. The implications for us. the future of the automation of the scientific process based on Open Data and Open Services will be discussed.

39

Chemical Information Bulletin Vol. 63(1) Spring 2011

Committee Report CINF Communications & Publications Committee

Transfer from the CINF Yahoo! Group to the ACS Network

The decision to close the CINF Yahoo! Group and transfer all CINF Division business to the ACS Network has been implemented. The CINF Yahoo! Group still exists but access is limited to the former group moderators and the group will not be closed until we can decide how to preserve the email archive.

The CINF Division group on the ACS Network has now grown to 127 members and CINF members have begun to use the group in a serious manner after some initial reluctance due to unfamiliarity with the new network. The discussion on the new CINF website has generated almost 900 views and a large number of postings. This group is open to all ACS Network members. There are also closed groups for the CINF Executive and also for this committee where private business can be conducted.

Switchover to the new CINF website

In January, Danielle Dennie, the new CINF webmaster reported as follows:

“Ideally, I would have liked to survey or talk with members of CINF to ask you how you use the site and what you would like to see in a new site. This would have meant that I would have kept the old site while gathering data that could be used to create the new site. Unfortunately, because of my limited knowledge of the software that was used to create the old site, I could not make any edits to it. Which means that if there were any updates to be made to the old site before a new site could be built, I would not have been able to make them.

“Therefore, I quickly designed a new template that I could work with. To make it easy for myself, and for users of the site, I kept the same logical organization that was on the old site. This means that the menu on the left hand side is practically (with minor exceptions) the same as the old menu, as well as the organization of the secondary pages.

“That being said, I was not able to transfer over all content. Specifically, there are 3 sections that I could not transfer:

“Because the old site used a database to generate past meetings information, I have not, for the moment transferred over the tremendous amount of content that was in that section of the site. Therefore, I simply link to it from the new site. (http://acscinf.org/meetings/past.php). “I did not transfer over volume 62 of the e‐CIB newsletter (http://acscinf.org/publications/bulletin.php). However, for the upcoming e‐CIB, I will create a new template. Perhaps once a new template is agreed upon, I will be able to transfer over volume 62. “The CINF electronic newsletter has not, as yet, been transferred to the new template (again, because of the tremendous amount of content). A link was made to the individual newsletters on the old site. (http://acscinf.org/publications/enews.php).

“Furthermore, there are a couple of links in the left hand menu that I did not add to the new site. If 40

Chemical Information Bulletin Vol. 63(1) Spring 2011 these are needed, please let me know and I will add them:

Surveys Metrics Disclaimers

“Otherwise, I went through every section of the old site, and recoded each page that I came across to fit the new template. If there were files on the old site that were orphan pages (i.e. not linked to from any other page), these files were unfortunately probably not transferred over. I hope I did not miss anything too important. If you notice anything glaring, please let me know.

“Overall, there are still some little tweaks that I need to bring to the site, but the bulk of the work is completed. I look forward to working with everyone to make the site as user‐friendly as possible. If you have any questions or concerns, please don’t hesitate to contact me.”

The new website has been well received by CINF members but we still have some way to go to achieving our vision of a website to which it will be easier for CINF members to post content themselves. eCIB editorship for 2011

At the end of 2010, Svetla Baykoucheva retired as eCIB editor but will continue to contribute actively to future editions.

In Spring 2011, David Martinsen has agreed to be guest editor and will try to experiment with new workflows based on using the ACS Network. Svetlana Korolev will edit the Summer and Winter edittions which follow and report on National ACS meetings and the Fall edition will be edited by Judith Currano.

41

Chemical Information Bulletin Vol. 63(1) Spring 2011

Awards and Scholarships

2011 CINF Scholarship for Scientific Excellence Sponsored by FIZ Chemie Berlin

The scholarship program of the Division of Chemical Information (CINF) of the American Chemical Society (ACS) funded by FIZ Chemie Berlin is designed to reward graduate and postdoctoral students in chemical information and related sciences for sscientific excellence and to foster their involvement in CINF.

Up to three scholarships valued at $1,000 each will be presented at the 242nd ACS National Meeting in Denver, CO, August 28 – September 1, 2011. Appplicants must be enrolled at a certified college or university, and they will present a poster duringg the Welcoming Reception of the division on Sunday evening at the National Meeting. Additionally, they will havve the option to also show their poster at the Sci‐Mix session on Monday night. Abstrracts for the pposter must be submitted electronically through PACS, the abstract submission system of ACS.

To apply, please inform the Chair of the selection committee, Guenter Grethe at [email protected], that you are applying for a scholarship. Submit your abstract to http://abstracts.acs.org using your ACS ID. If you do not have an ACS ID, follow the registratioon instructions and submit your abstract for "CINF Scholarship for Scientific Excellence". The deadline for submiitting an abstract to PACS is April 1, 2011. Additionally, please send a 2,000‐word abstract describing the work to be presennted in electronic form to the Chair of the selection committee by June 30, 2010. Any questions related to applying for one of the scholarships should be directed to the same e‐mail address.

Winners will be chosen based on contents, presentation annd relevance of the poster and they will be announced during the reception. The contents will reflect upon the student’s work and describe research in the field of cheminformatics and related sciences. Winning posters will be marked "Winner of FIZ Chemie‐CINF Scholarship for Scientific Exceellence" at the poster session.

Guenter Grethe

42

Chemical Information Bulletin Vol. 63(1) Spring 2011

Awards and Scholarships Applications Invited for CSA Trust Jacques‐Émile Dubois Grants for 2012

The Chemical Structure Association (CSA) Trust is an internationally recognized organization established to promote the critical importance of chemical information to advances in chemical research. In support of its charter, the Trust has created a unique Grant Program, renamed in honnor of Professor Jacques‐ Émile Dubois who made significant contributions to the field of cheminformatics. The Trust is currently inviting the submission of grant applications for 2012.

Purpose of the Grants: The Grant Program has been created to provide funding for the career development of young researchers who have demonstrated excellence in their education, ressearch or development activities that are related to the systems and methods used to store, process and retrieve information about chemical structures, reactions and compounds. A Grant will be awarded annually up to a maximum of five thousand U.S. dollars ($5,000). Grants are awarded for specific purposes, and within one year each grantee is required to submit a brief written report detailing how thee grant funds were allocated. Grantees are also requested to recognize the support of the Trust in any paper or presentation that is given as a result of that support.

Who is Eligible? Applicant(s), age 35 or younger, who have demonstrated excellence in their chemical information related research and who are developing careers that have the potential to have a positive impact on the utility of chemical information relevant to chemicall structures, reactions and compounds, are invited to submit applications. While the primary focus of the Grant Program is the career development of young researchers, additional bursaries may be made available at the discretion of the Trust. All requests must follow the application procedures noted below and will be weighed against the same criteria.

Which Activities are Eligible? Grants may be awarded to acquire the experience and education neceessary to support research activities; e.g. for travel to collaborate with research groups, to attend a conference relevant to one’s area of research, to gain access to special computational facilitiees, or to acquire unique research techniques in support of one's research.

Application Requirements: Applications must include the following documentation:

1. A letter that details the work upon which the Grant appplication is to be evaluated as well as details on research recently completed by the applicant; 2. The amount of Grant funds being requested and the deetails regardding the purpose for which the Grant will be used (e.g. cost of equipment, travel expenses if the request is for financial support of meeting attendance, etc.). The relevance of the above‐stated purpose to the Trust’s objectives and the clarity of this statement are essential in the evaluation of the application);

43

Chemical Information Bulletin Vol. 63(1) Spring 2011

3. A brief biographical sketch, including a statement of academic qualifications; 4. Two reference letters in support of the application. Additional materials may be supplied at the discretion of the applicant only if relevant to the application and if such materials provide information not already included in items 1‐4. Three copies of the complete application document must be supplied for distribution to the Grants Committee.

Deadline for Applications: Applications must be received no later than March 14, 2012. Successful applicants will be notified no later than May 2, 2012.

Address for Submission of Applications: Three copies of the application documentation should be forwarded to: Bonnie Lawlor, CSA Trust Grant Committee Chair, 276 Upper Gulph Road, Radnor, PA 19087, USA. If you wish to enter your application by e‐mail, please contact Bonnie Lawlor at [email protected] prior to submission so that she can contact you if the e‐mail does not arrive.

44

Chemical Information Bulletin Vol. 63(1) Spring 2011

What Do Libraries Have to Do with e‐Science?

An Interview with James L. Mullins, Dean of Purdue University Libraries

By Svetla Baykoucheva James L. Mullins has been Dean of Libraries andd professor of library science at Purdue University since 2004. Before that he was associate director for administration of the Massachusetts Institute of Technologyy (MIT) Libraries. His more than thirty years long career includes administrative positions at Villanova University and Indiana University. He earned BA and MALS degrees from the University of Iowa and a PhD from Indiana University.

Dr. Mullins has served in leadership positions within the American Library Association (ALA) and the Association of Research Libraries (ARL) and presently is an elected member of the ARL board of directtors and chair of the e‐Science Working Group. Presently he serves on the editorial board of the jourrnal College and Research Libraries. He is also on the board of directors of the International Association of Scientific and Technological University Libraries (IIATUL), Center for Research Libraries (CRL), and is a delegate to the Science and Technology Section of the International Federation of Library Associations (IFLA). Last June, Purdue was host to the 2010 IATUL Conference, which focused on the role of libraries in e‐science. He was a signatory to the formation in December 2009 of DataCite, an international consortium assigning diggital object identifiers (DOI) to datasets for citation.

Dr. Mullins is a frequent contributor to the professional liteerature, speaks at national and international conferences, and consults with research libraries and universities internationally on challenges facing research communication and dissemination. He has served on National Science Foundation (NSF) panels, including one in 2006 recommending that data management plans be required for NSF research funding.

Svetla Baykoucheva: The new buzzword in academic libraries is "e‐Science." It is also called "eScience." We are seeing job announcements for e‐Science librarians, conferences on e‐Science being organized, the Association of Research Libraries (ARL) publishing a on it, and NSF introducing new requirements for data management. What is e‐Sciencee?

James L. Mullins: In 1999, John Taylor, the Director Generaal of the Unitted Kingdom's Office of Science and Technology, created the term to describe computationally‐intensive science that draws upon large data sets and, through modeling and algorithms, test assumptions. In today's world, scientists rarely use the term e‐Science since computational methodologies have become so embedded in the research process that it hardly warrants distinctive nomenclature.

45

Chemical Information Bulletin Vol. 63(1) Spring 2011

SB: Last year you organized a conference on e‐Science. What were the topics discussed at this conference? Could you point to some future conference on e‐Science?

JM: Purdue was host to the 31st Annual Conference of IATUL (International Association of Scientific and Technological University Libraries); the theme of the program was: “The Evolving World of e‐ Science: Impact and Implications for Science and Technology Libraries.” The intent of the conference was to start with the broadest concept—what is e‐science/computational science, what is the role of data in computational science and how are scientists coping (or not) with managing data? The keynote speaker was Dr. Dan Kleppner of MIT who co‐chaired a task force for the National Academies on issues related to data. In addition, Dr. Arden Bement, who had stepped down as director of the NSF a few weeks before the conference, spoke about the interest the funding agencies have in ensuring that data generated through sponsored research would be available generally to researchers. Dr. Bement assumed the position of executive director of the Global Policy Research Institute at Purdue earlier that month, so his interest was twofold: the management of data and the need to create a global policy on data management to facilitate research. Most of the program was focused on how data can be managed and what the role can or should be for librarians; so it wasn’t just a theoretical discussion, as it provided an opportunity for librarians to gain knowledge of the processes that could assist them in developing e‐science programs in their institutions. Rather than having me provide a complete summary of the program, it would be easy for readers who are interested in the topics to go to the website: http://blogs.lib.purdue.edu/iatul2010/program/.

There are many organizations that have a focus on e‐science/data management within the international library community, especially the Digital Curation Centre (DCC) in the United Kingdom: http://www.dcc.ac.uk/events. In the United States, the Distributed Data Curation Center (D2C2) at Purdue is a research center focused on exploring and researching ways in which data can be accessed and archived. Further description can be gained at the link: http://d2c2.lib.purdue.edu/index.php. The Coalition for Networked Information (CNI) at its twice annual briefing sessions often has papers focused on e‐science and data management. Also, on the CNI website (http://www.cni.org/regconfs/) there is a list of upcoming conferences and workshops that include ones on e‐science/data management

Finally, the Association of Research Libraries (ARL) and the Digital Library Federation (DLF) are in the early stages of developing an e‐science institute planned for fall, 2011. Initially the Institute will be open to sponsoring libraries (ARL/DLF members), but the intent is that it will be repeated for the broader community in 2012.

SB: How do you see the role that librarians could play in this new area? What kind of expertise will be required from them?

JM: Working in the area of data management draws upon the principles of library and archival sciences. Our ability to see structure to overlay on a mass of disparate “parts,” as well as the ability to identify taxonomies to create a defined language for accessing and retrieving data is what is needed from us. The challenge will be for librarians to understand that we have collections that we cannot see and may not actually understand the importance of, but that we will have a responsibility to steward and preserve for researchers now and in the future. Archival science is 46

Chemical Information Bulletin Vol. 63(1) Spring 2011 important since there are requirements and expectations from investigators that there will be limited access to data that will require that an embargo be in place. Just as people can give their personal papers to archives with an expectation that access will be limited to specific researchers or closed for a period of time, researchers may similarly want to protect their intellectual property by creating an embargo. For librarians this would normally be unacceptable, while for archivists this is standard procedure. I also think it helps us to think about our present print archives as being raw bits of data, until a researcher (typically a humanist or social scientist) "mines" them to answer a research question, which is similar to scientists or engineers consulting digital data in their research.

SB: Will e‐Science change the way academic libraries function? Will it change the infrastructure and the services libraries provide?

JM: Many of our librarians (even those working in scientific and engineering disciplines) often have humanities or social sciences backgrounds. However, the trepidation that many librarians may have about sitting down with researchers and discussing their data management needs shouldn’t be a controlling factor. Once a librarian has the experience of talking with researchers about their research and the challenges they have with managing data, it becomes clear that the most important factor is not our subject expertise (although some subject understanding is needed) but rather the librarian’s knowledge of metadata and taxonomies. In the old days we would have said that this is “cataloging and classification,” but today, to convey that we have morphed into a new role, it is best to use the more technical terminologies since it may help identify our “new” role as a cutting edge initiative and not be encumbered with past misperceptions. In fact, a few times I have seen researchers frustrated by librarians with significant subject expertise, who more or less intrude their subject knowledge into what the investigator is researching, while what investigators want is the library/archival science contribution to their team. We need to remember that and be proud of the special expertise that we as librarians bring to the research team. The impact for libraries in the broadest sense is the recognition that we have an important role to ensure the archiving and preservation of important data sets that initially may not be apparent to the researchers or us. We need to be able to think of treating these data sets as important collections, which is not that dissimilar to how we have stewarded our print book and serial collections or our archives. Responsibility for digital data brings new challenges and cost models—ones that we will need to work through with our university administrations and develop further collaboration with our colleagues in research administration and information technology.

SB: What kind of problems do you see for librarians to be able to get involved in e‐Science? Will faculty be willing to share raw data with outsiders and how could this potentially affect intellectual property rights?

JM: I have touched on some of the problems for librarians to become involved with e‐Science; so I will focus on the second part of your question. And the simple answer, from my perspective, is, "it depends." The one thing we have learned from the work we have done so far with disciplinary faculty and their research is that no two disciplines have identical policies or principles guiding them about sharing data. When we at Purdue embarked on this work six years ago, we thought it was going to be simple to help researchers manage and share their data. However, that naïve assumption was soon disproved. Some disciplines share data through a central database available

47

Chemical Information Bulletin Vol. 63(1) Spring 2011 to all, while others keep their data "close to the vest" while the research is being undertaken and are willing to share it only when it is needed to document findings in a published research article.

The mandate by the NSF and the likelihood this will be adopted by other funding agencies will trump, possibly, the traditions of data sharing (or not) within a field. It will take some time before it becomes an accepted, required step of the process. The NSF mandate is a start, but ultimately it will gain acceptance when researchers themselves begin to see benefits of sharing data beyond what they have done in the past.

SB: How will e‐Science affect the way research is performed and reported? What will be the consequences for the science and technology publishing field?

JM: Some of the effects have been discussed above; so I won’t go back over them here. But I will amplify some of the potential impact that may come from the availability of data and the requirements necessary to provide that access. During the past several years, the publishing industry has begun to assign digital object identifiers (DOIs) through the service provided by CrossRef. This has been very successful as it assigns a persistent identifier that will tag this article for retrieval, now and far into the future. The DOI serves somewhat like a barcode or ISBN, a unique tag that provides access to this article. So, with this ability to identify the article, there comes the concurrent need or desire to link relevant data to it. That initiative has been taken on by libraries around the world, through the development of the international organization called DataCite (http://datacite.org/). Its charge is to create a registry available to researchers throughout the world to permanently tag a data set, and provide enough description to allow for access and retrieval, if desired by a researcher. In the United States, the coordination and assignment of DOIs through DataCite is being undertaken by the California Digital Library (CDL), Purdue University Libraries and the Office of Science and Technology (OSTI) of the Department of Energy (DOE).

Creating DataCite and the assignment of DOIs is a major undertaking, not unlike what took place forty years ago with ISBN—the difference being that ISBN was collaboration between publishers and national libraries, which had the reach and the clout to make it a standard in a short time and which were dealing with a finished product (a book). For DataCite, it is a few international libraries banding together to try to get this elephant headed in the right direction. At this time, the DOI assignment to a data set is not mandatory. There is a possibility, however, with OSTI recently joining DataCite, that the DOI assignment will become a requirement by funding agencies.

SB: I have done many interviews for the Chemical Information Bulletin, but this is the first time I am interviewing a dean of libraries. And I would like to ask you a question that all academic librarians are asking: how do you see the academic libraries and the work librarians are doing change in the next few years? As dean of libraries in such prominent institution as Purdue, what changes are underway in your own libraries?

JM: There is a shift from the trend that was happening ten years ago, which was the reduction of the number of librarians and other professionals and the increase in the number of clerical and student staff. In the "post print" world, the effort necessary to acquire, check‐in, catalog, bind, and manage print collections has significantly been reduced. However, the work that needs to be done in collaboration with the faculty in the classroom and lab has increased. 48

Chemical Information Bulletin Vol. 63(1) Spring 2011

At Purdue, librarians are full members of the professorial faculty, and with that comes an expectation that they not only ensure that the Purdue Libraries operate using sound library science principles, but that the latest initiatives be evaluated and integrated if deemed appropriate into the operations and services of the Libraries. However, in order to extend the work of the librarians, it is becoming clearer and clearer that we need to move much of the day‐to‐day management and such services as Jeremy Garritano, Assoociate Professor of Library Science and Chemical reference and cataloging/metadata Information Specialist, at the M. G. Mellon Library of Chemistry at operations to another tier of Purdue University, insttructing a class in the Mellon Chemistry Libraryy's Cyber Lab. professional and clerical staff, trained and able to do these operations. This frees up the librarians to collaborrate on information literacy instruction, research team collaboration, and research in the areas of changing scholarly communication models. If anyone came or is coming to librarianship thinking it would be a static, complacent, and quiet place to work, they may want to recconsider!

SB: On a personal note, could you tell us about something that interests you besides information science and librarianship?

JM: One of the great advantages of being a librarian is that we have the ability to explore so many aspects of knowledge and to follow the curiosity that I beliieve is an important trait that all librarians must have. Although I have a great love of travel and a commitment to international librarianship through participation in IFLA and IATUL, I don't consider that as my sideline interest, as it is still, for the most part, professional. I can give you an example of what I am reading for pleasure, pure enjoyment—and that is about the beginning of the Cold War, from the end of World War II and through the 1960's, into the Vietnam Era. Being a child during the 1950's, I remember so well our fear of the Chinese and the Soviets/Russians and the competition that was in place to out‐achieve the Soviets in science and technology. We were aware that we could be destroyed any day by nuclear war, but as a child I really had no idea what the reason was. I remember waatching as a boy in the 1950's an old WWII movie made during the War, where the sailors on an American ship began cheering when they realized that the planes they saw overhead were Russian and not Japanese. I remember asking my mother how could that be, and her answer was that they were our allies in the War. In the 1950s that seemed inconceivable. A little like today when we think of Iran. Therefore, I am reading about the beginning of the Cold War period and just finished an excellent book, The Lost Peace: Leadership in a Time of Horror and Hope, 1945‐1953, by Robert Dallek.

SB: It is an interesting coincidence that for this issue I also iinterviewed Dr. Michael Gordin, who has done extensive research on the beginning of the Cold War and has published on that period. Thank you, Dean Mullins, for discussing e‐Science and for your personal insights.

49

Chemical Information Bulletin Vol. 63(1) Spring 2011

Political, Cultural, and Technological Impacts on Chemistry

An Interview with Michael Gordin, Director of Graduate Studies of the Program in the History of Science, Princeton University

By Svetla Baykoucheva

Michael Gordin is the Director of Graduate Studies of the Program in the History of Science at Princeton University. He has done extensive research on the history of the modern physical sciences and Russian history. He earned his A.B. (1996) and his Ph.D. (2001) from Harvard University and served a term at the Harvard Society of Fellows. He has published articles on the introduction of science into Russia in the early 18th century, the history of biological warfare in the late Soviet period, the relationns between Russian literature and science, and a series of studies on Dmitrii I. Mendeleev. His book on the life and chemistry of Mendeleev1 is considered the most comprehensive and authoritative study published on the formulator of the periodic table of elements. Dr. Gordin has also worked extensively in the early history of nuclear weapons and is the author of Five Days in August: How World War II Became a Nuclear War2(2007), a history of the atomic bombings of Japan during World War II and an international history of nuclear intelligence, Red Cloud at Dawn: Truman, Stalin, and the End of the Atomic Monopoly (2009)3. He has also co‐edited the four‐volume Routledge History of the Modern Physical Sciences (2001), Intelligentsia Science: The Russian Century, 1860‐1960 (2008)4, and Utopia/Dystopia: Conditions of Historical Possibility (Princeton, 2010)5. He is now workingg on a history of the modern category of "pseudoscience" in postwar America, from the age of McCarthy to the counterculture, centering on the sensational career of Immanuel Velikovsky (1895‐1979), whose 1950 best‐seller, Worlds in Collision,6 sparked three decades of controversy over the boundaries of legitimate science. Professor Gordin teaches lecture courses in the history of modern science, technology and society, and translation in the history of science, as well as seminars on nuclear‐weapons history, the history of pseudoscience, the Soviet science system, and biography.

50

Chemical Information Bulletin Vol. 63(1) Spring 2011

Svetla Baykoucheva: The United Nations has designated 2011 as the International Year of Chemistry, and I am very pleased to be able to interview someone who has performed such extensive research in the field of history of chemistry. Your book on Dmitrii Mendeleev1 shows deep understanding not only of chemistry, but also of the socio‐political environment in Russia at the time. How does the cultural milieu of an epoch, a country, a region, or an organization influence the developments in science and the public attitude about it?

Michael Gordin: This is a great question, and in many ways it is the central concern of the history of science, and clearly there is no straightforward answer to it. There are many factors that influence the development of science at any particular time and place: the experimental equipment and resources available to the scientist, his or her level of education and preparation, access to communication from other scientists, and the general state of science at the time, to name just a few. Some of these factors are pretty tightly bound with intellectual matters, and some of them are more broadly social or cultural, and I think it would be an error to rule out any particular factor by fiat. In some cases, such as Mendeleev's, the need to reform the pedagogy of chemistry for students in St. Petersburg proved crucial to his creating a framework for organizing the elements which eventually grew into the periodic system we know today. The concerns were both social and political (how do you educate a large number of students who have inadequate preparation) and intellectual (the rapidly expanding knowledge of the properties of elements, especially their atomic weights, in the 1860s). That’s not to say we wouldn’t have a periodic table without educational reform in Russia — far from it, as we know by the existence of multiple competing systems. Rather, I mean to say that the form we received has a great deal to do with the specifics of that time and place; the content is a more nuanced philosophical matter. The purpose of the history of science is to elucidate all these various factors and point to their relative weights in specific episodes.

SB: Two of your books (Five Days in August: How World War II Became a Nuclear War2 and Red Cloud at Dawn: Truman, Stalin, and the End of the Atomic Monopoly3) were devoted to nuclear proliferation in the context of the Cold War. How do these topics relate to the history of chemistry?

MG: My colleagues often ask me the same thing. Nuclear weapons in the early Cold War, after all, are indeed a long way from Mendeleev and Imperial St. Petersburg. Certainly as topics they are pretty different, but as ways of investigating the past they are not that far apart. One of the great challenges in writing the history of science is avoiding what we call "Whiggish" interpretations of history; that is, writing a history of the past which leads inevitably to the present, placing the end of the story right there in the beginning. This kind of presentist version of history is very tempting in the history of science, because science’s achievements are so obvious, and seem so inalterable. The important point, from the historical point of view, is that they were not obvious to the scientists engaged in making the discoveries. They were beset by uncertainties, alternatives, doubts, and vigorous arguments. It is the historian's task to capture those uncertainties and show the past as it unfolded, not tell a just‐so story for the present. Well, after publishing the Mendeleev book, I found myself grabbed by a set of questions concerning the early nuclear arms race, and wanted to see if the same approach would yield results there, even if these weren’t, strictly speaking, classic "history of science" questions. For example, in Five Days in August, I focused on how American military officials, politicians, and scientists thought about the atomic bomb in the period before surrender of the Japanese government in August 1945, and especially in the five days between the bombing of Nagasaki and that surrender. At that time, no one could say that the bomb "ended the war," 51

Chemical Information Bulletin Vol. 63(1) Spring 2011 because the war was not yet over; so how did they think about it? Was it a revolutionary weapon or not? And in Red Cloud at Dawn, I concentrated on the period between the end of World War II and the detonation of the first Soviet atomic device in August 1949, in order to explore how people on both the American and Soviet sides evaluated the arms race before, strictly speaking, any such race existed. The approach is heavily indebted to the history of chemistry, even if the topics aren’t. To be honest, I’m looking forward to returning to more chemical questions now that I have spent all this time with nuclear weapons.

SB: You are the co‐editor of a , Intelligentsia Science: The Russian Century, 1860‐ 1960, for which you also wrote an on the Heidelberg Circle — a group of Russian chemists who specialized in Germany and who later founded the Russian Chemical Society. Who were these people and what impact did they have on the development of chemistry both in and outside Russia? What motivated them to choose chemistry as a career? What was the role of learned societies at that time?

MG: Russia entered the decade of the 1860s facing a series of severe challenges. In 1856 it had lost the Crimean War, a defeat which was interpreted by the elite and the intelligentsia as a sign that Russia was "backward" in significant ways with respect to the Western powers. They began to promote a series of military and fiscal reforms in an effort to modernize the state, the most famous of which was the abolition of serfdom in February, 1861. But the problem of technical modernization also occupied these decision makers, and they initiated a program to sponsor talented young scientists (and other scholars, like lawyers and physicians) to study abroad, absorb the very latest word in their specialties, and then return to Russia to help rebuild a self‐sustaining community at home. And, to a great degree, it worked. Many of the leading lights of Russian chemistry, to pick the example I know best, and those behind the formation of the Russian Chemical Society in 1868, were part of this temporary emigration: Dmitrii Mendeleev, Aleksandr Borodin, Vladimir Markovnikov, and others. Each was drawn to chemistry for different personal reasons, but the choice was in a sense no surprise: chemistry was the most dynamic and exciting science at mid‐century, and it was the science most well established in both St. Petersburg and Kazan, which trained these individuals to a level where they could take advantage of their sojourn abroad. As for learned societies, we see a proliferation of chemical societies all across Europe during this time period, and they served a crucial role in creating a national community of scholars who could communicate with each other, establish journals, and lobby their states and national industries for greater support of chemistry. As a step in the professionalization of chemistry, these societies were vital.

SB: In a published in the same book, you characterized the Russian national style of scientific discourse as "theoretical, bold, impulsive, and stridently argumentative. It was the style of D. I. Mendeleev and V. V. Markovnikov. It was also the style of Emil Erlenmeyer." Are there national differences in the way scientists perform research and discuss scientific ideas and experimental results?

MG: Yes and no. At almost any point in the past two centuries (although, interestingly, not so much before then), you can find cases of scientists claiming that their work bears some specific "national style" in a laudatory sense, or that the manner of research of their competitors from another national context bears a deleterious national style. We can easily jot down a number of these crude 52

Chemical Information Bulletin Vol. 63(1) Spring 2011 stereotypes: Russians are impulsive and bold; Germans are nit‐picking and meticulous; the French are abstract and conceptual; the Americans are pragmatic and application‐oriented. I do not endorse any of these points of view as being accurate descriptions of how people really were or are. Instead, in the article you mentioned I point to how certain Russians chose to brand themselves as being bold and speculative; the irony being that the person they were patterning themselves on most was Erlenmeyer, a German. These assertions of "national styles" have been over the years very important aspects of how scientists have understood their own activity, and as such they are significant for the historian to analyze. Some of them — such as the high level of mathematics found in certain chemical communities — can be traced to national educational systems and thus are more likely to bear a relationship to deeper processes, but many of the others are rhetoric. But, at the risk of belaboring a point: just because something is rhetoric doesn't mean it is historically insignificant.

SB: Which events and discoveries in the history of chemistry have happened unexpectedly and have become turning points for the development of science?

MG: This is a great question, and one that opens up a number of very interesting issues about how science has evolved over time. No one would doubt that unexpected events happen in the laboratory all the time — Becquerel leaving his uranium salts on top of some film in a drawer, for example. But it is pretty rare for something completely unexpected to happen, since the chemist has a certain collection of equipment and reagents available and is usually trying to accomplish something particular in the laboratory that day. As anyone who has spent any time in a laboratory knows, you don't always get what you expected, but that doesn't mean that the choices you have made have no impact on the set of unexpected outcomes that result. And if something completely unexpected were to happen, one which would have no framework in the concepts available to chemists at the time, then it would surely meet with a lot of resistance, as one finds with the way established chemists objected to the discovery of noble gases. (Mendeleev initially thought argon had to be N3, since the notion of an element that was chemically inert made no sense to him.) Generally, when an unexpected finding comes along in the historical record, closer investigation reveals that a certain group of chemists made a concerted effort to claim that it was a revolution in the science, and argued for thinking about this "unexpected" discovery as a confirmation of their prior theoretical arguments. This interplay between the serendipity of discovery and the hopefulness of theoretical speculation is one of the wellsprings of scientific creativity.

SB: You have taught a course on pseudoscience. What did you cover in that course?

MG: I find the topic of pseudoscience fascinating, and when I’ve taught this course I’ve covered a large variety of topics of things that have been variously classified (not without controversy) as pseudosciences: astrology, alchemy, phrenology, mesmerism, spiritualism, creationism, cold fusion, Lysenkoism, eugenics, and others. In the course, we emphasized what we can learn about how science works from these rejected domains of knowledge. After all, no one calls themselves a "pseudoscientist"—every single person so designated thinks that they are engaged in real scientific work. They don’t have to be right about that, but there is a lot of interest in trying to understand them in their own terms.

53

Chemical Information Bulletin Vol. 63(1) Spring 2011

SB: Although scientific fraud is much less seen in chemistry than in the life sciences, cases like the one of Hendrick Schön, from Bell Labs, shook the chemical community several years ago. Schön had published numerous articles before it was discovered that he had submitted the same data repeatedly. Many of his papers had to be retracted, including ones that were published in reputable journals such as Science and Nature. How can scientific fraud be prevented or, once it has happened, punished? Do you consider capable of filtering bad science?

MG: With regard to Schön, there is an important distinction to be made. On the one hand, we have the category of "pseudoscience," which can be roughly defined as something that is not science but tries very hard to look like science and adopt its methods and approaches. That is not quite the same thing as "fraud," which connotes a level of insincerity that one doesn’t find, for example, among seventeenth‐century alchemists. (There is a third category, the hoax, which is something else again.) Now, as to what can be done about any of these things, I do not have any particular insights. Wherever you find science, you will find something that scientists label pseudoscience; the two always come together. Fraud, if one subscribes to a particular model of psychology, is a matter of incentives, and it is possible that with intensified safeguards, one can reduce its occurrence. But we almost certainly can’t eliminate it altogether. Peer review, as you mention, is often put forward as a solution to this problem, and it is likely better than having no safeguard at all — at least this guarantees that a few scientists read over the piece before it is published — but the evidence of recent years has shown that it is far from foolproof in catching fraud. But, as in the case of Schön, eventually the misdeeds come to light. Time seems to be our best tool in this matter.

SB: There are some historians who are very passionate about "The Kekulé Riddle".7 To chemists, the notion that it was Archibald Scott Couper and not Kekulé who found out that carbon is tetravalent and that it was Johann Josef Loschmidt, who drew the benzene ring for the first time, is quite surprising. What do you think of the claims that Kekulé has received credit for concepts in structural organic chemistry that had actually already been developed by others—such as Couper, Loschmidt, Ladenburg, Frankland and Butlerov? And does it matter, from a science historian point of view, who was the first to make the discovery?

MG: Being first in making a discovery certainly matters to the scientist! And, in that sense, it does matter for historians of science, since the passions and debates of the scientists are one of the most important things we investigate. Personally, I have spent a lot of time researching the priority dispute over the periodic system between Mendeleev and Julius Lothar Meyer. I am not interested in deciding who was "right" — I don't think historians are in the business of awarding prizes or credit — but the fact that this fight took place, and the kinds of arguments Mendeleev and Meyer used to argue for who was first, makes for a fascinating story to uncover. For better or worse, our system of assigning credit in the sciences centers on priority, and the historian is obligated to explore why that particular system emerged, and what its consequences have been. With respect to Couper, Butlerov, Kekulé, and others — I’m afraid I am a spectator in that historiography and am not going to weigh in on one side or the other, but I can tell you my own particular approach to this kind of question. The fact remains that Kekulé was awarded the credit by his peers. I am personally more interested in why they thought he should receive the credit, rather than in adjudicating whether they were correct or incorrect in doing so.

54

Chemical Information Bulletin Vol. 63(1) Spring 2011

SB: How are the current conditions in academia (I have in mind such things as wider collaborations, struggling for grants, requirements for tenure that include publishing in high‐impact journals, pre‐ prints, open‐access, etc.) changing the way research is performed, reported and credited?

MG: It's generally a bad idea for a historian to speculate on the future, but there is no question that there have been significant transformations in the way of doing science both inside and outside academia that are bound to have important implications for how various disciplines develop in the future. One obvious factor has to do with funding. On the one hand, science is continually becoming more expensive, and there are more scientists competing for a fixed (or in some cases shrinking) pool of funds. On the other hand, the linkages between academia and industry are becoming tighter now than they have typically been (at least in the American context, with which I am most directly familiar), and this is shaping questions that are asked within universities as well as those asked in industrial laboratories. Conditions of publication are also changing in interesting ways. The problem of "information overload" has been with us as long as we have had journals (which is over three hundred years), and probably even longer than that. There is simply so much information for researchers to keep abreast of, so many venues where it appears, and not enough time in the world to track it. Managing this volume of information is a tremendous challenge, and the Internet has both provided tools for addressing this issue and in other ways also compounded the problem. We are seeing strains in the peer review system — exemplified in the use of the pre‐print server among physicists, as well as other experiments in open‐access — and also mounting costs for libraries. Without sufficient funding for research and access to information, science will suffer, or at the very least be forced to adapt. But I am not a pessimist on these questions. One of the most inspiring things about the history of science is how flexible scientists have been in adjusting to different conditions, and I am confident that while science will look different in thirty years than it did thirty years ago, the developments are going to be quite exciting.

SB: What projects have you been working on recently? What are you going to work on in the near future?

MG: I'm now beginning a large research project that connects with your query about the changes in chemistry in recent years. One of the most significant transformations in science over the last two hundred years has been the replacement of a polyglot community with an increasingly monoglot one. To take the example of chemistry, which is the focus of my research, in 1850 a chemist would be expected to be able to read, and to a lesser degree speak, German, English, and French. Today, almost no PhD program in chemistry requires any foreign‐language competence at all, as the global production in chemistry becomes increasingly Anglophone. This is an extremely important development, and I believe there has not been enough attention to it aside from a dedicated group of sociological linguists based mostly in Germany. I am planning to write a history that spans from the decline of Latin as a language of scientific communication in the early eighteenth century, through the rise of national languages (including Russian), experiments with artificial languages like Esperanto, the fate of German (almost certainly the most important language in chemistry in the early twentieth century), and the current ascendancy of English. Before embarking on that, however, I am finishing another project related to my interests in pseudoscience as a way of exploring the history of science, with a book on the debates over the theories of Immanuel Velikovsky in Cold War America.

55

Chemical Information Bulletin Vol. 63(1) Spring 2011

SB: Thank you for promoting the history of chemistry to a broad audience and for agreeing to discuss these interesting topics.

References

1. Gordin, M. D., A Well‐ordered Thing: Dmitrii Mendeleev And The Shadow Of The Periodic Table. Basic Books: 2004; p 384. 2. Gordin, M. D., Five Days in August: How World War II Became a Nuclear War. Princeton University Press: 2007; p 226. 3. Gordin, M. D., Red Cloud at Dawn: Truman, Stalin, and the End of the Atomic Monopoly. Farrar, Straus and Giroux: 2009; p 416. 4. Intelligentsia Science: The Russian Century, 1860‐1960. Gordin, M.; Hall, K.; Kojevnikov, A., Eds. University of Chicago Press Journals: 2008; p 316. 5. Utopia/Dystopia: Conditions of Historical Possibility. Princeton University Press: 2010; p 264. 6. Velikovsky, I., Worlds in Collision. Paradigma Ltd: 2009; p 436. 7. John H. Wotiz, E., The Kekulé riddle: a challenge for chemists and psychologists. Cache River Press: Clearwater, FL, 1993; p 329.

56

Chemical Information Bulletin Vol. 63(1) Spring 2011

Book Reviews: Scientific Writing

Robert E. Buntrock [email protected]

For this issue, several books on scientific writing will be covered either with brief reviews or by citation. Writing is not only fundamental to dissemination of information but it is a viable alternative career path for chemists and other scientists.

The first is on scientific communication, for both written and oral presentations, Harmon, Joseph E.; Gross, Alan G. The Craft of Scientific Communication; University of Chicago Press: Chicago, 2010. $55. (Hardcover) 240 p. ISBN: 978‐0‐226‐31661‐1; $20 (Paper) ISBN: 978‐022‐31662‐8; $7 rent, $20 (Electronic), 978‐022‐631663‐5.

Although little information is given on writing for chemistry, a good text and reference for writing and presenting science to both scientific and public audiences. Crafting of a scientific article is described followed by four examples. Research proposals and communications to a lay audience are described next followed by a discussion of style based on how good scientists actually write. Not all sentences need to be short in active mode nor do long sentences need be split (no mention of the "Fog Index", recommended by some technical editors for "executive summaries"). Method descriptions are similar to Julia Childs’ recipes. Exercises, with answers follow each chapter. Another deficiency is the lack of mention of poor, cluttered, low contrast power point slides. (Preciously reviewed by J. Kovac, J. Chem. Educ., 87(11), 1139‐1140, 2010, doi: 10.1021/ed100882.)

The second review covers communication of scientific information to the public, Introducing Scientific Communication: A Practical Guide; Brake, Mark L., Weitkamp, Emma, Eds.; Palgrave Macmillan: New York, 2010. $33.95. (Hardcover) 177 pp., ISBN 978‐02305373864.

Excellent text or reference for scientific journalism, for presentation of science to policy makers and the general public. Scientific journalism is a viable but underutilized alternative career path for chemists and scientists for which a few universities are developing courses. (Previously reviewed by R. Buntrock, J. Chem. Educ., 87(11), 1138‐1139, 2010, doi:10.1021/ed100855.)

Also reviewed in that issue of J. Chem. Educ. (by L. Montes, J. Chem. Educ.,87(11), 1138, 2010, doi:10.1021/ed100864) is The Oxford Book of Modern Scientific Writing, by Richard Dawkins. Shown and discussed are more than 80 examples of writing by prominent scientists including some Nobel Prize winners.

For chemists, the benchmark reference remains The ACS Style Guide: Effective Communication of Scientific Information, 3rd edition, by A. M. Coghill and L. R. Garson. (Previously reviewed by R. Buntrock, J. Chem. Inf. Model., 47(2), 703‐704, 2007, doi: 10.1021/ci600536.)

As before, we’re always open to suggestions for books to review as well as volunteer reviewers. With the demise of book reviews in JCIM, it’s up to us to "carry the torch" for book reviews on chemical information and related topics.

57

Chemical Information Bulletin Vol. 63(1) Spring 2011

Product Announcements Accelrys Draw 4.0

Accelrys Draw 4.0, the latest release of Accelrys's chemical drawing application is now available for download at no charge for academic and non‐commercial personal use at www.symyx.com/getdraw.

Accelrys Draw 4.0 features the follows enhancements:

 Biological sequence editor  Multi‐tabbed user interface  Structure resolver ‐ extends name‐to‐structure conversion to DiscoveryGate, ChemSpider, PubChem, and NCI/CADD web resources  View files with structure‐based thumbnail images in Windows Explorer  Dynamic toolstrips  Customizable atom toolstrip  Automatic, customizable coloring of atom labels  Reading of ChemDraw CDX files  Enhanced stereochemistry labels

The no‐fee download of Accelrys Draw 4.0 for academic and non‐commercial personal use contains all the functionality in the commercial vaersion and it is available now. For more information, visit www.symyx.com/getdraw. (Note, these URLs will move to the Accelrys domain in the near future, but redirects will be put in place to maintain access.)

Contact Information:

Keith T Taylor PhD, MRSC Advisory Product Manager, Chemistry

Accelrys, Inc. 2440 Camino Ramon, Suite 300 San Ramon, CA 94583 Ph: +1‐925‐543‐7525 Cell: +1 209 221 9415 Fax: +1‐925‐543‐7553

Smarter Science. Better Business. Stay Connected with Accelrys and the Scientific & Engineering Community

58

Chemical Information Bulletin Vol. 63(1) Spring 2011

Product Announcements Reaxys and IDBS: working together to provide seamless environments for researchers

Elsevier and IDBS recently announced that Reaxys is now innteroperable with the IDBS E‐Workbook Suite. This partnership creates a new mechanism that integrates the best‐in‐class content from journals and patents with documented proprietary scientific results.

E‐WorkBook users searching for relevant chemical data can now smoothly transition into Reaxys, the workflow solution that provides extensive information on chemical compounds, related physical and pharmacological properties, and synthesis information, and then save their data and findings in their workflow. With Reaxys available via E‐WorkBook, a new group of researchers can now access this extensive repository of experimentally validated data.

"There is a genuine need for automatically integrating relevant chemistry information directly into the research process,” said Neil Kipling, founder and CEO of IDBS. “Our partnership with Elsevier delivers essential chemical information to scientists just in time and at the point of use."

Mark van Mierle, Managing Director of Elsevier Information Systems GmbH, added: "Our customers want seamless, interoperable environments for their researchers. Our partnership with IDBS responds to customer needs and should further improve individual workflows and company productivity."

"Bringing information together in a federated search reduces the time to make decisions on which molecule to make next,” said Robert Glen, Professor of Molecular Sciences Informatics at Cambridge University. “The integration of E‐WorkBook and Reaxys provides an exciting new approach to improving productivity."

IDBS and Elsevier will continue to work closely together too provide additional innovative content‐ related functions to Reaxys and E‐WorkBook to significantly improve how researchers interact with the world’s best data.

For more information on Reaxys and IDBS, please visit our website.

59

Chemical Information Bulletin Vol. 63(1) Spring 2011

Product Announcements The Reaxys 2011 PhD Prize is open: celebrating innovation and creativity in young chemists

Elsevier Properties SA recently announced that the 2011 Reaxys PhD Prize, a global competition for candidates currently studying for a PhD or having completed a PhD within the last 12 months, is open, with a final submission date of 28th February 2011.

The prize will be awarded for original and innovative research in organic, organometallic and inorganic chemistry to the candidates that demonstrate excellence in methodology and approach in a peer‐reviewed publication. Three prize winners will each receive a check for $2000 and be invited to present their research at the Winners’ Symposium, to be held during the 14th Asian Chemical Congress, 5 – 8 September, 2011 in Bangkok, Thailand.

David Evans PhD, Scientific Affairs Director at Elsevier Properties SA said, "The Reaxys PhD Prize celebrates innovation and creativity in chemistry research from around the world, values which lie at the heart Reaxys itself." He continued, "In 2010 we received over 300 submissions from around the world covering the breadth of modern chemistry including representatives from most of thhe leading chemistry universities. The quality of research was outstanding, and the finalists and winners are clearly at the cutting edge of chemistry research. A high bar has been set for 2011."

All entries will be evaluated by a review board of leading international chemists, chaired by the following members of the Reaxys Advisory Board:

 Professor A. G. M. Barrett ‐ Imperial College London, UK  Professor B. M. Trost ‐ Stanford University, USA  Professor H. N. C. Wong ‐ Chinese University of Hong Kong, China

Submissions are reviewed based upon originality, innovation, importance to the field, applicability, rigor of approach and publication quality.

For more information, including submission details and requirements, please visit our website.

Reaxys is a registered trademark owned and protected by Elsevier Properties SA and used under license

60

Chemical Information Bulletin Vol. 63(1) Spring 2011

Product Announcements

Chemisches Zentralblatt now Searchable by Structure

FIZ CHEMIE has digitized the entire contents of the first and oldest abstracts journal published in the field of chemistry, the German Chemisches Zentralblatt. Beginningg from 1830, approximately 900.000 page images with about 2 million abstracts cover 140 years of research progress in pharmaceutical science and chemistry. InfoChem software company for chemoinformatics has applied advanced data mining technologies to the hole content and created a database that allows combined full‐text, structure and substructure search throughout the page images. Thus, researchers are able to scan 140 years of scientific knowledge and patents published in the time period from 1830 to 1969.

The Chemisches Zentralblatt Structure Database is offered optionally as a web application or as an in‐house system. The web application is hosted on the InfoChem server. Access is provided on a licence basis. By purchasing the in‐house solution, customeers get the database together with the original pdf files to integrate them into their company systems. Customized solutions are offered as packages.

For more information, please visit our website or http://www.infochem.de/content/downloads/czb.pdf.

61

Chemical Information Bulletin Vol. 63(1) Spring 2011

Product Announcements Complimentary access to the Journal of Chemical Information and Modeling 2011 Sample Issue

ACS Publications invites you to explore the 2011 sample issue of the Journal of Chemical Information and Modeling, available now online free until the end of the year.

On Bibliometric Analysis of Chinese Comments on “On Bibliometric Analysis oof Classifying Large Chemical Data Sets: Research on Cyclization, MALDI-TOF, and Chinese Research on Cyclization, MALDI-TOF, Using A Regularized Potential Function Antibiotics: Methodical Concerns and Antibiotics: Methodological Concerns” Method Petr Heneberg Jiang Li and Peter Willett Hamse Y. Mussa, Lezan Hawizy, Florian Nigsch, and Robert C. Glen

Cross-Target View to Feature Selection: New Fragment Weighting Scheme for the Molecular Docking and Pharmacophore Identification of Molecular Interaction Bayesian Inference Network in Ligand-Based Filtering in the Discovery of Dual- Features in Ligand−Target Space Virtual Screening Satoshi Niijima, Hiroaki Yabuuchi, and Ammar Abdo and Naomie Salim Inhibitors for Human Leukotriene A4 Yasushi Okuno Hydrolase and Leukotriene C4 Synthase Sundarapandian Thangapandian, Shalini John, Sugunadevi Sakkiah, and Keun Woo Lee

Would the Pseudocoordination Centre Transplant−Insert−Constrain−Relax−Assemble Discovery of Chemical Compound Groups Method Be Appropriate To Describe the (TICRA): Protein−Ligand Complex Structure with Common Structures by a Network Geometries of Lanthanide Complexes? Modeling and Application to Kinases Analysis Approach (Affinity Prediction Danilo A. Rodrigues, Nivan B. da Costa, Jr., Siavash Meshkat, Anthony E. Klon, Jinming Zou, Method) and Ricardo O. Freire Jeffrey S. Wiseman, and Zenon Konteatis Shigeru Saito, Takatsugu Hirokawa, and Katsuhisa Horimoto

62

Chemical Information Bulletin Vol. 63(1) Spring 2011

Assessing the Performance of the StructRank: A New Approach for Ligand- Quantum Mechanics/Molecular Mechanics MM/PBSA and MM/GBSA Methods. 1. The Based Virtual Screening Strategies for Docking Pose Refinement: Accuracy of Binding Free Energy Fabian Rathke, Katja Hansen, Ulf Brefeld, and Distinguishing between Binders and Calculations Based on Molecular Klaus-Robert Müller Decoys in Cytochrome cPeroxidase Dynamics Simulations Steven K. Burger, David C. Thompson, and Tingjun Hou, Junmei Wang, Youyong Li, and Paul W. Ayers Wei Wang

Comments on the Article “Evaluation of Calculation of the Solvation Free Energy of Sequence, Structure, and Active Site pKa Estimation Methods on 211 Druglike Neutral and Ionic Molecules in Diverse Analyses of p38 MAP Kinase: Exploiting Compounds” Solvents DFG-out Conformation as a Strategy to John C. Shelley, David Calkins, and Arron P. Sehan Lee, Kwang-Hwi Cho, Chang Joon Lee, Go Design New Type II Leads Sullivan Eun Kim, Chul Hee Na, Youngyong In, and Kyoung Preethi Badrinarayan and G. Narahari Sastry Tai No

Rational Approaches for the Design of Importance of Receptor Flexibility in Binding Automated Selection of Compounds with Effective Human Immunodeficiency Virus of Cyclam Compounds to the Chemokine Physicochhemical Properties To Maximize Type 1 Nonnucleoside Reverse Receptor CXCR4 Bioavailabbility and Druglikeness Transcriptase Inhibitors Alfonso R. Lam, Supriyo Bhattacharya, Kevin Taiji Oashhi, Ashley L. Ringer, E. Prabhu Sergio R. Ribone, Mario A. Quevedo, Marcela Patel, Spencer E. Hall, Allen Mao, and Nagaarajan Raman, and Alexander D. MacKerell, Jr. Madrid, and Margarita C. Briñón Vaidehi

Bacterial Carbohydrate Structure Database CYANOS: A Data Management System for ThermoData Engine (TDE): Software 3: Principles and Realization Natural Product Drug Discovery Efforts Using Implementation of the Dynamic Data Philip V. Toukach Cultured Microorganisms Evaluation Concept. 5. Experiment George E. Chlipala, Aleksej Krunic, Shunyan Mo, Planning and Product Design Megan Sturdy, and Jimmy Orjala Vladimir Diky, Robert D. Chirico, Andrei F. Kazakov, Chris D. Muzny, Joseph W. Magee, Ilmutdin Abdulagatov, Jeong Won Kang, Kenneth Kroenlein, and Michael Frenkel

Free access to other ACS Sample Issues. ACS Publications offers complimentary access to the first issue of the year for all 39 of its journals.

63

Chemical Information Bulletin Vol. 63(1) Spring 2011

Product Announcements Thieme Chemistry

Thieme Chemistry publishes highly evaluated information about synthetic and general chemistry for professional chemists and advanced students since 1909. Our portfolio of products includes the well known journals SYNFACTS, SYNLETT and SYNTHESIS, the renowned synthetic methodology reference work Science of Synthesis, RÖMPP, the largest and most renowned chemical encyclopedia published in German, as well as a selected range of . www.thieme‐chemistry.com

Product Announcements RSC Publishing Platform reaches the one million milestone

The one millionth publication to appear on the RSC Publishing Platform went online recently in a landmark achievement for the . The seven figure milestone was reached as the RSC's exceptional range of peer‐reviewed journals, magazines, books, databases and publishing services to the chemical science community more than doubled in output in the last three years.

Royal Society of Chemistry editorial director James Milne said: "This marks a significant landmark for the RSC Publishing Platform. Delivering the millionth record, a paper published in the journal Nanoscale, demonstrates not only the significance of the RSC in terms of disseminating high‐quality research content worldwide but also with many millions of article downloads each year, the value researchers place on being able to access this content through our new publishing platform."

In the last four years RSC Publishing has gone from being the fifth largest publisher in chemistry to challenging Wiley in third place.

Read more about this growth at: http://www.rsc.org/AboutUs/News/PressReleases/2011/Million.asp

View the one millionth publication, "Controlled assembly of plasmonic colloidal nanoparticle clusters", at: http://pubs.rsc.org/en/Content/ArticleLanding/2011/NR/C0NR00804D

64

Chemical Information Bulletin Vol. 63(1) Spring 2011

Product Announcements InfoChem Launches Chemisches Zentralblatt Structure Database

At the end of 2010, InfoChem GmbH launched the structure searchable version of Chemisches Zentralblatt, a powerful new way of gaining information from an essential resource for chemists, researchers and intellectual property professionals.

Chemisches Zentralblatt is the first and oldest abstracts journal published in chemistry, covering the literature from 1830 to 1969 and describing the "birth" of chemistry as a science. Over the period of 140 years, Chemisches Zentralblatt has published 900,000 pages, containing two million abstracts. InfoChem was able to identify one million unique names and 500,000 unique structures in these documents. Now, the structure searchable database provides non‐German speaking users with the opportunity to query this valuable source in the language of chemistry.

Using modern scanning technology, FIZ CHEMIE has digitized the entire content of Chemisches Zentralblatt. Then InfoChem produced the structure searchable database by applying specialized software tools for OCR, chemical named entity extraction and name to structure conversion. InfoChem used its exceptional skills and experience in German naming conventions to achieve optimal conversion results.

Chemisches Zentralblatt is available as a web‐based application or as an in‐house database. Scientists can search structures, substructures and full‐text. Then from the hit list, users can link directly to the original page in Chemisches Zentralblatt containing the information. Applications may include preparative chemistry and searches.

About InfoChem GmbH

Founded in 1989 and based in Munich (Germany), InfoChem has over 20 years' experience in the development and integration of sophisticated software tools for the storage and handling of structure and reaction information. For more information, please visit our website.

InfoChem is pleased to announce that we now have a representative in the UK. Dr Stephanie North has over 25 years’ experience in chemical information within the and is delighted to be working with the InfoChem team.

Contact address: PO Box 240, Royston, Hertfordshire SG8 1DA, UK; e‐mail:[email protected].

65

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF Officers 2011

Executive Committee

Member Function Tenure Contact Gregory M. Chair 2011 Bio‐Rad Laboratories, Inc., Informatics Division

Banik Two Penn Center Plaza, Suite 800, 1500 John F. Kennedy Blvd. Philadelphia, PA 19102 267‐322‐6952 (voice) 267‐322‐6953 (fax)

Carmen Nitsche Past Chair 2011 Accelrys, Inc., 254 Rockhill Drive, San Antonio, TX 78209 210‐820‐3459 (voice) 210‐820‐3459 (fax) 510‐589‐3555 (cell)

Rajarshi Guha Chair‐ 2011 NIH Chemical Genomics Center, Elect 9800 Medical Center Drive, Rockville, MD 20852 814‐404‐5449 (voice) 812‐856‐3825 (fax)

Leah R. Solla Secretary 2009‐ Cornell University, Clark Library 2010 283 Clark Hall, Ithaca, NY 14853‐2501 607‐255‐1361 (voice) 607‐255‐5288 (fax) 607‐229‐0287 (cell)

Meghan Lafferty Treasurer 2011‐ University of Minnesota, Science & Engineering Library 2012 108 Walter Library 117 Pleasant Street SE Minneapolis, MN 55455 612‐624‐9399 (voice) 612‐625‐5583 (fax)

66

Chemical Information Bulletin Vol. 63(1) Spring 2011

Bonnie Lawlor Councilor 2010 ‐ National Federation of Advanced Information 2012 Services (NFAIS), 276 Upper Gulph Road, Radnor, PA 19087‐2400 215‐893‐1561 (voice) 215‐893‐1564 (fax) Andrea Twiss‐ Councilor 2009 ‐ University of Chicago,

Brooks 2011 4824 S. Dorchester Avenue, Apt. 2, Chicago, IL 60615‐2034 773‐702‐8777 (voice) 773‐702‐3317 (fax)

Guenter Grethe Alternate 2010 ‐ 352 Channing Way, Councilor 2012 Alameda, CA 94502‐7409 510‐865‐5152 (voice) 510‐865‐5152 (fax) 510‐333‐7526 (cell)

Charles F. Huber Alternate 2009 ‐ University of California, Santa Barbara, Davidson Councilor 2011 Library Santa Barbara, CA 93106‐9010 805‐893‐2762 (voice) 805‐893‐8620 (fax) Dr. Rachelle Program Chair 2011‐ Senior Research Scientist

Bienstock 2012 National Institute of Environmental Health Sciences PO Box 12233 MD F0‐011 Research Triangle Park, NC 27709 919‐541‐3397 (voice)

Jan Carver Membership 2009‐ University of Kentucky, Chemistry Physics Library Chair 2011 150 Chem Phys Bldg, Lexington, KY 40506‐0001 859‐257‐4074 (voice) 859‐323‐4988 (fax)

67

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF Officers 2011 Committee Chairs

Chair Committee Tenure Contact

Jody Kempf Audit 2009‐ University of Minnesota, Science and 2011 Engineering Library 108 Walter Library, 117 Pleasant St. SE Minneapolis, MN 55455 612‐624‐9399 (voice) 612‐625‐5583 (fax)

Phil J. McHale Awards 2009‐ CambridgeSoft Corporation, 2011 375 Hedge Road, Menlo Park, CA 94025‐1713 650‐235‐6169 (voice) 650‐362‐2104 (fax)

Patricia Meindl Careers 2009‐ University of Toronto, A. D. Allen 2011 Chemistry Library 80 St George Street, Rm 480, Toronto, ON M5S 3H6, Canada 416‐978‐3587 (voice) 416‐946‐8059 (fax) Susanne Constitution, Bylaws, and 2007‐ University of Washington, Chemistry

Redalje Procedures Library BOX 351700, Seattle, WA 98195

206‐543‐2070 (voice) Meghan Finance 2011‐ University of Minnesota, Science &

Lafferty 2012 Engineering Library 108 Walter Library 117 Pleasant Street SE Minneapolis, MN 55455 612‐624‐9399 (voice) 612‐625‐5583 (fax) Graham C. Fundraising 2011 Scientific Information Consulting

Douglas Belmont, CA 510‐407‐0769 (voice)

Jan Carver Membership 2009‐ University of Kentucky, Chemistry Physics 2011 Library 150 Chem Phys Bldg, Lexington, KY 40506‐0001 859‐257‐4074 (voice) 859‐323‐4988 (fax)

68

Chemical Information Bulletin Vol. 63(1) Spring 2011

Carmen Nitsche Nominating 2011 Accelrys, Inc., 254 Rockhill Drive, San Antonio, TX 78209 210‐820‐3459 (voice) 210‐820‐3459 (fax) 510‐589‐3555 (cell) Dr. Rachelle Program 2011‐ Senior Research Scientist

Bienstock 2012 National Institute of Environmental Health Sciences PO Box 12233 MD F0‐011 Research Triangle Park, NC 27709 919‐541‐3397 (voice)

William G. Town Communications and 2009 ‐ Kilmorie Consulting, Publications 2011 24A Elsinore Rd., London, SE23 2SL, England +44 20 8699 9764 (voice)

69

Chemical Information Bulletin Vol. 63(1) Spring 2011

CINF Officers 2011 Divisional Representatives and Liaisons

Representative Division Tenure Contact Susan K. SLA DCHE 2006‐ University of Rochester, Carlson

Cardinal Library Box 270236, Rochester, NY 14627 585‐275‐9007 (voice) 585‐273‐4656 (fax)

Guenter Grethe ACS Multidisciplinary Program 2007‐ 352 Channing Way, Planning Group Alameda, CA 94502‐7409 510‐865‐5152 (voice) 510‐865‐5152 (fax) 510‐333‐7526 (cell)

Guenter Grethe Biotechnology Secretariat 2002‐ 352 Channing Way, Alameda, CA 94502‐7409 510‐865‐5152 (voice) 510‐865‐5152 (fax) 510‐333‐7526 (cell)

Erja Kajosalo ASIS&T STI 2006‐ Massachusetts Institute of Technology, MIT Libraries 14S‐134 77 Massachusetts Ave., Cambridge, MA 02139‐4307 Seattle, WA 98195 617‐253‐9795 (voice) 617‐253‐6365 (fax) 781‐223‐3869 (cell))

Peter F. Rusch ACS Committee on Nomenclature, 2006‐ Rusch Consulting Group, Terminology, and Symbols 162 Holland Court, Mountain View, CA 94040‐3864 650‐961‐8120 (voice) 650‐961‐8120 (fax) Mitchell C. ACRL STS 2009 University of California at Irvine,

Brownk Irvine, CA 92697‐8200 949‐824‐9732 (voice) 949‐824‐3114 (fax)

70

Chemical Information Bulletin Vol. 63(1) Spring 2011

Other Functionaries

Member Function Tenure Contact

Bonnie Lawlor Archivist/Historian 2006‐ National Federation of Advanced Information Services (NFAIS), 276 Upper Gulph Road, Radnor, PA 19087‐2400 215‐893‐1561 (voice) 215‐893‐1564 (fax) Danielle Webmaster 2011‐ Concordia University, Vanier Library Building

Dennie 2013 7141 Sherbrooke St. W., Montréal (QC), H4B 1R6, Canada 514.848.2424 x 5237 (voice)

71