<<

Course Title: Cheminformatics

General Information: Instructor: Dr. Hao Zhu Office: Business and Science Building (BSB) #421 Tel: 856-225-6781 E-mail: [email protected] Office hour: Wednesday morning (10am-12pm) Course pre/co - requisites: None Required Text: None

Rationale: Cheminformatics has been defined as the science of examining the structure and function of chemicals through the use of computational analysis, machine , statistics, and pattern recognition. A number of recent workforce studies have shown that there is a high current and unmet demand for people trained to various levels of expertise in informatics, from technicians and technical librarians to developers of new and improved methodologies and applications.

Course Description: The course will majorly teach the , and other relevant computational tools that the students may use or access in the future Computer-Aided (CADD) work of pharmaceutical companies and academic institutes. The important knowledge of the second part of class includes chemical descriptors, chemical similarity, Quantitative Structure Activity Relationship (QSAR) modeling, model applicability domain and . The students will learn how to use the knowledge of cheminformatics to create machine learning models and use the models to identify novel drug like compounds.

Several popular computational tools and machine learning approaches, such as Molecular Operating Environment (MOE) (www.chemcomp.com), Random Forest (RF), Support Vector Machine (SVM) and k Nearest Neighbor (kNN) will be involved in this class. Students are encouraged to examine the implementation of all the knowledge they learn from the lectures by using these tools and approaches during the lab time. Furthermore, the information covered in lectures will be expanded upon and reinforced by group discussions and the use of available computational resources.

Learning goals: Upon completion of this course, the student should be able to:

 Have the knowledge of the basic ligand/structure based approaches.  Understand the basic algorithms used in the established software to carry out the most common CADD project.  Understand the importance of proper use of various parameters in cheminformatics application programs.  Practical use of various computational tools available for computer aided drug design including 2D/3D structural database.

COURSE SESSION OUTLINE

W1 (Jan. 22) Course Introduction

W1-LAB No Lab

W2 (Jan. 29) Introduction to chemical structure representation and optimization– 2D &3D

W2-LAB Teach computer : ChemOffice, MS-Excel and SDF

W3 (Feb. 5) Linux system and R

W3-LAB Practicing Linux system and R

W4 (Feb. 12) Molecular Descriptors

W4-LAB Generating chemical descriptors using MOE software/Microsoft Excel/ChemDraw

W5 (Feb. 19) Distances and compound similarity

W5-LAB Calculating chemical similarity

W6 (Feb. 26) Machine Learning and QSAR

W6-LAB Simple QSAR model development

W7 (March 5) QSAR modeling

W7-LAB Practicing QSAR tools

W8 (March 12) Spring Break. No Class

W9 (March 19) Chemical information on internet

W11-LAB PubChem and other internet chemistry sources

W10 (March 26) Predictive modeling (1)

W8-LAB Develop you own machine learning models

W11 (April 2) Predictive modeling (2)

W10-LAB Using machine learning models to make predictions

W12 (April 9)

W13-LAB Generating pharmacophores using MOE

W13 (April 16) Structural based drug discovery

W14-LAB The use of commercial cheminformatics tool-CASE Ultra

W14 (April 23) The future of cheminformatics

W14-LAB No lab

W15 (Final) Final presentation (the final paper/report is due the same time)

Course Requirements and Grading: Students are expected to attend all classes, participate in each computer lab practice, and complete all assignments within the time allotted. Each student is expected to give a presentation based on a practical project or term paper at the end of the semester.

Requirements for paper reading:

Basically this class has no text book. The instructor will assign papers to read after each lecture. The papers are important for students to further understand the contents of the lecture or to prepare for the lab practice. The students are required to write a document (single space, no more than one page but no less than half page) to show their understanding of the papers.

Requirements for assignments;

The assignments will be 3-4 small projects that will be finished by students within two week period.

Requirements for the term paper and project:

To be announced by the week 8 of the semester.

Grading:

Paper Reading (3-4 times): 20%

Assignments (3-4 times): 30%

Term Project 50% (Report 15%, Slides 15%, Presentation 20%)

TOTAL 100%

Three important issues about grading:

1) Cheating is strictly prohibited from the class. The instructor will use advanced techniques (such as the services of turnitin.com) to determine a suspicious cheating case. Once a cheating case is confirmed, it will be reported to the Dean and result immediate failure of the class. 2) Absences from the final presentation will be excused only if Dr. Zhu is notified in advance and a note from a physician or a Dean is provided. 3) Late from due time (either homework or final project) will result in penalty of the grades (20% grade deduction for every 24 hour delay).

Class policy: 1) No cell phone is allowed during the lecture and lab time (please keep it mute or turn it off). 2) Absences from the lectures need the permission of Dr. Zhu at least one day before the class. 3) Homework needs to be finished independently. The final presentation needs to be clarified for the contribution of each teammate.