Course Title: Cheminformatics General Information
Total Page:16
File Type:pdf, Size:1020Kb
Course Title: Cheminformatics General Information: Instructor: Dr. Hao Zhu Office: Business and Science Building (BSB) #421 Tel: 856-225-6781 E-mail: [email protected] Office hour: Wednesday morning (10am-12pm) Course pre/co - requisites: None Required Text: None Rationale: Cheminformatics has been defined as the science of examining the structure and function of chemicals through the use of computational analysis, machine learning, statistics, and pattern recognition. A number of recent workforce studies have shown that there is a high current and unmet demand for people trained to various levels of expertise in informatics, from technicians and technical librarians to developers of new and improved methodologies and applications. Course Description: The course will majorly teach the machine learning algorithms, workflows and other relevant computational tools that the students may use or access in the future Computer-Aided Drug Discovery (CADD) work of pharmaceutical companies and academic institutes. The important knowledge of the second part of class includes chemical descriptors, chemical similarity, Quantitative Structure Activity Relationship (QSAR) modeling, model applicability domain and virtual screening. The students will learn how to use the knowledge of cheminformatics to create machine learning models and use the models to identify novel drug like compounds. Several popular computational tools and machine learning approaches, such as Molecular Operating Environment (MOE) (www.chemcomp.com), Random Forest (RF), Support Vector Machine (SVM) and k Nearest Neighbor (kNN) will be involved in this class. Students are encouraged to examine the implementation of all the knowledge they learn from the lectures by using these tools and approaches during the lab time. Furthermore, the information covered in lectures will be expanded upon and reinforced by group discussions and the use of available computational resources. Learning goals: Upon completion of this course, the student should be able to: Have the knowledge of the basic ligand/structure based drug design approaches. Understand the basic algorithms used in the established software to carry out the most common CADD project. Understand the importance of proper use of various parameters in cheminformatics application programs. Practical use of various computational tools available for computer aided drug design including 2D/3D structural database. COURSE SESSION OUTLINE W1 (Jan. 22) Course Introduction W1-LAB No Lab W2 (Jan. 29) Introduction to chemical structure representation and optimization– 2D &3D W2-LAB Teach computer chemistry: ChemOffice, MS-Excel and SDF W3 (Feb. 5) Linux system and R W3-LAB Practicing Linux system and R W4 (Feb. 12) Molecular Descriptors W4-LAB Generating chemical descriptors using MOE software/Microsoft Excel/ChemDraw W5 (Feb. 19) Distances and compound similarity W5-LAB Calculating chemical similarity W6 (Feb. 26) Machine Learning and QSAR W6-LAB Simple QSAR model development W7 (March 5) QSAR modeling W7-LAB Practicing QSAR tools W8 (March 12) Spring Break. No Class W9 (March 19) Chemical information on internet W11-LAB PubChem and other internet chemistry sources W10 (March 26) Predictive modeling (1) W8-LAB Develop you own machine learning models W11 (April 2) Predictive modeling (2) W10-LAB Using machine learning models to make predictions W12 (April 9) Pharmacophores W13-LAB Generating pharmacophores using MOE W13 (April 16) Structural based drug discovery W14-LAB The use of commercial cheminformatics tool-CASE Ultra W14 (April 23) The future of cheminformatics W14-LAB No lab W15 (Final) Final presentation (the final paper/report is due the same time) Course Requirements and Grading: Students are expected to attend all classes, participate in each computer lab practice, and complete all assignments within the time allotted. Each student is expected to give a presentation based on a practical project or term paper at the end of the semester. Requirements for paper reading: Basically this class has no text book. The instructor will assign papers to read after each lecture. The papers are important for students to further understand the contents of the lecture or to prepare for the lab practice. The students are required to write a document (single space, no more than one page but no less than half page) to show their understanding of the papers. Requirements for assignments; The assignments will be 3-4 small projects that will be finished by students within two week period. Requirements for the term paper and project: To be announced by the week 8 of the semester. Grading: Paper Reading (3-4 times): 20% Assignments (3-4 times): 30% Term Project 50% (Report 15%, Slides 15%, Presentation 20%) TOTAL 100% Three important issues about grading: 1) Cheating is strictly prohibited from the class. The instructor will use advanced techniques (such as the services of turnitin.com) to determine a suspicious cheating case. Once a cheating case is confirmed, it will be reported to the Dean and result immediate failure of the class. 2) Absences from the final presentation will be excused only if Dr. Zhu is notified in advance and a note from a physician or a Dean is provided. 3) Late from due time (either homework or final project) will result in penalty of the grades (20% grade deduction for every 24 hour delay). Class policy: 1) No cell phone is allowed during the lecture and lab time (please keep it mute or turn it off). 2) Absences from the lectures need the permission of Dr. Zhu at least one day before the class. 3) Homework needs to be finished independently. The final presentation needs to be clarified for the contribution of each teammate. .