COURSE DESCRIPTION Course Number CSCI 4317 (cross-listed) Course Title Genome Databases: Mining and Management Semester Hours 3 Course Coordinator Rob Knight Course URL http://www.cs.colorado.edu/courses/csci4317.html Current Catalog Description Develops essential skills for performing genomic analyses, with focus on developing practical research tools. Introduces human genome and microbiome projects, Python/SQL scripting, accessing and understanding genomic data, sequence alignment and search, evolutionary models, expression data, biological networks, and macromolecular structure. Prereqs., MCDB 3500, CSCI 3104, or CHEM 4711; coreq., CSCI 2270. Same as CSCI 5317. Credit not granted for this course and CHEM 4621 or MCDB 4621. Textbook None References Varies from year to year: 1-2 required readings per session Instructors (for the last 3 years: Fall 2006 — Spring 2009) Rob Knight Meeting Times (Number and Duration of Sessions per Week) Two lectures, each 1 hour 15 min. Course Outcomes  Learn fundamentals of types of biological data commonly stored in databases: genomes, DNA, RNA and protein sequences, expression data  Create simple databases using own data  Understand and use key bioinformatics algorithms such as BLAST, phylogeny construction, alignment etc.  Understand and apply statistical and machine learning methods for analyzing genome-scale databases, and communicate research results Relationship between Course Outcomes and Program Outcomes Outcomes A. B. C. D. E. F. Apply Computing Design Team Professional Communicate Knowledge Requirements System Work Issues Effectively Bio data fundamentals ✓ Create databases ✓ ✓ ✓ Bioinformatics algorithms ✓ ✓ ✓ DB analysis methods ✓ ✓ ✓ ✓

Outcomes G. H. I. J. K. Analyze Professional Current Design Design & Impacts Development Techniques Tradeoffs Development Bio data fundamentals Create databases ✓ ✓ ✓ Bioinformatics algorithms ✓ ✓ DB analysis methods ✓ ✓ Prerequisites by Topic Genetics, transcription, translation, gene expression: MCDB 3500 Basics of data structures: CSCI 2270 Basics of algorithm analysis and computational complexity: CSCI 3104 Understanding of biomolecules like DNA, RNA, protein CHEM 4711 Proficiency in ONE of these areas is sufficient. Major Topics Covered in the Course Introduction: genes, genomes and databases The human genome and microbiome Ancient DNA and sequence search Molecular evolution Sequence alignment Setting up your own databases Sequence composition Phylogeny Horizontal gene transfer Functional annotation and pathways The genetic code and gene expression Expression analysis Binding site searches RNA structure Protein structure Assessment Plan for the Course Assessment: Students will: (i) Read assigned literature and submit written and oral literature summaries for homework assignments (15%). Each student will present one paper in class, either “for” or “against” (schedule to be determined, counts for 5% of grade), and will submit one-page literature summaries (any number may be submitted, one will be assigned per lecture, the best 5 will count for 2% of your grade each).

(ii) Complete quizzes and problem sets covering biological and computational concepts (30%). There will be three problem sets, evenly spaced thorough the semester. The best two will count towards 20% of your grade (10% each). There will be short, 3-question quizzes covering basic concepts from the previous lecture at the start of each lecture except the first. The best 10 will count for 1% of your grade each (although any number can be submitted).

(iii) Work in groups to perform an assigned sequence analysis project for the midterm project (20%).

(iv) Work either individually or in groups to develop a real-world application for the final project (35%). How is Data from this Course used to Assess Program Outcomes? The instructor retains copies of student homework assignments, midterms, and semester projects. These materials are then evaluated by the department’s external advisory board for examples that demonstrate fulfillment of the program outcomes. Curriculum Category Content (Semester Hours) Area Core Advanced Algorithms 2.0 Data Structures

Computer Organization and Architecture Software Design 1.0 Concepts of Programming Languages