Information Theory for Genetics

Total Page:16

File Type:pdf, Size:1020Kb

Information Theory for Genetics

ECEN 4012/5012 Fall 2005 Information Theory for Genetics (Information Theory for Molecular Biology)

Instructor: OLGICA MILENKOVIC Office: ECOT 253 Class hours: M/W/F 10:00-10:50 a.m. ECEE 265 (Rescheduling is possible!) Office hours: To be scheduled in class

CLASS RULE #1: YOU HAVE TO HAVE FUN DURING THE LEARNING PROCESS! CLASS RULE #2: YOU HAVE TO WORK HARD (WHILE HAVING FUN)! CLASS RULE #3: YOU HAVE TO LOVE MATHEMATICS (AT LEAST SOMEWHAT)! CLASS RULE #4: YOU ARE NOT SUPOSE TO BELIEVE IN INTELIGENT DESIGN!

Charles Darwin (1809-1882)

TEXTBOOK: None is required, several are recommended. You will receive plenty of handouts and possibly class notes for this special topic. This is a new subject not completely covered in any known text on bioinformatics, information theory and genetics.

1. J. Pevsner, Bioinformatics and Functional Genomics, Wiley-Liss, 2003. 2. D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology 3. Watermann, Introduction to Computational Biology, 1995 4. R.S. Hawley, and C. Mori, The Human Genome, Harcourt Academic Press, 1999. 5. J. Percus, Mathematics of Genome Analysis, Cambridge Studies in Mathematical Biology, 2002. 6. A. M. Findley, S. P. McGlynn, and G. L. Findley, The Geometry of Genetics, 1989. 7. C. Adams, The Knot Book, An Elementary Introduction to the Mathematical Theory of Knots, American Mathematical Society, 2004.

Gregor Mendel (1822-1884)

Homework, Exam and Class Project Policy: There will be two (take-home) midterm exams, bi-weekly homeworks and a final class project. The exams carry 25% of the final grade each, the class project carries 30% , while homeworks account for 20% of your grade. The homework and project assignments will be slightly different for undergraduate and graduate students in the class.

Rosalynd Franklin (1920-1958)

Living beings are endowed with highly complex information storage and processing systems that are regulated at many different levels, including the macromolecular code stratum. Although some of the components of such systems are well-analyzed from the biochemical point of view, very little is known about their governing information-theoretic principles. It is therefore important to develop an understanding of the mathematical, combinatorial, coding theoretic, signaling and communication system aspects of biological units that lead to their observable self- organization and information processing traits. Furthermore, it is of interest to investigate how known techniques from the area of information-content based data analysis can be applied in the context of studying biological carriers of information.

The RNA Tie Club

The goal of this class is to provide a basic introduction to the problems encountered in modern molecular biology that can be investigated from the perspective of information theory. The topics to be covered include:  A short introduction to the theory biological macromolecules: DNA, RNA and proteins; you will receive a handout in the form of a disk by Roche Genetics outlining the basic principles of genetics;  Mendel’s laws of inheritance and some elementary probability;  Sequence and multiple sequence alignment: micro-arrays and gene expressions;  Basic Local Alignment Search Tool (BLAST);  Molecular phylogeny and evolution;  DNA knots and knot invariants: an overview of related coding and statistical physics problems;  A description of non-extensive entropies and their application for measuring the information content of DNA strands;  An introduction to fractal and multi-fractal sequences and combinatorial modeling of DNA strand properties; a description of the problem of DNA compression;  Coding theory and new distance measures for genetic sequence comparison and analysis;  Information-theoretic aspects of DNA proofreading;  A treatment of gene regulatory networks and novel modeling techniques for tumoro-genesis, with application to disease diagnostics and treatment;  Fault tolerance analysis of genetic networks;  The information-theoretic aspects of RNA/DNA folding, and the related problem of protein folding, with applications to DNA computing/self- assembly;  Several special topics to be chosen by the class! (1) If you qualify for accommodations because of a disability, please submit to me a letter from Disability Services in a timely manner so that your needs may be addressed. Disability Services determines accommodations based on documented disabilities. Contact: 303-492-8671, Willard 322, and http://www.Colorado.EDU/disabilityservices

(2) Campus policy regarding religious observances requires that faculty make every effort to reasonably and fairly deal with all students who, because of religious obligations, have conflicts with scheduled exams, assignments or required attendance. In this class, {{insert your procedures here}} See full details at http://www.colorado.edu/policies/fac_relig.html

(3) Students and faculty each have responsibility for maintaining an appropriate learning environment. Students who fail to adhere to such behavioral standards may be subject to discipline. Faculty have the professional responsibility to treat all students with understanding, dignity and respect, to guide classroom discussion and to set reasonable limits on the manner in which they and their students express opinions. Professional courtesy and sensitivity are especially important with respect to individuals and topics dealing with differences of race, culture, religion, politics, sexual orientation, gender variance, and nationalities. Class rosters are provided to the instructor with the student's legal name. I will gladly honor your request to address you by an alternate name or gender pronoun. Please advise me of this preference early in the semester so that I may make appropriate changes to my records. See polices at http://www.colorado.edu/policies/classbehavior.html and at http://www.colorado.edu/studentaffairs/judicialaffairs/code.html#student_code

(4) All students of the University of Colorado at Boulder are responsible for knowing and adhering to the academic integrity policy of this institution. Violations of this policy may include: cheating, plagiarism, aid of academic dishonesty, fabrication, lying, bribery, and threatening behavior. All incidents of academic misconduct shall be reported to the Honor Code Council at [email protected]; 303-725-2273. Students who are found to be in violation of the academic integrity policy will be subject to both academic sanctions from the faculty member and non-academic sanctions (including but not limited to university probation, suspension, or expulsion). Other information on the Honor Code can be found at http://www.colorado.edu/policies/honor.html and at http://www.colorado.edu/academics/honorcode/

(5) The University of Colorado at Boulder policy on Discrimination and Harassment - see http://www.colorado.edu/policies/discrimination.html - the University of Colorado policy on Sexual Harassment and the University of Colorado policy on Amorous Relationships applies to all students, staff and faculty. Any student, staff or faculty member who believes s/he has been the subject of discrimination or harassment based upon race, color, national origin, sex, age, disability, religion, sexual orientation, or veteran status should contact the Office of Discrimination and Harassment (ODH) at 303-492-2127 or the Office of Judicial Affairs at 303-492-5550. Information about the ODH and the campus resources available to assist individuals regarding discrimination or harassment can be obtained at http://www.colorado.edu/odh List of handouts

1) Class Syllabus 2) The Cell (lecture notes) 3) The Guide: The famous DNA experiments http://www.dnaftb.org/dnaftb/ 4) The Guide: The packing of DNA in eukaryotes http://library.thinkquest.org/27819/ch6_2.shtml 5) Chromatin Structure: Section D http://www.web-books.com/MoBio/Free/Ch3D.htm 6) Ehrenfeucht et.al.: Computation in Living Cells 7) Glossary (from the book by Howley and Mori) 8) Signal Processing in the Genetic Channel 9) The Genetic Code (Geometry of Genetics) 10) Mutations: Damage and Repair of DNA (Chapter 7), Maroni 11) World’s Toughest Bacterium has a Taste for Waste 12) Male Chromosome to Stick Around 13) DNA Replication, Repair, and Recombination (Garland Science, Chapter 6) http://www.bios.co.uk/textbooks/081533480X/pdf/ch06.pdf 14) Control of Gene Expression http://staff.jccc.net/pdecell/expression/control.html 15) The Cell Cycle: A Universal Cellular Division Program http://www.bioteach.ubc.ca/CellBiology/TheCellCycle/ 16) Cancer: Basic Facts (NIH Web Page) 17) Genes and Cancer, Chapter 17 (from the book by Howley and Mori) 18) Gene Regulatory Network – Circuit Representation of the Cytokine Complex 19) How do we Sequence DNA? http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/sequen cing.html 20) Nano-technology for DNA sequencing 21) Whole Genome DNA Sequencing (Gene Myers) 22) Towards Simplifying and Accurately Formulating Fragment Assembly (Gene Myers) 23) Saad Mneimneh’s web-page http://engr.smu.edu/~saad/courses/cse8354/ Lecture 3,4,15 Slides 15

Recommended publications