Toponomical Proteomics
Total Page:16
File Type:pdf, Size:1020Kb
AUTOMATED QUANTITATIVE ANALYSIS METHODS FOR TRANSLOCATION OF BIOMOLECULES IN RELATION TO MEMBRANE STRUCTURES Von der Fakult¨atf¨urMathematik, Informatik und Naturwissenschaften der RWTH Aachen University zur Erlangung des akademischen Grades einer Doktorin der Naturwissenschaften genehmigte Dissertation vorgelegt von Master of Science in Life Science Informatics OLGA DOMANOVA aus Sankt Petersburg, Russische F¨oderation Berichter: Universit¨atsprofessor Dr. rer. nat. Thomas Berlage Universit¨atsprofessorDr. rer. pol. Matthias Jarke Universit¨atsprofessorDr. med. Fabian Kießling Tag der m¨undlichen Pr¨ufung:18. Oktober 2013 Diese Dissertation ist auf den Internetseiten der Hochschulbibliothek online verf¨ugbar. ii Abstract Biological processes are complex study objects due to their dynamic nature and structural diversity of living organisms. To study dynamic processes statistically, numerous experiments with multiple observations have to be performed, and data have to be analyzed and evaluated. Owing to great technological advances, gigabytes of data are being acquired both in research and industry. Slow and subjective manual analyses are not sufficient anymore, and automated evaluation methods are required. The distribution of biomolecules provides valuable information on a current biological state. The distribution of biomolecules depends on and is influenced by functions of biomolecules, and may thus be used to detect abnormalities. The relatively young research field toponomics de- scribes the laws of spatial arrangement of molecules. Several evaluation methods have previously been developed, automatized and standardized. However, no standard evaluation methods have been reported to quantitatively analyze such an important biological process like translocation of biomolecules. Translocation processes are vital for living organisms. For instance, substance inclusion into a cell or exclusion from it represent a translocation. Furthermore, signaling biomolecules translocate from the cytoplasm across the nuclear membrane into the nucleus to influence gene and protein expression. Investigating translocation processes may help to understand complex biological functions. It may also be used to analyze signaling events, or may even be employed for diagnostics and therapy monitoring. Manual and case-specific methods for quantitative translocation analysis are known, but fail to be generally applicable. Therefore, I have developed a novel generic automated approach. The method is based on microscopy images of biological samples. I have defined a generic method to quantitatively express distribution of biomolecules in numeric descriptors. Herewith, changes in distribution may be analyzed using different biological samples. Thus, the samples analyzed do not necessarily have to belong to a time series. Furthermore, not only cell cultures, but also tissue samples can be used for the analysis. Evaluations of cell cultures are simpler due to homogeneity and spatial separation of individual objects. However, structural polarity of the cells can be seen only in tissues. I have developed two workflows based on numeric descriptors for the distribution of bio- molecules. The first workflow uses structure detection in images to localize the objects for evaluation. The second workflow avoids this complex operation by a structure-independent information extraction strategy. Both workflows are generic and may be applied to quantify a wide range of translocation processes. iii iv Kurzfassung Biologische Prozesse sind komplex aufgrund ihrer Dynamik und struktureller Vielfalt leben- der Organismen. Um dynamische Prozesse statistisch zu untersuchen, sollten zahlreiche Exper- imente mit mehreren Beobachtungen durchgef¨uhrtwerden. Dank der großen technologischen Fortschritte werden Gigabyte von Daten erfasst, sowohl in der Forschung als auch in der In- dustrie. Diese m¨ussenanalysiert und ausgewertet werden. M¨uhsameund subjektive manuelle Analysen sind nicht mehr ausreichend, und automatisierte Auswerteverfahren sind unerl¨asslich. Die Verteilung von Biomolek¨ulenliefert wertvolle Informationen zum aktuellen biologischen Zustand von Organismen. Der biologische Zustand h¨angtvon der Aktivit¨atvon Biomolek¨ulen ab. Deswegen kann die Verteilung von Biomolek¨ulenanalysiert werden um Anomalien zu de- tektieren. Das relativ junge Forschungsfeld Toponomics beschreibt die Gesetze der r¨aumlichen Anordnung von Molek¨ulen.Mehrere in der Literatur beschriebene Auswerteverfahren wurden automatisiert und standardisiert. Es gibt jedoch keine Standardauswerteverfahren f¨urquanti- tative Analyse eines wichtigen biologischen Prozesses wie die Translokation von Biomolek¨ulen. Translokationsprozesse sind lebenswichtig. Aufnahme von Substanzen in eine Zelle oder deren Ausschluss stellen, zum Beispiel, eine Translokation dar. Außerdem translozieren Signal- molek¨uleaus dem Zytoplasma durch die Kernmembran in den Zellkern, um Gen-und Protein- expression zu beeinflussen. Untersuchung der Translokationsprozesse k¨onnte helfen, komplexe biologische Funktionen zu verstehen. Translokationsprozesse k¨onnten auch analysiert werden, um Signalwege zu identifizieren, oder sogar um Diagnose- und Therapieverfahren zu entwickeln. Unterschiedliche Methoden f¨urdie quantitative Analyse von Translokation sind bekannt, aber meist manuell oder fallspezifisch und damit nicht allgemein anwendbar. Deshalb habe ich einen neuen generischen automatisierten Ansatz entwickelt. Das Verfahren basiert auf Mikroskopiebildern von biologischen Proben. Ich habe eine generische Methode definiert, um die Verteilung von Biomolek¨ulenin numerischen Deskriptoren quantitativ auszudr¨ucken. Hiermit k¨onnenVer¨anderungenin der Verteilung in unterschiedlichen biologischen Proben analysiert werden. Dies ist vorteilhaft, da die Methode unabh¨angigdavon funktioniert, ob Einzelbilder vorliegen, oder Zeitreihen. Dar¨uber hinaus k¨onnennicht nur Zellkulturen, sondern auch Gewebeproben zur Analyse verwendet werden. Auswertungen von Zellkulturen sind einfacher aufgrund der Homogenit¨at und r¨aumlicher Trennung der einzelnen Objekte. Allerdings kann strukturelle Polarit¨atder Zellen nur in Gewebe sichtbar sein. Ich habe zwei Workflows basierend auf numerischen Deskriptoren f¨urdie Verteilung von Biomolek¨ulenentwickelt. Der erste Workflow setzt Strukturerkennung in Bildern ein, um Ob- jekte f¨urAuswertung zu lokalisieren. Der zweite Workflow vermeidet diese komplexe Operation durch eine strukturunabh¨angigeInformationsextraktionstrategie. Beide Workflows sind gener- isch und k¨onnten angewendet werden, um eine breite Palette von Translokationsprozessen zu quantifizieren. v vi Acknowledgements The work on this dissertation was a great experience due to a successful combination of an interesting research topic, available facilities and a motivating working environment. I would like to thank my colleagues at Fraunhofer FIT who helped me to get acquainted with the available technical basis and were supportive at all times. My supervisor Professor Thomas Berlage has inspired and guided my work by interesting and constructive discussions. His comments and critics have significantly influenced the whole development and this writing in particular. He was the first professor that I met in my (not very long yet) academic career, who invested so much time in reading and correcting. Stefan Borbe was a great support for the programming side, immediately resolving any arising issue. Dr. Peter Wißkirchen has not only advised me on mathematical and statistical questions but also shared his great experience in image processing. Interdisciplinary research is strongly influenced by and even dependent on the existing collab- orations. Therefore, I would like to thank our colleagues from the University Clinic D¨usseldorf for their effort. Professor Dieter H¨aussingerand Professor Ralf Kubitz supervised the biolog- ical research and experiments, while also being very helpful in interpreting statistical results. Dr. Stefanie Kluge has standardized biological SOPs which are the basis of all our experi- ments. Fruitful discussions with her made it possible to translate some biological knowledge into technical descriptions, which is a great achievement. To my teachers and professors in Saint-Petersburg, I am also very thankful. They showed me to enjoy studying, to have fun with it and to be sure that the more I learn, the more is still left to be learned. Further, I would like to thank DAAD for the scholarship that permitted me to conduct my master studies. It was the first step towards this doctoral thesis. I am especially thankful to Doctor Markus Mathyl, the former DAAD lector in Saint-Petersburg. He was very supportive, helped me a lot in finding information and going through the Russian-German burocracy. This work, as many other things in my life, would not have been possible without the support of my family. I am very grateful to my husband for motivating me all the time, reminding me that there's a life beyond the laptop screen, and, of course, for the technical support at home. A very special gratitude goes to my mother who was and is always there for me, sacrificing everything just to make my dreams come true. My son, my little sunshine, I thank you for the endless happiness that I feel only from looking at you :) You are the one who showed me the importance of time management better than any professor ever could :) And last but not least, DFG (KFO-217) financially supported the project that included my doctoral thesis. I also am very thankful to the B-IT Research