BIOINFORMATICS Pages 48–64
Total Page:16
File Type:pdf, Size:1020Kb
Vol. 16 no. 1 2000 BIOINFORMATICS Pages 48–64 Serendipity in bioinformatics, the tribulations of a Swiss bioinformatician through exciting times! Amos Bairoch Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211 Geneva 4, Switzerland; E-mail: [email protected] Introduction write programs to analyze the results of this type of exper- This is a personal recollection of the events that led iment. When I returned to Geneva, I contacted the Clinical me to develop software tools and databases in the con- Biochemistry Institute (IBC) of the University of Geneva text of what has recently been termed proteomatics which, I thought, might be interested in these programs. (bioinformatics in the context of proteomics). As will I was very lucky! The day I visited the IBC I was in- be manifest from this article, the creations of PC/Gene, troduced to a visiting professor from Oxford, Robin Of- SWISS-PROT, PROSITE and ExPASy, were mostly ford. Robin Offord, a biochemist and previously a nuclear serendipitous unplanned events. From the very beginning physicist, is an expert in the semi-synthesis of proteins. of my biochemistry studies in 1978 up to today, I was ex- In addition Robin has a long lasting interest and knowl- tremely lucky to be able to pursue my combined interests edge of computer systems that started with Mercury Au- in proteins and computer analysis and to be able to follow tocode in 1959. Robin convinced the head of the IBC, Al- new avenues when they opened up. I also feel privileged bert Renold, to hire me part-time to write data analysis to have met and collaborated with many researchers programs. He asked me what computer I thought the lab whose work has been instrumental in the emergence should buy. I indicated that a small Californian company of the field of bioinformatics. More significantly many had recently announced a new microcomputer that seemed of these people became much more than colleagues. to correspond to the needs of the lab. I got the green light to They are friends scattered around the world, united by a buy what was the first Apple II microcomputer in Geneva. common passion, uncovering the meaning of the genetic From 1979 to the beginning of 1981 I pursued my studies information. It is to these friends that I dedicate this to obtain my bachelor degree in biochemistry and wrote article. programs for the IBC during my spare time. In 1980 Robin Offord moved from Oxford to Geneva to head the Department of Medical Biochemistry in the The early days medical faculty. When I obtained my BSc in 1981, I As a teenager, I was interested in space exploration started to look for a lab where I could do a Master’s and the search for extra terrestrial life. After graduation degree that would allow me to do both ‘wet’ (laboratory from high school in 1977 I thought that studying for a benchwork) and ‘dry’ (computer) work. Robin suggested university degree in biochemistry was a good way to train that I do my thesis in his department under the supervision to be an exobiologist! In high school I had also become of Keith Rose. Keith is an expert in mass spectrometry acquainted with computers. We had access to a time- and he had just received a brand new machine from sharing Honeywell Bull mainframe system that we could Kratos. Unfortunately it was not working according to the program in FORTRAN using a teletype console. We could specifications and a Kratos engineer was spending most also use a Wang series 700 programmable calculator that of his time in Geneva troubleshooting the machine. As I had a capacity of about 1000 programming steps. It was waited to start the experimental part of my master’s thesis, a wonderful feeling to be able to program a machine that I decided to tackle the computer system that was running was not dependent on a large computing center and one the mass spectrometer. For that time, and especially for that I could fully exploit without time constraints. a young student such as myself this was an impressive In 1978, during my first year at university, my father Digital Research NOVA-4 minicomputer with a 5 Mb bought me a Radio Shack TRS-80 microcomputer. With removable hard disk cartridge. Furthermore it was hooked it I wrote game programs using the first versions of the up to a HP graphic terminal. We hoped that when we did Microsoft Basic programming language. During the sum- manage to get the mass spectrometer running, we would mer I worked in a Lausanne medical research laboratory use it to characterize and even, in some cases, acquire bits that was developing radio-immunoassays. This led me to of protein sequence data. On the premise of doing data 48 c Oxford University Press 2000 Serendipity in bioinformatics, the tribulations of a Swiss bioinformatician through exciting times! analysis with the NOVA-4, I started developing programs convinced Robin to allow me to buy this machine for for protein sequence analysis. This software suite included the department and managed to obtain from the Swiss implementations of the Needleman and Wunsch similarity distributor the first exemplar to be imported into the search and the Garnier secondary structure prediction country. The decision to buy such a microcomputer algorithms. Also, I wrote programs to identify proteins on and use it for scientific applications at a time when the basis of its amino-acid composition and to simulate the this was the hallmark of centralized time-sharing sys- cleavage of proteins by a series of enzymatic or chemical tems and expensive minicomputers had far-reaching methods. consequences. In the process of building up this set of software tools I also typed in more than 1000 protein sequences. Some of them were entered from literature reports but the large The BIONET user group majority were printed in a wonderful series of books, the The first consequence of the arrival of the Sirius-1 in Atlas of Protein Sequence and Structure. The ‘Atlas’,as our department was that it attracted a lot of interest it was then known, had first been published in 1965 by from young scientists in neighboring departments. It Margaret Dayhoff from the National Biomedical Research prompted me to create a user club for life science Foundation (NBRF) in Washington DC. In 1981 the Atlas users of microcomputers. I named it BIONET, short for consisted of a large book (the fifth edition of 1978) Biology Network. By coincidence this name was later and three supplements. In total it listed 1660 protein independently used in the USA to name an on-line forum sequences. At the time the Atlas was not available on any of life science users. BIONET was meant to be a forum computer media. to share experience in the use of the Sirius-1. It was I never had a chance to use the mass spectrometer and also supposed to sponsor the shared development and from then on I left the ‘wet’ lab and became what is distribution of software tools for sequence and statistical now called a ‘bioinformatician’. Of course this term did analysis. BIONET did achieve these goals but it soon not exist at that time and people like me were generally grew in unexpected directions. It expanded geographically thought of as failed researchers playing around with by attracting users from Lausanne and Fribourg and it computers! quickly attracted users outside the life science fields. It In February 1982 I published my first paper; it was a also diversified in terms of its hardware support. However, letter to the Biochemical Journal suggesting that research the tide of IBM PC compatible computers slowly but groups publishing protein sequences should compute inexorably swamped the far superior Sirius-1. and print a simple checksum that would ‘facilitate the BIONET also quickly became a forum for exchanging detection of typographical and keyboard errors’.Tothe ‘informally obtained’ copies of commercial software best of my knowledge this checksum system was never packages. At the time the high cost of microcomputers and used in a publication (but was implemented for more than software made it impossible for academic laboratories to 10 years in the SWISS-PROT SQ line) and nobody ever buy full price licenses for most if not all the necessary noticed that the example peptide sequence in the paper software tools. In fact most software companies at that spelled out ‘Help I hate math’! I also learned a useful time were acutely aware of this problem and turned a lesson from that paper: standards or nomenclature are only blind eye to these activities. Some of them used the used when they can be enforced! BIONET group as a source of expertise at a time when such expertise in PC-based software and hardware was very scarce and made their products available to our group The Sirius-1 on an unofficial basis. In late 1981 I saw an article in a US newspaper describing In 1985 the price of software and computers went down a new microcomputer based on the Intel 8088 processor. and software and operating systems became easier to use. It was called the Sirius-1. Sirius, the company that built The University of Geneva finally recognized that micro- it, had been created by Chuck Peddle, the designer of computers were here to stay and their usefulness went be- the Motorola 6502 CPU chip that powered the Apple II yond ‘playing games’. With the installation of a local area and the Commodore Pet, one of the first affordable network spanning all the university’s buildings and a sup- personal computers.