<<

Review Evolutionary Steps in the Emergence of Life Deduced from the Bottom-Up Approach and GADV Hypothesis (Top-Down Approach)

Kenji Ikehara 1,2

1 G & L Kyosei Institute, Keihannna Labo-401, Hikaridai 1-7, Seika-cho, Sorakugun, Kyoto 619-0237, Japan; [email protected]; Tel./Fax: +81-774-73-4478 2 International Institute for Advanced Studies of Japan, Kizugawadai 9-3, Kizugawa, Kyoto 619-0225, Japan

Academic Editors: David Deamer, Bruce Damer and Niles Lehman Received: 1 November 2015; Accepted: 18 January 2016; Published: 26 January 2016

Abstract: It is no doubt quite difficult to solve the riddle of the origin of life. So, firstly, I would like to point out the kinds of obstacles there are in solving this riddle and how we should tackle these difficult problems, reviewing the studies that have been conducted so far. After that, I will propose that the consecutive evolutionary steps in a timeline can be rationally deduced by using a common event as a juncture, which is obtained by two counter-directional approaches: one is the bottom-up approach through which many researchers have studied the origin of life, and the other is the top-down approach, through which I established the [GADV]-protein world hypothesis or GADV hypothesis on the origin of life starting from a study on the formation of entirely new genes in extant microorganisms. Last, I will describe the probable evolutionary process from the formation of Earth to the emergence of life, which was deduced by using a common event—the establishment of the first genetic code encoding [GADV]-amino acids—as a juncture for the results obtained from the two approaches.

Keywords: origin of life; [GADV]-protein world; GADV hypothesis; pseudo-replication; protein 0th-order structure

1. Introduction In order to solve the riddle on the origin of life, it is important to understand the process by which the fundamental life system was formed, which is composed of three major elements: genes, genetic code, and proteins. However, it seems that even researchers studying the origin of life have not paid much attention to the formation process of the system, probably because it is quite difficult to even start to understand the process and because, when researchers address this problem, they would generally first try to understand what happened on the primitive Earth, where the first life emerged. Actually, many studies on the origin of life have been carried out from such a standpoint, the bottom-up approach, which has been intended to make clear the steps extending from the formation of Earth to the emergence of life (Figure1). The bottom-up approach is surely a general way for studying the origin of life, but it is incapable of solving this problem alone, principally because it is not able to elucidate the formation process of the fundamental life system composed of genes, genetic code, and proteins. On the other hand, the RNA world hypothesis has been the main focus for studies on the origin of life, since the RNA world hypothesis was proposed by W. Gilbert [1] to solve the so-called “chicken-egg relationship” between genes and proteins just after several catalytic RNAs or ribozymes were discovered [2,3]. The hypothesis assumes that the RNA world was first formed by self-replication of catalytic RNA(s) and, later, genetic information and catalytic function on the RNA were transferred to DNA and protein, respectively.

Life 2016, 6, 6; doi:10.3390/life6010006 www.mdpi.com/journal/life Life 2016, 6, 6 2 of 15 Life 2016, 6, 6 2 of 15

FigureFigure 1. General 1. General approach approach for for studies studies on on the the origin origin ofof life (bottom-up approach), approach), which which are arecarried carried out out to elucidateto elucidate what what happened happened on on the the primitive primitive Earth, Earth, wherewhere the first first life life emerged. emerged. Molecular Molecular evolution evolution from simple inorganic compounds would occur in order of physical evolution (dotted arrow), chemical from simple inorganic compounds would occur in order of physical evolution (dotted arrow), chemical and biochemical evolutions (broken arrow), and biological evolution (solid arrow). The difficulty in and biochemical evolutions (broken arrow), and biological evolution (solid arrow). The difficulty in understanding the evolutionary processes with the bottom-up approach is shown through the order of understandingthe thick arrows, the evolutionary from white toprocesses gray to black. with The the thin bottom-up broken arrow approach represents isshown the direction through of the the study order of the thickwith arrows,bottom-up from approach. white to gray to black. The thin broken arrow represents the direction of the study with bottom-up approach. However, there are many weak points in this hypothesis. For example, (1) nucleotides have not However,been produced there from are manysimple weak inorganic points compounds in this hypothesis. through prebiotic For example, means (1) and nucleotides have not been have not detected in any meteorites, although a small quantity of nucleobases can be obtained [4]. (2) It is quite been produced from simple inorganic compounds through prebiotic means and have not been detected difficult or most likely impossible to synthesize nucleotides and RNA through prebiotic means. (3) It in anymust meteorites, also be impossible although to a smallself-replicate quantity RNA of nucleobaseswith catalytic canactivity be obtained on the same [4]. RNA (2) It molecule. is quite difficult(4) or mostIt would likely be impossible impossible to to synthesize explain the nucleotidesformation pr andocess RNA of genetic through information prebiotic according means. to (3) the It mustRNA also be impossibleworld hypothesis, to self-replicate because the RNA information with catalytic is comprised activity of on triplet the same codon RNA sequence, molecule. which (4) would It would be impossiblenever be stochastically to explain theproduced formation by joining process of mo ofnonucleotides genetic information one by one. according (5) The formation to the RNAprocess world hypothesis,of the first because genetic thecode information cannot be explained is comprised by the hypothesis of triplet either, codon because sequence, a genetic which code would composed never be stochasticallyof around produced60 codons must by joiningbe prepared of mononucleotides to synthesize proteins one from by the one. be (5)ginning. The formation(6) It is also impossible process of the first geneticto transfer code catalytic cannot activity be explained from a folded by theRNA hypothesis ribozyme to either, a protein because with a atertiary genetic structure. code composed of around 60Thus, codons it is must only bea matter prepared of time to synthesize before it is proteinsrecognized from that the it must beginning. be impossible (6) It is to also solve impossible the riddle of the origin of life from the standpoint of the RNA world hypothesis. Actually, some to transfer catalytic activity from a folded RNA ribozyme to a protein with a tertiary structure. researchers have started to search for new ways to solve the riddle of the origin of life, especially in Thus,recent ityears. is only For a matterexample, of co-evolution time before ittheory is recognized between thatpeptides it must and be RNA impossible has been to provided solve the by riddle of theB.R. origin Francis, of life in fromorder the to standpointreinforce the of instability the RNA worldof RNA hypothesis. by hypothesizing Actually, that some the researchersearliest havepredecessor started to search of the fornucleic new acids ways was to solve a β-linked the riddle polyester of the made origin from of life,malic especially acid [5]. Restoration in recent years. For example,from the gene-early co-evolution theory theory into the between metabolism-early peptides theory and RNAhas also has been been considered provided by P.T.S. by B.R. van Francis,der in orderGulik to and reinforce D. Speijer the [6,7]. instability Caetano-Anollés’s of RNA group by hypothesizing suggest that prim thatordial the metabolic earliest domains predecessor evolved of the nucleicearlier acids than was informational a β-linked polyesterdomains madeinvolved from in malictranslation acid [and5]. Restorationtranscription, fromsupporting the gene-early the theorymetabolism-first into the metabolism-early hypothesis rather than theory the RNA has world also beenscenario considered [8]. by P.T.S. van der Gulik and D. SpeijerI[ 6consent,7]. Caetano-Anollés’s to their ideas because group it suggestsis important that to primordial suppose a stable metabolic predecessor domains of evolvedRNA or to earlier stabilize fragile RNA with peptides, especially in the watery circumstances on the primitive Earth, than informational domains involved in translation and transcription, supporting the metabolism-first and RNA having a complex chemical structure would not be synthesized without a metabolism using hypothesiscatalytic rather peptides than and/or the RNA proteins world as their scenario aggregates. [8]. However, it must also be impossible to solve the Iriddle consent of the to origin their ideasof life even because using it both is important the co-evolution to suppose theory aand stable the metabolism-early predecessor of theory, RNA or to stabilizesince fragile genetic RNA information with peptides,composed especiallyof triplet base in sequences the watery would circumstances never be stochastically on the primitive formed on Earth, and RNAthe RNA, having which a complex is synthesized chemical by structure joining mononu wouldcleotides not be synthesized one by one—even without if apeptides metabolism could using catalyticstabilize peptides the RNA and/or sequence proteins and aas metabolism their aggregates. with peptide However, catalysts it could must assist also to be accumulate impossible RNA to solve the riddleon the of primitive the origin Earth. of life In evenfact, Kurland using both is also the considering co-evolution that theory RNA coding and the is metabolism-earlynot a sine qua non for theory, sincethe genetic accumulation information of catalytic composed polypeptides of triplet [9]. base It sequencesis also proposed would by never Maury be stochasticallythat the amyloid formed on theworld—which RNA, which would is synthesized be stable, self-replicative, by joining mononucleotides environmentally one responsive, by one—even and evolvable if peptides under could early Earth conditions—preceded the RNA world, since it is unlikely that functional RNA could stabilize the RNA sequence and a metabolism with peptide catalysts could assist to accumulate RNA have existed under the presumably very harsh conditions on the early Earth [10,11]. In addition, on theCarter primitive and Wächtershäuser Earth. In fact, also Kurland give critic isisms also against considering the RNA that world RNA theory coding [12,13]. isnot a sine qua non for the accumulationUnder such a ofrecent catalytic situation, polypeptides a question, “Why [9]. It is is the also origin proposed of life still by a Maurymystery?”, that was the given amyloid world—whichin a workshop would of OQOL be stable, 2014, which self-replicative, was held atenvironmentally the International Institute responsive, for Advanced and evolvable Studies of under earlyJapan, Earth conditions—precededKyoto, Japan, 12–13 July the 2014. RNA In the world, worksh sinceop, itP.L. is unlikely Luisi [14] that presented functional a lecture RNA entitled could have existed under the presumably very harsh conditions on the early Earth [10,11]. In addition, Carter and Wächtershäuser also give criticisms against the RNA world theory [12,13]. Under such a recent situation, a question, “Why is the origin of life still a mystery?”, was given in a workshop of OQOL 2014, which was held at the International Institute for Advanced Studies of Life 2016, 6, 6 3 of 15 Life 2016, 6, 6 3 of 15

Japan,“The need Kyoto, of Japan,a new 12–13 start Julyfrom 2014. ground In the zero,” workshop, and S. P.L.Benner Luisi [15] [14 provided] presented five a lecture paradoxes entitled on “Thethe needorigin of of a life. new Therefore, start from it ground is obvious zero,” that and the S. riddle Benner of the [15 ]origin provided of life five still paradoxes remains unanswered on the origin in of life.spite Therefore, of the strenuous it is obvious efforts that made the riddleby many of the resear originchers of lifeworking still remains in the field. unanswered Thus, I inbelieve spite ofthat the strenuouselucidation efforts of the made riddle by will many be researchersdelayed if workingresearchers in thecontinue field. Thus,to persist I believe with thatthe elucidationRNA world of thehypothesis. riddle will be delayed if researchers continue to persist with the RNA world hypothesis. Then,Then, wherewhere do we start in solving the riddle riddle of of the the origin origin of of life? life? Adoption Adoption of of a anew new approach approach thatthat nobodynobody hashas triedtried so far could lead to to solving solving the the riddle riddle of of the the origin origin of of life. life. This This new new approach approach mightmight bebe aa top-down top-down approach, approach, studying studying first first the the fundamental fundamental genetic genetic system system of of extant extant organisms organisms and theand subsequent the subsequent emergence emergence of life of (Figurelife (Figure2). 2).

FigureFigure 2.2. Top-down approach carried out out to to highlight highlight th thee direction direction of of evolution evolution of of the the modern modern life life systemsystem intointo thethe emergence of life. Molecular Molecular evolution evolution would would occur occur in in order order of of biochemical biochemical evolution evolution (broken(broken arrow)arrow) and and biological biological evolution evolution (solid (solid arrow). arrow). It It becomes becomes difficult difficult to maketo make clear clear the the steps steps with thewith top-down the top-down approach approach in the orderin the of order the biological of the biological and biochemical and biochemical evolutions, evolutions, shown by shown white by and blackwhite thickand arrows,black thick respectively. arrows, resp Theectively. thin broken The arrowthin broken represents arrow the represents direction the of the direction study withof the the top-downstudy with approach. the top-down approach.

In this review, I would like to propose that cooperation of two counter-directional approaches In this review, I would like to propose that cooperation of two counter-directional approaches would be effective in solving the riddle on the origin of life, and I will describe probable steps in the would be effective in solving the riddle on the origin of life, and I will describe probable steps in the emergence of life deduced from the outline in the last section of this review. emergence of life deduced from the outline in the last section of this review.

2.2. General General Approach Approach for for the the Study Study of of th thee Origin Origin of of Life Life (Bottom-Up (Bottom-Up Approach) Approach)

2.1.2.1. ChemicalChemical Analysis of Extracts from Old Rocks Rocks and and Meteorites Meteorites ManyMany researchersresearchers analyzed chemically organic organic compounds compounds in in extracts extracts from from rocks rocks and and fossils fossils on on thethe primitiveprimitive EarthEarth andand investigatedinvestigated extracts extracts of of carbonaceous carbonaceous meteorites meteorites from from space, space, which which might might provideprovide insightinsight into into the the situation situation of of the the primitive primitive Earth. Earth. Thus, Thus, many many early early works works were were carried carried out out from thefrom standpoint the standpoint of the bottom-upof the bottom-up approach, approach, and important and important experimental experimental results haveresults been have obtained, been asobtained, follows: as follows: 1. Twelve amino acids were found from about 3.1 billion-year-old pre-Cambrian Fig Tree Chart 1. Twelve amino acids were found from about 3.1 billion-year-old pre-Cambrian Fig Tree Chart from from the mining area of eastern Transvaal, South Africa [16]. Extracts from 3.8 billion-year-old the mining area of eastern Transvaal, South Africa [16]. Extracts from 3.8 billion-year-old Isua Isua rock formed on the primitive Earth were also chemically analyzed, and both simple amino rock formed on the primitive Earth were also chemically analyzed, and both simple amino acids acids and hydrocarbons were detected from the extracts [17]. However, Nagy et al. have and hydrocarbons were detected from the extracts [17]. However, Nagy et al. have concluded, concluded, based on the extent of amino acid racemization, that the amino acids may be modern based on the extent of amino acid racemization, that the amino acids may be modern to a few tens to a few tens of thousands of years old [17]. Van Zuilen et al. also asserted that previously of thousands of years old [17]. Van Zuilen et al. also asserted that previously presented evidence presented evidence for ancient traces of life in the highly metamorphosed Early Archaean rock, for ancient traces of life in the highly metamorphosed Early Archaean rock, or the metasomatic or the metasomatic rock records, should be reassessed, which were earlier thought to provide therock basis records, for inferences should be about reassessed, early life which [18]. were earlier thought to provide the basis for inferences 2. Severalabout early amino life acids [18]. were detected in extracts of meteorites from space, such as Marchison 2. meteorite,Several amino which acids may werebe similar detected to rocks in extracts on the of primitive meteorites Earth from [19]. space, Therefore, such as nowadays, Marchison aminometeorite, acids which identified may be in similar the Murchison to rocks on chondroritic the primitive meteorite Earth [19]. are Therefore, thought nowadays, to have aminobeen deliveredacids identified to the early in the Earth Murchison by meteorites, chondroritic astero meteoriteids, comets, are and thought interplanetary to have been dust delivered particles, to whichthe early may Earth trigger by meteorites,the appearance asteroids, of life comets,by assisting and interplanetaryin the synthesis dust of proteins particles, via which prebiotic may polymerizationtrigger the appearance reactions of [20 life–22]. by assisting in the synthesis of proteins via prebiotic polymerization reactions [20–22]. Life 2016, 6, 6 4 of 15

2.2. Physical Evolution Experiments Molecular evolution from simple inorganic compounds would occur in order of physical evolution, chemical evolution, and biochemical evolution to produce biologically important organic compounds, such as amino acids and peptides. However, inorganic compounds in the primitive atmosphere cannot generally react to produce organic compounds if energy is not supplied, because of their poor reactivities. Therefore, physical evolution experiments were carried out using various energy suppliers.

1. About 60 years ago, Miller published results showing that amino acids and nucleobases were synthesized by repeated electrical discharging into a reducing gas mixture containing CH4, NH3, H2, and H2O, which imitates lightning in the primitive atmosphere [4,23–25]. Many geologists today consider that the early atmosphere was rather weakly reducing or even neutral, composed of mainly CO2 and N2, based on later studies on the primitive atmosphere [26]. However, it has been confirmed that significant amounts of amino acids can still be synthesized even with weakly reducing or neutral primitive atmospheric gas [27,28]. 2. It is supposed that deep-sea hydrothermal environments were important sites for the synthesis of bioorganic molecules, leading to the emergence of life [29]. Simple amino acids were synthesized with experimental equipment mimicking hydrothermal vents in a deep sea on the primitive Earth [30]. It has been also demonstrated that oligopeptides were synthesized from glycine in a flow reactor simulating a submarine hydrothermal system [31–33]. 3. Oró et al. suggested that cometary collisions with the primitive Earth provided the planet with both free energy and volatiles as important sources for creation of transient, gaseous environments, in which prebiotic synthesis may have taken place [21]. Experiments reproducing meteorite impacts or heavy bombardments to the primitive Earth were also carried out. After the impact, numerous organic molecules, including fatty acids, amines, nucleobases, and amino acids were detected. So, it is considered that organic molecules on the early Earth may have arisen from such impact syntheses [34,35].

2.3. Planetary Exploration and Astronomical Observation The cosmic ancestry hypothesis—like the hypothesis, assuming that life on Earth was seeded from space—has been proposed as an explanation of the origin and evolution of life on Earth. In recent years, the idea has been expanded to include the introduction of organic compounds as amino acids from space onto the primitive Earth. Such being the case, nowadays, many investigations—for example, space exploration of planets, asteroids, satellites, and comets with a probe and observation of them using various kinds of radio telescopes—have been prosperously carried out as one main current of studies for solving the riddle of the origin of life. Actually, sand and dust, which were carried back by space exploration of the asteroid Itokawa, were chemically analyzed [36]. Scientific analyses have been also carried out by instruments on the probe, which were soft-landed onto the comet Churyumov–Gerasimenko [37], because the comet may have some analogy with the primitive Earth. Ethylene oxide and acetaldehyde were detected through multiple lines in the hot cores NGC 6334F, G327.3 ´ 0.6, G31.41 + 0.31, and G34.3 + 0.2, as shown in the results of spectrum analysis using a telescope of the atmosphere of planets similar to the primitive Earth [38]. However, any convincing facts that explain the origin of life have not been discovered by the space extrapolations and astronomical observations so far.

2.4. Chemical Evolution Experiments A variety of organic compounds including [GADV]-amino acids, Gly [G], Ala [A], Asp [D], and Val [V] could be synthesized by lightning in the primitive atmosphere and by comet and/or meteorite impacts on the surface of the primitive Earth. In addition, those compounds were also delivered by Life 2016, 6, 6 5 of 15 meteorites and asteroids, as described above. So, it is considered that amino acids were naturally abundant in various prebiotic contexts [39]. Wächtershäuser also reported that many kinds of organic compounds including aspartic acid would be produced by pyrite-pulled reduction in the presence of H2S and NH3 and accumulated on the primitive Earth [40]. Peptides could be produced with amino acids accumulated on the planet in the hot and salty environments of the primordial earth, as shown with the salt-induced peptide formation in a hot prebiotic ocean in the presence of metal ions as Cu2+ [41–43].

2.5. Biochemical Evolution Experiments It has been confirmed that some peptides have various kinds of catalytic activities. For example, not only dipeptide, Ser-His, but also Gly-Gly and Gly-Gly-Gly catalyze peptide bond formation between amino acids, although less efficiently [44]. It has also been reported that simple peptides, especially His-containing peptides, could polymerize not only amino acids but also nucleotides into short oligomers under plausible prebiotic conditions [41,45]. However, Ser-His dipeptide could not play any role in biochemical evolutionary process on the primitive Earth, even if the dipeptide has a high catalytic activity, because it would be hard to obtain the dipeptides on the primitive Earth due to the low abundance of His. On the contrary, Gly-Gly dipeptide and Gly-Gly-Gly tripeptide would play a role, even if these peptides have only a low catalytic activity. The low activity of the simple peptide catalysts is out of the question, because it is sufficient for synthesis of organic compounds in the absence of any effective proteineous catalyst on the primitive Earth.

2.5.1. Catalytic Activities of [GADV]-Peptides Produced by Repeated Heat-Drying Processes [GADV]-P30 was produced by repeated heat-drying [GADV]-amino acid solution 30 times [42]. The repeated heat-drying cycles were carried out to mimic [GADV]-peptide synthesis in depressions on rocks or tide pools on the primitive Earth. As the number of the cycles increases, yellowish fluorescent light became strong, suggesting that some cyclic compounds were formed [46]. [GADV]-P30 hydrolyzed β-galactoside bond in MetU-Gal and amide bond (peptide bond) in Gly-pNA, resulting in production of MetU and pNA, respectively. The [GADV]-P30 did hydrolyze a peptide bond in a natural protein, bovine serum albumin (BSA) [46]. All three catalytic activities examined with [GADV]-P30 were detected, suggesting that [GADV]-peptides and/or [GADV]-proteins as their aggregates could exhibit various catalytic activities.

2.5.2. Catalytic Activities of [GADV]-Random Octapeptides [GADV]-octapeptides, which were randomly joined with solid-phase peptide synthesis, hydrolyzed BSA to give several fragments of the protein. In addition, the hydrolytic activity was much higher than that of [GADV]-P30 prepared by the repeated heat-drying treatment [46]. This indicates that even [GADV]-peptides without cyclic compounds such as diketopiperazine, which is formed during the repeated dry-heating process, can exhibit catalytic function of hydrolysis toward peptide bonds. The higher hydrolytic activity of the random octapeptides suggests that the appearance of [GADV]-peptides/proteins produced after the GNC primeval genetic code could cause the steps to the emergence of life to proceed at a higher rate than before establishment of the primeval genetic code.

2.6. Limitations of Bottom-Up Approaches The bottom-up approaches, which are generally carried out by experiments, are undoubtedly important to understand how life emerged on the primitive Earth. Indeed, the important insight that [GADV]-amino acids would accumulate on the primitive Earth was obtained through experiments [47,48]. The evolutionary processes deduced from the results obtained by the bottom-up approaches are shown in Figure3. However, on the other hand, it would be quite difficult or most likely impossible to solve the riddle of the origin of life using only bottom-up approaches. The principal Life 2016, 6, 6 6 of 15 reason is because it is impossible to know how the fundamental life system, which is composed of genes, geneticLife 2016, 6 code,, 6 and proteins, was formed. Of course, this would not be the case if6 vestigesof 15 of newly-born possessing a primordial life system were discovered from an old rock, meteorite, or hydrothermalmeteorite, or vent, hydrothermal and so vent, on. and However, so on. However, no vestige no vestige has been has been discovered discovered so so far.far. It It might might also be problematicalso be problematic that the that bottom-up the bottom-up approach approach tends tends to to focus focus on experiments experiments and, and, inevitably, inevitably, lacks lacks theoretical considerations. theoretical considerations.

Figure 3.FigureSummarized 3. Summarized results results obtained obtained by bythe the bottom-up approach approach for study for study on the on origin the originof life. of life. Physical,Physical, chemical, chemical, and biochemical and biochemical evolutions evolutions indicated indicated by by broken broken arrows arrows would would take take place place inin parallel in someparallel cases fromin some the cases accumulation from the accumulation of glycine of (Gly) glycine to the(Gly) emergence to the emergence of life. of The life. steps The steps made clear with themade bottom-up clear with approach the bottom-up are shownapproach by are thick shown white by thick arrows. white arrows. Assumed Assumed and unknownand unknown steps are steps are shown with thick gray and black arrows, respectively. shown with thick gray and black arrows, respectively. 3. A Top-down Approach for Understanding the Origin of Life ([GADV]-Protein World Hypothesis) 3. A Top-down Approach for Understanding the Origin of Life ([GADV]-Protein World Hypothesis) About 20 years ago, I thought of studying the mechanism behind how entirely new genes have Aboutbeen 20formed years in ago,presently I thought existing of microorganisms. studying the mechanismThat, fortunately, behind led me how to research entirely the new origin genes of have been formedlife. Consequently, in presently I adopted existing the microorganisms. top-down approach That, for the fortunately, research by ledchance. me The to research progress of the my origin of study is described briefly below. life. Consequently, I adopted the top-down approach for the research by chance. The progress of my study is3.1. described Origin of brieflyEntirely New below. Genes

3.1. Origin ofI Entirelystarted aNew study Genes to understand how entirely new genes—i.e., the first ancestor genes in families— are formed on the present Earth, independent of the origin of life. For that purpose, data I startedon microbial a study water-solubl to understande globular proteins how entirely and their new genes genes— obtainedi.e from., the GenomeNet first ancestor Databases genes in families—arewere analyzed formed using on the six presentproperty Earth,conditions independent (hydropathy, of α the-helix, origin β-sheet of life.and turn/coil For that structure purpose, data on microbialformations, water-soluble acidic amino globular acid, and basic proteins amino and acid their compositions). genes obtained The four protein from GenomeNet structural indexes Databases for hydrophobicity, α-helix, β-sheet, and turn/coil formations were calculated by summation using the were analyzed using six property conditions (hydropathy, α-helix, β-sheet and turn/coil structure following Equation; I(x)t = ΣI(x)a na/nt, where I(x)t, I(x)a, na, and nt are total index of a protein, index for formations,each acidicamino acid, amino each acid, amino and acid basic number amino and acidtotal compositions).amino acid number The in a four protein, protein respectively structural [49]. indexes for hydrophobicity,Necessary hydropathyα-helix, indexβ-sheet, and andsecondary turn/coil structure formations indexes wereof an amino calculated acid were by summation obtained from using the followingStryer’s Equation; textbookI(x )[50].t = Σ AcidicI(x)a n(aspartica/nt, where and glutamicI(x)t, I(x acids))a, na ,and and basicnt are (histidine, total index lysine, of and a protein, arginine) index for each aminoamino acid, acid eachcontents amino were acid arithmetically number calculated. and total From amino the acidresults, number it was found in a protein,that the six respectively structural [49]. indexes of water-soluble globular protein obtained with data of seven microbial genomes Necessary hydropathy index and secondary structure indexes of an amino acid were obtained from (Mycobacterium tuberculosis, Aeropyrum pernix, Escherichia coli, Bacillus subtilis, Haemophilus influenzae, Stryer’sMethanococcus textbook [50 genitarium,]. Acidic and (aspartic Borrelia andburgdorferi glutamic) having acids) a different and basic GC content (histidine, from each lysine, other and were arginine) amino acidall roughly contents constant, were irrespective arithmetically of a significant calculated. change of From the GC the content results, of a gene it was [49,51]. found So, the that six the six structuralstructural indexes indexes of water-soluble were used as globularthe conditions protein for judging obtained whether with dataa polypeptide of seven can microbial fold into genomesa (Mycobacteriumwater-soluble tuberculosis globular ,structure.Aeropyrum pernix, Escherichia coli, Bacillus subtilis, Haemophilus influenzae, MethanococcusFurthermore, genitarium it, was and foundBorrelia that burgdorferi an imaginary) having protein aencoded different by GCa non-stop content frame from on eachantisense other were strand of GC-rich gene (GC-NSF(a)) satisfies the six conditions at a high probability. It was also all roughlyconfirmed constant, that non-stop irrespective frame ofis produced a significant at a hi changegh probability, of the since GC contentthree termination of a gene codons [49,51 are]. So, the six structuralAT-rich indexes and GC-NSF(a) were usedis literally as the GC-rich conditions [49,51,52]. for This judging means whetherthat all imaginary a polypeptide proteins, canwhich fold into a water-solubleare encoded globular by any structure.GC-NSF(a), fold into a water-soluble globular protein structure, although they Furthermore,have several itdifferent was found amino that acid an sequences. imaginary On proteine of the encoded reasons byis because a non-stop rather frame similar on antisenseor strand ofsymmetrical GC-rich base gene compositions (GC-NSF(a)) at three satisfies codon the positions six conditions are observed at abetween high probability.GC-rich gene on It wasa also confirmed that non-stop frame is produced at a high probability, since three termination codons are AT-rich and GC-NSF(a) is literally GC-rich [49,51,52]. This means that all imaginary proteins, which Life 2016, 6, 6 7 of 15 are encoded by any GC-NSF(a), fold into a water-soluble globular protein structure, although they have different amino acid sequences, respectively. One of the reasons is because rather similar or symmetrical base compositions at three codon positions are observed between GC-rich gene on a sense strand and antisense sequence of GC-rich gene (GC-NSF(a)) [49,52,53]. Thus, I finally reached the conclusionLife 2016 that, 6, 6 entirely new genes are produced from non-stop frames on antisense strands7 of 15 of, not AT-rich,sense but strand GC-rich and microbial antisense sequence genes [49 of, 52GC-rich–55]. gene (GC-NSF(a)) [50,53,54]. Thus, I finally reached the conclusion that entirely new genes are produced from non-stop frames on antisense strands of, not 3.2. OriginAT-rich, of the but Genetic GC-rich Code microbial genes [49,52–55]. In the process of analyzing base compositions at three codon positions of GC-rich genes and the 3.2. Origin of the Genetic Code corresponding antisense sequences, I noticed that the base composition formats of both highly GC-rich genes (65%–70%)In the process and GC-NSF(a) of analyzing sequences base compositions are roughly at three SNS, codon where positions S means of GC-rich guanine: genes and G, orthe cytosine: corresponding antisense sequences, I noticed that the base composition formats of both highly GC-rich C. Then, it was confirmed by computer analysis that imaginary proteins produced under SNS code genes (65–70%) and GC-NSF(a) sequences are roughly SNS, where S means guanine: G, or cytosine: C. encodingThen, 10 it amino was confirmed acids ([GADV]-amino by computer analysis acids th plusat imaginary Glu [E], proteins Leu [L], produced Pro [P], under His [H], SNS Glncode [Q], and Arg [R])encoding satisfy 10 the amino six conditions acids ([GADV]-amino for formation acids ofplus a water-solubleGlu [E], Leu [L], globular Pro [P], His structure. [H], Gln At[Q], this and time, we publishedArg the[R]) SNSsatisfy primitive the six conditions genetic for code formation hypothesis of a water-soluble [56]. globular structure. At this time, we Further,published we the looked SNS primitive for a minimum genetic code set hypothesis of amino [56]. acids that could produce proteins satisfying four Further, we looked for a minimum set of amino acids that could produce proteins satisfying conditions (hydropacy plus three formabilities of α-helix, β-sheet, and turn/coil), in order to confirm four conditions (hydropacy plus three formabilities of α-helix, β-sheet, and turn/coil), in order to whetherconfirm there existedwhether athere genetic existed code a genetic more ancientcode more than ancient the SNSthan code,the SNS or not.code, Fromor not. the From results, the it was found thatresults, [GADV]-proteins it was found that encoded [GADV]-proteins by GNC encoded code satisfy by GNC the code four satisfy conditions, the four when conditions, about equal amountswhen of [GADV]-aminoabout equal amounts acids of [GADV]-amino are contained acids in the are proteins contained [49 in]. the This proteins means [49]. that This a groupmeans of four [GADV]-aminothat a group acids of could four produce[GADV]-amino proteins acids basically could produce comparable proteins to contemporarybasically comparable proteins. to Althoughcontemporary [GADV]-proteins proteins. do not contain basic amino acid, divalent metal ions, such as Mg2+, Although [GADV]-proteins do not contain basic amino acid, divalent metal ions, such as Mg2+, Mn2+ and Cu2+, could compensate for the lack of positive charge in the acidic [GADV]-protein. Mn2+ and Cu2+, could compensate for the lack of positive charge in the acidic [GADV]-protein. In this In this way,way, I reachedreached a a GNC-SNS GNC-SNS primitive primitive genetic genetic code code hypothesis, hypothesis, suggesting suggesting that the that universal the universal or or standardstandard genetic genetic code code originated originated from from GNC GNC codecode encoding encoding four four [GADV]-amino [GADV]-amino acids, acids,through through SNS SNS code (Figurecode (Figure4)[49 ].4) [49].

FigureFigure 4. GNC-SNS 4. GNC-SNS primitive primitive geneticgenetic code code hypothesis, hypothesis, assuming assuming that the universal that the or universalstandard genetic or standard code (both formally and substantially triplet code) originated from GNC primeval genetic code (formally genetic code (both formally and substantially triplet code) originated from GNC primeval genetic triplet but substantially singlet code) through SNS primitive genetic code composed of 10 amino acids code (formallyencoded by triplet 16 codons but substantially(formally triplet singlet but substantially code) through doublet SNScode). primitive genetic code composed of 10 amino acids encoded by 16 codons (formally triplet but substantially doublet code). This indicates that the frameworks or protein 0th-order structures for creation of entirely new Thisproteins indicates are implicitly that the written frameworks in the genetic or protein code table. 0th The-order GNC-SNS structures hypothesis for creation is quite important of entirely new for the elucidation of the origin of life too, since elucidation of the origin of the genetic code, which proteinsis arelocated implicitly at a core written position inconnecting the genetic genetic code information table. The on GNC-SNSa nucleotide hypothesissequence of RNA is quite and/or important for the elucidationDNA with the of catalytic the origin function of life of too,protein, since could elucidation lead to making of the originclear the of origins the genetic of genes code, and which is locatedproteins. at a core position connecting genetic information on a nucleotide sequence of RNA and/or DNA with the catalytic function of protein, could lead to making clear the origins of genes and proteins. 3.3. [GADV]-Protein World Hypothesis: GADV Hypothesis 3.3. [GADV]-ProteinI had considered World about Hypothesis: 15 years GADV ago that Hypothesis the first proteins must be composed of four simple [GADV]-amino acids, if the GNC-SNS primitive genetic code hypothesis is correct. At that time, I I had considered about 15 years ago that the first proteins must be composed of four simple suddenly came across the [GADV]-protein world hypothesis on the origin of life or the GADV [GADV]-amino acids, if the GNC-SNS primitive genetic code hypothesis is correct. At that Life 2016, 6, 6 8 of 15 time, I suddenly came across the [GADV]-protein world hypothesis on the origin of life or the GADVLife hypothesis. 2016, 6, 6 In other words, I noticed that [GADV]-proteins could be pseudo-replicated8 of 15 through [GADV]-protein synthesis with a [GADV]-protein having catalytic activity for peptide bond hypothesis. In other words, I noticed that [GADV]-proteins could be pseudo-replicated through formation,[GADV]-protein suggesting synthesis that various with a water-soluble [GADV]-protein globular having catalytic [GADV]-proteins activity for peptide could bond be producedformation, by the Life 2016, 6, 6 8 of 15 pseudo-replication.suggesting that The various term “pseudo-replication”water-soluble globular [GADV]-proteins means a process could whereby be produced proteins by comprised the of the samehypothesis.pseudo-replication. constituent In other set The of words, aminoterm “pseudo-replication”I acidsnoticed (composition) that [GADV]-proteins means that a process possess could whereby similarbe pseudo-replicated proteins but different comprised through water-soluble of the globular[GADV]-proteinsame structures constituent are synthesis set generated of amino with a by acids[GADV]-protein arandom (composition process having) that catalytic withoutpossess activitysimilar resorting forbut peptide different to any bond geneticwater-soluble formation, system [57]. The factsuggestingglobular that even structures that simple various are Gly-Gly generated water-soluble dipeptides by a random globular canproces catalyze s[GADV]-proteins without peptide resorting bond couldto any formation geneticbe produced system supports [57].by Thethe the idea fact that even simple Gly-Gly dipeptides can catalyze peptide bond formation supports the idea of this of this processpseudo-replication. [44]. The The notion term “pseudo-replication” of the random polymerization means a process ofwhereby [GADV]-amino proteins comprised acids of without the any process [44]. The notion of the random polymerization of [GADV]-amino acids without any genetic geneticsame system constituent led to aset new of amino theory acids on the(composition origin of) that life: possess the GADV similar hypothesis,but different water-soluble assuming that life globularsystem led structures to a new are theory generated on the by origin a random of life: proces the sGADV without hypothesis, resorting to assumi any geneticng that system life originated [57]. The originated from a [GADV]-protein world, not from the RNA world as previously thought [51,57,58]. factfrom that a [GADV]-proteineven simple Gly-Gly world, dipeptides not from can the cataly RNAze peptideworld asbond previously formation thought supports [51,57,58]. the idea Thus, of this I Thus, Iprocessformed [44].an an originalThe original notion idea, idea, of thethe the randomGADV GADV polymerizahypothesis, hypothesis,tion which of which[GADV]-amino was wasobtained obtained acids through without through studies any studiesgeneticon the on the formationsystemformation process led processto ofa new the of theory fundamentalthe fundamental on the origin life life of system system life: the fr from GADVom the the hypothesis,present present to the toassumi thepast,ng past, going that going lifeback originated in back a time in a time lapse orfromlapse with aor a[GADV]-protein with top-down a top-down approach world, approach not (Figure (Figurefrom 5the)[ 5) 57 RNA[57,58].–59 world]. as previously thought [51,57,58]. Thus, I formed an original idea, the GADV hypothesis, which was obtained through studies on the formation process of the fundamental life system from the present to the past, going back in a time lapse or with a top-down approach (Figure 5) [57,58].

Figure 5.FigureSummarized 5. Summarized results results obtained obtained by by top-down top-down approach approach for for study study of ofthe the origin origin of life. of life.Periods Periods of biochemicalof biochemical and biological and biological evolution evolution are shown are show byn brokenby broken and and solid solid arrows, arrows, respectively.respectively. The The steps

made clearsteps with made the clear top-down with the top-down approach approach are shown are withshown thick with whitethick white arrows. arrows. Steps Steps deduced deduced from the from the top-down approach are shown with thick gray arrows in the figure. top-downFigure approach 5. Summarized are shown results with obtained thick by gray top-down arrows approa in thech figure.for study of the origin of life. Periods of biochemical and biological evolution are shown by broken and solid arrows, respectively. The The probable evolutionary process from a [GADV]-protein world through the emergence of life to steps made clear with the top-down approach are shown with thick white arrows. Steps deduced the modern life system, deduced from a GADV hypothesis, are shown in Figure 6. The evolutionary The probablefrom the top-down evolutionary approach process are shown from with a thick [GADV]-protein gray arrows in the world figure. through the emergence of life to the modernsteps from life accumulation system, deduced of nucl fromeotides a to GADV formation hypothesis, of ds-(GNC) aren showngenes are in only Figure assumed6. The without evolutionary experimental results at this time. However, the steps follow a logical sequence, if the GNC primeval steps from accumulationThe probable evolutionary of nucleotides process tofrom formation a [GADV]-protein of ds-(GNC) world thgenesrough the are emergence only assumed of life to without genetic code hypothesis is correct. The first reason is because GNC coden must be established before experimentalthe modern results life system, at this time.deduced However, from a GADV the stepshypothesis, follow are a shown logical insequence, Figure 6. The if theevolutionary GNC primeval formation of (GNC)n genes, since any gene cannot be expressed without genetic code. The second steps from accumulation of nucleotides to formation of ds-(GNC)n genes are only assumed without genetic code hypothesis is correct. The first reason is because GNC code must be established before experimentalreason is because results the atgenetic this time. info rmationHowever, composed the steps of follow triplet a codon logical sequence sequence, ca nnotif the be GNC stochastically primeval formationgeneticproduced of (GNC)code by hypothesisjoiningn genes, of mononucleotidesis since correct. any The gene first one cannotreason by one. is be becauseThe expressed latter GNC reason code without suggests must geneticbe that established the code.first geneticbefore The second information must be created by joining GNC sequences. I would like to stress here that my reason isformation because of the (GNC) geneticn genes, information since any gene composed cannot ofbe tripletexpressed codon without sequence genetic cannotcode. The be second stochastically producedreasonpropositions by is joining because are ofbased the mononucleotides genetic on analyses information of databases one composed by of one. genes of tr The andiplet latterproteins. codon reason sequence So, I believe suggests cannot that be there that stochastically theis nofirst way genetic to explain the emergence of life, except for the steps deduced from the GADV hypothesis. informationproduced must by be joining created of mononucleotides by joining GNC one sequences. by one. The I would latter reason like to suggests stress herethat the that first my genetic propositions are basedinformation on analyses must of be databases created by of joining genes andGNC proteins.sequences. So, I Iwould believe like that to therestress ishere no that way my to explain propositions are based on analyses of databases of genes and proteins. So, I believe that there is no way the emergenceto explain of the life, emergence except of for life, the except steps for deduced the steps deduced from the from GADV the GADV hypothesis. hypothesis.

Figure 6. The evolutionary process from formation of a [GADV]-protein world to the modern life system through the emergence of life, which is deduced from the GADV hypothesis. Periods of biochemical and biological evolutions are indicated by broken and solid arrows. Italic letters in boxes represent evolutionary steps assumed from the GADV hypothesis. FigureFigure 6. The 6. evolutionary The evolutionary process process from fromformation formation of of a a[GADV]-protein [GADV]-protein world world to the tomodern the modern life life systemsystem through through the emergencethe emergence of of life, life, which which isis deduced from from the the GADV GADV hypothesis. hypothesis. Periods Periods of of biochemicalbiochemical and biological and biological evolutions evolutions are are indicated indicated by by broken broken and and solid solid arrows. arrows. Italic letters Italic lettersin boxes in boxes represent evolutionary steps assumed from the GADV hypothesis. represent evolutionary steps assumed from the GADV hypothesis. Life 2016, 6, 6 9 of 15

3.4. Strengths of the GADV Hypothesis I have introduced two new concepts for proposing the GADV hypothesis: protein 0th-order structure and pseudo-replication of [GADV]-protein [51,57–61]. Owing to the introduction of the two concepts, a novel idea about the origin of life, the GADV hypothesis, emerged. Based on the GADV hypothesis, the process by which the fundamental life system composed of genes, genetic code, and proteins was formed can be understood without any major contradictions. In addition, the formation process of the “chicken and egg” relationship between gene and protein can be also explained according to the GADV hypothesis, as moving from the lower ([GADV]-protein synthesis) to the upper stream (creation of (GNC)n genes) of the genetic flow in the fundamental life system [57,59].

3.5. Limitations of the GADV Hypothesis Although there are many strong points in the GADV hypothesis as described above, there are several weak points in the hypothesis, too. The biggest weakness is that only limited experimental results have been obtained thus far, such as detection of enzymatic activities of [GADV]-peptides/proteins as aggregates and of random [GADV]-octapeptides, for hydrolysis of peptide bond in a natural protein, bovine serum albumin (BSA) [46]. The reason is partly because we have mainly concentrated on the studies using computational analysis of databases of microbial genes and proteins. However, I would like to emphasize here that the hypothesis is not a purely theoretical idea, since the hypothesis was based on databases, which were obtained by experiments, and is based on protein 0th-order structure as [GADV]-amino acids, which satisfies the four structural conditions obtained from analysis of experimental data of proteins in extant micro-organisms [51,58,61]. However, it is difficult to get a clear answer for the origin of life according to only the GADV hypothesis or the top-down approach, because we are not able to fundamentally understand how, and from what kinds of simple inorganic and organic compounds, [GADV]-amino acids and [GADV]-proteins were synthesized and accumulated on the primitive Earth (Figure2).

4. The Necessity of Combining the Bottom-Up and Top-Down Approaches It is obviously quite difficult to solve the riddle of the origin of life only from the top-down approach or only from the bottom-up approach. However, all events from the formation of Earth through to the emergence of life and the evolution of modern lives occurred on a straight timeline. Therefore, in order to overcome the difficulties in elucidating the origin of life, it is essential to take the results obtained from the two counter-directional approaches into consideration and to search for consistent evolutionary processes leading to the present life system from accumulation of simple organic compounds on the primitive Earth. Then, it is also important to understand the establishment process of the first genetic code, GNC, because tRNA realizing genetic code connects the genetic function expressed by anticodon of tRNA with the catalytic function of proteins synthesized using amino acids on the tRNA.

4.1. The Most Significant Evolutionary Steps according to Bottom-Up Approaches As described in Section2, it was confirmed that endogenous, exogenous, and impact-shock sources of organics like amino acids could each have made a significant contribution to the origins of life. Although, of course, which organic compounds were synthesized with prebiotic means at quantitatively high amounts depends on primitive Earth’s conditions, such as the composition of the early terrestrial atmosphere [22]. Van der Gulik et al. have indicated that the [GADV]-amino acids are abundantly produced in Miller-type prebiotic synthesis experiments, and that [GADV]-amino acids are contained abundantly in carbonaceous meteorites [47]. Therefore, they have conjectured that the early functional peptides were made of [GADV]-amino acids. Parker et al. reported that the overall abundances of the synthesized amino acids in the presence of H2S are very similar to those found in some carbonaceous meteorites [28]. Higgs concluded that [GADV]-amino acids were used to Life 2016, 6, 6 10 of 15 synthesize early peptide catalysts [48], and proposed a four-column hypothesis, xNx code encoding four [GADV]-amino acids. The small letter x and the capital letter N mean all and any one of four nucleobases (A, U, G, and C), respectively. Francis suggests that the singlet coding system evolved into a four nucleotide/four amino acid process (AMP = aspartic acid, GMP = glycine, UMP = valine, CMP = alanine)[5] and Di Giulio insists that the universal genetic code originated from a primitive code, GNS and GNN, encoding four [GADV]-amino acids [62]. Certainly, there are some slight differencesLife among 2016, 6, 6 those ideas. However, it is important to note that the common conclusion—that10 of 15 the first proteins were produced with [GADV]-amino acids—has been independently obtained with encoding four [GADV]-amino acids. The small letter x and the capital letter N mean all and any one bottom-upof approaches. four nucleobases (A, U, G, and C), respectively. Francis suggests that the singlet coding system Van derevolved Gulik into aet four al. nucleotide/four[47] assumed amino that acid process some (AMP traces = aspartic of prebiotic acid, GMP peptides= glycine, UMP composed of [GADV]-amino= valine, acids CMP still = alanine) exist [5] in and the Di form Guilio of insists active that sites the inuniversal present-day genetic code proteins. originated After from searching a for primitive code, GNS and GNN, encoding four [GADV]-amino acids [62]. Certainly, there are some such proteinsslight as prebioticdifferences peptideamong those candidates, ideas. However, they identified it is important three main to note classes that ofthe motifs: common D(F/Y)DGD correspondingconclusion—that to the active the sitefirst in proteins RNA polymerases,were produced DGD(G/A)D with [GADV]-amino in some acids—has kinds ofbeen mutases, and DAKVGDGDindependently in dihydroxyacetone obtained with bottom-up kinase, approaches. where all three classes manipulate phosphate groups Van der Gulik et al. [47] assumed that some traces of prebiotic peptides composed of required for nucleotide metabolism in the very first stages of life. The results also support the idea that [GADV]-amino acids still exist in the form of active sites in present-day proteins. After searching for the [GADV]-aminosuch proteins acids, as prebiotic which accumulatedpeptide candidates, at meaningfully they identified high three amounts main classes on theof motifs: primitive Earth, were used forD(F/Y)DGD synthesis corresponding of the most to the primitive active site proteins, in RNA polymerases, leading to DGD(G/A)D the emergence in some of kinds life. of mutases, and DAKVGDGD in dihydroxyacetone kinase, where all three classes manipulate 4.2. The Mostphosphate Ancient groups Event required Deduced for nucleotide Using the metabolism GADV Hypothesis in the very first (Top-Down stages of life. Approach) The results also support the idea that the [GADV]-amino acids, which accumulated at meaningfully high amounts As describedon the primitive in Section Earth,3, were the GADVused for hypothesis synthesis of isthe an most idea primitive mainly proteins, based onleading the to GNC the primeval genetic codeemergence hypothesis, of life. which was obtained by analysis of databases of genes and proteins, starting from studies4.2. on The the Most formation Ancient Event mechanism Deduced Using of the entirely GADV Hypothesis new genes (Top-Down in presently Approach) existing microorganisms. So, the most ancientAs described event in deduced Section 3, the from GADV the hypothesis GADV hypothesis is an idea mainly is the based establishment on the GNC primeval of GNC primeval genetic codegenetic encoding code hypothesis, four [GADV]-amino which was obtained acids (Figureby analysis5). Thisof databases means of that genes the and GNC proteins, code encoding [GADV]-aminostarting acids from can studies be usedon the as formation the common mechanism event of deducedentirely new from genes the in bottom-up presently existing approach, when microorganisms. So, the most ancient event deduced from the GADV hypothesis is the the consistent evolutionary process from the birth of the Earth through to the emergence of life and the establishment of GNC primeval genetic code encoding four [GADV]-amino acids (Figure 5). This modern lifemeans system that is the considered. GNC code encoding [GADV]-amino acids can be used as the common event deduced from the bottom-up approach, when the consistent evolutionary process from the birth of 4.3. Deducedthe Evolutionary Earth through Stepsto the emergence from the Formationof life and the of modern Earth tolife the system Emergence is considered. of Life

As described4.3. Deduced above, Evolutionary the accumulation Steps from the Formation of [GADV]-amino of Earth to the Emergence acids andof Life the establishment of the first genetic code orAs GNC described primeval above, geneticthe accumu codelation encoding of [GADV]-amino the amino acids acidsand the are establishment assumed of as the the common event supposedfirst genetic from code the results,or GNC primeval which weregenetic obtained code encoding from the two amino counter-directional acids are assumed as (bottom-up the and top-down)common approaches. event supposed Thus, the from consecutive the results, which evolutionary were obtained steps from from two the counter-directional accumulation of simple (bottom-up and top-down) approaches. Thus, the consecutive evolutionary steps from the organic compoundsaccumulation on of the simple primitive organic co Earthmpounds to the on emergencethe primitive Earth of life to can the beemergence rationally of life deduced can be by using a commonrationally event as deduced a juncture, by using as a shown common in event Figure as a7 juncture,. as shown in Figure 7.

Figure 7. EvolutionaryFigure 7. Evolutionary steps fromsteps from the the formation formation of Earth Earth to tothe the emergence emergence of life, ofdeduced life, deducedfrom the from the bottom-upbottom-up approach approach and GADV and GADV hypothesis hypothesis (the (the top-downtop-down approach). approach). Periods Periods of physical-chemical, of physical-chemical, biochemical, and biological processes are shown by dotted, broken, and solid arrows, respectively. biochemical,A juncture and biological (the establishment processes of GNC are shownprimeval bygenetic dotted, code), broken, which is and used solid to determine arrows, the respectively. A juncture (the establishment of GNC primeval genetic code), which is used to determine the evolutionary steps that are consistent between the bottom-up and the top-down approaches, is indicated by bold italic letters. Life 2016, 6, 6 11 of 15

5. Discussion As repeatedly mentioned in this review article, the most important step in solving the riddle of the origin of life is to make clear the process by which the fundamental life system, composed of genes, genetic code, and proteins, was established on the primitive Earth. Now, I think that the evolutionary steps from simple inorganic and organic compounds as CO2,H2O, H2,N2, and CH4 to the modern life system have come into considerably clear view (Figure7). However, there are still some other difficult problems that must be overcome to explain the evolutionary process to the emergence of life.

5.1. Why Can [GADV]-Amino Acids Accumulate by Prebiotic Means? If a product, such as one of the [GADV]-amino acids, which was absent at the time, was synthesized with simpler chemical compounds, which were present in high amounts at the time, the product should gradually accumulate according to the law of chemical equilibrium, although it is necessary for reactants to receive sufficient energy to pass through an energy barrier of the physical or chemical reaction and for the product to have sufficiently high stability under the condition. In addition, random processes with prebiotic means generally cause both synthesis and degradation of organic compounds. Even in such a case, [GADV]-amino acids leading to the emergence of life could be accumulated on the primitive Earth. The reason is because accumulation of the amino acids could be observed as a result of a one-directional phenomenon, as [GADV]-amino acids are sufficiently stable. The accumulation could be further accelerated if the amino acids were excluded from the reaction system, for example by drying the products. Thus, a repeated heat-drying or dehydration process in depressions of rocks on seashores and tide pools on the primitive Earth could lead to accumulation rather than disappearance of the amino acids. So, the amino acids could easily accumulate under the circumstances on the primitive Earth in order of simple chemical structure of amino acids, as Gly [G], Ala [A], Asp [D], and Val [V]. The stronger the synthetic activity of a chemical compound becomes, the faster the product accumulates, even when synthesis and degradation of the product proceed with prebiotic means in parallel. Therefore, formation of a system more efficiently producing a product could lead to the development of an evolutionary step toward the emergence of life, in the order of simpler and more imperfect systems to more complex and more perfect systems, step by step.

5.2. Why Can Water-Soluble Globular Proteins Be Formed in the Absence of Any Genetic System? I have defined protein 0th-order structure as a specific amino acid composition in which water-soluble globular proteins, a prerequisite for high catalytic activity, can be produced at a high probability even by random polymerization of amino acids [51,57,58,61]. One of the protein 0th-order structures is a composition containing [GADV]-amino acids at a roughly equal ratio or about one-fourth each. Owing to the protein 0th-order structure, water-soluble globular proteins with weak but sufficiently high catalytic activity could be produced at a high probability even in the absence of genetic function or before creation of the first (GNC)n gene. Evolutionary processes leading to the emergence of life through the formation of the [GADV]-protein world always proceeded to produce more perfect [GADV]-proteins than those one step before, under the protein 0th-order structure. For example, the appearance of [GADV]-peptides/proteins as aggregates could promote biochemical reactions to accumulate various organic compounds including [GADV]-peptides and [GADV]-proteins themselves. As a result, the [GADV]-protein world was formed by pseudo-replication, with [GADV]-peptides/proteins having catalytic activity for peptide bond formation. The most primitive metabolic system was formed in the protein world. [GADV]-peptides/proteins with high substrate selectivities for nucleobases, ribose, and phosphate could give rise to syntheses of nucleotides and oligonucleotides. Accumulation of oligonucleotides containing GNC led to the establishment of the first genetic code, GNC. The establishment of the GNC code made it possible to synthesize more perfect Life 2016, 6, 6 12 of 15

[GADV]-peptides and/or [GADV]-proteins containing roughly equal amounts of [GADV]-amino acids and to exclude non-natural amino acids and compounds other than [GADV]-amino acids. Furthermore, the first single-stranded (GNC)n gene was formed by joining two GNC triplets in complexes of oligonucleotide-containing GNC with one [GADV]-amino acid. This made possible the synthesis of a number of [GADV]-peptides/proteins with the same amino acid sequence. It is important to understand that [GADV]-proteins or aggregates of [GADV]-peptides, not merely [GADV]-peptides, would be required for synthesis of (GNC)n gene, because two nucleotides must be bound to the proteins to form a phosphodiester bond between two nucleotides. Finally, ds-(GNC)n genes were produced by [GADV]-proteins with high catalytic activities in the protein world. The formation of the first ds-(GNC)n gene made it possible to produce homologous genes and to create entirely new genes from sense and antisense sequences of the gene after gene duplication, respectively. With the formation of ds-(GNC)n gene, the replication system was established and the first life could emerge through acquisition of various kinds of genes and of the replication system. Therefore, it is rationally assumed that the fundamental life system was formed from the primitive system as going up a spiral staircase, and water-soluble globular proteins were th always produced under the protein 0 -order structure. Transcription and translation of ds-(GNC)n genes could presumably be carried out by a [GADV]-protein (a primitive RNA polymerase) and an oligonucleotide-[GADV]-protein complex (a primitive ribosome).

5.3. Life Could Emerge with Some Good Luck and Trial and Error Of course, random processes did not always lead directly to the formation of the life system. Various kinds of chemical reactions would randomly occur in an innumerable number of depressions of rocks and tide pools to produce various organic compounds independently of the emergence of life. GNC primeval genetic code could be successfully established and the first double-stranded (GNC)n genes could be fortunately produced in only one depression of a rock or a tide pool on the vast primitive Earth. As a result, the first life could emerge by chance in a depression or tide pool, as one of countless reaction systems. Chemical compounds in other depressions and tide pools could be used as nutrients for the first life and its descendants. I have considered the steps to the emergence of life as above, but it is important to confirm experimentally whether the idea is correct or not. Thus, experiments are undoubtedly essential to elucidate how life emerged on the planet. However, it is also true that it would be impossible to solve the riddle of the origin of life only from experiments based on the bottom-up approach (Figure1). So, it must be essential to solve the riddle using two counter-directional (bottom-up and top-down) approaches, which are based mainly on experiments and computational analyses of databases of genes and proteins or theoretical considerations, respectively.

6. Conclusions Until now, researchers have tried to understand, using the bottom-up approach, how the first life emerged on the primitive Earth, but it has been so far proven impossible to solve the riddle of the origin of life. It is principally impossible to elucidate the formation process of the fundamental life system composed of genes, genetic code, and proteins. On the other hand, I started with a study on the formation of entirely new genes in presently existing microorganisms, and finally reached the conclusion that life emerged not from the RNA world but from the [GADV]-protein world ([GADV]-protein world hypothesis or GADV hypothesis). Therefore, I accidentally adopted a top-down approach for solving the riddle. Furthermore, I noticed that the riddle of the origin of life could be solved by using a common event, which was obtained by the two counter-directional approaches as a juncture. The probable evolutionary steps from the formation of the Earth to the emergence of life, which were deduced from the results obtained with the two approaches, are as follows: (1) The primitive atmosphere composed of CO2,H2,H2O, N2, CH4, NH3, etc. was formed. (2) Simple amino acids, such Life 2016, 6, 6 13 of 15 as glycine [G], alanine [A], aspartic acid [D], and valine [V], were physically and chemically synthesized and accumulated on the primitive Earth. (3) Peptide catalysts, such as Gly-Gly, Gly-Asp, and [GADV]-peptides, were produced. (4) The [GADV]-protein world was formed by pseudo-replication with [GADV]-peptides/proteins. (5) Nucleotides and oligonucleotides (RNA) were synthesized and accumulated in the protein world. (6) GNC primeval genetic code encoding [GADV]-amino acids was established. (7) Single-stranded (GNC)n gene(s) and successively double-stranded (GNC)n gene(s) were formed. (8) Finally, the first life emerged on the primitive Earth.

Acknowledgments: I am grateful to Tadashi Oishi (G & L Kyosei Institute, Emeritus professor of Nara Women’s University) for the encouragement throughout my research on the GADV hypothesis on the origin of life. Conflicts of Interest: The author declares no conflict of interest.

References

1. Gilbert, W. The RNA world. Nature 1986, 319, 618. [CrossRef] 2. Kruger, K.; Grabowski, P.J.; Zaug, A.J.; Sands, J.; Gottschling, D.E.; Cech, T.R. Self-splicing RNA: Autoexcision and autocyclization of ribosomal RNA intervening sequence of Tetrahymena. Cell 1982, 31, 147–157. [CrossRef] 3. Guerrier-Takada, C.; Gardiner, K.; Marsh, T.; Pace, N.; Altman, S. The RNA moiety of ribonuclease P is catalytic subunit of the enzyme. Cell 1983, 35, 849–857. [CrossRef] 4. Miller, S.L.; Orgel, L.E. The Origins of Life on the Earth; Prentice-Hall: Englewood Cliffs, NJ, USA, 1974. 5. Francis, B.R. The Hypothesis that the genetic code originated in coupled synthesis of proteins and the evolutionary predecessors of nucleic acids in primitive cells. Life 2015, 5, 467–505. [CrossRef][PubMed] 6. Van der Gulik, P.T.; Speijer, D. How amino acids and peptides shaped the RNA world. Life 2015, 5, 230–246. [CrossRef][PubMed] 7. Luisi, P.L. Prebiotic metabolic networks? Mol. Syst. Biol. 2014, 10, 729–730. [CrossRef][PubMed] 8. Kim, K.M.; Caetano-Anollés, G. The evolutionary history of protein fold families and proteomes confirms that the archaeal ancestor is more ancient than the ancestors of other superkingdoms. BMC Evol. Biol. 2012, 12, 13–29. [CrossRef][PubMed] 9. Kurland, C.G. The RNA dreamtime: Modern cells feature proteins that might have supported a prebiotic polypeptide world but nothing indicates that RNA world ever was. Bioessays 2010, 32, 866–871. [CrossRef] [PubMed] 10. Maury, C.P. Self-propagating β-sheet polypeptide structures as prebiotic informational molecular entities: The amyloid world. Orig. Life Evol. Biosph. 2009, 39, 141–150. [CrossRef][PubMed] 11. Maury, C.P. Origin of life. Primordial genetics: Information transfer in a pre-RNA world based on self-replicating beta-sheet amyloid conformers. J. Theor. Biol. 2015, 382, 292–297. [CrossRef][PubMed] 12. Carter, C.W. What RNA World? Why a Peptide/RNA Partnership Merits Renewed Experimental Attention. Life 2015, 5, 294–320. [CrossRef][PubMed] 13. Wächtershäuser, G. The place of RNA in the origin and early evolution of the genetic machinery. Life 2014, 4, 1050–1091. [CrossRef][PubMed] 14. Luisi, P.L. A new start from ground zero? Orig. Life Evol. Biosph. 2014, 44, 303–306. [CrossRef][PubMed] 15. Benner, S.A. Paradoxes in the origin of life. Orig. Life Evol. Biosph. 2014, 44, 339–343. [CrossRef][PubMed] 16. Kvenvolden, K.A.; Peterson, E.; Pollock, G.E. Optical Configulation of Amino-Acids in Pre-Cambrian Fig Tree Chert. Nature 1969, 221, 141–143. [CrossRef] 17. Nagy, B.; Engel, M.H.; Zumberge, J.E.; Ogiono, H.; Chang, S.Y. Amino acids and hydrocarbons ~3,800-Myr old in the Isua Rocks, southwestern Greenland. Nature 1981, 289, 53–56. [CrossRef] 18. Van Zuilen, M.A.; Lepland, A.; Arrhenius, G. Reassessing the evidence for the earliest traces of life. Nature 2002, 418, 627–630. [CrossRef][PubMed] 19. Botta, O.; Glavin, D.P.; Kminek, G.; Bada, J.L. Relative amino acid concentrations as a signature for parent body processes of carbonaceous chondrites. Orig. Life Evol. Biosph. 2002, 32, 143–163. [CrossRef][PubMed] 20. Meierhenrich, U.J.; Muñoz Caro, G.M.; Bredehöft, J.H.; Jessberger, E.K.; Thiemann, W.H.-P. Identification of diamino acids in the Murchison meteorite. Proc. Natl. Acad. Sci. USA 2004, 101, 9182–9186. [CrossRef] [PubMed] Life 2016, 6, 6 14 of 15

21. Oró, J.; Mills, T.; Lazcano, A. Comets and the formation of biochemical compounds on the primitive Earth—A review. Orig. Life Evol. Biosph. 1992, 21, 267–277. [CrossRef][PubMed] 22. Chyba, C.F.; Sagan, C. Endogenous production, exogenous delivery and impact-shock synthesis of organic molecules: An inventory for the origins of life. Nature 1992, 355, 125–132. [CrossRef][PubMed] 23. Miller, S.L. Production of some organic compounds under possible primitive earth conditions. Science 1953, 117, 528–529. [CrossRef][PubMed] 24. Miller, S.L.; Urey, H.C. Organic compound synthesis on the primitive earth. Science 1959, 130, 245–251. [CrossRef][PubMed] 25. Plankensteiner, K.; Reiner, H.; Rode, B.M. Amino acids on the rampant primordial Earth: Electric discharges and the hot salty ocean. Mol. Divers. 2006, 10, 3–7. [CrossRef][PubMed] 26. Kasting, J.F. Earth’s early atmosphere. Science 1993, 259, 920–926. [CrossRef][PubMed] 27. Cleaves, H.J.; Chalmers, J.H.; Lazcano, A.; Miller, S.L.; Bada, J.L. A reassessment of prebiotic organic synthesis in neutral planetary atmosphere. Orig. Life Evol. Biosph. 2008, 38, 105–115. [CrossRef][PubMed] 28. Parker, E.T.; Cleaves, H.J.; Dworkin, J.P.; Glavin, D.P.; Callahan, M.; Aubrey, A.; Lazcano, A.; Bada, J.L.

Primodial synthesis of amines and amino acids in a 1958 Miller H2S-rich spark discharge experiment. Proc. Natl. Acad. Sci. USA 2011, 108, 5526–5531. [CrossRef][PubMed] 29. Corliss, J.B.; Dymond, J.; Gordon, L.I.; Edmond, J.M.; Herzen, R.P.; Ballard, R.D.; Green, K.; Williams, D.; Bainbridge, A.; Crane, K.; et al. Submarine thermal springs on the galapagos rift. Science 1979, 203, 1073–1083. [CrossRef][PubMed] 30. Holm, N.G.; Andersson, E. Hydrothermal simulation experiments as a tool for studies of the origin of life on Earth and other terrestrial planets: A review. 2005, 5, 444–460. [CrossRef][PubMed] 31. Chandru, K.; Obayashi, Y.; Kaneko, T.; Kobayashi, K. Formation of amino acid condensates partly having peptide bonds in a simulated submarine hydrothermal environment. Viva Origin. 2013, 41, 24–28. 32. Imai, E.; Honda, H.; Hatori, K.; Brack, A.; Matsuno, K. Elongation of oligopeptides in a simulated submarine hydrothermal system. Science 1999, 283, 831–833. [CrossRef][PubMed] 33. Ogata, Y.; Imai, E.; Honda, H.; Hatori, K.; Matsuno, K. Hydrothermal circulation of seawater through hot vents and contribution of interface chemistry to prebiotic synthesis. Orig. Life Evol. Biosph. 2000, 30, 527–537. [CrossRef][PubMed] 34. Otake, T.; Taniguchi, T.; Furukawa, Y.; Kawamura, F.; Nakazawa, H.; Kakegawa, T. Stability of amino acids and their oligomerization under high-pressure conditions: implications for prebiotic chemistry. Astrobiology 2011, 11, 799–813. [CrossRef][PubMed] 35. Furukawa, Y.; Sekine, T.; Oba, M.; Kakegawa, T.; Nakazawa, H. Biomolecule formation by oceanic impacts on early Earth. Nat. Geosci. 2009, 2, 62–66. [CrossRef] 36. Ebihara, M.; Sekimoto, S.; Shirai, N.; Hamajima, Y.; Yamamoto, M.; Kumagai, K.; Oura, Y.; Ireland, T.R.; Kitajima, F.; Nagao, K.; et al. Neutron activation analysis of a particle returned from asteroid Itokawa. Science 2011, 333, 1119–1121. [CrossRef][PubMed] 37. Schulz, R.; Hilchenbach, M.; Langevin, Y.; Kissel, J.; Silen, J.; Briois, C.; Engrand, C.; Hornung, K.; Baklouti, D.; Bardyn, A.; et al. Comet 67P/Churyumov-Gerasimenko sheds dust coat accumulated over the past four years. Nature 2015, 518, 216–218. [CrossRef][PubMed] 38. Nummelin, A.; Dickens, J.E.; Bergman, P.; Hjalmarson, A.; Irvine, W.M.; Ikeda, M.; Ohishi, M. Abundances of ethylene oxide and acetaldehyde in hot molecular cloud cores. Astron. Astrophys. 1998, 337, 275–286. [PubMed] 39. Zaia, D.A.; Zaia, C.T.; de Santana, H. Which amino acids should be used in prebiotic chemistry studies? Orig. Life Evol. Biosph. 2008, 38, 469–488. [CrossRef][PubMed] 40. Wächtershäuser, G. Evolution of the first metabolic cycles. Proc. Natl. Acad. Sci. USA 1990, 87, 200–204. [CrossRef][PubMed] 41. Rode, B.M. Peptides and the origin of life. Peptides 1999, 20, 773–786. [CrossRef] 42. Shuwannachot, Y.; Rode, B.M. Catalysis of dialanine formation by glycine in the salt-induced peptide formation reaction. Orig. Life Evol. Biosph. 1998, 28, 79–90. [CrossRef] 43. Plankensteiner, K.; Reiner, H.; Rode, B.M. Catalytically increased prebiotic peptide formation: Ditryptophan, dilysine, and diserine. Orig. Life Evol. Biosph. 2005, 35, 411–419. [CrossRef][PubMed] 44. Gorlero, M.; Wieczorek, R.; Adamala, K.; Giorgi, A.; Schininà, M.E.; Stano, P.; Luisi, P.L. Ser-His catalyses the formation of peptides and PNAs. FEBS Lett. 2009, 583, 153–156. [CrossRef][PubMed] Life 2016, 6, 6 15 of 15

45. Wieczorek, R.; Dörr, M.; Chotera, A.; Luisi, P.L.; Monnard, P.A. Formation of RNA Phosphodiester Bond by Histidine-Containing Dipeptides. ChemBioChem 2013, 14, 217–223. [CrossRef][PubMed] 46. Oba, T.; Fukushima, J.; Maruyama, M.; Iwamoto, R.; Ikehara, K. Catalytic activities of [GADV]-peptiders. Formation and establishment of [GADV]-protein world for the emergence of life. Orig. Life Evol. Biosph. 2005, 35, 447–460. [CrossRef][PubMed] 47. Van der Gulik, P.; Massar, S.; Gilis, D.; Buhrman, H.; Rooman, M. The First Peptides: The evolutionary transition between prebiotic amino acids and early proteins. J. Theor. Biol. 2009, 261, 531–539. [CrossRef] [PubMed] 48. Higgs, P.G. A four-column theory for the origin of the genetic code: Tracing the evolutionary pathways that gave rise to an optimized code. Biol. Direct. 2009, 24, 4–16. [CrossRef][PubMed] 49. Ikehara, K.; Omori, Y.; Arai, R.; Hirose, A. A novel theory on the origin of the genetic code: A GNC-SNS hypothesis. J. Mol. Evol. 2002, 54, 530–538. [CrossRef][PubMed] 50. Stryer, L. Biochemistry, 3rd ed.; W.H. Freeman and Company: New York, NY, USA, 1988. 51. Ikehara, K. Origins of gene, genetic code, protein and life: Comprehensive view of life system from a GNC-SNS primitive genetic code hypothesis. J. Biosci. 2002, 27, 165–186. [CrossRef][PubMed] 52. Ikehara, K.; Amada, F.; Yoshida, S.; Mikata, Y.; Tanaka, A. A possible origin of newly-born bacterial genes: Significance of GC-rich nonstop frame on antisense strand. Nucleic Acids Res. 1996, 24, 4249–4255. [CrossRef] [PubMed] 53. Ikehara, K.; Okazawa, E. Unusually biased nucleotide sequences on sense strands of Flavobacterium sp. genes produce nonstop frames on the corresponding antisense strands. Nucleic Acids Res. 1993, 21, 2193–2199. [PubMed] 54. Ikehara, K. Simulation of gene evolution: Evidence for GC-NSF(a) hypothesis on the origin of genes. Viva Origin. 2003, 31, 201–215. 55. Ikehara, K. Mechanisms for creation of “original ancestor genes”. J. Biol. Macromol. 2005, 5, 21–30. 56. Ikehara, K.; Yoshida, S. SNS hypothesis on the origin of the genetic code. Viva Origin. 1996, 26, 301–310. 57. Ikehara, K. Pseudo-replication of [GADV]-proteins and origin of life. Int. J. Mol. Sci. 2009, 10, 1525–1537. [CrossRef][PubMed] 58. Ikehara, K. Possible steps to the emergence of life: The [GADV]-protein world hypothesis. Chem. Rec. 2005, 5, 107–118. [CrossRef][PubMed] 59. Ikehara, K. [GADV]-Protein World Hypothesis on the Origin of Life. In Genesis—In the Beginning; Seckbach, J., Ed.; Springer: Berlin, Germany, 2012; pp. 107–121. 60. Ikehara, K. [GADV]-protein world hypothesis on the origin of life. Orig. Life Evol. Biosph. 2014, 44, 299–302. [CrossRef][PubMed] 61. Ikehara, K. Protein ordered sequences are formed by random joining of amino acids in protein 0th-order structure, followed by evolutionary process. Orig. Life Evol. Biosph. 2014, 44, 279–281. [CrossRef][PubMed] 62. Di Giulio, M. An extension of the coevolution theory of the origin of the genetic code. Biol. Direct 2008, 3, 37. [CrossRef][PubMed]

© 2016 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).