Protein “A. superbus venom factor 1”, present in the Superbus venom, which are harmful to the complementary immunity system, is analogous to the amino acid sequences found in SARS-CoV2.

Cesar Fernandes Geraldes1 (Bachelor of Quality in Life - Mackenzie Presbyterian University and Civil Geotechnical Engineer - Uninove).1

1 Studying Post-Graduation in Data Science at Uninove, São Paulo, Brazil

Correspondence: [email protected]

Abstract

The study crossed the amino acids present in SARS-CoV-2, with the protein “A. Superbus venom factor 1” from the snake “Austrelaps Superbus”. The alignment of the sequences of pairs took place through matrix algorithms in high performance processing, by UGene software in terms of GNU. The “A. Superbus venom factor 1”, present in the snake Austrelaps Superbus venom, which are harmful to the complementary immunity system, is analogous to the amino acid sequences found in SARS-CoV2. The result found was of several sequences of amino acids with harmful potential, the most relevant being represented by the letters "LYID", "TAYA" and "NTLT".

1

Introduction As described on Wikipedia in the English version the lowland copperhead or lowlands copperhead Austrelaps superbus is a in the family , found in southeastern Australia and Tasmania. It is commonly referred to as the copperhead. The complement system is a fundamental part of innate immunity and contributes to the removal of immune complexes and the activation of inflammatory processes. These proteins represent a fast and efficient means of protecting the host against invading microorganisms. Associations between complement and disease are observed in situations of complement deficiency, abnormalities in complement regulation and inflammation.2 Data science, through the use of computational algorithms and databases, has been a great ally in the development of new knowledge. Bioinformatics has the characteristic of moving between inter- disciplines, as in the present study between the exact and biological disciplines. The health area could not be left out. In such a crucial moment for humanity, where we were affected by a disease spread in a pandemic in global proportions, and the use of computer technologies has added speed and precision to the development of solutions. The objective of this study is to identify possible sequences of amino acids with potential harmful to humans, present in SARS-CoV-2. In an analysis by similarity of the sequencing of amino acids with agents presented nature. The snake chosen from the family chosen for analysis was the Austrelaps superbus, Lowland copperhead snake. Protein “A. superbus venom factor 1”.

1. Do amino acid sequences potentially harmful to the complement system linked to immunity in SARS-CoV-2? Are such venomous as venomous as the snake?

Recent facts linked to the origin of SARS-CoV-2, led to the question whether there were elements that are natural agents of venomous animals, such as the amino acid sequence of or adjunct components in SARS-CoV-2. And its potential harmfulness to the immune complement system.

1.1 Protein “A. superbus venom factor 1” A. lowland copperhead, Austrelaps superbus

The cobra venom factor, CVF is an uncommon component of the venom of many elapid . Although CVF occurs in the poison, it is not a toxin, but a protein component of the venom that facilitates the entry of toxic components of the poison into the bloodstream by inducing activation of the complement at the site of poisoning. 3 These proteins act by activating, inhibiting or mimicking proteins in this system. They are found mainly in venoms of snakes and spiders. 3 Protein involved in the localized protective response to tissue damage, microbial infection or the presence of foreign matter. It is characterized by edema, redness, heat and pain and involves a complex series of events, including vascular changes and blood cell accumulation, such as neutrophil leukocytes and mononuclear phagocytes, at the site of the lesion. 4

2

2. Materials & Methods

All sequences are placed in a large matrix. Through the “Sequence Alignment” method, applying the functionality of the Unipro Software UGENE is a free multiplatform genome analysis suite. The version is Unipro UGENE v35.1, 64bit, Jun 2020. It is distributed under the terms of the GNU General Public License. The software uses different simultaneous algorithms, through the CLUSTALW package. 5

Figure 1: “NC_045512” to locate the “YP_009724389.1” section of the analyzed SARS-CoV-2.

The analyzed section of SARS-CoV-2 is referenced by name (YP_009724389.1), searched in the NCBI GenBank database, Protein. Also the file (Q0ZZJ6), downloaded in “.fasta” extension3

Table 2- Protein: “A. superbus poison factor 1 ”from Austrelaps superbus. File Q0ZZJ6.fasta.6 >sp|Q0ZZJ6|VCO31_AUSSU A.superbus venom factor 1 OS=Austrelaps superbus OX=29156 PE=1 SV=1 MEGMALYLVAALLIGFPGSSHGALYTLITPGVLRTDTEEQILVEAHGDSVPKQAVISIHD FPRRQKTLFQTRVDMNPAGGMLVTPTIKIPAKELNKESRQNQYVVVKVSGLPLELEKVVL LSYQSGFVFIQTDKGIYTPGSPVRYRVFSMDYNMHRMDKTVIVEFQTPEGVVVSSNPVNP SSVLIRPYNLPELVSFGTWKAVAKYEHSPEESYTAYFDVREYVLPSFEVRLQPSDKFLYI DGNKNFHVSITARYLYGKKVEGVAFVLFGVKIDDAKKSIPDSLTRIPIIDGDGEAILKRD TLRSRFQNLNELVGHTLYASVTVMTESGSDMVVTEQSGIHIVTSPYQIYFTKTPKYFKPG MPYELTVYVTNPDGSPAANVPVVSEAIRSEGTTLSDGTAKLILNTPLNTQSLPITVRTNH RDLPSERQATKSMTATAYQTQGGSGNYLHVAITSAEIKAGDNLPVNFNVRGNANSLNQIK YFTYLILTKGKIFKVGRQPKGEGQNLVTMNLRITPDLIPAFRFVAYYQVGNNEIVADSVW VDVKDTCMGTLVVKGASSRDNRIQKPGAAMKIKLEGDPGARVGLVAVDKAVYVLNDKYKI SQAKIWDTIEKSDFGCTAGGGQNNLGVFEDAGLALTTSTNLNTKQRSVATCPQPTNRRRR SSVLLLDSKANKAAQFQDQNLRKCCEDGMHENPMGYTCEKRAKYIQEGDACKAAFLECCR YIKGIRDENQRESELFLARSDFEDELFGEDNIISRSDFPESWLWLTEDLKEPPNSQGISS KTLSFYLRDSITTWEVLAVSIAPTKGICVAEPYEITVMKDFFIDLRVPYSVVKNEQVEIR AVLYNYADEDIYVRVELLYNPAFCSASTEGQRYRVQVPVKALSSWAVPFVIVPLEQGLHD VEVKASVRGELASDGVRKKLKVVPEGERKNIVTVIELDPSVKGVDGTQEQTVIANKLDDK VPETEIETKISVLGDPVAQIIENSIDGSKLNHLIITPSGCGEQNMITMTPSVIATYYLDA TGQWENLGVDRRTEAVKQIMKGYAQQMVYKKADHSYAAFPNRASSSWLTAYVVKVFAMAA KIVKDIKHEIICGGVKWLILNRQQPDGVFKENAPVIHGEMLGGTKGAEPEVSLTAFILTA LLESRSVCNEHINILDSSINKAIDYLLKKYEKLQRPYTTALTAYALAAAERLNDDRVLMA ASTGRDRWEEHNARTHNIEGTSYALLALLKMKKFAEAGPVVKWLIDQKYYGGTYGQTQAT VMVFQALAEYEIQIPTHKDLNLDISINLPEREVPLRYSINYGNALVARTAETKLNEDFTV SASGDGKATMTILTVYNAQLREDANVCNKFHLDVSVENAQLNSKQAKGAKDTLRLKICTR YLGEVDSTMTIIDVSMLTGFLADAEDLTRLSKGVDRYISKFEIDNNMVQKGTVVIYLDKV SHSEVECLHFKIHKHFEVGFIQPGSVKVYSYYNLDEQCTKFYHPDKGTGLLNKICHGNIC RCAEESCSLLNQQKKIDLQLRIQKACAPNVDYVYKTKLLQIEEKDGNDIYVMDVLEVIKG GTDRNPQAKARQYVSQRKCQEALNLKLNNDYLIWGLSSDLWPRKNDISYLITKNTWIERW

3

Clustal is a widely used multi-sequence alignment program, using the pair sequence alignment method, applying the CLUSTALW function. It is used for nucleotide and protein sequences. ClustalW is a command line version of the program 5

Regarding ClustalW:

“The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.” 6

Alignment with original MUSCLE package is the ability to align only part of the model. 5When the column range item is selected, the region of the specified columns is passed only to the MUSCLE alignment mechanism. The resulting alignment is inserted into the original.5,7

3. Results:

The excerpts presented in the figures below are the most relevant, given the study. In the first line “A. Superbus venom factor 1 ”, demonstrated as VCO31_AUSSU. In the second line, the SARS- CoV-2 YP_009724389.1 section.

Figure 5. Software screen showing the sequence of four codons “LYID”.

Figure 6. Software screen showing the sequence of four codons “TAYA”.

4

Figure 7. Software screen showing the sequence of four codons “NTLT”.

4. Discussion Improper or excessive activation of the complement can lead to harmful consequences due to severe inflammatory tissue destruction. Clinical and experimental evidence highlights the role of complement in the pathogenesis of numerous inflammatory diseases, which include not only immune-complex, and autoimmune diseases, but also organ failure following sepsis, multiple traumas and burns.7 Complement-activating protein in snake venom. The “A. superbus venom factor 1” interferes with the complement system.3 The possibility that the amino acid sequence may have been inserted in SARS-CoV-2 is not ruled out, even by publications 1,8,. Nor does it rule out the possibility that it is something natural. Further studies are needed.

5. Conclusion

Therefore, the “A. superbus venom factor”, present in the Cobra Austrelaps Superbus venom, which are harmful to the complementary immunity system, is analogous to the amino acid sequences found in SARS-CoV2. The study demonstrated the strings "LYID", "TAYA" and "NTLT", with more relevance to the study. The study was restricted to bioinformatics issues, analyzes related to the harmfulness or toxicity of the elements, will be up to professionals and researchers duly qualified for this function.

6. Acknowledgements

I thank all those who contributed to the discovery of several facts associated with the origin of SARS-CoV-2.

7. References

1. Geraldes, Cesar Fernandes. (2020, October 9). Sequencing of amino acids with toxic potential, presented in SARS-CoV-2, are analogous to those found in the venom of the Black Widow Spider. Zenodo. http://doi.org/10.5281/zenodo.4075055 2. Utiyama Shirley Ramos da Rosa, Reason Iara Taborda de Messias, Kotze Lorete Maria da Silva. O sistema complemento nas doenças: genética e patogenia. Rev. Bras. Reumatol. [Internet]. 2004 Aug [cited 2020 Oct 19] ; 44( 4 ): 277-286. Available from: http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0482-50042004000400006&lng=en. https://doi.org/10.1590/S0482-50042004000400006 3. UniProt. https://www.uniprot.org/uniprot/Q0ZZJ6, ( 2020). 4. Vogel, C.-W., & Fritzinger, D. C. (2017). Cobra Venom Factor: The Unique Component of Cobra Venom That Activates the Complement System BT - Snake Venoms (H. Inagaki, C.-W. Vogel, A. K. Mukherjee, & T. R. Rahmy (eds.); pp. 345–404). Springer Netherlands. https://doi.org/10.1007/978-94- 007-6410-1_4 5

5. Unipro UGENE User Manual(https://ugene.net/wiki/display/UM36/Unipro+UGENE+User+Manual , 2020). 6. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994 Nov 11;22(22):4673-80. doi: 10.1093/nar/22.22.4673. PMID: 7984417; PMCID: PMC308517. 7. Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004 Mar 19;32(5):1792-7. doi: 10.1093/nar/gkh340. PMID: 15034147; PMCID: PMC390337. 8. Robinson, C. Was the COVID-19 virus genetically engineered? (https://gmwatch.org/en/news/latestnews/19383 , 2020). 9. Yan, Li-Meng, Kang, Shu, Guan, Jie, & Hu, Shanchang. (2020, September 14). Unusual Features of the SARS-CoV-2 Genome Suggesting Sophisticated Laboratory Modification Rather Than Natural Evolution and Delineation of Its Probable Synthetic Route. Zenodo. http://doi.org/10.5281/zenodo.4028830

6