338 Int'l Conf. Security and Management | SAM'16 |

A DNA-Based Cryptographic Generation Algorithm

Shakir M. Hussain1 and Hussein Al-Bahadili1 Department of CIS and Computer Network, University of Petra, Amman, Jordan

Abstract—This paper presents a detail description of a Asymmetric algorithms use two mathematically-related new DNA-based cryptographic key generation algorithm keys, one of these two keys is disclosed to public (hence it that can be used to generate strong cryptographic key(s) for is referred to as public key), and the other one is kept by and symmetric ciphering applications. The algorithm uses an only known to the user (hence it is referred to as private initial private/secret key as an input to the Key-Based key) [4]. In such algorithms, data encrypted with any of Random Permutation (KBRP) algorithm to generate a these two keys can only be decrypted using the other key. permutation of size n, which is half of the size of the Which of these keys should be used for depends required cryptographic key, and to derive four vectors of on the targeted security service (confidentiality or size n representing the DNA bases (A, C, G, and T) of the authentication). Examples of asymmetric algorithms private key. The DNA vectors are mathematically processed include: Rivest-Shamir-Adleman (RSA), Diffie-Hellman using a linear formula to generate the cryptographic key. (DH), Al-Gamal, etc [4, 5]. They use mathematical The generated bases are re-permuted using the same functions for encryption/decryption and key generation, permutation vector and re-processed to determine new therefore, they are relatively slow and they are mainly used cryptographic keys, and this can be continue as much as for securing key-exchange over unsecure communication new cryptographic keys are required. The performance of channels. the new algorithm is evaluated in two different scenarios Symmetric algorithms can be classified into block cipher that demonstrate its high potential for providing high and stream cipher. A block cipher (such as DES, 3DES, randomness cryptographic key(s). The results show that the AES, etc.) applies a deterministic and computable function generated cryptographic keys always have ≈0.7 entropy, repeatedly to encrypt a block of data at once as a group and acceptable maximum and average run length for both using different fixed-length cryptographic key for each 0’s and 1’s for various key-lengths and private keys. cryptographic round. A stream cipher combines a plaintext stream with a cryptographic key stream in a way to produce Keywords: DNA ; DNA key generation, key a cipher stream, where each digit of the plaintext is generation, strong key, random permutation, KBRP. encrypted one at a time with the corresponding digit of the 1 Introduction cryptographic key stream, to give a digit of the ciphertext stream. The keys are generated using logical procedures or There has been a tremendous growth in the number and mathematical functions, which are normally uses some type of attacks that should be dealt with by data security initial value or password [4, 5]. specialists to protect sensitive valuable data, or data It must be well understood that lack of randomness in vulnerable to unauthorized disclosure or undetected the logical procedures or mathematical functions of the key modification, during transmission or while in storage [1]. generators, or weak passwords, are disastrous and may lead Cryptography is a method of coding/decoding data so that it to cryptanalytic breaks. Therefore, a number of high becomes unreadable or accessible by unauthorized users, randomness and strong key generators have been developed which is often used to protect data during their transmission [6, 7]. However, due to the exponential development in the or while in storage [2]. Cryptography relies upon two main processing power of the computing systems and the components: a cryptographic algorithm and a cryptographic tremendous advancement of the cryptanalysis techniques, key. The algorithm is a mathematical function, and the key more and more powerful cryptographic and key generators is a parameter used by that function [3]. are required. Cryptographic algorithms can be classified into Thus, in line with the growing needs for powerful symmetric and asymmetric algorithms. Symmetric cryptography, new cryptography techniques have been algorithms use the same key to encrypt and decrypt data, emerged, such as: quantum cryptography and DNA which must be kept secret and only disclosed to authorize cryptography. Quantum cryptography (QC) exploits parties; therefore it is referred to as secret key or private key quantum mechanical properties (e.g., the counterintuitive [4]. A symmetric algorithm processes data (plaintext) with behavior of elementary particles such as photons) to the secret key to create encrypted data (ciphertext). perform cryptographic tasks [8]. The best known example Examples of symmetric algorithms are: DES, RC2, 3DES, of this type of cryptography is quantum key distribution AES, etc [5]. These algorithms process the secret key to (QKD), which offers high-security solution to the key generate the required cryptographic key or keys. They are exchange problem rather than data encryption [9]. However, extremely fast and well suited for large data encryption. it has been discovered that QC may not be as secure as it However, they suffer from how to secure the secret key or was presumed to be, where it has been found that energy- how to securely exchange the secret key between different time entanglement, which forms the basis for many systems communicating parties across unsecure communication of QC, is vulnerable to attack [10, 11]. channels.

ISBN: 1-60132-445-6, CSREA Press © Int'l Conf. Security and Management | SAM'16 | 339

DNA cryptography, which is working on the concept of DNA computing, is emerging as a new promising cryptographic field, where DNA is used to carry the information or to be used as an alternative data encoding approach [12]. During the last two decades, many DNA- based algorithms have been developed and used for data cryptography and cryptographic key generation [13]. In this paper, we present a detail description of a new DNA-based cryptographic key generation algorithm that can be used to generate strong cryptographic key(s) for symmetric ciphering applications. The performance of the algorithm is evaluated through two different scenarios to demonstrate its high potential for providing strong cryptographic key(s). The performance measures that are used to evaluate and compare the performance of the Figure 1: Structure of DNA. algorithm against other key generation algorithms include: minimum, maximum, and average run length of 0’s and 1’s, The DNA sequence ACGT has 4!=24 possible pattern and entropy of key binary sequence. each of them has different numeric encoding format (e.g., This paper is divided into six sections. This section 0123 for ACGT, 0132 for ACTG, 0213 for AGCT, etc.), presents the main theme of this paper. The next section and consequently each encoding format will have different provides a brief background on the concept of DNA. binary representation [14]. Section 3 reviews some of the most recent and related research on DNA cryptography. The new DNA-based 3 Literature Review cryptographic key generation algorithm is given in Section A number of key derivation approaches have been 4. Section 5 presents the description of two different developed throughout the years, such as: functional-based, scenarios that are used to evaluate the randomness of biometric-based, voice-based, etc., a review on some of generated cryptographic keys. Finally, in Section 6, these techniques is given in [7]. However, more recently a conclusions are drawn and recommendations for future new approach is identified, which is a DNA-based research are pointed-out. approach. DNA cryptography is a promising research approach that emerged with the evolution of DNA 2 DNA Background computing field. DNA can be used to store and transmit the Deoxyribo Nucleic Acid (DNA) is a molecule that information and also to perform computation. The extensive represents the genetic material for all living organisms. It is parallelism and extraordinary information density built in the information carrier of all life forms, and considered as this molecule can be exploited for cryptographic purposes. the genetic blue print of any living or existing creatures. Several DNA-based algorithms have been proposed and DNA molecules consist of two long chains held together by used in many applications, such as encryption, key complementary base pairs, twisted around each other to generation, authentication, etc. [12]. This section briefly form a double-stranded helix with the bases on the inside. A reviews some of the most and recent research in this area. DNA sequence consists of four nucleic acid bases A Ritu Gupta and Anchal Jain [15] developed a method for (adenine), C (cytosine), G (guanine), T (thymine), where A image encryption based on DNA computation technology. and T are complementary, and C and G are complementary In this method, first, a secret key is generated using a DNA [3]. The base pairing mechanism is the basis for DNA sequence and modular arithmetic operations. Then each replication which is shown in Figure 1 [1]. pixel value of the image undergoes the encryption process One of the most basic attributes of the DNA strand using the key and DNA computation methods. The series is that it has different orientations and each one is algorithm demonstrates a satisfactory computing security different from the other, e.g., TCCGAATGC is distinct level in the encryption security estimating system. Zhang et from ATCGATCGC. Another basic attribute is the reverse al [16] proposed an image encryption algorithm based on complement, which is achieved in two stages: first is to DNA sequence addition operation. The results and security reverse the order of the DNA strand bases, and the second is analysis show that the algorithm can demonstrate good to take the complements of the reversed strands, where the encryption effect, and also can resist exhaustive attack, complement of the base A is T and C is G and vice versa. statistical attack and differential attack. For example, the reverse complement of AGCTAACC is Al-Wattar et al [17] and Al-Wattar et al [18] presented GGTTAGCT [13]. alternative key-dependent DNA-based approaches for the The DNA sequence {A, C, G, T} is presented into MixColumns and ShiftRows transformations engaged in the binary code using a simplest coding pattern of four digits 0, AES algorithm, which has characteristics identical to those 1, 2, and 3, respectively. Each digit is presented into 2-bit of the original algorithm AES besides increasing its pattern as follows: 0 as A→00, 1 as C→01, 2 as G→10, and resistance against attack. Varma and Raju [14] analyzed the 3 as T→11. different approach of DNA cryptography based on matrix manipulation and secure key generation scheme.

ISBN: 1-60132-445-6, CSREA Press © 340 Int'l Conf. Security and Management | SAM'16 |

Liu et al [19] developed an encryption method using Where m represents the number of required DNA complementary rule where piecewise linear chaotic cryptographic keys, for example 16 cryptographic map is used for permutation and then substitution is round keys, one for each round of the DES performed using complementary rule. An extensive review algorithm, or 10 cryptographic round keys for the on DNA cryptography and its basic encryption techniques is AES algorithm. Each element of the DNA vector presented in [12, 20]. will have n values, each value lies between 0 to 3, which can be converted to DNA bases. 4 The Proposed Algorithm g. Convert each DNA base to its 2-bit equivalent value A private key may be considered as a living creature (A as 0→00, C as 1→01, G as 2→10, and T as with a genetic blueprint (i.e., DNA) that can be derived and 3→11). This will yield the k-bit cryptographic used as a cryptographic key in single cryptographic key key(s). symmetric algorithms. The DNA can be used to derive further cryptographic keys for multi cryptographic key In this method the DNA components are randomly symmetric algorithms. For examples, DES requires sixteen distributed over the DNA-based generated key without any 48-bit keys and AES requires ten 128-bit keys) [4, 5]. previous knowledge about the occurrence of each DNA component. The proposed DNA-based cryptographic key generation algorithm can be summarized as follows: 5 Performance Evaluation 1. A private key is used to generate a permutation P of size In this paper, in order to demonstrate the tremendous n, where n is half of the size of the required potential and evaluate the statistical performance of the new cryptographic key (k) using any permutation generation DNA-based cryptographic key generation algorithm, we algorithm. In this work we use the KBRP algorithm [21]. develop two scenarios. In the first scenario (Scenario #1), The KBRP method derives one permutation of size n out we determine the statistical parameters (e.g., minimum, of n! possible permutations for any given private key or maximum, and average run-length of 0’s and 1’s, and password. For k-bit key, n=k/2 (e.g., for the DES, since entropy) for a number of cryptographic keys generated by k=56, then n=28). the new DNA-based algorithm using different private keys; namely, “Computer”, “Ad-Hoc”, and “CDMA&2000”. 2. The permutation P is used to generate the DNA-based Different cryptographic key sizes are generated using the cryptographic key as follows: same private set of keys (e.g., 64, 126, 256, 512, and 1024 a. Convert the n different values of the permutation P to bits). The generated cryptographic keys demonstrate their equivalent binary value (one byte each). excellent statistical features as shown in Table (1). b. Convert each two consecutives bits to an integer In particular, the results show that the generated value between 0 to 3. cryptographic keys always have the maximum acceptable entropy, a controlled run length for both 0’s and 1’s for all c. Store these integer values in a vector V of size 4n. key lengths, and an acceptable average run length. For

d. Split the vector V into four vectors (V1, V2, V3, and example, for the three different private keys, the maximum V4) each of size n. run length for 1’s in a key of 1024-bit is 14, which is equivalent to 1.4% of the total key length. e. Permute the vectors (V1, V2, V3, V4)using the permutation P to produce permuted vectors (PV1, In the second scenario (Scenario #2), we use the new PV2, PV3, PV4. algorithm to generate the cryptographic (round) keys for the DES algorithm (16 rounds), and compare the statistical f. For a single cryptographic key application, the n parameters of the generated keys against those generated elements of the DNA can be calculated as: using the standard DES key generator [4]. The results are For u = 1 To n presented in Table (2) for the new algorithm and in Table

DNA(u)=(PV1(u)+PV2(n-u+1)+PV3(u)+PV4(n-u+1)) % 4 (1) (3) for the DES key generator. The private key using in this Next u scenario is “Computer” to generate 16 48-bit round keys. For a multi cryptographic key application, the DNA It can be clearly seen from Tables (2) and (3) that the bases can be calculated as: new algorithm provides promising statistical result on For v = 1 to m the key in terms of entropy, minimum, maximum, and For u = 1 To n average run-length for both 0’s and 1’s. The features are very competitive with the standard DES key generator. DNA(u,v)=(PV1(u)+PV2(n-u+1)+PV3(u)+PV4(n-u+1)) % 4 (2) Next u

Permute PV1, PV2, PV3, and PV4 using the permutation P Next v

ISBN: 1-60132-445-6, CSREA Press © Int'l Conf. Security and Management | SAM'16 | 341

Table (1) – Scenario #1. Cryptographic Run-Length for 0 Run Length for 1 Private-Key Entropy Key Length Min. Max. Avg. Min. Max. Avg. 64-bit 1 8 2.375 1 3 1.733 0.675 128-bit 1 9 2.567 1 5 1.645 0.672 Computer 256-bit 1 7 2.200 1 6 2.033 0.693 512-bit 1 11 2.205 1 7 2.008 0.692 1024-bit 1 8 1.909 1 11 2.122 0.692 64-bit 1 4 2.063 1 5 2.067 0.693 128-bit 1 6 1.968 1 9 2.094 0.692 Ad-Hoc 256-bit 1 5 1.955 1 6 1.838 0.693 512-bit 1 10 2.235 1 9 2.067 0.692 1024-bit 1 9 1.962 1 8 1.924 0.693 64-bit 1 5 2.467 1 5 1.800 0.681 128-bit 1 6 1.935 1 6 2.194 0.691 CDMA&2000 256-bit 1 5 1.625 1 6 1.931 0.689 512-bit 1 9 1.963 1 6 1.830 0.693 1024-bit 1 7 2.107 1 14 2.098 0.693

Table (2) – Scenario #2 -Statistical parameters using the DNA-base key generator (Private key is Computer) Run Length for 0 Run Length for 1 Round Entropy Min. Max. Avg. Min. Max. Avg. 1 1 6 2.063 1 3 1.438 0.677 2 1 6 2.071 1 6 1.800 0.693 3 1 7 2.615 1 4 1.692 0.670 4 1 5 2.267 1 4 1.571 0.670 5 1 4 1.625 1 4 2.000 0.691 6 1 6 2.200 1 5 1.438 0.677 7 1 5 2.133 1 5 1.600 0.683 8 1 7 2.357 1 4 1.533 0.677 9 1 6 2.143 1 7 1.857 0.691 10 1 6 2.385 1 4 1.786 0.687 11 1 5 1.929 1 5 2.071 0.693 12 1 6 2.308 1 3 2.000 0.691 13 1 5 2.429 1 6 1.692 0.670 14 1 4 1.944 1 3 1.235 0.662 15 1 4 2.231 1 8 2.250 0.693 16 1 4 1.722 1 3 1.389 0.687

Table (3) – Scenario #2 - Statistical parameters using the DES key generator (Private/secret key is Computer) Run Length for 0 Run Length for 1 Round Entropy Min. Max. Avg. Min. Max. Avg. 1 1 9 2.400 1 4 2.400 0.683 2 1 5 1.769 1 5 1.923 0.677 3 1 8 2.300 1 7 2.273 0.677 4 1 4 1.909 1 4 2.250 0.662 5 1 3 1.571 1 3 1.733 0.670 6 1 4 1.769 1 6 1.923 0.677 7 1 8 2.556 1 6 3.125 0.677 8 1 9 1.917 1 6 2.083 0.677 9 1 3 1.438 1 5 1.667 0.677 10 1 4 1.833 1 6 2.364 0.670 11 1 6 1.846 1 3 1.846 0.683 12 1 4 1.769 1 6 1.923 0.677 13 1 7 1.643 1 4 1.786 0.677 14 1 8 2.273 1 4 1.917 0.687 15 1 6 1.917 1 5 2.083 0.677 16 1 8 2.889 1 5 2.444 0.691

ISBN: 1-60132-445-6, CSREA Press © 342 Int'l Conf. Security and Management | SAM'16 |

5 Conclusions [9] A. Mink, S. Franke, and R. Perlner. Quantum Key Distribution (QKD) and Commodity Security Protocols: Introduction and Integration. International Journal of This paper presents a detail description of a new DNA- Network Security & Its Applications (IJNSA), Vol. 1, No. based cryptographic key generation algorithm that can be 2, pp. 101-112, July 2009. used to generate strong cryptographic key(s) for symmetric ciphering applications. The algorithm is used in two [10] L. Lydersen, C. Wiechers, C. Wittmann, D. Elser, J. different scenarios to demonstrate its high potential for Skaar, and V. Makarov. Hacking commercial quantum providing strong cryptographic key(s). The two scenarios cryptography systems by tailored bright illumination. show that the generated cryptographic keys always have an Nature Photonics, Vol. 4, pp. 686–689, 2010. 0.7 entropy, an optimum run length for both 0’s and 1’s for all key lengths, and an acceptable average run length. For [11] J. Jogenfors, A. M. Elhassan, J. Ahrens, M. 48-bit cryptographic key, it presents 14% maximum run- Bourennane, and J. A. Larsson. Hacking the bell test using length for 0’s and 9% for 1’s, and average run-length of 4% classical light in energy-time entanglement–based quantum for both 0’s and 1’s. These parameters decrease with key distribution. Science Advances, Vol. 1, No. 11, 2015. increasing key length. [12] T. Mandge and V. Choudhary. A review on emerging cryptography technique: DNA cryptography. International This algorithm is at its early stage of development and it Journal of Computer Applications (IJCA), Vol. 13, pp. 9-13, is open up an area of interesting research. For example: (1) February 2013. Develop and perform more evaluation procedures and techniques, and (2) use the algorithm as a cryptographic key [13] B. B. Raj and V. Panchami. DNA-based cryptography generator for the standard symmetric encryption algorithms using permutation and random key generation method. (e.g., DES, 3DES, AES, IDEA, etc.) and compare the International Journal of Innovative Research in Science, Engineering and Technology, Vol. 3, Issue 5, pp. 263-267, statistical randomness test of the produced ciphertext July 2014. against using the standard key generator of each of these algorithms. [14] P. S. Varma, K. G. Raju. Cryptography based on DNA using random key generation scheme. International Journal REFERENCES of Science Engineering and Advance Technology (IJSEAT), Vol. 2, Issue 7, pp. 168-175, July, 2014. [1] M. Zhang, M. X. Cheng, and T. J. Tarn. A mathematical formulation of DNA computation. IEEE [15] Ritu Gupta and Anchal Jain. A new image encryption Transactions on NanoBioscience, Vol. 5, No. 1, pp. 32-40, algorithm based on DNA approach. International Journal of 2006. Computer Applications, Vol. 85, No. 18, pp. 27-31, January 2014. [2] P. Saxena, A. Singh, and S. Lalwani. Use of DNA for computation, storage and cryptography of information. [16] Q. Zhang, L. Guo, X. Xue, and X. Wei. An image encryption algorithm based on DNA sequence addition International Journal of Innovative Technology and th Exploring Engineering (IJITEE), Vol. 3, Issue 2, pp. 2278- operation. Proceedings of the 4 International conference on 3075, 2013. Bio-Inspired Computing (BIC-TA '09), pp. 1-5, Beijing, China, 16-19 October 2009. [3] Bibhash Roy, Gautam Rakshit, Ritwik Chakraborty. Enhanced key generation scheme based cryptography with [17] A. H. Al-Wattar, R. Mahmod, Z. A. Zukarnain, and N. DNA logic. International Journal of Information and Udzir. A new DNA based approach of generating key Communication Technology Research, Volume 1, No. 8, dependent MixColumns transformation. International December 2011. Journal of Computer Networks & Communications (IJCNC), Vol. 7, No. 2, pp. 93-102, March 2015. [4] B. A. Forouzan. Introduction to Cryptography and Network Security. McGraw-Hill (International Ed.), 2008. [18] A. Al-Wattar, R. Mahmod, Z. Zukarnain, and N. Udzir, “A new DNA based approach of generating key- [5] W. Stallings. Cryptography and Network Security: dependent ShiftRows transformation. International Journal Principles and Practices. Prentice Hall (6th Ed.), 2014. of Network Security and Its Applications (IJNSA), Vol.7, No.1, January 2015. [6] E. Barker and A. Roginsky. Recommendation for Cryptographic Key Generation. NIST Special Publication [19] H. Liu, X. Wang, and A. Kadir. Image encryption 800-133, 2012. using DNA complementary rule and chaotic maps. Applied Soft Computing, Vol. 12, pp. 1457–1466, 2012. [7] S. M. Hussain and H. Al-Bahadili. A password-based key derivation algorithm using the KBRP method. [20] Pierluigi Paganini. The future of data security: DNA American Journal of Applied Sciences, Vol. 5, No. 7, pp. cryptography and . Retrieved from 777-782, 2008. http://securityaffairs.co/wordpress/33879/security/dna- cryptography.html on 20th February 2015. [8] L. Chen, S. Jordan, Y.-K. Liu, D. Moody, R. Peralta, R. Perlner, and D. Smith-Tone. Report on post-quantum [21] S. M. Hussain and N. M. Ajlouni. Key-based random cryptography. National Institute of Standards and permutation (KBRP). Journal of Computer Science, Vol. 2, Technology Internal Report, NISTIR 8105, February 2016. No. 5, pp. 419-421, 2006.

ISBN: 1-60132-445-6, CSREA Press ©