SCIS & ISIS 2010, Dec. 8-12, 2010, Okayama Convention Center, Okayama, Japan Proliferation of in Clonal Selection Algorithm with Cell

Takumi Ichimura Takashi Yamaguchi & Kenneth James Mackin Faculty of Management and Information Systems, Faculty of Informatics, Prefectural University of Hiroshima Tokyo University of Information Sciences 1-1-71, Ujina-Higashi, Minami-ku, 4-1 Onaridai, Wakaba-ku, Chiba, 265-8501 Japan Hiroshima, 734-8559, Japan Email: {tyamagu, mackin}@rsch.tuis.ac.jp Email: [email protected]

Abstract—The clonal selection principle explains the basic Experimental results for classification of the medical database features of an adaptive to a antigenic stimulus. will be reported in Section 5. In Section 6, we give some It established the idea that only those cells that recognize the discussions to conclude this paper. are selected to proliferate and differentiate. This paper explains a computational implementation of the clonal selection II.THECLONALSELECTIONTHEORY principle that explicitly takes into account the affinity maturation of the immune response. The clonal selection algorithm in this Burnet proposed the clonal selection theory in order to paper finds the by the 3 kinds of combination types of explain the essential features of adaptive immune response Immunological Memory cells and the proliferation of immune [13][14]. The basic idea of this theory interprets the response cells. This paper tried to classify the medical database of Heart Disease database in the the UCI Repository. of in the face of an antigenic stimulus. Any molecule that can be recognized by the adaptive I.INTRODUCTION is known as antigens (Ags). Some subpopula- In a few decades, the area of artificial immune system (AIS) tions of its bone-marrow-derived cells responds by producing has been an ever-increasing interested in not only theoret- antibody (Ab). Abs are molecules attached primarily to the ical works but applications in pattern recognition, network surface of B cells. The aim of is to recognize and bind security, and optimizations [1],[2],[3],[4],[5],[6],[7]. AIS uses to Ags. Each B cell (B lymphocytes) secretes a single type of ideas gleaned from in order to develop adaptive Ab. By binding to these Abs and with a second signal from systems capable of performing a wide range of tasks in various T-helper cell, the Ag stimulates the B cell to proliferate and research areas. mature into terminal Ab secreting cells called plasma cells. The biological basis of the Clonal Selection Theory was Proliferation of the B cell is a mitotic process whereby the proposed by Burent [13],[14] in 1959. The theory interprets cells divide themselves, creating a set of clones identical to the response of lymphocytes in the face of antigenic stimulus. the parent cell. The proliferation rate is directly proportional Only the immune cells with high affinity are selected to to the affinity level. That is, the higher affinity level of B proliferate, while those low affinity cells must be efficiently cells, the more of them will be readily selected for cloning deleted or become anergic. The hypermutation is allowed and cloned in larger numbers. to improve the affinity of the selected cells to the selective In addition to proliferating and differentiating into plasma antigens. Receptor Editing as a mechanism of immune cell cells, B cells can differentiate into long lived B memory cells. tolerance is reported [9],[10],[11]. Memory cells circulate through the blood, lymph, and tissues. Gao indicated the complementary roles of somatic hypermu- By exposing to a second antigenic stimulus, they commence tation (HM) and receptor editing (RE) and presented a novel to differentiate into plasma cells capable of producing high clonal selection algorithm called RECSA model by incorpo- affinity Abs, which are preselected for the specific Ag that rating the Receptor Editing method [8]. In [8], they discussed stimulated the primary response. the relationship between HM and RE through utilizing them Gao has been proposed the clonal selection method by to solve the TSPs. Because a valid tour in TSP is represented incorporating receptor editing which can diversify the reper- by a permutation of N cities, the number of states to feasible toire of antigen activated B cells during asexual reproduction tours is (N −1)! [1] solves the problems for finding an optimal [8]. The proposed method has two mechanisms of somatic set in the search space consisted of the set of (N − 1)! valid hypermutation and receptor editing [15][16][17]. tours in TSPs. The remainder of this paper is organized as follows. In the III.CLONAL SELECTION ALGORITHMWITH next serction, the clonal selection theory will be explained IMMUNOLOGICAL MEMORY briefly. Section 3 will explain RECSA model proposed in [8]. Clonal Selection Algorithm with Immunological Mem- The response by memory cells will be discussed in Section 4. ory(CSAIM) model has been proposed to introduce an idea of

1329 immunologicaly memory into the RECSA model. This section Step9) Replace the worst c elite pools with new random describes RECSA model and CSAIM model briefly. once every t generations to introduce diversity and prevent the search from being trapped A. RECSA model[8] in local optima. The shape-space model aims at quantitatively describing the Step10)Determine if the maximum number of generation interactions among Ags and Abs (Ag-Ab) [18],[19]. The set of Gmax to evolve is reached. If it is satisfied with this features that characterize a molecule is called its generalized condition, it terminates and returns the best antibody. shape. The Ag-Ab codification determines their spatial repre- Otherwise, go to Step 4). sentation and a distance measure is used to calculate the degree B. CSAIM model of interaction between these molecules. The Gao’s model [8] can be described as follows. 1) Structure of antibody for the medical diagnosis: Fig.1 shows the structure of antibody in the classification problem m Step1) Create an initial pool of antibodies as candidate of Heart Disase Database which is given in the the UC (Ab , Ab , ··· , Ab ) solutions 1 2 m . Irvine Machine Learning Repository [32]. Each of the datasets Step2) Compute the affinity of all antibodies: consists of ten data items as shown in Table I. (D(Ab1 ),D(Ab2 ), ··· ,D(Abm )). D() means the function to compute the affinity. Step3) Select n best individuals based on their affinities HD from the m original antibodies. These antibodies will q be referred to as the elites. Step4) Sort the n selected elites in n separate and distinct w1 w pools in ascending order. They will be referred to as k the elite pools. Step5) the elites in the pool with a rate proportional to

INP its fitness. The amount of clone generated for these INP INPU IN INP IN I

NPU

PUT PUT

Tk UT antibodies is given by Eq.(1). UT UT

k T 2 T

1 3 4 k-2 n − i -1 P = round( × Q) (1) i n i is the ordinal number of the elite pools, Q Fig. 1. The antibody structure is a multiplying factor for determining the scope of the clone and round() is the operator that rounds towards the closest integer. Then, we ∑ w1 w2 wk-1 wk q can obtain pi antibodies as ((Ab1 ,1 , Ab1 ,2 , ··· , ··· ··· Ab1 ,p1 ), , (Abn,1 , Abn,2 , , Abn,pn )). Step6) Subject the clones in each pools through either hyper-

mutation or receptor editing process. The mutation w1 wk-1 w2 wk q rates, Phm for hypermutation and Pre for receptor editing given by Eq.(2) and Eq.(3), are inversely Fig. 2. RE for w2, wk−1 proportional to the fitness of the parent antibody,

Phm = a/D() (2) 2) and Receptor Editing: HM up- dates the randomly selected wi and θ for a P = Pre = (D() − a)/D() (3) (w1, ..., wk, θ) as follows. , where D() is the affinity of the current parent wi = wi + ∆w, θ = θ + ∆θ antibody and a is an appropriate numerous value. , where ∆w, ∆θ are −γ < ∆w < γ , −γ < ∆θ < γ , Step7) Determine the fittest individual Bi in each elite pool w w θ θ respectively. from amongst its mutated clones. The Bi is satisfied with the following equation. RE makes a crossover of 2 set of wi for a paratope as shown in Fig.2. ··· ··· D(Bi) = max(D(Abi,1 , , Abi,pi )), i = 1, 2, , n 3) Affinity: The system calculates the degree of affinities (4) between antibody and antigen by using Eq.(5) and Eq.(6). { ∑ Step8) Update the parent antibodies in each elite pool with k 1 if w xp ≥ θ the fittest individual of the clones and the proba- f(xp) = i=1 i i (5) → 0 otherwise bility P (Abi , Bi) is according to the roles: if { ≥ p p D(Ab1 ) < D(Bi) then P = 1, if D(Abi ) D(Bi) p 1 if f(x ) = xCHD D(Bi)−D(Abi ) g(x ) = (6) then P = 0, otherwise exp( α ). 0 otherwise

1330 Fig. 3. A flowchart of CSAIM model

Eq.(7) calculates the degree of affinity. 6) The procedure from 3 to 5 is executed for all the cases in Tr till the given number of iterations tr∑num o reaches or error becomes less than the expected Affinity = g(xp) (7) value. The good individuals after training are kept p=1 in the memory. , where xp means the pth example in tr num training cases p Fig. 4. The learning algorithm of CSAIM and xCHD is 1 if an example is a CHD, otherwise 0. Fig.3 shows the immune process of CSAIM model. 4) Learning algorithm in CSAIM: The learning algrithm in 5) Immunological Memory: The memory cells which spe- CSAIM is as follows. cializes to react to a partial training case are generated. They can classify similar cases when they meet its pattern again, 1) Select the elite individual without improving the because it is decided the representative case to learn the degree of affinity during τ generations. p characteristics for partial set of training cases in database. 2) Let Tro be training cases that g(x ) in Eq.(6) equals num In addition, the insertion of individuals are implemented in to 0. T is the number of training cases in Tr . ro o every t generations and the individuals are not only randomly 3) For each training case in T , calculate the output ro retrieved but also generated from memory cells. The swap of of the selected individual by using the following good part of individuals are executed and then the system can equation. ∑k make an efficient search. O = wixi (8) i=1 The representative individual is selected when the training

, where wi is a weight for each paratope. results are memorized. The correct ratio to the training case 4) Calculate the difference δ by using output value and is calculated. Then, not only the weight, w and the threshold, θ. θ, but also the representaitive case and its correct ratio for the 1 1 E = δ2 = (θ − O)2 (9) individual in memory cell are recorded. 2 2 5) Update weights. Moreover, the system compares the memory cell with the given training case and measure the distance between the input w = w + ηδx (10) i i i terms in the representative individual and that of the training , where η takes a real value in [0.1, 1.0]. case.

1331 IV. RESPONSEBY MEMORY CELL Antigen : Disgest A. Agreement degree between memory cells and training cases IFN- Activation We must consider which memory cell reacts for a given APC :Cell case. The agreement degree of the memory cell and training Antigen Antigen Presenting Rel ease case is calculated in the following procedure. Presenting IL-12 Step1) Let di be an input value of each attribute in a Promotion IL-10 given training case and hi be the value of each Prom otion attribute in the representative case of memorized cell. CD8 The representative case means that the memory cells CD4 T Cell Promotion IFN- IL-12,IFN- IL-4 IL-12 reacts the case in the past learning. Release Differentiation Differentiation 0 Differentiation hj Step2) If di =6 0 ∧ hi =6 0, calculate d = di × in the Prom otion i dj smallest case of hi. 0 CTL Th1 Th2 Step3) In IV-A, if the distance between di and hi is smaller Rele ase IL-10 than θ, the memory cell reacts to the training case. θ Deactivation Release is a parameter which determines the reacting region IFN- Deactivation IFN- IL-4 of memory cell. IFN- Deactivation IL-10 Activation B. Non antigen specific effects of T-cell memory Recent several research data have suggested that immune B Cell responsiveness is controlled by the state of maturity in T- cell memory populations. Especially, CD8 memory T cells are Fig. 5. The relation between CD8 and CD4 different from CD4 counterparts in several important respects. Resting memory CD8, but not CD4 cells are able to be acti- vated by alone in the absence of antigen [28],[29]. of signals among immune cells and propose a new model This enables CD8 memory cells induced by any antigen in the representing their relations as shown in Fig.6. similar to NK cells, and so on. All these cell types release IFN- γ immediately after stimulation, rather than requiring priming Signal 1 and restimulation. They are responsible for the early release Tcell receptor Specificity Memory of Antigen of cytokines at the start of an acquired immune response. Signal 2 This immediate IFN-γ production is not necessarily directed Magnitude Constimulation Longevity at the pathogen itself, but acts as a general alert signal and Nature of response Signal 3 directs CD4 cells into a Th1 developmental pathway [30]. By Cytokines contrast to CD4 cells, CD8 populations require only activation Memory of danger to develop IFN-γ-secreting capacity, even when was born. CD8 memory cells activated by the Th1-driving cytokines Fig. 6. T cell mediated IL-12+IL-18 are high secretors of IFN-γ that also prolifer- ate and differentiate into into effector cells producing even The signal 1 in Fig.6 controls specificity of an immune higher levels when restimulated by cytokines. CD8 memory response, and memory is antigen-specific, whereas the signals cells also proliferate in response to stimulation by Toll-like 2 and 3 control the nature of the response but only affect receptor(TLR) agonists. Although Th1 CD4 effectors can also the size of the specific memory pool. In the proposed model, be activated by cytokines such IL-12+IL-18 and thus act in a nonspecific form of memory which accumulates over time and bystander fashion, only CD8 cells seem to retain memory of regulates the nature and intensity of primary T-cell responses this exposure to cytokines as they become long-lived memory to unrelated antigens. That is, we divided the memory cells cells. In other words, CD8 memory populations are able to into the 3 following categories: mediate immunological memory not just for specific antigens, but also for the amount of nonspecific stimulation or • An antigen-specific memory cells the host has been exposed to over a prolonged time period. • A correspondence to T-cells with a variant of memory Fig.5 shows the model of these T cell mediated immunity [31]. cells • Memory cells that response to ambiguous antigens C. Generating antibody from immune memory cells Therefore, if memory cells consist c2 individuals and c1 − The RECSA model generates c new individuals randomly c3(= c4) and c3 proliferated individuals by using HM and RE. at each t generations and swaps them for the already existing Then, the number of new individuals at each generation equals antibodies. CSAIM model generates c1 randomly and prolif- to c = c1 + c2, where c1 = c3 + c4. erates c2 individuals by using memory cells among c new Moreover, the proliferated individuals from memory cells individuals. In this paper, we assume that there are 3 kinds fixes the value of θ and set the modification value ∆w to small

1332 range, because the search algorithm finds the fittest individuals 0.91 within a small fluctuation. 0.9

0.89 V. EXPERIMENTAL RESULTS 0.88 A. Database Design 0.87 This paper describes the classification result of medical database of heart disease data set, which is provided in the the 0.86 0.85

UC Irvine Machine Learning Repository [32]. This directroy Classification Ratio of UCI contains 4 databases concerning heart disease diag- 0.84 nosis gathering from the Cleveland Clinic Foundation(303 in- 0.83 stances), the Hungarian Institute of Cardiology(294 instances), the University Hospital in Switzerland(123 instances), and the 0.82 V.A. Medical Center(200 instances). We tested our proposed 0.81 0 200 400 600 800 1000 1200 1400 1600 1800 2000 CSAIM for the databases Cleveland. Each database has the Generation same instance format. While the databases have 76 raw attributes, only 14 of them are actually used. The 14 attribute Fig. 8. Classification Result information is given in the Table I. TABLE II B. Experimental Results PERFORMANCEOFTHEEXTRACTEDRULEBASES Technique Accuracy We use the programming language Java using NetBeans NN RB 76.9% IDE 6.8 on Fedora 13. All training cases are tested on a PC GA RB 83.7% with Core i7, 2.80GHz CPU and 4.0GB RAM. Fig.7 shows the Rough Set RB 81.8% C4.5 RB 81.1% increasing memory cells as the andtibody generated by HM and RE algorithm are stored in the memory pool of immune system. Fig.8 shows the classification capability of Cleveland database. VI.CONCLUSIVEDISCUSSION The system memorizes the memory cell corresponded to The clonal selection principle established the idea that the data sample as the generation increases. As a result, only those cells that recognize the antigens are selected to the classification capability of Cleveland database becomes proliferate and differentiate. The method by [8], RECSA, good, because the memory cell can classify the corresponding explicitly takes into account the affinity maturation of the im- antigen which it met once during learning. mune response by incorporating receptor editing method. This Table II shows the accuracy ratio by reasoning and/or paper describes the classification capability of Heart Disease methods in the research field of computer intelligence in [33]. Databases in UCI Machine Learning Repository. The compu- These ratios are results by using rule based reasoning but not tational results show about 90% correct ratio. The RECSA equal to classification results. We will check the accuracy by model can classify the medical dataset, where includes some using same data set for the testing. ambiguous data. However, the correct ratio was not high in other dataset of CHD DB [12] where the noise training cases 120 Gerated Memory Cells are included. We will challenge to improve our IMCIA which Matched Record by Memory Cells can perform higher classification for the CHD DB dataset. 100 Moreover, future works will focus the lateral interactive clonal selection algorithm [26] to enable the knowledge acquisition 80 automatically.

REFERENCES 60 Number [1] L.N. de Castro and J. Timmis, “Artificial immune systems: A new computational Intelligence Approach,” Springer-Verlag(1996) 40 [2] L.N. de Castro and F.V. Zuben, “An evolutionary immune network for data clustering, ” Proc. IEEE SBRN, pp.84-89

20 [3] L.N. de Castro and F.V. Zuben, “Learning and optimization using clonal selection principle, ” IEEE Trans. Evolutionary Computation, Vol.6, No.3, pp.239-251(2002)

0 [4] J.E. Hunt and D.E. Ciijem “Learning using an artificial immune system, 0 200 400 600 800 1000 1200 1400 ” J. Network Comput. Applicat., Vol.19, No.2, pp.189-212(1996) Generation [5] D. Dasgupta, “Artificial immune systems and their applications, ” Springer-Verlag(1999) Fig. 7. Increasing Memory Cell [6] S.A. Hofmeyr and S. Forrest, “Immunity by design: An artificial immune system, ” Proc. of Genetic and Evolutionary Computation Conf., pp.1289- 1296(1999)

1333 TABLE I DATA ITEMS OF CORONARY HEART DISEASE DATABASE Name Value age age in years sex sex (1 = male; 0 = female) cp: chest pain type 1 = typical angina, 2 = atypical angina, 3 = non-anginal pain, 4 = asymptomatic trestbps resting blood pressure(mmHg) chol serum cholestoral(mg/dl) fbs: fasting blood sugar 1 = true; 0 = false (> 120 mg/dl) restecg 0 = normal, 1 = having ST-T wave abnormality, resting electrocardiographic results 2 = showing probable or definite left ventricular hypertrophy thalach maximum heart rate achieved exang: exercise induced angina 1 = yes, 0 = no oldpeak ST depression induced by exercise relative to rest slope: 1 = upsloping, 2 = flat, 3 = downsloping the slope of the peak exercise ST segment ca number of major vessels (0-3) colored by flourosopy thal 3 = normal, 6 = fixed defect, 7 = reversable defect diagnosis of heart disease 0 = < 50% diameter narrowing, 1 = > 4 50%

[7] Y. Ishida, “The immune system as a prototype of autonomous de- [25] M. Suka, S. Oeda, T. Ichimura, K. Yoshida, et.al., “Comparison of centralized systems: An overview,” Proc. Intl. Sympo. on Autonomous proportional hazard model and neural network models in a real data set of Decentralized Systems, pp.85-92(1997) intensive care unit patients”, Proc.of MEDINFO2004(the World Congress [8] S.Gao, H,Dai, G.Yang, and Z.Tang, “A novel clonal selection algorithm on Medical Informatics), pp.741-745(2004) and its application to travelling salesman problem,” IEICE Trans. Funda- [26] S.Gao, H,Dai, J.Zhang, and Z.Tang, “An expanded lateral interactive mentals, Vol.E90-A, pp.2318-2325(2007) clonal selection algorithm and its application,” IEICE Trans. Fundamen- [9] L.K. Verkoczy, A.S. Martensson, and D. Nemazee, “The scope of receptor tals, Vol.E91-A, pp.2223-2231(2008) editing and its association with ,” Current Opinion in [27] S. Oeda, T. Ichimura,and K.Yoshida, “Immune Multi Agent Neural Immunology, Vol.16, pp.808-814(2004) Network and Its Application to Coronary Heart Disease Database”, Proc. [10] R. Pelanda and R.M. Torres, “Receptor editing for better or for worse,” of the 8th International Conference on Knowledge-Based Intelligent Infor- Current Opinion in Immunology, Vol.18, pp.164-190(2006) mation & Engineering Systems (KES2004), Vol.2, pp.1097-1105(2004) [28] D.F.Tough et al., “An IFN-gamma-dependent pathway controls stimula- [11] N.S. Longo and P.E. Lipsky, “Why do B cells mutate their immunoglob- + ulin receptors? ,” TRENDS in Immunology, Vol.27, No.8, pp.374- tion of memory phenotype CD8 T cell turnover in vivo by IL-12, IL-18, and IFN-gamma”, Journal of Immunology, Vol.166, pp.6007-6011(2001) 380(2006) + [12] M. Suka, T. Ichimura and K. Yoshida, ”Development of coronary [29] G.Lertmemongkolchai et al., “Bystander activation of CD8 T cells heart disease databases”, Proc. of the 8th International Conference contributes to the rapid production of IFN-gamma in response to bacterial on Knowledge-Based Intelligent Information & Engineering Systems pathogens”, Journal of Immunology, Vol.166, pp.1097-1105(2001) (KES2004), Vol.2, pp.1081-1088(2004) [30] M.Afkarian, et al., “T-best is a STAT1-induced regulator of IL-12R expression in native CD4+ T cells”, Nature Immunology, Vol.3, 549- [13] F.M. Burnet, “Clonal selection and after,” in Theoretical Immunology, 557(2002) G.I. Bell, A.S. Perelson, and G.H. Pimbley Jr. Eds. New York; Marcel [31] A. Noble, “Do we have memory of danger as well as antigen?”, Trends Dekker, pp.63-85(1978) in Immunology, Vol.30, No.4, pp.150-156(2009) [14] F.M. Burnet, The Clonal Selection Theory of Acquired Immunity. [32] David W. Aha, “Heart Disease DataSet”, Learning Repository, Cambridge, U.K.: Cambridge Univ. Press(1959) http://archive.ics.uci.edu/ml/datasets/Heart+Disease(1988) [15] M.C. Nussenzweig, “Immune receptor editing: Revise and select,” Cell, [33] I.Jagielska, C.Matthews, T.Whitfort, “An investigation into the applica- Vol.95, pp.875-878(1998) tion of neural networks, fuzzy logic, genetic algorithms, and rough sets to [16] S. Tonegawa, “somatic generation of antibody diversity,” Nature, automated knowledge acquisition for classication problems” Neurocom- Vol.302, pp.575-581(1983) puting Vol.24,pp.37-54(1999) [17] V. Kouskoff and D. Nemazee, “Role of receptor editing and revision in shaping the B and T repetorie,” Life Sciences, Vol.69, pp.1105-1113(2001) [18] A.S. Perelson and G.F. Oster, “Theoretical studies of clonal selection: Minimal antibody repertoire size and reliability of self-nonself discrimi- nation,” J. Theor. Biol., Vol.81, No.4, pp. 645-670(1979) [19] A.S. Perelson, “Immune network theory,” Immunological Review, Vol.110, pp.5-36(1993) [20] P.J. Lucas and A. Abu-Hanna, “Prognostic methods in medicine,” Artif. Intell. Med. Vol.15, pp.105-119(1999) [21] W. Penny and D. Frost, “Neural networks in clinical medicine,” Med. Decis. Making Vol.16, pp.386-398(1996) [22] P.W.F. Wilson, R.B. D’Agostino, D. Levy, A.M. Belanger, et.al., “Pre- diction of coronary heart disease using risk factor categories,” Circulation Vol.97, pp.1837-1847(1998) [23] T.R. Dawber, W.B. Kannel, N. Revotskie, J. 3rd. Stokes, et.al., “Some factors associated with the development of coronary heart disease: six year’s follow-up experience in the Framingham Study,” Am. J. Public Health, Vol.49, pp.1349-1356(1959) [24] W.B. Kannel, T.R. Dawber, A. Kagan, N. Revotskie, et.al., “Factors of risk in the development of coronary heart disease: six-year follow-up experience,”, The Framingham Study. Ann. Intern. Med., Vol.55, pp.33- 50(1961)

1334