9 772152 738001 0 3

Applied Mathematics, 2021, 12, 131-239 https://www.scirp.org/journal/am ISSN Online: 2152-7393 ISSN Print: 2152-7385

Table of Contents

Volume 12 Number 3 March 2021

Uniqueness of Positive Radial Solutions for a Class of Semipositone Systems on the Exterior of a Ball A. Mohamed, K. A. Abbakar, A. Awad, O. Khalil, B. M. Acyl, A. A. Youssouf, M. Mousa………...... ………131 Discrete Model of Plasticity and Failure of Crystalline Materials V. L. Busov…………...... …………………………………147 Tuning of Prior Covariance in Generalized Least Squares W. Menke……...... ………………………………157 Information Models for Forecasting Nonlinear Economic Dynamics in the Digital Era A. Akaev, V. Sadovnichiy…………...... ……………………………………………………………………171 A Stochastic SVIR Model for Measles M. Seydou, O. Moussa Tessa………...... ……………………………………………209 An Oracle Bone Inscription Detector Based on Multi-Scale Gaussian Kernels G. Y. Liu, S. H. Chen, J. Xiong, Q. J. Jiao……...... ……………………………………224

Applied Mathematics (AM) Journal Information

SUBSCRIPTIONS

The Applied Mathematics (Online at Scientific Research Publishing, https://www.scirp.org/) is published monthly by Scientific Research Publishing, Inc., USA.

Subscription rates: Print: $89 per copy. To subscribe, please contact Journals Subscriptions Department, E-mail: [email protected]

SERVICES

Advertisements Advertisement Sales Department, E-mail: [email protected]

Reprints (minimum quantity 100 copies) Reprints Co-ordinator, Scientific Research Publishing, Inc., USA. E-mail: [email protected]

COPYRIGHT

Copyright and reuse rights for the front matter of the journal: Copyright © 2021 by Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/

Copyright for individual papers of the journal: Copyright © 2021 by author(s) and Scientific Research Publishing Inc.

Reuse rights for individual papers: Note: At SCIRP authors can choose between CC BY and CC BY-NC. Please consult each paper for its reuse rights.

Disclaimer of liability Statements and opinions expressed in the articles and communications are those of the individual contributors and not the statements and opinion of Scientific Research Publishing, Inc. We assume no responsibility or liability for any damage or injury to persons or property arising out of the use of any materials, instructions, methods or ideas contained herein. We expressly disclaim any implied warranties of merchantability or fitness for a particular purpose. If expert assistance is required, the services of a competent professional person should be sought.

PRODUCTION INFORMATION

For manuscripts that have been accepted for publication, please contact: E-mail: [email protected] Applied Mathematics, 2021, 12, 131-146 https://www.scirp.org/journal/am ISSN Online: 2152-7393 ISSN Print: 2152-7385

Uniqueness of Positive Radial Solutions for a Class of Semipositone Systems on the Exterior of a Ball

Alhussein Mohamed1*, Khalid Ahmed Abbakar1,2, Abuzar Awad1, Omer Khalil1,3, Bechir Mahamat Acyl1, Abdoulaye Ali Youssouf1, Mohammed Mousa1

1College of Mathematics and Statistics, Northwest Normal University, Lanzhou, China 2Department of Mathematics and Physics, Faculty of Education, University of Gadarif, Gadarif, Sudan 3Department of Science, College of Education, Sudan University of Science and Technology, Khartoum, Sudan

How to cite this paper: Mohamed, A., Abstract Abbakar, K.A., Awad, A., Khalil, O., Acyl, B.M., Youssouf, A.A. and Mousa, M. (2021) In this paper, we study the positive radial solutions for elliptic systems to the Uniqueness of Positive Radial Solutions for −∆u = λ k( x) f( uv,) onΩ ,  11 a Class of Semipositone Systems on the −∆v = λ k( x) f( uv,) onΩ , Exterior of a Ball. Applied Mathematics, 12,  22  = = →∞ 131-146. ux( ) vx( ) 0 on x , https://doi.org/10.4236/am.2021.123009 nonlinear BVP:  ∂u , where ∆=u div( ∇ u) and  +=c 10( uu) 0 on x = r , ∂η Received: November 16, 2020  ∂v +=c 20( vv) 0, on x = r . Accepted: March 9, 2021 ∂η Published: March 12, 2021 ∆=v div( ∇ v) are the Laplacian of u, λ is a positive parameter, Copyright © 2021 by author(s) and n Ω={x ∈» : N > 2, x > rr00 , > 0} , let i = [1, 2] then Kri :[ 0 ,∞→) ( 0, ∞) Scientific Research Publishing Inc. ∂ This work is licensed under the Creative is a continuous function such that limkr( ) = 0 and is The exter- Commons Attribution International ri→∞ ∂η License (CC BY 4.0). nal natural derivative, and c :[ 0,∞→) ( 0, ∞) is a continuous function. We http://creativecommons.org/licenses/by/4.0/ i Open Access discuss existence and multiplicity results for classes of f with a) fi > 0 , b)

fi < 0 , and c) fi = 0 . We base our presence and multiple outcomes via the Sub-solutions method. We also discuss some unique findings.

Keywords Elliptic System, Positive Radial Solution, Exterior Domains, Fixed Point Index

1. Introduction

In reaction diffusion processes, steady states define the long term dynamics.

DOI: 10.4236/am.2021.123009 Mar. 12, 2021 131 Applied Mathematics

A. Mohamed et al.

Here we consider a steady state reaction diffusion equation on an exterior do- main with a nonlinear boundary condition on the interior boundary. Namely, we study positive radial solutions to: −∆u = λ k( x) f( uv,) onΩ ,  11 −∆v = λ k22( x) f( uv,) onΩ ,  ux( ) = vx( ) =0 on x →∞ ,  ∂u (1.1)  +=c 10( uu) 0 on x = r , ∂η  ∂v  +=c 20( vv) 0, on x = r . ∂η where ∆=u div( ∇ u) and ∆=v div( ∇ v) are the Laplacian of u, λ is a positive n parameter, Ω={x ∈» : N > 2, x > rr00 , > 0} , let i = [1, 2] then

Kri :[ 0 ,∞→) ( 0, ∞) is a continuous function such that limri→∞ kr( ) = 0 and ∂ is the outward normal derivative, and c :[ 0,∞→) ( 0, ∞) is a is an non ∂η i

decreasing (increasing) function. Here the reaction term fRi :[ 0,∞)×[ 0,∞) → 1 is a C function. The case when fi < 0 (see [1] [2], that the study of positive solutions to such problems is considerably more challenging than in the case

fi > 0 (positone problems). For a rich history on semipositone problems with Dirichlet boundary conditions on bounded domains, (see [3]-[8], and on do- mains exterior to a ball, see [9] [10] [11]. Such nonlinear boundary conditions occur very naturally in applications see [12] for a detailed description in a model arising in combustion theory. Recently, the existence of a radial positive solution for (1.1) when λ  1 has been established in [13], via the method of subsuper solutions. Here we discuss the uniqueness of this radial solution when some ad- ditional assumptions hold. In [14], the authors study such a uniqueness result

for the case of Dirichlet boundary condition on xr= 0 . Our focus in this paper is to consider the uniqueness result for semipositone problem when a class of

nonlinear boundary condition is satisfied at xr= 0 . The fact that we have no

longer a fixed value of u on xr= 0 results in quite a challenge in extending the results in [15].

Namely, we need to establish a detailed behavior of u at xr= 0 to achieve our goal. Instead of working directly with (1), we note that the change of 2−N r variables rx= and s =  transforms (1) into the following boundary r0 value problem: −=u′′ ( t) λ a ( t) f( ut( ), vt( )) t∈[ 0,1] ,  11 −=v′′ ( t) λ a 22( t) f( ut( ), vt( )) t∈[ 0,1] ,  N − 2  u′ += cu ( (1)) u( 1) 0,  r 1 (1.2)  0 N − 2  v′ += cv ( (1)) v( 1) 0,  r 2  0 uv(0) =( 0) = 0.

DOI: 10.4236/am.2021.123009 132 Applied Mathematics

A. Mohamed et al.

−−21( N ) r 2 1 1 a = 0 tNN−−22 k rt k ≤ where ii2 0 . We will only assume i N +µ for (2 − N )  r

r  1 and for some µ ∈−(0,N 2) . then a i ∈∞((0,1] ,( 0, )) could be singular

at 0.if µ ≥−N 2 , a i will be nonsingular at 0 and it will be an easier case to study. Note that a ii= inft∈(0,1] at( ) > 0 and there exists a constant d > 0 such d ( N −−2) µ that a i ≤ α for all t ∈(0,1] where α = . Motivated by the above t N − 2 21 discussion, in this paper, we will study positive solutions in CC(0,1) ∩ [ 0,1] to the following boundary value problems: −=u′′ ( t) λ a( t) f( ut( ), vt( )) t∈[ 0,1] ,  11 −=v′′ ( t) λ a22( t) f( ut( ), vt( )) t∈[ 0,1] ,  u′(1) += cu1 ( ( 1)) u( 1) 0, (1.3)  ′ v(1) += cv2 ( ( 1)) v( 1) 0,  uv(0) =( 0) = 0.

where ci :[ 0,∞) →∞( 0, ) is a continuous function and aCi ∈∞((0,1] ,( 0, )) is such that:

(H1) aii= inft∈(0,1] at( ) > 0 ; d (H2) there exists a constant d > 0 such that at( ) = for all t ∈(0,ε ] i tα where a ∈(0,1) and ε ≈ 0 1 (H3) ai is decreasing. We consider various C classes of the reaction term

fRi :[ 0,∞)×[ 0,∞) → satisfying the following:

fsi ( ) (F1) f < 0 and lim→∞ = 0 ; i = 1, 2 i s s

(F2) fi is increasing and limsi→∞ fs( ) = ∞ ; i = 1, 2

(F3) fi is concave on [0,∞) . i = 1, 2 Theorem 1.1. Assume (H1) - (H3) and (F1) - (F3). Then (1.3) has a unique positive solution for all λ sufficiently large. In Section two we will establish important a priori estimates. We will first re- call some important results from [8] where the authors studied the case of Di- richlet boundary condition, or equivalently (1.3) with the boundary condition t = 1 replaced by uv(1) =( 10) = . These results do not depend on the boundary condition at t = 1 and hence it is also true for solutions of (1.3). In view of the readers convenience we include the proofs of these results. In Section three, we prove Theorem 1.1.

2. Advance Estimates

s Let Fs( ) = f( t)d s. Note that there exist unique positive numbers βθ, such ∫0 i

that fi (β ) = 0 and F (θ ) = 0 and βθ< Theorem 2.1. (See [8].) Let uv, are a positive solution of (1.3). Then u and v

has only one interior maximum in (0,1) , say at tm , depending on λ , and

ut( m ) > θ , vt( m ) > θ . Proof. Let

DOI: 10.4236/am.2021.123009 133 Applied Mathematics

A. Mohamed et al.

2  ut′( ) E( t) =λ Fut( ( )) a( t) +∈, t ( 0,1) ,  112  2 (1.4)  vt′( ) E( t) =λ Fvt( ( )) a( t) +∈, t ( 0,1)  222 then  ′′ E11( t) = λ Fut( ( )) a( t), t∈( 0,1) ,  (1.5) ′′= λ ∈ E22( t) Fvt( ( )) a( t), t ( 0,1)

Note that by (H3), at1′( ) < 0 and at2′ ( ) < 0 for all t ∈(0,1] . Hence, Et1 ( )

and Et2 ( ) are increases when ut( ) < θ , vt( ) < θ and decreases when ut( ) > θ , vt( ) > θ .

Let tm ∈(0,1) be the first point at which u has a local maximum and assume

that ut( ) ≤ θ and vt( ) ≤ θ for all tt∈[0, m ] . Then Et1 ( ) and Et2 ( ) are in-

creases in [0,tm ] . Now integrating (1.3) from t to tm , for tt< m

 ttmmc cf11(θ ) u′( t) =λ a( s) f( us( ))dd s ≤≤λθ f( ) 1 s λ  ∫∫tt11 1 sα 1−α  (1.6)  ttmmc cf22(θ ) v′( t) =λ a( s) f( vs( ))dd s ≤≤λθ f( ) 2 s λ  ∫∫tt22 2 sα 1−α c where cd> are such that at( ) ≤ i for all t ∈(0,1) using (H2). Integrat- i i tα ing again (1.6) from 0 to t, tt≤ m

 t cf11(θθ) cf11( ) ut( ) ≤=λ d,s C  ∫0 11−−αα0  (1.7)  t cf22(θθ) cf22( ) Vt( ) ≤=λ d,s C  ∫0 11−−αα0

Since fi are continuous, there exists K0 > 0 such that F( u) ≤ Ku0 and

F( v) ≤ Kv0 for all (uv,)∈[ 0,θ ] . Hence

21−α lim λF( u( t)) a1( t) ≤≤lim λλ Ku0( t) a 1( t) lim KCcdt0 01 =0 t→0+ tt →→ 00 ++

21−α lim λFvt( ( )) a2( t) ≤≤lim λλ Kvta0( ) 2( t) lim KCcdt0 02 =0 t→0+ tt →→ 00 ++

lim+ Et≥ 0 Et 0,t Hence t→0 i ( ) . Since i ( ) increases on [ m ] ,

E11( tm) = λ Fut( ( mm)) a( t) > 0 and E22( tm) = λ Fvt( ( mm)) a( t ) > 0 and which

is a contra-diction if ut( m ) ≤ θ and vt( m ) ≤ θ . Suppose that u and v has two interior maxima. Then there exists tt ∈( ,1) such that ut′( ) = 0 , vt′( ) = 0

and ut′′ ( ) ≥ 0 , vt′′ ( ) ≥ 0 . Since u′′ ( t ) = λ a11(t ) f( ut( )) ≥ 0 , and

v′′ ( t ) = λ a22(t ) f( vt( )) ≥ 0 we have f1 ( ut( )) ≤ 0 and f2 ( vt( )) ≤ 0 which

implies that ut( ) ≤ β and ut( ) ≤ β . Thus Eti ( ) < 0 . Let tttθ ( m , ) such that 2 2 ut′( θ ) vt′( θ ) ut( ) = θ and vt( ) = θ . Then Et( θ ) = ≥ 0 , Et( θ ) = ≥ 0 θ θ 1 2 2 2

and Ei increases in (ttθ , ) since ut( ) < θ and since vt( ) < θ in (ttθ , ) .

Hence Eti ( ) > 0 , which is a contradiction. Therefore, we have only one interior maximum and that maximum value is larger than θ .

DOI: 10.4236/am.2021.123009 134 Applied Mathematics

A. Mohamed et al.

Theorem 2.2. (See [8].) Let u and v be a positive solution of (1.3) and let 1 − ∈ = β = β ≤ λ 2 ttβ (0, m ) such that ut( β ) and vt( β ) . Then tβ  as  λ →∞. β β ∈ = = Proof. Let ttββ(0, ) be the point such that utβ and vtβ 2 2 2 2 2

from Integrating (1.3) from 0 to t for some tt< β 2

t β ′′= −λλ≥− u( t) u(0d) ∫ a1( s) f 1( us( )) s a11 f t 0 2

t β ′′=−λλ≥− v( t) v(0) ∫ a2( s) f 2( vs( )) d, s a22 f t 0 2

Integrating the above again from 0 to tβ      −β = > c 1 1 0   β 2  af− 11 11 −− 2 22  tcββ≤≤ 12λλ, tc where  (1.8) 22      −β c 2 = > 0 1  β 2 −  af22  2  By the mean value theorem, there exists a tt ∈ 0, β such that 2   β 1 −=′ −=′ ′ ≥ λ 2 utββ u(0) utt( ) , vtββ v(0) vtt( ) and by (2.5) ut( ) 2222 2c 1 β 1 ′ 2 ′ ′  and vt( ) ≥ λ . Since u and v are increases in 0,tβ , 2c 2  β 1  2 ut′( ) ≥∈λ ,,t tββ t  2c 1 2  (1.9)  β 1  ′ 2 vt( ) ≥∈λ ,,t tββ t  2c 2 2 1 − −≤ λ 2 Integrating (2.6) from tβ to tβ , we have that ttββ c1 and 2 2 1 1 − − −≤ λ 2 ≤ λ 2 ttββ c1 . This implies tβ  by (2.5).  2  Lemma 2.3. Let u and v be a positive solution of (1.3). Then u (1) →∞ and v(1) →∞ as λ →∞ βθ+ βθ+ Proof. We first claim that u (1) ≥ and v(1) ≥ for λ  1. As- 2 2

DOI: 10.4236/am.2021.123009 135 Applied Mathematics

A. Mohamed et al.

βθ+ βθ+ sume that u (1) < and v(1) < Then there exists a tt θ ∈( ,1) 2 2 m

such that ut( θ ) = θ and vt( θ ) = θ where tm is the point at which uv, achi-

eves are maximum, and ut( m ) > θ , vt( m ) > θ by Lemma 2.1.

ut′( θ ) vt′( θ ) From (2.1) and (2.2), Et( θ ) = > 0 , Et( θ ) = > 0 and 1 2 2 2 Eti ( ) ≥ 0 on tθ ,1 since ut( ) ≤ θ and vt( ) ≤ θ in tθ ,1 . Hence we obtain that u′(1) E(1) =λ Fu( ( 11)) a( ) +> 0, 112 v′(1) E(1) =λ Fv( ( 11)) a( ) +> 0 222 and from (1.3), we have  =−′ ≥−λ c11( u(1)) u( 1) u( 1) 2 Fu( ( 1)) a( 1,)  (1.10)  =−′ ≥−λ c22( v(1)) v( 1) v( 1) 2 Fv( ( 1)) a( 1.) This cannot hold unless u (10) → and v(10) → as λ →∞. However, rewriting (1.10) as

 1 Fu( (1))  2 cu11( (1)) u( 1) ≥− 2λ a(1,)  u (1)  (1.11)  1 Fv( (1)) 2 cv22( (1)) v( 1) ≥− 2λ a(1.)  v(1) Fu( (1)) we obtain a contradiction when λ  1 since → f (0) , u (1) 1 Fv( (1)) → f (0) if u (10) → and v(10) → as λ →∞. Hence, v(1) 2 βθ++ βθ uv(1) ≥≥ and( 1) forλ  1 (1.12) 22 = →∞ = →∞ λ →∞ Next, we claim that ut( m ) u∞ and vt( m ) v∞ as . Let  hu:= − β and wv:= − β then h > 0 , w > 0 in (tβ ,1 and satisfies

 fu1 ( ) −=h′′ λ at1 ( ) h( tβ ,1)  u − β  fv( ) ′′ 2 −=wλ at2 ( ) w( tβ ,1)  v − β (1.13)  = = ht( ββ) wt( ) 0, hu(11) =( ) −>β 0  wv(11) =( ) −>β 0  π (tt− β ) Let ψ =: sin . Then ψ satisfies: − ttβ

DOI: 10.4236/am.2021.123009 136 Applied Mathematics

A. Mohamed et al.

 π 2 −=ψψ′′ ,t ,1  2 ( β )  (tt− β ) (1.14)  ψψ= =  (tβ ) (10)

Multiplying (1.13) by ψ and (1.14) by h,and w integrating both from tβ to 1 and subtracting, we have

2 11π fu( ) ′′ψψ−= ′′  −λ1 ψ ∫∫(h h)d t 2 at1 ( ) hd, t ttββ− u − β (1 tβ )

2 11π fv( ) ′′ψψ−= ′′  −λ2 ψ ∫∫(w wt)dd2 at2 ( ) w t ttββ− v − β (1 tβ ) 1 Since ∫ (h′′ψψ−=−> ′′ ht)d ψ ′ ( 11) h( )( 0) and 1 tβ (w′′ψψ−=−> ′′ wt)d ψ ′ ( 110) w( )( ) by integration by parts, we can see that ∫tβ 22 ππfu12( ) fv( ) >λλat12( ) and >∈at( ) for some t( tβ ,1) (1.15) 22uv−−ββ (11−−ttββ) ( )

Note that inf(0,1] at1 ( ) > 0 , inf(0,1] at2 ( ) > 0 and from Lemma 2.2 we can as-

1 fu1 ( ) fv2 ( ) sume (1−>tβ ) . Thus (2.11) is only true if → 0 and → 0 2 u − β v − β λ = = when  1. By (F1) (F2), we conclude that ut( m ) u∞ and vt( m ) v∞ as ′′ ′′  λ →∞. Notice that since u < 0 and v < 0 in (tβ ,1 , we have ut( ) − β ≥m −+β ∈ ut( ) (t tββ) ,, t t tm ttm − β vt( ) − β ≥m −+β ∈ vt( ) (t tββ) ,, t t tm ttm − β βθ+ ut( ) − m 2 βθ+ ut( ) ≥ (1−+t) , t ∈[ tm ,1] 12− tm βθ+ vt( ) − m 2 βθ+ vt( ) ≥ (1−+t) , t ∈[ tm ,1] 12− tm

Since ut( m ) →∞, vt( m ) →∞ and tβ → 0 as λ →∞, it is all true that βθ++ βθ 1 vt( ) ≥≥and ut( ) , in ,1 forλ  1 (1.16) 2 24 Now we show that u (1) →∞ and v(1) →∞ as λ →∞. Since uv, is a solution of (1.3), u and v can be written as: (see Appendix 8.2 in [5] for details) 1 ut( ) = λ Gtsa( ,) ( s) f us( ) d s− c u( 11) u( ) t (1.17) ∫0 11( ) 1( )

1 vt( ) = λ Gtsa( ,) ( s) f vs( ) d s− c v( 11) v( ) t (1.18) ∫0 22( ) 2( ) where

DOI: 10.4236/am.2021.123009 137 Applied Mathematics

A. Mohamed et al.

s,0≤≤≤ st 1 Gts( , ) =  t,0≤≤≤ ts 1 Let t = 1. Then from (1.17) and (1.18), we have + 1cu1 ( ( 11)) u( ) (1.19) tβ 1 = λ G(1, sa) 11( s) f( us( ))d s+ G( 1, sa) 11( s) f( us( ))d s ∫∫0 tβ + 1cv2 ( ( 11)) v( ) (1.20) tβ 1 = λ G(1, sa) 22( s) f( vs( ))d s+ G( 1, sa) 22( s) f( vs( ))d s ∫∫0 tβ

Then using the fact Gs(1, ) = s and tβ →∞ as λ →∞, for λ large we obtain

tβ 1 1+=c1( u( 11)) u( ) λ sa11( s) f( u( s))d s + sa11( s) f( u( s))d s (∫∫0 tβ )

tβ 1 1+=c2( v( 11)) v( ) λ sa22( s) f( v( s))d s + sa22( s) f( v( s))d s (∫∫0 tβ )

tβ 1 ≥+λ sa( s) f u( s) dd s1 sa( s) f u( s) s ∫∫0 11( ) 11( ) 4

tβ 1 ≥+λ sa( s) f v( s) dd s1 sa( s) f v( s) s ∫∫0 22( ) 22( ) 4 λ βθ+ 1 ≥ f11∫1 sa( s)d s 224 λ βθ+ 1 ≥ f22∫1 sa( s)d s 224 where the last inequality is obtained by (1.16). Hence, we have + ≥+λλ  ≥ 1cu12( ( 1)) u( 1) K and  1 cv( ( 1)) v( 1) K , (1.21) 1 βθ+ 1 1 βθ+ 1 = > = > where K f11∫1 sa( s)d0 s and K f22∫1 sa( s)d0 s . 224 224 Now, from (1.21), clearly u (1) →∞, v(1) →∞ as λ →∞.  Lemma 2.4. Let u and v be a positive solution of (1.3). Then there exists 1 αµ⊂ αµ≠ λ →∞ [ ,]  ,1 , , both independent of , such that inf[αµ, ] ut( ) 4

and inf[αµ, ] vt( ) →∞ as λ →∞

Proof. As λ →∞, tm may converge to 1 or to any other point in (0,1) .

First we consider the case tm 1 as λ →∞. Since u(1) < ut( m ) and

v(1) < vt( m ) clearly there exists α < 1 such that inf[α ,1] ut( ) →∞ and

inf[α ,1] vt( ) →∞ as λ →∞ by Lemma 2.3.

Now, let us consider the case when tm →1 as λ →∞. By differentiating (1.17) and (1.18) (or integrating (1.3)), we obtain 1 u′( t) =λ a( s) f us( ) d s −∈ c u( 1) u( 1) , t [ 0,1] , ∫t 11( ) 1( ) 1 v′( t) =λ a( s) f vs( ) d s −∈ c v( 1) v( 1) , t [ 0,1] ∫t 22( ) 2( ) which gives us that

DOI: 10.4236/am.2021.123009 138 Applied Mathematics

A. Mohamed et al.

1 u′( t) =λ a( s) f us( ) d s −= c u( 1) u( 1) 0, (1.22) ∫t 11( ) 1( ) 1 v′( t) =λ a22( s) f( vs( ))d s −= c2( v( 1)) v( 10) (1.23) ∫tm Now we rewrite (1.17) and (1.18) by using (1.22), (1.23) as:

11 ut( ) = λλ Gtsa( ,) ( s) f us( ) d s− a( s) f us( ) d, st ∫∫0 11( ) ( t 11( ) ) m

11 vt( ) = λλ Gtsa( ,d) ( s) f vs( ) s− a( s) f vs( ) d st ∫∫0 22( ) ( t 22( ) ) m

ttβ m = λλGsta( ,) 11( s) f( us( )) d, s+ Gtsa( ) 11( s) f( us( )) d s ∫∫0 tβ

1 +−λ Gts( , ) t a11( s) f( us( ))d. s ∫tm 

ttβ m = λλGsta( ,) 22( s) f( vs( )) d, s+ Gtsa( ) 22( s) f( vs( )) d s ∫∫0 tβ

1 +−λ Gts( , ) t a22( s) f( vs( ))d. s ∫tm 

Note that if tt∈[0, m ] , then 1 Gts( ,) −= t a11( s) f( us( ))d0 s ∫tm  and 1 Gts( ,) −= t a22( s) f( vs( ))d0 s ∫tm 

since tt≤m ≤≤ s1 implies Gts( , ) = t. Now tβ → 0 and tm →1 as λ →∞. 13 Hence, for t ∈ , and λ large we obtain 44

ttβ m ut( ) = λ Gtsa( ,d,d) 11( s) f( us( )) s+ Gtsa( ) 11( s) f( us( )) s, (∫∫0 tβ )

ttβ m vt( ) = λ Gtsa( ,d,d) 22( s) f( vs( )) s+ Gtsa( ) 22( s) f( vs( )) s (∫∫0 tβ ) 3 tβ ≥+λ Gtsa( ,d,d) ( s) f us( ) s4 Gtsa( ) ( s) f us( ) s , ∫∫0 11( ) 1 11( ) 4 3 tβ ≥+λ Gtsa( ,d,d) ( s) f vs( ) s4 Gtsa( ) ( s) f vs( ) s ∫∫0 22( ) 1 22( ) 4 3 a1 βθ+ ≥ λ 4 f1 ∫1 Gts( ,) d, s 224 3 a2 βθ+ ≥ λ 4 f2 ∫1 Gts( ,) d. s 224

Thus, a 3 1 βθ+ 4 ut( ) ≥ λ f1 inf1 Gts( ,) d s 13 ∫ 22, 4 44 and a 3 2 βθ+ 4 vt( ) ≥ λ f2 inf1 Gts( ,) d s 13 ∫ 22, 4 44

DOI: 10.4236/am.2021.123009 139 Applied Mathematics

A. Mohamed et al.

13 13 on , , which means that ut( ) →∞ and vt( ) →∞ for all t ∈ , as 44 44 λ →∞. Lemma 2.5. Let u and v be a positive solution of (1.3). Then there exists λ such that if λλ> , then u( t) ≥λλ Cd( t, ∂Ω) and v( t) ≥ Cd( t, ∂Ω) (1.24)

for some positive constant C, independent of λ . Here Ω=(0,1) .

Proof. Let φi be the unique solution of the problems −=φω′′  iiαµ, at( ),( tβ ,1)  (1.25) ψψ= = =  ii(tiβ ) (1) 0, and 1, 2 where ω is the characteristic function. By the Hopf maximum principle there

exists c i > 0 such that φi(t) > ce ii( t) for all t ∈[0,1] , where ei are a solu- tion of  −=eii′′ ωαµ, at( ),( tβ ,1)  (1.26) eeii(0) =( 1) = 0, and i = 1, 2

Let H > 0 be such that D:= cf ii( H) +> f i( 00) , and this is possible by

(F2). Let uv11, and uv22, satisfy

−=u11′′ λ fH( ) ωαµ[ ,] at1( ) , t∈( 0,1) , u1( 0) == 0 u 1( 1)

−=v12′′ λ fH( ) ωαµ[ ,] att2( ) ,∈( 0,1) , v1( 0) == 0 v 1( 1)

and

−=u21′′ λ f(0) ωαµ[ ,] at1( ) , t∈( 0,1) , u2( 0) == 0 u 2( 1)

−=v22′′ λ f(0) ωαµ[ ,] att2( ) ,∈( 0,1) , v2( 0) == 0 v 2( 1)

Now by Lemma 2.4, there exists λ > 0 such that if λλ> , then utH( ) ≥≥and vtH( ) on[αµ ,] . (1.27)

Hence, by (1.27), for λ  1 we have that for t ∈(0,1)

−=u′′ λ f11( ua) ( t) ≥ λωλωαµ f 11( ua) ( t) [0, tB ] + f11( ua) ( t) [ ,,]

−=v′′ λ f22( va) ( t) ≥ λωλωαµ f 22( va) ( t) [0, tB ] + f22( va) ( t) [ , ]

−≥v′′ λ f22(0,) at( ) +λ fHat 2( ) 2( ) ωαµ[ ]

−≥u′′ λ f11(0,) at( ) + λ fHat 1( ) 1( ) ωαµ[ ] ′′ =−−(uu12) ( t) ′′ =−−(vv12) ( t)

u(0) −−( uu12)( 00) =, v(0) −−( vv12)( 00) = and

u(1) −−( uu12)( 1) = u( 10) >, v(1) −−( vv12)( 1) = v( 10) >. By the maximum

principle, ut( ) =−= u1( t) u 2( t) λ f 1( H) φλ 1( t) + f 11(0) e( t) and

vt( ) =−= v1( t) v 2( t) λ f 2( H) φλ 2( t) + f 22(0) e( t) in [0,1] . Hence

ut( ) ≥ f1( Hcet) 11( ) +=λλ f 1(0) et 1( ) Det1( ) and

DOI: 10.4236/am.2021.123009 140 Applied Mathematics

A. Mohamed et al.

vt( ) ≥ f2( Hcet) 22( ) +=λλ f 2(0) et 2( ) Det2( ) for all t ∈[0,1] .

Note that there exists L > 0 such that e1 ( t) ≥ Ld( t, ∂Ω) and

e2 ( t) ≥ Ld( t, ∂Ω) for all t ∈[0,1] . Hence, for λ large u( t) ≥λ Cd( t, ∂Ω) and v( t) ≥λ Cd( t, ∂Ω) for all t ∈[0,1] , where C:0= DL > .

Lemma 2.6. Let u and v be a positive solution of (1.3). Then there exists Hλ ≤ ≤ such that uH∞ λ and vH∞ λ . Proof. 1 Let B= as( )d s. Then B <∞ since a( t) ≤ ct for all t ∈(0,1) for some ∫0 i iiα

ci > 0 . Now for each given λ > 0 , there exists Wλ > 0 such that if

fWi ( ) 1 WW> λ , then ≤ due to the hypothesis (F1). Also since WB2λ 1 fCi ∈∞([0,) , R) , there exists Kλ > 0 such that fWi ( ) ≤ Kλ on [0,Wλ ] . Hence, W fW( ) ≤ + Kλ , W ∈∞[ 0,) . (1.28) i 2λB Now by Lemma 2.1 and (1.28), we have 1 u= u( t) = λ Gt( ,) sa( s) f u( s) d s− c u( 1) u( 1) t ∞ mm∫0 11( ) 1( ) m

1 v= vt( ) = λ Gt( , sa) ( s) f vs( ) d s− c u( 11) u( ) t ∞ mm∫0 22( ) 2( ) m 1 ≤ λ Gt( ,d sa) ( s) f us( ) s ∫0 m 11( ) 1 ≤ λ Gt( ,d sa) ( s) f vs( ) s ∫0 m 22( )

1 ut( ) ≤+λ m ∫ Gt( m ,d sa) 1 ( s)  Kλ s 0 2λB

1 vt( ) ≤+λ m ∫ Gt( m ,d sa) 2 ( s)  Kλ s 0 2λB 1 1 ≤+λ a1 ( s)  ut( m ) Kλ d s ∫0 2λB 1 1 ≤λ + ≤× a2 ( s)  vt( m ) Kλ d s( since Gts( ,) 1 in[ 0,1] [ 0,1]) ∫0 2λB 1 =u( t) + λ BKλ 2 m 1 =v( t) + λ BKλ 2 m λ > ≤ ≤ = λ Hence, for each 0 , uH∞ λ and vH∞ λ , where Hλλ2 BK .

3. Proof of Theorem 1.1

We first claim that (1.3) has a maximal positive solutions uv , for λ  1. Let

ϕi be the solutions of the problems

−=ϕ1′′ ati ( ), t ∈( 0,1)  ϕi′(1) = c ii( ϕϕ( 1)) i( 1) = 0, (1.29)  ϕi (0) = 0 andi = 1, 2

DOI: 10.4236/am.2021.123009 141 Applied Mathematics

A. Mohamed et al.

Note that (1.29) has the unique solution since ei , ϕ0 are sub solutions and

super solutions of (1.29), respectively, where ei is defined in (1.26) and ϕ0 is the solutions of the linear boundary condition problems

−=ϕ0′′ ati ( ), t ∈( 0,1)  ϕ0′ (1) = ci ( ϕϕ 00( 1)) ( 1) = 0, (1.30)  ϕ0 (0) = 0 andi = 1, 2

Since fi satisfies (F1), given λ > 0 , we can choose Zλ ≥ 1 such that > λϕ Zλλ fZii( ∞ ) and Zλλ> λ fHi ( ) where Hλ is as in Lemma 2.6. Then,

Zλϕi are a super solutions of (1.3) since  −=≥Zϕ′′ Za t λ af Z ϕλ ≥ af Z ϕ, t ∈ 0,1  ( λλi) i( ) ii( λ iλ ) ii( λ i) ( )  ϕ′′+= ϕ ϕ ϕ+ ϕϕ (Zλi(1)) cZ ii( λ( ( 1))) Z λλ i( 1) Z( iiii( 1) cZ( λ( ( 11))) ( ))   ≥+Zcλ ϕ′(1) ( ϕϕ( 1)) ( 1) = 0,  ( i ii i )  Ziλϕi (0) = 0 and = 1, 2

Next, we show that this super solution Zλϕi is larger than any positive solu-

tions of (1.3). Let θi be any positive solutions of (1.3). From Lemma 2.6, we have ′′ −(Zλλϕθi −= i) Za i( t) −λ af iii( θ) = a i Zλ − λ f ii( θ)

≥aZiiλλ −λ fH( ) >∈0, t ( 0,1)

by the choice of Zλ . Note that (Zλϕθii−=)(00) . Now we show that

(Zλϕθii−≥)(10) . Indeed, since ′′ Zλϕii(1) + cZ( λλ( ϕ i( 1))) Z ϕ i( 10) ≥= θ iiii( 1) +c( θθ( 1)) ( 1) , we have ′′ Zλϕi(1) −+ θ i( 1) cZ i( λλ( ϕ i( 1))) Z ϕ i( 1) − c ii( θθ( 1)) i( 10) ≥ (1.31)

If we assume that Zλϕθii(1) −<( 10) , then

cZi( λλ(ϕ i(1))) Z ϕ i( 1) −< c ii( θθ( 1)) i( 10) since ci increases. Hence from (3.3) ′′ we obtain Zλϕθii′′(1) −>( 10) . However, −(Zλϕθii −>) 0 in (0,1) , ′ (Zλϕθii−=)(00) and (Zλϕθii−<)(10) implies that (Zλϕθii−>) (10) ,

which is a contradiction. Hence Zλϕθii(1) −≥( 10) . Therefore, by the maxi-

mum principle, Zλϕϕii≥ in [0,1] . Therefore, (1.3) has a maximal positive solutions uv , . Now, let u and v be any other positive solutions of (1.3). To es- tablish our theorem, we will show that uu≡ and vv≡ for λ  1. Since uv, and uv , are solutions of (1.3), we obtain

−− ′′ =λ − ∈  (u u) ( t) a11( t)( f( ut( )) f 1( ut( ))), t ( 0,1)  ′′ −−(vv ) ( t) =λ at22( )( fvt( ( )) − fvt 2( ( ))), t ∈( 0,1)  (uu −=−=)(0) ( vv)( 0) 0,  (uu −+)′ (1) cu( ( 11)) u( ) − cu( ( 110,)) u( ) =  11  −+′ − = (vv ) (1) cv22( ( 1)) v( 1) cv( ( 1)) v( 10)

DOI: 10.4236/am.2021.123009 142 Applied Mathematics

A. Mohamed et al.

By the mean value theorem, there exists ξ such that uu≤≤ξ and vv≤≤ξ quadin [0,1] and

−− ′′ =λξ′ − ∈  (u u) ( t) a11( t) f( )( ut( ) ut( )), t ( 0,1)  ′′ −−(v v) ( t) =λξ a22( t) f′( )( vt( ) − vt( )), t ∈( 0,1)  (uu −=−=)(0) ( vv)( 0) 0, (1.32)  ′ (uu −+) (1) cu11( ( 11)) u( ) − cu( ( 110,)) u( ) =   −+′ − = (vv ) (1) cv22( ( 1)) v( 1) cv( ( 1)) v( 10)

By multiplying (1.3) by (uu −−),( vv) and (1.32) by uv, and integrating, we first obtain, using integration by parts, 1  (uuu −)′′ −−( uuu) ′′d t = u ′( 11) u( ) − u ′ ( 11) u ( ) ∫0  =−+   u(1)  cu11( ( 11)) u( )  u ( 1) cu( ( 11)) u( )  = − u(11) u ( )  cu11( ( 1)) cu( ( 1)) ≤ 0 1  (vvv −)′′ −−( vvv) ′′d t = v ′( 11) v( ) − v ′ ( 11) v ( ) ∫0  =−+   v(1)  cv22( ( 11)) v( )  v ( 1) cv( ( 11)) v( )  = − v(11) v ( )  cv22( ( 1)) cv( ( 1)) ≤ 0

since ci are increasing. Using that fi is concave, we also have 11 (uuu −)′′ −−( uuu) ′′ dd t =λξ at( )  fu( ) − f′( ) uuu( −) t ∫∫0011 1  (1.33) 1 ≥λ at( )  fu( ) −− fuuuu′( ) ( )d t ∫0 11 1

11 (vvv −)′′ −−( vvv) ′′ dd t =λξ at( )  fv( ) − f′ ( ) vvvt( −) ∫∫0022 2  (1.34) 1 ≥λ at( )  fv( ) −− fvvvvt′( ) ( )d ∫0 22 2

From (F1), there exist ri > 0 , bi > 0 such that fii( s) −≥ f′( ss) b when-

ever sr≥ i . From (1.20), for λ  1, ut( ) ≥ ri and vt( ) ≥ ri if

ri rrii rrii   dt( ,∂Ω) ≥ . Let Ω=+ ,1 − and Ω=− 0, ∪ 1 − ,1 . Then λC λλCC λλCC   from (1.33), we have

0≥λλatbuu11( ) ( −+)d t atf1( ) 1( 0d)( uu−) t ∫∫ΩΩ+−

0≥λλatbvvt22( ) ( −+)d atf2( ) 2(0)( vvt−) d. (1.35) ∫∫ΩΩ+−

since when fi is concave fWi( ) −≥ WfW ii′( ) f(0) for all W ≥ 0 . Next let m and h satisfy −=′′ ω ∈ == mt( ) Ω+ at1 ( ), t( 0,1) , m( 0) m( 1) 0,

DOI: 10.4236/am.2021.123009 143 Applied Mathematics

A. Mohamed et al.

′′ =∈==ω mt( ) Ω+ at2 ( ), t( 0,1) , m( 0) m( 1) 0

and −=ht′′ ω at, t ∈ 0,1 , h 0 == h 1 0, ( ) Ω− 1 ( ) ( ) ( ) ( ) −=ht′′ ω at, t ∈ 0,1 , h 0 == h 1 0 ( ) Ω− 2 ( ) ( ) ( ) ( )

respectively. Now multiplying (1.32) by bmii+ f(0) h and integrating, we ob- tain

1 I:=−−( u u)′′  bm + f(0d) h t ∫0 11

1 J:=−−( v v)′′  bm + f(0d) h t ∫0 22

=−++(u (1) u( 1))  bm1′′( 1) f 1( 01) h( ) a11( t) b( u − u)d t ∫Ω+

+−atf11( ) (0d)( u u) t ∫Ω−

=−++(v (1) v( 1))  bm2′′( 1) f 2( 01) h( ) a22( t) b( v − v)d t ∫Ω+

+−a22( tf) (0d)( v v) t ∫Ω− =II +  12 (1.36) =JJ12 +

1 Note that as λ →∞, me→ i and h → 0 in C [0,1] . Hence, for λ large,

we obtain bmii+> f(00) h , in (0,1) and i = 1, 2

bmii+> f(0) h 0 in( 0,1) (1.37)

and

bmii′′+< f(00) h (1.38)

Hence for λ  1, (1.37) implies I1 ≤ 0 , J1 ≤ 0 and combining with (1.34)

(which implies I2 ≤ 0 and J2 ≤ 0 ) we have I ≤ 0 and J ≤ 0 . However, by (1.32), we also have

1 I:=−−( u u)′′  bm + f(0d) h t ∫0 11 1 +λξa( t) f′( ) ut ( ) −+ ut( )  bm f(0d) h t ∫0 11( ) 1 1

1 J:=−−( v v)′′  bm + f(0d) h t ∫0 22 1 +λξa( t) f′( ) v ( t) −+ v( t)  bm f(0d) h t ∫0 22( ) 2 2

Now for λ  1, using (1.36), ai > 0 , and fi′≥ 0 we get I ≥ 0 and J ≥ 0 . Hence, we conclude that I ≡ 0 and J ≡ 0 for λ  1, which implies that vv ≡ and uu ≡ in [0,1] . This proves that (1.3) has a unique positive solu- tion for all λ large.

4. Conclusion

In the paper, were studied the positive radial solutions for elliptic systems to the

DOI: 10.4236/am.2021.123009 144 Applied Mathematics

A. Mohamed et al.

nonlinear Boundary Value problems. And then, we presented that by the Theo- rem 1.1, and Theorem 2.2, we can obtain a solution of the problem (1.3). More- over, for all λ  1, then (1.3) has a unique positive solution.

Acknowledgements

The authors would like to thank the anonymous referees for their helpful com- ments.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this pa- per.

References [1] Berestycki, H., Caffarelli, L.A. and Nirenberg, L. (1996) Inequalities for Second- Order Elliptic Equations with Applications to Unbounded Domains, I. Duke Ma- thematical Journal, 81, 467-494. [2] Lions, P.L. (1982) On the Existence of Positive Solutions of Semilinear Elliptic Equ- ations. SIAM Review, 24, 441-467. https://doi.org/10.1137/1024101 [3] Ali, J., Castro, A. and Shivaji, R. (1993) Uniqueness and Stability of Nonnegative Solutions for Semipositone Problems in a Ball. Proceedings of the American Ma- thematical Society, 117, 775-782. https://doi.org/10.1090/S0002-9939-1993-1116249-5 [4] Anuradha, V., Hai, D.D. and Shivaji, R. (1996) Existence Results for Superlinear Semipositone Boundary Value Problems. Proceedings of the American Mathemati- cal Society, 124, 757-763. https://doi.org/10.1090/S0002-9939-96-03256-X [5] Castro, A., Gadam, S. and Shivaji, R. (1997) Positive Solution Curves of Semiposi- tone Problems with Concave Nonlinearities. Proceedings of the Royal Society of Edinburgh Section A: Mathematics, 127, 921-934. https://doi.org/10.1017/S0308210500026809 [6] Castro, A. and Shivaji, R. (1989) Nonnegative Solutions for a Class of Radially Sym- metric Nonpositone Problems. Proceedings of the American Mathematical Society, 106, 735-740. https://doi.org/10.1090/S0002-9939-1989-0949875-3 [7] Dancer, E.N. and Shi, J. (2006) Uniqueness and Nonexistence of Positive Solutions to Semipositone Problems. Bulletin of the London Mathematical Society, 38, 1033- 1044. https://doi.org/10.1112/S0024609306018984 [8] Oruganti, S., Shi, J. and Shivaji, R. (2002) Diffusive Logistic Equation with Constant Yield Harvesting, I: Steady States. Transactions of the American Mathematical So- ciety, 354, 3601-3619. https://doi.org/10.1090/S0002-9947-02-03005-2 [9] Castro, A., Sankar, L. and Shivaji, R. (2012) Uniqueness of Nonnegative Solutions for Semipositone Problems on Exterior Domains. Journal of Mathematical Analysis and Applications, 394, 432-437. https://doi.org/10.1016/j.jmaa.2012.04.005 [10] Ko, E., Lee, E. and Shivaji, R. (2013) Multiplicity Results for Classes of Singular Problems on an Exterior Domain. Discrete and Continuous Dynamical Systems, 33, 5153-5166. https://doi.org/10.3934/dcds.2013.33.5153 [11] Sankar, L., Sasi, S. and Shivaji, R. (2013) Semipositone Problems with Falling Zeros on Exterior Domains. Journal of Mathematical Analysis and Applications, 401, 146- 153. https://doi.org/10.1016/j.jmaa.2012.11.031

DOI: 10.4236/am.2021.123009 145 Applied Mathematics

A. Mohamed et al.

[12] Gordon, P.V., Ko, E. and Shivaji, R. (2014) Multiplicity and Uniqueness of Positive Solutions for Elliptic Equations with Nonlinear Boundary Conditions Arising in a Theory of Thermal Explosion. Nonlinear Analysis: Real World Applications, 15, 51-57. https://doi.org/10.1016/j.nonrwa.2013.05.005 [13] Bulter, D., Ko, E., Lee, E. and Shivaji, R. (2014) Positive Radial Solutions for Elliptic Equations on Exterior Domains with Nonlinear Boundary Conditions. Communi- cations on Pure and Applied Analysis, 13, 2713-2731. https://doi.org/10.3934/cpaa.2014.13.2713 [14] Castro, A., Hassanpour, M. and Shivaji, R. (1995) Uniqueness of Non-Negative So- lutions for a Semipositone Problem with Concave Nonlinearity. Communications in Partial Differential Equations, 20, 1927-1936. https://doi.org/10.1080/03605309508821157 [15] Ambrosetti, A., Arcoya, D. and Buffoni, B. (1994) Positive Solutions for Some Semi- Positone Problems via Bifurcation Theory. Differential Integral Equations, 7, 655- 663. https://projecteuclid.org/euclid.die/1370267698

DOI: 10.4236/am.2021.123009 146 Applied Mathematics

Applied Mathematics, 2021, 12, 147-156 https://www.scirp.org/journal/am ISSN Online: 2152-7393 ISSN Print: 2152-7385

Discrete Model of Plasticity and Failure of Crystalline Materials

V. L. Busov

Donbass State Engineering Academy, Kramatorsk,

How to cite this paper: Busov, V.L. (2021) Abstract Discrete Model of Plasticity and Failure of Crystalline Materials. Applied Mathemat- Within the framework of a discrete model of the nuclei of linear and planar ics, 12, 147-156. defects, the variational principles of sliding in translational and rotational https://doi.org/10.4236/am.2021.123010 plasticity, fracture by separation (cleavage) and shear (shearing) in crystalline

Received: January 29, 2021 materials are considered. The analysis of mass transfer fluxes near structural Accepted: March 9, 2021 kinetic transitions of slip bands into cells, cells into fragments of deformation Published: March 12, 2021 origin, destruction by separation and shear for fractal spaces using fractional Riemann-Liouville derivatives, local and global criteria of destruction is car- Copyright © 2021 by author(s) and ried out. One of the possible schemes of the crack initiation and growth me- Scientific Research Publishing Inc. This work is licensed under the Creative chanism in metals is disclosed. It is shown that the discrete model of plasticity Commons Attribution International and fracture does not contradict the known dislocation models of fracture License (CC BY 4.0). and makes it possible to abandon the kinetic concept of thermofluctuation http://creativecommons.org/licenses/by/4.0/ rupture of interatomic bonds at low temperatures. Open Access

Keywords Variational Principles of Plasticity and Destruction, Photoelectrons, Conduction Electrons, Injected Electrons, Fractal Space, Fracture Criteria

1. Introduction

The analysis of works [1] [2] [3] shows that during the generation of nuclei of linear and planar defects, two subsystems of electrons arise: photoelectrons knocked out of cations by an intermittent field, and intrinsic electrons of a solid. In metals, these are conduction electrons; in dielectrics and semiconductors, they are injected into the volumes of shock waves under the influence of external strong electric fields, and also arise when impurity donor atoms are introduced into the material. Here the subsystem of intrinsic electrons with thermal veloci-

ties vep ( p= ds, ) and matrix cations is a solid-state plasma, and the subsystem

of photoelectrons is a set of plane beams with velocities Vphe , while the subsys-

DOI: 10.4236/am.2021.123010 Mar. 12, 2021 147 Applied Mathematics

V. L. Busov

tem of pairs of photoelectrons and cations weakly coupled by Coulomb attrac- tion, of which photoelectrons were knocked out, is a deformation plasma beams. At large plastic deformations leading to the formation of stable fragmented

structures up to critical ones, the average electron densities nep and nphe and

the corresponding plasma frequencies Ωep and Ω phe are quantities of the same order, which leads to a fundamentally new distribution of the dielectric constant

tensor εωαβ ( ,k ) both in space and in time. In this case, a new branch of the spectrum associated with the presence of beams is added to the main branch of

the spectrum of longitudinal oscillations of the intrinsic plasma, where Ωep re-

flects the collective natural oscillations, and Ω phe -oscillations and rotations in the additional potential relief of the nuclei of linear and plane defects. Here it is necessary to note the fundamental difference in the nature of the motion of elec- trons in these subsystems: in their own plasma, the directions of thermal motion of electrons are equally distributed in the total solid angle, the values of their ve-

locities in metals are not lower than vF ; and in dielectrics and semiconductors in the volumes of shock waves at electric fields near the breakdown tend to the

rates of local metallization vvms, md . On the contrary, in the plasma of beams, the alternating (intermittent) field creates dynamic anisotropy, while the direc-

tions of the velocities Vphe lie in the slip planes of single crystals, and in poly- crystals the appearance of a subsystem of beams is possible at threshold values of the projections of these velocities on the slip plane in individual crystallites. Optical and electronic micro-fractography of the surface of fatigue brittle and viscous-plastic fractures of specimens from a wide range of metals and their al- loys [4] [5], suggests that the geometry of such surfaces can be described by fractal functions such as Weerstrass, Takagi, and Riemann [6], while brittle frac- tures as a combination of terraces and steps are reduced to a superposition of saw-tooth functions or condensation of singularities, and viscous-plastic frac- tures to a countable number of peaks, where the left-side and right-side deriva- tives of the surface profile tend either to the left to +∞, and to the right to −∞, or vice versa. It is also known that the macroscopic curves of tension [7] [8] and creep [9] have an intermittent jump-like shape and thus reflect the fractal nature of de- formation processes of plasticity and fracture, where, at small deformations, the

density of jumps η jump and the depth of load decay per jump ∆σ are small,

and at large deformations η jump and ∆σ increase to 1 order. Currently, there are several ways to describe the structural kinetic transitions “cell-fragment”, “fragment-microcrack”: 1) within the framework of a synergetic approach using scale invariance [10] [11], which makes it possible to relate the fractal properties of an open system far from equilibrium with deformation pro- cesses; 2) with the help of the phase transition of the crystalline state to the amor- phous [12] [13], which leads to the appearance of submicrocracks in front of the top of the growing crack; 3) the kinetic concept of thermofluctuation rupture of interatomic bonds [14]. Here the description of the transitions is made without using the discrete model of charged particles that oscillate and rotate in potential

DOI: 10.4236/am.2021.123010 148 Applied Mathematics

V. L. Busov

valleys of the additional potential relief of the nuclei of linear and planar defects arising due to the distribution of conduction electrons [1] [2] [3]. The aim of this work is to build a physical and mathematical discrete model of structural kinetic transitions taking into account the fractality of deformation processes.

2. Theoretical Model Let us consider variations in the potential relief of a crystal Vc ( rt, ) as a func- e e cat cat tional of external currents JJosc, turn ,, JJ osc turn [15]. When strip, cellular, frag- mented structures stable at a given level of latent energy are formed, the neces- sary condition for an extremum is fulfilled ν δVc ( Jµ ) =0, νµ = e , cat ; = osc , turn (1)

On the contrary, the processes of translational, rotational plasticity and de- struction are transient processes caused by the non-equilibrium of the system from the influence of external and internal electromagnetic fields when the thre- shold values of extraneous currents of photoelectrons (e) and cations (cat) are thr thr successively reached: Jeosc ( trpl) and Jcatosc ( trpl) -currents of oscillations and thr thr rotations with translational plasticity; Jeturn ( rotpl) and Jcatturn ( rotpl) -currents thr of oscillations and rotations with rotational plasticity (rotpl) ; Jνµ ( dst) -cur- rents of oscillations and rotations during destruction. Here equality (1) turns ν into an inequality, and the variations δηJ µ ( ) are connected in pairs

e cat δδJosc ( trpl)  Josc ( trpl) (2.1)

e cat δδJturn ( trpl)  Jturn ( trpl) (2.2)

e cat δδJosc ( rotpl)  Josc ( rotpl) (2.3)

e cat δδJturn ( rotpl)  Jturn ( rotpl) (2.4)

e cat δδJosc ( dst)  Josc ( dst) (2.5)

e cat δδJturn ( dst)  Jturn ( dst) (2.6) where η = trpl;; rotpl dst and in the region of structural kinetic transitions asym- ptotically tend to step functions. A natural question arises: What is the physical and mathematical model of such transitions, taking into account the fractality of deformation processes? Here we assume that the deformed volume of the ma- terial is considered as a fractal space, where the equations of mass transfer with the help of electron and ion plasma waves ([3], Formulas (20), (21)) are genera-

lized by replacing the usual differentiation operators ∂∂tx, ∂∂ j to operators α α of fractional derivative (Riemann-Liouville operator) (∂∂tx) ,( ∂∂ j ) with fractional exponent 01<<α ([6], p. 75). The fractional Riemann-Liouville de- rivatives of order α are left-handed

1dx ft( )d t α = (Df+ )( x) ∫ α (3) a Γ−(1dα ) x a ( xt− )

DOI: 10.4236/am.2021.123010 149 Applied Mathematics

V. L. Busov

and right-handed operators

1db ft( )d t α = − (Df− )( x) ∫ α (4) b Γ−(1dα ) x x (tx− )

where fff≡ inj, phe, f cat are the distribution functions of injected electrons, pho- toelectrons and cations, respectively; Γ−(1 α ) —Gamma function. The convolu- tion integral is written on the right-hand sides of (3) and (4); therefore, it is more convenient to consider fractional operators in ω,k —space. If we intro- duce the linear operator ddx under the sign of this integral, then its Fourier transform as a → −∞ and b → +∞ leads to the product of the Fourier com- ν ponents of the function fµ and the power hyperbolic function Fgp with frac- tional exponent. This transformation is applicable only in undeformed dielectrics and pure undoped semiconductors, where, due to the low density of free carri- ers, relaxation processes proceed extremely slowly. At the same time, such pro-

cesses in metals are fast, spatio-temporal intervals [ab,] ≈÷τ re ,( 5 10)a0 , which

is caused by the equations of selective selection of frequencies ω pw and wave

vectors k pw of plasma waves [3]

ω pw−=kV pw e 0 (5) εωl(k pw,0 pw ) = (6)

when generating linear defects. The dielectric constant near plane defects has a

tensor representation εωαβ ( ,k ) . Here, for the low-angle boundaries of inclina- tion and rotation in the principal (normal and tangential) axes of the matrix

εαβ , two functions ε n and εt can be distinguished and, accordingly, two eq- uations of selective selection. For high-angle “interfragment” boundaries of de- formation origin, in particular, multi-wall Chalmers boundaries ([16], p. 456),

the number of εi and selection equations increases to three. In ω,k —space near the regions of structural kinetic transitions, by analogy

with [17], there is a spectrum of threshold values of frequencies ωi and wave

vectors k j , relative to which, both on the left and on the right, the dependence

Fgp (ω,k ) is completely determined by the fractional exponent α . We represent α as a power series ββ  nnδδ Vphei V α =AAli ⋅−11ννi +  ⋅− i , ν = e, cat (7) i ni nnVi  VV  gli ννgli gli gli

where the distribution functions of photoelectrons f phe and cations fcat , from which the photoelectrons were knocked out, are averaged over the local test (vo-

lumes near slip bands, boundaries of blocks, grains, fragments) volume Vl and

global (fragment, grain, crystal) volume Vgl : δ nνi= f ν i − f νi ≡− nn ννli gli ; VVl gl variations in the velocities of propagation of charged particles

δVνi= V ν i − V νi ≡− VV ννli gli for the i-th structural kinetic transition. Here, VVl gl the numerical coefficients are 0,1<

pends primarily on the density of conduction electrons nec . On the other hand, β plays the role of the Hausdorff-Besicovich fractal dimension ([6], pp. 15-56),

DOI: 10.4236/am.2021.123010 150 Applied Mathematics

V. L. Busov

therefore β =13 ÷ . Hence, in dielectrics in the absence of shock waves and

strong external fields, the β (nec ) dependence is linear, in semiconductors, it is weakly nonlinear: β =12 ÷ , and in metals it goes from a quadratic to a cubic parabola: β =23 ÷ .

2.1. Borders of Deformation Origin

At the first stage of the theory, let us return to the well-known island model of the hereditary boundaries of a polycrystal in metals [18], where a periodic se- quence of regions of good and bad conjugation of bicrystal atoms is considered. Following the adiabaticity of the model, periodic thickening and rarefaction of cations lead to corresponding variations in the density of conduction electrons. On the contrary, when boundaries of deformation origin appear according to [3], electron plasma waves are first formed according to the rules for selecting frequencies and wave vectors, to which their ionic plasma waves are shifted and stabilized. The process of redistribution of dislocations occurs in two directions: an increase in the density of such boundaries and the appearance of multi-wall boundaries, which leads to crystal fragmentation. The main feature of these boundaries is the fulfillment of the condition

ffec+ phe ≡( nnlec + lphe) >≡ n gl f ec (8) Vlb Vc

where Vlb and Vc are the volumes adjacent to the interfragment boundary

with the average fragment diameter Lfr ≈÷100 200 nm [5] [19] and the entire crystal, respectively. Hence, the mechanism of motion of the boundary separating fragment 1 and

fragment 2 is clear, if the average density of electrons in the first nngl12< gl is the same density in the second, parallel to itself towards the first fragment. Ad- jacent fragments form edges and junctions, at which the selective frequencies and wave vectors of plasma waves must be matched with similar values at inter- secting boundaries. It should be noted that the interaction of the deformation plasma of the beams and the intrinsic plasma of a solid at large plastic deforma- tions leads to a significant increase in the large-scale correlation energy of the relative rotation of injected electrons and photoelectrons with respect to the dis- tribution of conduction electrons. Here, in the region of the volume of fragment boundaries, the sizes of vacancy volumes and trajectories of rotation of injected

electrons and photoelectrons are of the same order [2], the moduli VVphe, inj

approach and coincide with vF of conduction electrons, and in critical frag- mented structures they exceed it. During the transition “slip band—cell”, the dislocations interact and group into rows chaotically distributed within the cell wall, where dislocations of different signs annihilate, and each of them having the same sign, according to the AFM surface profile ([3], Figure 1, two extreme sections), is central the valley and two protrusions along its edges interact in such a way that at first the overlapping protrusions of the harmonics of plasma electron waves of the nuclei of these dislocations merge into one protrusion- peak, and then their ionic plasma waves shift to this peak and stabilize it, form-

DOI: 10.4236/am.2021.123010 151 Applied Mathematics

V. L. Busov

ing a powerful linear ridge of cations substances, while the neighboring valleys deepen significantly. At high dislocation densities ρ =1010 ÷ 10 11 cm− 2 [5] [20], this process can occur many times. As a result, we arrive at the “cell-fragment” transition. Near this transition on the left, the density and velocity of mobile dislocations gener- ated both upon impact and under static loading change abruptly, which means

that the variations δδnninj,, phe, δδ VV inj phe are jumps, and from expression (7) it follows that the exponent α → 0 and reaches values of 0.1 - 0.2, confirmed by experiment [17]. To the right of the transition, the decisive role is played by re-

laxation processes, where variations δδnninj, phe change signs, and in metals, both on the left and on the right, the values of α are close or practically the same, in dielectrics on the right they are close to unity, and in semiconductors they have intermediate values.

2.2. Destruction of Crystalline Materials

The most general global criteria for the destruction of crystalline materials by detachment and shearing are

finj+ f phe ≡+ nn inj phe >≡ n ec f ec (9) Vl Vgl

nninj+> phe n eff (10)

Vphe, V inj> vv F ,, ms v md , xSj ∈ (11)

for a part of the surface of the interfragment boundary, the cleavage plane. Here

neff is the limiting density of the electronic subsystem including photoelectrons, injected electrons, conduction electrons, reflecting the small-scale correlation

energy due to the Pauli principle; at interelectronic distances reff ≤÷(0.1 0.2) nm

[2], the potential of interelectronic repulsion Uee corresponds to the power −m function rm( =69 ÷ ) ; in semiconductors and dielectrics, injected electrons

with velocities Vinj , first appear, and then, as the deformations grow, photoelec-

trons with Vphe ; also, the velocity vF of conduction electrons must be replaced

by the velocity of injected electrons at the rates of local metallization vvms, md . In dielectrics, when fractured by cleavage along the cleavage planes, criteria (10) and (11) are satisfied. Similarly, in metals, the same criteria take place dur- ing the initiation of microcracks at interfragment boundaries ([5], p. 118). Here, microcracks even in very plastic metals (Al, Ag) appear explosively—brittle. With the growth of a crack, a so-called plastic zone is formed in front of its tip ([4], p. 272), where the same sequence of deformation processes is repeated, proceeding from the beginning of the application of the load to the opening of a new section of the crack across the plastic zone. This is precisely the self-similarity of plasticity and fracture, or the scaling inherent in fractal structures. It should be noted here that the fulfillment of fracture criteria (9) - (11) in itself does not automatically

mean crack opening. The opening time intervals ∆top caused by the interelec-

tronic repulsive potential Uee should be rather small compared to the time of approach of photoelectrons, injected electrons and cations from the influence of

DOI: 10.4236/am.2021.123010 152 Applied Mathematics

V. L. Busov

the potential of electrostatic attraction Uecat for their stabilization. In other

words, the forces from Uee must be large enough compared to the forces from

Uecat to prevent the incipient crack from collapsing and forming internal free surfaces with an electric double layer and surface (Tamm) states of electrons. On the other hand consider the process of forming a new section of a growing crack, which consists of two stages. At the first stage, in metals, electrons concentrated

within the sharp peak of electron plasma waves ffphe+ inj at the interfrag-

ment boundary are accelerated under the influence of Uee to velocities

VVphe, inj≥ v F along the normal to this boundary. Here the conduction electrons, although they create an additional potential relief, but it does not allow to com- pletely limit and stabilize the emerging electron fluxes near the boundary, but ei-

ther slows them down to zero at distances of the order of ∆≈÷Lee 70 1200 nm from the boundary, or they, decelerating, are reflected back from an adjacent in- terfragment boundary parallel to the original boundary ([5], Figure 32). At the second stage, an accumulation of cations arises near the interfragment boun- dary, in which the electroneutrality condition is violated, which leads to the ap-

pearance of a repulsive potential Ucatcat inside this cluster, where the corres- ponding ion fluxes appear, directed from the boundary into the fragment with

velocities vcat VV phe, inj . As a result, an internal cavity appears, the opening

∆lcr of which should be sufficient to return the hindered electron beams with subsequent stabilization of the generated ion beams for the formation of an elec-

tric double layer of the inner free surface. If the dependencies Uee( rt ee , ) and

Ucatcat( rt catcat , ) remain similar in the process of crack opening, then the condi- tion for the stability of the emerging crack section is maximally simplified ∆∆lL cr ≥ ee , e≡ phe, inj (12) vcat ( t) Vte ( )

1 ∆top ∂U( rt, ) 1 ∆top ∂U( rt, ) where vt( ) = catcat dt; Vt( ) = ee dt; ∆t is cat ∫0 e ∫0 op Mrcat ∂ catcat mre ∂ ee

time interval of crack opening. Analysis (12) shows that at ∆≈lcr 0,7 ÷ 12 nm 4 ([5], Figure 31), the cations should move with supersonic speeds vcat ≈ 10 m s . The fractal shape of the emerging crack surfaces is the most acceptable for its stabilization ([4], Figure 5.47 - 5.59; [5], Figure 34, Figure 35).

3. Discussion of Results

At first glance, the huge variety of structures and associated deformation and re- laxation processes, which take place in a wide range of loads, deformations, up to destruction, seems absolutely amazing. To date, there is no unified theory in the literature for their description with a seemingly rather simple combination of electronic and cationic subsystems for a metal bond, semiconductor atoms for a covalent bond, cations and anions of di- electrics for an ionic bond. Nevertheless, the nature of plasticity and fracture in crystalline materials still remains completely undisclosed. In this work, only a qualitative description of deformation processes is pre-

DOI: 10.4236/am.2021.123010 153 Applied Mathematics

V. L. Busov

sented. For a quantitative consideration, it will be necessary to solve systems of equations by numerical methods. A fundamental question arises: Is a fragment of deformation origin a quan- tum dot [21] [22]? First of all, we note that a quantum dot consists of a core in the form of a semiconductor (or conductor) three-dimensional microcrystal

with a diameter of dcp = 2 -10 nm inside a thin shell with a significantly larger energy gap than that of its core. The energy spectrum of electrons in the nucleus −2 of this point is discrete with an interval between neighboring levels ∆εcp~ d cp ,

which is characteristic of a three-dimensional potential well at εcp < 0 . On the contrary, the distribution of injected, photo and conductivity electrons

inside a fragment with Lfr =100 ÷ 300 nm has a continuous spectrum, while the interfragment boundary, in contrast to the free (impenetrable) surface, is

partially permeable to these electrons. The ratio of the reflection coefficients  ref

and transmission  trm of electrons in the beams, and separately, depending on the misorientation angle for low-angle and high-angle boundaries of deforma- tion origin, has yet to be found. The scheme of the crack initiation and growth mechanism contains a number of fundamental differences:  Based on a discrete model of charged particles;  Reflects the interatomic potential and the type of bond between particles in- herent in a given material;  Does not contradict the well-known dislocation models of brittle fracture by Zener, Straw, Cottrell [23] and the model of dislocation replenishment by AN Orlov [24];  Allows abandoning the kinetic concept of thermal fluctuation rupture of in- teratomic bonds [14], at least at low and room temperatures. The model of plasticity and fracture presented in this work shows that the pumping energy under shock loads is redistributed as follows: the generalized space of rectangular pulses along the lines is replaced by a similar space along planes and curved surfaces, and the superposition of step functions in the form of terraces and steps is replaced by a superposition of undifferentiated peaks and ridges in the form of grooved and pit relief of fractures , at the same time, in real conditions, most often there are mixed structures from both types of relief; with the growth of cracks, exactly self-similarity of deformation processes appears. As a result, according to the destruction criteria (9) - (11), material objects are di- vided into separate fragments, the properties inside which are preserved. This is precisely the fractality of these objects under deformation.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

References [1] Busov, V.L. (2020) On Separation of Charges and Formation of Linear Structures in the Nuclei of Dislocations in Metals. Applied Mathematics, 11, 739-752.

DOI: 10.4236/am.2021.123010 154 Applied Mathematics

V. L. Busov

https://doi.org/10.4236/am.2020.118049 [2] Busov, V.L. (2020) On the Relationship of the Discrete Model of the Nuclei of Li- near and Planar Defects and the Continuum Models of Defects in Crystalline Mate- rials. Applied Mathematics, 11, 862-875. https://doi.org/10.4236/am.2020.119056 [3] Busov, V.L. and Grechkina, M.V. (2020) Plasma Model of Generation and Slip of Linear Defects in Crystalline Materials. Applied Mathematics, 11, 1167-1177. https://doi.org/10.4236/am.2020.1111079 [4] Kocanda, S. (1985) Fatigue Cracking of Metals. Wydawnictwa Naukowo Techniczne, Warszawa. [5] Rybin, V.V. (1986) Large Plastic Deformation and Destruction of Metals. Metallur- gy, Moscow. [6] Potapov, A.A. (2005) Fractals in Radio Physics and Radar. Sampling Topology, Pub- lishing House University Book, Moscow. [7] Klyavin, O.V. (1974) Features of Plastic Deformation of Crystalline Bodies at He- lium Temperatures. In: Startsev, V.I., Ed., Physical Processes of Plastic Deformation at Low Temperatures, the Collection of Articles, Naukova Dumka, Kyiv, 5-30. [8] Didenko, D.A. (1974) On the Mechanism of Low-Temperature Jump-Like Defor- mation of Aluminum. In: Startsev, V.I., Ed., Physical Processes of Plastic Deforma- tion at Low Temperatures, the Collection of Articles, Naukova Dumka, Kyiv, 129-138. [9] Koval, V.A., Soldatov, V.P. and Startsev, V.I. (1974) Creep of Copper at Tempera- tures of 1.4-4.2˚ K. In: Startsev, V.I., Ed., Physical Processes of Plastic Deformation at Low Temperatures, the Collection of Articles, Naukova Dumka, Kyiv, 339-345. [10] Stanley, H., Conillo, A., Klein, W., et al. (1984) Critical Phenomena: Past, Present and Future. Synergetics. Mir, Moscow. [11] Ivanova, V.S. (1989) Fracture Synergetics and Mechanical Properties. In: Ivanova, V.S., Ed., Synergetics and Fatigue Destruction of Metals, the Collection of Scientific Works, Science, Moscow, 6-29. [12] Pavlov, V.A. (1985) Amorphization of the Structure of Metals and Alloys with an Extremely High Degree of Plastic Deformation. Physics of Metals and Metal Science, 59, 629-649. [13] Tutnov, A.A., Dorovskiy, V.M. and Elesin, L.A. (1989) Amorphization of Crystal- line Materials in the Zone in Front of the Tip of the Developing Crack. In: Ivanova V.S., Ed., Synergetics and Fatigue Destruction of Metals, the Collection of Scien- tific Works, Nauka, Moscow, 45-57. [14] Regel, V.R., Slutsker, A.I. and Tamashevsky, E.E. (1974) The Kinetic Nature of the Strength of Solids. Science, Moscow. [15] Busov, V.L. (2019) Dynamic Evolution Equations for the Cores of Linear Defects of Crystalline Materials in Colliding Solids. Physical Mesomechanics, 22, 91-96. [16] Mirkin, L.I. (1968) Physical Foundations of Strength and Plasticity. Publishing House of , Moscow. [17] Busov, V.L. and Mikheenko, D.Yu. (2015) On the Mechanism of Destruction of the Rolling Roll. Theoretical Model. Physical Mesomechanics, 18, 72-78. [18] Orlov, A.N., Perevezentsev, V.N. and Rybin, V.V. (1980) Borders of Grains in Met- als. Metallurgy, Moscow. [19] Glezer, A.M. and Metlov, L.S. (2008) Megaplastic Deformation of Solids. Physics and Technology of High Pressure, 18, 21-35. [20] Panin, V.E., Likhachev, V.A. and Grinyaev, Yu.V. (1985) Structural Levels of De-

DOI: 10.4236/am.2021.123010 155 Applied Mathematics

V. L. Busov

formation of Solids. Nauka Siberian Department, Novosibirsk. [21] Yekimov, A.I. and Onushchenko, A.A. (1981) Quantum Size Effect in Three-Dimen- sional Microcrystals of Semiconductors. Journal of Experimental and Theoretical Physics Letters, 34, 363-366. [22] Reed, M.A., Randall, J.N., Aggarwal, R.J., Matyi, R.J., Moore, T.M. and Wetsel, A.E. (1988) Observation of Discrete Electronic States in a Zero-Dimensional Semicon- ductor Nanostructure. Physical Review Letters, 60, 535-537. https://doi.org/10.1103/PhysRevLett.60.535 [23] Hirt, J. and Lotte, I. (1972) Dislocation Theory. Atomizdat, Moscow. [24] Orlov, A.N. (1983) Introduction to the Theory of Defects in Crystals. Higher School, Moscow.

DOI: 10.4236/am.2021.123010 156 Applied Mathematics

Applied Mathematics, 2021, 12, 157-170 https://www.scirp.org/journal/am ISSN Online: 2152-7393 ISSN Print: 2152-7385

Tuning of Prior Covariance in Generalized Least Squares

William Menke

Lamont-Doherty Earth Observatory of Columbia University, New York, USA

How to cite this paper: Menke, W. (2021) Abstract Tuning of Prior Covariance in Generalized Least Squares. Applied Mathematics, 12, Generalized Least Squares (least squares with prior information) requires the 157-170. correct assignment of two prior covariance matrices: one associated with the https://doi.org/10.4236/am.2021.123011 uncertainty of measurements; the other with the uncertainty of prior infor-

Received: February 5, 2021 mation. These assignments often are very subjective, especially when correla- Accepted: March 14, 2021 tions among data or among prior information are believed to occur. How- Published: March 17, 2021 ever, in cases in which the general form of these matrices can be anticipated up to a set of poorly-known parameters, the data and prior information may Copyright © 2021 by author(s) and be used to better-determine (or “tune”) the parameters in a manner that is Scientific Research Publishing Inc. This work is licensed under the Creative faithful to the underlying Bayesian foundation of GLS. We identify an objec- Commons Attribution International tive function, the minimization of which leads to the best-estimate of the pa- License (CC BY 4.0). rameters and provide explicit and computationally-efficient formula for cal- http://creativecommons.org/licenses/by/4.0/ culating the derivatives needed to implement the minimization with a gra- Open Access dient descent method. Furthermore, the problem is organized so that the mi- nimization need be performed only over the space of covariance parameters, and not over the combined space of model and covariance parameters. We show that the use of trade-off curves to select the relative weight given to ob- servations and prior information is not a form of tuning, because it does not, in general maximize the posterior probability of the model parameters, and can lead to a different weighting than the procedure described here. We also provide several examples that demonstrate the viability, and discuss both the advantages and limitations of the method.

Keywords Bayesian Inference, Covariance, Error, Generalized Least Squares, Gradient Descent, Interpolation, Regularization, Trade-Off Curve, Variance

1. Introduction

Generalized Least Squares (GLS, also called least-squared with prior information)

DOI: 10.4236/am.2021.123011 Mar. 17, 2021 157 Applied Mathematics

W. Menke

is a tool for statistical inference [1]-[6] that is widely used in geotomography [7]-[12] and geophysical inversion [13] [14], as well as other areas of the physi- cal sciences and engineering. One of the attractive features of GLS that makes it especially useful in the imaging of multidimensional fields (for example, density, velocity, viscosity) is its ability to implement, in a natural and versatile way, prior information of the behavior of the field. Widely-used types of prior infor- mation include the field being smooth, as quantified by its low-order derivatives [15], having a specified power spectral density or autocovariance [7] [15], and satisfying a specified partial differential equation (such as the geostrophic flow equation [16] or the diffusion equation [4]). The word “regularization” some- times is used to describe the effect of prior information on the solution process [17]. We review the Generalized Least Squares (GLS) method here, following the notation in [6], in order to provide context and to establish nomenclature. In GLS, observations (or data) and prior information (or inferences) are combined to arrive at a best-estimate of initially-unknown model parameters (which might, for example, represent a field sampled on a regular grid). The data are assumed N to satisfy the linear equation Gm= d , where d ∈  is a vector of data, M m ∈  is a vector of model parameters, and G is a known “kernel” matrix associated with the data. Prior information is assumed to satisfy a linear equa- K tion Hm= h , where h∈  is a vector of prior values and H is a kernel matrix associated with the prior information. GLS problems are assumed to be over-determined, with NKM+> . For observed data d obs , known prior in- formation h pri and a specified model m , the prediction error is e≡− dobs Gm and prior information error is ≡−hpri Hm . These errors are assumed to be

Normally-distributed with zero mean and prior covariance Cd and Ch , re- −12  −12 spectively. Then, the normalized errors eC ≡ d e and ≡ Ch are inde- pendent and identically-distributed Normal random variables with zero mean and unit variance. Bayes theorem can be used to show that the best estimate mest of the solution is the one that minimizes the generalized error Φ≡EL + , with E ≡ eeT and L ≡ T [1] [2] [5]. The solution can be expressed in a va- riety of equivalent forms, among which is the widely-used version [6]:

est −−1 T1obs T1 −pri T1− T1 − m= Z( GCd d + HC h h) with Z ≡+ GCdh G HC H (1)

The assumption of linear kernels G and H is a very restrictive one. In the well-studied nonlinear generalization [1] [6], the products Gm and Hm are replaced with vector functions gm( ) and hm( ) . Then, a common solution method is to linearize the data and prior information equations around a trial solution m(0) :

(00) ( ) ∂gi obs G∆ m =∆ dwith Gij ≡and ∆≡d d − gm( ) ∂m j (0) m (2) (00) ( ) ∂hi pri H∆ m =∆ hwith Hij ≡and ∆≡h h − hm( ) ∂m (0) j m

DOI: 10.4236/am.2021.123011 158 Applied Mathematics

W. Menke

and ∆=m mm −(0) . The solution is then found by iterative application of (1) applied to (2); that is, by the Gauss-Newton’s method [3]. Alternatively, a gra- dient-descent method [18] can be used that employs:

(0T) −−11obs (0) (0T) pri (0) ∇Φ(0) =− − − − m m 22G Cdh( d Gm) H C( h Hm ) (3)

The latter approach is preferred for very large M, since the convergence rate of gradient descent is independent of its dimension [18], whereas the effort re- quired to solve the M × M system (1) by a direct method scales as M3 [19]. We now discuss issues related to the covariance matrices that appear in GLS.

The data covariance Cd quantifies the uncertainty of the observations and the

information covariance Ch quantifies the uncertainty of the prior information. Prior knowledge of the inherent accuracy of the measurement technique is

needed to assign Cd , and prior knowledge of the physically-plausible solutions, perhaps stemming from and understanding of the underlying physics, is needed

to assign Ch . These assignments are often very subjective, especially when cor-

relations are believed to occur (that is, Cd and Ch have non-zero off-diagonal elements). For example, one geotomographic study [7] reconstructs a two-di-

mensional field using a Ch that represents autocovariance of the field and that is dependent upon a scale length q. The value of q is chosen on the basis of broad physical arguments that, while plausible, leaves considerable room for subjectiv- ity. 11 The matrices C and C together contain NN( ++11) KK( +) elements, d h 22 many more than the ( NK+ ) constraints imposed by the data d and prior information h .Consequently, insufficient information is available to uniquely

solve for all the elements of Cd and Ch . However, it sometimes may be possi- J ble to parameterize Cqd ( ) and/or Cqh ( ) in terms of q ∈  , and ask whether an initial estimate of q can be improved. As long as (MJ+<) ( NK +) , ade- quate information may be available to determine a best estimate qest . We refer to the process of determining qest as “tuning”, since in typical practice it re- quires that the covariances be close to their true values. As an example of a parametrized covariance, we consider the case where the model parameters represent a sampled version of a continuous function mx( ) ,

where x ∈  is an independent variable; that is, mnn= mx( ) , with xn ≡∆ nx and ∆x the sampling interval. The prior information that mx( ) is approx- imately oscillatory with wavenumber q can be modeled by: =pri = = σ 2 − HIand h 0 and [ Ch]nm hcos(qx nm x ) (4)

In this case, Ch approximates the autocovariance of mx( ) , which is as- sumed to be stationary. The goal of tuning is to provides a best-estimate qest , as well of best estimated mest of the model parameters. This problem is further developed in Example 4, below. Although the GLS formulation is widely used in geotomography and geo- physical imaging, the tuning of variance is typically implemented in a very li-

DOI: 10.4236/am.2021.123011 159 Applied Mathematics

W. Menke

mited fashion, through the use of trade-off curves [7]-[12]. In this procedure, a

scalar parameter q controls the relative size of Cd and Ch , that is, (0) (0) CChh(qq) = , where Ch is specified [20]. The GLS problem is then solved for a suite of qs, the functions Eq( ) and Lq( ) are tabulated and the resulting

trade-off curve EL( ) is used to identify a solution m (q0 ) that has acceptably low E and L (for example, Figure 1 of [20]). As we will show below, this ad hoc procedure is not a consistent extension of GLS, because it results in a different q than the one implied by Bayes’ principle. A more consistent approach is to apply Bayes theorem directly to estimate both the model parameters m and the co- variance parameters q . Such an approach has been implemented in the context of ordinary least squares [21] and the Markov chain Monte Carlo (MCMC) in- version method [22] (which is a computationally-intensive alternative to GLS). An important and novel result of this paper is a computationally-efficient pro- cedure for tuning GLS in a Bayes-consistent manner.

2. Bayesian Extenion of GLS

The general process of using Bayes’ theorem to construct a posterior probability density function (p.d.f.) that depends on unknown parameters and of estimating those parameters though the maximization of probability is very well unders- tood [23]. In the current case, the p.d.f. has M model parameters and J cova- riance parameters, so the maximization process (implemented, say, with a gra- dient ascent method) must search an (MJ+ ) -dimensional space. Our main purpose here is to show that the process can be organized in a way that makes use of the GLS solution (1) and thus reduce the dimensionality of the searched space to J. The GLS solution (1) yields the m that minimizes the generalized error Φ(m) , or equivalently, the m that maximizes the Normal posterior probabil- ity density function (p.d.f.) p(md|,obs h pri ) : mest = arg maxp(md |obs , h pri ) m (5) withp(md |obs , h pri ) ∝ pp( dobs | m) ( hpri | m)

Here, Bayes theorem [23] is used to related the Normal posterior p.d.f. p(md|,obs h pri ) to the Normal likelihood p(dmobs | ) and the Normal prior p(hmpri | ) . When poorly known parameters q are added to the problem, they T ≡ (dh) ( ) must be treated as additional random variables [22]. Writing q qq, , (d ) (h) with q appearing in the likelihood and q appear in the prior, we have: mest, q est = arg maxp(mq , | dobs , h pri ) (dh) ( ) mq,, q (6) withp(mq , | dobs , h pri ) ∝ p( dobs | mq ,(d) ) p( hpri | mq , (hdh) ) pp( q( ) ) ( q( ) )

Here, we have assumed that q and m are not correlated with one another. The maximization with respect to the two variables can be performed as a se- quence of two single-variable maximizations:

DOI: 10.4236/am.2021.123011 160 Applied Mathematics

W. Menke

obs pri m( q00) : arg maxp(mq , | d , h) ( at fixed q 0) (7a) m

est obs pri q : arg maxp(mq( 00) , q | d , h ) (7b) q0 mest = mq( est ) (7c) ∝ In the special case of the uniform prior pp(qq(dh) ) ( ( ) ) constant , the max-

imization in (7a) is the GPR solution at fixed q0 . For the Normal p.d.f.: obs pri p(mq( 00),| q d , h ) 1 11 −+( NK) −−11   (8) 2 22 =π(2) ( detCCdk) ( det) exp−−EL   22   the maximization (7b) is equivalent to the minimization of an objective function Ψ (q) , defined as: 1 Ψ≡− + + π = + + + 2 ln p( NK)ln( 2) ln( detCCdh) ln( det ) EL (9) 2

The quantity ln( det Cd ) is best computed by finding the Choleski decompo- T sition Cd = DD , the algorithm [24] for which is implemented in many soft- ware environments, including MATLAB® and PYTHON/linalg. Then, = ln( detCd ) 2∑n ln (Dnn ) (and similarly for ln( det Ch ) ).The nonlinear opti- mization problem of minimizing Ψ (q) can be implemented using a gradient

descent method, provided that the derivative ∂Ψ ∂qm can be calculated [18]. In the next section, we derive analytic formula for this and related derivatives.

3. Solution Method and Formula for Derivatives

The process of simultaneously estimating the covariance parameters qest and model parameters mest consists of six steps. First, the analytic form of the co-

variance matrices Cqd ( ) and Cqh ( ) are specified, and their derivatives

∂∂Cdmq and ∂∂Chmq are computed analytically. Second, an initial estimate (0) (0) (0) q is identified. Third, the covariance matrices Cqd ( ) and Cqh ( ) are (0) inserted into (1), yielding model parameters mq( ) . Fourth, using formulas (0) developed below, the value of the derivative ∂Ψ ∂qm is calculated at q . Fifth,

a gradient descent method employing ∂Ψ ∂qm is used to iteratively perturb (0) q towards the minimum of Ψ at qest (and in process, repeating steps three through five many times). Sixth, the estimated model parameters are computed as mest = mq( est ) . This process is depicted in Figure 1. −1 Our derivation of ∂Ψ ∂qm uses three matrix derivatives, ∂∂M q , − ∂∂M 12 q and ∂∂ln( det M ) q that may be unfamiliar to some readers, so we derive them here for completeness. Let M (q) be asquare, invertible, differen- − tiable matrix. Differentiating MM1 = I yields ∂−−11 ∂ + ∂∂ = Mqqmm MM[ M ] 0 , which can be rearranged into ([25], their (36)):

−1 ∂∂MM−−11 = −MM (10) ∂∂qqmm

DOI: 10.4236/am.2021.123011 161 Applied Mathematics

W. Menke

Figure 1. Schematic depiction of solution process. (a) The GLS solution mest (red curve) est is considered a function of the covariance parameters q and its derivative ∂∂m qn (0) (blue line) at a point q is computed by analytic differentiation of GLS equation (1); (b) The objective function Ψ (colors) is considered a function of q . The results of (a) are (0) used to compute its gradient ∇Ψq at the point q . The gradient descent method is used to iteratively perturb this point anti-parallel to the gradient until it reaches the minimum Ψ min of the objective function, resulting in the best-estimate qest . This value is then used to determine a best-estimate of the model parameters mest , as depicted in (a).

−− − Similarly, differentiating MM12 12= M 1 and applying (10), yields the Sylve- ster equation:

−12 −−12 1 ∂M−−12 12 ∂∂ MM−1 ∂ M −1 MM+==−MM (11) ∂qm ∂∂ qqmm ∂ q m

We have not been able to determine a source for this equation, but in all like- lihood, it has been derived previously. In practice, (11) is not significantly harder to compute than (10), because efficient algorithms for solving Sylvester equa- tions [26] and for computing a symmetric (principal) square root [27], are widely available and implemented in many software environments, including

MATLAB® and PYTHON/linalg. The derivative of ln( det Cd ) is derived start- ing with Jacobi’s formula [12]:

∂ det ( M ) ∂∂∂MMM  −−11   = tr adj( M) = tr det ( MM) = det( MM) tr  (12) ∂∂∂∂qqqq     where adj( .) is the adjugate and tr( .) is the trace, applying Laplace’s identify −1 [28] adj(Cd) = det ( CC dd) and the rule tr(ccMM) = tr ( ) (where c is a sca- lar and M is a matrix) [29]. Finally, the determinant is moved to the left-hand −1 side and the well-known relationship ∂ln ( f) ∂= qf( ∂ fq ∂) , for a differen- tiable function fq( ) , is applied, yielding ([25], their (38)):

∂∂ln( det MM) 1 det ( ) −1 ∂M = = tr M (13) ∂qdet ( M ) ∂∂ qq

We begin the main derivation by considering the case in which data variance

Cqd ( ) depends on a parameter vector q , and the information variance Ch is constant. The derivative of the GLS solution can be found by applying the chain

DOI: 10.4236/am.2021.123011 162 Applied Mathematics

W. Menke

rule applied to (1):

est −−11−1 ∂∂mZT1−−obs 1T∂Cd obs ∂ZT1−pri =++GCdh d Z G d HC h ∂∂qqmm ∂qm ∂ q m −1 − ∂C ∂Z = ZG1Td dobs − mest (14) ∂∂qqmm −−11 ∂Z T∂∂CCdd−−11 ∂ C d with = GGand = − CCdd ∂∂qqmm ∂ q m ∂ q m

Note that we have used (10). The derivative of the normalized prediction error −12 obs est T is e ≡−Cd ( d Gm ) and total error E ≡ ee are:

est −12 ∂ee−12 ∂m ∂Cd obs est ∂∂E T =−+−CGd (d Gm ) and = 2e ∂qm ∂∂ qqmm ∂qm ∂ q m −−12 12 (15) ∂Ch−−12 12 ∂∂ CC hh−1 −1 with CChh+=− ChC h ∂qm ∂∂ qq mm

Here, the Sylvester equation arises from (11). An alternate way of differen- tiating E that does not require solving a Sylvester equation is:

est T −1 est ∂∂E T1− ∂mmT1 −− T∂Cd T1∂ =(eCd e) =−+−GC dd e e e eC G (16) ∂∂qqmm ∂qm ∂qm ∂ qm

The derivative of the normalized error in prior information  −12 est T =Ch ( h − Hm ) and total error L ≡ are: est ∂ −12 ∂m ∂∂L  T =−=CHh and 2 (17) ∂qm ∂∂ qq mm ∂ q m

Finally, since Ψ=ln( detCCdh) + ln( det ) +EL + , we have: ∂ ∂Ψln( det Cd ) ∂∂EL−1 ∂Cd ∂∂EL = ++=tr Cd ++ (18) ∂qm ∂ q m ∂∂ qq mm ∂ qm ∂∂ qq mm

Note that we have applied (13).

Finally, we consider the case in which the information variance Cqh ( ) de-

pends on parameters q , and Cd is constant. Since the data and prior informa- tion play completely symmetric roles in (1), the derivatives can be obtained by obs pri interchanging the roles of Cd and Ch , G and H , d and h , e and  and E and L, in the equations above, yielding:

est −1 ∂∂mZ− ∂C = ZH1Th hpri − mest ∂qm ∂∂ qq mm

−−11 ∂Z T∂∂∂CCChhh−−11 with = H Hand = − CChh ∂∂qqmm ∂ q m ∂ q m

est ∂ee−12 ∂m ∂∂E T =−=CGd and 2e ∂qm ∂∂ qq mm ∂ q m  est −12 ∂∂ −12 m ∂Ch pri est =−+−CHh (h Hm ) ∂qm ∂∂ qqmm

DOI: 10.4236/am.2021.123011 163 Applied Mathematics

W. Menke

T ∂ ∂∂ est ∂ −1 ∂est L  T mmT1−− TCh T1 ==−+−2 HChh CH ∂qm ∂∂ qq mm ∂ qm ∂ q m

−−12 12 ∂Ch−−12 12 ∂∂ CC hh−1 −1 CChh+=− Ch C h ∂qm ∂∂ qq mm ∂ ln( det Ch ) −1 ∂Ch = tr Ch ∂∂qqmm

∂Ψ −1 ∂Ch ∂EL ∂ =tr Ch ++ (19) ∂qm ∂ qm ∂∂ qq mm

These formulas have been checked numerically.

4. Examples with Discussion

In the first example, we examine the simplistic case in which the parameter q (0) represents an overall scaling of variance; that is CCdd(qq) = and (0) (0) (0) est CChh(qq) = , with specified Cd and Ch . The solution m is indepen- dent of q, as can be verified by substitution into (1). The parameter q can then be found by direct minimization of (9), which simplifies to: Ψ= NK(00) + ( ) +−−11 + ln(q detCCdk) ln(q det ) qE00 qL (20)

N Here, we have used the rule det(qqMM) = det ( ) [25], valid for any

NN× matrix M , and have defined E0 ≡= Eq( 1) and L0 ≡= Lq( 1) . The minimum occurs when: ∂Ψ EL+ ==+0 ( N Kq) −−12 −+( E L) qor q =00 (21) ∂+q 00 NK

This is a generalization of the well-known maximum likelihood estimate of

the sample variance [30]. As long as (EL00+ ) exists, the minimization in (21) is well-behaved and the overall scaling q is uniquely determined. In the second example, we examine another simplistic case in which a para- −1 meter q represents the relative weighting of variance; that is CId (qq) = and −1 CIh (qq) =(1 − ) .We consider the problem of estimating the mean m1 of data given observations d = 1 and prior information h = 0 (where 0 and 1 are vectors of zeros and ones, respectively), when NK= , M = 1 and GH= = 1 . Applying (1), we find that mqest = . Then, the objective function is N Ψ=ln(qN ) + ln(( 1 − q) ) + Nq(1 − q) and its derivative is ∂Ψ∂= −−1 +−−1 +− − ∂Ψ ∂ = qN q(11 q) ( q) q. The solution to q 0 is qest = 12, as can be verified by direct substitution. Thus, the solution splits the difference between the observations and the prior values, and yields prior va-

riances Cd and Ch that are equal. While simplistic, this problem illustrates that, at least in some cases, GLS is capable of uniquely determining the relative

sizes of Cd and Ch . Because trade-off curves, as defined in the Introduction, are based on the behavior of E and L, and not the complete objective function Ψ,

DOI: 10.4236/am.2021.123011 164 Applied Mathematics

W. Menke

the weighting parameter q0 estimated from them in general will be different from qest .Consequently, the trade-off curve procedure is not consistent with the Bayesian framework upon which GLS rests.

Our third example demonstrates the tuning of data covariance Cqd ( ) . In many cases, observational error increases during the course of an experiment, due to degradation of equipment or to worsening environmental conditions. The example demonstrates that the method is capable of accurately quantifying σ the fractional rate of increase p of the variance dn , which is assumed to vary

with position xn . In our simulation, we consider N = 201 synthetic data, even-

ly-spaced on the interval 01≤≤xi , which scatter around the curve 12 dii= m12 + mx (Figure 2). The covariance of the data is modeled as 2 C = σδ2 σ =+−11qx 2 1 δ [ d ]mn dn mn , where dnn ( ) ( ( )) and mn is the Kronecker del- ta; that is, the data are uncorrelated and their variance increases linearly with x. ∂∂ =2 − δ The derivative of the covariance is ( qx)[Cd ]mn (12) ( n 1) mn . We have included prior information with HI= and h pri = 0 , which implements the notion that the model parameters are small. The corresponding covariance is 2 chosen to be large, CIh = (1000) , indicating that this information is weak. The goal is to tune the rate of increase of variance and to arrive at a best-esti- mate of

the two model parameters. The starting value is taken to be q0 = 0 , which cor- responds to uniform variance. It is successively improved by a gradient descent method that minimizes Ψ, yielding an estimated value qest ≈ 0.709 .This esti- mate differs from the true value qtrue = 0.700 by about 1%. The estimated solu- tion mest differs from m (q = 0) by a few tenths of a percent, which may be significant in some applications.

Figure 2. Example of tuning Cd (q) . (a) Plot of synthetic data (red dots) and predicted

data (green curve); (b) The starting value q0 = 0 corresponds to uniform variance (black curve). The estimate qest corresponds to increasing variance (green curve); (c) Genera-

lized error Φ(q) (black curve). The starting value q0 (black circle) is successively im- proved (red circles) by a gradient descent method, yielding an estimate qest (green cir- cle); (d) The gradient ∂Φ ∂q , computed using the formulas developed in the text; (e)

The first model parameter mq1 ( ) , highlighting the initial value (black circle) and esti-

mated value (green circle) (f) Same as (e), except for the second model parameter mq2 ( ) .

DOI: 10.4236/am.2021.123011 165 Applied Mathematics

W. Menke

The fourth example demonstrates tuning of information covariance Cqh ( ) . In many instances, one may need to “reconstruct” or “interpolate” a function on the basis of unevenly and sparsely sampled data. In this case, prior information on the autocovariance of the function can enable a smooth interpolation. Fur- thermore, it can enforce a covariance structure that may be required, say, by the underlying physics of the problem. In our example, we suppose that the function is known to be oscillatory on physical grounds, but that the wavenumber of those oscillations is known only imprecisely. The goal is to tune prior knowledge of wavenumber to arrive at a best-estimate of the reconstructed function. In our

simulation, a total of M = 101 model parameters m j are uniformly spaced on the interval 0≤≤x 100 and representing a sampled version of a continuous, sinusoidal function mx( ) with wavenumber ptrue = 0.1571 (Figure 3). Synthe- obs 2 2 tic data di with uncorrelated error with variance σ d = (0.01) are available

for N = 40 randomly-chosen points x ji( ) , where the index function ji( ) aligns

in x observations to model parameters. The data kernel is Gij = δiji, ( ) . The prior = σ 2 − information is given in (4), with autocovariance [Ch]nm hcos(qx nm x ) and σ 2 = 2 ∂∂ =−σ 2 − − h (10) . The derivative is ( q)[Ch]nm hnm xxsin ( qxx nm) . An true initial guess pp0 = 0.95 is improved using a gradient descent method, yield- ing an estimated value of pest = 0.1571 that differs from ptrue by less than 0.01%. The reconstructed function is smooth and sinusoidal and the fit to the data is much improved. Examples three and four were implemented in MATLAB® and executed in <5s on a notebook computer. They confirm the flexibility, speed and effectiveness of the method. An ability to tune prior information on autocovariance may be of special utility in seismic exploration applications, where three-dimensional waveform datasets are routinely interpolated. A limitation of this overall “parametric” approach is that the solution is de- pendent on the choice of parameterization, which must be guided by prior knowledge of the general properties of the covariance matrices in particular problem being solved. In Example 3, we were able to recognize (say, by visually

obs Figure 3. Example of tuning Ch (q) . Sparsely-sampled synthetic data di (red dots) est are oscillatory. (a) A regularly-sampled version m j is created by imposing the oscilla- = σ 2 − = true tory covariance [Ch]nm hcos(qx nm x ) . With the starting value qq0 0.9500 , the reconstruction poorly fits the data (black curve). Tuning leads to a better fit (green curve true with dots), as well as a precise estimate of wavenumber qq0 ≈ 0.9999 ; (b) Decrease in

Ψ n with iteration number during the gradient descent process.

𝑛𝑛 DOI: 10.4236/am.2021.123011 166 Applied Mathematics

W. Menke

examining the data plotted in Figure 2(a)) that observational error increases with =+−σδ2 x and chose [Cd ]mn d (1 qx( n x1 )) mn that matched this scenario. If, in- stead, the degree of correlation between successive data increased with x, this pattern might be less expected, more difficult to detect, and require a different 1 =σ 2 −+− parameterization—say, Cd(q) dexp qx( n x mn) x x m. nm 2

Not every parameterization of Cd (or Ch ) is necessarily well-behaved. To avoid poor behavior, the parameterization must be chosen so its determinant does not have zeros at values of qest that will prevent the steepest descent process from converging to the global minimum. That this choice can be problematical

is illustrated by the simple Toeplitz version of Cd (with N = 10 , J = 9 ):

1 qqq123 q 9  q11 qq 12 q 8

qq211 q 1 q 7 Cd =  (22) qqq3211  q6   qqqq9876 1

with qi < 1 . This form is useful for quantifying correlations within a stationary J sequence of data [31]. Yet as is illustrated in Figure 4, the  volume is crossed

Figure 4. The function detCqd ( ) = 0 for the case given by (22). (a) The (qq12, ) sur-

face for q3 = −0.95 and the other qs randomly assigned; (b) Same as (a), but with

q3 = 0.00 ; (c) Same as (a), but with q3 = 0.95 ; (d) Perspective view of the surfaces in the

qqq123,, volume. The positions of the three slices in (a), (b) and (c) are noted on the q3 -axis (green arrows). A question posed in the text is whether, given an arbitrary point (0) q and the global minimum of the objective function, say at qest (and with both

points satisfying detCd > 0 ), a steepest-descent path necessarily exists between them.

DOI: 10.4236/am.2021.123011 167 Applied Mathematics

W. Menke

by many detCd = 0 surfaces that correspond to surfaces of singular objective function Ψ. Their presence suggests that the steepest descent path between a (0) starting value q and the global minimum at qest may be very convoluted (if, (0) indeed, such a path exists) unless q is very close to qest .

5. Conclusion

Generalized Least Squares requires the assignment of two prior covariance ma- trices, the prior covariance of the data and the prior covariance of the prior in- formation. Making these assignments is often a very subjective process. However, in cases in which the forms of these matrices can be anticipated up to a set of poorly-known parameters, information contained within the data and prior in- formation can be used to improve knowledge of them—a process we call “tun- ing”. Tuning can be achieved by minimizing an objective function that depends on both the generalized error and determinants of the covariance matrices to ar- rive at a best estimate of the parameters. Analytic and computationally-tractable formulas are derived for the derivative needed to implement the minimization via a gradient descent method. Furthermore, the problem is organized so that the minimization need be performed only over the space of covariance parame- ters, and not over the typically-much-larger space of model and covariance pa- rameters. Although some care needs to be exercised as the covariance matrices are parametrized, the minimization is tractable and can lead to better estimates of the model parameters. An important outcome is this study is the recognition that the use of trade-off curves to determine relative weighting of covariance—a practice ubiquitous in the geophysical imaging—is not consistent with the un- derlying Bayesian framework of Generalized Least Squares. The strategy outlined here provides a consistent solution.

Acknowledgements

The author thanks Roger Creel for helpful discussion.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this pa- per.

References [1] Tarantola, A. and Valette, B. (1982) Generalized Non-Linear Inverse Problems Solved Using the Least Squares Criterion. Reviews of Geophysics and Space Physics, 20, 219-232. https://doi.org/10.1029/RG020i002p00219 [2] Tarantola, A. and Valette, B. (1982) Inverse Problems = Quest for Information. Journal of Geophysics, 50, 159-170. https://n2t.net/ark:/88439/y048722 [3] Menke, W. (2018) Geophysical Data Analysis: Discrete Inverse Theory. 4th Edition, Elsevier, 350 p. [4] Menke, W. and Menke, J. (2016) Environmental Data Analysis with MATLAB. 2nd Edition, Elsevier, 3342 p. https://doi.org/10.1016/B978-0-12-804488-9.00001-X

DOI: 10.4236/am.2021.123011 168 Applied Mathematics

W. Menke

[5] Tarantola, A. (2005) Inverse Problem Theory and Methods for Model Parameter Estimation. SIAM: Society for Industrial and Applied Mathematics, 342 p. https://doi.org/10.1137/1.9780898717921 [6] Menke, W. (2014) Review of the Generalized Least Squares Method. Surveys in Geophysics, 36, 1-25. https://doi.org/10.1007/s10712-014-9303-1 [7] Abers, G. (1994) Three-Dimensional Inversion of Regional P and S Arrival Times in the East 723 Aleutians and Sources of Subduction Zone Gravity Highs. Journal of Geophysical Research, 99, 4395-4412. https://doi.org/10.1029/93JB03107 [8] Schmandt, B. and Lin, F.-C. (2014) P and S Wave Tomography of the Mantle be- neath the United States. Geophysical Research Letters, 41, 6342-6349. https://doi.org/10.1002/2014GL061231 [9] Menke, W. (2005) Case Studies of Seismic Tomography and Earthquake Location in a Regional Context. Geophysical Monograph 157. American Geophysical Union, Washington DC. https://doi.org/10.1029/157GM02 [10] Nettles, M., and Dziewonski, A.M. (2008) Radially Anisotropic Shear Velocity Struc- ture of the Upper Mantle Globally and Beneath North America. Journal of Geo- physical Research, 113, B02303. https://doi.org/10.1029/2006JB004819 [11] Chen, W. and Ritzwoller, M.H. (2016) Crustal and Uppermost Mantle Structure Beneath the United States. Journal of Geophysical Research, 121, 4306-4342. https://doi.org/10.1002/2016JB012887 [12] Humphreys, E.D., Dueker, K.G., Schutt, D.L. and Smith, R.B. (2000) Beneath Yel- lowstone: Evaluating Plume and Nonplume Models Using Teleseismic Images of the Upper Mantle. GSA Today, 10, 1-7. https://www.geosociety.org/gsatoday/archive/10/12/ [13] Gillet, N., Schaeffer, N. and Jault, D. (2011) Rationale and Geophysical Evidence for Quasi-Geostrophic Rapid Dynamics within the Earth’s Outer Core. Physics of the Earth and Planetary Interiors, 187, 380-390. https://doi.org/10.1016/j.pepi.2011.01.005 [14] Zhao, S. (2013) Lithosphere Thickness and Mantle Viscosity Estimated from Joint Inversion of GPS and GRACE-Derived Radial Deformation and Gravity Rates in North America. Geophysical Journal International, 194, 1455-1472. https://doi.org/10.1093/gji/ggt212 [15] Menke, W. and Eilon, Z. (2015) Relationship between Data Smoothing and the Re- gularization of Inverse Problems. Pure and Applied Geophysics, 172, 2711-2726. https://doi.org/10.1007/s00024-015-1059-0 [16] Voorhies, C.F. (1986) Steady Flows at the Top of Earth’s Core Derived from Geo- magnetic Field Models. Journal of Geophysical Research, 91, 12444-12466. https://doi.org/10.1029/JB091iB12p12444 [17] Yao, Z.S. and Roberts, R.G. (1999) A Practical Regularization for Seismic Tomo- graphy. Geophysical Journal International, 138, 293-299. https://doi.org/10.1046/j.1365-246X.1999.00849.x [18] Snyman, J.A. and Wilke, D.N. (2018) Practical Mathematical Optimization—Basic Optimization Theory and Gradient-Based Algorithms. Springer Optimization and Its Applications, 2nd Edition, Springer, New York, 340 p. [19] Hidebrand, F.B. (1987) Introduction to Numerical Analysis. 2nd Edition, Dover Publications, New York. [20] Zaroli, C., Sambridge, M., Lévêque, J.-J., Debayle, E. and Nolet, G. (2013) An Objec- tive Rationale for the Choice of Regularization Parameter with Application to Glob-

DOI: 10.4236/am.2021.123011 169 Applied Mathematics

W. Menke

al Multiple-Frequency S-Wave Tomography. Solid Earth, 4, 357-371. https://doi.org/10.5194/se-4-357-2013 [21] Malinverno, A. and Parker, R.L. (2006) Two Ways to Quantify Uncertainty in Geo- physical Inverse Problems. Geophysics, 71, W15-W27. https://doi.org/10.1190/1.2194516 [22] Malinverno, A. and Briggs, V.A. (2004) Expanded Uncertainty Quantification in Inverse Problems: Hierarchical Bayes and Empirical Bayes. Geophysics, 69, 877-1103. https://doi.org/10.1190/1.1778243 [23] Box, G.E.P. and Tiao, G.C. (1992) Bayesian Inference in Statistical Analysis. Wiley, New York, 589 p. https://doi.org/10.1002/9781118033197 [24] Schmidt, E. (1973) Cholesky Factorization and Matrix Inversion, National Oceanic and Atmospheric Administration Technical Report NOS-56. US Government Print- ing Office, Washington DC. https://books.google.com/books?id=MiRHAQAAIAAJ [25] Petersen, K.B. and Pedersen, M.S. (2008) The Matrix Cookbook, 71 p. https://archive.org/details/imm3274 [26] Bartels, R.H. and Stewart, G.W. (1972) Solution of the matrix equation AX + XB = C. Communications of the ACM, 15, 820-826. https://doi.org/10.1145/361573.361582 [27] Higham, N.J. (1987) Computing Real Square Roots of a Real Matrix. Linear Algebra and its Applications, 88-89, 405-430. https://doi.org/10.1016/0024-3795(87)90118-2 [28] Magnus, J.R. and Neudecker, H. (1999) Matrix Differential Calculus with Applica- tions in Statistics and Econometrics, Revised Edition. John Wiley and Sons, New York, 424 p. [29] Gantmacher, F.R. (1960) The Theory of Matrices, Volume 1. Chelsea Publishing, New York, 374 p. [30] Fisher, R.A. (1925) Theory of Statistical Estimation. Mathematical Proceedings of the Cambridge Philosophical Society, 22, 700-725. https://doi.org/10.1017/S0305004100009580 [31] Claerbout, J.F. (1985) Fundamentals of Geophysical Data Processing with Applica- tions to Petroleum Prospecting. Blackwell Scientific Publishing, Oxford, UK, 267 p.

DOI: 10.4236/am.2021.123011 170 Applied Mathematics

Applied Mathematics, 2021, 12, 171-208 https://www.scirp.org/journal/am ISSN Online: 2152-7393 ISSN Print: 2152-7385

Information Models for Forecasting Nonlinear Economic Dynamics in the Digital Era

Askar Akaev, Viktor Sadovnichiy

Moscow State University, Moscow,

How to cite this paper: Akaev, A. and Abstract Sadovnichiy, V. (2021) Information Models for Forecasting Nonlinear Economic Dy- The aim of this study was to develop an adequate mathematical model for namics in the Digital Era. Applied Mathe- long-term forecasting of technological progress and economic growth in the matics, 12, 171-208. digital age (2020-2050). In addition, the task was to develop a model for fore- https://doi.org/10.4236/am.2021.123012 cast calculations of labor productivity in the symbiosis of “man + intelligent Received: January 21, 2021 machine”, where an intelligent machine (IM) is understood as a computer or Accepted: March 16, 2021 robot equipped with elements of artificial intelligence (AI), as well as in the Published: March 19, 2021 digital economy as a whole. In the course of the study, it was shown that in

Copyright © 2021 by author(s) and order to implement its goals the Schumpeter-Kondratiev innovation and cycle Scientific Research Publishing Inc. theory on forming long waves (LW) of economic development influenced by This work is licensed under the Creative a powerful cluster of economic technologies engendered by industrial revolu- Commons Attribution International tions is most appropriate for a long-term forecasting of technological progress License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ and economic growth. The Solow neoclassical model of economic growth, Open Access synchronized with LW, gives the opportunity to forecast economic dynamics

of technologically advanced countries with a greater precision up to 30 years, the time which correlates with the continuation of LW. In the information and digital age, the key role among the main factors of growth (capital, labour and technological progress) is played by the latter. The authors have devel- oped an information model which allows for forecasting technological progress basing on growth rates of endogenous technological information in econom- ics. The main regimes of producing technological information, corresponding to the eras of information and digital economies, are given in the article, as well as the Lagrangians that engender them. The model is verified on the example of the 5th information LW for the US economy (1982-2018) and it has had highly accurate approximation for both technological progress and economic growth. A number of new results were obtained using the devel- oped information models for forecasting technological progress. The fore- casting trajectory of economic growth of developed countries (on the example of the USA) on the upward stage of the 6th LW (2018-2042), engendered by the digital technologies of the 4th Industrial Revolution is given. It is also dem-

DOI: 10.4236/am.2021.123012 Mar. 19, 2021 171 Applied Mathematics

A. Akaev, V. Sadovnichiy

onstrated that the symbiosis of human and intelligent machine (IM) is the driving force in the digital economy, where man plays the leading role orga- nizing effective and efficient mutual work. Authors suggest a mathematical model for calculating labour productivity in the digital economy, where the symbiosis of “human + IM” is widely used. The calculations carried out with the help of the model show: 1) the symbiosis of “human + IM” from the very beginning lets to realize the possibilities of increasing work performance in the economy with the help of digital technologies; 2) the largest labour prod- uctivity is achieved in the symbiosis of “human + IM”, where man labour prevails, and the lowest labour productivity is seen where the largest part of the work is performed by IM; 3) developed countries may achieve labour productivity of 3% per year by the mid-2020s, which has all the chances to stay up to the 2040s.

Keywords The Schumpeter-Kondratiev Innovation and Cycle Theory of Economic Development, The Solow Neoclassical Model of Economic Growth, Information Model of Technological Progress, Symbiosis of “Human + Intelligent Machine”, Labour Productivity in the Symbiosis of “Human + IM” and the Digital Economy

1. Introduction

The widespread use of digital technologies has led to increasing attention being paid to issues such as the relationship between the digital economy and the de- velopment of new business models [1], the role of new breakthrough technolo- gies in changing economic relations [2], and reducing the main costs associated with digital economic activity [3]. As a first step in understanding the nature of digital transformation, it is proposed to distinguish between the “digital sector” and the increasingly expanding trend of digitalization of the modern economy, which is often called the” digital economy” [4]. New approaches to classifying sec- tors are being formed depending on the degree of their transition to digital tech- nologies [5]. A true “digital economy” is now defined as “a part of the economic output generated exclusively or predominantly by digital technologies and busi- ness models based on digital goods or services” [6]. Since digital technologies cause accelerated changes in the organization of labor, proposals for new social approaches to the development and use of innovative models in the digital econ- omy are of great interest [7]. One of the most recent publications [8] addresses the issues of creating a unified system for measuring the digital economy, in- cluding a set of existing indicators for measuring jobs, employee skills, as well as the infrastructure for implementing digital technologies. However, we did not find any work on modeling and forecasting the dynamics of the digital economy for the future, which is necessary for developing a long-term development strat- egy.

DOI: 10.4236/am.2021.123012 172 Applied Mathematics

A. Akaev, V. Sadovnichiy

It should also be noted that certain aspects of the digital economy are eva- luated differently by the expert community. Some believe that the future belongs to robots and people working together, as automation gives people the opportu- nity to focus on more skilled and highly paid tasks [9]. However, measured productivity growth in the US decreased by half over the past decade, whereas the real income of most Americans has not changed since the 1990s. In this re- gard, the authors of the study [10] believe that the impressive capabilities of in- telligent machines will not be fully realized until special additional innovations are developed and implemented. That is why it is proposed to solve the discre- pancy between the skill requirements for working with new technologies and the possibility of implementing automation with elements of artificial intelligence at the expense of other technologies that increase productivity [11]. Unfortunately, there is still no model for calculating labor productivity in the symbiosis of “man + intelligent machine”, in order to evaluate the effectiveness of the methods of organizing labor in the digital economy proposed in the aforementioned studies. As for the impact of IT on economic growth and the growth of aggregate factor productivity, it should be noted that such estimates already exist [12]. There is also a study on quantitative impact of IT on economic growth based on data from 39 African countries for 2012-2016 [13]. However, models for calculating and predicting growth in the digital age do not yet exist. As can be seen from the brief review of the above-stated studies, many prob- lematic issues of digital economy formation have already been investigated and quite interesting results, which answer these questions, have been obtained. In our work, we aimed to supplement and expand the existing methods of studying the digital economy, in particular on the basis of the information model devel- oped by us [14], which allows us to predict technological progress on a long- term basis, using the growth rate of endogenous technological information in the economy. The article presents the main modes of production of technologi- cal information corresponding to the epochs of informatization and digitaliza- tion of the economy, as well as the Lagrangians that generate them. The model was tested (verified) on the example of the 5th information LW for the US econ- omy (1982-2018) and showed a very high accuracy of approximation both for technological progress and for calculating economic growth. Using the developed model, an example of calculating the forecast trajectory of economic growth for the United States at the stage of the digital economy formation (2018-2042), as well as a forecast calculation of labor productivity in the symbiosis of “man + intelligent machine”, are given.

2. The Schumpeter-Kondratiev Innovation and Cycle Theory of Economic Development

The crisis of world financial and economic system in 2008-2009, which led to the Great Recession in the USA and decrease in the majority of developed world economies, followed by the long global depression, lasting for almost 10 years, reminded the politicians, economists and businessmen that market economy

DOI: 10.4236/am.2021.123012 173 Applied Mathematics

A. Akaev, V. Sadovnichiy

had an uneven, unstable and cyclic character of development. Crises and depres- sion are logical, and the governments must forecast them and try to mitigate them. In such periods of cyclic crisis recessions and depressions in the world economy the scholars tend to turn to the Kondratiev long wave theory (LW) of economic development [15], lasting for 30 - 40 years in the modern era. It was so in the period of the Great Depression of the 1930s, then it happened in the 1970s and 1980s during the structural crisis of the world economy, when profound works on the LW theory were created. The compound analysis of these works can be found in the research [16]. We have developed a complete closed mathe- matical model in the work [17] to describe and calculate the Kondratiev long wave of economic development, which allows us to forecast all major economic variables with a long-term perspective up to 30 years, which corresponds to the length of LW. We can say that nowadays there exists a satisfactory and compre- hensive explanation of the LW theory or long economic cycles. Kondratiev stated that long economic cycles have an endogenous character, i.e. they are internally characteristic of capital economy. It is important to state that he was the first to understand that wavy cyclic movements of the economy deviate from a balance which the capitalist economy tries to preserve. Conse- quently, most of the time a healthy dynamic economy develops in imbalance, while the classical economic theory stated the contrary. Kondratiev himself identi- fied first (1780/90-1845/51), second (1845/51-1890/96) and third (1890/96-1940/46) long economic cycles. Besides, at the beginning of the 1920s he made an as- sumption that a cyclic crisis should take place at the end of the 1920s followed by the depression, which indeed happened in 1929 and the 1930s. Another great economist of the 20th century, Joseph Schumpeter, continued Kondratiev’s study of LW and developed an innovation theory of long waves integrating it in his general innovation theory of economic development [18]. Schumpeter saw cycles as the direct consequence of innovative processes determined by technological progress, and, as well as Kondratiev, he believed that the cyclical movement the output was a deviation from the balance. It should be mentioned that Schumpeter highlighted that the main driving force of the capitalist economy was innovations and entrepreneurship, not capi- tal per se, as many economists of that time thought. Schumpeter stated that cap- ital is useless and powerless to cause economic growth without innovations and initiative, will and perseverance of the entrepreneur. Schumpeter believed that spontaneous “blobs” of innovation by forming large clusters, caused radical changes in economy, taking it away from the original balanced trajectory only seen in the periods of stagnation. Moreover, the system never returns to the original balance. A new cycle begins at the end of another depression at a new level of balance. According to Schumpeter, the change in balance levels determine the long- term trajectory of economic development, during which economic system is in a dynamic, not a stationary, balance. Both Kondratiev and Schumpeter believed that there existed three types of balance, and, consequently, three oscillating

DOI: 10.4236/am.2021.123012 174 Applied Mathematics

A. Akaev, V. Sadovnichiy

movements, which consist of short Kitchin cycles (3 - 5 years), provoked by os- cillations of inventories; medium industrial Juglar cycles (7 - 11 years) and long Kondratiev cycles (30 - 40 years). These three waves overlap the trend trajectory of economic growth and, according to Schumpeter, their superposition shows the general state of economic development at every moment [18]. Schumpeter was the first to assume that innovations appear unevenly in time, and then spontaneously unite in clusters of innovations. He distinguished basic and ameliorative innovations. He highlighted the key role of basic innovations in the cyclic dynamics of long waves of economic development, seeing them as the main driving force of the capital economy. In fact, he predicted the 4th afterwar LW (1946-1982) generated by a large cluster of epoch-making basic innovations, including computers, electronics, televisions, jet and rocket engines as well as nuclear energy. Since the long economic cycle concept plays a highly important role in the Schumpeter innovative theory of economic development, and taking into account the fact that Schumpeter himself saw it as a cornerstone of his theory, we decided to call the latter “the Schumpeter-Kondratiev innovation and cycle theory of economic development.” The Schumpeter-Kondratiev innovation and cycle theory of economic devel- opment is valuable because it suggests an effective mechanism to overcome the global cyclic crisis followed by the depression through “the launch and compre- hensive stimulating of storm of the new generation of highly effective basic technological innovations” [19], aiming at substitution of outdated industrial technologies and forms of organizing production. It is also important that this theory shows in a certain way the period of crises and depressions, and it also has an innovative paradigm for forecasting the beginning of a new cycle [20]. The success of the Schumpeter-Kondratiev theory was already evident in the 1980s. Firstly, it should be noted that Mensch then predicted the appearance of a cyclic structural crisis of the world economy at the end of the 1970s in its begin- ning. Secondly, he accurately pointed out the feature characteristic of the forth- coming crisis—“stagflation”, which meant that economic stagnation would be coupled with rise in prices, i.e. inflation, and not their decrease as it was before. Thirdly, he clarified that under such circumstances monetary and credit policy could not help in resolving the problem of overcoming the crisis [19]. Mensch and other adherents to the Schumpeter-Kondratiev theory suggested launching the process of mastering the basic innovations of a new technological paradigm based on the achievements of microelectronics and information technology. The Schumpeter-Kondratiev innovation and cycle theory was validated during the 5th LW (1982-2018). The core for the 5th technological paradigm was micro- electronics, personal computers, information and communication technologies (ICT) and biotechnology [20]. Microprocessors and computers became widely spread and their usage turned out to be a breakthrough in goods production in each field of the economy and control of dynamic objects. It was demonstrated in the work [21] that the beginning of the 5th long economic cycle dated back to 1982. Indeed, it was 1982 when there was a rise in the world economy, which

DOI: 10.4236/am.2021.123012 175 Applied Mathematics

A. Akaev, V. Sadovnichiy

then transformed into a lengthy (1982-1994) period of a stable and quite rapid economic growth with average annual rates of 3.4 percent, which ended with a slight fall in 1995. Then, economy flourished from 1996 to 2006 when labour productivity rates reached 2.8% per year and they almost twice exceeded the same figure for the previous decade (1985-1995). Mentioned achievements could be explained by the usage of ICT and an up- surge in investments in the sphere. It explains the phenomenon of an intermit- tent labour productivity growth in the second half of the 1990s in developed countries. At the beginning of the 21st century the global ICT market exceeded 1 billion US dollars. At that time the talks on the emergence of new economy— “knowledge economy”—began in developed countries. However, by the mid- 2000s the growth of production caused by ICT stopped. According to Mensch, it meant that the 5th long economic cycle reached its peak and it was necessary to search for innovative technologies and next generation products. In 2006-2007 there was a recession in economic growth rates in OECD countries, which meant the transfer from an upward stage of the 5th LW to a downward. Thus, 2004-2005 were the upper turning point for the 5th LW. The continuity of the upward stage of the 5th LW predictably constituted 22 - 24 years (1982-2006). Less than three years had passed before a sudden world financial and eco- nomic crisis of 2008-2009 broke out, which resembled the crisis of 1929, fol- lowed by the Great Depression of the 1930s, and thus called “The Great Reces- sion”. In the work [22] we have shown that a bursting rise in prices on such highly liquid commodities as oil and gold is a forerunner of global cyclic finan- cial and economic crises and we have also developed a nonlinear dynamic model for forecasting the onset of the crisis. With the help of this model we successfully predicted the date of the second wave of the global financial crisis of August 2011 nine months before it actually took place with an error of only two weeks and it showed that the crisis of 2008-2009 could have also been predicted befo- rehand. Then, with the help of the Hirooka innovation paradigm [20] we also described the trajectory of developing basic technologies of the 6th technological paradigm and predicted the beginning of the 6th LW—2017-2018. Next, we calculated economic potential of NBIC-technologies [23]. Since NBIC-technologies are mutually convergent, a significant synergetic effect is achieved due to their co- operative action, and this effect is to accelerate the rates of technological progress in developed countries up to 3% and higher by 2030, which is much better than the same rate during the rise of the 5th long economic cycle (1982-1994), which was 2.3%. Consequently, basic technologies of the 6th technological paradigm will be able to ensure record rates of economic growth close to those back in the 1950-1960s. Indeed, as was predicted in the Schumpeter-Kondratiev theory, in 2017 there was a simultaneous growth of leading world economies. According to the IMF, by the end of 2017 all eight world economies (the USA, China, Japan, Germany, India, Russia, France and the UK) increased by over 1.5% [IMF, 2018]. All the

DOI: 10.4236/am.2021.123012 176 Applied Mathematics

A. Akaev, V. Sadovnichiy

world economy also increased by 3.8% in 2017 against 3.2% in 2016. In 2018 a simultaneous growth of GDP was seen in almost each of those 45 countries ob- served by the OECD. The IMF points out that such an event has occurred only twice over the past 40 years and adds that previously such periods of a simulta- neous growth lasted, as a rule, for several years. For example, the world economy grew at a rate of 4% annually in 1984-1989 and 2004-2007, i.e. at the beginning of the development and on the peak of the upward stage of the 5th LW. It is evi- dent that since 2018 we have observed the development of the 6th LW, which will last for about thirty years. The question arises: will the current simultaneous growth of developed world economies be stable in the medium term? Kondratiev and Schumpeter noted in their classical works that the economic growth at the beginning of the develop- ment at the upward stage of the long economic cycle is subjected to various risks, which make it unstable, and recommended that the governments assist entrepreneurs with overcoming such risks. The main risks that the current stable economic growth faces are: a large scope of a total debt of governments, house- holds, corporate and financial sectors; aggravating gap between the real econo- my and the financial sphere; accelerating growth of the excessive income inequa- lity; an acute shortage of consumer demand; instability of the financial system and sharp increase in protectionism from developed countries, transformed in trade wars, and increased environmental threats. It is possible to solve all these problems only globally, for instance, at the G20 summits. However, contrary to the crisis years of 2008-2009, there is no spirit of global cooperation nowadays. At that time the G20 countries acted as a single whole, which played a unique role in preventing the worst consequences of the financial crisis. Now we can see trade and ideological conflicts between leading countries, which hinder con- structive cooperation in the context of the G20. That is why there are reasons to believe that the development of the 6th LW will be quite unstable and may be in- terrupted by crisis recessions, not so lengthy and deep as “The Great Recession” of 2009.

3. Mathematical Models for Describing and Forecasting the Dynamics of Information and Digital Economy

To describe a long-term economic dynamics in a technologically developed coun- try we can successfully use a classical Cobb-Douglas production function (PF) with a labour-saving technological (technical) progress [14]:

1−+αδ Y( t) =γ ⋅ Kα ( t) ⋅( At( ) ⋅⋅ hLt( )) , (1)

where Yt( ) is a current national income (GDP); Kt( ) is a productive capital; Lt( ) is a number of employees in the economic sphere; h is the average level of human capital; At( ) is technological progress; α is the capital share in GDP; δ is the parameter characterizing the growing impact of production scale (δ > 0 ); γ is a constant normalizing index. It is explained by the fact that information— and the future digital economy—is the same high-technology economy where

DOI: 10.4236/am.2021.123012 177 Applied Mathematics

A. Akaev, V. Sadovnichiy

the key role for increasing the production of main factors, capital and labour, is played by ICT, digital technologies and platforms. To calculate in the re- trospective area with the help of PF (1), we used the data from the following sources: Yt( ) and Lt( ) [https://apps.bea.gov/iTable/iTable.cfm?reqid=19&step=2&isuri=1&1921=servey]; Kt( ) [https://apps.bea.gov/iTable/iTable.cfm?ReqID=10&isuri=1&step=1%20%20reqi d%(1а) 3D10#reqid=10&step=1&isuri=1]; At( ) [https://fred.stloisfed.org/series/RTFPNAUSA632NRUG]. In PF (1) and in the time of information and digital economy capital and la- bour may be described with the help of traditional methods and models ([24], §2.1). However, technological progress At( ) cannot be described using tradi- tional models ([24], ch.5) since they do not have the main factor of information- al and digital economy—the speed and scope of the production of technological information. In what follows we give an account of mathematical models for calculating and forecasting the dynamics of technological progress in informa- tion and digital economy, based on (speed) rates of the production of technolo- gical information. These information models first appeared in our work [14] and were developed in the work [25]. For practical application it is better to write and use PF (1) in a tempo-based form by means of its logarithmic differentia- tion:

qYK=⋅α q +−+(1 αδ) ⋅(qqAL + ) , (2) where Y K L A q = ; q = ; q = ; q = . (2а) Y Y K K L L A A If the functions describing the main economic variables in PF (1) are known, it is easy to find their rates of growth by using Formulas (2a). Vice versa, if the growth rates of variables are given (2a), there is no difficulty in determining va- riables by formulas t t a) Yt( ) = Y0 ⋅exp qY (ττ) d or b) At( ) = A0 ⋅exp qA (ττ) d . (3) ∫T0 ∫T0

We should note that here and above it has concerned an average technological level At( ) over the whole economy, which is clearly defined by high technolo- gical level of newly formed innovative fields of the economy. In our case these are the fields of information and digital economies. PF (1) was verified for a number of developed countries of OECD and it turned out that it could function perfectly. For example, for the US economy from 1946 to 2018, we have received the following meanings for index γ and parameters α and δ in PF (1), basing on the range of factual values of main factors (Y, K, A and L), taken from the data base (1a) in corresponding prices, and using the method of least squares:

DOI: 10.4236/am.2021.123012 178 Applied Mathematics

A. Akaev, V. Sadovnichiy

γ = 0.069 ; α = 0.622 ; δ = 0.167 . (4)

We took up in (1) h = 1 supposing that its real value will be based on the as- sessed value of normalizing multiplier/factor γ. In Figure 1 you can see the tra- jectory of the US GDP calculated on the basis of PF (1) with certain values of constant parameters (4) together with a factual trajectory. As shown in the fig- ure, there is a perfect coincidence in each of the stages. Besides, mean square er- ror of approximation did not exceed 0.05%. Thus, PF (1) may be successfully employed in long-term forecasting of the trajectory of economic growth in case when there are reliable long-term fore- casts of the dynamics of key growth factors—K, L and A. It is necessary to take into account the changes in long-term tendencies in the development of modern economy, which were profoundly explored by a French economist Thomas Pi- ketty [26]. In the 20th century there was a range of empirical regularities corres- ponding to the process of long economic growth, which are justified in a long- term period, when the results of various economic and financial crises are smoothed. A number of these regularities were first formulated by an American economist Nicholas Kaldor [27], and some of them still remain in force even nowadays. However, some of these empirical regularities do not work anymore, which sig- nals the changes in tendencies in the development of the leading economy of the 21st-century. For our future analysis the following Kaldor’s regularities are of special im- portance: 1) The ratio of physical capital to output is nearly constant, i.e.

а) KY=σ K ⋅ , σ K = const ; b) Y= kKK , kK = const , (5)

where σ K is index of capital intensity; kK is the index of capital productivity; −1 in this connection, σ KK= k ; 2) The shares of capital and labour in national income are nearly constant, i.e. α = const and 1−+=αδ const ; (6) 3) The wages for workers if labour share in GDP is constant grow in propor- tion to labour productivity i.e.

Figure 1. The trajectory of the US GDP movement for the last 70 years.

DOI: 10.4236/am.2021.123012 179 Applied Mathematics

A. Akaev, V. Sadovnichiy

wt( ) = a0 ⋅ At( ) , (7)

where wt( ) is a current average wage; a0 is a normalizing index. Such wage growth was seen in the golden age of the capitalist economy flourishing (1948- 1973).

3.1. New Empirical Regularities of Economic Development in the First Half of the 21st Century

The new tendencies of capital accumulation, economic growth, and income in- equality emerging at the beginning of the 21st century were comprehensively studied by Thomas Piketty. First of all, Piketty convincingly showed that in the most developed countries (the USA, the UK, Germany, France, etc.) capital in- th tensity (σ K ) made a huge U-shaped curve in the 20 century and returned to its maximum values at the beginning of the 21st century, close to those observed at th th th the end of the 19 century ([26], ch.2, 3). In the 18 -19 centuries the value σ K

in the leading European economies was quite stable and amounted to σ K = 7

in France and the UK, σ K = 6.5 in Germany, and σ K = 4.5 in the USA ([26], th ch.3, 4). In the middle of the 20 century the value σ K in European countries

decreased to a minimum equal to σ K =3.5 ÷ 3 , and in the USA it stopped at the th level of σ K = 3.3 . As we can see, changes in capital intensity in the 20 -century United States were more limited than in Western European countries, which gave the impression of its stationarity, which was recorded by Kaldor in the first empirical regularity (5).

The return of the value of capital intensity σ K in developed countries in the 21st century to its maximum value means that it stabilizes again, at least until the middle of the century ([26], ch.5). Hence, the first empirical regularity of Kaldor (5) remains valid in the first half of the 21st century and will be the determining factor in transformations of PF (1). Our calculations of the value of capital in- tensity in the US economy for the entire afterwar period (1946-2017), presented in Figure 2 with two graphs (current and comparable prices) show that the US capital intensity remained practically constant throughout the whole period, sta- bilizing at the beginning of the 21st century around the stationary value

Figure 2. Changes in the value of capital intensity/capital-output ratio σK in the US economy from 1946 to 2017.

DOI: 10.4236/am.2021.123012 180 Applied Mathematics

A. Akaev, V. Sadovnichiy

σ K ≅ 3.3 , (kK = 0.3) . (8)

As for the second empirical regularity of Kaldor (6), in the 21st century it ceases to operate in practice: the share of capital income in GDP (α) will no longer be a constant value, but will grow as Piketty claims [26]. For example, in Western European countries it has already risen from 20% - 25%, characteristic of the mid-20th century, to 25% - 30% by the beginning of the 21st century. Piketty believes that the share of capital income at the global level will reach 30% - 40% by the middle of the century, i.e. the level close to the indicators of the 18th-19th centuries ([26], ch.6) with average profitability of capital of 4% - 5%. Such precedents have already taken place in economic history: an increase of 10 percentage points from 35% - 40% at the turn of the 18th-19th centuries up to 45% - 50% by the middle of the 19th century. As for the labour share, it fell during such periods accordingly. For example, in the United States, the share of labour in GDP fell from 65% to 55% from 1970 to 2015, i.e. in just 45 years. This is exactly what caused the stagnation of worker wages in the United States, which has been observed since the 1970s, instead of their growth in accordance with Equation (7).

3.2. Mathematical Models for Forecasting the Trajectory of Capital Accumulation in the 21st Century

Taking into account the new tendencies in the development of leading economies in the 21st century mentioned above, we shall consider models for long-term forecasting of the main factors of economic growth—K, L and A. Let us start with capital accumulation. The growth of capital was the most important feature of capitalism in the 19th and 20th centuries. Piketty argues [26] that this process will become more intense in the 21st century. Let us consider the patterns of accumulation of fixed physical capital in the 21st century with the help of the classical equation of capital accumulation ([24], §2.1): Kt( ) =−=− IKK( t) µµ Kt( ) sYt K( ) K Kt( ) , (9)

where ItK ( ) is production investment; µK is the rate of capital outflow; sK is accumulation rate. Here we are talking about the deterioration of infrastructure capital within the framework of one LW. For the US economy during the 5th LW period (1982-2018), we have (1a):

µK = 0.035 ; sK = 0.186 . (9а)

th We estimated µK value as the regression index in Equation (9) within the 5

LW (1982-2018), and the sK value was taken from the above World Bank data- base (1a). Taking into account that the first empirical rule of Kaldor (5) remains valid in the first half of the 21st century, Equation (9) can be rewritten as: Kt( ) =( skK⋅ k − µ K) Kt( ) . (10)

The solution for this equation is exponential capital growth:

DOI: 10.4236/am.2021.123012 181 Applied Mathematics

A. Akaev, V. Sadovnichiy

Kt( ) = K00⋅exp ( sKK⋅ k−−µ K)( t T) . (11)

th If we take T0 = 2018 as the beginning of the 6 LW in the US economy, then

we find in the World Bank database that K0 = 59.3 trillion US dollars. Taking

into account the specific values of the parameters kK (8), µK , sK (9а), we

find out that skKK⋅ −≅µ K0.021. Thus, under the conditions of the Kaldor empi- rical regularity (5), in the first half of the 21st century there will be an exponential growth of accumulated capital (11). However, according to the LW theory [17], the effect of capital saturation should happen at its downward stage in the 2040s. Therefore capital accumulation will take place according to a logistic trajectory:

K2 Kt( ) = K1 + , (12) 1+ uKK⋅ exp −−ϑ ⋅( tT0 )

where KKu12,,K and ϑK are constant parameters; T0 = 2018 . All these para- meters are easily determined using the method of least squares only if the expo- nential trajectory of capital accumulation (11) coincides with the first half of the logistic curve (12), up to the inflection point (2018-2034), which is located in the middle of the 6th LW (2018-2050). Hence, the following estimates are obtained as a result of the calculations for the US economy:

K1 = 48.46 trillion dollars; K1 = 105 trillion dollars; uK = 5 ; ϑK = 0.1. (12а)

3.3. Forecast Models for Calculating Potential Workplaces in Digital Economy

Let us now pass on to the models for predictive calculation of potential work- places in the economy. There are two possible forecasting options: 1) based on a theoretical model; 2) based on empirical regularities. We shall begin with the first option. Let us take PF (1) as a theoretical model. At the beginning of §2 it was verified for the 5th Kondratiev LW (1982-2018) on the example of the US economy and it was shown that the PF can be employed in the long-term fore- casting within the 6th Kondratiev LW (2018-2050). We also took up h ≡ 1 there. If we now assume that the first Kaldor regularity (5) will be fulfilled, which

means balanced economic growth, when qqYK= , then, substituting relation (5в) into PF (1), we obtain the equation for determining the effective amount of labour: 1 1−α k 1−+αδ AL = K ⋅ K 1−+αδ. (13) γ

Solving this equation with respect to L, we get the formula for the predictive calculation of workplaces in the economy in terms of its balanced growth: 1 1−α 1−+αδ kK 1 1−+αδ Ltpt ( ) =  ⋅ ⋅ K( t) . (14)  γ  At( )

Thus, according to Formulas (13) and (14), we can carry out a predictive calcu-

DOI: 10.4236/am.2021.123012 182 Applied Mathematics

A. Akaev, V. Sadovnichiy

lation of the growth trajectory AL (13), since the predicted trajectory of movement Kt( ) is already known and is calculated by Formula (12), or of the predicted

trajectory of growth of potential workplaces Ltpt ( ) (14) if At( ) is known (see Figure 3). The trajectory of the movement of potential workplaces in the US

economy in the digital era Lt2 ( ) , calculated by Formula (14) with the already known function of technological progress At( ) is shown in Figure 3. As is seen, the number of workplaces is growing until the early 2030s, and then will begin to sharply decline. Sometimes it is possible to obtain empirical regularities in connection with the creation of new workplaces and the reduction of existing ones, by means of their technological replacement. For example, in the work ([28], ch.3) the authors provide the results of empirical extrapolation forecasting for the next 10 - 15 years: a new stage of automation will, on average, reduce 12% of existing work- places and create 13% of new vacancies, which will eventually lead to the creation of 2 million additional workplaces by 2030. This empirical regularity can be most easily approximated by a linear predictive function:

Lpe( t) =+− L bd λ ( tTbd ) , (15)

where Lbd is the number of employed (workplaces) in the economy in the th initial year (Tbd = 2018 ) of the rise of the 6 Kondratiev LW (2018-2050); index λ is determined on the condition that in 2030 the number of employed will increase by 2 million workers.

That is why for the US economy, where Lbd = 136.6 million workers,

λ = 0.17 . The graph of employment growth L1 in the US economy in the 2020s and 2030s, calculated with the help of the empirical forecast Formula (15), is shown in Figure 3. Comparing it with the trajectory of actual employment growth during the 5th Kondratiev LW (1982-2018) (see Figure 3), it is clear that during the rise of the digital economy (2018-2040), workplaces will practically be

stagnant. The trajectory of employment with balanced economic growth L2, calculated using Formula (14) and built on the same Figure 3, also supports the conclusion about the significant technological replacement of workplaces in the digital economy [29].

Figure 3. Forecast trajectories of the employment in the US economy according to theo- retical (L2) and empirical (L1) models.

DOI: 10.4236/am.2021.123012 183 Applied Mathematics

A. Akaev, V. Sadovnichiy

3.4. Forecasting the Trajectory of Economic Growth

Since from the very beginning we allowed for long-term balanced growth for the economies of developed countries, it is obvious that the forecast trajectory of the GDP movement Yt( ) can be calculated using the empirical Kaldor formula

YkK= K ⋅ (5c) through the known forecast trajectories of capital accumulation (11) or more exactly (12). If we take a look only at balanced trajectories of economic growth with effective labour (13), then the forecast trajectory of the GDP movement will naturally be the only one. Such a balanced trajectory of

economic growth Y2, calculated according to the exponential trajectory of capital accumulation (11) and Formula (5c), is shown in Figure 4. As a comparison,

there you can see the forecast trajectory of the GDP movement Yt1 ( ) , corres- ponding to the growth of employment according to the empirical regularity (15)

L1 (t) , calculated on the basis of initial PF (1) with the help of the forecast trajectory of capital accumulation (11) and the known function of technological progress At( ) .

Figure 5 shows the projected growth rates of the US economy q1 and q2 at the upward stage (2018-2042) of the 6th LW (2018-2050), calculated according to the

Figure 4. Forecast trajectories of economic growth according to theoretical (Y2) and em- pirical (Y1) models.

Figure 5. The movement of projected economic growth rates according to theoretical (q2) and empirical (q1) models.

DOI: 10.4236/am.2021.123012 184 Applied Mathematics

A. Akaev, V. Sadovnichiy

abovementioned forecast trajectories Y1 and Y2. As can be seen from the graphs

qt1 ( ) and qt2 ( ) , they make up for about 2% per year in the next 15 years, as well as in the years of depression (2010-2017), which is directly seen in Figure 5. This was to be expected, since we took into account only technological progress, but did not consider its impact on the productivity of workers. We shall return to this issue later. Above, we forecasted long-term balanced economic growth using the well- known Solow neoclassical model in the form of the Cobb-Douglas PF (1). Since the two main factors of growth, capital and labour, can be described using the classical models discussed above, it is now necessary to find a model to be able to

reliably forecast technological progress Atd ( ) in the digital age, which can no longer be described with the help of traditional models. This is explained by the peculiarities of the digital economy and technologies. In the second half of the 20th century technological progress At( ) in the economy was determined mainly by the efficiency of the R & D system. In this regard, various R & D mod- els have been proposed by a number of authors ([24], ch.5). We have developed an improved R & D equation, invariant to the scale of the economy, which turned out to be very effective and provided a sufficiently accurate solution that allows us to reliably predict both technological progress and long-term econom- ic dynamics [30]. However, in the future digital economy, technological progress will already be determined by the dynamics of the production of technological information. That is why an information formula will be required to determine technological progress, which we obtained in the work [14] and verified for the information age (1980-2018). The mathematical models of economic dynamics and technological progress that are most suitable for the age of the digital econo- my are given below [25]. The digital economy is a new paradigm for accelerating economic develop- ment, improving the quality and usefulness of goods and services. The digital economy is a real developed economy in which a key role is played by digital platforms, platform business models and digital technologies, designed to in- crease the productivity of economic factors, minimize the costs of materials and resources, and most importantly, improve the accuracy of forecasting demand on the part of consumers and ensure their full compliance with their preferences and requirements regarding the characteristics of goods and services. Therefore, in describing the economic dynamics in the digital age, technological progress, which determines the total productivity of factors, will play a key role, and it is important that it is directly determined by the dynamics of the production of technological information, since the main factors of the digital economy are knowledge and know-how embodied in digital technologies.

3.5. Mathematical Models for Describing and Forecasting Technological Progress in the Information and Digital Age

Technological progress A(t), which determines the total productivity of the main economic factors—physical capital K(t) and labour L(t), in the information and

DOI: 10.4236/am.2021.123012 185 Applied Mathematics

A. Akaev, V. Sadovnichiy

digital age should naturally depend on the dynamics of production of technolo- gical information S(t). Since the rate of economic growth by 80% or more is de- termined exactly by technological progress (the total productivity of factors) ([31], ch.1), it is extremely important to establish the desired functional rela- tionship. We shall use the information model first proposed by A. I. Yablonsky ([32], p.163) as a basic model for calculating the rate of technological progress in

the information and digital age qtAd ( ) , ISd Ad qAd ( t) = ξ⋅⋅ = ξε⋅ d(tqt)⋅ s ( ) , (16) KSd Ad

where Itd ( ) is current investment in physical capital Ktd ( ) of the informa-

tion and digital economy; StAd ( ) is a function describing the dynamics of the accumulation of industrial technological knowledge, which determines the ad-

vanced technological level in the economy Atd ( ) ; ε d(t) = IK dd is a relative level of investment in the economy; qtS ( ) = SAd S Ad is the growth rate of pro- duction technological information in the economy; ξ is a constant calibration index. Although the ideas underlying the construction of Formula (16) are absolutely correct, however, Formula (16) itself is incorrect, since it does not take into ac- count the correspondence between the dimensions of the right and left sides of the equation. Based on the π-theorem of the theory of dimensions [33], Formula (16) can be correctly written as: Atd ( ) ISd Ad qAd ( t) = = ξ⋅⋅= ξεd(tqt)⋅ S ( ) . (17) Atd ( ) Kd S Ad

Since information and digital technologies are exponentially growing, it can

be assumed that StAd ( ) will also be an exponential function. Indeed, we have

defined (for more details see [25]) that the SAd function is an exponential func- tion written as S t= Sexp gt . AAdd( ) 0 ( ) (18)

The fact that the accumulation of industrial technological knowledge in the information and digital sectors of the economy occurs according to the expo- nential law was previously noted by Raymond Kurzweil ([34], pp. 491-496).

Most works usually consider the simplest version of the function, when g(t) = g0t

and g0 = const, i.e. the case of a linear function. In this paper, we consider the general case when g(t) is a nonlinear but differentiable function. Hence, S t=⋅⋅ S gt exp gt . AAdd( ) 0 ( ) ( ) (19)

If we substitute representations (19) and (18) into Formula (17), we get a com- pact formula for calculating the required growth rates of technological progress:

qAd ( t) = ξεd ( tgt) ( ). (20)

Thus, rates qAd of technological progress, which determine the dynamics of the total productivity growth of the main economic factors, in the information

DOI: 10.4236/am.2021.123012 186 Applied Mathematics

A. Akaev, V. Sadovnichiy

and digital age will mainly depend on the rates of production of technological information St Ad ( ) qS ( t) = = gt ( ) . (21) Ad St Ad ( )

If we know qAd(t), it is quite easy to calculate the trajectory of the technologi-

cal progress Ad(t). itself. Indeed, it follows from Equation (17) that

t At( ) = Aexp q(ττ) d . (22) dd0 ∫0 Ad

If we put here the expression for qAd (20), we will finally have:

t At( ) = Aexp ξ ετ( ) g ( τ)d τ . (23) dd0 ∫0 d

In the work [14] we considered a number of real dynamical regimes of pro- duction of technological information. When we derive the equations for the production of technological information, we use the “principle of the minimum production of entropy in self-organization processes” formulated by Yuri Kli- montovich ([35], p.36), which in relation to the digital economy can be refor- mulated as follows: “The principle of the maximum production of information in self-organization processes on the upward stage of the Kondratiev LW.” This means that the economic system at the upward stage produces a maximum of information (goods) thanks to the self-organization of agents and using available resources. In the mathematical form this means that there exists a certain Lagrangian, the functional of which, in accordance with the Klimontovich principle, takes an extreme value at the upward stage of the Kondratiev LW. The correspond- ing Lagrange equation generates the required dynamical regime of production of technological information. Let us give here four dynamical regimes of pro- duction of technological information that are of greatest interest from the point of view of describing information retrospective and forecasting the digital fu- ture.

3.5.1. The Invariable Dynamical Regime

Information is produced at a constant growth rate gv =0 = const . The Lagrange function has the form of

2 L( ggt ,,) = g . (24) The corresponding Lagrange equation d ∂∂LL −==g 0 (24a) dtg∂∂ g

has the solution

g( t) = g00 + vt, gv 00= . (24b)

In this case, Sdd( t) = S00exp( vt) . The process of accumulation of industrial technological knowledge occurs according to the simplest exponential law which

DOI: 10.4236/am.2021.123012 187 Applied Mathematics

A. Akaev, V. Sadovnichiy

characterizes the steady state within the framework of one Kondratiev long eco- nomic cycle—a long wave of economic development lasting for 30 - 40 years, formed under the influence of another technological revolution. The simplest dynamical regime of production of technological information described above was characteristic of the symbiosis “human + computer” at the initial stage of the development of computer technology in the period of the 4th long economic cycle (1946-1982).

3.5.2. The Dynamical Regime with Aggravation Growth rates of information production grow exponentially with its accumula- tion ([34], p. 492), i.e. g ~eg . The following Lagrangian leads to this regime

22− g L( ggt ,,) = g e . (25)

The corresponding Lagrange equation has the following form: gg = 2 . The

solution of this equation under initial conditions gt( =0) = g0 and

gt ( =0) = v0 leads to a hyperbolic increase in growth rates of information pro- duction: 1 t 1 gt ( ) = ; gt( ) =−− g0 ln 1 ; TS = , (25a) TtS − TS v0

where TS is a point of singularity. Equation (25a) resembles the hyperbolic equation of demographic dynamics, first obtained by Heinz von Foerster, Patricia Mora and Lawrence Amiot [36],

with a point of singularity at TS = 2026. In reality, the explosive demographic growth was replaced with the stabilization dynamical regime—demographic transition [37]. The same happened with the dynamical regime of information production in the 5th LW (1982-2018), when instead of explosive growth (25a), smooth logistic growth was realized. We assume that during the 6th LW (2018- 2050), the dynamical regime with aggravation is implemented with subsequent stabilization.

3.5.3. The Dynamical Regime with Stabilization This regime is a combination of the dynamical regime with aggravation g - eg , which is implemented at the initial stage of development, and the invariable eg dynamical regime g = const at the final stage: g ~ . 1e+ g The Lagrangian has the form of g 22e− g L( ggt ,,) = , (26) 1− g

and Lagrange equation 2 gg = (1 − g) . (26a)

Usually the solution needs to be scaled to make it adequate for the problem at hand. This is done by introducing new variables g t g = ; t = , (26b) sg st

DOI: 10.4236/am.2021.123012 188 Applied Mathematics

A. Akaev, V. Sadovnichiy

where sg and st are scaled factors. The scaled solution of the latter Lagrange equation has the form: −1 1 −s gt( ) sg 1 gt =1e + c g c =e1g 0 − а) ( ) 1 ; 1 ; (27) sg v0

−sg gt( ) 1 b) stg⋅= t sgt( ) − c12e + c; c20= −−1 sgg . v0 Here, as we can see from Equation (27a), rates of production of technological information monotonically increase according to the logistic law with a variable rate, because, as follows from Equation (27b), g(t) is not a strictly linear function of argument t. This is the dynamical regime of production of technological information which was observed during the 5th LW. Indeed, as can be seen in Figure 6, which

shows the graph of the qAd (t) function calculated by Formula (20) with respect to (27), and the curve with points characterizing the actual values of the contri- bution of ICT to the rates of technical progress [38], they coincide with a high index of determination (R2 = 0.998).

3.5.4. The Aggravation Dynamical Regime with Return to the Stable Level In this scenario, at the initial stage, the process will be sharply escalated ( g - eg ) and, due to inertia, it skips the stationary mode ( g = const ), and then, hav-

ing reached a certain maximum value g m , it returns asymptotically to the stable eg level. This regime can be described by the relation g ~ , where 1e+ zg( ) g zg( ) is the deceleration function, which in the simplest case has the form 1 zg( ) =1e − −ρ g , where ρ = const and ρ ≠ 1, and if ρ → 0 the dynamical 1− ρ regime with aggravation is obtained, while if ρ →∞, it gets to the dynamical regime with stabilization. This regime is produced by the Lagrangian g 22e− g L( ggt ,,) = , and the corresponding Lagrange equation has the form 1− zgg( ) of

Figure 6. ICT contribution to total labour productivity.

DOI: 10.4236/am.2021.123012 189 Applied Mathematics

A. Akaev, V. Sadovnichiy

2 dz g =−+ g 1. gzg ( ) (28a) dg

Scaled solution of such an equation has the following form: −1 −ρsgg −sgg 1 1e−sg sg 1e a) gt ( ) =−+1ec g ; c =e1g 1  −+ ; (28) − ρ 1 1 − ρ sg 1 v1 1

−ρsgg e −sgg 11−ρsgg 1 b) stg⋅= t sg + − c12e + c; c2 = −−1e sgg 1 − . ρρ(1− ) v1 ρ

Here v1 = gt ( = 2018) and g1 = gt( = 2018) are the initial values for the pro- th cess of production of technological information in the 6 LW; sg is the scaled factor. It is this scenario of growth with a return according to which the number of population in separate countries is stabilized in the 21st century [39]. Graphs of growth rates of technological information vt( ) = gt ( ) described in Equations (27a) and (28a) are shown in Figure 7. All of them are S-shaped curves possessing all the characteristics of the growth curves of technological

progress at the upward stage of the LW. Moreover, only vt1 ( ) , calculated by Equations (27), is a classical logistic function that asymptotically tends to the

stationary level below, while the other two curves v22(ρ = 0.2) and

v33(ρ = 0.1) avoid the stationary level, due to powerful acceleration, and then, due to the activation of the deceleration mechanism, return asymptotically to the stationary level above. The method of calculating constant parameters and in- dexes in Equations (27) and (28) will be explained in the following paragraphs. In the next section, by verifying technological progress and economic dynamics th at the 5 informational LW (1982-2018), we will show that curve vt1 ( ) allows us to accurately describe the technological progress in the developed economy.

4. Verification of the Information Model of Technological Progress and Economic Dynamics at the 5th LW (1982-2018)

In the previous paragraph (§2.5), mathematical models were given for describing and forecasting technological progress in the age of the information and digital

Figure 7. Different trajectories of the growth rates of technological information.

DOI: 10.4236/am.2021.123012 190 Applied Mathematics

A. Akaev, V. Sadovnichiy

economy. The models were based on various production regimes of technologi- cal information. As a result, they are called information models. The main veri- fied formula is the formula for calculating the growth rates of technological progress (20):

qAd ( t) =ξεd ( t) ⋅ gt ( ) , (29)

where gt ( ) is the production rates of technological information in the econo-

my; ε d (t) is relative investments of Itd ( ) in the fixed productive capital

Ktd ( ) of the information and digital economy. Here and in what follows, the normalizing index ξ is assumed to be equal to unity (ξ = 1), since normalization can be carried out in terms of constant indexes of the function gt ( ) . We will verify Formula (29) for the 5th Kondratiev LW in the world economy (1982- 2018), which is characterized as the age of formation of the information econo- my or the “knowledge economy” in developed countries and for which there ex- ist verified and sufficiently accurate data on both technological progress and economic growth for all OECD countries. We obtained the actual data on the growth rates of technological progress

qtAd ( ) by aggregating the geometric mean series according to the initial data from the two most reliable sources [Bureau of Labor Statistics USA; University of Groningen]. They are presented graphically in Figure 8. In our work [14] an

approximate analytical expression was given for approximating ε d (t) for the US economy:

εd (t) =+− εε01( tT 0) , (30)

where T0 = 1982 ; ε0 = 0.09 ; ε1 = 0.002 . To calculate gt ( ) in the previous paragraph, (§2.5) the following equations were obtained (27):

−1 1 −s⋅ gt( ) 1 sg vt= gt =1e +⋅ c g c = −1eg 0 а) ( ) ( ) 1 ; 1 ; (31) sg svg 0

1 −s⋅ gt( ) 1 t=  s⋅ gt−+ c⋅e g c c= −−1 sg b) g ( ) 12; 20g . st svg 0 Let us introduce scaled values:

Figure 8. Rates of technological progress according to the information model against the

background of the actual curve qtAd ( ) .

DOI: 10.4236/am.2021.123012 191 Applied Mathematics

A. Akaev, V. Sadovnichiy

t = stt ⋅ ; gt( ) = sg ⋅ gt( ) ; gt( ) = sg ⋅ gt ( ) . (31a)

First of all, it is necessary to determine the initial values g0 and v0 , charac-

terizing the production of technological information. To determine v0 , we use the hyperbolic equation for the growth of rates of information production dur- ing the initial period of information technology development (25d), according to −1 which the point of singularity is Tvs = 0 . It is natural to take 1946 as the begin- ning of the information age, when the world’s first universal computer was put into operation. The explosive growth of the influence of information and com- munication technologies (ICT) on the economic development of leading coun-

tries was first observed in 1995-1997. Therefore, Ts ≅ 50 years and v0′ = 0.02

at the initial stage. Yet g0 is determined from the solution (25b), assuming that th g0′ = 0 for 1946. Then, for the initial year of the upward stage of the 5 LW (1982), from (25b) we get: 36 g0 =−≅ln 1 1.273 . (32) 50

Next, according to the fact that the logistic function (31a), which describes the production rate of technological information, takes at the beginning of the 5th LW (1982) and at the upper turning point (2004, see Figure 8) the minimum and maximum values, equal to 0.1 and 0.9 respectively, as is generally accepted, we get two equations: −1 1 −− ( ggr 0 ) а) Tr = 2004 ; vrr===+−g 0.9 1 1 e ; (33) v0 −1 1 −− ( ggr 0 ) b) T0 = 1982 ; v00===+−g 0.1 1 1 e . v0 It follows from the second equation that if

T0 = 1982 , v0 = 0.1. (33c) From the first equation we get:

Tr = 2004 ; gr = 5.67 . (33d)

Knowing v0 = 0.1 and gr = 5.67 , we can calculate numerical values of in-

tegration constants c1 (31a) and c2 (31b):

c1 = 32.14 ; c2 = 7.73 . (34) Now let us move on to determining the numerical values of the scaled indexes

st and sg (3с). Substituting the expressions for c1 (31а) and c2 (31b), into Equation (31b), we get the following equation at the upper turning point of the th upward wave (Tr = 2004 ) of the 5 LW:  11−−( ggr 0 ) 1 str= gg −+0  −11 e +−. (35) TTr − 00v v 0

Substituting here numerical values Tr , T0 , g0 (32) and gr (33d), we get

st = 0.6 . (36)

DOI: 10.4236/am.2021.123012 192 Applied Mathematics

A. Akaev, V. Sadovnichiy

Substituting the expression for the production rate of technological informa- tion gt ( ) (31а) and (31b) in Formula (29) we get:

(1) ε d (t) qtAd ( ) = , (37) sgt1+ gt( ) −+ s⋅ t c2

where tTT= − 0 ; T0 = 1982 ; T0 ≤≤ TTr =2004 . Moreover, here function gt( ) is in a numerical form by solving the nonlinear Equation (31b) at each point t from a given timespan (37):

−gt( ) st ⋅⋅ t=−+ gt( ) c12e c, (38)

where g0 ≤≤ gt( ) gr . Since the left side of Equation (37) is set by its actual values, which have al-

ready been presented in Figure 8, and only the value of the scaled index sg remained the unknown and undefined on the right side, it can be estimated us- ing the method of least squares. It turned out to be

sg = 412.33 . (39)

Now we can build a trend trajectory of the rates of technological progress by

calculating qtAd ( ) on the right side of Formula (37) with the concrete value

sg = 412.33 (39). The trend trajectory of the rates of technological progress th qtAd ( ) at the upward stage of the 5 LW (1982-2004) is shown in Figure 8. At the downward stage of the 5th LW (2004-2018), to approximate the trajec- tory of technological progress, we shall first use the formula for deceleration in technological progress in the recession phase, obtained in the work [17]:  (21) ( ) 1 −−λ (tT) q( t) = q ⋅exp −− 1 λ tT −+ e 0 r , (40) Ad Adr 0 r λ  0 

(1) (1) where qAdr is the value of qtAd ( ) (37) at the upper turning point ( tT= r )

Tr = 2004 . Using the actual data qtAd ( ) given in Figure 8 in the phases of re- cession and depression in the 5th LW (2004-2014) with the help of the method of least squares we get:

λ0 = 2.73 . (40а) The corresponding part of the trend trajectory of technological progress in the phases of recession and depression is also shown in Figure 8 (2004-2014). So, it is necessary to find an approximate analytical description of technological progress in the recovery phase (2014-2018) of the 5th LW, which is best approx- imated by an exponential curve:

(32) ( ) qAd( t) = q Ade exp ω ( tT− re ) , (41)

(2) (2) where qAde is the value qAd (40) in the lower turning point (Tre = 2014 ).

Proceeding from actual data qtAd ( ) in the recovery phase (2014-2018, see Figure 8), with the help of the method of least squares we estimated ω ≅ 0 . (41a)

th Thus, we approximated the rates of technological progress qtAd ( ) of the 5

DOI: 10.4236/am.2021.123012 193 Applied Mathematics

A. Akaev, V. Sadovnichiy

LW both at the upward (37) and downward (40) and (41) stages. Now we can

calculate the trajectories of both actual and model technological progress Atd ( )

and Atd ( ) by the formula  T  Atd ( ) = Ad 0 ⋅exp qAd ( t) d t. (42) ∫T0 

The initial value Ad 0 is calculated by the production function (1) using the

method of least squares, so that at the initial timepoint Y0, K0 and L0 would coin- cide in their actual values with the best approximation of the formula 1 1−+αδ 1 Y0 Ad 0 = α . (43) L0 γ K0

The trajectory of the actual technological progress is obtained by numerical

integration (42), using the actual data qtAd ( ) presented in Figure 8. The tra- jectory of the model technological progress is calculated by the Formula (42)

with the sequential use of approximating functions qtAd ( ) (37), (40) and (41). The calculation results are presented graphically in Figure 9, in the form of a growth trajectory of the normalized level of technological progress

atd( ) = At dd( ) A0 . The mean square error turned out to be σ A = 1.95% . As we can see, the proposed information model of technological progress (29) pro- vides a sufficiently high accuracy of approximation. Now let us consider how this model error can affect the further calculations of the trajectory of economic growth by the production function (1):

α 1−+αδ Y( t) = γ K( t)  At( )⋅ Lt( ) . (44)

Here we have just calculated and verified Atd ( ) with high accuracy, with a mean square error not exceeding 2%, over a time span of 36 years. The actual data for the main economic variables Yt( ), Kt( ) and Lt( ) for the US econ- omy were taken from sources (1a). First, the optimal values of the production function parameters (44) γ, α and δ were calculated: a) At current prices: γ = 0.04 ; α = 0.73 ; δ = 0.33 (45) b) At comparable prices: γ = 0.07 ; α = 0.63 ; δ = 0.19 .

Figure 9. Technological progress in the information age.

DOI: 10.4236/am.2021.123012 194 Applied Mathematics

A. Akaev, V. Sadovnichiy

After having calculated these both variants, we obtained a mean square error (1) (2) at current prices σY = 0.3% and at comparable prices σY = 0.13% . The tra- jectories of economic growth are shown in Figure 10 and Figure 11 respectively. As can be seen from the graphs in Figure 10 and Figure 11, the accuracy of model (49) in calculating the trajectory of economic growth is much higher than for calculating technological progress, and the accuracy is two times higher for calculating economic growth in comparable prices than in current prices. There- fore, forecast calculations are also recommended to be carried out at comparable prices. In conclusion, let us calculate the growth trajectory of the production of technological information gt( ) from Formula (40): 2 T qtAd ( ) gt( ) = g0 + st ⋅ d t, T0 ≤≤ TTbd ; (46) ∫T0 ε d (t)

g0 T0 = 1982 ; Tbd = 2018 ; g0 = ; g0 = 1.27 ; sg = 412.33 ; st = 0.6 . sg

Figure 10. Verification of the calculated trajectory of the US GDP movement (at current

prices). A mean square error σ Y = 0.298% .

Figure 11. Verification of the calculated trajectory of the US GDP movement (at compa-

rable prices). A mean square error σ Y = 0.133% .

DOI: 10.4236/am.2021.123012 195 Applied Mathematics

A. Akaev, V. Sadovnichiy

Here, in the integrand, we successively use approximating functions (37), (40) and (41). As a result of calculations, we get the trajectory gt( ) shown in Fig- ure 12. As we can see from a comparison of the trajectories gt( ) (Figure 12)

and qtAd ( ) (Figure 8), the growth rate of the production of technological in- formation should significantly surpass the rates of technological progress. As was expected, the production growth of technological information is an entirely

increasing function and its maximum value at the end point Tbd = 2018 calcu- lated by Formula (69) is

gbd ≅ 0.016 or gbd ≅ 6.6 . (46a)

5. Forecasting the Technological and Economic Dynamics at the Upward Stage of the 6th Kondratiev LW (2018-2042)

The dynamics of technological progress at the upward stage of the 6th Konrdatiev LW in world economic development is determined by the accelerated produc- tion regime of technological information with a return to the stationary Equa- tion (28): − −ρ 1 gt( ) −ρ gbd e −gt( ) 1e gbd а) ν (t) ==−+ gt( ) 1ec3 ⋅ ; c3 =e1 −+ ; (47) 1− ρ νρbd 1−

−ρ gt( ) ρ gbd e −gt( ) 1e b) t=+−+ gt( ) c34⋅e c; cg4 = −−1 bd − . ρρ(1− ) νρbd

Here gt( ) = sg ⋅ gt( ) ; t= stt ⋅ ; gbd = gt( = Tbd ) ; Tbd = 2018 ; gbd ≅ 6.6

(46a); ρ = 0.2 . The scaled factors st and sg have already been defined for

the information age: st = 0.6 (36) and sg = 412.33 (39). They also retain their values in the digital age. Taking into account that at the initial point of the up- th ward stage of the 6 LW (Tbd = 2018 ), the value of the logistic function ν (t) (47a) should equal ν (t) (47a) and solving Equation (47a) at this point we make sure that the latter turns into identical equation. Therefore,

ν bd = 0.1. (48а) Then we can easily find numerical values of integration constants:

c3 ≅ 6559.3 ; c4 ≅ 1.1 . (48b)

Figure 12. The production growth of technological information in the information age.

DOI: 10.4236/am.2021.123012 196 Applied Mathematics

A. Akaev, V. Sadovnichiy

Since the production regime of technological information in question is a re- gime with a return, the upper turning point of the 6th LW will be higher than the stationary level, therefore, Equation (47a) at the point of return (return to)

tT= rt is reduced to a simplified equation: −1 −−ρρggrt bd e 1e −− ( ggrt bd ) ν rt=g rt =1.1 = 1 − + −+1 e . (49) 11−−ρνbd ρ

Here all values except for grt are known. Solving the equation, we find out that

grt ≅ 12.72 . (49a) To find the duration of the upward stage of the 6th Kondratiev LW

trt= TT rt − bd , we use Equation (47b) at the upper turning point ( tT= rt ): e−ρ grt tg= + − c⋅e−grt +≅ c 14.3 . rt rt ρρ(1− ) 34 t Since the scaled factor is s = 0.6 (36), then t =rt ≅ 23.8 years . Therefore, t rt s we get t

trt ≅ 24 years and Trt = 204 . (49b)

Thus, the duration of the diffusion of digital technologies into the economy is standard for innovative technologies and consists of 24 years. Solving Equation (47b) with the help of the numerical method with respect to gt( ) in the range

Tbd≤≤ tT rt , we obtain the numerical values of the function gt( ) , the growth trajectory of which is shown in Figure 13, which is a supplement to Figure 12. After having determined all the constant parameters and indexes of function gt ( ) (47), describing the production rates of technological information, we can turn to calculating the predicted trajectory of the growth rates of technological

progress qtAd ( ) (29). Substituting the expression for gt ( ) (47) into the orig- inal Formula (29) and admitting ξ = 1, we obtain the following formula for

predictive calculations qtAd ( ) :

Figure 13. Production growth of technological information in the information and digital age.

DOI: 10.4236/am.2021.123012 197 Applied Mathematics

A. Akaev, V. Sadovnichiy

ε d (t) qtAd ( ) = , (50) 1 −ρ gt( ) sgt⋅ 1e++ c4 gt( ) + − s⋅ t  ρ 

where tTT= − bd , Tbd = 2018 , Tbd ≤≤ TTrt =2042 ; εd (t) =+− εε01( TT 0) ,

T0 = 1982 , ε0 = 0.09 , ε1 = 0.002 . The forecasted values of the function gt( ) have already been calculated and presented in a graphical form in Figure 13.

The predicted growth rates of technological progress qtAd ( ) , calculated by Formula (50) for the upward stage of the 6th Kondratiev LW (2018-2042) for the US economy, are presented in a graphical form in Figure 14, in addition to the similar indicator in the information age (1982-2018), shown earlier in Figure 8. Next, using Formula (42), we can easily calculate the predicted trajectory of the

technological progress Atd ( ) itself according to the known qtAd ( ) (50): T Atd ( ) = Abd ⋅exp qAd ( t) d t. (51) ∫Tbd 

Finally, we turn to forecasting the dynamics of economic growth at the up- ward stage of the 6th Kondratiev LW (2018-2042) using PF (1): 1−+αδ = γ α  Yt( ) K( t)  Adp( tL) ( t) (52)

The predicted growth trajectory of technological progress in the digital age

Atd ( ) has already been calculated above (51). Capital accumulation Kt( ) , as was shown earlier, is forecasted by Formulas (11) or (12). Let us choose the sim- plest exponential law of capital accumulation (11):

K= Kbd ⋅exp 0.021( TT− bd ) , (53)

where Kbd = 59.3 trillion US dollars; Tbd = 2018 . Employment dynamics is forecasted using Formulas (14) or (15). Let us choose the predictive Formula (15) based on an empirical pattern:

Lpe( t) =+− L bd λ ( TTbd ) , (54)

where Lbd = 136.6 million workers; λ = 0.17 . Constant parameters α and δ and a normalized index γ have already been given (45b).

Figure 14. Growth rates of technological progress in the information and digital age.

DOI: 10.4236/am.2021.123012 198 Applied Mathematics

A. Akaev, V. Sadovnichiy

In Figure 15 you can see the trajectory of the US GDP, calculated using PF

(52) based on the predicted trajectories of the main growth factors Ad (51), K th (53) and Lpe (54) with constant parameters (45b) at the upward stage of the 6 LW (2018-2042). As is shown here, there is an inertial economic growth, contin- uing the growth trend that developed during the depression years (2010-2016) after the Great Recession of 2009, although digital technologies are expected to provide a significant acceleration. However, we have not yet considered the possible increase in labour productivity through the effective use of the symbi- osis “human + intelligent machine”.

5.1. The Main Driving Force of the Digital Economy—The Symbiosis “Human + IM”

There are enough reasons to believe that most of the work in the future cannot be done without humans. Indeed, any cognitive work can be fragmented into a certain set of tasks, some of which are programmable, and therefore can be au- tomated and transferred for the execution by the IM or bot programmes, while others, in fact, cannot be automated and people will fulfill them. Moreover, the latter, as noted above, are likely to be supplemented and expanded. The book cited above [28] gives the results of special studies indicating that in many cases about 20% of routine tasks will go to the IM. In general, the dispersion covers from 25% to 50%. Thus, it is concluded that the upcoming digital automation would take away from people a maximum of 25% - 50% of routine and boring work ([28], ch.3). On the other hand, by freeing themselves from such work, humans can redouble efforts to do the rest of the work and significantly increase labour productivity or the quality of performance. These considerations men- tioned above will form the basis of our mathematical model for calculating the impact of highly qualified human capital on labour productivity in the digital economy. The abovementioned resembles the processes of fragmentation of industrial production that began in the 1980s and the transfer of some of them, usually the most labour-intensive, to developing countries with relatively cheap labour. Of course, a progressive step on the part of developed countries was that the process

Figure 15. The trajectory of the US GDP movement in the information and digital age.

DOI: 10.4236/am.2021.123012 199 Applied Mathematics

A. Akaev, V. Sadovnichiy

was accompanied by the transfer of innovative technologies and know-how, al- beit with the aim of ensuring the high-quality standards adopted in developed countries. The correlation of high technologies and low wages drastically re- duced the cost of manufacturing products and sharply increased the volume of profits. Good workplaces with high wages, requiring highly skilled workers, re- mained in the developed countries. Such a situation combined the sources of competitiveness of developed countries—their technological, managerial and marketing know-how—and the comparative advantage of developing countries— their cheap labour. In the end, everyone won. Moreover, the practice of transfer- ring a labour-intensive routine work from developed to developing countries has convincingly shown that in the end this part of production could be fully auto- mated. In the age of digital intelligent machines, work will be fragmented into tasks, and routine tasks will be transferred to the IM, and an interesting creative part of it will remain for humans. In addition, the overall supervision of any work performance will certainly remain with people. There highly qualified man- agers will be required.

5.2. Mathematical Models for Describing and Calculating the Dynamics of Labour Productivity Growth in the Symbiosis “Human + IM”

As can be seen in Figure 14, technological progress based on digital technologies will grow very slowly in the 2020s, reaching 1% by the end of the decade, and only in the 2030s it will significantly accelerate and its rates by the beginning of the 2040s will double and exceed 2% per year. However, in the 2020s it is possi- ble to significantly increase labour productivity in the economy effectively using the symbiosis “human + IM”. The main PF (1) contains the value h—the level of human capital, which we averaged and equated to unity, shifting its real value to the normalizing factor γ. In fact, it is a variable, ranging from a minimum value (low skill level) to a maximum (high skill level). With a high level of human cap- ital h (highly qualified employee), it is possible to organize effective joint work of

a person and an intelligent machine ( Ahd ⋅ ), which can significantly increase

labour productivity in the economy Ath ( ) . Let us show this with the help of a mathematical model.

Labour productivity in the symbiosis “human + IM” Ath ( ) in the age of the digital economy can be described by the Equation ([24], ch.6): ββ1− Ah( t) = q hd( tA) ( t)⋅ A h( t) , (55)

where Atd ( ) is digital technological progress (maximum level of world digital

technologies); qth ( ) is a function determining the transition process in estab- lishing effective joint work of the human and the IM. Usually this process takes one medium business cycle of 6 - 10 years. It can be roughly described with the help of a logistic function:

qhm qth ( ) = (56) 1+⋅ηϑ exp  −(tT −bd )

DOI: 10.4236/am.2021.123012 200 Applied Mathematics

A. Akaev, V. Sadovnichiy

If we accept that the duration of a business cycle is 8 years, then the following estimates are easily obtained for parameters η and ϑ: η = 9 ; ϑ = 0.55 . (56a)

qhm will be estimated later. Let us consider differential Equation (55) in greater detail. Parameter β in this equation characterizes the part of the work that is automated and given to IM, while (1− β ) is the remaining part of the work that is fulfilled by the human. If β = 0 , we obtain the simplest equation from Equation (55) Ath( ) = qtAt hh( ) ( ) , (57)

which describes the growth of employee productivity without the use of digital technologies and IM. This equation is identical to the one describing the accu- mulation of human capital ht( ) during the years of work after graduation ([40], ch.10.2): ht( ) = qh ( t)⋅ ht( ) . (58)

After graduation a person has some knowledge described by the formula ([24], ch.3): hu= exp(ψ ⋅ ) , (59)

where u is an average number of years of study; ψ is the index of return on edu- cation. Empirical estimates of this index have shown that 0.06≤≤ψ 0.1 ([24], ch.3). It means that an additional year of studying increases human capital by 6-10%. In further calculations we shall use the lower line: ψ ≅ 0.06 . Since in Equation (59) u ~ t during the years of study, then h =ψψexp( ⋅ uh) = ψ. Hence, h it follows that q = =ψ . Thus, the maximum value qt( ) in Formula (56), hm h h on which this function stabilizes, is:

qhm =ψ = 0.06 . (60) When β = 1, then Equation (55) will be: Ah( t) = q hd( tA) ( t) . (61)

Here, the growth rate of an employee’s labour productivity depends solely on digital technological progress, since a person does not participate in work per- formance, but observes or, at the best, controls the work of IM. For any other values of 01<<β , Equation (55) shows that labour productivity in the econ- omy grows in proportion to the average weighted quantity of employee’s prod- uctivity and the level of digital technological progress. In what follows, we shall consider three values of β: 1 1 2 β = ; β = and β = (62) 1 3 2 2 3 3 In the first case fragments of work performed by a person prevail, in the third— by a machine, while in the second they are equally distributed. It is important for us to know in which of these options we can speak of the maximum labour prod- uctivity.

DOI: 10.4236/am.2021.123012 201 Applied Mathematics

A. Akaev, V. Sadovnichiy

The solution to Equation (55) has the form:

1 ββT β Ah( t) = A h0 + γ q hd( tA) ( t)d t . (63) ∫Tbd

Since the employee’s labour productivity at the initial moment Ah0 is deter- mined by the human capital accumulated in the learning process (59) and fur- ther practice, assuming that a highly qualified person has completed a master’s program (who has studied for 18 years or more, i.e. u = 18 ). Suppose it took an employee 3 years with the highest return (ψ = 0.1) to fully master the know- ledge and skills of working with IM. Then u = 21 and if ψ = 0.1, by Formula (59) we get

Ah0 = 8.17 . (64) Thus, we have obtained Formula (63), which allows us to calculate the dy- namics of labour productivity growth in the digital economy for any values of the parameter β. In practice, for a comparative analysis, it is not the trajectories of labour prod-

uctivity growth (63) that are more suitable, but their growth rates qtAh ( ) , which can be most easily obtained from Equation (55): β At ( )  At( ) =hd = qAh ( t) qth ( ). (65) Athh( )  At( )

Substituting the expressions for Atd ( ) (51), Ath ( ) (63) and qth ( ) in For- mula (56), we get: qtAt( )⋅ β ( ) qt( ) = hd Ah ββT Ah0 + γ q hd( tA) ( t)d t ∫Tbd where

qhm qth ( ) = . (66) 1+⋅ηϑ exp  −(tT −bd )

Here Ah0 = 8.17 (64); η = 9 , ϑ = 0.55 (56a); qhm = 0.06 (60). The growth rates of labour productivity in the digital economy with the wide- spread and effective use of the symbiosis “human + IM”, calculated by Formula (66), are presented graphically in Figure 16 with three values of β parameter: 1 1 2 β = ; β = ; β = . The corresponding curves of labour productivity 1 3 2 2 3 3

growth in the digital economy are indicated by the symbols q1, q2 and q3 (see Figure 16). As is seen in the growth curves of labour productivity in Figure 16, the high- 1 est labour productivity is achieved at β = , when human labour dominates, 3 2 and the lowest labour productivity is observed at β = , when the share of work 3 performed by machines prevails. In any case, we can see that the symbiosis of “human + IM” allows for effective use of the potential of digital technologies at

DOI: 10.4236/am.2021.123012 202 Applied Mathematics

A. Akaev, V. Sadovnichiy

the initial stage of the formation of the digital economy, ensuring the growth of labour productivity to potential values already in the mid-2020s (see Figure 16). At the same time, digital technologies are maximally manifested only since the mid-2030s, i.e. a decade later. If we now use the predictive Formulas (52)-(54) to calculate the trajectory of

GDP in the digital era (2018-2042), replacing Atd ( ) in the production function

(52) by Ath ( ) (63), we get graphs Y1, Y2 and Y3, shown in Figure 17. As we can see directly from Figure 17, economic growth receives an additional acceleration

compared to the basic trajectory of economic growth (Ybasic ). Using these graphs,

it is already easy to calculate the rates of economic growth q1, q2 and q3, which are shown in Figure 18. The graphs of the movement of economic growth rates (see Figure 18) demonstrate that with the effective use of the symbiosis “human

+ IM” (q1), the growth rates of the digital economy in the 2020s may confidently exceed 3% and maintain this level in the 2030s. The US economy has a great chance to grow at a high rate, as in the 2000s, already in the long-term period.

Figure 16. Rates of technological progress in the information and digital age and labour

productivity in the digital economy (qqq123,,) .

Figure 17. Trajectory of the US GDP movement in the digital age (2018-2042) under dif-

ferent dynamical regimes of the use of the symbiosis “human + IM” (YYY123,,) .

DOI: 10.4236/am.2021.123012 203 Applied Mathematics

A. Akaev, V. Sadovnichiy

Figure 18. Predicted rates of economic growth (qqq123,,) in the digital age under dif- ferent dynamical regimes of the use of the symbiosis “human + IM”.

6. Conclusions

1) For long-term forecasting of technological progress and economic growth, it is more suitable to use the Schumpeter-Kondratiev innovation and cycle theory on the formation of long waves (LW) of economic development, which lasts for about 30 years under the influence of a powerful cluster of innovative technolo- gies generated by cyclically arising industrial revolutions. The Solow neoclassical economic growth model, tied to the LW, makes it possible to accurately predict the economic dynamics of technologically developed countries with the longest forecasting horizon of up to 30 years. Currently, leading countries have entered the upward phase of the 6th LW (2018-2042) under the influence of digital tech- nologies of the 4th Industrial Revolution. 2) In the information and digital age, technological progress plays a key role among the main factors of economic growth (capital, labour and technological progress). The authors have developed an information model which allows for forecasting technological progress basing on growth rates of endogenous tech- nological information in the economy. The main dynamical regimes of produc- ing technological information corresponding to the eras of information and dig- ital economies are highlighted, and the Lagrangians that generate them are con- structed. The information model of technological progress was verified on the example of the 5th information LW in the economies of developed countries (1982-2018) for the US economy and was proven highly accurate. High accuracy also occurs when predicting the trajectory of economic growth using an infor- mation model for calculating technological progress. It is important to note that forecasts of the accumulation of productive capital and the dynamics of the number of employed in the economy are carried out according to the classical models.

DOI: 10.4236/am.2021.123012 204 Applied Mathematics

A. Akaev, V. Sadovnichiy

3) Most of the cognitive work in the digital age will be fulfilled by people, since all of it, as a rule, is fragmented into non-programmable tasks that require creative, highly skilled human labour to solve them, and routine programmable tasks that can be automated and transferred to IM. In this regard, digital compe- tencies and a person’s ability to work in a symbiosis with IM are in great de- mand. All this leads to the inconsistency of numerous predictions and hypo- theses that in the digital age, most workplaces will be taken by IM. 4) The main driving force of the digital economy will be the symbiosis “hu- man + IM”, which effectively works under the leadership of a person. On the ba- sis of a mathematical model it is shown that, from the very beginning of the formation of the digital economy, precisely due to the high level of human capi- tal and its effective interaction with IM the potential of digital technologies to increase labour productivity is realized. Moreover, it turned out that the highest labour productivity is achieved in the symbiosis “human + IM”, where highly skilled human labour dominates, and the lowest labour productivity is observed where the programmed work performed by IM prevails. It is also calculated that for developed economies, which are the leaders in the formation of the digital economy, growth rates of labour productivity equal to 3% per year can be achieved by the mid-2020s, and it has great chances to remain like this until the 2040s. 5) Since the main driving force of the digital economy is the symbiosis of “human + IM”, scholars and developers need to ensure that IM is extremely friendly to people and serve to improve and complement human labour, and enhance its cognitive ability. On the other hand, the education system in the dig- ital age should, along with the formation of deep professional knowledge and solid work skills in people, provide them with good mathematical knowledge, engineering thinking, teamwork skills and sufficient competencies in the field of digital technologies in order to let future specialists successfully and effectively work together with IM.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this pa- per.

References [1] Organisation for Economic Co-Operation and Development (2014) The Digital Eco- nomy, New Business Models and Key Features. In Addressing the Tax Challenges of the Digital Economy, OECD Publishing, Paris. [2] Inter-American Development Bank (2018) Exponential Disruption in the Digital Economy. Inter-American Development Bank, Washington DC. [3] Goldfarb, A. and Tucker, C. (2017) Digital Economics. Working Paper No. 23684. National Bureau of Economic Research, Cambridge. http://pinguet.free.fr/nber23684.pdf https://doi.org/10.3386/w23684

DOI: 10.4236/am.2021.123012 205 Applied Mathematics

A. Akaev, V. Sadovnichiy

[4] International Monetary Fund (2018) Measuring the Digital Economy. Staff Report. International Monetary Fund, Washington DC. [5] Calvino, F., Criscuolo. C., Marcolini. L. and Squicciarini. M. (2018) A Taxonomy of Digital Intensive Sectors. OECD Science, Technology and Industry Working Papers, No. 2018/14, OECD Publishing, Paris. [6] Bukht, R. and Heeks, R. (2017) Defining, Conceptualising and Measuring the Digi- tal Economy. International Organisations Research Journal, 13, 143-172. https://doi.org/10.17323/1996-7845-2018-02-07 [7] Jabłoński A. and Jabłoński M. (2020) Social Business Models in the Digital Econo- my: New Concepts and Contemporary Challenges. Springer Nature. https://doi.org/10.1007/978-3-030-29732-9 [8] Organisation for Economic Co-Operation and Development (2020) A Roadmap toward a Common Framework for Measuring the Digital Economy. http://www.oecd.org/sti/roadmap-toward-a-common-framework-for-measuring-th e-digital-economy.pdf [9] International Federation of Robotics (2017) The Impact of Robots on Productivity, Employment and Jobs: A Positioning Paper by the International Federation of Ro- botics April 2017. https://ifr.org/img/office/IFR_The_Impact_of_Robots_on_Employment.pdf [10] Brynjolfsson, E., Rock, D. and Syverson, C. (2017) Artificial Intelligence and the Modern Productivity Paradox: A Clash of Expectations and Statistics. Working Pa- per No. 24001. National Bureau of Economic Research, Cambridge. https://www.nber.org/papers/w24001 https://doi.org/10.3386/w24001 [11] Acemoglu, D. and Restrepo, P. (2018) Artificial Intelligence, Automation and Work. Working Paper No. 24196. National Bureau of Economic Research, Cambridge. https://www.nber.org/system/files/working_papers/w24196/w24196.pdf https://doi.org/10.3386/w24196 [12] Hernandez, K., Faith, B., Prieto Martín, P. and Ramalingam, B. (2016) The Impact of Digital Technology on Economic Growth and Productivity, and Its Implications for Employment and Equality: An Evidence Review. IDS Evidence Report 207, In- stitute of Development Studies, Brighton. [13] Solomon, E.M. and van Klyton, A. (2020) The Impact of Digital Technology Usage on Economic Growth in Africa. Utilities Policy, 67, Article ID: 101104. https://doi.org/10.1016/j.jup.2020.101104 [14] Akaev, A.A. and Sadovnichiy, V.A. (2018) Mathematical Models for Calculating the Development Dynamics in the Era of Digital Economy. Doklady Mathematics, 98, 526-531. https://doi.org/10.1134/S106456241806011X [15] Kondratiev, N.D. (1935) The Long Waves in Economic Life. Review of Economics and Statistics, 17, 105-115. https://doi.org/10.2307/1928486 [16] Menshikov S.M. and Klimenko L.A. (1989) Long Waves in the Economy. Mezhdu- narodnye Otnosheniya, Moscow.[in Russian] [17] Akaev, A.A. and Sadovnichiy, V.A. (2016) A Closed Dynamic Model to Describe and Calculate the Kondratiev Long Wave of Economic Development. Herald of the Russian Academy of Sciences, 86, 371-383. https://doi.org/10.1134/S1019331616050014 [18] Schumpeter, J.A. (1939) Business Cycles. A Theoretical, Historical and Statistical Analysis of the Capitalist Process. McGraw-Hill Book Company Inc., New York. [19] Mensch, G. (1979) Stalemate in Technology. Cambridge University Press, Cam-

DOI: 10.4236/am.2021.123012 206 Applied Mathematics

A. Akaev, V. Sadovnichiy

bridge. [20] Hirooka, M. (2006) Innovation Dynamism and Economic Growth. A Nonlinear Perspective. Edward Elgar, Cheltenham, Northampton. https://doi.org/10.4337/9781845428860 [21] Van Duijn, J.J. (1983) The Long Wave in Economic Life. George Allen and Unwin, London, Boston. [22] Akaev, A.A., Sadovnichiy, V.A. and Korotaev, A.V. (2011) Huge Rise in Gold and Oil Prices as a Precursor of a Global Financial and Economic Crisis. Doklady Ma- thematics, 83, 243-246. https://doi.org/10.1134/S1064562411020372 [23] Akaev, A.A. and Rudskoi, A.I. (2015) A Mathematical Model for Predictive Com- putations of the Synergy Effect of NBIC Technologies and the Evaluation of its In- fluence on the Economic Growth in the First Half of the 21st Century. Doklady Mathematics, 91, 182-185. https://doi.org/10.1134/S1064562415020209 [24] Jones, Ch.I. and Vollrath, D. (2013) Introduction to Economic Growth. W.W. Nor- ton & Company, New York, London. [25] Akaev, A.A. and Sadovnichiy, V.A. (2019) On the Choice of Mathematical Models for Describing the Dynamics of Digital Economy. Differential Equations, 55, 729-738. https://doi.org/10.1134/S0012266119050136 [26] Piketty, T. (2014) Capital in the Twenty First Century. Harvard University Press, Cambridge and London. https://doi.org/10.4159/9780674369542 [27] Kaldor, N. (1961) Capital Accumulation and Economic Growth. In: Hague, D.C., Ed., The Theory of Economic Growth, St. Martin’s Press, New York, 177-222. https://doi.org/10.1007/978-1-349-08452-4_10 [28] Frank M., Reohrig P., Pring B. (2017) What to Do When Machines Do Everything. John Wiley & Sons Inc., New York. [29] Brynjolfsson, E. and McAfee, A. (2014) The Second Machin Age. W.W. Norton & Company, New York and London [30] Akaev, A.A., Sadovnichiy, V.A. and Anufriev, I.E. (2011) Mathematic Models for the Longterm Forecasting of Demographic, Economic, Eco-Energy Development of the World/Problems of Contemporary World Futurology. Cambridge Scholars Pub- lishing, 76-154. [31] Barro, R.J. and Sala-i-Martin, X. (2004) Economic Growth. MIT Press, London and Cambridge, 2004. [32] Yablonsky, A.I. (1986) Mathematical Models in Science Research. Mysl, Moscow. [In Russian] [33] Barenblatt, G.I. (2003) Scaling. Cambridge University Press, Cambridge. [34] Kurzweil, R. (2005) The Singularity Is Near. Viking Books, New York. [35] Klimontovich, Y.A. (2019) Introduction to the Physics of Open Systems. Yanus-K, Moscow. [in Russian] [36] von Foerster, H., Mora, P. and Amiot, L. (1960) Doomsday: Friday, 13 November, A.D. 2026. Science, 132, 1291-1295. https://doi.org/10.4159/9780674369542 [37] Kapitsa, S.P. (2008) An Outline of the Theory of Human Growth. Nikitsky Club, Moscow. [in Russian] [38] Organisation for Economic Co-Operation and Development (2020) Key Short- Term Economic Indicators. https://stats.oecd.org/index.aspx?DatasetCode=KEI# [39] Akaev, A.A. and Sadovnichiy, V.A. (2010) Mathematical Model of Population Dy-

DOI: 10.4236/am.2021.123012 207 Applied Mathematics

A. Akaev, V. Sadovnichiy

namics with the World Population Size Stabilizing about a Stationary Level. Dokla- dy Mathematics, 82, 978-981. https://doi.org/10.1134/S1064562410060360 [40] Acemoglu, D. (2009) Introduction to Modern Economic Growth. Princeton Uni- versity Press, Princeton, New York.

DOI: 10.4236/am.2021.123012 208 Applied Mathematics

Applied Mathematics, 2021, 12, 209-223 https://www.scirp.org/journal/am ISSN Online: 2152-7393 ISSN Print: 2152-7385

A Stochastic SVIR Model for Measles

Moussa Seydou, Ousmane Moussa Tessa

Department of Mathematics and Informatics, Abdou Moumouni University, Niamey, Niger

How to cite this paper: Seydou, M. and Abstract Moussa Tessa, O. (2021) A Stochastic SVIR Model for Measles. Applied Mathematics, In this article, we consider the construction of a SVIR (Susceptible, Vacci- 12, 209-223. nated, Infected, Recovered) stochastic compartmental model of measles. We https://doi.org/10.4236/am.2021.123013 prove that the deterministic solution is asymptotically the average of the sto-

Received: February 4, 2021 chastic solution in the case of small population size. The choice of this model Accepted: March 27, 2021 takes into account the random fluctuations inherent to the epidemiological Published: March 30, 2021 characteristics of rural populations of Niger, notably a high prevalence of measles in children under 5, coupled with a very low immunization coverage. Copyright © 2021 by author(s) and Scientific Research Publishing Inc. This work is licensed under the Creative Keywords Commons Attribution International License (CC BY 4.0). Measles, Compartmental Model, SVIR, Basic Reproductive Number, Markov http://creativecommons.org/licenses/by/4.0/ Chains, Lyapunov Function, Stochastic Stability, Stochastic Simulation, Niger Open Access

1. Introduction

The measles is caused by a virus belonging to morbillivirus group. It may infect other primates, but is largely specialized on its human host. It is transmitted by direct contact with an infected person or by air [1] [2]. Upon infection, the pa- tient passes through a latent period of 6 to 9 days, followed by 6 to 7 day infec- tive period [3]. The infection results in either death or full recovery of the host. In the last case, the host develops lifelong immunity. However, immunity can also be acquired by vaccination before infection. Before the introduction of mea- sles vaccine in 1963 and widespread vaccination, major epidemics occurred ap- proximately every two or three years and measles caused an estimated 2.6 mil- lion deaths each year [1]. In developing countries, like Niger, measles remains one of the main causes of infant mortality because children under 5 remain the most affected, 90% who die have less than 5 years [1] [4] [5] [6]. In sub-Saharan Africa, especially in areas where vaccination coverage is not optimal, the case fa- tality is one of the highest 5% - 10%, compared to that of high-income countries, where we have 1 death in this age group out of 1000 measles cases [7] [8] [9].

DOI: 10.4236/am.2021.123013 Mar. 30, 2021 209 Applied Mathematics

M. Seydou, O. Moussa Tessa

A fundamental concept that has come out of the measles transmission process

is that of the basic reproduction number R0. It is defined as average number of secondary infections produced when one infected individual is introduced into a

host population where everyone is susceptible [10] [11]. R0 is a threshold para-

meter in the course of the spread of measles disease; indeed, if R0 < 1, the disease

will eventually disappear from the population, while if R0 > 1, the disease can spread as an epidemic in the absence of health interventions. In a small, isolated population, a measles epidemic cannot persist [12] [13] [14], even if the basic reproduction number is initially greater than 1. Indeed, the spread of the disease subsides at term, due to a progressive immunization of a growing proportion of the population. Thus, in such a context, measles can only be endemic after regu- lar importation of the virus, generally from infected people from large urban cen- ters [15]. Most model used for infections diseases are the compartmental models, orig- inally introduced by Kermack and Mckendrick and their variants [5] [10] [16] [17]. They are based on the partition of the population into distinct classes (or compartments) according to its epidemiological status. Host can move from one class to another (transition). In the case of the SIR (Susceptible, Infected, Recov- ered) model, an infection is the transition which moves an individual from the susceptible class to the infected class and a recovery leads an infected person to the recovered compartment [5] [10] [11]. In general, the transition rate, which expresses the probability that an individual passes from one class to another per unit of time, depends essentially on the state of the system at a given moment, in particular on the number of individuals in the different compartments and the disease infection force [6] [10].

In our SVIR stochastic model, we consider Rp the effective reproduction num- ber, characterizing the vaccination effort to control the spread of the disease, where p is the proportion of newborns vaccinated and immunized. In the total

absence of vaccination (p = 0) against measles, we estimate R0 the basic repro- duction number [3] [5] between 10 and 18. The rest of the paper is organized as follows: Section 2 describes in detail the deterministic model SVIR and the equilibrium points of the system of differen- tial equations of the model. In Section 3, we formulate our stochastic SVIR mod- el by means of the Kolmogorov Forward equations, precisely by means of a sys- tem of differential equations of the mathematical expectations of the number of susceptible, infected and immune (recovered and vaccinated). Section 4 is de- voted to the study of the asymptotic behavior of our stochastic model, followed by numerical simulations in the fifth section. Finally, in the last section, we dis- cuss our stochastic approach and scientific conclusions.

2. The Deterministic SVIR Model

In what follows, StItRt( ),,( ) ( ) denote respectively the number of susceptible, infected and immunized (susceptible vaccinated and recovered patients) at time t.

DOI: 10.4236/am.2021.123013 210 Applied Mathematics

M. Seydou, O. Moussa Tessa

In this model, the new susceptibles (newborns) are introduced at a constant rate n. A fraction, pn, of newborns has acquired immunity by vaccination. The other fraction (1− pn) remains susceptible. In addition, we assume that: • The natural death rate is δ for each compartment. • Infectious patients recover at the rate of γ. • Infectious patients have an additional μ death rate from measles. • We consider the standard incidence f( I, S) = β SI , β is the disease trans- mission coefficient. β is the average probability of an adequate contact (con- tact sufficient for transmission) between an infected and a susceptible per unit of time. In Figure 1, a compartmental diagram of the transitions illustrates the rela- tionship between the three classes. The dynamics of a well-mixed population can be described by the differential equations: dS  =n(1 −− p) βδ SI − S  dt dI  =βSI −( δ ++ µγ) I (1)  dt dR  =+−npγδ I R  dt Remark. 1) In the case of equilibrium without disease, the system (1) admits *** an equilibrium point SIR0 ,,00 with  *(1− pn) **np S= , IR= 0 et = (2) 0 δδ00 β n Setting R = et R=(1 − pR) , this equilibrium point is asymp- 0 δδ( ++ µ γ) p 0

totically stable [18] if Rp < 1 . In addition, we have RRp < 0 et Rp < 1 if and 1 1 only if p >−1 . We say that pc =1 − is the critical vaccination coverage R0 R0 of newborns. *** 2) If Rp > 1 , an endemic point of equilibrium appears SIRe ,,ee asymp-  totically stable [18], where

** * δ++ µγ (Rpp−11)δ npβ +− γδ ( R ) SI= , = et R= (3) e β eeβ δβ

Figure 1. Compartment diagram of model SVIR.

DOI: 10.4236/am.2021.123013 211 Applied Mathematics

M. Seydou, O. Moussa Tessa

3. The Continuous Stochastic SVIR Model

X= St, It Let t ( ( ) ( ))t≥0 be a continuous-time homogeneous Markov chain on 2 2 the denumerable state space  = {0,1,2, } . First, assume that ∆t can be chosen sufficiently small such that at most one change in state occurs during the time interval ∆t . In particular, there can be either a new infection, a birth, a

death, or a recovery. From of state {Xt = ( si, )} , only the following states are accessible: (sis,) ;( + 1, isi) ;( , − 1) ;( s − 1, is) ;( −+ 1, i 1) .

corresponding to the possible transitions starting from the state (si, ) . (See Fig-

ure 2). X t has an absorbing set corresponding to disease-free equilibrium states

E0 ={( si,) , s ≥= 0; i 0} .

Let V(si, ) be the set of neighbors of state (si, ) :

V(si, ) ={( s +1, is) ;( −+ 1, i 1) ;( s − 1, isi) ;( , − 1)}

Setting τ(si, ) =n(1 − p) +β is + δ s +( µδγ ++)i , the transition rates are defined by: −n(1 p) ( kl,) =+( s 1, i) , s ≥≥ 0, i 0  βis( k, l) =−+( s 1, i 1) , s ≥≥ 1, i 0 τ (si,,) ( kl ,) =  (4) δ s( kl,) =−( s 1, i) , s ≥≥ 1, i 0  (µδγ++)i( kl,) =( si , − 1) , s ≥ 0, i ≥ 1

The transition probabilities of Xt = ( St( ), It( )) are defined by

P(si,,) ( kl ,) (∆= t) { Xtt+∆ =( kl,/) Xt =( si ,)} We have ∀≥s 0 ,  ∀i >0, τ (si,,) ( kl ,) ∆+t o( ∆ t) if( kl , ) ∈ V(si,)  P(si,,) ( kl ,) (∆ t) =1 −τ (si,) ∆+t o( ∆ t) if(kl ,) =( si , ) (5)  ∀= ∆ =  iP0, (ss,0) ,( ,0) ( t) 1

Figure 2. States transition.

DOI: 10.4236/am.2021.123013 212 Applied Mathematics

M. Seydou, O. Moussa Tessa

The distribution of X t is Ptsi, ( ) = 0 if s < 0 or i < 0 and

Psi, ( t) ={ Xt = ( si, )} if si≥≥0, 0 . Therefore, the marginal distributions are given by:

{Iti( ) =} = ∑∑ Ptsi,,( ) and { Sts( ) =} = Ptsi( ) si≥≥00

From the Equation (5), we obtain the Kolmogorov Forward equations, for all s ≥ 0 and i ≥ 0 dP si,    =−n(1 p)  Ps−1, i − P si ,  +β ( s +−11)( i) Ps+−1, i 1 − siP si ,  dt (6)    ++++−++−(µγδ) (i11) Psi,++ 1 iP si , δ( s) Ps1, i sP si , 

Hence the system of differential equations verified by the mathematical ex- pectations: dS  =−−(1 p) nβ SI −− δβ S covSI  dt dI  =βSI −( µδγ ++) I + β covSI (7)  dt dR  =+−npγδ I R  dt

+∞ +∞ +∞ +∞

St( ) = ∑∑ sPtItsi,,( ), ( ) = ∑∑ iPtsi( ) si=00 = si=00 = +∞ +∞ +∞

covtSI ( ) =−==∑∑ siPtStIts, i ( ) ( ) ( ) and Rt( ) ∑ rRt{ ( ) r} si=00 = r=0

4. Asymptotic Behavior

In this part, we establish that the extinction of the epidemic is done almost sure-

ly independently of the number Rp, although this is not a priori guaranteed in infinite dimension. Let us consider the embedded process (Y ) of ( X ) which is a dis- k k∈ t t≥0

crete Markov chain representing the sequence of values taken by ( X t )t≥0 at transition times.

Setting ∆=YYkk+1 − Y k, we have: np(1− )  , if(ee12 ,) = ( 1, 0 )  τ (si, )  β si  , if(ee12 ,) =( − 1,1) τ (si, ) {∆=Ykk( e12,/ e) Y =( si ,)} = (8) δ s  , if(ee ,) =( − 1, 0 ) τ (si, ) 12  (µγδ++)i  , if(ee12 ,) =( 0, − 1)  τ (si, )

note that τ(s,1 i) = n( − p) +β si + δ s +( µγδ ++)i . To establish our results, we need the proposition [1] and the lemmas [2] [3] [4] [5] which are obtained according to the proof of the criterion of ergodicity and

DOI: 10.4236/am.2021.123013 213 Applied Mathematics

M. Seydou, O. Moussa Tessa

recurrence of Markov chains, given by Rosenkrantz [19]. These assertions are essentially based on the Lyapunov-Foster ergodicity criterion, which shows that a Markov chain is recurrent positive. This criterion was subsequently extended by Meyn and Tweedie [20] [21]. The proofs of the lemmas are given in the Ap- pendix. np(1− ) µγδ++ Proposition 1. Setting s0 = max , ; D00={( si,,) s >> s , i 0} , δβ

D10={( s,0) , ss > } , D20={( sii,,) > 0} and let d (si,) =∆=  Ykk /, Y( si)

be the drift vector and ddj (si,,) = ( si) (si,)∈ Dj ,0 ≤≤ j 2. then n(1− p) −β si − δ s β si −( µγδ ++)i d (si,,) =  (9) ττ(si,,) (si)

Lemma 2. For all j ∈{0,1,2} , we pose ddj = (si, ) where (si, )∈ Dj . We   denote by ψ = (nd10, ) the angle between n1 and d0 , ψ1= (nd 11, ) the angle  between n1 and d1 , ψ 2= (nd 22, ) the angle between n2 and d2 , where

n1 = (0,1) and n2 = (1,0) . We have the following results: π 1) 0 <<ψψ =< ψ ≤π . 122 µγδ++ π π 2) If R ≤ 1 : s = , ψ = π and <ψψ<= p 0 β 2 421 np(1− ) π 3) If R > 1 : s = , <<ψ π . p 0 δ 2 2 Proof: See Appendix.

α 2(θθ+ ) Definition 4.1. Let φ(rr, θ) = cos( αθ − θ ) where α = 12, for all 1 π  π reals r ≥ 0 and θ ∈ 0, with 2 π π π • θ ∈ 0, and θθ∈−, , in the case where R ≤ 1 . 1 4 2122 p πππ π • θψ∈− − − θθ∈− > 1 ,inf , and 21, , in the case where Rp 1 . 2 24 2  ψ = (nd10, ) is the angle between n1 and d0 , We say that φ is the Lyapounov function intervening in the study of the re-

currence-transience of X t .

Remark. If Rp ≤ 1 , we obtain 12<<α , whereas if Rp > 1 , 01<<α . Lemma 3. Let φ be the Lyapounov function. For all reals r ≥ 0 and  π θ ∈ 0, , we have the following results: 2

1) ∇φ (r,0θ ) ⋅

2) There are real constants C0 and C1 such that, uniformly in θ we have: 1−α a) limsuprr∇φ ( ,θ ) ⋅≤d00 C <0 r→+∞ 2−α b) limsupr Drljφθ( , ) ≤ C1 and (c) limsupφθ(r , ) = +∞ r→+∞ r→+∞

DOI: 10.4236/am.2021.123013 214 Applied Mathematics

M. Seydou, O. Moussa Tessa

1−α 1−α 3) limsuprr∇φ ( ,0) ⋅≤d10 C < 0 and limsuprr∇φ ( ,π 2) ⋅≤d20 C < 0 r→+∞ r→+∞

Drljφθ( , ) denote the partial derivatives of φθ(r, ) with respect to xll ( = 1, 2 )

and xjj ( = 1, 2 ) . r and θ are the polar coordinates of x= ( xx12, ) . Proof: See Appendix.

Remark. Let x=( xx12,) ∈ Dj ,0 ≤≤ j 2 and Axj( ) =∆=( YY kk/ x) ; we ob- tain = ∈ −− − d jj( x)   Ax( ) and Ax j( ) {( 1, 0) ,( 1,1) ,( 1, 0) ,( 0, 1)}

22 β si  On {Yk = ( si, )} we have: ∆==YYkk21 =−∆= 1 and  τ (si, ) 

2β si 22  A0( si,1) =+< 2,,12,,12 A12( si) =

An immediate consequence of the lemma 3 is:

Lemma 4. Let xy=( xx12,,) = ( yy 12 ,) two vectors of the plane and 22 x =xx12 + , xy⋅=xy11 + xy 2 2. Then, There are ε > 0 and K > 0 such that > ∀≥ 1) If Rp 1 , then xK,  φφ(Yk+1 ) −=≤( YYx kk) /0. ≤ ∀≥ 2) If Rp 1 , then xK,  φφ(Yk+1 ) −( YYx kk) / = ≤− ε. Proof: See Appendix. Lemma 5. Let (Y ) be the embedded process of ( X ) , which is a dis- k k∈ t t≥0

crete Markov chain representing the sequence of values taken by ( X t )t≥0 at transition times. Then 1) If R ≤ 1 , then the Markov chain (Y ) is positive recurrent. p k k∈ 2) If R > 1 , then the Markov chain (Y ) is null recurrent. p k k∈ Proof: See Appendix. We can state now our main results:

Theorem 6. Let T0 =≥=inf{ t 0, It( ) 0} with inf ∅ = +∞ . Then, for all * < +∞ = i ∈  , i [T0 ] 1 and limti→+∞  It( ) =0 = 1. Proof: This result is a consequence of the lemma 5 and the properties of re- current Markov chains with nonempty absorbing set of states. (see [22], Propo- sition 5-15). It reflects the absorbent nature of the Markov chain. 

Theorem 7. Let T0 =≥=inf{ t 0, It( ) 0} with inf ∅ = +∞ and *(1− pn) **np S0 = , IR00= 0, = . δδ

If Rp ≤ 1 , then (1) [T0 ] = +∞ and (2) *** limt→+∞ (StItRt( ) ,( ) ,( )) =  SIR0 ,00 , .  Proof: The first result reflects the positive recurrence obtained from the lemma 5. The second assertion follows from the fact that the Markov chain is absorbent, and once in the absorbing state, the correlation between St( ) and It( ) is identically zero. Therefore, asymptotically the deterministic equations and the mathematical expectation equations have the same equilibrium points. 

DOI: 10.4236/am.2021.123013 215 Applied Mathematics

M. Seydou, O. Moussa Tessa

Theorem 8. Let T0 =≥=inf{ t 0, It( ) 0} , inf ∅ = +∞ and

** * δ++ µγ (Rpp−11)δnp β +− γδ ( R ) SI= ,,= R= e β eeβ δβ 

If Rp > 1 , then (1) [T0 ] = +∞ and (2) *** limt→+∞ (StItRt( ) ,( ) ,( )) =  SIRe ,ee ,  Proof: The first assertion is proved by observing that there are asymptotically

two distinct equilibrium points, and necessarily E(T ) = +∞ in the case Rp > 1 , otherwise the two equilibrium points would be confused by uniqueness of the stationary measure. The proof of the second assertion is similar to that of the second assertion of Theorem 7. 

5. Simulation

In what follows, we will denote by I and dI numerical solutions of Equations (7) and (1) respectively. The average of the simulated realizations of the number of infected It( ) is denoted by mI. We used MATLAB software for Monte- Carlo simulations and R software for graphics

Let an initial population of S0 = 100 susceptibles with an initial number of

I0 = 2 infected for the following values of the parameters:

βδµγ=0.69; = 0.25; = 0.02; = 0.5;np = 3.5; = 0.51; Rp = 6.15

In Figure 4, we have the estimate of the covariance covSI ( t) from 50 simu- lations. Figure 5 give a comparison of I , dI and mI in the time interval [0,26] . The time interval is then varied for the same values of the parameters. It appears * * that for the large values of t, we obtain I( t) ≈≈ dI( t) Ie , where Ie is the en- demic equilibrium of the Equation (1), the expected asymptotic value when

Rp > 1 . For the considered values of the parameters, the endemic equilibrium value is * Ie = 1.8659 . The simulations gave the following values : I (26) ≈≈ 1.8648,dI ( 26) 1.8650,mI ( 26) ≈ 0 (voir Figure 5) I (52) ≈≈ 1.8650,dI ( 52) 1.8650,mI ( 52) ≈ 0 (voir Figure 6) In Figure 3, two sample paths of It( ) , their mean and the deterministic so- lution for the following values:

SI00=100; = 2;βδµ = 0.69; = 0.25; = 0.02;

γ ===∈=0.5;np 3.5; 0.51; t[ 0,26] et Rp 6.1 Figure 4 estimated covariance function from 50 sample paths of

Xt = ( St( ), It( )) and in Figure 5 Deterministic solution (dI), solution of ma- thematical expectations ( I ) and mean of 50 sample paths of It( ) (mI) for the following values:

SI00=100; = 2;βδµ = 0.69; = 0.25; = 0.02;

γ ===∈=0.5;np 3.5; 0.51; t[ 0,26] ; Rp 6.15

DOI: 10.4236/am.2021.123013 216 Applied Mathematics

M. Seydou, O. Moussa Tessa

Figure 3. Two sample paths of I(t), their mean—the average of the simulated values cal- culated at each instant, estimate of the mathematical expectation of I(t)—and the deter- ministic solution.

Figure 4. Estimated covariance function from 50 sample paths of Xt = ( St( ), It( )) .

Figure 5. Deterministic solution (dI), solution of mathematical expectations ( I ) and mean of 50 sample paths of I(t) (mI).

In Figure 6 mean of 50 sample paths of It( ) and solution of mathematical expectations ( I ) for previous parameter values are compared to the determinis- tic solution (dI) over the time interval [0,52] .

DOI: 10.4236/am.2021.123013 217 Applied Mathematics

M. Seydou, O. Moussa Tessa

Figure 6. Deterministic solution (dI), solution of mathematical expectations ( I ) and mean of 50 sample paths of I(t) (mI) for previous values but t ∈[0,52] .

6. Discussions

This paper presents a stochastic compartmental model SVIR of measles. A com- parison of our stochastic model with the corresponding deterministic model in- dicates that the deterministic solution is asymptotically the mean of the stochas- tic solution. It is well known that mI obtained by random sampling (Monte Carlo methods) before extinction is an estimate of I . Our result shows that the three trajectories of I , dI and mI asymptotically coincide. The deterministic solution is the mean of the stochastic solution. In addition, unlike the deterministic approach, we show that the epidemic is

extinguished independently of the threshold Rp with a probability equal to 1.

More precisely, if Rp ≤ 1 extinction occurs in a time of finite mean, and if

Rp > 1 the disease eventually disappears in a time of infinite mean. One of the peculiarities of our model is that the size of the population is not constant and can be quite large. The extinction of the process in this case is not guaranteed unlike in the case where the size of the population is constant. This led us to focus on the probability of absorption of the process.

On the other hand, when R0 > 1 , it is well known for the constant population SIR model [23] that the average duration of the epidemic increases exponentially with the size of the population. This fact is confirmed by the assertion 1. of the

theorem 8 and the extinction is done in a time of infinite mean when Rp > 1 .

7. Conclusions

To understand the dynamics of the system before absorption, a commonly used measure is the quasi-stationary distribution [24]. The term quasi-stationarity re- fers to the distribution of the Markov chain by conditioning on the event that absorption has not occurred yet. It gives a good measure of the behavior before absorption when the absorption time is very long. If the set of transient states is finite and irreducible, it is well known that the quasi-stationary distribution exists [25]. But if this set is infinite the existence of a quasi-stationary distribution is not guaranteed, and even if it does exist it is

DOI: 10.4236/am.2021.123013 218 Applied Mathematics

M. Seydou, O. Moussa Tessa

practically impossible to determine it explicitly. To elucidate this situation, an extension of our work would be the study of the process in quasi-stationary re- gime. The emergence of epidemics often reveals complex dynamic relationships be- tween susceptible individuals, pathogens and their environments. Complex dy- namic relationships that result in seasonal epidemic cycles vary over time [26]. In Niger, recent studies [7] reveal two main periodicity of measles, a more ac- centuated annual periodicity, probably due to seasonal agricultural labour mi- gration and a low and unstable periodicity of 2 to 3 years which is partly ex- plained by heterogeneity in vaccination coverage. To account for this aspect of temporal and environmental variability, it would be necessary to extend our study to the analysis of the time series of cases of infection. The stochastic aspect takes better account of these temporal and environmental fluctuations and may provide a framework to improve our understanding of the complex dynamics of measles epidemics.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this pa- per.

References [1] World Health Organization (2019) Fact about Measles. [2] Kitengeso, R.E., et al. (2015) A Mathematical Model for Control and Elimination of the Transmission Dynamics of Measles. Applied and Computational Mathematics, 4, 396-408. https://doi.org/10.11648/j.acm.20150406.12 [3] Modèles compartimentaux en épidémiologie—Wikipédia, Oct 2019. [4] Alkassoum, S., et al. (2016) Surveillance épidémiologique de la rougeole au niger: Analyse de la base de données des maladies à déclaration obligatoire (mdo) de 2003 à 2015. International Journal of Innovation and Scientific Research, 17, 264-274. [5] Mitku, S.N. and Koya, P.R. (2017) Mathematical Modeling and Simulation Study for the Control and Transmission Dynamics of Measles. American Journal of Ap- plied Mathematics, 5, 99-107. https://doi.org/10.11648/j.ajam.20170504.11 [6] Moussa Tessa, O. (2006) Mathematical Model for Control of Measles by Vaccina- tion. Proceedings of MSAS 06 Conference, Bamako, August 2006, 31-36. [7] Blake, A., DJibo, A., Guindo, O. and Bharti, N. (2020) Investigating Persistent Measles Dynamics in Niger and Associations with Rainfall. Journal of the Royal Society In- terface, 17, 20200480. https://doi.org/10.1098/rsif.2020.0480 [8] Moss, W.J. (2017) Measles. The Lancet, 390, 2490-2502. https://doi.org/10.1016/S0140-6736(17)31463-0 [9] Moss, W.J. (2007) Measles Still Has a Devastating Impact in Unvaccinated Popula- tions. PLOS Medicine, 4, e24. https://doi.org/10.1371/journal.pmed.0040024 [10] Hethcote, H.W. (2000) The Mathematics of Infectious Diseases. SIAM Review, 42, 599-653. https://doi.org/10.1137/S0036144500371907 [11] van der Driessche, P. and Watmough, J. (2002) Reproduction Numbers and Sub- Threshold Endemic Equilibria for Compartmental Models of Disease Transmission.

DOI: 10.4236/am.2021.123013 219 Applied Mathematics

M. Seydou, O. Moussa Tessa

Mathematical Biosciences, 180, 29-48. https://doi.org/10.1016/S0025-5564(02)00108-6 [12] Finkenstädt, B.F. (2002) A Stochastic Model for Extinction and Recurrence of Epi- demics Estimation and Inference for Measles Outbreaks. Biostatistics, 3, 493-510. https://doi.org/10.1093/biostatistics/3.4.493 [13] Anderson, R.M. and Rhodes, C.J. (1996) Power Laws Governing Epidemics in Iso- lated Populations. Nature, 381, 600-602. https://doi.org/10.1038/381600a0 [14] Anderson, R.M., Jensen, H.J. and Rhodes, C.J. (1997) On the Critical Behavior of Simple Epidemics. Proceedings of the Royal Society of London B, 264, 1639-1646. https://doi.org/10.1098/rspb.1997.0228 [15] Cliff, A.D., Haggett, P. and Smallman-Raynor, M. (1993) Mesales: An Historical Geography of a Major Human Viral Disease from Global Expansion to Local Re- treat. Blackwell, Oxford, 1840-1990. [16] Kitengeso, R.E. (2016) Stochastic Modelling of the Transmission Dynamics of Measles with Vaccination Control. International Journal of Theoretical and Applied Ma- thematics, 2, 60-73. [17] Graham, M., et al. (2019) Measles and the Canonical Path to Elimination. Science, 364, 584-587. https://doi.org/10.1126/science.aau6299 [18] Lahrouz, A., Omari, L. and Kiouach, D. (2011) Global Analysis of a Deterministic and Stochastic Nonlinear SIRS Epidemic Model. Nonlinear Analysis: Modelling and Control, 16, 59-76. https://doi.org/10.15388/NA.16.1.14115 [19] Rosenkrantz, W.A. (1989) Ergodicity Conditions for Two-Dimensional Markov Chains on the Positive Quadrant. Probability Theory and Related Fields, 83, 309-319. https://doi.org/10.1007/BF00964367 [20] Meyn, S.P. and Tweddie, R.L. (1992) Stability of Markovian Processes I: Criteria for Discrete-Time Chains. Advances in Applied Probability, 24, 542-574. https://doi.org/10.1017/S000186780002440X [21] Meyn, S.P. and Tweddie, R.L. (1993) Markov Chains and Stochastic Stability. Sprin- ger-Verlag, London. https://doi.org/10.1007/978-1-4471-3267-7 [22] Kemeny, J.G., Laurie Snell, J. and Knapp, A.W. (1982) Denumerable Markov Chains. Springer-Verlag, New York. [23] Allen, J.S. and Burgin, M. (2000) Comparison of Deterministic and Stochastic SIS and SIR Models in Discrete Time. Mathematical Biosciences, 163, 1-33. https://doi.org/10.1016/S0025-5564(99)00047-4 [24] Darroch, J.N. and Seneta, E. (1967) On Quasi-Stationary Distributions in Absorbing Continuous-Time Finite Markov Chains. Journal of Applied Probability, 4, 192-196. https://doi.org/10.1017/S0021900200025341 [25] Artalejo, J.R. and Lopez-Henero, M.J. (2010) Quasi-Stationary and Ratio of Expec- tations Distributions: A Comparative Study. Journal of Theoretical Biology, 266, 264-274. https://doi.org/10.1016/j.jtbi.2010.06.030 [26] Bartlett, M.S. (1992) Measles Periodicity and Community Size. Journal of the Royal Statistical Society: Series A, 120, 48-70. https://doi.org/10.2307/2342553 [27] Foster, F.G. (1982) On the Stochastic Matrices Associated with Certain Queueing Processes. Annals of Mathematical Statistics, 24, 295-308.

DOI: 10.4236/am.2021.123013 220 Applied Mathematics

M. Seydou, O. Moussa Tessa

Appendix

Proof of the lemma 2:

The lemma is a consequence of the definition of Rp and of the expressions

ddd012,, : n(1− p) −β si − δ s βsi −( µγδ ++)i = d0 , n(11− p) +β si + δ s +( µγδ ++)i n( − p) + β si + δ s +( µγδ ++)i nps(1−−) δ = d1 ,0 (10) nps(1−+) δ n(1− p) −β si − δ s βsi−( µγδ ++)i = 00 0 d2 , n(11− p) +β si00 + δ s +( µγδ ++)i n( − p) + β si00 + δ s +( µγδ ++)i

We can easily determine the signs of the abscissas and ordinates of d j , in- deed: np(11−−) µγδ++ β np( ) sR0 = max , ; p = δ β δµ( ++ γ δ)

1) dd00xy<<0 ; d1x = 0 ; dd22xy<≤0

2) If Rp ≤ 1 we have: 0

3) If Rp > 1 we have: 0

To establish the result, we distinguish the two cases Rp ≤ 1 and Rp > 1 .

• If Rp ≤1 , the angle between d0 and ∇φ (r,θ ) is π ππ a(θ) =− θ1 αθ − ψ +− θ  . From −−−≤θψ21a( θ) ≤( θψ −) −, the  2  22 α −1 angle between d1 and ∇=φ (rr,0) α( cos θθ11 ,sin ) is equal to

a11=θ − π . furthermore, we show that the angle between d2 and α −1 ∇=φ (rr,π 2) α( cos θθ22 , − sin ) is equal to a22=−−θ π .

The choice of θ1 and θ2 allows to have: 5π 33 ππ 3ππ − <θ −π <−, − <−θθ −π <− <− 41 42 212 2 π 53πππ and for any θθ,0<<, we obtain − 1 , we have good θψ−−≤a( θ) ≤ θψ − et a =θ − π . p 112 11

For d2 , we find a2=−−θψ 22. The choice of θ1 and θ2 leads to 33ππππ − <θ −ππ <−et − <−θψ − <− θ − <− 21 2 22 222 π 3πππ and for any θθ,0<< et − <−−≤θψa( θ) ≤− θψ<− . Thus 2 2211 2

DOI: 10.4236/am.2021.123013 221 Applied Mathematics

M. Seydou, O. Moussa Tessa

− 2  π  cosaa1< , cos 22 < cos −−θ <0etcos(a(θ)) < cos( θψ1 −) < 0. 22

definitively, for any value of Rp and for any 02≤≤θ π , we deduce that

∇φφ(rr,θ ) ⋅<∇dd01 0,( ,0) ⋅< 0 and ∇ φ( r ,π 2)⋅ d 2 < 0. (11)

So the assertions 1; 2. a) et 3. deduce. To establish the assertion 2. (b), we consider the partial derivatives with re-

spect to x1 and x2 of φ :

sinθ α −1 Drφ=cos θφ − φθ = αcos(( α −− 1) θ θ ) 11r r (12) cosθ α −1 Drφ=sin θφ + φθ = αsin(( α −− 1) θ θ ) 21r r

where φr and φθ are the partial derivatives with respect to r and θ of φ of Jacobian matrix of φ :

cos((α−− 2) θθ11) sin(( α −− 2) θθ) Drφ= αα( −1) α −2  (13) lj −α −− θθ α −− θθ sin(( 2) 11) cos(( 2) ) The assertion 2. (c) follows from the definition of φ and the fact that π ππ cos(αθ−> θ ) 0 ; indeed, for any θθ,0≤≤ we have −<αθ −< θ . 1 2 221 Hence the lemma 3.  Proof of the lemma 4: The proof is analogous to that of the theorem 3 of [19]. The Taylor formula of the function φ is: φφ( x+ h) −( x) =∇φ ( x) ⋅+ h R( xh, ) 1 where h= ( hh12, ) and R( x, h) = ∑ Dljφη( x+ h) hhl j is the remainder of 2 lj,= 1,2 Taylor with 01<<η .

For l ∈{0,1,2} , when we replace h by Axl ( ) , we get: φφ− =∈ =∇φ ⋅+ (Yk+1 ) ( Y kk) /, Y x D l ( x) dl  RxA( l( x)) Applying the lemma 3 and the remark 4, we have φφ− =∈ =∇φ ⋅+ α −2  (Yk+1 ) ( Y kk) / Y x D l ( x) dl Ox( )

−+α 1 • If Rp ≤ 1 , we have 12<<α and limsup x →∞ x∇φ ( xC) ⋅≤dl 0 <0 ; Therefore:

∃ε >0 andK > 0,   φφ( Yk+1 ) −( YYx kk) / = ≤− ε ∀ xK ≥ .

• If Rp > 1 , it turns out that 01<<α , we cannot conclude that

 φφ(Yk+1 ) −( YYx kk) /0 =≤ ∀≥ xK .

What completes the demonstration.  Proof of the lemma 5:

Let us show the recurrence in the case Rp > 1 .

DOI: 10.4236/am.2021.123013 222 Applied Mathematics

M. Seydou, O. Moussa Tessa

= ≥∈ We pose B={ xx/ ≤ K} , Tinf{ k 0, YBk } and ZYkk= φ ( ){Tk> }

where  A denotes the indicator map of A. Let ( ) be the filtration associated to (Y ) . Knowing that k k∈ k k∈ ≤ ZY//≤ φ {Tk>+1} { Tk >} , we can write: [ kk++11] ( k) {Tk> } k φ= φφ ≤= (Yk++11) {Tk>>} //. k {Tk} (Yk) k{Tk>} (YZkk)

In this last expression, the last inequality is obtained from the second assertion of the lemma 4. Thereafter (Z ) is a positive supermartingale and therefore k k∈ [limkk→+∞ Z = 0] = 1 . On the other hand, because the Markov chain (Y ) is irreducible, we have k k∈ = +∞  limsupkk→+∞ Y =∞=1 . In this case, on {T } , it follows that

limkk→+∞ ZY= lim k→+∞ φ ( k) = +∞ , thus [T = +∞] = 0 . In other words, the fi- nite set A is visited an infinite number of times by the Markov chain (Y ) , k k∈ which corresponds to recurrence. Finally, the last assertion of the lemma is a consequence of the first assertion of the lemma 4 and of Foster’s positive recur- rence criterion [27]. 

DOI: 10.4236/am.2021.123013 223 Applied Mathematics

Applied Mathematics, 2021, 12, 224-239 https://www.scirp.org/journal/am ISSN Online: 2152-7393 ISSN Print: 2152-7385

An Oracle Bone Inscription Detector Based on Multi-Scale Gaussian Kernels

Guoying Liu1, Shuanghao Chen2*, Jing Xiong1, Qingju Jiao1

1School of Computer and Information Engineering, Anyang Normal University, Anyang, China 2School of Computer and Engineering, Zhengzhou University, Zhengzhou, China

How to cite this paper: Liu, G.Y., Chen, Abstract S.H., Xiong, J. and Jiao, Q.J. (2021) An Oracle Bone Inscription Detector Based on The detection of Oracle Bone Inscriptions (OBIs) is one of the most funda- Multi-Scale Gaussian Kernels. Applied Ma- mental tasks in the study of Oracle Bone, which aims to locate the positions thematics, 12, 224-239. of OBIs on rubbing images. The existing methods are based on the scheme of https://doi.org/10.4236/am.2021.123014 anchor boxes, involving complex network design and a great number of anc- Received: February 22, 2021 hor boxes. In order to overcome the problem, this paper proposes a simpler Accepted: March 27, 2021 but more effective OBIs detector by using an anchor-free scheme, where shape- Published: March 30, 2021 adaptive Gaussian kernels are employed to represent the spatial regions of

Copyright © 2021 by author(s) and different OBIs. More specifically, to address the problem of misdetection Scientific Research Publishing Inc. caused by regional overlapping between some tightly distributed OBIs, the This work is licensed under the Creative character regions are simultaneously represented by multiscale Gaussian ker- Commons Attribution International nels to obtain regions with sharp edges. Besides, based on the kernel predic- License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ tions of different scales, a novel post-processing pipeline is used to obtain ac- Open Access curate predictions of bounding boxes. Experiments show that our OBIs de-

tector has achieved significant results on the OBIs dataset, which greatly out- performs several mainstream object detectors in both speed and efficiency. Dataset is available at http://jgw.aynu.edu.cn.

Keywords Oracle Bone Inscriptions, Deep Learning, Object Detection, Hourglass Network

1. Introduction

Oracle Bone Inscriptions (OBIs) are of the oldest and the most mysterious an- cient characters in china, which record a large number of unknown ancestors’ lives, thoughts, and social states about 3600 years ago. They are very important historical materials for understanding the emergence and development of an-

DOI: 10.4236/am.2021.123014 Mar. 30, 2021 224 Applied Mathematics

G. Y. Liu et al.

cient China. The cues of OBIs’ locations are valuable for the interpretation of these ancient characters. Therefore, the detection of OBIs is of the most funda- mental tasks in the field of Oracle Bone study, which tries to locate the positions of OBIs on rubbing images. At present, few people pay attention to the automat- ic detection of OBIs, and OBI experts have to locate the OBIs only according to their knowledges and experiences, which is rather boring and time-consuming. In this paper, we mainly focus on the automatic detection of OBIs and attempt to explore a simple but efficient method to find out the precise positions of OBIs on rubbing images. Currently, there are only a few methods for the OBIs detection task in the field of image processing. For example, Meng [1] build a single-stage OBIs detector via extending SSD300 to SSD1024. Wang [2] introduced a region-based full convolutional network and proposed a novel auxiliary detection algorithm based on character recognition, which can help the detection model reduce the false positive of cracks. In our earlier works [3] [4], we also did some simple explora- tions on the OBIs detection. We applied several state-of-art object detection models on OBIs dataset and compared and analyzed their detection results. Later, based on the statistical characteristics of the characters in scale size, we redesigned the size and aspect ratio of the anchor and proposed and Spatial Block to stabilize the features and alleviate noise interference during training. Although these methods have achieved good detection results on the OBIs da- taset, there are still certain limitations in accuracy and efficiency. First, due to the lack of character-level class labels in the OBIs dataset, the semantic informa- tion of the character is not easily captured through position regression. So, some special characters may be mis-detected by the detection model, for example, some compound characters composed of multiple parts are easily mis-detected as multiple characters, as shown in Figure 1(Left). Similarly, multiple charac- ters are also easy to be detected as a compound character, as shown in Figure 1(Right). Second, most algorithms are based on the scheme of anchor boxes, which involve complex network design and the need for a large number of anc- hor boxes, such as the number of anchor boxes in DSSD [6] exceeds 40 k and the number in RetinaNet [7] exceeds 100 k. To some extent, it reduces the detection efficiency of the detection model. In this work, our main goal is to explore a simpler OBIs detector and improve the detection accuracy. We are motivated by the recently proposed CRAFT (Character Region Aware- ness for Text Detection) [8]. This work uses adaptively shaped Gaussian kernel to represent character region, where the detection of the text instances is con- verted to the prediction of the corresponding Gaussian map. Thus, it not only bypasses the need for anchor boxes but also enables the detection model to learn character spatial regions. In our work, we follow the formulation that represents the Oracle Bone Character region by adaptively shaped Gaussian kernel and di- rectly outputs the Gaussian prediction of character region, as shown in Figure 2. However, experiments show that Gaussian kernel representation has good per-

DOI: 10.4236/am.2021.123014 225 Applied Mathematics

G. Y. Liu et al.

formance only when dealing with character regions that are not rigidly bounded and it is prone to regional overlapping for some tightly distributed oracle cha- racters, as shown in Figure 3. To overcome this problem, we represent a single character using Gaussian kernels of multiple scales simultaneously, where the smaller the scale, the larger the margin between the character regions, and then based on these kernel predictions, a progressive scale expansion strategy is used to obtain accurate character bounding boxes. Experimental results show that, compared to some state-of-art object detectors, our character detector based on multi-scale Gaussian kernels have achieved more accurate results on the OBIs dataset. The main contributions of this work are summarized as follows:

Figure 1. Examples of false detection of Faster R-CNN [5]. The red and blue boxes indi- cate the predicted and ground truth bounding boxes respectively.

Figure 2. Visualization of the character detection based Gaussian kernel representation. Left: Heatmaps predicated by our proposed framework. Right: Segmentation result based on the heatmaps predicated.

(a) (b)

Figure 3. Outputs by our proposed framework when only using single scale Gaussian kernel. (a) and (b) indicates Heatmaps with not rigidly bounded and tightly distributed between characters.

DOI: 10.4236/am.2021.123014 226 Applied Mathematics

G. Y. Liu et al.

• We firstly propose an anchor-free detector for OBIs detection. The detector uses the Gaussian kernel to represent the character spatial region, which not only bypasses the need for anchor boxes, but also enables the detection model to learn character spatial regions. • To overcome the problem of misdetection caused by regional overlapping between some tightly distributed oracle characters, we represent character region using Gaussian kernels of multiple scales simultaneously, and then based on these kernel predictions, character regions with sharp edges are ob- tained in the way of progressive scale expansion. • Experiments show that compared to some state-of-art object detectors, our character detector based on multi-scale Gaussian kernels representation has achieved excellent detection results in accuracy and efficiency on the OBIs dataset.

2. Related Work 2.1. Traditional Object Detection Methods

In the early days, most object detection methods [9] [10], adopted the detection routes of Sliding Window or Connected Components Analysis. Based on the Sliding Window method, windows of different scales are usually used to densely slide on the input image and meanwhile, the content of each window is classified by a classifier or rules made by people. The methods based on Connected Com- ponents Analysis usually first obtain the selected connected regions through a variety of ways (e.g., color clustering or extreme region extraction) and then fil- ter out non-object regions in the candidate region based on some artificially de- signed rules. As one of the most successful detection methods, [11] uses Haar features and Adaboost [12] to train a series of cascaded classifiers for face detec- tion, achieving high efficiency and satisfactory accuracy. DPM [13] is another popular method that had maintained the best results on PASCAL VOC [14] for many years. It uses a mixture of multi-scale deformable part models to represent highly variable object classes. Later, some methods further improved the accu- racy of object detection based on knowledge of morphological operations [15], conditional random fields [16] and graphs [17].

2.2. Object Detection in Deep Learning

Motivated by the thriving of deep learning-based object [18] or text [19] detec- tion architectures, we thought that oracle characters as a particular object could get benefits from these fields. There are two main trends in the field of object detection: two-stage and one-stage. Two-stage approaches divide the object detection task into two stages: gene- rates ROIs (Region of Interesting) and then classify and regress the ROIs. Two-stage approach was introduced and popularized by R-CNN [20]. It ge- nerates ROIs using a low-level vision algorithm and then uses a DCN-based re- gion-wise classifier to classify the ROIs independently. Later, SPP-Net [21] and

DOI: 10.4236/am.2021.123014 227 Applied Mathematics

G. Y. Liu et al.

Fast-RCNN [22] improve R-CNNs by extracting ROIs from the feature maps. However, both still rely on separate proposal algorithms and cannot be trained end-to-end. Faster-RCNN [5] is allowed to be trained end-to-end by introducing RPN (region proposal network). RPN generates proposals from a set of pre-deter- mined candidate boxes, usually known as anchor boxes, which not only makes the detectors more efficient but also allows the detectors to be trained end-to- end. Mask-RCNN [23] further improves the efficiency of Faster-RCNN by add- ing a mask prediction branch and can thereby detect objects and predict their masks at the same time. Other works focus on the architecture design, the con- textual relationship, improving speed. One-stage approaches remove the ROIs extraction process and directly clas- sify and regress the candidate anchor boxes. YoLo [24] uses a single feed-forward convolutional network to directly predict object classes and locations, which is extremely fast. After that, YoLov2 [25] further improves YoLo by using more anchor boxes and a new bounding box regression method. DSSD [6] and RON [2] adopt networks similar to the Hour- glass Network [26], enabling them to combine low-level and high-level features via skip connections to predict bounding boxes more accurately. RefineDet [27] refines the locations and sizes of the anchor boxes twice, exploiting the merits of both one-stage and two-stage approaches. CornerNet [28] and CenterNet [29] are other keypoint-based approaches that directly detect an object using a pair of corners. Although these methods achieve high performance, it still has room for improvement.

2.3. Related Works of OIBs Detection

Up to now, there are only a few methods for the OBIs detection task in the field of image processing. Meng [1] build a single-stage OBIs detector via extending SSD300 to SSD1024. Wang [2] introduced a region-based full convolutional network and proposed a novel auxiliary detection algorithm based on character recognition, which can help the detection model reduce the false positive of cracks. In our earlier works [3] [4], we also did some simple explorations on OBIs detection. We applied several state-of-art object detection models on the OBIs dataset and compared and analyzed their detection results. Later, based on the statistical characteristics of the characters in scale size, we redesigned the size and aspect ratio of the anchor and proposed the Spatial Block to stabilize the features and alleviate noise interference during training. However, most of these methods are only a few simple explorations by mi- grating some classic object detection models slightly modified to the OBIs data- set. Thus, there are still certain limitations in accuracy and efficiency. As men- tioned above, most algorithms are based on the scheme of anchor boxes, which involve complex network design and the need for a large number of anchor boxes. Secondly, some special characters (such as compound characters) may be mis- detected by the detection model. In this work, our main goal is to explore a

DOI: 10.4236/am.2021.123014 228 Applied Mathematics

G. Y. Liu et al.

simpler OBIs detector and improve the detection accuracy.

3. Methodology 3.1. The Pipeline of Our Character Detection Model

Our character detection model regards oracle bone characters as special key points, which aims to predict complete and separated character regions. The overall data stream of the model is shown in Figure 4. Firstly, the rubbing input

IO passes through a convolutional neural network to predict a feature map HWC×× IRF ∈ that incorporates multi-layer context information of feature maps.

The feature map IF is mapped to n branches by the region prediction module

whose output are used to generate n scale region maps SS12,,, Sn , where

each Si represents a character region score map of scale size. S1 represents the

character region prediction of the minimal scale, and Sn represents the character region prediction of the maximal scale. Finally, based on these obtained mul- ti-scale Gaussian region predictions, the final accurate character bounding boxes are obtained after a series of simple post-processing operations.

3.2. Architecture of Detection Network

The OBIs detector uses the Hourglass Network [26] as its basic backbone. The Hourglass Network is a fully convolutional neural network with a cascade struc- ture, which is composed of one or more Hourglass modules. The Hourglass module is similar to a lightweight encoding and decoding network, which down samples the input features through a series of convolution and maximum pool- ing, and then restores to the original resolution through a series of up sampling and convolutional layers. To reduce the loss of details caused by the max-pooling operation, skip connections are used to bring the details back to the up-sampling feature. Besides, a single hourglass module can capture global and local features in a unified structure. When multiple hourglass modules are stacked in the net- work, the Hourglass model can reprocess features to obtain higher-level infor- mation.

Figure 4. The overall structure of the OBIs detector based on multi-scale Gaussian kernels.

DOI: 10.4236/am.2021.123014 229 Applied Mathematics

G. Y. Liu et al.

In our character detector, we stack two Hourglass modules and make a few modifications to the overall Hourglass network. Specifically, before the features are input to the Hourglass module, we use a convolutional layer with stride 2 and a 3 × 3 convolution to replace the 7 × 7 convolution in the original network, which can scale the input image to 1/2 size. Similarly, in the Hourglass module, a 3 × 3 convolution with stride 2 is used to replace the maximum pooling in the original module to down-sample the input features. At the end of the Hourglass module, we continue to add an up-sampling layer to restore the output to the original input resolution.

3.3. Loss Functions

The overall loss function of the OBIs detection model is expressed as follows:

LL=λλFullMap +−(1 ) LZoomMap (1)

where LFullMap and LZoomMap represent the loss of character region instance with complete shape and multiple shrinking character region instances respec-

tively, and λ is used to balance the weight of LFullMap and LZoomMap . ∗ LFullMap= L Pix ( Sp( ), S( p)) (2) where p represents the coordinate position of a pixel. Sp( ) represents the ∗ predicted character region score with complete shape, and Sp( ) represents the corresponding ground truth score.

∗∗2 L TpT( ), ( p) = Tp( ) − T( p) (3) Pix ( ) ∑ p 2 = N −1 ∗ LZoomMap ∑i=1 LPix( ZpZ i( ), i ( p)) (4)

where N represents the number of scales, Zpi ( ) represents the predicted cha- ∗ racter region score of the scale i , and Zpi ( ) represents the ground truth score of the scale i . In addition to the character features, there is a lot of disturbance on the rub- bing image that is very similar to character features, such as background noise and cracks. To enable the detection model to learn to distinguish these patterns, Online Hard Negative Mining [30] (OHEM) is applied to enforce the 1:3 ratio of

positive and negative pixels in the detection loss LFullMap .

3.4. Ground Truth Label Generation

For each training image, we generate the ground truth label of the region score with complete shape and n shrinking using character-level bounding boxes provided by the OBIs dataset, as shown in Figure 5. The detailed steps are as 1) According to character level bounding boxes provided by the OBIs dataset, fol- lowing the shrinking principle in [8], setup n shrinking pixel spacing

D= { dd12,,, dn} . 2) Based on the shrinking spacing D, shrink inward along the original bounding boxes to obtain n bounding box sets of different scales. 3) Prepare a 2D isotropic Gaussian kernel. 4) Calculate the perspective trans- formation matrixM between the Gaussian kernel and each character box. 5)

DOI: 10.4236/am.2021.123014 230 Applied Mathematics

G. Y. Liu et al.

Based on the perspective transformation matrixM, warp Gaussian map to the box area.

3.5. Inference

During inference, the detection model finally outputs n character region maps of different scales. In this section, we briefly describe how to predict the accurate character level bounding box based on the region score maps. The key of the post-processing pipeline is a scale extension algorithm from [31], which adopts a novel progressive extension strategy to detect dense scene text. It uses the adjacent relationship between Gaussian heatmaps of different scales to gradually expand from the text region with the minimal kernel to the maximal kernel with complete shape. On this basis, we added some additional steps and a few modifications to suit our character detection task. We first per- form a simple pre-processing on the original multi-scale gaussian map predic- tion and reduce the noise in the gaussian map through some morphological op- erations (opening operation, distanceTransform). Secondly, for the separated character regions K obtained by the scale extension algorithm, we calculated their connected components C and assigned different labels Label. Finally, based on these assigned Label, the minimum enclosing rectangle of each connected component is calculated to obtain the final accurate bounding box. The func- tions like connectedComponents, morphologyEx, and minAreaRect provided by Opencv can be applied for this purpose. The details are shown in Algorithm 1.

Figure 5. The generation process of ground truth label.

Algorithm 1. Post-processing pipeline of detection model.

Input: Kernel predictions Z= { ZZ12,,, Zn } Output: Bounding box list L Function prediction (Z)

1) Initialize a set of zero arrays M= { MM12, ,, Mn } 2) While i =1 to n do

3) If Zpi ( ) > δ Then Mi ( p) = True // δ is a threshold with value of 0.35 4) M← morphologyEx( M ) 5) K← scaleExpanded( M ) 6) C, Label← connectedComponents( K ) 7) L← minAreaRectByLabel( C, Label) 8) return L

DOI: 10.4236/am.2021.123014 231 Applied Mathematics

G. Y. Liu et al.

4. Experiments 4.1. Oracle Bone Inscriptions Dataset

In this paper, all experiments are based on the OBIs dataset provided by the Key Laboratory of the Ministry of Education for Oracle Information Processing, An- yang Normal University. The dataset focuses on the task of OBIs detection and it mainly includes two parts: the number of oracle bone rubbing image collected from the OBIs literature collection using a high-resolution scanner, which is up to 9500 pieces, and the bounding box of characters level by hand-made. Differ- ent from the general natural scene image, the rubbing image mainly has the fol- lowing characteristics: High noises: Oracle bone rubbing, as the main carrier of OBIs, was buried in the ruins of Anyang for a long time and was not discovered until 120 years ago. Therefore, there is inevitably a certain degradation on the rubbing appearance. The most significant of these is a large amount of noise on the rubbing. These noises have different rules and are densely distributed on the rubbing image, which brings great challenges to the task of OBIs detection. Cracks: Due to the burial environment and private excavations, many of the unearthed oracle bone rubbing have been broken, and various cracks have ap- peared on the surface of the rubbing. These cracks are very similar to character characteristics in texture, and it is easy to mistake for oracle bone characters. Distribution: The characters on the same rubbing image are of different sizes, different directions, and random distribution. Besides, in the 56,743 oracle bone rubbing, there are 1425 words. Among them, there are 366 common characters, 500 not usually used, and 559 rare. There are up to 9500 oracle rubbing records on OBIs dataset. In this experi- ment, the training set, validation set, and test set contain 8287, 436, and 411 data records respectively.

4.2. Experimental Environment

In this experiment, the source code of all models is based on the Pytorch deep learning framework and trained on the four Nvidia TITAN X GPUs. Especially, due to the lack of character category information in the OBI dataset, the class- agonistic strategy is adopted. By default, all characters are treated as a single category, and the same category label is assigned. During training, the rubbing image is scaled to 512 × 512 resolution, and the Adam optimizer is used to up- date and optimize the parameters. We start Adam at the learning rate of 0.0001, and use 0.9 momentum and 0.0001 weight decay empirically.

4.3. Evaluation Indicators

We mainly evaluate the overall performance of the character detection model from the perspective of efficiency and accuracy. The three indicators of network weight parameters, floating-point calculation, and inference speed are used to

DOI: 10.4236/am.2021.123014 232 Applied Mathematics

G. Y. Liu et al.

evaluate the overall detection efficiency of the model. Precision (P), Recall (R), and F-Measure (F) has commonly used measurement indicators in mainstream object detection methods to measure the detection accuracy of the model. The calculation formulas of these indicators are as follows: TP P = (5) TP+ FP TP R = (6) TP+ FN 2∗∗PR F = (7) PR+ where TP, FP and FN represent Ture Positive, False Positive, False Negative re- spectively.

4.4. Ablation Experiments

The validity of Gaussian kernel representation: In addition to Gaussian kernels that can be used to represent character regions, binary mask is another option. To compare the difference between the two represents, we simply com- pare the character detection model (using only a single scale Gaussian kernel) with the state-of-art semantic segmentation model DeepLabv3 [32]. Specifically, we roughly divide the rubbing image into foreground and background regions according to the principle that whether the pixels are inside the character level box annotation provided by the OBIs dataset and then use the trained segmenta- tion model directly to predict the foreground character regions. The visualiza- tion of these models’ output results is shown in Figure 6. The binary mask represents the character regions using discrete values without distinction and the obtained prediction results have more regional overlapping. On the contrary, the Gaussian kernelencodes the character region based on the distance relationship with the center pixel, and the obtained character regions are clearer on the boundary. After obtaining these binary and Gaussian region predictions, we use some simple post-processing operations (including connectedComponents, minArea- Rect) to get the character bounding boxes and then calculate their P, R, F indi- cators respectively. The quantitative results are shown in Table 1. The method based on Gaussian kernel is significantly higher than the binary mask represen- tation on all indicators. This shows once again that the Gaussian kernel repre- sentation has obvious advantages and is more conducive to expressing the tightly distributed character region.

Table 1. The quantitative results based on binary mask and Gaussian kernel represent.

Methods Precision (P) Recall (R) F-Measure (F)

DeepLabv3 [32] 0.626 0.638 0.632

Gaussian(our) 0.776 0.646 0.705

DOI: 10.4236/am.2021.123014 233 Applied Mathematics

G. Y. Liu et al.

Is multi-scale Gaussian kernel necessary? To answer this question, we re- train the detection model, when the number of scales is different. The assess- ment results are shown in Figure 7, from which we can find that with the grow- ing of n , the F-measure keeps rising and begins to go down when n > 6 . The informative result suggests that it is not that the larger the number of scales, the better. When n = 6 , the detection model achieves the highest F-measure, thus, it is more beneficial to achieve better detection results for the task of OBIs detec- tion when the number of scales is 6. Besides, although with the growing of n , F-measure shows a certain decline, but compared to using a single-scale Gaus- sian kernel, when n >1, the value of F-measure is significantly higher. This shows to some extent that the design of multiple kernel scales is essential and ef- fective.

4.5. Accuracy Comparison

To better evaluate the detection effect of our character detection model, we com- pare our model with several mainstream object detection models, which not only include two-stage object detectors such as Faster RCNN [5], but also single-stage object detectors such as YoLov3 [35], RBFNet [34].

(a) (b) (c)

Figure 6. Comparison results based on the binary mask and Gaussian kernel represent. (a) Rubbing input; (b) Binary mask prediction by DeepLabv3 [32]; (c) Gaussian region prediction by our model.

Figure 7. Ablation study on the number of scales n.

DOI: 10.4236/am.2021.123014 234 Applied Mathematics

G. Y. Liu et al.

Table 2 shows the quantitative results with these state-of-art detection mod- els. In terms of accuracy, our detector achieved the highest score of 89.7%,which is significantly better than the second place with a gap of 12%. However, in terms of recall rate, our model performed relatively weakly, almost at the bottom of all the models. For this phenomenon, we believe that the possible reason lies in the fact that for the detection methods based on anchor boxes, the non-maxi- mum suppression (NMS) operation uses a manually set threshold to filter out some invalid candidate boxes, which may have some missed candidate boxes, resulting in a high recall rate. To more accurately evaluate the detection effect, we continue to compare the F-measure that is the balance of indicators of preci- sion and recall. Similarly, our model still achieves the best results, far better than the second place by 5%. Therefore, this reflects the advantage of our model in accuracy to some degree. Also, it is not difficult to imagine that our model can capture more semantic information about the characters and has character area awareness by using directly Gaussian kernels to represent the character regions, so it can get more accurate detection results.

4.6. Efficiency Comparison

We evaluate the detection efficiency of our character detector by measuring its inference speed, weight parameters, floating-point operations and then com- pared them with several state-of-art detectors. Table 3 shows the efficiency comparison with these models. In inference speed, our model achieved the fastest inference speed of 23FPS, which 5FPS higher than the second place YoLov3 [35]. In weight parameters, our model re- quires fewer parameters, occupying only 12.73M, which is much lower than the 26.29M of the suboptimal model SSD [19]. In terms of floating-point operations, our model is only weaker than YoLov3 [35] and won the second position. Nev- ertheless, the number of floating-point operations is only 57.34 GMac, which is far lower than other state-of-art detection models. It is comprehensively known that our model can achieve faster inference speed while Has a lighter computing burden.

Table 2. Accuracy quantitative results with state-of-art detection models.

Methods Precision (P) Recall (R) F-Measure (F)

FasterRCNN [5] 0.754 0.778 0.766

SSD [19] 0.748 0.758 0.753

RefineDet [33] 0.752 0.805 0.778

RBFNet [34] 0.761 0.789 0.775

YoLov3 [35] 0.776 0.784 0.78

Ours 0.897 0.775 0.832

DOI: 10.4236/am.2021.123014 235 Applied Mathematics

G. Y. Liu et al.

Table 3. Comparison results of detection efficiency with state-of-art detection model.

Methods Speed(FPS) Parameters(M) Flops(GMac)

Faster RCNN [5] 3 41.37 129.27

SSD [19] 9 26.29 90.4

RefineDet [33] 14 34.44 97.94

RBFNet [34] 15 36.64 103.65

YoLov3 [35] 17 61.92 50.06

Ours 23 12.73 57.34

5. Conclusion

In this paper, we first propose an anchor-free OBIs detector for OBIs detection. The detector uses adaptively shaped Gaussian kernel to represent the spatial re- gion of the characters, which not only bypasses the need for anchor boxes but also enables the detection model to learn character spatial regions. Furthermore, to address the problem of misdetection caused by regional overlapping between some tightly distributed characters, the character region is simultaneously repre- sented by multiscale Gaussian kernels to obtain character regions with sharp edges. Finally, based on these kernel predictions of different scales, a novel post- processing pipeline is used to obtain accurate bounding box predictions. The experimental results show that our OBIs detector has achieved good detection results on the OBIs dataset.

Fund

This work is supported by the joint fund of National Natural Science Foundation of China (NSFC) and Henan Province of China under Grant U1804153, and partly supported by the Scientific and Technological Research Projects in Henan province under Grant 212102310545 and 212102210502 and the Anyang Science and Technology Plan Project under Grant 2021C01GX020.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this pa- per.

References [1] Meng, L. (2017) Two-Stage Recognition for Oracle Bone Inscriptions. Lecture Notes in Computer Science, 10485, 672-682. https://doi.org/10.1007/978-3-319-68548-9_61 [2] Hao, W. (2019) Research on Oracle Detection and Recognition Based on Deep Learning. South China University of Technology, Guangzhou. [3] Xing, J., Liu, G. and Xiong, J. (2019) Oracle Bone Inscription Detection: A Survey of Oracle Bone Inscription Detection Based on Deep Learning Algorithm. Proceedings

DOI: 10.4236/am.2021.123014 236 Applied Mathematics

G. Y. Liu et al.

of the International Conference on Artificial Intelligence, Information Processing and Cloud Computing, Sanya, December 2019, Article No. 39. https://doi.org/10.1145/3371425.3371434 [4] Liu, G., Xing, J. and Xiong, J. (2020) Spatial Pyramid Block for Oracle Bone Inscrip- tion Detection. ICSCA 2020: Proceedings of the 2020 9th International Conference on Software and Computer Applications, February 2020, 133-140. https://doi.org/10.1145/3384544.3384561 [5] Ren, S., He, K. and Girshick, R. (2016) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031 [6] Fu, C.Y., Liu, W., Ranga, A., Tyagi, A. and Berg, A.C. (2017) DSSD: Deconvolution- al Single Shot Detector. [7] Lin, T.Y., Goyal, P., Girshick, R., He, K. and Dollar, P. (2020) Focal Loss for Dense Object Detection. EEE Transactions on Pattern Analysis and Machine Intelligence, 42, 318-327. https://doi.org/10.1109/TPAMI.2018.2858826 [8] Baek, Y., Lee, B., Han, D., Yun, S. and Lee, H. (2019) Character Region Awareness for Text Detection. IEEE/CVF Computer Society Conference on Computer Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 9357-9366. https://doi.org/10.1109/CVPR.2019.00959 [9] Epshtein, B., Ofek, E. and Wexler, Y. (2010) Detecting Text in Natural Scenes with Stroke Width Transform. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, 13-18 June 2010, 2963-2970. https://doi.org/10.1109/CVPR.2010.5540041 [10] Huang, W., Lin, Z., Yang, J. and Wang, J. (2013) Text Localization in Natural Im- ages Using Stroke Feature Transform and Text Covariance Descriptors. IEEE In- ternational Conference on Computer Vision, Sydney, 1-8 December 2013, 1241-1248. https://doi.org/10.1109/ICCV.2013.157 [11] Papageorgiou, C.P., Oren, M. and Poggio, T. (1998) A General Framework for Ob- ject Detection. Sixth International Conference on Computer Vision, Bombay, 7 Janu- ary 1998, 555-562. https://doi.org/10.1109/ICCV.1998.710772 [12] Schapire, R.E. (2013) Explaining AdaBoost: Empirical Inference. Springer, Berlin, Heidelberg. [13] Felzenszwalb, P.F., Girshick, R.B., McAllester, D. and Ramanan, D. (2010) Object Detection with Discriminatively Trained Part Based Models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 1627-1645. https://doi.org/10.1109/TPAMI.2009.167 [14] Everingham, M., Eslami, S.A., Van Gool, L., Williams, C.K., Winn, J. and Zisser- man, A. (2015) The Pascal Visual Object Classes Challenge: A Retrospective. Inter- national Journal of Computer Vision, 111, 98-136. https://doi.org/10.1007/s11263-014-0733-5 [15] Lee, J.J., Lee, P.H., Lee, S.W., Yuille, A. and Koch, C. (2011) AdaBoost for Text De- tection in Natural Scene. IEEE International Conference on Document Analysis and Recognition, Beijing, 18-21 September 2011, 429-434. https://doi.org/10.1109/ICDAR.2011.93 [16] Wang, K., Babenko, B. and Belongie, S. (2011) End-to-End Scene Text Recognition. IEEE International Conference on Computer Vision, Barcelona, 1457-1464. [17] Wang, T., Wu, D.J., Coates, A. and Ng, A.Y. (2012) End-to-End Text Recognition with Convolutional Neural Networks. Proceedings of the 21st International Confe-

DOI: 10.4236/am.2021.123014 237 Applied Mathematics

G. Y. Liu et al.

rence on Pattern Recognition, Tsukuba, Japan, 11-15 November 2012, 3304-3308. [18] Li, Y., He, K., Sun, J., et al. (2016) R-fcn: Object Detection via Region-Based Fully Convolutional Networks. Proceedings of the 30th International Conference on Neural Information Processing, Morehouse Lane, Red Hook, December 2016, 379-387. [19] Wang, Y., Xie, H., Zha, Z.-J., Xing, M., Fu, Z. and Zhang, Y. (2020) ContourNet: Taking a Further Step toward Accurate Arbitrary-Shaped Scene Text Detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 11753-11762. https://doi.org/10.1109/CVPR42600.2020.01177 [20] Liu, W., Fu, C.H., Reed, S., et al. (2016) SSD: Single Shot Multi-Box Detector. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., Computer Vision—ECCV 2016, Springer, Cham, 21-37. https://doi.org/10.1007/978-3-319-46448-0_2 [21] He, K., Zhang, X., Ren, S. and Sun, J. (2016) Spatial Pyramid Pooling in Deep Con- volutional Networks for Visual Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904-1916. https://doi.org/10.1109/TPAMI.2015.2389824 [22] Girshick, R. (2015) Fast R-CNN. 2015 IEEE International Conference on Computer Vision, Santiago, 7-13 December 2015, 1440-1448. https://doi.org/10.1109/ICCV.2015.169 [23] He, K., Gkioxari, G., Dollár, P. and Girshick, R. (2020) Mask R-CNN. IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 42, 386-397. https://doi.org/10.1109/TPAMI.2018.2844175 [24] Redmon, J., Divvala, S., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 779-788. https://doi.org/10.1109/CVPR.2016.91 [25] Redmon, J. and Farhadi, A. (2017) YOLO9000: Better, Faster, Stronger. 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 6517-6525. https://doi.org/10.1109/CVPR.2017.690 [26] Newell, A., Yang, K. and Deng, J. (2016) Stacked Hourglass Networks for Human Pose Estimation. In: Leibe, B., Matas, J., Sebe, N. and Welling, M., Eds., Computer Vision—ECCV 2016, Springer, Cham, 483-499. https://doi.org/10.1007/978-3-319-46484-8_29 [27] Lin, G., Milan, A., Shen, C. and Reid, I. (2017) RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation. IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 42, 1228-1242. https://doi.org/10.1109/CVPR.2017.549 [28] Law, H. and Deng, J. (2020) CornerNet: Detecting Objects as Paired Keypoints. In- ternational Journal of Computer Vision, 128, 642-656. https://doi.org/10.1007/s11263-019-01204-1 [29] Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q. and Tian, Q. (2019) CenterNet: Key- point Triplets for Object Detection. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 27 October-2 November 2019, 6569-6578. https://doi.org/10.1109/ICCV.2019.00667 [30] Shrivastava, A., Gupta, A. and Girshick, R.B. (2026) Training Region-Based Object Detectors with Online Hard Example Mining. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 761-769. https://doi.org/10.1109/CVPR.2016.89 [31] Wang, W., Li, X. and Liu, T. (2019) Shape Robust Text Detection with Progressive Scale Expansion Network. IEEE/CVF Computer Society Conference on Computer

DOI: 10.4236/am.2021.123014 238 Applied Mathematics

G. Y. Liu et al.

Vision and Pattern Recognition, Long Beach, 15-20 June 2019, 9336-9345. https://doi.org/10.1109/CVPR.2019.00956 [32] Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K. and Yuille, A.L. (2018) DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Con- volution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 6834-6848. https://doi.org/10.1109/TPAMI.2017.2699184 [33] Zhang, S., Wen, L., Bian, X., Lei, Z. and Li, S.Z. (2018) Single-Shot Refinement Neural Network for Object Detection. IEEE Transactions on Circuits and Systems for Vid- eo Technology, 31, 674-687. https://doi.org/10.1109/TCSVT.2020.2986402 [34] Liu, S., Huang, D. and Wang, Y. (2018) Receptive Field Block Net for Accurate and Fast Object Detection. Computer Vision. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Computer Vision—ECCV 2018, Springer, Cham, 404-419. https://doi.org/10.1007/978-3-030-01252-6_24 [35] Redmon, J. and Farhadi, A. (2018) YOLOv3: An Incremental Improvement.

DOI: 10.4236/am.2021.123014 239 Applied Mathematics

9 772152 738001 0 3