Generalized Constraints As a New Mathematical Problem in Artificial

1 Generalized Constraints as A New Mathematical Problem in Artificial Intelligence: A Review and Perspective Bao-Gang Hu, Senior Member, IEEE, and Han-Bing Qu, Member, IEEE, . A. Three core terms Abstract—In this comprehensive review, we describe a new Definition 1: Conventional constraint (CC) refers to a mathematical problem in artificial intelligence (AI) from a mathematical modeling perspective, following the philosophy stated by constraint in which its representation is fully known and given Rudolf E. Kalman that “Once you get the physics right, the rest is in a structured form. mathematics”. The new problem is called “Generalized Constraints Definition 2: Prior information (PI) is any information or (GCs)”, and we adopt GCs as a general term to describe any knowledge about the particular things (such as problem, data, type of prior information in modelings. To understand better about GCs to be a general problem, we compare them with the fact, etc.) that is known for someone. conventional constraints (CCs) and list their extra challenges over Definition 3: Generalized constraint (GC) is a term used in CCs. In the construction of AI machines, we basically encounter mathematical modelings to describe any related prior informa- more often GCs for modeling, rather than CCs with well-defined tion. forms. Furthermore, we discuss the ultimate goals of AI and From the definitions above, we can describe their relations redefine transparent, interpretable, and explainable AI in terms of comprehension levels about machines. We review the studies by using a set notation: in relation to the GC problems although most of them do not PI = GC CC, (1) take the notion of GCs. We demonstrate that if AI machines are ⊃ simplified by a coupling with both knowledge-driven submodel and where CC is a subset of GC, and GC is equal to PI. When the data-driven submodel, GCs will play a critical role in a knowledge- driven submodel as well as in the coupling form between the two term PI appears in daily life, the term GC stresses a mathemati- submodels. Examples are given to show that the studies in view cal meaning in modeling. For simplifying discussions, we take of a generalized constraint problem will help us perceive and PI as a general term which may be called prior knowledge, explore novel subjects in AI, or even in mathematics, such as prior fact, specification, bias, hint, context, side information, generalized constraint learning (GCL). invariance, idea, hypothesis, principle, theory, common sense, Index Terms—Constraints, Prior, Transparency, Interpretabil- etc. In [2], Hu et al. pointed out that PI usually exhibits one ity, Explainability or a combination of features in modelings, such as incomplete information and unstructured form in its representation. They showed several examples about incomplete information in I. INTRODUCTION GCs, such as, 1 ax2 by2 > 0, where a and b are unknown − − NCE you get the physics right, the rest is mathematics parameters. O [1]. This statement by Kalman is particularly true to the arXiv:2011.06156v1 [cs.AI] 12 Nov 2020 study of artificial intelligence (AI). In contrast to the natural B. Extra challenges of GCs over CCs intelligence displayed by humans or other lives, AI demon- strates its intelligence by machines (or tools, systems, models Duda et al. pointed out that [3]: “incorporating prior in other terms) programmed from a computer language. The knowledge can be far more subtle and difficult”. For a better statement will direct us to seek the fundamental of AI at a understanding about CCs and GCs, we present an overall mathematical level rather than to stay at application levels. comparison between them in respect to several aspects (Table Therefore, when deep learning (DL) advanced AI to a new I). One can see that GCs do not only enlarge the application wave, one question seems to be: ”Do we encounter any new domains and the representation forms over CCs, but also add mathematical, yet general, problem in AI”? a significant amount of extra challenges in AI studies. For In this paper, we take the notion of “Generalized Constraints example, the new subjects may appear from the study of GCs. (GCs)” and consider it as a new mathematical problem in AI. Some of GCs may involve a transformation between linguistic The related backgrounds are given below. prior and computational representation, which is a difficult task because one may face a “semantic gap” [4]. We will discuss Manuscript created September 19, 2020.(Corresponding Author: Bao-Gang those extra challenges further in the later sections. Hu) B.-G. Hu is with National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science, and University of Chinese C. Mathematical notation of generalized constraints Academy of Sciences, Beijing, 100091, China. H.-B. Qu is with Beijing Institute of New Technology Applications, Beijing In fact, the term GCs is not a new concept and it appeared in Academy of Science and Technology, Beijing, 100094, China. literature, such as a paper by Greene [5] in 1966. In the earlier 2 TABLE I COMPARISONS BETWEEN CCS AND GCS Problem Representation Completeness Constraint Related Given Domain(s) Forms Features Transformation Tasks Examples Within computational Transformation Within representations with Fully known Transformation Conventional only within the x 2 optimization structured and well constraint into dual g( ) = x1 + x2 = 0 Constraints computational h(x)= x1 + x2 ≥ 0 domain defined forms, i.e., representations problems representations equality and/or inequality functions Covering Including natural Possibly Possibly The system output a large language requiring requiring at the next step will spectrum of descriptions transformations constraint be a function of the domains in AI and computational between natural mathematization, current output, as Including modelings, representations language constraint well as of the fully and/or such as, with (un)structured, descriptions coupling output with a time Generalized partially mathematics, and/or ill (well) and selection, delay τ which is a Constraints known cognition, defined forms, such computational parameter positive integer. constraint neuroscience, as, rule, graph, representations identifiability, (Mathematization: representations psychology, table, functional, and/or from and/or x(t +1) = linguistics, equation, unstructured generalized f(x(t),x(t − τ)), social science, (sub)model, form into constraint τ(∈ Z+) is unknown physics, etc. virtual data, etc. structured one learning studies for identification.) studies, GC was used as a term without a formal definition until the studies by Zadeh [6]–[8] in 1986, 1996, and 2011, respectively. Zadeh presented a mathematical formulation of GCs in a canonical form [7]: X isr R, (2) where isr (pronounced “ezar”) is a variable copula which defines the way in which R constrains a variable X. He stated Fig. 1. Schematic diagram of mathematical spaces studied in machine learning that “the role of R in relation to X is defined by the value of or artificial intelligence. The diagram is modified from [11] by adding the part the discrete variable r”. The values of r for GCs are defined of “Constraint Space”. by probabilistic, probability, usuality, fuzzy set, rough set, etc; so that a wide variety of constraints can be included. Zadeh Network (HNN)” models [9], [10], but adopted “generalized used GCs for proposing a new framework called “Generalized constraint” rather than “hybrid” as a descriptive term so that Theory of Uncertainty (GTU)”. He described that “The con- the mathematical meaning was clear and stressed. cept of a generalized constraint is the center piece of GTU” [8]. The idea by Zadeh is very stimulating in the sense that we need to utilize all kinds of prior, or GCs, in AI D. A view from mathematical spaces modelings. Following the philosophy of Kalman [1], we can view any In 2009, Hu et al. [2] proposed so called “Generalized study in machine learning (ML) or in AI is a mapping study Constraint Neural Network (GCNN)” model, in which the among the mathematical spaces as shown in Fig. 1. GCs considered by them were “partially known relationship The all spaces are interactive to each other, such as a (PKR)” knowledge about the system being studied. Without knowledge space (i.e., a set of knowledge) which is connected awareness of the pioneer work by Zadeh, they proposed the to other spaces via either an input (i.e., embedded knowledge) formulation for a regression problem in a form of: or an output (i.e., derived knowledge) means. In fact, a knowledge space is not mutually exclusive with other spaces min y g(x,θ) k − k (3) in the sense of sets. The diagram is given for a schematically subjectto g(x,θ) Ri f ; i =1, 2, ... ∝ h i understanding about the information flows in modelings as where x and y are the input and output for a target function well as in an intelligent machine. The interactive and feedback f to be estimated, g(x,θ) is the approximate function with relationships imply that intelligence is a dynamic process. a parameter set θ, Ri f is the ith PKR of f, the symbol Most machine learning systems can be seen as a study on “ ” represents the termh i “is compatible with” so that “hard” deriving a hypothesis space (i.e., a set of hypotheses) from a or∝ “soft” constraints can be specified to the PKR, and the data space (i.e., a set of observations or even virtual datasets) symbol denotes “about” because some PKRs cannot be [12]. The systems are also viewed as parameter learning expressedh·i by mathematical functions. They adopted the term machines if the concern is focused on a parameter space (i.e., “relationships” rather than “functions”, for the reason to ex- a set of parameters) [11]. This paper will focus on a constraint press a variety of types of prior knowledge in a wider sense.

Load more