Quick viewing(Text Mode)

From a Set of Parts to an Indivisible Whole. Part I: Operations in a Closed Mode

From a Set of Parts to an Indivisible Whole. Part I: Operations in a Closed Mode

From a of parts to an indivisible whole. Part I: Operations in a closed mode

Leonid Andreev

Equicom, Inc., 10273 E Emily Dr, Tucson, AZ 85730, U.S.A. Email: [email protected]

February 29, 2008

Abstract

This paper provides a description of a new method for processing based on holistic approach wherein analysis is a direct product of synthesis. The core of the method is iterative averaging of all the elements of a according to all the parameters describing the elements. It appears that, contrary to common , the iterative averaging of a system's elements does not result in homogenization of the system; instead, it causes an obligatory subdivision of the system into two alternative subgroups, leaving no outliers. Within each of the formed subgroups, similarity coefficients between the elements reach the value of 1, whereas similarity coefficients between the elements of different subgroups equal a certain constant value of 0><1. When subjected to iterative averaging, any system consisting of three or more elements of which at least two elements are not completely identical undergo such a process of bifurcation that occurs non linearly. Successive iterative averaging of each of the forming subgroups eventually provides a hierarchical system that reflects relationships between the elements of an input system under analysis. We propose and discuss a definition of a natural that can exist only in conditions of closeness of a system and can be discovered upon providing such an effect onto a system which allows its elements interact with each other based on the principle of self organization. We show that selforganization can be achieved through an overall and total cross averaging of a system's elements. We propose an algorithm for performing such crossaveraging through iterative averaging transformations of a system's similarity matrix, wherein the very first of the iterative transformations turns any system under processing into a closed type system that does not allow an addition of new elements or removal of any of its existing as it would result in drastic changes as compared to the original state of the input data system. A system subdivision into groups occurring in the course of iterative averaging performed in an autonomous unsupervised mode displays a highly intelligent analysis of partwhole relations within the system, which proves that the resulting hierarchical structures reflect the system's natural hierarchy. This method for data processing, named by us 'matrix reasoning', can be effectively utilized for analysis of any kind and any combination of data. We demonstrate new methods for construction of hierarchical trees, dendrograms, and isohierarchical structures which allow effective visualization of results of a hierarchical analysis in the form of a holistic picture. We demonstrate the application potentials of the proposed technology on a number of examples, including a system of scattered points, randomized datasets, as well as meteorological and demographical datasets.

Keywords: Iterative averaging algorithm, Nonlinearity, , Natural hierarchy, Similarity matrix, Metrics, Scattered points, Random systems, Meteorology, Demography

1 about Nature's objects and phenomena still fail to 1. Introduction provide a holistic picture of the world. Extreme Partwhole relations are one of the structuring forms of the of holism exist largely due bases of the . One may assume that since to unavailability of scientifically grounded the ancient times the problem of 'partwhole' methods – or even ideas that would promise a relations has been most stimulating for potential capability of development of such development of philosophical understanding of methods – for synthesis of a whole which could the nature of and the environment. provide that a resulting whole, rather than being a The principle of approaching a whole from the sum of the component elements, would acquire standpoint of its parts and treating the properties new properties that were not present in the of a whole as the sum of properties of its parts is component elements. known as merism (from the Greek 'meros' , 'part') Analysis of the behavior of parts from the and is the subject of studies in [13]. viewpoint of the whole is not characteristic of the Alternatively, a position which emphasizes the classical science. In classical science, analysis is inequality between a whole and the sum its parts based on breaking down a phenomenological because a whole, due to its parts, acquires new whole into its parts and examining the parts, and properties as compared to its parts, is known as this reductionist process is not complemented holism (from the Greek word 'holos' , 'whole') [4 with a reverse process – from parts to a whole – 5]. Holism is based on the idea that all properties and therefore it does not provide an integral of a given system cannot be determined or picture of a system under analysis. The problem explained by the sum of its component parts of relationships between real, identifiable subjects alone. Instead, the system as a whole determines and the appearances whose in an important way how the parts behave. supposedly involves participation of those Holism as a philosophical is aimed at a subjects seems to be the most critical and "holistic of the world", i.e. at resolving complicated issue of the modern scientific the conflict between the subjective and objective, . As was emphasized by Craig Dilworth between irrational and rational. The holistic [6], "The debate over and realism approach concerns all the areas of as concerns the very nature of modern science: what it deals with the general principles of scientific it is or what it ought to be. Empiricism, in its discovery of knowledge. 'Wholeness' as a display extreme form, claims that there is no of the properties of a subject under investigation, behind appearances and that it is the task of hence the entire cognizable world, is viewed by science to determine what the appearances are holistic science not as something that directly and and what the formal relation are that obtain obviously follows from interrelations between the among them." A functional whole may contain elements of a system, but as something that is any kind of elements (elements are parts of a manifested in the of specific and stable system which do not consist of subsystems, and properties of such wholeness. Holism maintains the notion of 'element' includes also peculiarities that one of the fundamentally important of an element's interactions with other elements properties of a whole is nonreducibility of the of the whole), even something that is unknown to properties of a whole to the properties of its science and lives and evolves on its own, component parts, which is in contrast to analytic independently from us, the humankind. tradition of establishing the properties of a whole Notwithstanding the overwhelming amount of through analysis of causeandeffect relations and publications on philosophical understanding of relationships between the parts of a whole which the problem of 'wholepart' from the standpoint have a fixed set of properties. From the of holism, science does not know of mathematical standpoint of holism, the requirement of logical ideas and exact methods that would be able to deducibility of properties of a whole from initially demonstrate, on the quantitative level, the set conditions, as it is maintained by traditional relations between parts which underlie the scientific , is the reason of why all functioning of an integral indivisible whole. This the diverse and extensive domains of knowledge problem is profoundly important and hardly

2 solvable in general from the position of linear artificial and natural intelligence, risk assessment, logic prevailing in science. Kurt Koffka, one the modeling of unpredictable situations, catastrophe classics of , wrote, "It has been theory, statistics and economics, medical and said: The whole is more than the sum of its parts. social psychology, including behavior, It is more correct to say that the whole is quantum physics, astrophysics, and many others. something else than the sum of its parts, because Solutions for many critical problems of the summing up is a meaningless procedure, whereas nowadays science directly depend on novel the wholepart relationship is meaningful." [7]. approaches to holistic processing of information. Eighty years ago, , who coined the term There has to be a certain universal 'holism' and made an important contribution to providing a capability to objectively evaluate a the philosophy of holism, outlined the problem given set of elements from the point of view of its that would be faced by the science were it to ability to become – upon certain variations in attempt a description of holism: "This is … the conditions of their interactions – a nonadditive, case where cell a unites with cell b to form a new indivisible, specific entity, i.e. a phenomenon entity, in which both a and b disappear finally and whose scientific analysis allowed for distinction of irrecoverably, and whose character and behaviour those elements. Currently, a determination on cannot be traced mathematically or mechanically how relevant is a given set of subjects in the to those of a and b." [8]. It is "…impossible to say emergence of an indivisible whole under analysis where the whole ends and the parts begin, so is made by using specific and subjective intimate is their interaction and so profound their approaches that are based on and result from a mutual influence" [9]. combination of such factors as , Due to its universality, the problem of whole opinions and beliefs of the analyst or a team of part relations is inexhaustible for analysis and analysts, as well as methods in mathematical understanding. A. J. Bahm [10], for instance, statistics which are completely incompatible with points out five kinds of wholepart relations: the paradigm of holism. atomism, holism, , and The purpose of this research was the organicism, and describes the relations between development of a universal methodology for them. The diversity of relations between a whole synthesis – from analytically discovered and and its parts is discussed in [1] as well as in other independently identified and described elements sources. Scientific systematics of wholepart of a functional whole – of the initial intact whole relations would require a capability to identify at that possesses stable and specific properties that least statistically verifiable gaps between are not present in either its component subsystems or the existence of some other criteria elements or a mechanical totality of those for differentiation between subsystems. However elements. The said methodology utilizes a new painful it might be for the scientific community to original data processing technology based on admit it, the systematics of wholepart relations iterative averaging, over all available parameters, becomes hardly feasible as one moves forward of each and all elements of a system under from additivity towards wholeness. The objective analysis [11]. As a result of the iterative averaging, reason for this problem lies in the fact that the all the elements of the system divide into two 'wholeparts' theory, especially in the part subgroups, without any outliers or transient concerning an indivisible whole, is essentially elements. The absence of transient elements is an based on notions – such as systemity, important peculiarity of this technology as it , , hierarchy, chaos, completely excludes the possibility of subjective cooperative relations, etc. – that are allembracing interpretation of results and allows for and difficult for interpretation, not only unsupervised autonomous data processing. A quantitatively, but qualitatively, too. dichotomy resulting from the iterative averaging Advances in the theory of 'partswhole' provides a 100% similarity of elements within a relations greatly impact the science and subgroup, whereas a similarity between the technologies of the future, particularly, such subgroups may widely vary depending on initial complex and controversial areas of science as properties of the system's elements, and it does

3 not affect the division into subgroups. The of a system, and thus it provides a new approach iterative averaging provides an evolutionary to hypothesis generation and verification. Part III transformation of the system under analysis, describes a methodology that involves the use of involving both convergence and divergence an outside "drifter" object that is chaotically and processes. Successive evolutionary unrestrictedly moves within the system's space transformations of each of the emerging while being compared, at its every move, to all of subgroup of elements eventually produce a the elements of the system. Coupled with the hierarchical tree that shows the properties of the method of evolutionary transformation of whole, which are not present in any of its similarity matrices, the use of a drifter provides individual elements or in a mechanical totality of unique information about interactions between those elements. Thus, while reductionist the elements of a system and, in particular, allows methodology offers a oneway path to knowledge the assessment of the size and shape (i.e. aura) of by following from the root of a hierarchical tree the system's space beyond which the intrasystem to its leaves, and while any network is a certain interactions cease to exist. It appears that intra form of presentation of a hidden hierarchy, system interactions are described by a very namely a "leaf distance matrix" whose complex structure of closed attractor membranes informational potential dramatically decreases in which depends on parametric characteristics of conditions of dynamical existence of a given the system's elements and the number of those hierarchical system, the proposed methodology of elements. construction of hierarchical trees by analyzing the It would be impossible within the framework properties of their leaves can be viewed as a new of a single article to thoroughly discuss those paradigm of , congruent with the aspects of evolution, metaphysics, hierarchy and paradigm of holism. systemity which are directly connected with the The technology described in this paper problem of 'partwhole' relations, hence with the involves three principally different approaches to technology presented in these papers. synthesis of a whole from a totality of its Nonetheless, in order to better understand the component parts, and therefore the paper consists proposed technology and its implications for of three parts: Part I describes "closed mode fundamental and applied science, it will be operations", Part II deals with "open comparative necessary to touch upon some of the abovesaid mode operations", and Part III provides a method issues. In this paper, we will consider some of the for analysis of holistic space of multiobject theoretical problems that are important in the relations. In case of a closed mode, the very first context of the technologies presented in this step of information processing with the use of series of three articles. evolutionary transformation of similarity matrices turns the totality of the initial elements (parts) of 2. Holistic perception of the system into a closed system wherein the information and construction number and of elements should not and natural cannot change. In case of the open comparative mode, each individual element of the system is 2.1. Evolution consecutively compared to a certain outside element (i.e. not belonging to a given system) or a Computer science deliberately uses the set of outside elements that carries a certain "evolutionary" epithet, such as in 'evolutionary meaning for the analyst. The technique for such computation', 'evolutionary algorithms', comparison, comprising the algorithm of 'evolutionary programming', etc. This comes evolutionary transformation of similarity matrices, primarily from a desire to state that a certain represents in general a system of algorithms software product is not merely an additive set of named by us as "information thyristor" [12]. commands based on linear logic but a system Essentially, this technique enables the analyst to capable of independent development and self establish on a quantitative level whether or not a organization. In this context, we would like to certain idea may apply to a given set of elements point out a certain criterion that can indicate whether or not a given computer program indeed 4 can perform actions similar to Darwinian natural the leaves of a hierarchical tree towards its root, as selection. provided by the algorithm of evolutionary In Chapter 6 of "The Origin of Species", titled transformation, consists in successive averaging of "Difficulties on Theory", Charles Darwin the properties of antipodes that evolved from remarked that there were some problems that, in their predecessor and represents a perfect his opinion, could be fatal to his theory: "These illustration of a holistic perception of evolutionary difficulties and objections may be classed under processes. the following heads: Firstly, why, if species have descended from other species by insensibly fine 2.2. Metaphysics gradations, do we not everywhere see "The question of holism must be approached innumerable transitional forms? Why is not all from a metaphysical point of view: as the task of nature in confusion instead of the species being, determining the level up to which a property is as we see them, well defined?" [13]. In Darwin's constituted by its relation to other properties" time, the science of paleontology was yet at the [16]. Indeed, metaphysics as a branch of embryonic stage of its development and the fossil philosophy which studies the nature of the record was poorly known, so Darwin could only universe as a whole science of being and knowing hope that in the future the situation would change is fully compatible with the paradigm of holism for the better. However, within the past 150 years, since any doctrine that emphasizes the priority of not much has changed in that respect. "One of a whole over its parts is holism. Metaphysics that the most surprising negative results of deals with fundamental problems of paleontological research in the last century is that Weltanschauung has always influenced concrete such transitional forms seem to be inordinately science as no single scientific theory can be tested scarce. In Darwin's time this could perhaps be in isolation. It seems to be quite natural that ascribed with some justification to the metaphysics must have emerged and developed incompleteness of the paleontological record and due to an expressed inability of the average to lack of knowledge, but with the enormous human mind to perceive new knowledge in a way number of fossil species which have been other than viewing it as an additive construction discovered since then, other causes must be found based on the existing knowledge. At the level of for the almost complete absence of transitional linear logic, it is impossible to perceive the forms." [14]. Thus, what has been proven on a universality of regularities of formation of large number of examples should be accepted as antipodes within a system whose elements an axiom or law: the most important point of undergo iterative averaging, and this seems to be Darwinian natural evolution is that species the most clear and compelling proof of the formation represents a discontinuous function or legitimacy and validity of the metaphysical view of – more precisely – a process of sharp the surrounding world. Therefore, analysis of the dichotomization. history of relationships between metaphysics and It clearly follows from the above that the specific fields of science can help see the adjective "evolutionary" in reference to a tendencies in the development of methodology of computer program must be used responsibly. To scientific knowledge with regard to the 'parts claim that a certain computerbased procedure for whole' problem. data processing has semblance to a natural Unlike science, metaphysics strives to arrive process of evolution, the "novel and coherent to an ultimate and overall perspective from which structures, patterns and properties" [15] arising in it would be possible to explain all the aspects of the course of data processing must be free from existence as it is. Therefore, metaphysical transitional forms. The fact that the iterative knowledge is more conservative and hard to averaging of properties of a system's elements, as refute, hence more stable than scientific described below, leads to the system's sharp knowledge. As the fundamentals of scientific dichotomy with no transient elements, makes it knowledge get more complex, the finding of fully conform to the criterion of natural evolution. formal relations between appearances becomes Reconstruction of phylogenesis, i.e. moving from more complicated due to limitations in the human

5 mind capabilities and potentials, whereas the need science to a scale that is so large that it seems to in visualization capabilities upon demonstration be a metaphysical scale, even if it has nothing to of scientific results that go above and beyond the do with metaphysics which studies what lies frames of strictly scientific relations is constantly beyond the boundaries of physical phenomena. growing. Science cannot exist without criteria of Thus made artificial stretching of scientific veracity of knowledge. During smooth periods of knowledge ought to be noncontroversial in order the development of science, the criteria of to avoid outright rejection. In other words, it veracity are based on the scientific cannot afford being radically novel or too Weltanschauung supported by academic schools contradictory to commonly accepted scientific of thought and scientific authorities. However, in . For example, after Haken's the times of its turbulent development, science monograph [20], the concept of synergetics can leans on metaphysics for criteria of veracity of the be effectively used in scientific discussions in any new knowledge. The late 19 th and especially the field of science. The term 'synergetics' is known to early 20 th centuries were the time of unparalleled fit any scientific topic, without an immediate risk revolutionary discoveries in almost all fields of of its improper use. It is a convenient word to use science, as well as of unprecedentedly high rate of when speaking of the universe as a complexly development of theoretical knowledge and organized live entity in constant dynamical and precision engineering. As a consequence, during evolutionary selfdevelopment, but, in fact, all it the recent decades, science has shown a global represents is just a way of suprascientific tendency toward integration, or rather eclectic interpretation of reality and essentially is combination, of scientific and philosophical profanation of metaphysical knowledge. The knowledge, or – to be more precise – toward synergeticsbased way of thinking is supported imitation of metaphysical approaches and mainly by generalized symbols of certain, yet to be methods, which has produced such new areas of acquired knowledge, nonlinear logic and other science as the theory of selforganization [17], the attractants of idealized science, rather than by general theory of systems [18], nonequilibrium concrete particulars of the nonlinear world which thermodynamics and the theory of dissipative could be studied today and to the point. structures [19], synergetics [20], geometry Another quasimetaphysical theory – the [21], [22], as well as some new superstring or Mtheory [22] – has become a part notions, such as complexity [23], emergence [15], of elementary physics. In certain etc. The emergence of these quasimetaphysical conditions, superstrings may reflect the properties theories and terms is both natural and symbolical, of ordinary particles, but at the discretion of a which has been already commented on in physics theoretician – for instance, upon the philosophical publications. J. W. N. Watkins [24] increase of dimensionality – they may become noted, "The counterrevolution against the logical extremely complex and acquire most unexpected empiricist seems to have and unpredictable properties and thus stimulate triumphed: I have the impression that it is now the researcher to look at an old problem in a new almost as widely agreed that metaphysical ideas way since a traditional approach to the problem are important in science as it is that mathematics turns out to be too complicated. According to P. is". It should be also emphasized that, for Woit, a quantum field theorist, currently working understandable reasons, putting a purely scientific in mathematics [25], no one has ever been able to knowledge into a quasimetaphysical form always make any experimental predictions based on the stirs enthusiasm in the scientific community but string theory, and there is a legitimate question of does not lead to perfection of scientific tools and whether the string theory is a scientific theory at means. all. In Woit's opinion, the only area in which the There is something in common in the afore string theory is really strong is public relations mentioned quasimetaphysical theories, as well as (which is true for all and any other quasi in all other quasimetaphysical theories. One of metaphysical theories, and which distinguishes the their common features is that they represent a latter from the true metaphysical outlook that has noncontroversial and logical expansion of always had tremendous problems in the area of

6 public relations). Woit, admitting that the leading demonstrates that the everexisting balance theorists in the string theory were undoubtedly between holism and greedy has a geniuses, has thoroughly analyzed the reasons of strong tendency of shifting towards holism. why the string theory had caused such overwhelming enthusiasm in quantum physics. It 2.3. The concept of system was quite understandable as the The term 'system' is one of the most widely used disproportionately drastic formal expansion of terms both in science and life, not only due to its science brought about high expectations in semantic plasticity, but also because it reflects respect of many problems that theretofore some of the very important universalities of being. seemed to be irresolvable. First and foremost, it Among the numerous definitions of 'system', the applies to the problem of quantum gravitation key words are a "united whole" wherein all of the theory that has always attracted the most talented constituent parts work together to perform a physicists. Another example of quasimetaphysical certain concrete function. Hence, the inevitably approach based on an unlimited non following conclusion is: any functioning system controversial expansion of theoretical knowledge determines "by itself" what it needs in order to is fractal geometry effectively used by B. perform its function, and what it does not need. If Mandelbrot for development of reproducible a certain element of the system does not need to techniques in recognition of indefinitely be engaged at a certain given moment of time, but expanding elements of fractality in various it will be needed at a later time or upon variations mathematical and natural forms [21]. in the system's environment, that element should The use of the technique of noncontroversial necessarily be considered as a systemic element, as formal expansion of theoretical knowledge could without it the system would lose its functional be also demonstrated on examples from all of the potentials. Thus, all things considered, a abovenamed areas of science which exploit the functioning system is a closed formation, even if imitation of the metaphysical level of knowledge. it includes, along with obligate elements, This has become a stable tendency in modern facultative ones, which creates certain diffusion of science, and it has a strong motivational effect on a conventional border between the system and scientists but very limited potentials in producing outer environment, and therefore a word new concrete knowledge about objects under combination "open system" is nonsense. From study, unless, of course, 'new knowledge' means the standpoint of holism, elements that are new ways of presentation of widely known . responsible for a system's functioning as a whole It is easy to prove that the more elaborate are the are the system's immanent parts. "Open" or attempts to tightly bound scientific knowledge "closed" should rather refer to the operational with metaphysics, the slower is the in mode applied to analysis of a system, but not a fundamental scientific knowledge, and the faster system itself which by definition can only be a is the development of applied sciences where closed formation. fundamental knowledge, despite its overall The "independence" of functioning systems positive role, has an inhibiting effect on was conceptually reflected in one of the earliest innovativeness. quasimetaphysical theories – W. R. Ashby's Notwithstanding the congruence between theory of selforganization [17] in regard to metaphysics and holism, they pursue processes in which dynamic functional systems fundamentally different goals in acquisition of becomes more organized over time and on their knowledge. Unlike metaphysics, the paradigm of own, i.e. without management by outside agents. holism is aimed at concrete knowledge, and The concept of selforganization – or "order for therefore there is no such thing as quasiholistic free", as was well put by S. Kaufman [26] – is one science. If a union between holism and specific of the most difficult issues in modern scientific fields of science will ever be built, it may exist knowledge as the theory of selforganization is only based on exact knowledge, and therefore the essentially a collection of observations, precedents prefix "quasi" will not be applicable. The and assumptions, rather than a scientific emergence of quasimetaphysical theories clearly hypothesis providing instructions on how to

7 discover concrete mechanisms of self functioning, and (2) how it functions, i.e. what organization in realworld systems. Probably the mechanisms start the spontaneous interactions most impressive results in study of self between individual elements and subsystems. In organization were obtained by a Nobel laureate the course of study of any functioning systems, it JeanMarie Lehn in his numerous works on will immediately become clear that these two supramolecular structure selfassembly (see, e.g. problems are interconnected in a very complex [27]). J.M. Lehn has convincingly demonstrated way and cannot be considered apart from each that information stored in the covalent framework other. This circumstance is the root of the of the components of a complex mixture of fundamental difficulty of theoretical is a source of corollaries, instructions understanding of systems behavior. and programs recognized at the supramolecular Differentiation between open and closed type level. "Selforganization … may be directed by the systems is an artificial separation of the two design of both these components and their mode aforesaid problems which cannot provide any of assembly, i.e. by the molecular information input to understanding of concrete mechanisms stored in the components and by the of systems functioning. supramolecular processing of this information [18] believed that the through the interactional algorithm (the fact that many characteristics of interaction pattern involved)" [28]. One of the which are paradoxical in view of the laws of factors of J.M. Lehn's success in study of the physics was exactly due to their being open processes of selforganization was that he systems, as living systems cannot, for instance, investigated systems with comparatively limited live without consumption of oxygen and other and well studied diversity of interaction between substrates from the environment and without elements (molecules). However, in realworld releasing the products of their metabolism into systems, interactions between elements are the environment. I dare think that to a certain overwhelmingly diverse and, as a rule, extent this was a psychologically motivated imperceptible by reductionist methods of position meant to state that we would not be able investigation. Therefore, most of quasi to understand the origin of life and the ultimate metaphysical theories refer to such interactions by nature of life processes mostly because any using a term "cooperative relations" which is too is a functional part of a very vague to be a scientific term [1720]. complex surrounding world. If we place an "Independence" of functioning systems is obligate aerobic in a low oxygen quite a nuisance for science as it imposes medium, then, irrespective of the way it consumes insurmountable limitations. If a system "decides oxygen – through gills, lungs, or diffusion – the on its own" how to structure itself in order to be organism's reaction to hypoxia will involve certain able to perform its functions, then a reduction of evolutionarily determined metabolic shifts that are the system, with a purpose of understanding how inherent in a given biological system and are not it does all that, will make it incapable of "making specific for a particular medium. In conditions of decisions", hence revealing its secrets. This anoxia, an organism, as an independently explains why one of the approaches used by functioning system, will struggle with a lack or quasimetaphysical theories, e.g. general systems absence of oxygen till the very end by mobilizing theory [18], the theory of dissipative structures its own internal mechanisms, and no changes in [19], and synergetics [20], is based on the environment can revive the organism post distinguishing between open and closed systems. mortem. The environment that provides oxygen In physics, an "open system" refers to system that for an organism does not participate in the exchanges its substance and with the outer organism's struggle for survival in the conditions environment, whereas a "closed system" means a of hypoxia. After all, we cannot refer to system that does not allow for such an exchange. an airplane as an open system even though it uses Clearly, there is deliberate confusion between two an external source of energy and releases the absolutely different problems: (1) how a system products of fuel combustion into the acquires substances and energy needed for its environment. In Part III of this series of articles

8 "From a set of parts to an indivisible whole", it original meaning. In fact, evolution is directly will be visually demonstrated that a functioning connected with genetic variability, i.e. with system cannot be an open system. random mutations wherein a gene is a unit of According to I. Prigogine [19], systems that variability. A genotype directly determines and display dynamical selforganization are open controls a phenotype at the individual organism nonlinear systems that are far from level, whereas the basis of natural evolution is a thermodynamic equilibrium and resistant to feedback that occurs at the population level via minor disturbances. An increase of order in such natural selection and heredity. Unlike it is the case systems is coupled with a decrease of order in with ontogenesis, in evolution, genetic symbols their environment. In other words, in a system are ratedependent. Essentially, the cause of undergoing selforganization, the complexity of life is not so much the diversity of continuously increases but it concurrently elements, i.e. genes and variants of their dissipates or is exported to the environment, thus phenotype expression, but the diversity of making the second law of thermodynamics valid combinations of those elements in populations of for any kinds of situations, especially in case of . Although molecular biology offers such dissipative structures as living organisms. As many spectacular successes, it is clear that the noted by F. Heylighen [29], "the export of detailed inventory of genes, proteins, and entropy does not explain how or why self metabolites is not sufficient to understand, at the organization takes place"; however, that question, level of additive perception, the cell's complexity crucial for theoretical science, does not seem to [31]. Therefore, the study of those combinations have been of major concern to I. Progogine as his and their hierarchies has become a priority in works were aimed at pushing the classical biology. Structural and functional analysis of thermodynamics to the level of metaphysics and hierarchies of genomic and postgenomic data is of turning it into a universal doctrine of the primary importance in proteome research [32], metaphysical scale so it could explain everything, studies on metabolism, transcription – i.e. including the origin of life, the spacetime everything that both divides and joins genome relation, the nature of chaos, and solve many and phenome. Thus, Patee's abovequoted other problems of physics and biology which statement rather applies to the epistemic cut have not yet been solved and will hardly be solved between potentially measurable properties of a set in the foreseeable future. of system's elements and the unaccountable Summing up the above brief discussion of the changeability of the system's properties due to its global tendency toward crossbreeding of science elements' interaction, given that in reality each and metaphysics, I should like to emphasize that unique set of elements correspond to a single this is not a random phenomenon but a perfectly specific system with its given unique set of natural trend showing that the interest in the properties. That is why it is so difficult and paradigm of holism in the context of exact actually impossible, by using the currently sciences has been invariably growing during the available scientific methods, to calculate that past few decades and that research and single specific solution that is hidden among an establishment of generalized mechanisms of infinite multitude of possible solutions. synthesis of an indivisible whole from its Ultimately, and in a broader sense, it is all about individual parts is a fundamentally important task. the epistemic cut between scientific and metaphysical (holistic) views on the nature of 2.4. Hierarchical systems things and phenomena and the incomparability of In his oftenquoted article H. H. Pattee [30] the two views – holistic and reductionist – for the wrote, "Evolution requires the genotype known and the knower. The epistemic cut is of phenotype distinction, a primeval cut that paramount importance, which is especially evident separates energydegenerate, rateindependent in studies of hierarchical systems. symbols from the ratedependent dynamics of Any system where there is a parentchild construction that they control". The use of word relationship, any ascending or descending series 'evolution' makes this statement lose its apparent of elements ranked according to their value can

9 be viewed as a hierarchical system. However, nothing to do with the scientific notion of natural upon a thorough consideration it appears that hierarchy. There are many other similar examples hierarchy, especially concerning biological objects, of use of the word 'hierarchy' which is not is an extremely complex notion from the sciencerelated, i.e. when the term 'hierarchy is standpoint of philosophical understanding. The used as an equivalent of assessment of 'more' vs. fact is that pathways along the branches of 'less', 'higher' vs. 'lower', 'more important' vs.'less hierarchical trees reflect the precise history of important', etc., e.g. as in socalled 'dominance node organizations. Individual nodes of a hierarchy' which is a form of animal social hierarchical tree, even closely adjacent ones, may structure in which a linear or nearly linear ranking significantly differ by character and functionality. exists, or as 'memory hierarchy' in computer A. Koestler [33] referred to the notion of science – and does not correspond to the which means an entity in a hierarchy that is meaning of 'hierarchy' as a scientific term. at once a whole and at the same time a part. Thus Oftentimes, a hierarchy is assessed based on a holon at once operates as a quasiautonomous analysis of noncommutative properties; or by whole that integrates its parts. Traditional comparing events that are characterized by scientific approaches to analysis of complex conditions and properties that are absent in the hierarchical systems are mostly based on preceding node and make other events reductionism, which suggests that the nature of unavoidable; or by deliberately establishing the complex entities can always be understood by nodes of a hierarchical tree and the following re breaking them down into simpler or more evaluation of them based on further analysis and fundamental components. By doing so it is easy consideration; and sometimes, a hierarchical to overlook a certain holon or a number of analysis is aimed at simply having a complex holons that not only connect but also divide the system of large parts broken down into a more nodes of a higher than root hierarchical level, simple system of smaller parts so it becomes which would make them irreversibly lose their easier for overview and perception. The above hierarchical character. For example, a Chilean said clearly brings up a question of what is a economist and philosopher M. MaxNeef [34] has natural hierarchy and whether it can be argued that such a complex structure as scientifically explained as a phenomenon. This fundamental human needs is nonhierarchical as problem is the more so important as there is no human needs are ontologically universal and unanimous opinion on whether or not a natural invariant in nature. However, we all know that hierarchy exists at all. any given individual has a specific and distinctly Most research papers that attempt to find out expressed hierarchy of needs. Which one of these the peculiarities of a natural hierarchy mainly deal views should be the correct one? In fact, both are with biological objects or phenomena of the correct. The problem with MaxNeef's reasoning biological level of complexity. However, as it will on a hierarchy of needs is that he disregarded very be demonstrated in the experimental section of important holons, psychotypes (according C. G. this paper, a natural hierarchy may exist in any, Yung), which determine the hierarchy of needs of not necessarily biological, group of objects or . A hierarchy of human needs is not phenomena. A. Koestler, in his paper on the built from abstract needs, instead it can only exist theory of selfregulating open hierarchical order in connection with an individual whose (SOHO) [35,36], makes an attempt of highlighting psychological and physiological specifics and some of the characteristics of the natural social status determine the pattern of needs. Thus, hierarchy by describing the properties of holons the conclusion made by M. MaxNeef is valid as on the examples of mostly biological objects. an example of utilitarian statistical generalization, Leaving aside the fact that the said description while the word 'hierarchy' in this example was includes 66 positions, which by itself excludes the used as a trope, a nonliteral reference to possibility of providing clearly defined criteria of a inequality, as, for instance, a hierarchy of the three natural hierarchy, many of the points of the ascending ranks of angels in Medieval Christian SOHO theory are equally applicable to both a theology. MaxNeef's "hierarchy" of needs has natural hierarchy and a hierarchy that by no

10 means can be considered as natural. For instance, that may be hard or impossible to overcome: a one (and probably the most significant and disassembly is by far easier than an assembly. interesting of all 66 points) of Koestler's points is: Therefore, of the four abovesaid types of "Every holon has the dual tendency to preserve hierarchies, the Ddtype hierarchy in any kind of and assert its individuality as a quasiautonomous systems is always the most difficult for analysis, whole and to function as an integral part of an especially in comparison with Aa and Ad (existing or evolving) larger whole. The polarity hierarchies in biological systems. Due to between the SelfAssertive (SA) and Integrative homomorphism of nodes in biological (INT) tendencies is inherent in the concept of hierarchies, and because a lack of information hierarchical order, and a universal characteristic of about some of the intermediary nodes does not life" [35]. This is certainly correct; however, the significantly affect the hierarchical analysis due to same it is true, for instance, for a military distinct homology between the organisms of hierarchy that is not a natural hierarchy. successive evolutionary levels, any possible It is not difficult to establish that there are ambiguities in systematics and taxonomy of two types of hierarchy: descending (D) and organisms are mostly local ambiguities that do not ascending (A). In a Dtype hierarchy, the root of interfere with an overall result. In this case, a the hierarchical tree has the most complex cognitive process is directed not at the organization as compared to higher nodes and construction of a hierarchy but at the discovery of end leaves; and conversely, in Ahierarchies, a natural hierarchy that exists independently from leaves of a hierarchical tree represent more our and has emerged in the process of complex structures than the root. For the purpose evolution. of analysis, any hierarchy can be considered from The notions of 'holon' and 'holarchy' [35,36] the viewpoint of a descending or ascending accepted by the scientific community with much hierarchy, depending on the analysis vector, i.e. enthusiasm (see, e.g. [37]), essentially, can only either from root to leaves (d), or from leaves to serve as an indirect confirmation of the well root (a). An example of Ahierarchy is a known that a whole does not equal the sum phylogenetic tree. The concept of Dhierarchy of its parts. These new terms, having given a rise can be illustrated by the example of an effort to to speculative tendencies in the study of the 'parts understand how a mechanical clock works by – whole' problem, have not provided any new disassembling it into parts. A "working input into understanding of the universal nature mechanical clock" root produces two nodes: node of hierarchy and, particularly, of the mechanism 1.1, a winding key; and 1.2 , a clock in a working of Ddhierarchy. Hardly ever can anyone verify, condition but unwound. Node 1.1 is a deadend theoretically or practically, A. Koestler's idea that node and has no branches. Node 1.2 leads to "the concept of the holon is intended to reconcile node 2.1, a clock face that is also a deadend the atomistic and holistic approaches" [36]. The node, and node 2.2, a working clock which, when term 'holon' can be more effectively applied to wound, can perform the clock's main function, i.e. nodes that can undergo further transformation allow us to approximately determine time by the (evolution). For instance, in the aforementioned position of the clock hands. Ultimately, the leaves example with a mechanical clock, nodes 1.2 and of Dahierarchy are: a spring, gears, screws, and 2.2 can be considered as holons, while nodes 1.1 other parts that cannot be further broken down. and 2.1 are not holons. There is at least one more The Datype hierarchical analysis drastically point at where the idea of 'holon' may prove differs from the Ddtype analysis which can be useful. As was already mentioned, in a natural illustrated by the process of assembly of a clock hierarchical system, holons located at different from a set of individual parts. A Ddtype analysis levels of a hierarchical tree should be homological requires the knowledge of or instructions on due to their quasiautonomy. For instance, performing an assembly. A Datype hierarchy holonsnodes of the evolutionary phylogenetic analysis is based on reductionism, whereas a Dd tree are homological because each of the objects type analysis is based on holism. Between the Da of that hierarchy represents individual whole cells and Ddapproaches, there lies an epistemic cut or organisms made of whole cells. Therefore, the

11 question put forth by H. Pattee [30] "Is it possible A. A natural hierarchy is always a closed for us to distinguish the living from lifeless if we system representing an indivisible whole. It can describe both conceptually by the motion of cannot comprise entities that are not inorganic corpuscles?" cannot be answered by interconnected through branches of the means of a hierarchical analysis of living forms hierarchical tree, i.e. through hierarchical relations. because the root of the phylogenetic tree of living The distinction of a hierarchical system lies in the forms is not a set of macromolecules but a certain fact that it can be considered as an element of a hypothetical, most ancient prokaryoteprotobiont higher level system, and that each of its elements, that had a cellular membrane structure and an in turn, represents a lower level system. A total established metabolic system. Further, it gave the interrelatedness is exactly what makes any second branch of the hierarchical tree by having hierarchical system a closed system. Any evolved to a eukaryote, a cellular structure in hierarchical system can be partially or completely which the DNA replication and ATP synthesis destroyed by the removal of even one of its were – with the wellknown evolutionary benefit elements or by the addition of an element that is – isolated from the processes occurring on the foreign for a given system. cytoplasmic membrane, by of respective B. Creation of a natural hierarchy is possible membrane structures supported by the only through an objective, uniform and equal endoplasmic reticulum network. In any impact on all of the system's entities so as to allow hierarchical system, the degree of the construction of a hierarchical tree to occur heteromorphism among the root and the leaves based on the principle of selforganization. In a steadily increases; however, in natural hierarchical natural hierarchy, the dichotomy of a node is systems, the increase of heteromorphism must be always asymmetrical, unpredictable and depending strictly successive and regular, and with all the on the entirety of all the elements and their differences between the nodes of various levels, interrelations. While it is possible to thoroughly they still must share a certain common feature. In study the mechanism of dichotomization, as it has case of the evolutionary phylogenetic tree, such been done, for example, for arterial lines as a common feature is the cellular structure. system of dichotomous branching [38], one cannot predict the site and time of the occurrence 2.5. Construction of hierarchies of dichotomy, nor the proportion of the diameter Construction of hierarchical systems is routinely of a new branch to the diameter of the major used in research and technology and represents an artery. Dichotomy of nodes is always preceded by important part of human cognitive and a state when the "soontobe" groups can intellectual activities. The problem of synthesis of completely dissolve in each other and represent a an indivisible whole from a set of parts cannot be whole. In a natural hierarchy, the formation of tackled without the understanding of mechanisms branches from an initial node cannot occur of construction of hierarchical systems. As for through a selective and subjective impact on the those mechanisms, they cannot be established system. without the criteria for a natural hierarchy. Now let us look at the most common Certainly, such criteria cannot be determined techniques used by the modern science for based on purely theoretical knowledge; however, construction of Ddhierarchical systems, or in the vast amount of empirical knowledge other words, for reconstruction of Ddhierarchy accumulated by the humankind appears to be from Dahierarchy (synthesis vs. analysis). These sufficient for establishing the necessary minimum techniques are referred to as clustering of such criteria and, by that, demonstrating that techniques, even though a hierarchy and a the modern science does not know the clustering are not one and the same phenomenon. mechanisms of construction of natural hierarchies Clustering is partitioning of a data set into for the purpose of knowledge acquisition. subsets, i.e. division into groups of similar Without claiming a comprehensive analysis of subjects. Clearly, clustering does not necessarily all the reasonable criteria of a natural hierarchy, imply the presence of a hierarchy, whereas a we will point out two most important positions: hierarchy always involves clustering. A term

12 "hierarchical clustering" was coined by S. C. analysis, as it excludes any possibility for the Johnson [39] in 1967. There are two types of elements of the system to display their hierarchical clustering – agglomerative and relationships through selforganization. divisive. Agglomerative clustering starts with taking each entity as a single cluster and then 2.6. Bifurcation (dichotomy) as a result of building bigger and bigger clusters by grouping iterative averaging similar entities together until the entire dataset is Previously, we discovered and described a encapsulated into one final cluster. Conversely, a phenomenon [11] that may seem to be contrary to divisive hierarchical clustering starts with all common sense: iterative averaging (both objects in one cluster and then subdivides them arithmetic or geometric) of properties of the into smaller units. Divisive clustering methods are elements of a system leads to dichotomy not as common as agglomerative. Hierarchical (bifurcation). By commonsense logic, averaging of clustering is well covered in science literature, the elements of a system should eventually make including papers, patents and books (e.g. [4041]). them indistinguishable. Instead, it causes an Here, we will point out those aspects of it which asymmetric division of the system into two show that the clustering, in its common meaning, alternative subgroups. Clearly, this paradox has a performed by the currently available methods, general informative value and can provide better including those which are referred to as understanding of many of the physical processes, "hierarchical clustering", in fact has nothing to do natural evolution, and cognitive processes. Since a with a natural hierarchy. For that purpose, we will study of any paradox should start with a solid consider the hierarchical clustering from the proof of its existence, which is by far more position of the abovespecified criteria of the important than the understanding and explanation natural hierarchy: closeness and selforganization. of its nature and causes, in this paper we will only Hierarchical clustering is usually performed focus on the methodology that provides the based on pairwise similarities (or dissimilarities) reproducibility of the said effect and allows its between the elements of a system under analysis, investigation, leaving aside the issues of the i.e. by establishing similarity (or dissimilarity) metaphysical role of that paradox and its scores for pairs of elements or pairs of groups of mathematical nature. elements of the system. A resulting similarity We will show on the examples of analysis of matrix represents an open system, which is realworld and artificial datasets that the expressed in the fact that a removal of any discovered effect is universal and true for any number of elements from an initial system will kind of system, and that it can be easily not affect the (dis)similarities between the demonstrated on any set of data consisting of remaining elements; likewise, an addition of new more than three elements that differ, even very elements to a system will not change the slightly, from each other. The said effect can also relationships between the existing elements. This be demonstrated on any set of random data means that in the process of hierarchical points, which distinguishes our methodology for clustering, it is impossible to create a closed discovery of natural hierarchies in systems under hierarchical system, i.e. a system that represents a analysis from the socalled hierarchical clustering united indivisible whole. Also, agglomeration and that cannot be applied to systems of random data division in the process of hierarchical clustering points. The proposed method ideally suits the task are determined by algorithms and not by of construction of natural hierarchies, in an cooperative interactions between the elements of automated mode, by a standardized a system, hence they are not a result of self procedure. First of all, after the very first organization. Moreover, those algorithms allow operation of averaging, any set of data becomes a the use of different techniques for determining closed system, which means that none of its the similarities (or dissimilarities) between elements can be removed from it and no new different elements or groups of elements of the elements can be added to it, as it would result in . Thus performed clustering cannot distortion of the input dataset. Second of all, the reflect the natural hierarchy in a system under algorithm of iterative averaging provides an

13 absolutely same effect on each of the elements of in other words, synthesis of an indivisible whole the system, which is performed autonomously from a set of individual parts is provided by and independently from a human operator. iterative averaging effected by the evolutionary Iterative averaging of the system's elements transformation algorithm and does not involve evokes the processes within the system which are any kind of additional techniques. However, in similar to what is considered to be the self the course of development of a universal organization processes. These selforganization algorithm for the process of iterative averaging of processes determine the number of clusters in the data points, we had to face a number of problems hierarchical system that emerges from the input caused by the shortcomings of the currently dataset. Thus, the said method meets the two available techniques for establishing earlier defined criteria of natural hierarchy as a (dis)similarities between objects. First of all, with certain wholesome structure that is inherent only all the numerous metrics currently available for and only in a given system of data. computation of (dis)similarities (see e.g. [4041]), After completion of the first cycle of iterative there is no logically clear approach to grouping of averaging, a dataset under analysis becomes attributes from the standpoint of applying the divided into two alternative subgroups of most optimal metrics for each type of attributes. elements, without outliers. The following For instance, the most commonly and almost processing by iterative averaging of each of the universally used metric is Euclidean distances, i.e. successively emerging subgroups ultimately distances between objects in a multidimensional provides a hierarchical system that can be space of parameters describing those objects. graphically presented as a tree or dendrogram However, such operations often appear wherein the lengths of branches are proportional senseless. For example, it is impossible to to the logarithms of the numbers of iterations that establish distances between values of such have led to the dichotomy of a given set of parameters as concentration of a substance or elements. Thus, the abovedescribed operational intensity of display of a certain quality of an kind transformations provide images of logical object. Another example of improper use of spaces through a visual and holistic representation Euclidean distance would be a comparison of the end result of hierarchical construction, between different levels of household income: in which, particularly, allows one to see whether a terms of Euclidean distances, the resulting succession of dichotomies, unpredictable between annual incomes of $10,000 and $50,000 to the human mind, correlates with the human is the same as between $510,000 and $550,000, perception of the same system. It is apparent that which clearly does not correlate with the actual in such a kind of data processing, even one wrong differences between these values. subdivision into alternative groups, especially at In order to normalize and standardize the use the early stages of the processing, would lead to a of metrics in computation of similarities between completely wrong end result. In the meantime, objects, we have developed two universal metrics: the processing of data on a hundred of objects XRmetric and Rmetric [11] (see Methods below) can involve dozens of such alternative that are applied depending on whether a given subdivisions in the course of selforganization of parameter reflects a shape or power of objects. the input dataset. Even the few examples Both metrics provide computation of similarities presented in this paper are sufficient for between objects and are normalized from 0 to 1. demonstrating that the end results provided by The XRmetric allows computation of similarities this quite a simple method of data processing are in strict conformity with linear distances between congruent with what is meant by the term respective objects and provides results that are 'artificial intelligence'. Certainly, it happens only identical to results obtained for the same objects when an input dataset carries a certain meaning with the use of Euclidean distances. The examples even if it cannot be discovered by any other data provided in this paper, as well as our yearslong processing methods. practical application of these metrics demonstrate The mechanism of transition from the leaves their and efficacy. of a hierarchical tree (Ddhierarchy) to its root, or,

14 Another problem that has been successfully on each object's averaged similarity to each of solved in the course of development of the the rest of the system's objects, and therefore it presented methodology is the effect known as the represents a polyvalent similarity between i and "curse of dimensionality" [43]. The term "curse of j which can be conditionally referred to as dimensionality" is used to describe a problem that 'averaged similarity' ( A-similarity). Unlike A- occurs upon the establishing of similarities based similarities, denoted as [ S], similarities on distances measured in highdimensional space computed in a conventional way based direct of parameters: the higher is the number of assessment of similarities between two objects parameters, the less meaning is in similarities are denoted as S without square brackets. [ S]- computed based on distances between the similarity matrices are computed based on initial objects. The algorithm of iterative averaging has S-similarity matrices. The computation removed the curse of dimensionality due to the technique is provided in Section 3.3. In procedure of hybridization of a set of "monomer" computation of [ S]ij according to Equation (1), matrices computed separately for each of the the GM-mode is more practicable, albeit the parameters [42]. This procedure is described in AM-mode provides results that are qualitatively detail in Methods, section 3.3. comparable and non-contradictory to those produced in the GM-mode. Thus, Eq. (1) 3. Methods provides a normalized to the interval 0 – 1 averaging of similarity of each object of the Below we describe the algorithm of evolutionary system to all other objects of the system. When transformation of similarity matrices (ETSM) using ETSM in the GM-mode for S- matrices which is the central point of the methodology of based on Euclidean distances which have zero data processing and interpretation presented in diagonal elements, the first transformation of S this work. All other algorithms provided further to [ S] needs to be performed by using arithmetic in this paper play a secondary, auxiliary role. They means, after which all the subsequent contribute to computation quality and accuracy transformations are done with the use of and provide proper visualization of end results. geometric means. A detailed demonstration of the process of 3.1. Evolutionary transformation of similarity iterative averaging is provided in Section 4.1 on matrices the example of a set of scattered points. Here, in The ETSM algorithm is described by the the description of methods, we should like to following equation [11]: point out two important properties of the ETSM algorithm. [Sij ]T+1= Aver (( Min ([ Sin ]T,[ Sjn ]T)/ Max ([ Sin ]T,[ Sjn ]T),n) (1), First of all, the fact of the formation of two where " Aver " is a geometric (GM) or arithmetic alternative groups without outliers does not (AM) mean value function, T is the number of depend on how the S values were computed. similarity matrix transformations according to They may be computed based on distances, similarities, dissimilarities, proximities, or any equation (1), [ S ] is pair-wise similarity ij T other way of comparison of two objects, between objects i and j, respectively, after a T including the ways of comparison commonly used number of transformations, and n is the number by or animals for various kinds of of objects in a dataset, hence the number of assessment of objects. The important thing about elements in a (dis)similarity matrix under the ETSM process is that two objects are processing. It is important to point up that, as is compared to each other not directly, as it is done evident from Eq. (1), the similarity between in any other clustering methods, but by their objects i and j, denoted by [ Sij ], is an indicator relations to all of other objects in the system that is qualitatively different from a pair-wise under analysis. As soon as after the first similarity between i and j computed in a transformation of a set of input data, each cell of conventional way based on direct comparison of a square matrix under processing reflects, to a each object's attributes. [Sij ] is computed based certain degree, the relationships within the whole 15 set of n objects, and not only between objects i domain, the contrast function can be applied and j to whose similarity a given cell corresponds. within the range from 0 to 200, thus allowing any Thus, as was earlier mentioned, the very first Asimilarity coefficient in a similarity matrix transformation by means of Eq. (1) turns the processed by the method of evolutionary dataset into a closed system. An addition of a new transformation to be represented as either 0 or 1. element to such a system or removal of even one The effect of the contrast function is illustrated by of its existing elements is both senseless and the plot shown in Fig. 1. impossible as it would lead to a conflict between the input dataset and its current state system 1.0 1.0 under processing, and the magnitude of such a 0.99995 conflict cannot predicted and preassessed. 0.9999 Secondly, in the course of the iterative C averaging, similarities between objects within each of the alternative groups always asymptotically tend to 1 (i.e. to maximum), whereas similarities 0.9995 between objects of different groups asymptotically tend to a certain end value that depends on the values of parameters describing the objects underlying the input dataset. That end [S] similarity coefficient

value, denoted by is a very important indicator uated of the process of iterative averaging, which will be 0.35 0.7 0.9 0.965 0.985 0.995 0.9985 demonstrated by us on the examples of various atten practical applications in the following articles of 0 100 this series. When the objects within an alternative contrast value (C) group are identical, the values may vary from being close to 0 to being close to 1. Upon analysis FIG. 1. Dependence of [ S]C from the value of of datasets describing objects that are very similar contrast C (see Eq. (2). Initial [S] values are shown to each other, especially when they are described on each of the 10 curves. in a multidimensional space of parameters, the 1 values may be diminutively low, up to 10 4 to 10 3.3. Computation of conventional similarity 10 . Although such low values of do not affect matrices the accuracy and the character of the process of As was already mentioned, upon computation of bifurcation, in practice it is more convenient to similarity matrices by conventional methods, the observe the bifurcation process by using our increase of the number of parameters describing special 'contrasting technique' which serves as a the objects under comparison, i.e. the increase of "magnifying glass". dimensionality, leads to a predicament commonly referred to as "curse of dimensionality". The 3.2. The function of Contrast problem is caused by the fact that as the number The function of contrast, C, helps differentiate of parameters increases the distances between between and 1, no how close to 1 the objects in the ndimensional space appear to value may be. The contrast function is designed to become progressively lesser parts of the entire attenuate similarity coefficients according to volume of the ndimensional space, thus turning, equation (2) [11]: for instance, Euclidian distances into less and less informative measure of dissimilarities between the exp(expS − )1 .0 082C ) −1 [S] = objects in a highdimensional space of parameters. C exp(e − )1 .0 082C −1 (2), We have developed a method for computation of similarity matrices [42] which eliminates the where [ S]C is an Asimilarity coefficient [ S] attenuated by the contrast function C, and e is a abovesaid problem. The method involves natural number. In practice, in the real numbers computation and hybridization of socalled monomer similarity matrices [42]. Monomer 16 similarity matrices are based on similarities characteristics. The use of XRmetric is optimal according to one parameter only and are for parameters that reflect a system's shape, a computed for each parameter. Then, for each distance between individual points within a pairwise similarity in each monomer matrix, a system. An important property of XRmetric lies geometric mean is calculated, which is then used in the fact that variations in the value of constant for construction of the similarity matrix for the B do not affect the bifurcation into alternative entire set of objects. Thus obtained hybridized groups. Thus, by changing the B constant from similarity matrix additively reflects objects' values close to 1 up to values of the order of similarities based on an unlimited number of magnitude of tens, it is possible to evaluate parameters. Moreover, it provides a capability to objects' similarities based on parameters whose easily change the weight of any of the parameters values may vary within wide ranges up to many by changing the share of a respective monomer orders of magnitude. Coupled with the method of similarity matrix in the hybrid matrix. It also monomer similarity matrix hybridization, these allows the use of an appropriate metric for each two metrics provide the advantage of dealing with individual parameter. dimensionless similarity values, which allows for fusion of parameters of any dimensionality, no 3.4. Metrics for shape and power matter how different and incompatible the In construction of monomer, hence, hybrid parameters may be. Thus, unlike Euclidian similarity matrices, Euclidian distances are not distances, the use of XRmetric warrants that appropriate as they would transform into city neither variations in parameter values nor the block metrics. Previously, we have shown that the increase of the number of parameters or objects entire diversity of parameters can be adequately can affect the validity of analysis results. reflected by using only two metrics. One of them is Rmetric ("R" for 'ratio') which is calculated by 3.5. Construction of dendrograms and trees the formula: In construction of dendrograms and hierarchical trees, branch lengths are proportional to a natural

Rij =min (Vi,V j)/ max (Vi,V j) (3), logarithm of the number of transformations involved in a complete cycle of asymptotic where Vi and Vj are values of parameter V for division of input data into two subgroups. In objects i and j. Here, similarity values are construction of hierarchical trees, the angles calculated as the ratio of the lower value to the between the branches can be computed according higher value of the parameter of each of the two to equation (5): objects. Another metric is referred to as XR metric ("XR" stands for 'exponential ratio') and is α= ArcCos│−exp( −1)│ (5), calculated by the formula: where α is the angle between the hierarchical tree −│Vi − Vj │ XR ij =B (4), branches that represent each of the two subgroups, а is the limit value of similarity where Vi and Vj are values of parameter V for between two subgroups of objects which is objects i and j, and B (which stands for 'base') is a reached at full completion of the formation of constant that is higher than 1. XRmetric is two alternative subgroups. The α value varies designed so that it provides a computation of from 0 degrees when similarities between two distances between objects according to desired subgroups equal 1, up to 180 degrees when parameters. Results obtained by using XRmetric similarities between two groups equal 0. fully correspond to those obtained based on Pythagorean Theorem. Unlike Euclidian distances 3.5. Software that reflect dissimilarities, R and XR metrics All the analyses reported in this paper were done provide similarity coefficients. Rmetric is applied with the use of computer program MeaningFinder to parameters that reflect signal strength, 2.2 (Equicom, Inc). concentration, power, or other intensiveness

17 4. Experiments important peculiarity of the method is that the ETSM data processing represents a series of In this section, we provide a number of examples absolutely identical operations performed of analysis of datasets to demonstrate the according to one and the same Equation (1). And mechanisms of application of our methodology to finally, the third peculiarity of the method lies in synthesis of an indivisible whole from its parts. In the fact that the end results of analysis by the other words, we will demonstrate how it provides, ETSM method are neither reducible to, nor in an autonomous computing mode, a thorough deducible from the original input data. investigation into complex datasets on objects The ETSM method represents a new, presented in a multidimensional space of heretofore unavailable method of knowledge attributes. The examples include both complex discovery through a sort of "matrix reasoning". It ones, such as a comparative study of the climates can be used to solve any information analysis of 100 cities of 42 U.S. states based on 108 tasks (as was earlier mentioned, hybridization of meteorological parameters, and relatively simple monomer matrices [42] removes any limitations in 3D spaces of scattered points. As was already terms of the number of parameters describing the mentioned, the uniqueness of the presented objects under analysis). Datasets that lack any methodology is manifested in its capability to logical meaning or the presence of several handle an input dataset as a closed selforganizing overlapping centers of conflicting information in system wherein all of its elements interact with an input database will present a problem, as well each other according to a certain intrinsic logic as they do for traditional methods of cognition. In that does not depend on the will of a data analyst the following publications of this series, we will or programmer. This phenomenal feature of the demonstrate some of the techniques that allow ETSM methodology is evident from the few the ETSM method to correct such problems. examples presented in this paper. When a programmer or a team of programmers 4.1. Analysis of scattered points develop a data processing flow, it is always a set To demonstrate a result of the iterative action of of additive operations, no matter how many steps the algorithm ETSM described by Eq. (1), we will it may include and whatever complex refer to an example illustrated on Figs. 2a – 2b. mathematics may be involved in each of the Fig. 2a shows a set of 36 scattered points which individual steps. In a resulting program, these clearly look like four distinct groups of points; we steps may be performed consecutively or have labeled the four groups as A, B, C, and D, concurrently, but they can always be sorted out, and marked out two points in each group: a and i.e. separated from each other. Certainly, they are a1 , b and b1 , c and c1 , d and d1 . Further, pair connected by a certain logic that underlies a given wise Asimilarities between the points are program. Except for intuitive reasoning where denoted by: aa for a and a1, bb for b and b1, and information is usually supplied in the form of so on; ab for a and b, and so on. entangled blocks of loosely connected fragments, The input Ssimilarity matrix of 36 scattered a similar approach is used in classical science points was computed based on Euclidean procedures utilized in the course of solving distances and was further processed according to certain research tasks. As far as the method of Eq. (1) using arithmetic means for the first iterative averaging is concerned, it does not have transformation of S to [ S] and geometric means any counterparts in the techniques used by the for subsequent transformations (see Methods, human mind in the process of cognition: here, in section 3.1). After the first 300 transformations, order to understand a system or an event or a similarity coefficients for pairs aa , bb , cc , dd , bc , phenomenon, we take all of its individual and cd appear to equal 1 with an accuracy of up elements, bond them together into an indivisible whole and then subject it to processing, which represents an absolutely and ultimately holistic approach. This is the first most important peculiarity of the ETSM method. The second 18 D transformation of the similarity matrix of this data d1 d point set can be presented, practically, by 0 and 1, thus providing a complete separation of 16 points 16 C of group A from the rest 20 points of groups B, C Z 14 c c1 and D. The evolutionary transformation of 20 12 B points of groups B, C and D after their separation 10 b1 from the points of group A results in formation b of two loci: 1) 8 points of group B, and 2) 12 8 points of groups C and D . Upon the evolutionary 6 transformation of the 12 points of the second

4 locus, 8 points of group B get separated from 4 points of group C и 4 points of group D. Fig. 2b 2 shows an isohierarchical picture of the process of -2 A a1 Z 30 division of the set of 36 scattered points: the higher was the number of the transformation a 20 20 30 (a) 40 Y divisiontransformation cycles required for X formation of a certain node of the hierarchical tree, the darker is the area of points joined in that subcluster, and vice versa. Isohierarchies, as well

16 as hierarchical trees and dendrograms provide visualization of hierarchical structures constructed 14 through the use of the ETSM algorithm. 12 Figs. 3a – 3c demonstrate the nonlinear

10 dynamics of the evolutionary transformation, at the contrast value of 50, resulting in subdivision 8 of the Asimilarity values. The result shown in 6 Fig. 3a was obtained by applying Eq. (1) in the 4 AMmode, whereas the results presented in Figs.

2 3b and 3c were produced in the GMmode (see Methods subsection III.1.). As is seen upon -2 30 comparison of Figs. 3a and 3b, the dynamics of

20 20 the evolutionary transformation are qualitatively 30 40 same in both of the modes, with only slight (b) quantitative differences. A comparison of Figs. 3в and 3с shows that the dynamics of the separation FIG. 2: 3D space of 36 scattered points before of the group B points from groups C and D has and after ETSM processing. a) 36 scattered points the same regularities that were observed in Fig. 3b located as four groups: A, B, C and D. b) Same 36 upon separation of the group A points from the scattered points after ETSM processing presented rest of the points. in the form of isohierarchies: the greater is the The abovedemonstrated method for fully affinity between the objects, the darker is the unsupervised hierarchical analysis significantly shading of the plane that connects the objects. differs from the heretofore known clustering methods including those that are commonly to the 7th decimal place, whereas similarity referred to as "unsupervised". Firstly, upon the coefficients for pairs ab , ac and ad (i.e. ) equal very first transformation of the similarity matrix in 0.89459. Upon completion of 500 the above example, the objects of the system transformations, changes in the value occur under analysis become objects of a close (i.e. only in the ninth decimal place. By applying the isolated) cooperative system that immediately contrast function (see Methods subsection III.2.) starts evolving into two separate closed at C=80, the result of the evolutionary subsystems representing the first two branches of 19 1.0 transformations (T). Intergroup similarities were determined based on points "a", "b", "c" and "d", 0.8 cd and intragroup similarities were determined dd based on pairs "a" and "a1", "b" and "b1", "c" and

0.6 "c1", "d" and "d1" and are denoted as "aa", "bb", "cc" and "dd", respectively. Dissimilarity matrices were computed based on Euclidian distances. The aa [S] 0.4 cc value of contrast was 50 (see Methods subsection

bb III.2.). 3a) Evolutionary transformation of 36 0.2 bc, bd scattered points in the AMmode; 3b) Same in the GMmode; 3c) Evolutionary transformation of 20 ab, ac, ad points of groups B, C and D in the GMmode 0 510 15 20 30 after separation from the group A points. (a) T

1.0 the system's hierarchical tree. Secondly, due to the abovesaid peculiarities of the ETSM algorithm, no object can have affinity to both subsystems, 0.8 and therefore, after a certain number of transformations, no outliers, i.e. objects that tend 0.6 cc to both of the two loci, are left. Thirdly, the

cd number of nodes of the hierarchical tree (see, e.g., dd

[S] 0.4 Fig. 2) corresponds to the number of successive aa transformationdivisiontransformation cycles, each of which results in formation of two loci, 0.2 bb whereupon each newly formed locus is subjected ab, ac, ad to ETSM; thus, the number of nodes depends bc, bd 0 510 15 2025 30 solely on the innate structure of an input data (b) T system and by no means is set at the analyst's discretion, as is the case with, for instance, k 1.0 mean clustering. Finally, unlike all of the commonly accepted clustering methods that are 0.8 aimed at organizing the diversity of a system's objects by sorting them, this method provides the evolution of the diversity of a system's objects as 0.6 cc dd bb a whole, leading a system to transformation into

cd two opposite subsystems. The abovedescribed [S] 0.4 ETSM process represents a peculiar combination of convergent and divergent evolution of a

0.2 's objects, which is stimulated by the averaging of the objects properties. bc, bd It is important to point up the following. 0 510 15 20 30 Evolutionary transformation of any similarity (c) T matrix, regardless of the number of objects and parameters, occurs according to one and the same scenario. The abovedemonstrated examples of FIG. 3: Dynamics of evolutionary transformation nonlinear dynamics of ETSM (Figs. 3a –3c) of similarity coefficients of 36 points (shown in clearly indicate that there may be multitudes of FIG. 3a) in the course of iterative processing by similarity matrices whose evolutionary the ETSM algorithm. Asimilarities [S] are shown transformation will produce one and the same to change depending on the number of result. Thus, unlike chaotic nonlinear dynamics,

20 the ETSM nonlinear dynamics results in not the 500 increase of diversity but, on the contrary, Z unification of the objects of a complex system and in dichotomy of trajectories of variations in 400 multiplicity of similarity matrices.

The aboveprovided simple example of analysis 300 of a dataset of scattered points demonstrates a mechanism of unsupervised construction of a 200 hierarchical system that, in our opinion, meets the criteria for a natural hierarchical system. This 100 mechanism can be used for discovery of hierarchies in any kind of database, including 3D 300 200 and multidimensional systems, as well as spatial 200 300 temporal systems. The XRmetric (see Methods, Y III.4) coupled with construction of similarity X matrices through hybridization of monomer matrices (see Methods, III.3) removes the 500 problems that occur in such computations upon Z the use of Euclidean distances. As is seen from Fig. 4a, upon the increase of the number of 400 scattered points to 115 and the use of Euclidean distances, the division into hierarchical groups is 300 partially incomplete. The use of Euclidean distances in multidimensional systems provides 200 even worse results. As is seen from Fig. 4b, the XRmetric is free from that drawback. 100

4.2. Hierarchical analysis of randomized 300 200 datasets 200 300 It is important to realize that hierarchical Y X grouping (subdivision) of mathematical points even in a 3D space depends on completely FIG. 4. Isohierarchies of 115 scattered points unpredictable factors. Those factors are obtained with the use of the ETSM algorithm: 4a) determined by the relationships between all of the based on dissimilarity matrix computed by using elements of a system under analysis, i.e. by their Euclidean distances between the points in a 3D overall cooperative interactions. Certainly, those space and 4b) based on hybrid similarity matrix interactions are inherent, although in a hidden computed with the use of XRmetric (B = 1.1). form, in any dataset under analysis, but they become detectable only after the first cycle of and final result. That result will be logically iterative averaging which transforms the initial meaningful only in case when input data dataset into a closed system, i.e. a system in which inherently contain a certain logical foundation, a the principle of holism is manifested in its full. certain meaning that needs to be extracted and It should be emphasized that the ETSM presented for understanding. The above factor is algorithm is not a modeling tool or a tool that essential for the understanding of the unique enables a data analyst to choose the most optimal capabilities of the 'matrix reasoning' technology as solution among a number of possible solutions. a new approach to data processing which seems This is the fundamental distinction of the ETSM to have come up very close to that vague concept algorithm from all the algorithms for data that is generally referred to as 'artificial clustering. Data analysis performed through the intelligence'. iterative averaging procedure provides only one Clearly, it is impossible to provide a theoretical 21 proof of the unique capabilities of 'matrix dataset will produce a unique hierarchical picture reasoning' since a conceptual plan contained in a and can be easily identified. These assumptions given dataset cannot be apriori calculated, nor can proved to be true when we analyzed about 200 they be proven by merely demonstrating a limited randomized datasets. Each of the resulting number of examples of its practical application as hierarchical trees was different. The three we do it in this paper. The capabilities of this hierarchical shown in Fig. 5 have different shapes method can be evaluated only by trying it in and consist of 39, 29 and 31 nodes and 304, 310 processing of various kinds of information, using and 309 leaves, respectively. the techniques described in this paper. In this section, we will provide an example ex contrario by 4.3. Comparative analysis of climates of U.S. analyzing a dataset which, by definition, cannot cities carry any meaningful information. The example It follows from the above provided description of below is based on a set of randomized data. the ETSM methodology that formation of natural Fig. 5 shows three hierarchical trees obtained hierarchical structures for any given dataset by ETSMprocessing of data tables for 500 processed by the iterative averaging algorithm objects described by 500 parameters whose values always occurs according to one and the same in the range of 1 – 500 were generated by a scenario involving the same standard procedures random number generator. A dichotomous since our system of data processing is engineered subdivision will certainly occur as a result of to take scalability into account. This is achieved iterative averaging even if among the 500 objects through the way we construct similarity matrices there is only one pair of distinguishable objects. by hybridization of monomer similarity matrices computed for each individual parameter. Thus, the resulting hybridized similarity matrices are based on dimensionless similarity criteria. The analysis of thus processed data is essentially based on comparison of the positions of objects in the rows, normalized within a range of 0 to 1, according to individual dimensionless characters. This allows a concurrent processing of an unlimited number of parameters expressed in different units and thus eliminates the necessity of reducing the number of parameters by selecting the most representative ones, for which FIG. 5. Hierarchical trees obtained through mathematical statistics usually employs a iterative averaging of three datasets, each of 500 discriminant analysis [44]. objects described by 500 parameters whose values In this section, we will demonstrate quite a in the range from 1 to 500 were produced by a complex example of multiparameter random number generator. Similarity matrices computations by the ETSM method. We were computed by using the XRmetric (see processed comparative climatic data provided by section III, Methods). the U.S. National Climatic Data Center [45] for 100 U.S. cities of 42 states, including the Therefore, there is nothing unusual in the fact following 108 climatic characteristics based on that a set of randomized data can actually form thirtyyear averages for each parameter: morning hierarchical trees. However, as one can assume, and afternoon values of relative humidity, in per hierarchical trees of randomized sets of the same cent, for each month of the year (the total of 24 number of data points will not greatly differ in the parameters); relative cloudiness, in per cent, based number of nodes and leaves and cannot convey on average percentage of clear, partly cloudy and any meaningful information, and, most cloudy days per month (the total of 36 importantly, it is practically impossible to obtain parameters); normal daily mean, minimum, and two identical hierarchical trees as any randomized maximum temperatures in degrees of Fahrenheit

22 (the total of 36 parameters); and normal monthly Even a simplified example based on only three precipitation, in inches (the total of 12 meteorological parameters for only one month, parameters). All data are based on multiyear March, (see Fig. 6) shows that there are no records from the year 1970 through 2000. A apparent correlations between geographical similarity matrix was constructed according to the locations and the multiyear averages of the three method for hybridization of monomer similarity meteorological parameters: normal daily matrices with the use of XRmetric at B=1.5 (see maximum temperature ( oF), cloudy days per Methods sections 3.3. and 3.4.). month, and normal monthly precipitation (in The said computations involve а total of inches). 10,800 data points. Taking into account that each Fig. 7 shows a dendrogram of 100 U.S. cities data point represents an average based on 30 which was obtained based on all 108 abovesaid measurements, a total of the underlying data parameters processed by the ETSMalgorithm in points is over onethird of a million data points. It the unsupervised mode. As is seen from the should be also taken into account that deviations dendrogram, cities located in same states appear from average values significantly vary from to be in the same subclusters, and there are six parameter to parameter and from location to distinct climatic groups of states. location, thus making the input dataset an 1 AZ, Phoenix AZ, Yuma extremely chaotic system. A discovery of a AZ, Tucson NV, Las Vegas reasonably distinct correlation between the LA, Lake Charles LA, Baton Rouge GA, Macon FL, Apalachicola FL, Pensacola parameter values and geographic locations would FL, Jacksonville FL, Tallahassee MS, Jackson MS, Meridian be impossible by applying any of the currently 2 GA, Columbus GA, Augusta GA, Macon known data processing methods. The difficulty of OK, Oklahoma City OK, Tulsa TX, Dallas-Fort Worth establishing correlations in such a system is TX, San Angelo TX, Wichita Falls AL, Birmingham AP AL, Huntsville caused by the fact that the dynamics of variations AR, Fort Smith AR, Little Rock NC, Charlotte SC, Greenville-Spartanburg AP in values of meteorological parameters is NC, Raleigh WA, Quillayute extremely nonlinear, and each parameter's mean NM, Albuquerque 3 NV, Winnemucca NV, Elko NV, Ely values for any particular year are greatly CA, Sacramento CA, Stockton CA, Fresno influenced by that year's meteorological specifics. CA, Bakersfield CA, Redding IN, Fort Wayne IN, South Bend OH, Toledo OH, Dayton OH, Akron OH, Cleveland PA, Avoca PA, Pittsburg 10 PA, Williamsport Z MI, Muskegon NY, Buffalo NY, Binghamton WI, Madison WI, Milwaukee MI, Grand Rapids 8 2 2 6 MI, Flint MI, Lansing 2 2 2 IL, Springfield 22 4 NE, Lincoln NE, Omaha (North) 2 5 3 IL, Moline IL, Peoria 6 5 22 IA, Des Moines IA, Sioux City 6 5 2 2 4 2 MA, Worcester ME, Portland 4 5 2 2 VT 5 55 3 2 , Burlington 4 4 444 5 2 MN, Rochester 4 4 4 55 2 2 4 4 4 4 MN, Saint Cloud 4 4 4 4 2 ND, Bismarck SD, Aberdeen 4 4 4 4 5 5 2 4 4 4 5 3 KY, Lexington KY, Louisville 6 3 4 4 66 6 65 66 6 3 2 KY, Jackson WV, Charleston WV, Huntington 4 6 3 TN TN 2 6 6 3 3 , Knoxville , Nashville 5 6 VA, Lynchburg VA, Roanoke 2 22 1 MO, Springfield MO, St. Louis 3 3 1820 1 1 16 5 MO, Columbia MO, Kansas City 14 KS KS 1 10 12 8 , Concordia , Wichita 40 KS, Dodge City 50 60 6 70 80 Y KS, Goodland OR, Eugene OR, Medford X OR, Sexton Summit ID, Lewiston OR, Pendleton 6 ID, Boise UT, Salt Lake City ID, Pocatello FIG. 6. Relations between 30year (1970 – 2000) MT, Kalispell MT, Missoula CO, Pueblo MT, Glasgow averages of monthly (March) normal means of WY, Casper WY, Cheyenne daily maximum temperatures in oF ( x), cloudy days per month ( y) and precipitation (in inches) FIG. 7. Dendrogram of 100 cities of 42 states of (z) for 100 cities of 42 states of the USA. the USA, produced by the method of iterative averaging, based on 108 climatic parameters

23 (specified in the text). The resulting subclusters its age/gender structure (usually composed of 5 are indicated by numbers 1 through 6. year cohorts) [46]. They look like nearly On the USA map in Fig. 8, we have labeled symmetrical bell curves and represent a basic tool each of the states involved in the analysis by in demography. The shape of a population numbers 1 through 6 in accordance with the pyramid is basically a result of birth, death, and grouping shown by the dendrogram in Fig. 7: as is migration rates. However, quantitative seen, the six groups cover the totality of groups of characteristics of evolutionary population northern, central, southern, and western states, pyramids significantly depend on various factors: thus demonstrating perfect dechaotization of the ethnic, socioeconomic, ecological, climatic, input data. The fact that the obtained grouping is political, and others, as well as on many events not a result of mechanical mathematical that are extremely difficult to evaluate and take operations is especially obvious from viewing into account. Therefore, population growth group 5 that joins together the central states as a results always contain a great deal of narrow layer between the northern (group 4) and unpredictable, chaotic components and dynamic southern (group 2) states. instabilities. Even though population pyramids are believed to be a selfexplanatory reflection of the

WA MT 3 ID state of a country's population, the use of ND MN VT ME OR 6 4 4 population pyramids in correlative analysis is WI 6 SD 4 WY 4 MI CA 6 highly complicated. An example of demographic 4 4 NY 4 NV 6 IA 4 UT NE PA MA analysis presented in this section provides one IL OH 4 4 4 IN 4 1,3 CO 4 MO 4 more compelling demonstration of the efficiency 6 KS 4 6 WV5 3 5 5 5 of iterative averaging in extraction of information AZ KY 5 VA NM NC OK TX AR TN 2 5 SC from highly complex . 1 2 AL GA 3 2 MS 2 2 We processed demographic data on 72 LA 2 2 2 2 FL countries, including 50 demographic parameters 2 according to U.S. Census Bureau data for the year 2000 [47]. The 50 parameters are: population pyramid sections reflecting percentages of various FIG. 8. Grouping of 42 states based on 108 age groups (in 5year cohorts) from 0 to over 80 climatic parameters. The states are labeled by the yearolds in the total male and female numbers that correspond to the group numbers in populations, respectively, (total of 34 parameters); the dendrogram of the cities located in the birth and death rates per 1000; life expectancy at respective states (Fig. 7). birth; infant deaths; total fertility factor (total of 5 parameters); and dynamics of population growth Thus, we have demonstrated, on an example in various years (in 1980, 1990 – 1999) compared of a realworld dataset, that it is possible to to the year 2000 (total of 11 parameters). The list synthesize an indivisible whole from a set of parts. of 72 countries includes: 34 countries with In this example, the indivisible whole is the predominantly Muslim populations; 21 European dendrogram of the cities according to their countries of the former Soviet bloc and former climatic peculiarities, and the U.S. map showing USSR republics with predominantly Christian the U.S. states grouping based on the said populations; Israel, with predominantly Jewish character. Taking into account the high number population; and 16 European countries with free and diversity of the involved parameters, this market economy and predominantly Christian grouping is definitely not a result of any accidental populations. A similarity matrix for 72 countries coincidences. was computed by using Rmetric (see Methods, subsection III.4). As well as in all of the above 4.4 Hierarchical analysis of population provided examples, the entirety of the data was pyramids subjected to automated unsupervised processing Population pyramids are a graphic way to show by the ETSM algorithm the age/gender composition of a population and

24 The fact that the database under analysis was Israel, as well as certain peculiarities of its social highly heterogeneous and extremely difficult for policy. There are some other interesting extraction of information is demonstrated by a "coincidences" in the grouping of the countries: small example on Fig. 9. It shows a dependency e.g. a group including all the seaside countries of plot of two parameters that are very similar in southern Europe (Italy, Portugal, Spain, and their nature: the portions of males in the age of Greece); a group of countries that are different 3034 and 3539 in population pyramids of 72 countries. As is seen in Fig. 9, even such closely Afghanistan Djibouti related parameters as male portions in the age Sudan Somalia Uzbekistan groups of 3034 and 3539 of 72 countries do not Oman Saudi Arabia Iraq Syria display any definitive correlations. Jordan Yemen Bahrain Brunei Kuwait Qatar Albania Lebanon Tunisia Turkey 0.10 Azerbaijan Kazakhstan Algeria Morocco Iran Libya Bangladesh 1 Egypt Malaysia 0.09 Indonesia

p Kyrgyzstan Turkmenistan Uzbekistan Pakistan Tajikistan United Arab Emirates 0.08 Armenia Moldova Bosnia&Herzegov Macedonia Montenegro Israel 2 Belarus Russia 0.07 Ukraine Lithuania Estonia Latvia Bulgaria Romania Serbia Male 35-39 age grou 0.06 Poland Slovakia Croatia Slovenia Czech Republic Hungary Georgia 0.05 Austria Switzerland Luxembourg Netherlands Norway Belgium France 0.04 0.05 0.06 0.07 0.08 0.09 0.10 Denmark UK 3 Finland Male 30-34 age group Sweden Germany Greece Spain Italy Portugal

FIG. 9. Relations between male portions in the FIG. 10. Dendrogram of 72 countries, obtained age groups of 3034 and 3539 of populations of by BfCclustering based on 50 demographic 72 countries. Triangleshape data points characteristics. correspond to Muslim countries; squares, to the countries of the former Soviet bloc and former demographically and economically but are close USSR republics; and circles, to the countries with geographically, historically and culturally (Jordan, historically free market economy. Yemen, Iraq, Syria, Djibouti, Oman, Saudi An ETSMproduced dendrogram presented in Arabia); countries of the Maghrib; countries with Fig. 10 shows three distinct loci that exactly oilbased economy; Muslim countries of South correspond to the aboveindicated three groups East Asia, including former USSR republics in of countries. It clearly distinguishes Muslim Middle Asia; a welldefined grouping of Russia, countries from the rest of the countries. The Ukraine, Belarus and the Baltic states, etc. This latter, in their, clearly show two groups consisting data processing example is not aimed at of: former Soviet bloc countries, and the interpretation of the obtained results, its purpose countries with historically free market economy. is to demonstrate that the abovepresented Israel appears to be in the same subcluster as the unsupervised ETSManalysis of demographic data former Soviet bloc countries, which seems to be has successfully identified three distinctive loci in logical, given a high percentage of immigrants the group of 72 countries and discovered certain from those countries in the total population of 25 natural logic (in the form of religious, cultural, And finally, there is a third question that political, economic, geographical, and other cannot be avoided. We have demonstrated that correlations) in the grouping of countries within the averaging of a system's elements results in the three main loci. formation of two heterogeneous groups, instead All of these details are the of the input of homogenization of the system as one would data successful dechaotization as a result of expect based on common sense. This equals an processing them with the ETSM algorithm. A assertion that there should be a certain underlying simple mathematical process applied to an entirely physical principle that, obviously, should be entangled data array has produced a crystal clear discoverable through adequate physical methods, picture of similarities and differences between the for instance, in the course of studies on countries described by the data under analysis, turbulence, quantummechanical effects, showing that the quantitatively measured biological evolution processes, etc. parameters perfectly corroborate common Each of the aforementioned issues certainly knowledge of the kind that cannot be reduced to require indepth consideration and explanation; plainly quantitative characteristics. however, the purpose of this paper was to provide a detailed description of the new 5. Some concluding remarks technology and to emphasize that the principle of The foregoing is a description of a fundamentally holism, on which this technology is based every new approach to data processing, which has no whit, is not just a tool for philosophical analogs in the nowadays information science. The understanding of reality; instead, it is a reality of algorithmic basis of the presented method is fairly modern science and technology. simple, and a reasonably informed reader who Even if in addition to a few examples provided possesses basic programming skills can easily, in this paper we would have given a few dozens even without the use of a computerized of examples from our decadeslong work with the implementation of the method, such as, e.g. algorithm of iterative averaging, it would not add MeaningFinder 2.2, that iterative averaging an iota of further knowledge that has yet to be concurrently applied to all of a system's elements discovered on this phenomenon, as the discovery always results in subdivision into two alternative of such knowledge would require a different kind groups. However, unlike the final result, the of investigation. The fact that the algorithm of process of its achieving is very difficult to track iterative averaging provided in this paper down and monitor. Dichotomization, being a represents the most natural way of transition from highly nonlinear process, is practically impossible linearity to nonlinearity in the real world is for visualization, which certainly can slow down obvious. This fact has yet to be investigated from the assimilation of this very promising the standpoint of mathematics. technology. A specialist educated on linear cause The remarkable intelligence potentials of the effect principles will definitely have trouble algorithm of iterative averaging, only in part perceiving the fact that in order to have a system demonstrated in this paper, also require special subdivide into certain meaningful substructures, studies. Apparently, the key factor here is a all of its elements need to be mixed into an natural hierarchy whose manifestation is indivisible whole facilitated by the algorithm. Although a strictly Another quite puzzling peculiarity of the scientific and precise definition of natural iterative averaging algorithm is the fact that, as is hierarchy is probably impossible to provide, we seen from the earlier provided practical pointed out two most significant characteristics of application examples, despite its relative a natural hierarchy: a system's closeness and simplicity, this algorithm displays an expressly capability for selforganization. If one would intelligent response, which would not be totally apply the algorithm of iterative averaging to a unexpected should it be some kind of a highly database of words with known numbers of sophisticated computer program and not merely locations characters, it would act only as a search an algorithm based on a mechanical repetition of engine sorting the words by length and character one and the same operation. composition, since there is no natural hierarchy in 26 such a database: the meanings of words are not a hypothesis generation and verification. U.S. natural result of their character composition; Patent 7,062,508 (2006). instead, they were conventionally assigned to [13] Ch. Darwin. The Origin of Species : A Facsimile words by users of a respective language. of the First Edition, Harvard University Press, On a final note, it should be added that the p.172, 1964 method of data processing by iterative averaging [14] A. Brouwer. General Paleontology . Translated offers a possibility of various implementations in from the 1959 edition by Kaye R.H., Oliver & knowledge discovery, the most interesting of Boyd, Edinburgh, London, p. 162, 1967. which we will describe in the following papers of [15] J. Goldstein. Emergence as a construct: this series. history and issues. Emergence: Complexity and Organization , vol. 1, pp. 4972, 1999. Acknowledgments [16] A. Blanco Salgueiro. Holism and dependency Thanks are due to Oleg Rogachev for his work on among properties. Logica Trianguli , 2, pp. 1730, the software implementation of the presented 1998. methodology. I am also grateful to Michael [17] W. R. Ashby. Principles of the selforganizing Andreev for the graphic visualization work. dynamic system. Journal of General Psychology , vol. 37, pp. 125128, 1947. References [18] L. von Bertalanffy. General System Theory . [1] A. C. Varzi. Mereology ,Stanford Encyclopedia of George Braziller, New York, 1969. Philosophy, 2003. [19] I. Prigogine, G. Nicolis. SelfOrganization in [2] A. C. Varzi. Mereological Commitments, NonEquilibrium Systems , Wiley & Sons, 1977. Dialectica, 54:283305, 2000. [20] H. Haken. Synergetik . SpringerVerlag, Berlin, [3] M. Winston, R. Chaffin, and D. Herrmann. A Heidelberg, New York, 1982. Taxonomy of PartWhole Relations, Cognitive [21] B. Mandelbrot. The Fractal Geometry of Nature . Science , 11:417444, 1987. W. H. Freeman & Co., 1983. [4] J. Fodor, and E. Lepore. Holism: A Shopper's [22] L. Smolin. The trouble with physics . Houghton Guide . Basil Blackwell Inc., Cambridge, 1992. Mifflin, 2006. [5] M. Esfeld. Holism in and [23] M. Mitchell Waldrop. Complexity: The . Kluver Academic Publishers, Emerging Science at the Edge of Order and Dordrecht, 2001. Chaos. Touchstone, New York, 1993. [6] C. Dilworth. The Metaphysics of Science. An [24] J. W. N. Watkins. Metaphysics and the account of Modern Science in Terms of Principles, Laws Advancement of Science, Brit. J. Phil. Sci ., 26, pp. and Theories . Springer, p. 49, 1996. 91121, 1975. [7] K. Koffka. Principles of Gestalt Psychology . New [25] P. Woit. Not Even Wrong: The Failure of String York: HarcourtBrace, p. 176, 1935. Theory & the Continuing Challenge to Unify the Laws of [8] Hon. J. C. Smuts. Holism and Evolution . Physics , Jonathan Cape, London, 2006. Osmania University, Macmillan and Company [26] S. Kauffman. At Home in the Universe: The Limited, 2 nd edition, p. 158, 1927. Search for the Laws of SelfOrganization and Complexity . [9] Hon. J. C. Smuts. Holism and Evolution . Oxford University Press, 1995. Osmania University, Macmillan and Company [27] J.M. Lehn. Programmed chemical systems: Limited, 2 nd edition, p. 152, 1927. Multiple subprograms and multiple [10] A. J. Bahm. International Journal of General processing/expression of molecular information". Systems , vol. 6, Issue 4 March, pp. 233 – 237, 1981. Chem. Eur. J ., 6:20972102, 2000. [11] L. Andreev. Unsupervised automated [28] J.R. Nitschke and J.M. Lehn. Self hierarchical data clustering based on simulation of organization by selection: Generation of a a similarity matrix evolution. U.S. Patent metallosupramolecular grid architecture by 6,640,227 (2003). selection of components in a dynamic library of [12] L. Andreev and M. Andreev. Method and ligands. Proc. Natl. Acad. Sci. USA, 100:11970 computerbased system for nonprobabilistic 11974, 2003.

27 [29] F. Heylighen. The Science of Self [46] N. Keyfitz. Applied Mathematical Demography . organization and Adaptivity. In: L. D. Kiel, (ed.) Springer, New York, 1985. Knowledge Management, Organizational [47] U.S. Census Bureau, IDB Summary Intelligence and Learning, and Complexity, in: Demographic Data and IDB Population The Encyclopedia of Life Support Systems Pyramids. (EOLSS). Eolss Publishers, Oxford, 2001. http://www.census.gov/ipc/www/idb/ [http://www.eolss.net] [30] H. H. Pattee. The Physics of Symbols: Bridging the Epistemic Cut. Biosystems , vol. 60, pp. 521, 2001. [31] Z. N. Oltavai, and A.L. Barabasi. Life's complexity pyramid. Science , 298:763764, 2002. [32] M. R. Wilkins, K. L.Williams, R. D. Appel, and D. F. Hochstrasser (Eds). Proteome Research: New Frontiers in Functional Genomics . Springer Verlag, 1997. [33] A. Koestler. (1967). . Hutchinson, London, p. 68, 1967. [34] M. MaxNeef. Human Scale Development , The Apex Press, 1991. [35] A. Koestler, and J. R. Smythes (Eds). Beyond reductionism. New perspectives in the life sciences. The Alpbach Symposium 1968. Hutchinson, London, 1969. [36] A. Koestler. Janus: A Summing Up . Random House, New York, 1978. [37] K. Wilber. Sex, ecology and : The evolution of spirit . Shambhala, Boston, 1995. [38] V. A. Glotov. Structural analysis of microvascular bifurcations (In Russian. Strukturny analiz mikrovaskulyarnykh bifurkatsiy), AO Amipress, Smolensk, p. 178, 1995. [39] S. C. Johnson. Hierarchical Clustering Schemes. Psychometrika , 2:241254, 1967. [40] B. Mirkin, B. Mathematic Classification and Clustering. Kluwer Academic Publishers, Boston, 1996. [41] J. Han, and M. Kamber. Data Mining . Morgan Kaufmann Publishers, 2001. [42] L. Andreev. Highdimensional data clustering with the use of hybrid similarity matrices. U.S. Patent 7,003,509 (2006). [43] R. E. Bellman. Adaptive Control Processes . Princeton University Press, Princeton, NJ, 1961. [44] W. R. Klecka. Discriminant Ananlysis (Quantitative Applications in ) , Publications Inc., London, 1980. [45] Comparative Climatic Data for the United States through 2005. National Climatic Data Center. http://www.ncdc.noaa.gov/oa/ncdc.html

28