Towards an Understanding of Ideas by Machines
Total Page:16
File Type:pdf, Size:1020Kb
Introduction Semantic Text Comparator Scientific Investigation References Towards an Understanding of Ideas by Machines Piotr Andrzej Felisiak Beishui Liao Institute of Logic and Cognition Xixi Campus, Zhejiang University February 8, 2019 Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Scientific Investigation References Presentation Outline 1 Introduction Previous work: generalized multiset theory Automated idea understanding 2 Semantic Text Comparator The Empiria system: implementation of relations between ideas Training methods 3 Scientific Investigation Research plan Summary 4 References Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Multisets Informal definition A multiset is a collection of objects, such that elements of this collection may occur multiple times in the collection. Figure: An Euler diagram of a multiset fx; x; y; y; y; zg. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Multisets The number of times an element occurs in a multiset is called a multiplicity and belongs to the set of all natural numbers. Multisets may be considered as a generalization of sets. A formal (axiomatic) theory of multisets has been given by W. D. Blizard [1]. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Axiomatic generalizations of multiset theories Multiplicities of elements belong to the set of all positive and negative integers (what implies a possibility of negative membership) [3]. Multiplicities belong to the set of all positive real numbers [2]. Multiplicities belong to the set of all positive and negative real numbers [4]. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Additive union of generalized multisets Figure: An Euler diagram of the additive union of a generalized multiset ffx; xg; fz; x; zg; fygg. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Conjectures 1 Simple ideas1 can be produced by neural networks. 2 Neural networks can evaluate relations among ideas2. 3 Understanding of a new, percepted sensation, i.e. sensory input, is a process of finding a similarity relation between the percepted sensation and one or more of previously experienced sensations or ideas. 1In the sense of John Locke. 2Thus we can construct complex ideas in the sense of John Locke. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References A general example Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Semantic relationship A more specific example: semantic comparator Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References Core architecture A hybrid of CNN and MLP Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References Convolutional modules Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References Perceptrons Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References Training rule Steepest descent algorithm @E P = P − α (1) new old @P where: P - a network parameter, e.g. a weight, a bias or an element of a convolutional filter, P 2 R, E - squared error between desired and actual output, E 2 R, α - learning rate, α 2 R, α > 0. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References The Empiria training method Algorithm Suppose that we have a set fP1; P2;:::; Png of network parameters, E is measured for a single training pair and ∆ > 0. Then: 1 Set i to 1. 2 Measure E(Pi ). 3 Measure E(Pi + ∆). @E E(Pi +∆)−E(Pi ) 4 Estimate as . @Pi ∆ 5 Apply the training rule (1) to obtain a new value of Pi . 6 If i < n, then increment i by 1 and go to the step 2; otherwise finish. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References The Empiria training method Benefits 1 The algorithm is conceptualy much simpler than backpropagation and thus easier to implement. 2 The simplicity and generality of the algorithm allows for easy modification of the network architecture without significant change of the training algorithm. 3 The algorithm does not require a differentiable activation function. Drawbacks 1 Tests showed that training using the Empiria training algorithm is slower than training using backpropagation. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Scientific questions 1 To which degree the conjectures are true? Firstly, is that question answerable? 2 Can a machine understand a text by evaluation of semantic similarity relation with respect to another text or data? 3 An universe of philosophical questions; e.g. does our AI machine actually perceive the world (as humans do)? If not, then { according to George Berkeley { our machine does not exist { since it is only an idea in the minds of perceivers; then may an idea exist in the machine, viz. can an idea exist inside an idea? Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Research plan Development of semantic comparator prototype 1 Development of: convolutional layers, pooling layers, multilayer perceptron, network training algorithms, a system for pre-processing of input data, e.g. it would include file reading programs, word embedding, etc., a graphical user interface, 2 parallelization of calculations, 3 preparation of a set of training examples, 4 training of the network, 5 evaluation of the network performance, 6 manual adjustment of the network hyperparameters. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Research plan Improvements and extension to arbitrary relations 1 Experiments involving relations between ideas derived from text and ideas derived from other sources, such as pictures or speech (heterogenous sensations), 2 an application of genetic algorithms for the training problem, as long as such an approach is in its infancy [6], 3 application of the dropout method to prevent an overfitting, 4 an introduction of the technology of capsules of neurons [5], 5 an additional CNN for recognition of words and symbols in source texts, what may facilitate exploitation of text graphical features and enable to avoid the need for word embedding. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Research plan Automatic optimization of hyperparameters Development of a genetic algorithm for automatic optimization of hyperparameters. This would involve: design of: 1 a data structure for representation of network hyperparameters, 2 a method for creation of new generations of solutions, using elitism, selection, crossover and mutation, 3 a fitness function, 4 a method for adaptive tuning of genetic algorithm parameters (probabilities of mutation and crossover), this step may be optional, implementation and evaluation of the genetic algorithm. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Summary 1 Scientific questions concerning cognitive problems has been formulated. 2 A research plan, induced by a will to answer the questions, has been proposed. 3 The research plan is partially realised. 4 The first interesting result of the endeavor is development and preliminary