<<

Introduction Semantic Text Comparator Scientific Investigation References

Towards an Understanding of Ideas by Machines

Piotr Andrzej Felisiak Beishui Liao

Institute of Logic and Cognition Xixi Campus, Zhejiang University

February 8, 2019

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Scientific Investigation References Presentation Outline

1 Introduction Previous work: generalized multiset theory Automated idea understanding

2 Semantic Text Comparator The Empiria system: implementation of relations between ideas Training methods

3 Scientific Investigation Research plan Summary

4 References

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Multisets

Informal definition A multiset is a collection of objects, such that elements of this collection may occur multiple times in the collection.

Figure: An Euler diagram of a multiset {x, x, y, y, y, z}.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Multisets

The number of times an occurs in a multiset is called a multiplicity and belongs to the of all natural numbers. Multisets may be considered as a generalization of sets. A formal (axiomatic) theory of multisets has been given by W. D. Blizard [1].

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Axiomatic generalizations of multiset theories

Multiplicities of elements belong to the set of all positive and negative (what implies a possibility of negative membership) [3]. Multiplicities belong to the set of all positive real numbers [2]. Multiplicities belong to the set of all positive and negative real numbers [4].

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Additive union of generalized multisets

Figure: An Euler diagram of the additive union of a generalized multiset {{x, x}, {z, x, z}, {y}}.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Conjectures

1 Simple ideas1 can be produced by neural networks. 2 Neural networks can evaluate relations among ideas2. 3 Understanding of a new, percepted sensation, i.e. sensory input, is a process of finding a similarity relation between the percepted sensation and one or more of previously experienced sensations or ideas.

1In the sense of John Locke. 2Thus we can construct complex ideas in the sense of John Locke. Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References A general example

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Previous work: generalized multiset theory Scientific Investigation Automated idea understanding References Semantic relationship A more specific example: semantic comparator

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References Core architecture A hybrid of CNN and MLP

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References Convolutional modules

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References Perceptrons

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References Training rule Steepest descent algorithm

∂E P = P − α (1) new old ∂P where: P - a network parameter, e.g. a weight, a bias or an element of a convolutional filter, P ∈ R, E - squared error between desired and actual output, E ∈ R, α - learning rate, α ∈ R, α > 0.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References The Empiria training method Algorithm

Suppose that we have a set {P1, P2,..., Pn} of network parameters, E is measured for a single training pair and ∆ > 0. Then: 1 Set i to 1.

2 Measure E(Pi ).

3 Measure E(Pi + ∆). ∂E E(Pi +∆)−E(Pi ) 4 Estimate as . ∂Pi ∆ 5 Apply the training rule (1) to obtain a new value of Pi . 6 If i < n, then increment i by 1 and go to the step 2; otherwise finish.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator The Empiria system: implementation of relations between ideas Scientific Investigation Training methods References The Empiria training method

Benefits 1 The algorithm is conceptualy much simpler than backpropagation and thus easier to implement. 2 The simplicity and generality of the algorithm allows for easy modification of the network architecture without significant change of the training algorithm. 3 The algorithm does not require a differentiable activation .

Drawbacks 1 Tests showed that training using the Empiria training algorithm is slower than training using backpropagation.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Scientific questions

1 To which degree the conjectures are true? Firstly, is that question answerable? 2 Can a machine understand a text by evaluation of semantic similarity relation with respect to another text or data? 3 An universe of philosophical questions; e.g. does our AI machine actually perceive the world (as humans do)? If not, then – according to George Berkeley – our machine does not exist – since it is only an idea in the minds of perceivers; then may an idea exist in the machine, viz. can an idea exist inside an idea?

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Research plan Development of semantic comparator prototype

1 Development of: convolutional layers, pooling layers, multilayer perceptron, network training algorithms, a system for pre-processing of input data, e.g. it would include file reading programs, word embedding, etc., a graphical user interface, 2 parallelization of calculations, 3 preparation of a set of training examples, 4 training of the network, 5 evaluation of the network performance, 6 manual adjustment of the network hyperparameters.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Research plan Improvements and extension to arbitrary relations

1 Experiments involving relations between ideas derived from text and ideas derived from other sources, such as pictures or speech (heterogenous sensations), 2 an application of genetic algorithms for the training problem, as long as such an approach is in its infancy [6], 3 application of the dropout method to prevent an overfitting, 4 an introduction of the technology of capsules of neurons [5], 5 an additional CNN for recognition of words and symbols in source texts, what may facilitate exploitation of text graphical features and enable to avoid the need for word embedding.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Research plan Automatic optimization of hyperparameters

Development of a genetic algorithm for automatic optimization of hyperparameters. This would involve: design of: 1 a data structure for representation of network hyperparameters, 2 a method for creation of new generations of solutions, using elitism, selection, crossover and mutation, 3 a fitness function, 4 a method for adaptive tuning of genetic algorithm parameters (probabilities of mutation and crossover), this step may be optional, implementation and evaluation of the genetic algorithm.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Research plan Scientific Investigation Summary References Summary

1 Scientific questions concerning cognitive problems has been formulated. 2 A research plan, induced by a will to answer the questions, has been proposed. 3 The research plan is partially realised. 4 The first interesting result of the endeavor is development and preliminary evaluation of a non-classical training algorithm.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Scientific Investigation References ReferencesI

Wayne D. Blizard. Multiset theory. Notre Dame J. Formal Logic, 30(1):36–66, 12 1988.

Wayne D. Blizard. Real-valued multisets and fuzzy sets. Fuzzy Sets and Systems, 33(1):77–97, 1989.

Wayne D. Blizard. Negative membership. Notre Dame J. Formal Logic, 31(3):346–368, 06 1990.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Scientific Investigation References ReferencesII

Piotr Andrzej Felisiak, Kaiyu Qin, and Gun Li. Generalized multiset theory. 2018. (unpublished; under review by Fuzzy Sets and Systems).

Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. Dynamic routing between capsules. In Proceedings of 31st Conference on Neural Information Processing Systems, 2017.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines Introduction Semantic Text Comparator Scientific Investigation References ReferencesIII

Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Kenneth O. Stanley, and Jeff Clune. Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. CoRR, abs/1712.06567, 2017.

Piotr Felisiak, Beishui Liao Towards an Understanding of Ideas by Machines