Quantum Probabilistic Graphical Models for Cognition and Decision
Catarina Alexandra Pinto Moreira
Supervisor: Doctor Andreas Miroslaus Wichert
Thesis specifically prepared to obtain the PhD Degree in Information Systems and Computer Engineering
Draft
August, 2017 ii Dedicated to all who contributed for my education ’If you can dream - and not make dreams your master; If you can think - and not make thoughts your aim; If you can meet with Triumph and Disaster, And treat those two impostors just the same; (...) If you can force your heart and nerve and sinew To serve your turn long after they are gone, And so hold on when there is nothing in you Except the Will which says to them: ”Hold on!” (...) Then, yours is the Earth and everything that’s in it, And - which is more - you’ll be a Man, my son!’ ’If - ’ by Rudyard Kipling
iii iv Title: Quantum-Like Probabilistic Graphical Models for Cognition and Decision Name Catarina Alexandra Pinto Moreira PhD in Information Systems and Computer Engineering Supervisor Doctor Andreas Miroslaus Wichert
Abstract
Cognitive scientists are mainly focused in developing models and cognitive structures that are able to represent processes of the human mind. One of these processes is concerned with human decision making. In the last decades, literature has been reporting several situations of human decisions that could not be easily modelled by classical models, because humans constantly violate the laws of prob- ability theory in situations with high levels of uncertainty. In this sense, quantum-like models started to emerge as an alternative framework, which is based on the mathematical principles of quantum mechan- ics, in order to model and explain paradoxical findings that cognitive scientists were unable to explain using the laws of classical probability theory. Although quantum-like models succeeded to explain many paradoxical decision making scenarios, they still suffer from three main problems. First, they cannot scale to more complex decision scenarios, because the number of quantum parameters grows exponentially large. Second, they cannot be consid- ered predictive, since they require that we know a priori the outcome of a decision problem in order to manually set quantum parameters. And third, the way one can set these quantum parameters is still an unexplored field and still an open research question in the Quantum Cognition literature. This work focuses on quantum-like probabilistic graphical models by surveying the most important aspects of classical probability theory, quantum-like models applied to human decision making and probabilistic graphical models. We also propose a Quantum-Like Bayesian Network that can easily scale up to more complex decision making scenarios due to its network structure. In order to address the problem of exponential quantum parameters, we also propose heuristic functions that can set an exponential number of quantum parameters without a priori knowledge of experimental outcomes. This makes the proposed model general and predictive in contrast with the current state of the art models, which cannot be generalised for more complex decision making scenarios and that can only provide an explanatory nature for the observed paradoxes.
Keywords: Quantum Cognition, Quantum-Like Bayesian Networks, Quantum Probability, Quan- tum Interference Effects, Quantum-Like Models
v vi T´ıtulo Modelos Graficos´ Probabil´ısticos Quanticosˆ para Cognic¸ao˜ e Decisao˜ Nome Catarina Alexandra Pinto Moreira Doutoramento em Engenharia Informatica´ e de Computadores Orientador Doutor Andreas Miroslaus Wichert
Resumo
Os cientistas cognitivos concentram-se principalmente no desenvolvimento de modelos e estruturas cognitivas capazes de representar processos da mente humana. Um desses processos esta´ rela- cionado com o facto de como os humanos tomam decisoes.˜ Nas ultimas´ decadas,´ a literatura tem relatado varias´ situac¸oes˜ de decisoes˜ humanas que nao˜ podem ser facilmente modeladas por mode- los classicos,´ porque os humanos violam constantemente as leis da teoria da probabilidade classica´ em situac¸oes˜ com altos n´ıveis de incerteza. Nesse sentido, os modelos quanticosˆ comec¸aram a surgir como uma abordagem alternativa que se baseia nos princ´ıpios matematicos´ da mecanicaˆ quanticaˆ para modelar e explicar situac¸oes˜ paradoxais que os cientistas cognitivos nao˜ conseguem explicar usando as leis da teoria da probabilidade classica.´ Embora os modelos quanticosˆ tenham conseguido explicar muitos cenarios´ paradoxais de decisao˜ humana, eles ainda sofrem de tresˆ problemas principais. Primeiro, eles nao˜ podem escalar para cenarios´ de decisao˜ mais complexos, porque o numero´ de parametrosˆ quanticosˆ cresce de uma forma exponencial relativamente a` complexidade do problema de decisao.˜ Em segundo lugar, eles nao˜ podem ser considerados preditivos, uma vez que exigem que conhec¸amos a priori o resultado de um problema de decisao˜ para definir manualmente os parametrosˆ quanticosˆ que servem para explicar os resultados paradoxais. E em terceiro lugar, a forma como se pode definir esses parametrosˆ quanticosˆ e´ um campo inexplorado e ainda e´ uma questao˜ de investigac¸ao˜ aberta na literatura modelos cognitivos quanticos.ˆ Este trabalho centra-se em modelos probabil´ısticos graficos´ quanticos,ˆ consistindo num levanta- mento dos aspectos mais importantes da teoria da probabilidade classica,´ modelos quanticosˆ aplica- dos a` tomada de decisao˜ humana e em modelos probabil´ısticos graficos´ classicos.´ Tambem´ propomos uma rede Bayesiana quanticaˆ que pode escalar facilmente para cenarios´ de decisao˜ mais complexos devido a` sua estrutura de rede. De forma a abordar o problema de atribuic¸ao˜ de valores a um numero´ exponencial de parametrosˆ quanticos,ˆ tambem´ propomos func¸oes˜ heur´ısticas que podem definir um conjunto exponencial de parametrosˆ quanticosˆ sem conhecimento a priori de resultados experimentais. Isso torna o modelo proposto geral e preditivo em contraste com os modelos actuais do estado da arte, que nao˜ podem ser generalizados para cenarios´ de tomada de decisao˜ mais complexos e que so´ podem fornecer uma natureza explicativa para os paradoxos observados.
Palavras-chave: Cognic¸ao˜ Quantica,ˆ Redes Bayesianas Quanticas,ˆ Probabilidade Quantica,ˆ Efeitos de Interferenciaˆ Quantica,ˆ Modelos Quanticosˆ
vii viii T´ıtulo Modelos Graficos´ Probabil´ısticos Quanticosˆ para Cognic¸ao˜ e Decisao˜ Nome Catarina Alexandra Pinto Moreira Doutoramento em Engenharia Informatica´ e de Computadores Orientador Doutor Andreas Miroslaus Wichert
Resumo Extendido
A cognic¸ao˜ quanticaˆ e´ uma area´ de investigac¸ao˜ que visa usar os princ´ıpios matematicos´ da mecanicaˆ quanticaˆ para modelar sistemas cognitivos para a tomada de decisoes˜ humanas. Dado que a teoria da probabilidade classica´ e´ muito r´ıgida no sentido de que ela apresenta muitas restric¸oes˜ e pressupostos (princ´ıpio da trajetoria´ unica,´ obedece a teoria dos conjuntos, etc.), torna-se muito limitado (ou mesmo imposs´ıvel) desenvolver modelos simples que possam capturar julgamentos humanos e decisoes,˜ uma vez que as pessoas podem violar as leis da logica´ e da teoria da probabilidade [33, 37,6]. A teoria da probabilidade quanticaˆ beneficia de muitas vantagens relativamente a` teoria classica.´ Pode representar eventos em espac¸os vectoriais. Consequentemente, pode levar em considerac¸ao˜ o problema da ordem dos efeitos [202, 188] e representar as amplitudes dos resultados experimentais ao mesmo tempo atraves´ de numa superposic¸ao.˜ Psicologicamente, o efeito de superposic¸ao˜ pode estar relacionado ao sentimento de confusao,˜ incerteza ou ambiguidade. Ou seja, pode representar a noc¸ao˜ de crenc¸a como um estado indefinido [34]. Alem´ disso, esta representac¸ao˜ do espac¸o vectorial nao˜ obedece ao axioma distributivo da logica´ booleana e nem a` lei da probabilidade total. Isso permite a construc¸ao˜ de modelos mais gerais que podem explicar matematicamente fenomenos´ cognitivos, como erros de conjunc¸ao/disjunc¸˜ ao˜ [40, 73] ou violac¸oes˜ do Princ´ıpio da Certeza [164, 110], que e´ o foco principal deste trabalho. Um problema dos actuais sistemas probabil´ısticos e´ o facto de nao˜ podem fazer previsoes˜ pre- cisas em situac¸oes˜ em que as leis da probabilidade classica´ sao˜ violadas. Estas situac¸oes˜ ocorrem frequentemente em sistemas que tentam modelar decisoes˜ humanas em cenarios´ onde o princ´ıpio da Certeza [170] e´ violado. Este princ´ıpio e´ fundamental na teoria da probabilidade classica´ e afirma que se alguem´ preferir a acc¸ao˜ A relativamente a` acc¸ao˜ B no estado do mundo X, e se alguem´ tambem´ preferir a acc¸ao˜ A relativamente a B sob o estado complementar do Mundo ¬X, entao˜ subentende- se que se deve preferir sempre a acc¸ao˜ A relativamente a B mesmo quando o estado do mundo nao˜ e´ conhecido. Violac¸oes˜ ao Princ´ıpio da Certeza implicam violac¸oes˜ a` lei da probabilidade total classica´ [193, 196, 198,9, 26]. Desta forma, neste trabalho, e´ proposta uma Rede Bayesiana quantica,ˆ inspirada nos formalismos de Integrais de caminho de Feynman [72]. Uma rede Bayesiana pode ser entendida como um grafico´ ac´ıclico direcionado, no qual cada no´ representa uma variavel´ aleatoria´ e cada uma das arestas representa uma influenciaˆ direta do no´ de origem para o no´ alvo (dependenciaˆ condicional). Por sua vez, os integrais do caminho de Feynman representam todos os caminhos poss´ıveis que uma part´ıcula pode percorrer para alcanc¸ar um ponto de
ix destino, levando em considerac¸ao˜ que todos esses caminhos podem produzir efeitos de interferenciaˆ quanticaˆ entre eles.
A criac¸ao˜ deste tipo de redes Bayesianas quanticas,ˆ juntamente com a aplicac¸ao˜ dos integrais de caminho de Feynman, geram algumas dificuldades, nomeadamente a quantidade exponencial de parametrosˆ livres que resultam dos efeitos de interferenciaˆ quantica.ˆ A estes parametrosˆ e´ preciso atribuir valores que permitam acomodar os cenarios´ de decisao˜ onde o princ´ıpio da certeza e´ violado.
Para colmatar este problema, propomos tambem´ um conjunto de heur´ısticas de similaridade para calcular esse numero´ exponencial de parametrosˆ de interferenciaˆ quanticos.ˆ Note-se que uma heur´ıstica e´ simplesmente um atalho que geralmente fornece bons resultados em muitas situac¸oes,˜ mas com o custo de ocasionalmente nao˜ nos dar resultados muito precisos [173].
Note-se que os modelos atuais da literatura exigem uma busca manual de parametrosˆ que podem levar aos resultados desejados. Ou seja, e´ necessario´ sabermos o resultado do cenario´ de decisao˜ a` priori para manualmente se atribu´ırem valores a esses parametrosˆ [93, 96, 101, 164, 44, 41]. Com a rede proposta, pretende-se um modelo escalavel´ e preditivo ao contraste dos modelos actuais que temˆ uma natureza explicativa.
As heur´ısticas que propomos neste trabalho sao˜ de tresˆ tipos: (1) baseadas na distribuic¸ao˜ prob- abi´ısticas dos dados, (2) baseadas nos conteudos´ dos dados e (3) baseadas em relac¸oes˜ semanticas.ˆ
Heur´ısticade Similaridade Baseada em Distribuic¸oes˜ Probabil´ısticas
O objetivo da heur´ıstica de similaridade e´ determinar um anguloˆ entre os vectores probabil´ısticos as- sociados a` marginalizac¸ao˜ das atribuic¸oes˜ positivas e negativas da variavel´ de consulta. Em outras palavras, ao realizar uma inferenciaˆ probabil´ısticas a partir de uma tabela de distribuic¸ao˜ de proba- bilidade conjunta, selecionamos nesta tabela todas as probabilidades que combinam as atribuic¸oes˜ da variavel´ de consulta e, se for dado, as variaveis´ observadas. Se somarmos essas probabilidades, acabamos com uma inferenciaˆ de probabilidade classica´ final. Se acrescentarmos um termo de in- terferenciaˆ a essa inferenciaˆ classica,´ acabaremos com uma inferenciaˆ probabil´ıstica quantica.ˆ Neste caso, podemos usar esses vetores de probabilidade para obter informac¸oes˜ adicionais para calcular os parametrosˆ de interferenciaˆ quantica.ˆ A ideia geral da heur´ıstica de similaridade e´ usar as distribuic¸oes˜ de probabilidade marginal como vetores de probabilidade e medir sua similaridade atraves´ da lei dos Cossenos, que e´ uma medida de similaridade bem conhecida no dom´ınio da Cienciaˆ da Computac¸ao˜ e e´ amplamente utilizada na Recuperac¸ao˜ de Informac¸ao˜ [23]. De acordo com esse grau de similaridade, aplicaremos uma func¸ao˜ de mapeamento com uma natureza heur´ıstica, que produzira´ o valor para o parametroˆ de interferenciaˆ quantico,ˆ tendo em conta um estudo previo´ relativamente a` distribuic¸ao˜ prob- abil´ıstica dos dados de varias´ experienciasˆ relatadas por toda a literatura. Os resultados mostraram um erro medio´ entre 6.4% a 7.9% na previsao˜ das decisoes˜ humanas em varias´ experienciasˆ da literatura onde foram reportadas violac¸oes˜ ao princ´ıpio da certeza.
x Heur´ısticade Similaridade Baseada em Distribuic¸oes˜ Probabil´ısticas
Esta heur´ıstica representa objectos (ou eventos num espac¸o vetorial N-dimensional. Isto permite a sua comparac¸ao˜ atraves´ de func¸oes˜ de similaridade. O valor da similaridade e´ usado para calcular os parametrosˆ de interferenciaˆ quantica.ˆ Tal como no trabalho de Pothos et al. [166], nao˜ estamos a restringir o nosso modelo a um vector em num espac¸o psicologico´ multidimensional, mas a um espac¸o multidimensional arbitrario.´ As similaridades calculadas entre dois vectores que representam conteudos´ de eventos (neste caso, os eventos sao˜ imagens e os seus conteudos´ sao˜ os pixeis´ que as compoes)˜ podem ser usadas para definir parametrosˆ de interferenciaˆ quantica,ˆ uma vez que ambos sao˜ compostos pelo calculo´ do produto interno entre duas variaveis´ aleatorias.´ Isto sugere uma equivalenciaˆ matematica´ entre os parametrosˆ θ calculados a partir da similaridade do Cosseno e os parametrosˆ quantitativos θ correspondentes aos efeitos de interferenciaˆ quantica.ˆ Essa suposic¸ao˜ e´ baseada no livro de Busemeyer & Bruza [34], onde se afirma que o parametroˆ θ que surge em efeitos de interferenciaˆ quanticaˆ corresponde a` fase do anguloˆ do produto interno entre os projetores de duas variaveis´ aleatorias.´ Os autores tambem´ afir- mam que o produto interno fornece uma medida de similaridade entre dois vectores (onde cada vector corresponde a uma superposic¸ao˜ de eventos). Se os vectores tiverem o comprimento unitario,´ entao,˜ a semelhanc¸a do Coseno colapsa para o produto interno. Dadas todas essas relac¸oes,˜ podemos assumir que as semelhanc¸as computadas entre dois vetores que representam imagens (usadas na experienciaˆ de Busemeyer et al. [41]) podem ser usadas para definir parametrosˆ de interferenciaˆ quantica.ˆ Os resultados das simulac¸oes˜ aplicados ao trabalho de Busemeyer et al. [41] demonstraram que a heur´ıstica proposta foi capaz de reproduzir as observac¸oes˜ experimentais das violac¸oes˜ do princ´ıpio da certeza com uma pequena percentagem de erro (entre 4% e 5%).
Heur´ısticade Similaridade Semanticaˆ
Esta heur´ıstica procura determinar o impacto de relac¸oes˜ de dependenciaˆ semanticaˆ entre eventos. Estas semelhanc¸as semanticasˆ adicionam novas dependenciasˆ entre os nos´ das redes Bayesianas que nao˜ incluem necessariamente relac¸oes˜ causais directas. Usaremos essas informac¸oes˜ semanticasˆ adicionais para calcular os efeitos de interferenciaˆ quantica,ˆ a fim de acomodar as violac¸oes˜ ao princ´ıpio da certeza. Sob o princ´ıpio da causalidade, dois eventos que nao˜ estao˜ causalmente conectados nao˜ devem produzir nenhum efeito. Quando alguns eventos acausais ocorrem produzindo um efeito, e´ chamado de coincidencia.ˆ Carl Jung, acreditava que todos os eventos tinham que estar conectados uns aos outros, nao˜ num cenario´ causal, mas sim atraves´ do seu significado, sugerindo algum tipo de relac¸ao˜ semanticaˆ entre eventos. Esta noc¸ao˜ e´ conhecida como o princ´ıpio da sincronicidade [87]. Definimos a heur´ıstica de similaridade semanticaˆ de forma semelhante ao princ´ıpio da sincronici- dade: duas variaveis´ sao˜ ditas sincronizadas, se compartilhem uma conexao˜ semanticaˆ entre eles. Essa conexao˜ pode ser obtida atraves´ da representac¸ao˜ de uma rede semanticaˆ das variaveis´ em
xi questao.˜ Isso permitira´ o surgimento de novas dependenciasˆ significativas que seriam inexistentes ao considerar apenas relac¸oes˜ causa/efeito. Os parametrosˆ quanticosˆ sao˜ entao˜ atribu´ıdos usando esta informac¸ao˜ adicional de forma a que o anguloˆ formado por essas duas variaveis,´ num espac¸o de Hilbert, seja o menor poss´ıvel (alta similaridade), dessa forma forc¸ando os eventos acausais a serem correlacionados. Os resultados das simulac¸oes˜ aplicadas ao trabalho de Busemeyer et al. [41] demonstraram que a heur´ıstica proposta foi capaz de reproduzir as observac¸oes˜ experimentais das violac¸oes˜ do princ´ıpio da certeza com uma pequena percentagem de erro (entre 3% e 6%).
Palavras-chave: Cognic¸ao˜ Quantica,ˆ Redes Bayesianas Quanticas,ˆ Probabilidade Quantica,ˆ Efeitos de Interferenciaˆ Quantica,ˆ Modelos Quanticosˆ
xii Contents
Abstract...... v Resumo...... vii Resumo...... ix List of Tables...... xviii List of Figures...... xxii
1 Introduction 1 1.1 Violations to Normative Theories of Rational Choice...... 1 1.2 The Emergence of Quantum Cognition...... 2 1.3 Motivation: Violations to the Sure Thing Principle...... 3 1.4 Why Quantum Cognition?...... 5 1.5 Challenges of Current Quantum-Like Models...... 6 1.6 Thesis Proposal...... 8 1.6.1 Why Bayesian Networks?...... 8 1.6.2 Quantum-Like Bayesian Networks for Disjunction Errors...... 8 1.6.3 Comparison with Existing Quantum-Like Models...... 9 1.7 Advantages of Quantum-Like Models...... 9 1.7.1 Research Questions...... 10 1.8 Contributions...... 10 1.8.1 Conference Papers, Extended Abstracts and Posters...... 11 1.9 Organisation...... 11
2 Quantum Cognition Fundamentals 15 2.1 Introduction to Quantum Probabilities...... 16 2.1.1 Representation of Quantum States...... 16 2.1.2 Space...... 17 2.1.3 Events...... 18 2.1.4 System State...... 20 2.1.5 State Revision...... 21 2.1.6 Compatibility and Incompatibility...... 22 2.2 Interference Effects...... 23
xiii 2.2.1 The Double Slit Experiment...... 23 2.2.2 Derivation of Interference Effects from Complex Numbers...... 24 2.3 Time Evolution...... 26 2.4 Path Diagrams...... 27 2.4.1 Single Path Trajectory Principle...... 28 2.4.2 Multiple Indistinguishable Paths...... 28 2.4.3 Multiple Distinguishable Paths...... 29 2.5 Born’s Rule...... 29 2.6 Why Complex Numbers?...... 30 2.7 Summary and Final Discussion...... 33
3 Fundamentals of Bayesian Networks 35 3.1 The Na¨ıve Bayes Model...... 35 3.2 Bayesian Networks...... 37 3.2.1 Example of Inferences in Bayesian Networks...... 38 3.3 Reasoning Factors...... 39 3.3.1 Causal Reasoning...... 39 3.3.2 Evidential Reasoning...... 40 3.3.3 Intercausal Reasoning...... 41 3.4 Flow of Probabilistic Inference...... 41 3.5 Summary and Final Discussion...... 43
4 Paradoxes and Fallacies for Cognition and Decision-Making 45 4.1 Utility Functions...... 46 4.1.1 Expected Utility Theory...... 46 4.1.2 Subjective Expected Utility...... 47 4.2 Paradoxes...... 47 4.2.1 Ellsberg Paradox...... 48 4.2.2 Allais Paradox...... 48 4.2.3 Three Color Ellsberg Paradox...... 50 4.3 Conjunction and Disjunction Errors...... 50 4.3.1 The Linda Problem...... 51 4.4 Disjunction Effects...... 52 4.4.1 The Two Stage Gambling Game...... 53 4.4.2 The Prisoner’s Dilemma Game...... 54 4.5 Order of Effects...... 56 4.6 Summary and Final Discussion...... 58
xiv 5 Related Work 61 5.1 Disjunction Fallacy: The Prisoner’s Dilemma Game...... 62 5.2 A Classical Markov Model of the Prisoner’s Dilemma Game...... 62 5.3 The Quantum-Like Approach...... 64 5.3.1 Contextual Probabilities: The Vaxj¨ o¨ Model...... 64 5.3.2 The Hyperbolic Interference...... 66 5.3.3 Quantum-Like Probabilities as an Extension of the Vaxj¨ o¨ Model...... 67 5.3.4 Modelling the Prisoner’s Dilemma using the Quantum-Like Approach...... 68 5.4 The Quantum Dynamical Model...... 68 5.5 The Quantum Prospect Decision Theory...... 71 5.5.1 Choosing the Uncertainty Factor...... 72 5.5.2 The Quantum Prospect Decision Theory Applied to the Prisoner’s Dilemma Game 74 5.6 Probabilistic Graphical Models...... 75 5.6.1 Classical Bayesian Networks...... 75 5.6.2 Classical Bayesian Networks for the Prisoner’s Dilemma Game...... 75 5.6.3 Quantum-Like Bayesian Networks in the Literature...... 77 5.7 Discussion of the Presented Models...... 78 5.7.1 Discussion in Terms of Interference, Parameter Tuning and Scalability...... 78 5.7.2 Discussion in Terms of Parameter Growth...... 81 5.8 The Quantum-Like Approach Over the Literature...... 82 5.9 The Quantum Dynamical Model Over the Literature...... 83 5.10 A Model of Neural Oscillators for Quantum Cognition and Negative Probabilities..... 84 5.11 A Quantum-Like Agent-Based Model...... 85 5.12 Summary and Final Discussion...... 86
6 Quantum-Like Bayesian Networks for Cognition and Decision 89 6.1 Classical Bayesian Networks...... 89 6.1.1 Classical Conditional Independece...... 90 6.1.2 Classical Random Variables...... 90 6.1.3 Example of Application in the Two-Stage Gambling Game...... 90 6.1.4 Classical Full Joint Distributions...... 91 6.1.5 Classical Marginalization...... 91 6.2 Quantum-Like Bayesian Networks...... 92 6.2.1 Quantum Random Variables...... 93 6.2.2 Quantum State...... 94 6.2.3 Quantum-Like Full Joint Distribution...... 95 6.2.4 Quantum-Like Marginalisation: Exact Inference...... 96 6.3 The Impact of the Phase θ ...... 99 6.4 A Cognitive Interpretation of Quantum-Like Bayesian Networks...... 100
xv 6.5 Summary of the Quantum-Like Bayesian Network Model...... 100 6.6 Inference in More Complex Networks: The Burglar/Alarm Network...... 103 6.7 Discussion of Experimental Results...... 105 6.8 Summary and Final Discussion...... 107
7 Heuristical Approaches Based on Data Distribution 109 7.1 The Vector Similarity Heuristic...... 109 7.1.1 Acquisition of Additional Information...... 111 7.1.2 Definition of the Heuristical Function...... 112 7.1.3 Algorithm...... 113 7.1.4 Summary...... 114 7.2 Example of Application...... 115 7.3 Similarity Heuristic Applied to the Prisoner’s Dilemma Game...... 117 7.3.1 The Special Case of Crosson’s (2009) Experiments...... 119 7.3.2 Analysing Li’s et al. (2002) Experiments...... 120 7.4 Similarity Heuristic Applied to the Two Stage Gambling Game...... 122 7.5 Comparing the Similarity Heuristic with other Works of the Literature...... 123 7.6 Summary and Final Discussion...... 124
8 Heuristical Approaches Based on Contents of the Data 127 8.1 A Vector Similarity Model to Extract Quantum Parameters...... 128 8.1.1 Using Cosine Similarity to Determine Quantum Parameters...... 129 8.2 Application to the categorisation-Decision Experiment...... 130 8.2.1 Categorisation - Decision Making Experiment...... 130 8.2.2 Modelling the Problem using Quantum-Like Bayesian Networks...... 132 8.2.3 Computation of the Probability of Narrow Faces...... 132 8.2.4 Computing Quantum Interference Terms...... 133 8.2.5 The Impact of the Conversion Threshold...... 134 8.2.6 Results and Discussion...... 136 8.3 Algorithm...... 138 8.4 Summary and Final Discussion...... 138
9 Heuristical Approaches Based on Semantic Similarities 141 9.1 Synchronicity: an Acausal Connectionist Principle...... 142 9.2 Combining Causal and Acausal Principles for Quantum Cognition...... 142 9.2.1 Semantic Networks...... 143 9.2.2 The Semantic Similarity Heuristic...... 143 9.3 The Semantic Similarity Heuristic in the Categorisation/Decision Experiment...... 144 9.3.1 Application of the Synchronity Heuristic: Narrow Faces...... 145 9.3.2 Results and Discussion...... 146
xvi 9.4 Application to More Complex Bayesian Networks: The Lung Cancer Network...... 146 9.4.1 Deriving a Semantical Network...... 147 9.4.2 Inference in Quantum Bayesian Networks...... 147 9.4.3 Results with No Evidences Observed: Maximum Uncertainty...... 147 9.4.4 Results with One Piece of Evidence Observed...... 148 9.5 Application to More Complex Bayesian Networks: The Burglar / Alarm Network...... 149 9.5.1 Semantic Networks: Incorporating Acausal Connections...... 150 9.6 Summary and FInal Discussion...... 152
10 Classical and Quantum Models for Order Effects 153 10.1 The Gallup Poll Problem...... 154 10.2 A Quantum Approach for Order Effectsl...... 156 10.2.1 The Quantum Projection Model...... 157 10.2.2 Discussion of the Quantum Projection Model...... 159 10.3 The Relativist Interpretation of Parameters...... 160 10.4 Do We Need Quantum Theory for Order Effects?...... 162 10.4.1 A Classical Approach for Order Effects...... 162 10.4.2 Analysis of the Classical Projection Model...... 164 10.4.3 Explaining Serveral Order Effects using the Classical and Quantum Projection Models...... 165 10.4.4 Occam’s Razor...... 166 10.5 Summary and Final Discussion...... 167
11 Classical Models with Hidden Variables 169 11.1 Latent Variables...... 170 11.2 Classical Bayesian Network with Latent Variables...... 172 11.2.1 Estimating the Parameters...... 175 11.2.2 Increasing the Dimensionality of a Classical Bayesian Network...... 179 11.3 Quantum-Like Bayesian Networks as an Alternative Model...... 180 11.4 Summary and Final Discussions...... 184
12 Conclusions 187
13 Future Work 191 13.1 A Quantum-Like Analysis of a Real Life Financial Scenario: The Dutch’s Bank Loan Ap- plication...... 191 13.2 Quantum-Like Influence Diagrams: Incorporating Expected Utility in Quantum-Like Bayesian Networks...... 192 13.3 Neuroeconomics: quantum probabilities towards a unified theory of decision making... 193
xvii Bibliography 194
xviii List of Tables
3.1 Summary of all possible active trails in a Bayesian Network...... 43
4.1 Allais Paradox Experiement 1...... 49 4.2 Allais Paradox Experiement 2...... 49 4.3 Three color ellesberg paradox experiment 1...... 50 4.4 Three color ellesberg paradox experiment 2...... 50 4.5 Results of the two-stage gambling game reported by different works from the literature.. 54 4.6 Works of the literature reporting the probability of a player choosing to defect under sev- eral conditions. a corresponds to the average of the results reported in the first two payoff matrices of the work of Crosson [55]. b corresponds to the average of all seven experi- ments reported in the work of Li & Taplin [125]...... 55 4.7 Summary of the results obtained in the work of Moore [134]...... 57 4.8 Results obtained from the medical decision experiment in Bergus et al. [25]...... 57 4.9 Results reported by Trueblood & Busemeyer [188] of the experiments performed by McKen- zie et al. [132]...... 58
5.1 Average results of several different experiments of the Prisoner’s Dilemma Game reported in Section 4.4.2...... 62 5.2 Classical full joint probability distribution representation of the Bayesian Network in Fig- ure 5.4...... 76 5.3 Relation between classical and quantum probabilities used in the work of Leifer & Poulin [124]...... 77 5.4 Comparison of the different models proposed in the literature...... 79
6.1 Fulll joint distribution of the Bayesian Newtwork in Figure 6.1 representing the average results reported over the literature for the Two Stage Gambling Game (Table 4.5). The
random variable G1 corresponds to the outcome of the first gamble and the variable G2 corresponds to the decision of the player of playing/not playing the second gamble..... 91 6.2 Fulll joint distribution of the Bayesian Newtwork in Figure 6.2 representing the average results reported over the literature for the Two Stage Gambling Game (Table 4.5). The
random variable G1 corresponds to the outcome of the first gamble and the variable G2 corresponds to the decision of the player of playing/not playing the second gamble..... 96
xix 6.3 Probabilities obtained when performing inference on the classical Bayesian Network of Figure 6.6...... 104 6.4 Probabilities obtained when performing inference on the quantum Bayesian Network of Figure 6.7...... 105 6.5 Optimum θ’s found for each variable from the burglar/alarm bayesian network (Figure 6.6). 105
7.1 Table representation of a quantum full joint probability distribution...... 110 7.2 Fulll joint distribution of the Bayesian Newtwork in Figure 6.2 representing the average results reported over the literature for the Two Stage Gambling Game (Table 4.5). The
random variable G1 corresponds to the outcome of the first gamble and the variable G2 corresponds to the decision of the player of playing/not playing the second gamble..... 115 7.3 Analysis of the quantum θ parameters computed for each work of the literature using the proposed similarity function. Expected θ corresponds to the quantum parameter that leads to the observed probability value in the experiment. Computed θ corresponds to the quantum parameter computed with the proposed heurisitc. b corresponds to the average of all seven experiments reported...... 118 7.4 Results for the two games reported in the work of Crosson [55] for the Prisoner’s Dilemma Game for several conditions: when the action of the second player was guessed to be Defect (Guessed to Defect), when the action of the second player was guessed to be C ooperate (Guessed to Collaborate), and when the action of the second player was not known(Unknown)...... 119 7.5 Experimental results reported in work of Li & Taplin [125] for the Prisoner’s Dilemma game for several conditions: when the action of the second player is known to be Defect (Known to Defect), when the action of the second player is known to be C ooperate (Known to Collaborate), and when the action of the second player is not known(Unknown). The column Violations of STP corresponds to determining if the collected results are violating the Sure Thing Principle...... 120 7.6 Experimental results reported in work of Li & Taplin [125] for the Prisoner’s Dilemma game. The entries highlighted correspond to games that are not violating the Sure Thing Principle. Expected θ corresponds to the quantum parameter that leads to the observed probability value in the experiment. Computed θ corresponds to the quantum parameter computed with the proposed heurisitc...... 121 7.7 Comparison between the Quantum Prospect Decision Theory (QPDT) model and the proposed Quantum-Like Bayesian Network (QLBN) for different works of the literature reporting violations to the Sure Thing Principle. b corresponds to the average of all seven experiments reported...... 123 7.8 Comparison between the Quantum Prospect Decision Theory (QPDT) model and the proposed Quantum-Like Bayesian Network (QLBN) for all the different experiments per- formed in the work of Li & Taplin [125]...... 124
xx 8.1 Empirical data collected in the experiment of Busemeyer et al. [41]...... 131 8.2 Results from the application of the Quantum Like Bayesian Network (QLBN) model to the categorisation / Decision experiment and comparison with the Quantum Dynamical Model (QDM) proposed in the work of Busemeyer et al. [41]...... 137
9.1 Full joint probability distribution. Pr(C,D) corresponds to the classical probability and ψ(C,D) corresponds to the respective quantum amplitude...... 145 9.2 Comparison between a Quantum Markov Model and the proposed Bayesian Network... 146 9.3 Probabilities obtained when performing inference on the Bayesian Network of Figure 9.4. 149 9.4 Probabilities obtained when performing inference on the Bayesian Network of Figure 9.6. 151
10.1 Summary of the results obtained in the work of [134] for the Clinton-Gore Poll, showing an Assimilation Effect...... 154 10.2 Summary of the results obtained in the work of [134] for the Gingrich-Dole Poll, showing a Contrast Effect...... 155 10.3 Summary of the results obtained in the work of [134]. The table reports the probability of answering All or Many to the questions. The results show the occurrence of an Additive Effect...... 155 10.4 Summary of the results obtained in the work of [134] for the Rose-Jacjson Poll, showing a Subtractive Effect...... 156 10.5 Prediction of the geometric approach using different φ rotation parameters to explain the different types of order effects reported in the work of [134]. The columns Pr(1st ans) vs Pr(1st ans exp) represent the answer to the first question obtained using the projection models and the value reported in [134], respectively. Pr(2nd ans) vs Pr(2nd ans exp) represent the answer to the second question obtained using the projection models and the value reported in [134]...... 165
11.1 Full joint probability distribution for the general Bayesian Network from Figure 11.2, which models the Prisoner’s Dilemma game. Note that rs stands for risk seeking, ra for risk averse, d for defect and c for cooperate...... 176 11.2 Full joint probability distribution table of the Quantum-Like Bayesian Network in Figure 11.5.182
11.3 Analysis of the quantum θx parameters computed for each work of the literature in order to reproduce the observed and unobserved conditions of the Prisoner’s Dilema Game. b corresponds to the average of all seven experiments reported...... 183
xxi xxii List of Figures
1.1 Representation of the knowable conditions of the Two Stage Gambling Game experiment of Tversky & Shafir [198]...... 4 1.2 Representation of the unknowable conditions of the Two Stage Gambling Game experi- ment of Tversky & Shafir [198]...... 4 1.3 Representation of the unknowable conditions of the Two Stage Gambling Game experi- ment conducted by Tversky & Shafir [198]...... 5
2.1 Sample Space (classical probability theory)...... 18 2.2 Hilbert Space (quantum probabilty theory)...... 18 2.3 Example of a representation of an event on a Hilbert Space...... 19 2.4 Example of a quantum system state...... 20 2.5 The double slit experiment. Electrons are fired and they can pass through one of the
slits (either s1ors2) to reach a detector screen in points d1 or d2. If we measure from which slit the electron went through, then the pattern in the detectetor will have the shape and size of the two slits, suggesting a particle baheviour of the electron. If we do not measure from which slit the electron is going through, then the electron behaves as a wave and produces an interference pattern in the detector screen, with one point detecting constructive interference and another point detecting destructive interference...... 24 2.6 Classical Principle of Least Action. The path that a particle chooses between a starting and ending position is always the one that requires the least energy (left). Quantum version of the Principle of Least Action. A particle can be on different paths at the same time and use them to find the optimal path (the one that requires less energy) between a starting and final position (right)...... 27 2.7 Single Path Trajectory (left). Multiple distinguishable paths (center). Multiple undistin- guishable paths (right)...... 27
2.8 Representation of the projections, Pi, of a qubit ψ, to either the |0i state subspace S0 or
the |1i state subspace S1...... 30 2.9 Example of a distance between two points in L1-norm, also known as the Manhatten distance...... 31 2.10 Example of a distance between two points in L2-norm, also known as the Euclidean dis- tance...... 31
xxiii 3.1 Na¨ıve Bayes Model, where node C represents the class variable and the set of random
variables {X1,X2, ..., Xn} represent the features...... 36 3.2 The Burglar Bayesian Network proposed in the book of [168]...... 38
3.3 Difference between causal reasoning and evidential reasoning...... 40
3.4 Indirect Causal Effect...... 43
3.5 Indirect Evidential Effect...... 43
3.6 Common Cause Effect...... 43
3.7 V-Structure...... 43
4.1 Linda is feminist and bank teller. Notice that P r(F ∩ B) has always to be smaller than P r(B)...... 52
4.2 Linda is feminist and bank teller. Notice that P r(F ∪ B) has always to be bigger than P r(F ). 52
4.3 The two-stage gambling experiment proposed by Tversky & Shafir [198]...... 53
4.4 Example of a payoff matrix for the Prisoner’s Dilemma Game...... 54
4.5 The Prisoner’s Dilemma game experiment proposed by Tversky & Shafir [198]...... 55
5.1 Illustration of the probabilities that can be obtained by varying the parameters γ and µd.. 71
5.2 Illustration of the probabilities that can be obtained by varying the parameters γ and µc.. 71
5.3 Illustration of the probabilities that can be obtained by varying the parameters γ and µB.. 71 5.4 Bayesian Network representation of the Average of the results reported in the literature
(last row of Table 8.2). The random variables that were considered are P1 and P2, corre- sponding to the actions chosen by the first participant and second participant, respectively. 76
6.1 Classical Bayesian Network representation of the average results reported over the liter- ature for the Two Stage Gambling Game (Section 4.4.1, Table 4.5)...... 90
6.2 Quantum-Like Bayesian Network representation of the average results reported over the literature for the Two Stage Gambling Game (Section 4.4.1, Table 4.5). The ψ(x) repre- sents a complex probability amplitude...... 93
6.3 Example of constructive interference: two waves collide forming a bigger wave...... 98
6.4 Example of destructive interference: two waves collide cancelling each other...... 98
6.5 The various quantum probability values that can be achieved by variying the angle θ in Equation 6.14. Note that quantum probability can achieve much higher/lower values than the classical probability...... 99
6.6 Burglar/Alarm classical Bayesian Network proposed in the book of Russel & Norvig [168] 104
6.7 Quantum-like counterpart of the Burglar/Alarm Bayesian Network proposed in the book of Russel & Norvig [168]...... 104
6.8 Possible probabilities when querying ”MaryCalls = t” with no evidence. Parameters used
were: {θ1, θ2, θ3, θ5, θ7, θ8} → {0, 0, 0, 0, 3.1, 0}. Maximum probability for {θ1, θ2} → {0, 3.1}. 106
xxiv 6.9 Possible probabilities when querying ”Burglar = t” with no evidence. Parameters used
were: {θ1, θ2, θ3, θ5, θ6, θ7} → {0, 0, 0, 6.2, 0.1, 3.1}. Maximum probability for {θ4, θ8} → {0, 3.2}...... 106 6.10 Possible probabilities when querying ”JohnCalls = t” with no evidence. Parameters used
were: {θ1, θ3, θ4, θ5, θ7, θ8} → {1.9, 0, 2.3, 0.5, 4.5, 2.4}. Maximum probability for {θ2, θ6} → {2.3, 5.5}...... 106 6.11 Possible probabilities when querying ”Alarm = t” with no evidence. Parameters used were:
{θ1, θ3, θ4, θ5, θ7, θ8} → {0, 0, 0.8, 6.2, 3.1, 4.3}. Maximum probability for {θ2, θ6} → {0.2, 0.5}.106
7.1 Vector representation of two vectors representing a certain state...... 111 7.2 Illustration of the different 2-dimensional vectors that will be generated for each step of iteration during the computation of the quantum interference term...... 111
7.3 Vector representation of vectors G2play and G2nplay plus the euclidean distance vector c. 116 7.4 Comparison of the results obtained, for different works of the literature concerned with the Prisoner’s Dilemma game...... 118 7.5 Possible probabilities that can be obtained from Game 1 (left), Game 2 (center) and the average of the Games of the work of Crosson [55], using the quantum law of total probability.119 7.6 Comparison of the results obtained, for different experiments reported in the work of Li & Taplin [125] in the context of the Prisoner’s Dilemma game...... 120 7.7 Possible probabilities that can be obtained in Game 2 of the work of Li & Taplin [125] (left). Possible probabilities that can be obtained in Game 6 of the work of Li & Taplin [125] (center). Possible probabilities that can be obtained in the work of Busemeyer et al. [39] (right)...... 121 7.8 Comparison of the results obtained, for different works of the literature concerned with the Two-Stage Gambling game...... 122 7.9 Error percentage obtained in each experiment of the Two Stage Gambling game...... 122 7.10 Possible probabilities that can be obtained in the work of Lambdin & Burdsal [122]. The probabilities observed in their experiment and the one computed with the proposed quan- tum like Bayesian Network are also represented...... 122
8.1 Vector normalization to obtain quantum destructive interferences...... 129 8.2 Example of Wide faces used in the experiment of Busemeyer et al. [41]...... 130 8.3 Example of Narrow faces used in the experiment of Busemeyer et al. [41]...... 130 8.4 Summary of the probability distribution of the Good / Bad faces in the experiment of Buse- meyer et al. [41]...... 131 8.5 Representation of the Narrow faces experiment (left) and Wide faces experiment (right) in a Bayesian Network with classical probabilities and quantum amplitudes. The classical
probabilities are given by P r(X) and the quantum amplitudes by ψx...... 132 8.6 Conversion of a dataset image into a binary image. Conversion with a small threhold (left). Conversion with a high threhold (right)...... 134
xxv 8.7 Impact of the threshold when converting an image into a binary image. Threhold ranges from 0.2 (left) to 0.8 (right)...... 135 8.8 Distribution of P r(Attack) using threshold 0.2...... 135 8.9 Distribution of P r(Attack) using threshold 0.3...... 135 8.10 Distribution of P r(Attack) using threshold 0.4...... 135 8.11 Distribution of P r(Attack) using threshold 0.5...... 135 8.12 Distribution of P r(Attack) using threshold 0.6...... 136 8.13 Distribution of P r(Attack) using threshold 0.7...... 136 8.14 Distribution of P r(Attack) using threshold 0.8...... 136 8.15 Probability distribution of the 100 simulations performed when converting a grayscale im- age into a binary one with a threshold of 0.4...... 136
9.1 Encoding of the synchronised variables with their respective angles (left). Two synchro- nised events forming an angle of π/4 between them (right)...... 143 9.2 Representation of the Synchronicity heuristic in the Hilbert Space. Vector i corresponds to the event C = Good, D = Attack. Vector j corresponds to the event C = Bad, D = Attack. The computed angle for the Attack (left) and W thdraw (right) actions is θ = 3π/4...... 145 9.3 Semantic Network for the Lung Cancer Bayesian Network...... 147 9.4 Lung Cancer Bayesian Network...... 148 9.5 Probabilities obtained using classical and quantum inferences for different queries for the Lung Cancer Bayesian Network (Figure 9.4)...... 148 9.6 Example of a Quantum-Like Bayesian Network [168]. ψ represents quantum amplitudes. P r corresponds to the real classical probabilities...... 150 9.7 Semantic Network representation of the network in Figure 9.6...... 150 9.8 Results for various queries comparing probabilistic inferences using classical and quan- tum probability when no evidences are observed: maximum uncertainty...... 151
10.1 Example of the application of the quantum projection approach for a sequece of two bi- nary questions A and B. We start in a superposition state and project this state into the yes basis of question A (left). Then, starting in this basis, we project into the basis corre- sponding to the answer yes of question B (center). We can then have a different result if we reverse the order the projections (right)...... 157
10.2 Relation between the rotation parameter φ and the quantum probability amplitude s0 of
Equation 10.15. The amplitude s1 was set to s1 = 1 − s0. We can simulate several order effects by varying the parameter φ...... 160
10.3 Relation between the rotation parameter φ and the quantum probability amplitude s0 of
Equation 10.12. The amplitude s1 was set to s1 = 1 − s0. We can simulate several order effects by varying the parameter φ...... 160
xxvi 10.4 Example of the Relativistic Interpretation of Quantum Parameters. Each person reasons according to a N-dimensional personal basis state without being aware of it. The repre- sentation of the beliefs between different people consists in rotating the personal belief state by φ radians...... 161
11.1 Example of a Bayesian Network with a latent variable H and a random variable X..... 171 11.2 A classical Bayesian Network with a latent variables to model the Prisoner’s Dilemma game. P 1 and P 2 are both random variables. P 1 represents the decision of the first player and P 2 represents the decision of the second player (either to cooperate or to defect). H is the hidden state or latent variable and represents some unmeasurable factor that can influence the participant’s decisions...... 173 11.3 Classical Bayesian Network to model the observed conditions for the Prisoner’s Dilemma Game. OutP 1 and P 2 are both random variables that represent the outcome (or decision) of the first player and the decision of the second player. The decisions can either be defect, which is represented by d or cooperate, represented by c. H2 represents a latent (hidden) unmeasurable variable that corresponds to the personality of the second player: either risk averse (ra) or risk seeking (rs)...... 177 11.4 A general classical Bayesian Network with two latent variables, H1 and H2, to express both unobserved and observed conditions for the Prisoner’s Dilemma game. Random variables P 1U and P 1 represent the first player’s decision according to the unobserved and observed conditions, respectively. Random variables P 2U and P 2 represent the sec- ond player’s decision according to the unobserved and observed conditions, respectively. The assignments ra stand for risk averse, rs risk seeking, d defect and c cooperate.... 179 11.5 Example of a Quantum-Like Bayesian Network. The terms ψ correspond to quantum probability amplitudes. The variables P 1 and P 2 correspond to random variables repre- senting the first and the second player, respectively...... 181
xxvii xxviii Chapter 1
Introduction
It is the purpose of this thesis to explore the applications of the formalisms of quantum mechanics in areas outside of physics. More specifically, it is proposed a quantum-like decision model based on a network structure to accommodate and predict several paradoxical findings that were reported over the literature [193, 89, 195, 197, 198]. Note that, the term quantum-like is simply the designation that is employed to refer to any model, which is applied in the domains outside of physics and that makes use of the mathematical formalisms of quantum mechanics, abstracting them from any physical meanings or interpretations. The paradoxes reported over the literature suggest that human behaviour does not follow normative rational choices. In other words, people usually do not choose the preferences which lead to a maximum utility in a decision scenario and consequently are consistently violating the axioms of expected utility functions and the laws of classical probability theory. When observations contradict one of the most significant and predominant decision theories, like the Expected Utility Theory, then it often suggests that there is something missing in the theory. When dealing with preferences under uncertainty, it seems that models based on normative theories of rational choice tend to tell how individuals must choose, instead of telling how they actually choose [129]. It is the purpose of this thesis to provide a set of contributions of quantum based models applied to decision scenarios as an alternative mathematical approach to human decision-making and cognition in order to better understand the structure of human behaviour.
1.1 Violations to Normative Theories of Rational Choice
The process of decision-making is a research field that has always triggered a vast amount of interest among several fields of the scientific community. Throughout time, many frameworks for decision-making have been developed. In the beginning of the 1930’s, economical models focused in the mathematical structures of preferences, which take choices as primitives and investigate whether these choices can be represented by some utility function. The biggest consequence of this approach is the separation of economics from psychology. This means that human psychological processes started to be irrelevant as long as human decision-making obeys to some set of axioms [77]. According to these strong normative
1 models, human behaviour is assumed to maximise his/her utility function and by doing so, the person would be acting in a rational manner. It was in 1944, that the Expected Utility theory was axiomatised by the mathematician John von Neumann and the economist Oskar Morgenstern, and became one of the most significant and predominate rational theories of decision-making [201]. The Expected utility hypothesis is characterised by a specific set of axioms that enable the computation of the person’s preferences with regard to choices under risk [74]. By risk, we mean choices that can be measured and quantified. Putting in other words, choices based on objective probabilities. However, in 1953, Allais proposed an experiment that showed that human behaviour does not follow these normative rules and violates the axioms of Expected Utility, leading to the well known Allais paradox [13]. Later, in 1954, the mathematician Leonard Savage proposed an extension of the Expected Utility hypothesis, giving origin to the Subjective Expected Utility [170]. Instead of dealing with decisions under risk, the Subjective Utility theory deals with uncertainty. Uncertainty is specified by subjective probabilities and is understood as choices that cannot be quantified and are not based on objective probabilities. But once more, in 1961, Daniel Ellsberg proposed an experiment that showed that human behaviour also contradicts and violates the axioms of the Subjective Expected Utility theory, leading to the Ellsberg paradox [70]. In the end, the Ellsberg and Allais paradoxes show that human behaviour is not normative and tend to violate the axioms of rational decision theories.
1.2 The Emergence of Quantum Cognition
Later, in the 70s, cognitive psychologists Amos Tversky and Daniel Kahneman decided to put to test the axioms of the Expected Utility hypothesis. They performed a set of experiments in which they demon- strated that people usually violate the Expected Utility hypothesis and the laws of logic and probability in decision scenarios under uncertainty [193, 195, 197, 90, 89]. From these experiments, it was reported several paradoxes, such as disjunction / conjunction fallacies, order of effects, etc. Motivated by these findings, researchers started to look for alternative mathematical representations in order to accommodate these violations. Although in the 40’s, Niels Bohr had defended and was con- vinced that the general notions of quantum mechanics could be applied in fields outside of physics [150], it was only in the 90’s, that researchers started to actually apply the formalisms of quantum mechanics to problems concerned with social sciences. It was the pioneering work of Aerts & Aerts [7] that gave rise to the field Quantum Cognition. In their work, Aerts & Aerts [7] designed a quantum machine that was able to represent the evolution from a quantum structure to a classical one, depending on the de- gree of knowledge regarding the decision scenario. The authors also made several experiments to test the variation of probabilities when posing yes/no questions. According to their experiment, most partici- pants formed their answer at the moment the question was posed. This behaviour goes against classical theories, because in classical probability, it would be expected that the participants have a predefined answer to the question (or a prior) and not form it at the moment of the question. A further discussion about this study can be found in the works of [4, 11, 12, 76,8]. Quantum cognition has emerged as a research field that aims to build cognitive models using the
2 mathematical principles of quantum mechanics. Given that classical probability theory is very rigid in the sense that it poses many constraints and assumptions (single trajectory principle, obeys set theory, etc.), it becomes too limited (or even impossible) to provide simple models that can capture human judgments and decisions since people are constantly violating the laws of logic and probability theory [33, 37,6].
1.3 Motivation: Violations to the Sure Thing Principle
Although there are many paradoxical situations reported all over the literature, in this work we focus on one of the most predominant human decision-making errors that still persists nowadays: the disjunction effect [198]. The disjunction effect occurs whenever the Sure Thing Principle is violated. This principle is fundamental in classical probability theory and states that, if one prefers action A over B under the state of the world X, and if one also prefers A over B under the complementary state of the world X, then one should always prefer action A over B even when the state of the world is not known [170]. Violations of the Sure Thing Principle imply violations of the classical law of total probability. In order to put to test the Sure Thing Principle, Tversky & Shafir [198] conducted an experiment, which is called the Two Stage Gambling Game. Under this experiment, participants were asked to make a set of two consecutive gambles. At each stage, they were asked to make the decision of whether or not to play a gamble that has an equal chance of winning $200 or losing $100. Three conditions were verified:
1. Participants were informed if they had won the first gamble;
2. Participants were informed if they had lost the first gamble;
3. Participants did not know the outcome of the first gamble;
The results obtained showed that participants who knew they had won the first gamble, decided to play the second gamble. Participants who knew they had lost the first gamble also decided to play the second gamble. We will address to these two conditions as the knowable conditions. Through Savage’s Sure Thing Principle, it would be expected that the participants would choose to play the second gamble even when they did not know the outcome of the first gamble. However, the results obtained showed that the majority of the participants became risk averse and chose not to play the second gamble, leading to a violation of the Sure Thing Principle. We will refer to this experimental condition as the unknowable condition. Figures 1.1 and 1.2 represent the knowable and unknowable conditions, respectively. Tversky & Shafir [198] explained these findings in the following way: when the participants knew that they had won, then they had extra house money to play with and decided to play the second gamble. When the participants knew that they lost, then they decided to play again with the hope of recovering the lost money. But when the participants did not know if they had won or lost the gamble, then these thoughts did not arise in their minds and consequently they ended up not to playing the second gamble. Under a mathematical point of view, a person acts in a rational and consistent way, if under the unknowable condition, he/she chooses to play the second gamble. Let P r (G2 = play|G1 = win) and
3 Figure 1.1: Representation of the knowable Figure 1.2: Representation of the unknowable conditions of the Two Stage Gambling Game conditions of the Two Stage Gambling Game experiment of Tversky & Shafir [198]. experiment of Tversky & Shafir [198].
P r (G2 = play|G1 = win) be the probability of a player choosing to play the second gamble given that it is known that he won / lost the first gamble, respectively. And let P r (G2 = play) be the probability of the second player choosing to play without knowing the outcome of the first gamble. Assuming a neutral prior and that the gamble is fair and not biased (50% chance of either winning or losing the first gamble), it would be expected that:
P r (G2 = play|G1 = win) ≥ P r (G2 = play) ≥ P r (G2 = play|G1 = lose)
However, this is not consistent with the experimental results reported in the work of Tversky & Shafir [198]. What it was perceived in their experiments was that the probability of the unknowable condition got extremely low compared to the known conditions.
P r (G2 = play|G1 = win) ≥ P r (G2 = play|G1 = lose) ≥ P r (G2 = play)
This led to a violation of the laws of classical probability theory. Classical mechanics was also not able to accommodate many paradoxical findings that were being observed in several experimental settings. This gave rise to the axiomatisation of the theory of quantum mechanics. In this thesis, we explore these paradoxical scenarios in the same way by using quantum probability theory as an alternative mathematical formalism. Under a quantum cognition perspective, the third experimental condition, the unknown condition, could be mathematically explained by quantum interference effects. In quantum mechanics, electrons which are in an undefined state can interfere with each other. Under a quantum cognitive point of view, if we consider that the beliefs of the participants are in an undefined state, then they can also interfere with each other causing the final probabilities to be disturbed either increasing them (constructive interferences) or decreasing them (destructive interferences). This last one is the type of interference that results in violations to the Sure Thing Principle. Figure 6.2 represents the third experimental condition from Tversky & Shafir [198] under a quantum cognitive point of view with interference effects being generated when the outcome of the first gamble is not known.
4 Figure 1.3: Representation of the unknowable conditions of the Two Stage Gambling Game experiment conducted by Tversky & Shafir [198].
1.4 Why Quantum Cognition?
It is not the purpose of this thesis to argue whether quantum-like models should be preferred over classical models. Just like it will be addressed in future chapters of this work, the advantages of the applications of quantum-like models depend on the type of the decision problem (Chapters 10 and 11). Following the lines of though of Sloman [179], people have to deal with missing / unknown information. This lack of information can be translated into the feelings of ambiguity, uncertainty, vagueness, risk, ignorance, etc [216], and each of them may require different mathematical approaches to build adequate cognitive / decision problems. Quantum probability theory can be seen as an alternative mathematical approach to model such cognitive phenomena. Some researchers argue that quantum-like models do not offer many underlying aspects of human cognition (like perception, reasoning, etc). They are merely mathematical models used to fit data and for this reason they are able to accommodate many paradoxical findings [123]. Indeed quantum-like models provide a more general probability theory that use quantum interference effects to model de- cision scenarios, however they are also consistent with other psychological phenomena (for instance, order effects) [179]. In the book of Busemeyer & Bruza [34], for instance, the feeling of uncertainty or ambiguity can be associated to quantum superpositions, in which assumes that all beliefs of a person occur simultaneously, instead of the classical approach which considers that each belief occurs in each time frame. The book of Busemeyer & Bruza [34] provides a set of quantum phenomena that can be associated to psychological processes that support the application of quantum-like models to cognitive models.
• Violation of Classical Laws: The biggest motivation for the application of quantum formalisms in areas outside of physics is the need to explain paradoxical findings that are hard to explain through classical theory: violations to the Sure Thing Principle, disjunction/conjunction errors, Ellsberg / Allais paradoxes. Quantum theory is a more general framework that allows the accommodation and explanation of scenarios violating the laws of classical probability and logic [33, 31, 38].
• Superposition: Under a classical point of view, cognitive models assume that, at each moment of
5 time, a person is in a definite state. For instance, while making a judgement whether of not buying a car, a person is either in a state corresponding to the judgment buy car or in the state not buy car for each instance of time. In quantum cognition, it is assumed that the human thought process works like a wave until a decision is made. A person can be in an indefinite state, that is, due to the wave- like structure, a person can be in a superposition of thoughts. For each instance of time, a person can be in the state buy car and not buy car. This wave-like paradigm enables the representation of conflicting, ambiguous and uncertain thoughts more clearly as well as vagueness [28].
• The Principle of Unicity: In classical theory, when the path of a particle is unknown, it is assumed that the particle either goes from one path or another with probability 1/2. In quantum theory, when the path is unknown, the particle enters in a superposition state, taking all paths at the same instance of time, generating interference effects that alter the final probabilities of the particle.
• Sensitivity to Measurement: In quantum mechanics, the act of measuring disturbs a quantum superposition state making it collapse into one definite state. A measurement on a system rather creates than records a property of the system [162]. In the scope of quantum cognition, the measurement process can be used to explain decisions if we assume that human thoughts are represented by a wave in superposition. For instance, if we ask a person if he/she will buy a car, immediately before the question is posed, the person is in a superposition state. When the question is posed, the superposition state will collapse into either one of two states: one in which the answer is yes, the other in which the answer is no. The answer is created from the interaction of the superposition state and the question. In classical mechanics, this act of creation does not exist. Since a state is always considered definite, then the properties of a system are recorded rather than created.
• Measurement Incompatibility: In classical theory, the act of asking a sequence of two questions should yield the same answers as in the situation where the questions are posed in reversed order. Empirical experiments have shown that this is not the case, and the act of answering the first question changes the context of the second question, yielding people to give different answers according to the order in which the questions are posed. Quantum theory allows the explanation of order effects intuitively, since operations in quantum theory are non-commutative.
1.5 Challenges of Current Quantum-Like Models
Although recent research shows the successful application of quantum-like models in many different decision scenarios of the literature [20, 16, 54, 50, 187], there are still many concerns that challenge and put some resistance in the acceptance and usage of these models. Some of the current challenges that quantum-like models face can be summarised in the following points.
• Prediction. Although many cognitive models have been successfully applied to accommodate several paradoxical findings, they cannot be considered predictive. Most of the quantum-like mod- els proposed need to have a priori knowledge of the outcome of the probability distributions of
6 the experiment in order to fit parameters to explain the paradoxical results. For this reason, it is considered that these models have an explanatory nature rather than a predictive one.
• Scalability. Although there are many experiments that report paradoxical findings [164, 41, 120, 122], these experiments consist of very small scenarios that are modelled by, at most, two random variables. Therefore, many of the proposed models in the literature are only effective under such small scenarios and become intractable (or even intractable) for more complex situations. The number of quantum interference parameters grows exponentially large [93, 96, 101] or there are computational constraints in the computation of very large unitary operators [164, 44, 41].
• What can be considered Quantum-Like? Since the emergence of the Quantum Cognition field, many researchers have been attempting to apply the mathematical formalisms of quantum me- chanics in many different research areas, ranging from Biology [20, 16], Economics [102, 82], Finance [81] Perception [54, 50], Jury duty [187] to domains such as Information Retrieval [133]. Regarding this last field, it has been proposed quantum-like versions of geometric-based pro- jection models, which measure the similarity between entities (either documents, concepts, etc). However, it is still not clear if applying a quantum-like projective approach has any advantages towards the classical models, since the way that these models accommodate the paradoxical find- ings is through a rotation of the vector space instead of the usage of quantum interferences [144].
• Classical vs. Quantum-Like. Recent research shows that quantum-based probabilistic models are able to explain and accommodate decision scenarios that cannot be explained by pure clas- sical models [38, 31]. However, there is still a big resistance in the scientific literature to accept these quantum-based models [123, 184]. Many researchers believe that one can model scenarios that violate the laws of probability and logic through traditional classical decision models [151]. Classical models can indeed simulate many of the paradoxical findings reported all over the litera- ture [144]. This rises the question of the advantages or even the applicability of quantum models over classical ones.
• Non-Kolmogorovian Models. Quantum-like models make use of quantum interference effects in order to accommodate paradoxical decision-scenarios [165]. Since pure classical probabilistic models are constrained to the limitations of set theory, it is difficult (or even impossible) to represent these paradoxes. But if the limitations are in the constraints of set theory, then non-Kolmogorovian theories such as Dempster-Shafer (D-S) theory [171] should also be able to accommodate the same decision problems that quantum-like models are able to. The Dempster-Shafer theory of evidence differs from classical set theory in the sense that it is possible to associate measures of uncertainty to sets of hypotheses, this way enabling the theory to distinguish between uncer- tainty from ignorance [128]. This distinction has been proved all over the literature and has shown accurate predictions in sensor fusion models [136]. The uncertainty in the D-S theory is speci- fied by allowing the representation of probabilities to sets of events, instead of being constraint to specify the probabilities to atomic events (like in classical probability theory). It is still an open research question in the literature if there are any relations between quantum-like models and
7 other non-Kolmogorovian probability theories. If this holds to be true, then the accommodation of the paradoxical decision scenarios does not come from the unique characteristics of quantum-like models and quantum interference effects, but because of the limitations of set theory.
1.6 Thesis Proposal
In order to overcome the above challenges, in this thesis it is proposed a quantum-like Bayesian Network formalism, which consists in replacing classical probabilities by quantum complex probability amplitudes. However, since this approach also suffers from the problem of exponential growth of quantum param- eters that need to be fit, it is also proposed a similarity heuristic [173] that automatically computes this exponential number of quantum parameters through vector similarities. A Bayesian Network can be un- derstood as an acyclic directed graph, in which each node represents a random variable and each edge represents a direct causal influence from the source node to the target node (conditional dependence). Under the proposed network, if a node (event) is unobserved, then it can enter in a superposition state and produce interference effects. These effects provide some explanation in terms of cognition, since they can be seen as the feeling of confusion or ambiguity [34].
1.6.1 Why Bayesian Networks?
Bayesian Networks are one of the most powerful structures known by the Computer Science community for deriving probabilistic inferences (for instance, in medical diagnosis, spam filtering, image segmenta- tion, etc) [116]. The reason why Bayesian Networks were chosen is because it provides a link between probability theory and graph theory. And a fundamental property of graph theory is its modularity: one can build a complex system by combining smaller and simpler parts. It is easier for a person to combine pieces of evidence and to reason about them, instead of calculating all possible events and their respec- tive beliefs [79]. In the same way, Bayesian Networks represent the decision problem in small modules that can be combined to perform inferences. Only the probabilities which are actually needed to perform the inferences are computed. This process can resemble human cognition [79]. While reasoning, humans cannot process all possible information, because of their limited capacity [90]. Consequently, they combine several smaller pieces of evidence in order to reach a final decision. A Bayesian Network works exactly in the same way. It provides a relation mechanism between human cognition and inductive inference [161].
1.6.2 Quantum-Like Bayesian Networks for Disjunction Errors
In this thesis, it is addressed the problem of violations to the Sure Thing Principle (which are a conse- quence of disjunction errors) by examining two major problems in which these violations were verified: the Prisoner’s Dilemma game and the Two Stage Gambling game. These violations were initially re- ported by Tversky & Shafir [198] and later reproduced in several works in the literature that also reported similar results [125, 39, 86]. It will be demonstrated how the current classical models fail to explain the
8 paradoxical findings implied in the violations of the Sure Thing Principle and it will be proposed a new quantum-like structure based on Bayesian Networks. This model has the advantage of being flexible enough to be easily extended to more complex decision scenarios (containing more than two random variables) and provides mechanisms to automatically compute quantum parameters derived from quan- tum interference effects. This way, the proposed model decouples itself from the models proposed in the literature for its ability of not requiring a priori knowledge of the outcome of the empirical experiments to accommodate the paradoxical results.
1.6.3 Comparison with Existing Quantum-Like Models
In this thesis, it was performed an overview of the most important quantum models in the literature that are used to make predictions under scenarios where the Sure Thing Principle is being violated. These models were evaluated in terms of three metrics: interference effects, parameter tuning and scalability. The first examines if the analysed model makes use of any type of quantum interference to explain human decision-making. The second is concerned with the assignment of values to a large number of quantum parameters. The last one consists of analysing the ability of the models to be extended and generalised to more complex scenarios. We also studied the growth of the quantum parameters when the complexity and the levels of uncertainty of the decision scenario increase. Finally, we compared these quantum models with traditional classical models from the literature. We conclude with a discussion of the manner in which the models addressed in this thesis can only deal with very small decision problems and why they cannot scale well to larger and more complex decision scenarios.
1.7 Advantages of Quantum-Like Models
Recent research shows that quantum-based probabilistic models are able to explain and predict scenar- ios that cannot be explained by pure classical models [33, 31]. However, there is still a big resistance in the scientific literature to accept these quantum-based models. Many researchers believe that one can model scenarios that violate the laws of probability and logic through traditional classical decision models [123, 184]. Although, the quantum cognition field is recent in the literature, there have been several different quantum-like models proposed in the literature. These models range from dynamical models [44, 41, 164], which make use of unitary operators to describe the time evolution since a participant is given a problem (or asked a question), until he/she makes a decision, to models that are based on contextual probabilities [7, 105, 215]. Quantum-like dynamical models have also been proposed in the literature to accommodate violations to the Prisoner’s Dilemma Game [164], study the evolution of the interaction of economical agents in markets [102, 82] or even to specify a formal description of dynamics of epigenetic states of cells interacting with an environment [19]. On the other hand, quantum-like models based on contextual probabilities, explore the application of complex probability amplitudes in order to define contexts that can interfere with the decision-maker [99, 106, 105]. For a survey about the applications
9 of quantum-like models for the Sure Thing Principle, the reader can refer to Moreira & Wichert [142].
1.7.1 Research Questions
With this thesis, a set of research questions are posed. Their answers will be explored throughout the chapters of this work and will be answered with more detail in the final chapter of this thesis (Chapter 12).
• [RQ1] So far, several quantum-like models have been proposed all over the literature, ranging from models that are dynamical [44, 164,3], based on contextual probabilities [105] or even models based on expected utility theories [149, 10]. What is the advantage of the proposed approach?
• [RQ2] Different quantum-like models use different types of quantum interference terms: either from the usage of Hamiltonians or by the usage of Feynman’s Path Integrals. But what is the psychological interpretation of quantum interference effects under these approaches?
• [RQ3] Are quantum projection models really quantum? Or are they merely a representation of a classical model with a rotation of the basis vectors?
• [RQ4] It is a legitimate thought that pure classical models fail to accommodate paradoxical decision- scenarios, because there could be some hidden variables that could be influencing the results and due to physical limitations, one cannot gather their data. Can a classical model with hidden vari- ables be used to accommodate the paradoxical findings reported over the literature? Or is this accommodation specific to quantum-like models?
1.8 Contributions
During this work, several contributions were proposed to the scientific community:
1. Catarina Moreira and Andreas Wichert, Are Quantum-Like Bayesian Networks More Powerful than Classical Bayesian Networks? (Major Revisions).
2. Diederik Aerts, Suzette Geriente, Catarina Moreira and Sandro Sozzo, Testing Ambiguity and Machina Preferences Within a Quantum-theoretic Framework for Decision-making, (under re- view) [10]
3. Catarina Moreira and Andreas Wichert, Are Quantum Models for Order Effects Quantum?, Inter- national Journal of Theoretical Physics, 1-18, 2017 [144]
4. Catarina Moreira and Andreas Wichert, Exploring the Relations Between Quantum-Like Bayesian Networks and Decision-Making Tasks with Regard to Face Stimuli, Journal of Mathematical Psy- chology, 78, 86-95, 2017 [145]
5. Catarina Moreira and Andreas Wichert, Quantum Probabilistic Models Revisited: the Case of Disjunction Effects in Cognition, Frontiers in Physics: Interdisciplinary Physics, 4, 1-26, 2016 [142]
10 6. Catarina Moreira and Andreas Wichert, Quantum-Like Bayesian Networks for Modelling Decision Making, Frontiers in Psychology, 7, 2016 [141]
7. Catarina Moreira and Andreas Wichert, The Synchronicity Principle Under Quantum Probabilistic Inferences, NeuroQuantology, 13,111-133, 2015 [140]
8. Catarina Moreira and Andreas Wichert, Interference Effects in Quantum Belief Networks, Journal of Applied Soft Computing, 25, 64-85, 2014 [137]
1.8.1 Conference Papers, Extended Abstracts and Posters
1. Catarina Moreira, Quantum-Like Influence Diagrams: Incorporating Expected Utility in Quantum- Like Bayesian Networks, International Symposium Worlds of Entanglement, Belgium, 2017 (ex- tended abstract) [135].
2. Catarina Moreira, Emmanuel Haven, Sandro Sozzo, Andreas Wichert, A Quantum-Like Analysis of a Real Financial Scenario: The Dutch’s Bank Loan Application, In Proceedings of the 13th Econophysics Colloquium, Poland, 2017 (extended abstract) [146].
3. Catarina Moreira and Andreas Wichert, When to use Quantum Probabilities in Quantum Cogni- tion? A Discussion, In Proceedings of the 12th Biennial International Quantum Structures Associ- ation Conference, United Kingdom, 2016 (extended abstract) [143].
4. Catarina Moreira and Andreas Wichert, Application of Quantum-Like Bayesian Networks in Social Sciences, 4th Champalimaud NeuroScience Symposium, Champalimaud Center of the Unknown, Portugal, 2015 (poster)
5. Catarina Moreira and Andreas Wichert, Quantum-Like Bayesian Networks using Feynman’s Path Diagram Rules, In Proceedings of the 16th Vaxj¨ o¨ Conference on Quantum Theory: from Founda- tions to Technologies, Sweden, 2015 (extended abstract) [138].
6. Catarina Moreira and Andreas Wichert, The Relation Between Acausality and Interference in Quantum-Like Bayesian Networks, In Proceedings of the 9th International Conference on Quan- tum Interactions, Switzerland, 2015 [139].
1.9 Organisation
The present work is organised as follows:
• Chapter1 presents an introduction and motivation of the current thesis by making an overview of the scientific historical aspects that contributed for the emergence of the field of Quantum Cogni- tion. An example showing violations to the Sure Thing Principle is also presented as a motivation for the topics and problems that will be addressed throughout this work.
11 • Chapter2 presents the fundamental concepts of quantum probability theory and makes an intro- duction to the field of Quantum Cognition.
• Chapter3 makes a general overview of the basic concepts related to Bayesian Networks, which are fundamental for the understanding of this work.
• Chapter4 makes a small overview of the most relevant paradoxes and fallacies that occur in decision-making scenarios and provides a brief literature overview of current approaches that at- tempt to address them.
• Chapter5 presents the first contribution of this work. It provides an exhaustive overview and dis- cussion of the most important state-of-the-art quantum cognitive models that are able to explain the paradoxical findings of experiments that violate the Sure Thing Principle. It also presents a deep comparison of and discussion of several quantum models in terms of three elements: (1) incorporation of quantum interference effects, (2) how to find quantum parameters and (3) scala- bility of the model for more complex decision problems. This study has been published in Moreira & Wichert [142].
• Chapter6 presents the second contribution of this work: a quantum-like Bayesian Network for- malism, which consists in replacing classical probabilities by quantum probability amplitudes. The proposed model takes advantage of its modular network structure to scale to more complex de- cision scenarios and generates an exponential number of quantum interference effects. An initial study of the impact of these interference terms is performed. This study has been published in Moreira & Wichert [137].
• Chapter7 presents the third contribution of this work. We complement the study of the quantum- like Bayesian network by proposing a vector similarity heuristic that is based on the probability distribution of the data collected for many different experiments reported over the literature. This heuristic takes into account the vector similarity between random variables and, based on this information, attempts to derive a value for the quantum interference term. This heuristic turns the proposed model predictive, contrary to current state of the art approaches, which have an explanatory nature. This study has been published in Moreira & Wichert [141].
• Chapter8 presents the forth contribution of this work. We complement the study of the quantum- like Bayesian network by proposing an heuristic to model the Categorisation / Decision experiment from Busemeyer et al. [41]. The model consists in representing objects (or events) in an arbitrary n-dimensional vector space, enabling their comparison through similarity functions. The computed similarity value is used to set the quantum parameters in the quantum-like Bayesian Network model. The difference of this approach with the vector similarity heuristic is that the former is based on the contents of the data objects used to perform inference, whereas the later makes use of a previous probability distribution analysis of several experiments reported all over the literature. This study has been published in Moreira & Wichert [145].
12 • Chapter9 presents the fifth contribution of this work. It is analysed a quantum-like Bayesian Net- work that puts together cause/effect relationships and semantic similarities between events. These semantic similarities constitute acausal connections according to the Synchronicity principle and provide new relationships for quantum-like Bayesian networks. As a consequence, beliefs (or any other event) can be represented in vector spaces, in which quantum parameters are determined by the similarities that these vectors share between them. Events attached by a semantic meaning do not need to have an explanation in terms of cause and effect. This study has been published in Moreira & Wichert [140, 139].
• Chapter 10 presents the sixth contribution of this work. We analysed several order effects situa- tions (additive, subtractive, assimilation and contrast effects) using the gallup reports collected in the work of Moore [134]. In the end, we show that order effects can be explained by both classical and quantum frameworks intuitively, since both models are similar and take advantage of the fact that matrix multiplications are non-commutative. Depending on how one sets the rotation operator, one can simulate any effect reported in Moore [134]. This study has been published in Moreira & Wichert [144].
• Chapter 11 presents the seventh contribution of this work. In this chapter it is discussed that, although the classical models with latent variables could explain the paradoxical findings under the Prisoner’s Dilemma game, the same model could not simulate the choice of the player when a piece of evidence was given, that is, when it was known which action the first player chose. This leads to the dilemma: either one creates a classical model just to account for observed evidence or one creates the model just to explain the paradoxical findings. This study is currently under review.
• Chapter 12 makes a general summary and some final discussions about this work by answering the research questions posed in Section 1.7.1.
• Chapter 13 makes a discussion of some promising directions for future work that we have already established in preliminary research [135, 146].
13 14 Chapter 2
Quantum Cognition Fundamentals
Since the paradoxical findings of Tversky and Khaneman[193, 195, 197, 90, 89], researchers started to look for alternative frameworks to accommodate decision scenarios that violated the laws of probability theory and logic. In the same way quantum physics was created to explain several paradoxical findings that could not be explained through classical physics, quantum cognition emerged as a research field that aims to explain paradoxical decision scenarios by building cognitive models with the underlying mathematical principles of quantum mechanics. In this sense, psychological (and cognitive) models benefit from the usage of quantum probability principles because they have many advantages over classical counterparts [42]. In quantum theory, events are represented as multidimensional vectors in a Hilbert space. This vector representation comprises the occurrence of all events at the same time. In quantum mechanics, this property refers to the superposition principle. Under a psychological point of view, a quantum superposition can be related to the feeling of confusion, uncertainty or ambiguity [34]. This vector representation neither obeys to the distributive axiom of Boolean logic nor to the law of total probability. It also enables the construction of more general models that can mathematically explain cognitive phenomena such as violations of the Sure Thing Principle [110, 131], which is the focus of this thesis. Quantum probability principles have also been successfully applied in many different fields of the literature, namely in biology [20, 16], economics [102, 82], perception [54, 50], jury duty [187], game theory [148, 30], order effects [206], opinion polls [111, 109] etc.
This chapter presents the fundamental concepts of quantum cognition that are necessary for the understanding of this work. In Section 2.1, we introduce the main concepts of quantum probability theory by comparing it with the classical theory axiomatised by Kolmogorov [117], giving illustrative examples of how to apply this theory. The derivation of quantum interference terms from complex probability amplitudes is detailed in Section 2.2. A parallel analysis of how classical and quantum systems evolve through time is presented in Section 2.3. This chapter also compares the calculations of probabilities, in a classical and quantum setting, in path trajectories in Section 2.4. In Sections 2.5 and 2.6, we explain how Born’s rule was derived and why complex amplitudes are important for quantum mechanics and quantum cognitive models. We will give some discussions about these subjects, since they are still open research questions in the scientific community. Finally, in Section 2.7, it is summarised all the
15 important concepts that were addressed throughout this chapter.
2.1 Introduction to Quantum Probabilities
In this section, the main differences between classical and quantum probability theory are presented. The concepts will be introduced by an example concerning jury duty. Suppose you are a juror and you must decide whether a defendant is guilty or innocent. The following sections describe how to represent this problem according to classical probability theory and quantum probability theory. This comparison is based on the book of Busemeyer & Bruza [34].
2.1.1 Representation of Quantum States
The notation adopted in quantum theory is the Dirac notation [67], also known as the ”bra-ket” notation. Instead of writing states using column vectors, in quantum theory column vectors are written in a linear and compact way. For instance, a k-dimensional column vector A:
α0 α1 A = , . . αk−1 can be written in terms of ket notation, as |Ai, in the following way:
|Ai = α0|0i + α1|1i + ... + αk−1|k − 1i
Where, 1 0 0 0 |0i = ,..., |k − 1i = . . . . 0 1
The bra notation, of the same vector hA|, corresponds to the conjugate transpose (represented by the symbol †) of the ket representation of |Ai and vice versa:
hA|† = |Ai
,
16 This means that the inner product of vector A can be written in terms of Dirac notation as:
α0, α , . 1 2 2 2 hA|Ai = α , α , ., α = |α0| + |α1| + ... + |αk−1| . 0 1 k−1 . ., αk−1
In the same way, the outer product of vector A, which is known as the quantum Projection operator, can be written as:
2 ∗ ∗ α0, |α0| α0α1 . . . α0αk−1 ∗ 2 ∗ α1, . α1α0 |α1| . . . α1αk−1 hA|Ai = α , α , ., α = . 0 1 k−1 . . .. . ., . . . . ∗ ∗ 2 αk−1 αk−1α0 αk−1α1 ... |αk−1|
The main advantage of using the Dirac’s notation is that it enables the explicit labelling of the basis vectors. In quantum theory, this is a great benefit since it allows the representation of a quantum system by a vector while at the same time explicitly writing the physical quantity of interest (either the energy level, position, spin, etc, of the electron). Writing vectors in the Dirac notation often saves space. Partic- ularly, when writing sparse vectors, the Dirac notation enables a compact representation by representing a vector through a binary string of length k, instead of writing a column vector representation with 2k components [2].
2.1.2 Space
In classical probability theory, events are contained in a Sample Space. A Sample Space Ω corre- sponds to the set of all possible outcomes of an experiment or random trial [64]. For example, when judging whether a defendant is guilty or innocent, the sample space is given by Ω = Guilty, Innocent (Figure 2.1). In quantum probability theory, events are contained in a Hilbert Space. A Hilbert space can be defined as a generalisation and extension of the Euclidean space into spaces with any finite or infinite number of dimensions. It is a vector space of complex numbers and offers the structure of an inner product to enable the measurement of angles and lengths [84]. The space is spanned by a set of orthonormal basis vectors, which form a basis for the space. In the jury duty example, we can define a set of orthonormal basis vectors as H = {|Guiltyi, Innocenti}, where
1 0 |Guiltyi = |Innocenti = . 0 1
Figure 2.2, presents a diagram showing the Hilbert space of a defendant being guilty or innocent [34]. Inferences are then calculated through similarities between vectors and events are defined in feature
17 vectors just like in many cognitive systems [194, 119, 35]. Since a Hilbert space enables the usage of complex numbers, then, in order to represent the events Guilty and Innocent, one would need two dimensions for each event (one for the real part and another for the imaginary part). In quantum decision theory, one usually ignores the imaginary component in order to be able to visualise geometrically all vectors in a 2-dimensional space.
Figure 2.2: Hilbert Space (quantum probabilty Figure 2.1: Sample Space (classical probabil- theory) ity theory)
2.1.3 Events
In classical probability theory, events can be defined by a set of outcomes to which a probability is assigned. They correspond to a subset of the sample space Ω from which they are contained in. Events can be mutually exclusive and they obey to set theory. This means that operations such as intersection or union of events is well defined, as well as the distributive axiom between sets. In the jury duty example, since a person cannot be both guilty and innocent, these events are defined as being two mutually exclusive sets, that is Guilty ∩ Innocent = ∅. We also have the union property between sets defined: Guilty ∪ Innocent = {Guilty, Innocent}. And we could also apply the distributive axiom. Let Z be some event (like the person carrying the crime scene weapon), then the distributive axiom would allow us to compute: Z ∩ (Guilty ∪ Innocent) = (Z ∩ Guilty) ∪ (Z ∩ Innocent). In quantum probability theory, events are defined geometrically and correspond to a subspace spanned by a subset of the basis vectors contained in the Hilbert Space. This geometric representation enables the definition of events as a superposition state vector |Si, which comprises the occurrence of all events. This representation of events is an important difference between classical and quantum prob- ability and a consequence of the quantum vector space representation. While in the classical theory we can only represent each event at each time frame as a set, in quantum probability we can represent all possible events at the same time though a vector. It follows that mutual exclusive events are represented by orthonormal vectors. Also, operations such as intersection and union of events can be represented geometrically and are only defined if and only if the events are contained in the same subspace [34]. If two events are represented by different basis vectors (that is, they are contained in different subspaces), then the intersection and union of events is not defined. It also follows that the distributive axiom does not hold [32]. This fact constitutes another major difference between classical and quantum probability theory.
18 Figure 2.3: Example of a representation of an event on a Hilbert Space
Concerning the jury duty example, the superposition state |Si can be defined as a superposition of a defendant being both Guilty and Innocent. Figure 2.3 presents a geometric visualisation of event |Si in a Hilbert space.
eiθG eiθI |Si = √ |Guiltyi + √ |Innocenti (2.1) 2 2
iθ In Equation 2.1, the values √e are quantum probability amplitudes. They correspond to the ampli- 2 tudes of a wave and are described by complex numbers. A complex number is a number that can be expressed in the form z = a + ib, where a and b are real numbers and i corresponds to the imaginary part, such that i2 = −1. Alternatively, a complex number can be described in the form z = |r| eiθ, where √ |r| = a2 + b2. The eiθ term is defined as the phase of the amplitude and corresponds to the angle be- tween the point expressed by (a, b) and the origin of the plane. These amplitudes are related to classical probability by taking the squared magnitude of these amplitudes through Born’s rule (more details will be given in Section 2.5). This is achieved by multiplying the amplitude with its complex conjugate.
2 ∗ iθG iθG iθG iθG −iθG 2 e e e e e i(θ −θ ) 1 P r(Guilty) = √ = √ . √ = √ . √ = e G G √ = 0.5 (2.2) 2 2 2 2 2 2 In quantum theory, it is required that the sum of the squared magnitudes of each amplitude equals 1. This axiom is called the normalisation axiom and corresponds to the classical theory constraint that the probability of all events in a sample space should sum to one.
iθ 2 iθ 2 e G e I √ + √ = 1 (2.3) 2 2 If we consider the situation where the jury needs to reason about the guiltiness of two defendants, who were accused of conducting a crime together, then one could write their combined state using an operation called tensor product, which is represented by the symbol ⊗. The tensor product corresponds to a mathematical method that enables the construction of a Hilbert space from the combination of individual spaces. If we have two defendants who could be each either Guilty or Innocent, then we could represent them as a complex linear combination of these states as,:
(α0|Gi + β0|Ii) ⊗ (α1|Gi + β1|Ii) =
= α0β0|GGi + α0β1|GIi + α1β0|IGi + α1β1|IIi
19 where G and I, represent the basis for Guilty and Innocent, respectively. And αi, βj, represent the quantum complex amplitudes associated to the first and second defendants, respectively.
2.1.4 System State
A system state is a probability function P r which maps events into probability numbers, i.e., positive real numbers between 0 and 1.
In classical theory, the system state corresponds to exactly its definition. There is a function that is responsible to assign a probability value to the outcome of an event. If the event corresponds to the sample space, then the system state assigns a probability value of 1 to the event. If the event is empty, then it assigns a probability of 0. In our example, if nothing else is told to the juror, then the probability of the defendant being guilty is P r(Guilty) = 0.5.
In quantum theory, the probability of a defendant being |Guiltyi is given by the squared magnitude of the projection from the superposition state |Si to the subspace containing the observed event |Guiltyi. Figure 2.4 shows an example. If nothing is told to the juror about the guiltiness of a defendant, then according to quantum theory, the decision-maker is unsure and is in a superposition state |Si between these events.
Figure 2.4: Example of a quantum system state.
When someone asks whether the defendant is guilty, then the decision-maker needs to reason and to answer the question. Under the geometric representation of quantum events, a decision-maker makes up his mind by projecting the superposition state |Si into the subspace that is related to this answer. If the decision-maker thinks the defendant is guilty, then a projection operator PG is applied, which maps the superposition vector |Si to the |Guiltyi subspace. In the same way, if the decision-maker thinks the defendant is innocent, then another projection operator PI is applied, mapping the superposition vector |Si to the |Innocenti subspace (Figure 2.4). This act of projecting the wave function into some subspace corresponds to the projection postulate of quantum mechanics.
eiθG eiθI |Si = √ |Guiltyi + √ |Innocenti 2 2
The projection operators PG and PI correspond to the outer product (also called tensor product) of
20 the basis states |Gi and |Ii, respectively.
1 0 0 0 PG = |GuiltyihGuilty| = PI = |InnocentihInnocent| = 0 0 0 1
In order to compute the probability of a defendant being guilty or innocent, under the geometric approach of quantum theory, we measure the length of the vector, which results from the projection of the superposition vector |Si to a desired subspace. The length of this vector corresponds to its squared magnitude and this value is interpreted to be the probability of the event. The interpretation of the squared magnitude of a wave’s density function being related to the probability of finding a particle in a given region of space was proposed by Max Born in 1926 The Statistical Interpretation of Quantum Mechanics [185]. This relation is known as Born’s rule. We will give more details about this rule in Section 2.5.
2 iθG 2 † e P r(Guilty) = ||PG|Si|| = (PG|Si) (PG|Si) = hS|PG|Si P r(Guilty) = √ = 0.5 2
2 iθI 2 † e P r(Innocent) = ||PI |Si|| = (PI |Si) (PI |Si) = hS|PI |Si P r(Innocent) = √ = 0.5 2
Note that, in quantum mechanics, the collapse of the wave function into a certain subspace with some probability corresponds to the act of measurement and to the collapse of the wave function. Although, the wave function is fully deterministic, when we make a measurement, the outcome is random.
2.1.5 State Revision
State revision corresponds to the situation where after observing an event, we are interested in observing other events given that the previous one has occurred.
P r(A∩B) In classical theory, this is addressed through the conditional probability formula P r(B|A) = P r(B) . So, returning to our example, suppose that some evidence has been given to the juror proving that the defendant is actually guilty, then what is the probability of him being innocent, given that it has been proved that he is guilty? This is computed in the following way.
P r(Innocent ∩ Guilty) P r(Innocent|Guilty) = = 0 P r(Guilty)
Since the events Guilty and Innocent are mutually exclusive, then their intersection is empty, leading to a zero probability value. In quantum probability theory, the state revision is given by first projecting the superposition state |Si into the subspace representing the observed event. Then, the projection is normalised such that the resulting vector is unit length. Again, if we want to determine the probability of a defendant being innocent, given he was found guilty, the calculations are performed as follows. We first start in the superposition state vector |Si.
21 1 1 |Si = √ |Guiltyi + √ |Innocenti 2 2 Then, we observe that the defendant is guilty, so we project the state vector |Si into the Guilty subspace and normalise the resulting projection. √ (1/ 2)|Guiltyi Sg = q √ 2 1/ 2
Sg = 1|Guiltyi + 0|Innocenti
From the resulting state, we just extract the probability of being innocent by simply squaring the respec- tive probability amplitude and find that we achieve the same results as in classical theory.
P r(Innocent|Guilty) = 02 = 0
2.1.6 Compatibility and Incompatibility
So far, although classical theory and quantum probability theory are different, they achieved the same results under our juror/defendant example. This happened, because the events were compatible. Classical theory follows the principle of unicity [78] which states that, for any given experiment, a single sample space contains all events. This means that a single probability function is necessary in order to assign probabilities to all events. As a consequence, operations such as intersection or union between events are always possible and well defined. Quantum theory only follows the unicity principle if and only if the events are spanned by the same basis vectors, that is, if the events are compatible. When events are spanned by a common basis vector, intersection and union operations are also possible and well defined and also a single probability function is enough to assign the probability values to outcomes. This means that, when we are dealing with compatible events, quantum theory converges to the classical probability theory [34]. The incompatibility phenomenon occurs only under the quantum probability theory. If two events are spanned by different basis vectors, then operations such as intersection and union are not defined [34]. This is the major difference between the classical theory and the quantum theory. In order to compute probabilities in incompatible events, one should turn to Luder’s¨ Rule [153]. This rule states that in order to compute the probability of two incompatible events A and B, one should compute the sequence of first observing event A and then calculate the probability of the sequence A and then B. Following the steps from Nielsen & Chuang [152] and Busemeyer & Bruza [34], the probability of two incompatible events is given as follows. Let A be an event spanned by the basis vectors V = |Vii, i = 1, ..., N and let
B be an event spanned by the basis vectors W = |Wii, i = 1, ..., N. After observing event A, we obtain the revised state: PA|Si SA = 2 ||PA|Si|| The probability of event A is simply the square of the projection of its amplitude, which is: P r(A) =
22 2 ||PA|Si|| . The probability of event B given that we observed event A is given by the square of the projection of the revised state vector SA corresponding to event A, into the subspace related to event B. That 2 is, P r(B|A) = ||PB|SAi|| . So, according to Luders¨ rule, the probability of B followed by A is given by
P r(A).P r(B|SA). And this equals to the following formulas [34].
2 2 P r(A).P r(B|SA) = ||PA|Si|| .||PB|SAi||
2 2 PA|Si = ||PA|Si|| . PB k PA|Si k
2 1 2 =k PA|Si k . 2 . k PBPA|Si k k PA|Si k
2 = ||PBPA|Si||
2.2 Interference Effects
Quantum interference is a challenging principle that essentially states that an individual elementary particle, such as a photon, can cross its own trajectory and interfere with the direction of its path, causing destructive or constructive effects. Mapping these effects to quantum cognition, the reasoning process of a decision-maker can be modelled by set of waves moving across time over a state space until a final decision is made. Under this perspective, interference can then be regarded as a chain of waves in a superposition state, coming from different directions. When these waves crash, one can experience a destructive effect (one wave destroys the other) or a constructive effect (one wave merges with another). In either case, the final probabilities of each wave is affected.
2.2.1 The Double Slit Experiment
The most famous example of quantum interference is the double slit experiment [56]. The experiment consisted in firing electrons towards a barrier with two slits. A detector would then record the pattern originated by the electrons. What makes the experiment interesting is that when a detector was put near the slits to check from which slit the electron was crossing, the detector screen showed an interference pattern with the size and shape of the two slits. This means that the electrons behaved like particles. When the detectors were not put near the slits (that is, no measurement was being made), then the de- tector revealed an interference pattern, characteristic of waves, which can crash between them causing destructive interference (when they cancel each other) or constructive interference (when they merge and their amplitude increases). Mathematically, this can be represented as an application of an unitary matrix to a superposition state. A unitary matrix can be seen as rotation of the vector space that pre- serves the length of the vectors and has orthonormal eigenvalues and unit eigenvalues. Let’s consider a unitary matrix, U, that rotates the vector space 45◦ clockwise and simply represents the electron passing from slit s1 or s2 to the top or bottom of the detector, d1 and d2, respectively. Let’s also define the initial
23 superposition vector of the electron, which can either go through slit s1 or s2, just like it is represented in Figure 2.5.
Figure 2.5: The double slit experiment. Electrons are fired and they can pass through one of the slits (either s1ors2) to reach a detector screen in points d1 or d2. If we measure from which slit the electron went through, then the pattern in the detectetor will have the shape and size of the two slits, suggesting a particle baheviour of the electron. If we do not measure from which slit the electron is going through, then the electron behaves as a wave and produces an interference pattern in the detector screen, with one point detecting constructive interference and another point detecting destructive interference.
Then, if we observe (or measure) that an electron passed in the first slit, s1, then one can compute the resulting quantum state corresponding to the electron reaching the detector as:
√ √ √ 1/ 2 1/ 2 1 1/ 2 √ 2
U|Ss1i = √ √ √ = √ → P r(d1) = 1/ 2 = 0.5 = P r(d2), 1/ 2 −1/ 2 0/ 2 1/ 2 which gives a 50% chance of either reaching the top or bottom part of the detector and shows that the electron behaved like a particle. If, however, we do not observe from which slit the electron passed through, then
√ √ √ 1/ 2 1/ 2 1/ 2 1 U|Si = √ √ √ = → P r(d1) = 1 and P r(d2) = 0, 1/ 2 −1/ 2 1/ 2 0 which shows a destructive interference of waves for the bottom of the detector and a constructive inter- ference effect when reaching its top. This means that applying a unitary matrix that designates a random effect over the electron (we do not know, which slit that the electron went through) to a random initial superposition state of the electron, leads to a well deterministic outcome Aaronson [2]. This effect does not occur under a classical setting, because we cannot specify negative probabilities in classical proba- bility theory. But in quantum probability theory, we can specify positive numbers, negative numbers and complex numbers through quantum probability amplitudes. These negative amplitudes are the reason why waves cancel each other or interfere constructively and are the core of quantum mechanics.
2.2.2 Derivation of Interference Effects from Complex Numbers
Interference effects can be naturally derived from the rules of complex numbers. The relation of some event A between classical probability, P r(A), and a quantum probability amplitude, eiθA ψ(A), is given
24 by Born’s rule in Equation 2.4. A quantum amplitude corresponds to the amplitude of a wave and is described by a complex number. The term eiθA is the phase of the amplitude.
P r(A) = | eiθA ψ(A) |2 (2.4)
Suppose that events A1,A2,...,AN form a set of mutually disjoint events, such that their union is all in the sample space, Ω, for any other event B. Then, the classical law of total probability can be formulated like in Equation 2.5. N N X X P r(B) = P r(Ai)P r(B|Ai) where: Ai = 1 (2.5) i=1 i=1 The quantum law of total probability can be derived through Equation 2.5 by applying Born’s rule (Equa- tion 2.4): N 2 N X X 2 iθx iθx P r(B) = e ψ(Ax)ψ(B|Ax) e ψ(Ax) = 1 (2.6) x=1 x=1 For simplicity, we will expand Equation 2.6 for N = 3 and only later we will find the general formula for N events: 2 iθ1 iθ2 iθ3 P r(B) = e ψ(A1)ψ(B|A1) + e ψ(A2)ψ(B|A2) + e ψ(A3)ψ(B|A3) (2.7)
Next, we compute the magnitude in Equation 2.7 by multiplying it with its complex conjugate: