Causality Through Directed Information

Causality Through Directed Information Young-Han Kim University of California, San Diego SNU Institute for Research in Finance and Economics April , Joint work with Jiantao Jiao (Stanford), Haim Permuter (Ben Gurion), Tsachy Weissman (Stanford), and Lei Zhao (Jump Operations) Supported in part by National Science Foundation (NSF), US–Israel Binational Science Foundation (BSF), and BSF Bergmann Memorial Award Related publications Haim H. Permuter, Young-Han Kim, and Tsachy Weissman, “Interpretations of ∙ directed information in portfolio theory, data compression, and hypothesis testing,” IEEE Transactions on Information Theory, vol. , no. , pp. –, June . Tsachy Weissman, Young-Han Kim, and Haim H. Permuter, “Directed information, ∙ causal estimation, and communication in continuous time,” IEEE Transactions on Information Theory, vol. , no. , pp. –, March . Jiantao Jiao, Lei Zhao, Haim H. Permuter, Young-Han Kim, and Tsachy Weissman, ∙ “Universal estimation of directed information,” to appear in IEEE Transactions on Information Theory, . For more information, visit http://circuit.ucsd.edu/˜yhk ∙ Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Shannon’s information measures () Entropy: “uncertainty in a random variable X” ∙ H X p x log ( )=ᚰ ( ) p x x ( ) Mutual information: “information about X provided by Y” ∙ I X; Y H X H Y H X, Y ( )= ( )+ ( )− ( ) Relative entropy (Kullback–Leibler ): “distinction between p and q” ∙ p x D p q p x log ( ) ( ‖ )=ᚰ ( ) q x x ( ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Where do they come from? Mathematical communication theory (Shannon ) ∙ é Fundamental limits on communication and compression é Probability theory and statistics Axiomatic definitions (Aczel–Dar´ oczy´ ) ∙ é “Reasonable” properties for information measures Functional equations: f p q f p f q f H é ( × )= ( )+ ( ) ⇒ ≅ How about finance and economics? ∙ Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Gambling in horse races Horses: ,,..., m ∙ Odds: o , o ,..., o m (say, o x m) ∙ ( ) ( ) ( ) ( )≡ Win probabilities: p , p ,..., p m ∙ ( ) ( ) ( ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Optimal gambling Bets: b , b ,..., b m ∙ ( ) ( ) ( ) No short: b x , x ,,..., m é ( )≥ = No margin: b x é ∑x ( )= In other words, b x lies in the probability simplex é ( ) Payoff: If horse x wins (with probability p x ), then turns into b x o x ∙ ( ) ( ) ( ) Question How should we choose our portfolio b x ? ( ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Kelly gambling and log-optimal portfolio Kelly (), “A new interpretation of information rate”: ∙ b∗ x p x ( )= ( ) Maximize E log b X o X ∙ [ ( ( ) ( ))] é Logarithmic utility é Growth rate optimality é Competitive optimality (Bell–Cover ) é Other properties (MacLean–Thorp–Ziemba ) Optimal growth rate: ∙ W∗ X max E log b X o X E log o X H X ( )= b(x) [ ( ( ) ( ))] = ( ( )) − ( ) With o x m, ( )≡ ∗ W X log m H X ( )= − ( ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Entropy H X p x log E log ( )=ᚰ ( ) p x = ឱ p X ុ x ( ) ( ) Amount of randomness (information, uncertainty) in X ∙ Fundamental limit on lossless compression (Shannon ) ∙ Can be generalized to measures other than the counting measure ∙ Conditional entropy: ∙ H X Y p x, y log E log ( | )=ᚰ ( ) p x y = ឱ p X Y ុ x,y ( | ) ( | ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Gambling with side information Side information Y about the horse race outcome X ∙ Bets: b x y , x ,,..., m ∙ ( | ) = Kelly gambling: b∗ x y p x y ∙ ( | )= ( | ) Optimal growth rate: ∙ W∗ X Y max E log b X Y o X ( | )= b(x|y) [ ( ( | ) ( ))] E log o X H X Y = ( ( )) − ( | ) Value of side information (Kelly ) ΔW W∗ X Y W∗ X I X; Y = ( | )− ( )= ( ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Mutual information I X; Y H X H Y H X, Y H X H X Y H Y H Y X ( )= ( )+ ( )− ( )= ( )− ( | )= ( )− ( | ) Amount of information about X provided by Y (and vice versa) ∙ For a general stock market (Barron–Cover ): ΔW I X; Y é ≤ ( ) Fundamental limit on communication (Shannon ) ∙ Fundamental limit on lossy compression/quantization (Shannon ) ∙ Can be generalized to any pair of random objects ∙ Conditional mutual information: ∙ I X; Y Z H X Z H Y Z H X, Y Z ( | )= ( | )+ ( | )− ( | ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Repeated gambling in horse races with memory Win probabilities: p x , p x x , p x x , x ,..., p x xn− ∙ ( ) ( | ) ( | ) ( n| ) Odds: o x m ∙ ( i)≡ Bets: b x , b x x , b x x , x ,..., b x xn− ∙ ( ) ( | ) ( | ) ( n| ) Kelly gambling: b∗ x xi− p x xi− , i ,,... ∙ ( i| )= ( i| ) = Optimal growth rate: ∙ n W∗ Xn log m H Xn log m H X Xi− ( )= − ( ) = − ᚰ ( i | ) n n i= If the horse race process X is stationary ergodic, then ∙ { n} n H Xn H∗ X é ( / ) ( ) → ( ) ∗ n ∗ é W X W X ( ) → ∗ ( ) ≐ nW é wealth almost surely (Shannon , McMillan , Breiman ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Gambling with causal side information Side information Y , Y ,... ∙ Bets: b x xi−, yi ∙ ( i| ) Kelly gambling: b∗ x xi−, yi p x xi−, yi , i ,,... ∙ ( i| )= ( i| ) = Optimal growth rate: ∙ n W∗ Xn Yn log m H X Xi−, Yi log m H Xn Yn ( ‖ )= − ᚰ ( i | ) = − ( ‖ ) n i= n If the X , Y is stationary ergodic, then n H Xn Yn H∗ X Y ∙ {( n n)} ( / ) ( ‖ ) → ( ‖ ) Value of causal side information (Permuter–K–Weissman ) ΔW W∗ Xn Yn W∗ Xn H Xn H Xn Yn I Yn Xn = ( ‖ )− ( )= n( ( )− ( ‖ )) = n ( → ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Directed information n I Yn Xn H Xn H Xn Yn I X ; Yi Xi− ( → )= ( )− ( ‖ )= ᚰ ( i | ) i= Amount of information about X causally provided by Y ∙ For a general stock market: ΔW ≤ n I Y n → Xn é ( / ) ( ) Arrow of time: directed and asymmetric ∙ I Yn Xn I Xn Yn ( → ) ̸= ( → ) Fundamental limit on feedback communication ∙ (Tatikonda–Mitter , K , Permuter–Weissman–Goldsmith ) Can be generalized to continuous time (Weissman–K–Permuter ) ∙ Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Test for causal dependence H H Xi Yi Xi Yi Controller Output generatorController Output generator i− i− Xi i i− Yi i− i− Xi i− Yi p(xi|x , y ) p(yi|x , y ) p(xi|x , y ) p(yi|y ) Yi− Yi− Type-I and type-II error probabilities: α P Ac H , β P A H ∙ = ( | ) = ( | ) Chernoff–Stein lemma for the causal dependence test n n β∗ min β ≐ −I(X →Y ) = A⊆X n×Y n: α<є Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Brief history ∙ Marko (), “The bidirectional communication theory: A generalization of information theory” é Direction of information flow for mutually coupled statistical systems é Cybernetics: Group behavior with monkeys ∙ Massey (), “Causality, feedback, and directed information” Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Relationship to other notions for causality ∙ Granger causality (Granger , Geweke ): n i− LMMSE Yi Yi−p G Xn → Yn = log ( | ) ( ) ᚰ LMMSE Y Yi− , Xi i= ( i| i−p i−p) The higher G Xn → Y n is, the more X influences Y é ( ) If X , Y is Gauss–Markov of order p, then é {( n n)} I Xn → Y n ≡ G Xn → Y n ( ) ( ) ∙ Transfer entropy (Schreiber ): T X → Y = I Xi−; Y Yi− i( ) ( i | ) The higher T X → Y is, the more X influences Y (with one step delay) é i( ) If X , Y is stationary, then é {( n n)} I Xn− → Y n → T X → Y n ( ) ( ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Causal conditioning ∙ Causally conditional probability (Kramer ): n p yn xn = p y xi, yi− ( ‖ ) ᚱ ( i | ) i= n p yn xn− = p y xi−, yi− ( ‖ ) ᚱ ( i | ) i= ∙ Causally conditional entropy: H Yn Xn =− E log p Yn Xn , ( ‖ ) [ ( ‖ )] H Yn Xn− =− E log p Yn Xn− ( ‖ ) [ ( ‖ )] Chain rules p xn, yn = p xn yn p yn xn− = p xn yn− p yn xn , ( ) ( ‖ ) ( ‖ ) ( ‖ ) ( ‖ ) H Xn, Yn = H Xn Yn + H Yn Xn− = H Xn Yn− + H Yn Xn ( ) ( ‖ ) ( ‖ ) ( ‖ ) ( ‖ ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Properties of directed information I Xn → Yn = H Yn − H Yn Xn , ( ) ( ) ( ‖ ) I Xn− → Yn = H Yn − H Yn Xn− ( ) ( ) ( ‖ ) ∙ I Xn → Yn ≤ I Xn; Yn ( ) ( ) ∙ I Xn → Yn = I Xn; Yn if p xn yn− = p xn ( ) ( ) ( ‖ ) ( ) ∙ I Xn → Yn = I Xn; Yn = nI X; Y if X , Y is IID ( ) ( ) ( ) {( n n)} Conservation law I Xn ; Yn = I Xn → Yn + I Yn− → Xn = I Xn− → Yn + I Yn → Xn ( ) ( ) ( ) ( ) ( ) ∙ Measure of causal influence Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Universal estimation of directed information ∙ In reality, the probability distribution may not be known or may not even exist Something out of nothing ∙ Can we perform as if the distribution were known? ∙ Can we perform as well as the best estimator in a given class? ∙ Answer: Yes! (Jiao–Zhao–Permuter–K–Weissman ) Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Universal probability assignments ∙ Probability assignment: q xn ( ) ∙ Sequential probability assignment: q x , q x x , q x x , x ,..., q x xn− ( ) ( | ) ( | ) ( n| ) ∙ Probability assignment q is universal if lim D p xn q xn = n→∞ n ( ( )‖ ( )) for every stationary distribution p ∙ Probability assignment q is pointwise universal if p Xn ( ) ≤ lim sup log n p–a.s. n→∞ n q X ( ) for every stationary ergodic distribution p ∙ (Pointwise) universal probability assignments é Compression-based approaches: Ziv–Lempel (), Willems–Shtarkov–Tjalkens () é Ergodic theoretic approaches: Ornstein (), Morvai–Yakowitz–Algoet () Young-HanKim (UCSD) DirectedInformation SIRFESeminar(April) / Algorithm ̂I Xn → Yn = Ĥ Yn − Ĥ Yn Xn ( ) ( ) ( ‖ ) ∙ Ĥ Yn =− log q Yn and Ĥ Yn Xn =− log q Yn Xn ( ) n ( ) ( ‖ ) n ( ‖ ) , Consistency

Causality Through Directed Information

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support