Part II: Variational Bayes and beyond

Tamara Broderick ITT Career Development Assistant Professor, MIT

http://www.tamarabroderick.com/tutorial_2018_lugano.html Approximate Bayesian inference

qAAAB6nicbVBNS8NAEJ34WetX1aOXxSKIh5KIoMeiF48V7Qe0sWy2k3bpZhN3N0IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O0vLK6tr64WN4ubW9s5uaW+/oeNUMayzWMSqFVCNgkusG24EthKFNAoENoPh9cRvPqHSPJb3ZpSgH9G+5CFn1Fjp7vHhtFsquxV3CrJIvJyUIUetW/rq9GKWRigNE1Trtucmxs+oMpwJHBc7qcaEsiHtY9tSSSPUfjY9dUyOrdIjYaxsSUOm6u+JjEZaj6LAdkbUDPS8NxH/89qpCS/9jMskNSjZbFGYCmJiMvmb9LhCZsTIEsoUt7cSNqCKMmPTKdoQvPmXF0njrOK5Fe/2vFy9yuMowCEcwQl4cAFVuIEa1IFBH57hFd4c4bw4787HrHXJyWcO4A+czx/01o2R ⇤ p( y) Use to approximate AAAB83icbVBNS8NAEJ3Ur1q/qh69LBahXkoigh6LXjxWsB/QhLLZbtqlm82yuxFC7N/w4kERr/4Zb/4bt20O2vpg4PHeDDPzQsmZNq777ZTW1jc2t8rblZ3dvf2D6uFRRyepIrRNEp6oXog15UzQtmGG055UFMchp91wcjvzu49UaZaIB5NJGsR4JFjECDZW8mXdJ8PEoCeUnQ+qNbfhzoFWiVeQGhRoDapf/jAhaUyFIRxr3fdcaYIcK8MIp9OKn2oqMZngEe1bKnBMdZDPb56iM6sMUZQoW8Kgufp7Isex1lkc2s4Ym7Fe9mbif14/NdF1kDMhU0MFWSyKUo5MgmYBoCFTlBieWYKJYvZWRMZYYWJsTBUbgrf88irpXDQ8t+HdX9aaN0UcZTiBU6iDB1fQhDtoQRsISHiGV3hzUufFeXc+Fq0lp5g5hj9wPn8A5+eQ7g== ·| Optimization q⇤ = argminq Qf(q( ),p( y)) 2 · ·| Variational Bayes q⇤ = argminq QKL(q( ) p( y)) 2 · || ·| Mean-field variational Bayes

q⇤ = argminq Q KL(q( ) p( y)) 2 MFVB · || ·| • Coordinate descent • Stochastic variational inference (SVI) [Hoffman et al 2013] • Automatic differentiation variational inference (ADVI) [Kucukelbir et al 2015, 2017]

7 Roadmap

• Bayes & Approximate Bayes review • What is: • Variational Bayes (VB) • Mean-field variational Bayes (MFVB) • Why use MFVB? • When can we trust MFVB? • Where do we go from here? Roadmap

• Bayes & Approximate Bayes review • What is: • Variational Bayes (VB) • Mean-field variational Bayes (MFVB) • Why use MFVB? • When can we trust MFVB? • Where do we go from here? Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN )

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) ! • Model: p(y ✓): y iid (µ, 2),n=1,...,N AAACS3icbVBNaxRBFOxZE43r16pHL00WYQPLMhMERRCCXjyFBLJJYGdd3vS83W3S3TN2vxGGdv6fFy/e/BNePCiSQ3o3c8iHBQ1FVT36vcpKJR3F8c+oc2dj8+69rfvdBw8fPX7Se/rs2BWVFTgWhSrsaQYOlTQ4JkkKT0uLoDOFJ9nZh5V/8gWtk4U5orrEqYaFkXMpgII062XloOZfeUpLJNh5y9PPFeS8nnnT8NQRiDOLykuZNz51UgdRAy0FKL/fDFJdDUNKLjR88rvNzrAdN/wdT4apygtyw/1Zrx+P4jX4bZK0pM9aHMx6P9K8EJVGQ0KBc5MkLmnqwZIUCptuWjksw2awwEmgBjS6qV930fCXQcn5vLDhGeJr9eqEB+1crbOQXF3ibnor8X/epKL5m6mXpqwIjbj8aF4pTgVfFctzaVGQqgMBYWXYlYslWBAU6u+GEpKbJ98mx7ujJB4lh6/6e+/bOrbYC7bNBixhr9ke+8gO2JgJ9o39Yn/Y3+h79Dv6F51fRjtRO/OcXUNn8wKEGLJq | n ⇠ N

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and variance 2 • Model: ✓ =(µ, ) p(y ✓): y iid (µ, 2),n=1,...,N AAACS3icbVBNaxRBFOxZE43r16pHL00WYQPLMhMERRCCXjyFBLJJYGdd3vS83W3S3TN2vxGGdv6fFy/e/BNePCiSQ3o3c8iHBQ1FVT36vcpKJR3F8c+oc2dj8+69rfvdBw8fPX7Se/rs2BWVFTgWhSrsaQYOlTQ4JkkKT0uLoDOFJ9nZh5V/8gWtk4U5orrEqYaFkXMpgII062XloOZfeUpLJNh5y9PPFeS8nnnT8NQRiDOLykuZNz51UgdRAy0FKL/fDFJdDUNKLjR88rvNzrAdN/wdT4apygtyw/1Zrx+P4jX4bZK0pM9aHMx6P9K8EJVGQ0KBc5MkLmnqwZIUCptuWjksw2awwEmgBjS6qV930fCXQcn5vLDhGeJr9eqEB+1crbOQXF3ibnor8X/epKL5m6mXpqwIjbj8aF4pTgVfFctzaVGQqgMBYWXYlYslWBAU6u+GEpKbJ98mx7ujJB4lh6/6e+/bOrbYC7bNBixhr9ke+8gO2JgJ9o39Yn/Y3+h79Dv6F51fRjtRO/OcXUNn8wKEGLJq | n ⇠ N

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and variance 2 • Model: ✓ =(µ, ) p(y ✓): y iid (µ, 2),n=1,...,N AAACS3icbVBNaxRBFOxZE43r16pHL00WYQPLMhMERRCCXjyFBLJJYGdd3vS83W3S3TN2vxGGdv6fFy/e/BNePCiSQ3o3c8iHBQ1FVT36vcpKJR3F8c+oc2dj8+69rfvdBw8fPX7Se/rs2BWVFTgWhSrsaQYOlTQ4JkkKT0uLoDOFJ9nZh5V/8gWtk4U5orrEqYaFkXMpgII062XloOZfeUpLJNh5y9PPFeS8nnnT8NQRiDOLykuZNz51UgdRAy0FKL/fDFJdDUNKLjR88rvNzrAdN/wdT4apygtyw/1Zrx+P4jX4bZK0pM9aHMx6P9K8EJVGQ0KBc5MkLmnqwZIUCptuWjksw2awwEmgBjS6qV930fCXQcn5vLDhGeJr9eqEB+1crbOQXF3ibnor8X/epKL5m6mXpqwIjbj8aF4pTgVfFctzaVGQqgMBYWXYlYslWBAU6u+GEpKbJ98mx7ujJB4lh6/6e+/bOrbYC7bNBixhr9ke+8gO2JgJ9o39Yn/Y3+h79Dv6F51fRjtRO/OcXUNn8wKEGLJq n | 2 1⇠ N p(✓): ( ) Gamma(a ,b ) ⇠ 0 0 µ 2 (µ , 2) | ⇠ N 0 0

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and variance 2 • Model: ✓ =(µ, ) p(y ✓): y iid (µ, 2),n=1,...,N AAACS3icbVBNaxRBFOxZE43r16pHL00WYQPLMhMERRCCXjyFBLJJYGdd3vS83W3S3TN2vxGGdv6fFy/e/BNePCiSQ3o3c8iHBQ1FVT36vcpKJR3F8c+oc2dj8+69rfvdBw8fPX7Se/rs2BWVFTgWhSrsaQYOlTQ4JkkKT0uLoDOFJ9nZh5V/8gWtk4U5orrEqYaFkXMpgII062XloOZfeUpLJNh5y9PPFeS8nnnT8NQRiDOLykuZNz51UgdRAy0FKL/fDFJdDUNKLjR88rvNzrAdN/wdT4apygtyw/1Zrx+P4jX4bZK0pM9aHMx6P9K8EJVGQ0KBc5MkLmnqwZIUCptuWjksw2awwEmgBjS6qV930fCXQcn5vLDhGeJr9eqEB+1crbOQXF3ibnor8X/epKL5m6mXpqwIjbj8aF4pTgVfFctzaVGQqgMBYWXYlYslWBAU6u+GEpKbJ98mx7ujJB4lh6/6e+/bOrbYC7bNBixhr9ke+8gO2JgJ9o39Yn/Y3+h79Dv6F51fRjtRO/OcXUNn8wKEGLJq n | 2 1⇠ N p(✓): ( ) Gamma(a ,b ) ⇠ 0 0 µ 2 (µ , 2) | ⇠ N 0 0

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= p(y ✓): y iid (µ, 2),n=1,...,N AAACS3icbVBNaxRBFOxZE43r16pHL00WYQPLMhMERRCCXjyFBLJJYGdd3vS83W3S3TN2vxGGdv6fFy/e/BNePCiSQ3o3c8iHBQ1FVT36vcpKJR3F8c+oc2dj8+69rfvdBw8fPX7Se/rs2BWVFTgWhSrsaQYOlTQ4JkkKT0uLoDOFJ9nZh5V/8gWtk4U5orrEqYaFkXMpgII062XloOZfeUpLJNh5y9PPFeS8nnnT8NQRiDOLykuZNz51UgdRAy0FKL/fDFJdDUNKLjR88rvNzrAdN/wdT4apygtyw/1Zrx+P4jX4bZK0pM9aHMx6P9K8EJVGQ0KBc5MkLmnqwZIUCptuWjksw2awwEmgBjS6qV930fCXQcn5vLDhGeJr9eqEB+1crbOQXF3ibnor8X/epKL5m6mXpqwIjbj8aF4pTgVfFctzaVGQqgMBYWXYlYslWBAU6u+GEpKbJ98mx7ujJB4lh6/6e+/bOrbYC7bNBixhr9ke+8gO2JgJ9o39Yn/Y3+h79Dv6F51fRjtRO/OcXUNn8wKEGLJq n | 2 1⇠ N p(✓): ( ) Gamma(a ,b ) ⇠ 0 0 µ 2 (µ , 2) | ⇠ N 0 0

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= p(y ✓): y iid (µ, 2),n=1,...,N AAACS3icbVBNaxRBFOxZE43r16pHL00WYQPLMhMERRCCXjyFBLJJYGdd3vS83W3S3TN2vxGGdv6fFy/e/BNePCiSQ3o3c8iHBQ1FVT36vcpKJR3F8c+oc2dj8+69rfvdBw8fPX7Se/rs2BWVFTgWhSrsaQYOlTQ4JkkKT0uLoDOFJ9nZh5V/8gWtk4U5orrEqYaFkXMpgII062XloOZfeUpLJNh5y9PPFeS8nnnT8NQRiDOLykuZNz51UgdRAy0FKL/fDFJdDUNKLjR88rvNzrAdN/wdT4apygtyw/1Zrx+P4jX4bZK0pM9aHMx6P9K8EJVGQ0KBc5MkLmnqwZIUCptuWjksw2awwEmgBjS6qV930fCXQcn5vLDhGeJr9eqEB+1crbOQXF3ibnor8X/epKL5m6mXpqwIjbj8aF4pTgVfFctzaVGQqgMBYWXYlYslWBAU6u+GEpKbJ98mx7ujJB4lh6/6e+/bOrbYC7bNBixhr9ke+8gO2JgJ9o39Yn/Y3+h79Dv6F51fRjtRO/OcXUNn8wKEGLJq n | 2 1⇠ N p(✓): ( ) Gamma(a ,b ) ⇠ 0 0 µ 2 (µ , 2) | ⇠ N 0 0

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0

[CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004] • MFVB approximation: q⇤(µ, ⌧)=q⇤(µ)q⇤(⌧) = argmin KL(q( ) p( y))

AAACYHicbVHRTtswFHUCG10Ho4w3eLGoJjUIVQlC2l6QEEgIaSCBtBakposc1y0WtpPaN4wqzU/yxsNe9iW4SR4YcCVL555zrnx9HKeCG/D9J8ddWv7wcaXxqfl5de3Lemvja98kmaasRxOR6JuYGCa4Yj3gINhNqhmRsWDX8d3JQr++Z9rwRP2CWcqGkkwUH3NKwFJR68/0924nlNkeDoFkHj7E0yi3fVHRXtlapexrRwjsAbTMiZ5Irooon+KQK3y1cFbKxWn/uCgK/PO8M+2EdJSAh+dznFZ4PvO8qNX2u35Z+C0IatBGdV1GrcdwlNBMMgVUEGMGgZ/C0O4AnApWNMPMsJTQOzJhAwsVkcwM8zKgAn+zzAiPE22PAlyyLydyIo2Zydg6JYFb81pbkO9pgwzGP4Y5V2kGTNHqonEmMCR4kTYecc0oiJkFhGpud8X0lmhCwf5J04YQvH7yW9Df7wZ+N7g6aB8d13E00DbaQR0UoO/oCJ2hS9RDFP11lpxVZ8355zbcdXejsrpOPbOJ/it36xn+5bQ2 µ ⌧ q Q 2 MFVB · || ·|

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004] • MFVB approximation: q⇤(µ, ⌧)=q⇤(µ)q⇤(⌧) = argmin KL(q( ) p( y))

AAACYHicbVHRTtswFHUCG10Ho4w3eLGoJjUIVQlC2l6QEEgIaSCBtBakposc1y0WtpPaN4wqzU/yxsNe9iW4SR4YcCVL555zrnx9HKeCG/D9J8ddWv7wcaXxqfl5de3Lemvja98kmaasRxOR6JuYGCa4Yj3gINhNqhmRsWDX8d3JQr++Z9rwRP2CWcqGkkwUH3NKwFJR68/0924nlNkeDoFkHj7E0yi3fVHRXtlapexrRwjsAbTMiZ5Irooon+KQK3y1cFbKxWn/uCgK/PO8M+2EdJSAh+dznFZ4PvO8qNX2u35Z+C0IatBGdV1GrcdwlNBMMgVUEGMGgZ/C0O4AnApWNMPMsJTQOzJhAwsVkcwM8zKgAn+zzAiPE22PAlyyLydyIo2Zydg6JYFb81pbkO9pgwzGP4Y5V2kGTNHqonEmMCR4kTYecc0oiJkFhGpud8X0lmhCwf5J04YQvH7yW9Df7wZ+N7g6aB8d13E00DbaQR0UoO/oCJ2hS9RDFP11lpxVZ8355zbcdXejsrpOPbOJ/it36xn+5bQ2 µ ⌧ q Q 2 MFVB · || ·| • Coordinate descent (derivation shortly) [Bishop 2006, Sec 10.1.3]

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004] • MFVB approximation: q⇤(µ, ⌧)=q⇤(µ)q⇤(⌧) = argmin KL(q( ) p( y))

AAACYHicbVHRTtswFHUCG10Ho4w3eLGoJjUIVQlC2l6QEEgIaSCBtBakposc1y0WtpPaN4wqzU/yxsNe9iW4SR4YcCVL555zrnx9HKeCG/D9J8ddWv7wcaXxqfl5de3Lemvja98kmaasRxOR6JuYGCa4Yj3gINhNqhmRsWDX8d3JQr++Z9rwRP2CWcqGkkwUH3NKwFJR68/0924nlNkeDoFkHj7E0yi3fVHRXtlapexrRwjsAbTMiZ5Irooon+KQK3y1cFbKxWn/uCgK/PO8M+2EdJSAh+dznFZ4PvO8qNX2u35Z+C0IatBGdV1GrcdwlNBMMgVUEGMGgZ/C0O4AnApWNMPMsJTQOzJhAwsVkcwM8zKgAn+zzAiPE22PAlyyLydyIo2Zydg6JYFb81pbkO9pgwzGP4Y5V2kGTNHqonEmMCR4kTYecc0oiJkFhGpud8X0lmhCwf5J04YQvH7yW9Df7wZ+N7g6aB8d13E00DbaQR0UoO/oCJ2hS9RDFP11lpxVZ8355zbcdXejsrpOPbOJ/it36xn+5bQ2 µ ⌧ q Q 2 MFVB · || ·| • Coordinate descent (derivation shortly) [Bishop 2006, Sec 10.1.3] 1 q⇤(µ)= (µ µ , ⇢ ) q⇤(⌧) = Gamma(⌧ a ,b ) µ N AAACGXicbZBNSwMxEIazftb6VfXoJVgEK1J2RdCLUPSgJ1GwrdDWZTZNNTTZXZNZsaz9G178K148KOJRT/4b04+DVl8IPLwzw2TeIJbCoOt+OWPjE5NT05mZ7Ozc/MJibmm5YqJEM15mkYz0RQCGSxHyMgqU/CLWHFQgeTVoH/bq1VuujYjCc+zEvKHgKhQtwQCt5efcG7+OkFymm92NHhToPq0jv0Ot0iNQCgb2PfgnWzTwTwp+Lu8W3b7oX/CGkCdDnfq5j3ozYoniITIJxtQ8N8ZGChoFk7ybrSeGx8DacMVrFkNQ3DTS/mVdum6dJm1F2r4Qad/9OZGCMqajAtupAK/NaK1n/lerJdjaa6QijBPkIRssaiWSYkR7MdGm0Jyh7FgApoX9K2XXoIGhDTNrQ/BGT/4Lle2i5xa9s5186WAYR4askjWyQTyyS0rkmJySMmHkgTyRF/LqPDrPzpvzPmgdc4YzK+SXnM9vD6ufrA== N N AAACHXicbVDLSsNAFJ3UV62vqEs3g0VoRUsiBd0IRTeuSgX7gCYNk+nUDp08nJkIJeZH3Pgrblwo4sKN+DdO2iy09cAMh3Pu5d573JBRIQ3jW8stLC4tr+RXC2vrG5tb+vZOSwQRx6SJAxbwjosEYdQnTUklI52QE+S5jLTd0WXqt+8JFzTwb+Q4JLaHbn06oBhJJTl69c6xvKh3WFJ/GZ5Dy0NyiBGL60kqwQelRE79CFp8GDj1XnxsJmVHLxoVYwI4T8yMFEGGhqN/Wv0ARx7xJWZIiK5phNKOEZcUM5IUrEiQEOERuiVdRX3kEWHHk+sSeKCUPhwEXD1fwon6uyNGnhBjz1WV6fJi1kvF/7xuJAdndkz9MJLEx9NBg4hBGcA0KtinnGDJxoogzKnaFeIh4ghLFWhBhWDOnjxPWicV06iY19Vi7SKLIw/2wD4oAROcghq4Ag3QBBg8gmfwCt60J+1Fe9c+pqU5LevZBX+gff0Al86gWQ== N | N ⌧ |

8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004] • MFVB approximation: q⇤(µ, ⌧)=q⇤(µ)q⇤(⌧) = argmin KL(q( ) p( y))

AAACYHicbVHRTtswFHUCG10Ho4w3eLGoJjUIVQlC2l6QEEgIaSCBtBakposc1y0WtpPaN4wqzU/yxsNe9iW4SR4YcCVL555zrnx9HKeCG/D9J8ddWv7wcaXxqfl5de3Lemvja98kmaasRxOR6JuYGCa4Yj3gINhNqhmRsWDX8d3JQr++Z9rwRP2CWcqGkkwUH3NKwFJR68/0924nlNkeDoFkHj7E0yi3fVHRXtlapexrRwjsAbTMiZ5Irooon+KQK3y1cFbKxWn/uCgK/PO8M+2EdJSAh+dznFZ4PvO8qNX2u35Z+C0IatBGdV1GrcdwlNBMMgVUEGMGgZ/C0O4AnApWNMPMsJTQOzJhAwsVkcwM8zKgAn+zzAiPE22PAlyyLydyIo2Zydg6JYFb81pbkO9pgwzGP4Y5V2kGTNHqonEmMCR4kTYecc0oiJkFhGpud8X0lmhCwf5J04YQvH7yW9Df7wZ+N7g6aB8d13E00DbaQR0UoO/oCJ2hS9RDFP11lpxVZ8355zbcdXejsrpOPbOJ/it36xn+5bQ2 µ ⌧ q Q 2 MFVB · || ·| • Coordinate descent (derivation shortly) [Bishop 2006, Sec 10.1.3] 1 q⇤(µ)= (µ µ , ⇢ ) q⇤(⌧) = Gamma(⌧ a ,b ) µ N AAACGXicbZBNSwMxEIazftb6VfXoJVgEK1J2RdCLUPSgJ1GwrdDWZTZNNTTZXZNZsaz9G178K148KOJRT/4b04+DVl8IPLwzw2TeIJbCoOt+OWPjE5NT05mZ7Ozc/MJibmm5YqJEM15mkYz0RQCGSxHyMgqU/CLWHFQgeTVoH/bq1VuujYjCc+zEvKHgKhQtwQCt5efcG7+OkFymm92NHhToPq0jv0Ot0iNQCgb2PfgnWzTwTwp+Lu8W3b7oX/CGkCdDnfq5j3ozYoniITIJxtQ8N8ZGChoFk7ybrSeGx8DacMVrFkNQ3DTS/mVdum6dJm1F2r4Qad/9OZGCMqajAtupAK/NaK1n/lerJdjaa6QijBPkIRssaiWSYkR7MdGm0Jyh7FgApoX9K2XXoIGhDTNrQ/BGT/4Lle2i5xa9s5186WAYR4askjWyQTyyS0rkmJySMmHkgTyRF/LqPDrPzpvzPmgdc4YzK+SXnM9vD6ufrA== N N AAACHXicbVDLSsNAFJ3UV62vqEs3g0VoRUsiBd0IRTeuSgX7gCYNk+nUDp08nJkIJeZH3Pgrblwo4sKN+DdO2iy09cAMh3Pu5d573JBRIQ3jW8stLC4tr+RXC2vrG5tb+vZOSwQRx6SJAxbwjosEYdQnTUklI52QE+S5jLTd0WXqt+8JFzTwb+Q4JLaHbn06oBhJJTl69c6xvKh3WFJ/GZ5Dy0NyiBGL60kqwQelRE79CFp8GDj1XnxsJmVHLxoVYwI4T8yMFEGGhqN/Wv0ARx7xJWZIiK5phNKOEZcUM5IUrEiQEOERuiVdRX3kEWHHk+sSeKCUPhwEXD1fwon6uyNGnhBjz1WV6fJi1kvF/7xuJAdndkz9MJLEx9NBg4hBGcA0KtinnGDJxoogzKnaFeIh4ghLFWhBhWDOnjxPWicV06iY19Vi7SKLIw/2wD4oAROcghq4Ag3QBBg8gmfwCt60J+1Fe9c+pqU5LevZBX+gff0Al86gWQ== N | N ⌧ | “variational parameters” 8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004] • MFVB approximation: q⇤(µ, ⌧)=q⇤(µ)q⇤(⌧) = argmin KL(q( ) p( y))

AAACYHicbVHRTtswFHUCG10Ho4w3eLGoJjUIVQlC2l6QEEgIaSCBtBakposc1y0WtpPaN4wqzU/yxsNe9iW4SR4YcCVL555zrnx9HKeCG/D9J8ddWv7wcaXxqfl5de3Lemvja98kmaasRxOR6JuYGCa4Yj3gINhNqhmRsWDX8d3JQr++Z9rwRP2CWcqGkkwUH3NKwFJR68/0924nlNkeDoFkHj7E0yi3fVHRXtlapexrRwjsAbTMiZ5Irooon+KQK3y1cFbKxWn/uCgK/PO8M+2EdJSAh+dznFZ4PvO8qNX2u35Z+C0IatBGdV1GrcdwlNBMMgVUEGMGgZ/C0O4AnApWNMPMsJTQOzJhAwsVkcwM8zKgAn+zzAiPE22PAlyyLydyIo2Zydg6JYFb81pbkO9pgwzGP4Y5V2kGTNHqonEmMCR4kTYecc0oiJkFhGpud8X0lmhCwf5J04YQvH7yW9Df7wZ+N7g6aB8d13E00DbaQR0UoO/oCJ2hS9RDFP11lpxVZ8355zbcdXejsrpOPbOJ/it36xn+5bQ2 µ ⌧ q Q 2 MFVB · || ·| • Coordinate descent (derivation shortly) [Bishop 2006, Sec 10.1.3] 1 q⇤(µ)= (µ µ , ⇢ ) q⇤(⌧) = Gamma(⌧ a ,b ) µ N AAACGXicbZBNSwMxEIazftb6VfXoJVgEK1J2RdCLUPSgJ1GwrdDWZTZNNTTZXZNZsaz9G178K148KOJRT/4b04+DVl8IPLwzw2TeIJbCoOt+OWPjE5NT05mZ7Ozc/MJibmm5YqJEM15mkYz0RQCGSxHyMgqU/CLWHFQgeTVoH/bq1VuujYjCc+zEvKHgKhQtwQCt5efcG7+OkFymm92NHhToPq0jv0Ot0iNQCgb2PfgnWzTwTwp+Lu8W3b7oX/CGkCdDnfq5j3ozYoniITIJxtQ8N8ZGChoFk7ybrSeGx8DacMVrFkNQ3DTS/mVdum6dJm1F2r4Qad/9OZGCMqajAtupAK/NaK1n/lerJdjaa6QijBPkIRssaiWSYkR7MdGm0Jyh7FgApoX9K2XXoIGhDTNrQ/BGT/4Lle2i5xa9s5186WAYR4askjWyQTyyS0rkmJySMmHkgTyRF/LqPDrPzpvzPmgdc4YzK+SXnM9vD6ufrA== N N AAACHXicbVDLSsNAFJ3UV62vqEs3g0VoRUsiBd0IRTeuSgX7gCYNk+nUDp08nJkIJeZH3Pgrblwo4sKN+DdO2iy09cAMh3Pu5d573JBRIQ3jW8stLC4tr+RXC2vrG5tb+vZOSwQRx6SJAxbwjosEYdQnTUklI52QE+S5jLTd0WXqt+8JFzTwb+Q4JLaHbn06oBhJJTl69c6xvKh3WFJ/GZ5Dy0NyiBGL60kqwQelRE79CFp8GDj1XnxsJmVHLxoVYwI4T8yMFEGGhqN/Wv0ARx7xJWZIiK5phNKOEZcUM5IUrEiQEOERuiVdRX3kEWHHk+sSeKCUPhwEXD1fwon6uyNGnhBjz1WV6fJi1kvF/7xuJAdndkz9MJLEx9NBg4hBGcA0KtinnGDJxoogzKnaFeIh4ghLFWhBhWDOnjxPWicV06iY19Vi7SKLIw/2wD4oAROcghq4Ag3QBBg8gmfwCt60J+1Fe9c+pqU5LevZBX+gff0Al86gWQ== N | N ⌧ | • (µ , ⇢ )=f(a ,b ) “variational Iterate: AAACBnicbVDLSgMxFM3UV62vUZciBItQQcqMCLoRim5clQr2AZ1hyKSZNjTJDElGKKUrN/6KGxeKuPUb3Pk3ZtpZaOuBCyfn3EvuPWHCqNKO820VlpZXVteK66WNza3tHXt3r6XiVGLSxDGLZSdEijAqSFNTzUgnkQTxkJF2OLzJ/PYDkYrG4l6PEuJz1Bc0ohhpIwX2YcXjaVA/hZ4cxEH9BF7BqIIyITSvwC47VWcKuEjcnJRBjkZgf3m9GKecCI0ZUqrrOon2x0hqihmZlLxUkQThIeqTrqECcaL88fSMCTw2Sg9GsTQlNJyqvyfGiCs14qHp5EgP1LyXif953VRHl/6YiiTVRODZR1HKoI5hlgnsUUmwZiNDEJbU7ArxAEmEtUmuZEJw509eJK2zqutU3bvzcu06j6MIDsARqAAXXIAauAUN0AQYPIJn8ArerCfrxXq3PmatBSuf2Qd/YH3+AAdLlkQ= N N N N (a ,b )=g(µ , ⇢ ) parameters” AAACBnicbVDLSgMxFM3UV62vUZciBItQQcqMCLoRim5clQr2AZ1hyKRpG5pMhiQjlKErN/6KGxeKuPUb3Pk3ZtpZaOuBCyfn3EvuPWHMqNKO820VlpZXVteK66WNza3tHXt3r6VEIjFpYsGE7IRIEUYj0tRUM9KJJUE8ZKQdjm4yv/1ApKIiutfjmPgcDSLapxhpIwX2YQUF9VMYBvUTeAUHFY8n2duTQ2GkwC47VWcKuEjcnJRBjkZgf3k9gRNOIo0ZUqrrOrH2UyQ1xYxMSl6iSIzwCA1I19AIcaL8dHrGBB4bpQf7QpqKNJyqvydSxJUa89B0cqSHat7LxP+8bqL7l35KozjRJMKzj/oJg1rALBPYo5JgzcaGICyp2RXiIZIIa5NcyYTgzp+8SFpnVdepunfn5dp1HkcRHIAjUAEuuAA1cAsaoAkweATP4BW8WU/Wi/VufcxaC1Y+sw/+wPr8Af0glkU= N N N N 8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model (conjugate prior): AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004] • MFVB approximation: q⇤(µ, ⌧)=q⇤(µ)q⇤(⌧) = argmin KL(q( ) p( y))

AAACYHicbVHRTtswFHUCG10Ho4w3eLGoJjUIVQlC2l6QEEgIaSCBtBakposc1y0WtpPaN4wqzU/yxsNe9iW4SR4YcCVL555zrnx9HKeCG/D9J8ddWv7wcaXxqfl5de3Lemvja98kmaasRxOR6JuYGCa4Yj3gINhNqhmRsWDX8d3JQr++Z9rwRP2CWcqGkkwUH3NKwFJR68/0924nlNkeDoFkHj7E0yi3fVHRXtlapexrRwjsAbTMiZ5Irooon+KQK3y1cFbKxWn/uCgK/PO8M+2EdJSAh+dznFZ4PvO8qNX2u35Z+C0IatBGdV1GrcdwlNBMMgVUEGMGgZ/C0O4AnApWNMPMsJTQOzJhAwsVkcwM8zKgAn+zzAiPE22PAlyyLydyIo2Zydg6JYFb81pbkO9pgwzGP4Y5V2kGTNHqonEmMCR4kTYecc0oiJkFhGpud8X0lmhCwf5J04YQvH7yW9Df7wZ+N7g6aB8d13E00DbaQR0UoO/oCJ2hS9RDFP11lpxVZ8355zbcdXejsrpOPbOJ/it36xn+5bQ2 µ ⌧ q Q 2 MFVB · || ·| • Coordinate descent (derivation shortly) [Bishop 2006, Sec 10.1.3] 1 q⇤(µ)= (µ µ , ⇢ ) q⇤(⌧) = Gamma(⌧ a ,b ) µ N AAACGXicbZBNSwMxEIazftb6VfXoJVgEK1J2RdCLUPSgJ1GwrdDWZTZNNTTZXZNZsaz9G178K148KOJRT/4b04+DVl8IPLwzw2TeIJbCoOt+OWPjE5NT05mZ7Ozc/MJibmm5YqJEM15mkYz0RQCGSxHyMgqU/CLWHFQgeTVoH/bq1VuujYjCc+zEvKHgKhQtwQCt5efcG7+OkFymm92NHhToPq0jv0Ot0iNQCgb2PfgnWzTwTwp+Lu8W3b7oX/CGkCdDnfq5j3ozYoniITIJxtQ8N8ZGChoFk7ybrSeGx8DacMVrFkNQ3DTS/mVdum6dJm1F2r4Qad/9OZGCMqajAtupAK/NaK1n/lerJdjaa6QijBPkIRssaiWSYkR7MdGm0Jyh7FgApoX9K2XXoIGhDTNrQ/BGT/4Lle2i5xa9s5186WAYR4askjWyQTyyS0rkmJySMmHkgTyRF/LqPDrPzpvzPmgdc4YzK+SXnM9vD6ufrA== N N AAACHXicbVDLSsNAFJ3UV62vqEs3g0VoRUsiBd0IRTeuSgX7gCYNk+nUDp08nJkIJeZH3Pgrblwo4sKN+DdO2iy09cAMh3Pu5d573JBRIQ3jW8stLC4tr+RXC2vrG5tb+vZOSwQRx6SJAxbwjosEYdQnTUklI52QE+S5jLTd0WXqt+8JFzTwb+Q4JLaHbn06oBhJJTl69c6xvKh3WFJ/GZ5Dy0NyiBGL60kqwQelRE79CFp8GDj1XnxsJmVHLxoVYwI4T8yMFEGGhqN/Wv0ARx7xJWZIiK5phNKOEZcUM5IUrEiQEOERuiVdRX3kEWHHk+sSeKCUPhwEXD1fwon6uyNGnhBjz1WV6fJi1kvF/7xuJAdndkz9MJLEx9NBg4hBGcA0KtinnGDJxoogzKnaFeIh4ghLFWhBhWDOnjxPWicV06iY19Vi7SKLIw/2wD4oAROcghq4Ag3QBBg8gmfwCt60J+1Fe9c+pqU5LevZBX+gff0Al86gWQ== N | N ⌧ | • (µ , ⇢ )=f(a ,b ) “variational Iterate: AAACBnicbVDLSgMxFM3UV62vUZciBItQQcqMCLoRim5clQr2AZ1hyKSZNjTJDElGKKUrN/6KGxeKuPUb3Pk3ZtpZaOuBCyfn3EvuPWHCqNKO820VlpZXVteK66WNza3tHXt3r6XiVGLSxDGLZSdEijAqSFNTzUgnkQTxkJF2OLzJ/PYDkYrG4l6PEuJz1Bc0ohhpIwX2YcXjaVA/hZ4cxEH9BF7BqIIyITSvwC47VWcKuEjcnJRBjkZgf3m9GKecCI0ZUqrrOon2x0hqihmZlLxUkQThIeqTrqECcaL88fSMCTw2Sg9GsTQlNJyqvyfGiCs14qHp5EgP1LyXif953VRHl/6YiiTVRODZR1HKoI5hlgnsUUmwZiNDEJbU7ArxAEmEtUmuZEJw509eJK2zqutU3bvzcu06j6MIDsARqAAXXIAauAUN0AQYPIJn8ArerCfrxXq3PmatBSuf2Qd/YH3+AAdLlkQ= N N N N (a ,b )=g(µ , ⇢ ) parameters” AAACBnicbVDLSgMxFM3UV62vUZciBItQQcqMCLoRim5clQr2AZ1hyKRpG5pMhiQjlKErN/6KGxeKuPUb3Pk3ZtpZaOuBCyfn3EvuPWHMqNKO820VlpZXVteK66WNza3tHXt3r6VEIjFpYsGE7IRIEUYj0tRUM9KJJUE8ZKQdjm4yv/1ApKIiutfjmPgcDSLapxhpIwX2YQUF9VMYBvUTeAUHFY8n2duTQ2GkwC47VWcKuEjcnJRBjkZgf3k9gRNOIo0ZUqrrOrH2UyQ1xYxMSl6iSIzwCA1I19AIcaL8dHrGBB4bpQf7QpqKNJyqvydSxJUa89B0cqSHat7LxP+8bqL7l35KozjRJMKzj/oJg1rALBPYo5JgzcaGICyp2RXiIZIIa5NcyYTgzp+8SFpnVdepunfn5dp1HkcRHIAjUAEuuAA1cAsaoAkweATP4BW8WU/Wi/VufcxaC1Y+sw/+wPr8Af0glkU= N N N N 8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length approximation ⌧AAAB63icbVA9SwNBEJ2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJlmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1jLdYLGPTiajlUmjeQoGSdxLDqYokf4wmt7n/+MSNFbF+wGnCQ0VHWgwFo5hLPaRpv1rz6/4cZJUEBalBgWa/+tUbxCxVXCOT1Npu4CcYZtSgYJLPKr3U8oSyCR3xrqOaKm7DbH7rjJw5ZUCGsXGlkczV3xMZVdZOVeQ6FcWxXfZy8T+vm+LwOsyETlLkmi0WDVNJMCb542QgDGcop45QZoS7lbAxNZShi6fiQgiWX14l7Yt64NeD+8ta46aIowwncArnEMAVNOAOmtACBmN4hld485T34r17H4vWklfMHMMfeJ8/IS2OSA== exact posterior

µAAAB6nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF48RzQOSJcxOZpMhM7PLTK8QQj7BiwdFvPpF3vwbJ8keNLGgoajqprsrSqWw6PvfXmFtfWNzq7hd2tnd2z8oHx41bZIZxhsskYlpR9RyKTRvoEDJ26nhVEWSt6LR7cxvPXFjRaIfcZzyUNGBFrFgFJ300FVZr1zxq/4cZJUEOalAjnqv/NXtJyxTXCOT1NpO4KcYTqhBwSSflrqZ5SllIzrgHUc1VdyGk/mpU3LmlD6JE+NKI5mrvycmVFk7VpHrVBSHdtmbif95nQzj63AidJoh12yxKM4kwYTM/iZ9YThDOXaEMiPcrYQNqaEMXTolF0Kw/PIqaV5UA78a3F9Wajd5HEU4gVM4hwCuoAZ3UIcGMBjAM7zCmye9F+/d+1i0Frx85hj+wPv8AV1ejdY=

9 [Bishop 2006] Midge wing length approximation ⌧AAAB63icbVA9SwNBEJ2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJlmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1jLdYLGPTiajlUmjeQoGSdxLDqYokf4wmt7n/+MSNFbF+wGnCQ0VHWgwFo5hLPaRpv1rz6/4cZJUEBalBgWa/+tUbxCxVXCOT1Npu4CcYZtSgYJLPKr3U8oSyCR3xrqOaKm7DbH7rjJw5ZUCGsXGlkczV3xMZVdZOVeQ6FcWxXfZy8T+vm+LwOsyETlLkmi0WDVNJMCb542QgDGcop45QZoS7lbAxNZShi6fiQgiWX14l7Yt64NeD+8ta46aIowwncArnEMAVNOAOmtACBmN4hld485T34r17H4vWklfMHMMfeJ8/IS2OSA== exact posterior

µAAAB6nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF48RzQOSJcxOZpMhM7PLTK8QQj7BiwdFvPpF3vwbJ8keNLGgoajqprsrSqWw6PvfXmFtfWNzq7hd2tnd2z8oHx41bZIZxhsskYlpR9RyKTRvoEDJ26nhVEWSt6LR7cxvPXFjRaIfcZzyUNGBFrFgFJ300FVZr1zxq/4cZJUEOalAjnqv/NXtJyxTXCOT1NpO4KcYTqhBwSSflrqZ5SllIzrgHUc1VdyGk/mpU3LmlD6JE+NKI5mrvycmVFk7VpHrVBSHdtmbif95nQzj63AidJoh12yxKM4kwYTM/iZ9YThDOXaEMiPcrYQNqaEMXTolF0Kw/PIqaV5UA78a3F9Wajd5HEU4gVM4hwCuoAZ3UIcGMBjAM7zCmye9F+/d+1i0Frx85hj+wPv8AV1ejdY=

9 [Bishop 2006] Midge wing length approximation ⌧AAAB63icbVA9SwNBEJ2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJlmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1jLdYLGPTiajlUmjeQoGSdxLDqYokf4wmt7n/+MSNFbF+wGnCQ0VHWgwFo5hLPaRpv1rz6/4cZJUEBalBgWa/+tUbxCxVXCOT1Npu4CcYZtSgYJLPKr3U8oSyCR3xrqOaKm7DbH7rjJw5ZUCGsXGlkczV3xMZVdZOVeQ6FcWxXfZy8T+vm+LwOsyETlLkmi0WDVNJMCb542QgDGcop45QZoS7lbAxNZShi6fiQgiWX14l7Yt64NeD+8ta46aIowwncArnEMAVNOAOmtACBmN4hld485T34r17H4vWklfMHMMfeJ8/IS2OSA== exact posterior

µAAAB6nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF48RzQOSJcxOZpMhM7PLTK8QQj7BiwdFvPpF3vwbJ8keNLGgoajqprsrSqWw6PvfXmFtfWNzq7hd2tnd2z8oHx41bZIZxhsskYlpR9RyKTRvoEDJ26nhVEWSt6LR7cxvPXFjRaIfcZzyUNGBFrFgFJ300FVZr1zxq/4cZJUEOalAjnqv/NXtJyxTXCOT1NpO4KcYTqhBwSSflrqZ5SllIzrgHUc1VdyGk/mpU3LmlD6JE+NKI5mrvycmVFk7VpHrVBSHdtmbif95nQzj63AidJoh12yxKM4kwYTM/iZ9YThDOXaEMiPcrYQNqaEMXTolF0Kw/PIqaV5UA78a3F9Wajd5HEU4gVM4hwCuoAZ3UIcGMBjAM7zCmye9F+/d+1i0Frx85hj+wPv8AV1ejdY=

9 [Bishop 2006] Midge wing length approximation ⌧AAAB63icbVA9SwNBEJ2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJlmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1jLdYLGPTiajlUmjeQoGSdxLDqYokf4wmt7n/+MSNFbF+wGnCQ0VHWgwFo5hLPaRpv1rz6/4cZJUEBalBgWa/+tUbxCxVXCOT1Npu4CcYZtSgYJLPKr3U8oSyCR3xrqOaKm7DbH7rjJw5ZUCGsXGlkczV3xMZVdZOVeQ6FcWxXfZy8T+vm+LwOsyETlLkmi0WDVNJMCb542QgDGcop45QZoS7lbAxNZShi6fiQgiWX14l7Yt64NeD+8ta46aIowwncArnEMAVNOAOmtACBmN4hld485T34r17H4vWklfMHMMfeJ8/IS2OSA== exact posterior

µAAAB6nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF48RzQOSJcxOZpMhM7PLTK8QQj7BiwdFvPpF3vwbJ8keNLGgoajqprsrSqWw6PvfXmFtfWNzq7hd2tnd2z8oHx41bZIZxhsskYlpR9RyKTRvoEDJ26nhVEWSt6LR7cxvPXFjRaIfcZzyUNGBFrFgFJ300FVZr1zxq/4cZJUEOalAjnqv/NXtJyxTXCOT1NpO4KcYTqhBwSSflrqZ5SllIzrgHUc1VdyGk/mpU3LmlD6JE+NKI5mrvycmVFk7VpHrVBSHdtmbif95nQzj63AidJoh12yxKM4kwYTM/iZ9YThDOXaEMiPcrYQNqaEMXTolF0Kw/PIqaV5UA78a3F9Wajd5HEU4gVM4hwCuoAZ3UIcGMBjAM7zCmye9F+/d+1i0Frx85hj+wPv8AV1ejdY=

9 [Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004] • MFVB approximation: q⇤(µ, ⌧)=q⇤(µ)q⇤(⌧) = argmin KL(q( ) p( y))

AAACYHicbVHRTtswFHUCG10Ho4w3eLGoJjUIVQlC2l6QEEgIaSCBtBakposc1y0WtpPaN4wqzU/yxsNe9iW4SR4YcCVL555zrnx9HKeCG/D9J8ddWv7wcaXxqfl5de3Lemvja98kmaasRxOR6JuYGCa4Yj3gINhNqhmRsWDX8d3JQr++Z9rwRP2CWcqGkkwUH3NKwFJR68/0924nlNkeDoFkHj7E0yi3fVHRXtlapexrRwjsAbTMiZ5Irooon+KQK3y1cFbKxWn/uCgK/PO8M+2EdJSAh+dznFZ4PvO8qNX2u35Z+C0IatBGdV1GrcdwlNBMMgVUEGMGgZ/C0O4AnApWNMPMsJTQOzJhAwsVkcwM8zKgAn+zzAiPE22PAlyyLydyIo2Zydg6JYFb81pbkO9pgwzGP4Y5V2kGTNHqonEmMCR4kTYecc0oiJkFhGpud8X0lmhCwf5J04YQvH7yW9Df7wZ+N7g6aB8d13E00DbaQR0UoO/oCJ2hS9RDFP11lpxVZ8355zbcdXejsrpOPbOJ/it36xn+5bQ2 µ ⌧ q Q 2 MFVB · || ·| • Coordinate descent (derivation shortly) [Bishop 2006, Sec 10.1.3] 1 q⇤(µ)= (µ µ , ⇢ ) q⇤(⌧) = Gamma(⌧ a ,b ) µ N AAACGXicbZBNSwMxEIazftb6VfXoJVgEK1J2RdCLUPSgJ1GwrdDWZTZNNTTZXZNZsaz9G178K148KOJRT/4b04+DVl8IPLwzw2TeIJbCoOt+OWPjE5NT05mZ7Ozc/MJibmm5YqJEM15mkYz0RQCGSxHyMgqU/CLWHFQgeTVoH/bq1VuujYjCc+zEvKHgKhQtwQCt5efcG7+OkFymm92NHhToPq0jv0Ot0iNQCgb2PfgnWzTwTwp+Lu8W3b7oX/CGkCdDnfq5j3ozYoniITIJxtQ8N8ZGChoFk7ybrSeGx8DacMVrFkNQ3DTS/mVdum6dJm1F2r4Qad/9OZGCMqajAtupAK/NaK1n/lerJdjaa6QijBPkIRssaiWSYkR7MdGm0Jyh7FgApoX9K2XXoIGhDTNrQ/BGT/4Lle2i5xa9s5186WAYR4askjWyQTyyS0rkmJySMmHkgTyRF/LqPDrPzpvzPmgdc4YzK+SXnM9vD6ufrA== N N AAACHXicbVDLSsNAFJ3UV62vqEs3g0VoRUsiBd0IRTeuSgX7gCYNk+nUDp08nJkIJeZH3Pgrblwo4sKN+DdO2iy09cAMh3Pu5d573JBRIQ3jW8stLC4tr+RXC2vrG5tb+vZOSwQRx6SJAxbwjosEYdQnTUklI52QE+S5jLTd0WXqt+8JFzTwb+Q4JLaHbn06oBhJJTl69c6xvKh3WFJ/GZ5Dy0NyiBGL60kqwQelRE79CFp8GDj1XnxsJmVHLxoVYwI4T8yMFEGGhqN/Wv0ARx7xJWZIiK5phNKOEZcUM5IUrEiQEOERuiVdRX3kEWHHk+sSeKCUPhwEXD1fwon6uyNGnhBjz1WV6fJi1kvF/7xuJAdndkz9MJLEx9NBg4hBGcA0KtinnGDJxoogzKnaFeIh4ghLFWhBhWDOnjxPWicV06iY19Vi7SKLIw/2wD4oAROcghq4Ag3QBBg8gmfwCt60J+1Fe9c+pqU5LevZBX+gff0Al86gWQ== N | N ⌧ | • (µ , ⇢ )=f(a ,b ) “variational Iterate: AAACBnicbVDLSgMxFM3UV62vUZciBItQQcqMCLoRim5clQr2AZ1hyKSZNjTJDElGKKUrN/6KGxeKuPUb3Pk3ZtpZaOuBCyfn3EvuPWHCqNKO820VlpZXVteK66WNza3tHXt3r6XiVGLSxDGLZSdEijAqSFNTzUgnkQTxkJF2OLzJ/PYDkYrG4l6PEuJz1Bc0ohhpIwX2YcXjaVA/hZ4cxEH9BF7BqIIyITSvwC47VWcKuEjcnJRBjkZgf3m9GKecCI0ZUqrrOon2x0hqihmZlLxUkQThIeqTrqECcaL88fSMCTw2Sg9GsTQlNJyqvyfGiCs14qHp5EgP1LyXif953VRHl/6YiiTVRODZR1HKoI5hlgnsUUmwZiNDEJbU7ArxAEmEtUmuZEJw509eJK2zqutU3bvzcu06j6MIDsARqAAXXIAauAUN0AQYPIJn8ArerCfrxXq3PmatBSuf2Qd/YH3+AAdLlkQ= N N N N (a ,b )=g(µ , ⇢ ) parameters” AAACBnicbVDLSgMxFM3UV62vUZciBItQQcqMCLoRim5clQr2AZ1hyKRpG5pMhiQjlKErN/6KGxeKuPUb3Pk3ZtpZaOuBCyfn3EvuPWHMqNKO820VlpZXVteK66WNza3tHXt3r6VEIjFpYsGE7IRIEUYj0tRUM9KJJUE8ZKQdjm4yv/1ApKIiutfjmPgcDSLapxhpIwX2YQUF9VMYBvUTeAUHFY8n2duTQ2GkwC47VWcKuEjcnJRBjkZgf3k9gRNOIo0ZUqrrOrH2UyQ1xYxMSl6iSIzwCA1I19AIcaL8dHrGBB4bpQf7QpqKNJyqvydSxJUa89B0cqSHat7LxP+8bqL7l35KozjRJMKzj/oJg1rALBPYo5JgzcaGICyp2RXiIZIIa5NcyYTgzp+8SFpnVdepunfn5dp1HkcRHIAjUAEuuAA1cAsaoAkweATP4BW8WU/Wi/VufcxaC1Y+sw/+wPr8Af0glkU= N N N N 8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length • Catalogued midge wing lengths (mm) y =(y1,...,yN ) • Parameters of interest: population mean and precision ✓ =(µ, ⌧) • Model: AAAB/XicbVDLSsNAFJ3UV62v+Ni5GSxCBSmJCLoRim5cVrAPaEKZTCft0MkkzNwINRR/xY0LRdz6H+78G6dtFtp64MLhnHu5954gEVyD43xbhaXlldW14nppY3Nre8fe3WvqOFWUNWgsYtUOiGaCS9YADoK1E8VIFAjWCoY3E7/1wJTmsbyHUcL8iPQlDzklYKSufeDBgAHBV7jiRekp9oCkJ1277FSdKfAicXNSRjnqXfvL68U0jZgEKojWHddJwM+IAk4FG5e8VLOE0CHps46hkkRM+9n0+jE+NkoPh7EyJQFP1d8TGYm0HkWB6YwIDPS8NxH/8zophJd+xmWSApN0tihMBYYYT6LAPa4YBTEyhFDFza2YDogiFExgJROCO//yImmeVV2n6t6dl2vXeRxFdIiOUAW56ALV0C2qowai6BE9o1f0Zj1ZL9a79TFrLVj5zD76A+vzB2iqk+M= iid 1 p(y ✓): y (µ, ⌧ ),n=1,...,N AAACRHicbVBNaxRBFOxJ/Ijr1xqPXhoXYQPrMiOCIgSCXjyFCG4S2F6XNz1vs026e8bu18IwmR/nxR/gzV/gxYMiXsXezRw0saChqHpFv1d5pZWnNP2SbGxeuXrt+taN3s1bt+/c7d/bPvRlcBInstSlO87Bo1YWJ6RI43HlEEyu8Sg/fbXyjz6g86q0b6mucGbgxKqFkkBRmven1bA+E7REgp0XXLwPUPB6brnwBPLUoW6UKtpGeGVaLgzQUoJu9tuhMGHEBUF41zzO2p1Rl7W72UjooiQ/2p/3B+k4XYNfJllHBqzDwbz/WRSlDAYtSQ3eT7O0olkDjpTU2PZE8FjFteAEp5FaMOhnzbqElj+KSsEXpYvPEl+rfycaMN7XJo+TqzP8RW8l/s+bBlo8nzXKVoHQyvOPFkFzKvmqUV4oh5J0HQlIp+KuXC7BgaTYey+WkF08+TI5fDLO0nH25ulg72VXxxZ7wB6yIcvYM7bHXrMDNmGSfWRf2Xf2I/mUfEt+Jr/ORzeSLnOf/YPk9x+BkrEQ | n ⇠ N p(✓): ⌧ Gamma(a ,b ) ⇠ 0 0 1 µ ⌧ (µ , (⇢ ⌧) ) AAACXnicbVHPaxQxGM2MWuvU2lUvgpfgosyCLplSqHgq9lBP0oLbFjbr8E022w1NZsbki7iM80/2Vnrpn2Jmu4f+8EHg8d77yJeXotbKIWOXUfzo8ZO1p+vPko3nmy+2ei9fHbvKWyFHotKVPS3ASa1KOUKFWp7WVoIptDwpzvc7/+S3tE5V5Q9c1HJi4KxUMyUAg5T3fJ1ynEuEwRfKf3mY0g+UI3jKnTKByT9oTXMAxkCbQs4+0iJnA8p5Qruk8fTv7bwBnAvQzfc2DV4XT7mdVzlbhgY/m09ZSwdJ3uuzIVuCPiTZivTJCod574JPK+GNLFFocG6csRonDVhUQss24d7JGsQ5nMlxoCUY6SbNsp6Wvg/KlM4qG06JdKnenmjAOLcwRUh2+7v7Xif+zxt7nH2eNKqsPcpS3Fw085piRbuu6VRZKVAvAgFhVdiVijlYEBh+pCshu//kh+R4e5ixYXa009/7uqpjnbwl70hKMrJL9sg3ckhGRJCrKIqSaCO6jtfizXjrJhpHq5nX5A7iN/8A2XSwHg== | ⇠ N 0 0 • Exercise: check p(µ, ⌧ y) = f (µ, y)f (⌧,y) AAACFnicbVDLSsNAFJ34rPUVdelmsAgVtCRF0GXRjcsK9gFNCZPppB06mYR5CCH2K9z4K25cKOJW3Pk3TtostPXAwLnn3Mude4KEUakc59taWl5ZXVsvbZQ3t7Z3du29/baMtcCkhWMWi26AJGGUk5aiipFuIgiKAkY6wfg69zv3REga8zuVJqQfoSGnIcVIGcm3z5KqF+lT6Cmk4QNMT6DHCQx9dyabOvTr1dzNC9+uODVnCrhI3IJUQIGmb395gxjriHCFGZKy5zqJ6mdIKIoZmZQ9LUmC8BgNSc9QjiIi+9n0rAk8NsoAhrEwjys4VX9PZCiSMo0C0xkhNZLzXi7+5/W0Ci/7GeWJVoTj2aJQM6himGcEB1QQrFhqCMKCmr9CPEICYWWSLJsQ3PmTF0m7XnOdmnt7XmlcFXGUwCE4AlXgggvQADegCVoAg0fwDF7Bm/VkvVjv1sesdckqZg7AH1ifPwLUnB8= | 6 1 2 [CSIRO 2004] • MFVB approximation: q⇤(µ, ⌧)=q⇤(µ)q⇤(⌧) = argmin KL(q( ) p( y))

AAACYHicbVHRTtswFHUCG10Ho4w3eLGoJjUIVQlC2l6QEEgIaSCBtBakposc1y0WtpPaN4wqzU/yxsNe9iW4SR4YcCVL555zrnx9HKeCG/D9J8ddWv7wcaXxqfl5de3Lemvja98kmaasRxOR6JuYGCa4Yj3gINhNqhmRsWDX8d3JQr++Z9rwRP2CWcqGkkwUH3NKwFJR68/0924nlNkeDoFkHj7E0yi3fVHRXtlapexrRwjsAbTMiZ5Irooon+KQK3y1cFbKxWn/uCgK/PO8M+2EdJSAh+dznFZ4PvO8qNX2u35Z+C0IatBGdV1GrcdwlNBMMgVUEGMGgZ/C0O4AnApWNMPMsJTQOzJhAwsVkcwM8zKgAn+zzAiPE22PAlyyLydyIo2Zydg6JYFb81pbkO9pgwzGP4Y5V2kGTNHqonEmMCR4kTYecc0oiJkFhGpud8X0lmhCwf5J04YQvH7yW9Df7wZ+N7g6aB8d13E00DbaQR0UoO/oCJ2hS9RDFP11lpxVZ8355zbcdXejsrpOPbOJ/it36xn+5bQ2 µ ⌧ q Q 2 MFVB · || ·| • Coordinate descent (derivation shortly) [Bishop 2006, Sec 10.1.3] 1 q⇤(µ)= (µ µ , ⇢ ) q⇤(⌧) = Gamma(⌧ a ,b ) µ N AAACGXicbZBNSwMxEIazftb6VfXoJVgEK1J2RdCLUPSgJ1GwrdDWZTZNNTTZXZNZsaz9G178K148KOJRT/4b04+DVl8IPLwzw2TeIJbCoOt+OWPjE5NT05mZ7Ozc/MJibmm5YqJEM15mkYz0RQCGSxHyMgqU/CLWHFQgeTVoH/bq1VuujYjCc+zEvKHgKhQtwQCt5efcG7+OkFymm92NHhToPq0jv0Ot0iNQCgb2PfgnWzTwTwp+Lu8W3b7oX/CGkCdDnfq5j3ozYoniITIJxtQ8N8ZGChoFk7ybrSeGx8DacMVrFkNQ3DTS/mVdum6dJm1F2r4Qad/9OZGCMqajAtupAK/NaK1n/lerJdjaa6QijBPkIRssaiWSYkR7MdGm0Jyh7FgApoX9K2XXoIGhDTNrQ/BGT/4Lle2i5xa9s5186WAYR4askjWyQTyyS0rkmJySMmHkgTyRF/LqPDrPzpvzPmgdc4YzK+SXnM9vD6ufrA== N N AAACHXicbVDLSsNAFJ3UV62vqEs3g0VoRUsiBd0IRTeuSgX7gCYNk+nUDp08nJkIJeZH3Pgrblwo4sKN+DdO2iy09cAMh3Pu5d573JBRIQ3jW8stLC4tr+RXC2vrG5tb+vZOSwQRx6SJAxbwjosEYdQnTUklI52QE+S5jLTd0WXqt+8JFzTwb+Q4JLaHbn06oBhJJTl69c6xvKh3WFJ/GZ5Dy0NyiBGL60kqwQelRE79CFp8GDj1XnxsJmVHLxoVYwI4T8yMFEGGhqN/Wv0ARx7xJWZIiK5phNKOEZcUM5IUrEiQEOERuiVdRX3kEWHHk+sSeKCUPhwEXD1fwon6uyNGnhBjz1WV6fJi1kvF/7xuJAdndkz9MJLEx9NBg4hBGcA0KtinnGDJxoogzKnaFeIh4ghLFWhBhWDOnjxPWicV06iY19Vi7SKLIw/2wD4oAROcghq4Ag3QBBg8gmfwCt60J+1Fe9c+pqU5LevZBX+gff0Al86gWQ== N | N ⌧ | • (µ , ⇢ )=f(a ,b ) “variational Iterate: AAACBnicbVDLSgMxFM3UV62vUZciBItQQcqMCLoRim5clQr2AZ1hyKSZNjTJDElGKKUrN/6KGxeKuPUb3Pk3ZtpZaOuBCyfn3EvuPWHCqNKO820VlpZXVteK66WNza3tHXt3r6XiVGLSxDGLZSdEijAqSFNTzUgnkQTxkJF2OLzJ/PYDkYrG4l6PEuJz1Bc0ohhpIwX2YcXjaVA/hZ4cxEH9BF7BqIIyITSvwC47VWcKuEjcnJRBjkZgf3m9GKecCI0ZUqrrOon2x0hqihmZlLxUkQThIeqTrqECcaL88fSMCTw2Sg9GsTQlNJyqvyfGiCs14qHp5EgP1LyXif953VRHl/6YiiTVRODZR1HKoI5hlgnsUUmwZiNDEJbU7ArxAEmEtUmuZEJw509eJK2zqutU3bvzcu06j6MIDsARqAAXXIAauAUN0AQYPIJn8ArerCfrxXq3PmatBSuf2Qd/YH3+AAdLlkQ= N N N N (a ,b )=g(µ , ⇢ ) parameters” [board] AAACBnicbVDLSgMxFM3UV62vUZciBItQQcqMCLoRim5clQr2AZ1hyKRpG5pMhiQjlKErN/6KGxeKuPUb3Pk3ZtpZaOuBCyfn3EvuPWHMqNKO820VlpZXVteK66WNza3tHXt3r6VEIjFpYsGE7IRIEUYj0tRUM9KJJUE8ZKQdjm4yv/1ApKIiutfjmPgcDSLapxhpIwX2YQUF9VMYBvUTeAUHFY8n2duTQ2GkwC47VWcKuEjcnJRBjkZgf3k9gRNOIo0ZUqrrOrH2UyQ1xYxMSl6iSIzwCA1I19AIcaL8dHrGBB4bpQf7QpqKNJyqvydSxJUa89B0cqSHat7LxP+8bqL7l35KozjRJMKzj/oJg1rALBPYo5JgzcaGICyp2RXiIZIIa5NcyYTgzp+8SFpnVdepunfn5dp1HkcRHIAjUAEuuAA1cAsaoAkweATP4BW8WU/Wi/VufcxaC1Y+sw/+wPr8Af0glkU= N N N N 8 [Hoff 2009; Grogan, Wirth 1981; MacKay 2003; Bishop 2006] Midge wing length approximation ⌧AAAB63icbVA9SwNBEJ2LXzF+RS1tFoNgFe5E0DJoYxnBxEByhL3NJlmyu3fszgnhyF+wsVDE1j9k579xL7lCEx8MPN6bYWZelEhh0fe/vdLa+sbmVnm7srO7t39QPTxq2zg1jLdYLGPTiajlUmjeQoGSdxLDqYokf4wmt7n/+MSNFbF+wGnCQ0VHWgwFo5hLPaRpv1rz6/4cZJUEBalBgWa/+tUbxCxVXCOT1Npu4CcYZtSgYJLPKr3U8oSyCR3xrqOaKm7DbH7rjJw5ZUCGsXGlkczV3xMZVdZOVeQ6FcWxXfZy8T+vm+LwOsyETlLkmi0WDVNJMCb542QgDGcop45QZoS7lbAxNZShi6fiQgiWX14l7Yt64NeD+8ta46aIowwncArnEMAVNOAOmtACBmN4hld485T34r17H4vWklfMHMMfeJ8/IS2OSA== exact posterior

µAAAB6nicbVDLSgNBEOyNrxhfUY9eBoPgKeyKoMegF48RzQOSJcxOZpMhM7PLTK8QQj7BiwdFvPpF3vwbJ8keNLGgoajqprsrSqWw6PvfXmFtfWNzq7hd2tnd2z8oHx41bZIZxhsskYlpR9RyKTRvoEDJ26nhVEWSt6LR7cxvPXFjRaIfcZzyUNGBFrFgFJ300FVZr1zxq/4cZJUEOalAjnqv/NXtJyxTXCOT1NpO4KcYTqhBwSSflrqZ5SllIzrgHUc1VdyGk/mpU3LmlD6JE+NKI5mrvycmVFk7VpHrVBSHdtmbif95nQzj63AidJoh12yxKM4kwYTM/iZ9YThDOXaEMiPcrYQNqaEMXTolF0Kw/PIqaV5UA78a3F9Wajd5HEU4gVM4hwCuoAZ3UIcGMBjAM7zCmye9F+/d+1i0Frx85hj+wPv8AV1ejdY=

9 [Bishop 2006] Microcredit Experiment

[amcharts.com 2016] 10 Microcredit Experiment

• Simplified from Meager (2018a)

!

!

! • Priors and hyperpriors:

[amcharts.com 2016] 10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

!

! • Priors and hyperpriors:

[amcharts.com 2016] 10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site:

!

! • Priors and hyperpriors:

[amcharts.com 2016] 10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site:

!

! • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site:

!

! • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site:

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site:

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site:

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site:

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site: 1 if microcredit

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site: 1 if microcredit

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site: 1 if microcredit

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site: 1 if microcredit

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site: 1 if microcredit

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site: 1 if microcredit

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

µ iid µ k ,C ⌧k ⇠ N ⌧ ✓ ◆ ✓✓ ◆ ◆

10 Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site: 1 if microcredit

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

µ iid µ k ,C ⌧k ⇠ N ⌧ ✓ ◆ ✓✓ ◆ ◆

2 iid k (a, b) 10 ⇠ Microcredit Experiment

• Simplified from Meager (2018a) • K = 7 microcredit trials (Mexico, Mongolia, Bosnia, India, Morocco, Philippines, Ethiopia)

• Nk businesses in kth site (~900 to ~17K) • Profit of nth business at kth site: 1 if microcredit

! profit indep y (µ + T ⌧ , 2) ! kn ⇠ N k kn k k • Priors and hyperpriors:

µk iid µ µ iid µ0 1 ,C , ⇤ ⌧k ⇠ N ⌧ ⌧ ⇠ N ⌧0 ✓ ◆ ✓✓ ◆ ◆ ✓ ◆ ✓✓ ◆ ◆

2 iid k (a, b) C Sep&LKJ(⌘,c,d) 10 ⇠ ⇠ Microcredit

MFVB: How will we know if it’s working?

11 11 Microcredit

MFVB [Giordano, Broderick, Meager, Huggins, Jordan 2016] Microcredit

• One set of 2500 MCMC draws: 45 minutes MFVB

11 [Giordano, Broderick, Meager, Huggins, Jordan 2016] Microcredit

• One set of 2500 MCMC draws: 45 minutes

• MFVB MFVB optimization: <1 min

11 [Giordano, Broderick, Meager, Huggins, Jordan 2016] Microcredit

• One set of 2500 MCMC draws: 45 minutes

• MFVB MFVB optimization: <1 min

11 [Giordano, Broderick, Meager, Huggins, Jordan 2016] Microcredit

• One set of 2500 MCMC draws: 45 minutes

• MFVB MFVB optimization: <1 min

Criteo Online Ads Experiment • Click-through conversion prediction • Q: Will a customer (e.g.) buy a product after clicking?

11 [Giordano, Broderick, Meager, Huggins, Jordan 2016] Microcredit

• One set of 2500 MCMC draws: 45 minutes

• MFVB MFVB optimization: <1 min

Criteo Online Ads Experiment • Click-through conversion prediction • Q: Will a customer (e.g.) buy a product after clicking? • Q: How predictive of conversion are different features?

11 [Giordano, Broderick, Meager, Huggins, Jordan 2016] Microcredit

• One set of 2500 MCMC draws: 45 minutes

• MFVB MFVB optimization: <1 min

Criteo Online Ads Experiment • Click-through conversion prediction • Q: Will a customer (e.g.) buy a product after clicking? • Q: How predictive of conversion are different features? • Logistic GLMM [board] 11 [Giordano, Broderick, Meager, Huggins, Jordan 2016; Giordano, Broderick, Jordan 2017] Microcredit

• One set of 2500 MCMC draws: 45 minutes

• MFVB MFVB optimization: <1 min

Criteo Online Ads Experiment • Click-through conversion prediction • Q: Will a customer (e.g.) buy a product after clicking? • Q: How predictive of conversion are different features? • Logistic GLMM; N = 61,895 subset to compare to MCMC 11 [Giordano, Broderick, Meager, Huggins, Jordan 2016; Giordano, Broderick, Jordan 2017] Criteo Online Ads Experiment

12 [Giordano, Broderick, Jordan 2017] Criteo Online Ads Experiment

• MAP: 12 s

12 [Giordano, Broderick, Jordan 2017] Criteo Online Ads Experiment Global parameters (-τ) Global parameter τ Local parameters MAP MAP MAP

MCMC MCMC MCMC • MAP: 12 s

12 [Giordano, Broderick, Jordan 2017] Criteo Online Ads Experiment Global parameters (-τ) Global parameter τ Local parameters MAP MAP MAP

MCMC MCMC MCMC • MAP: 12 s • MFVB: 57 s

12 [Giordano, Broderick, Jordan 2017] Criteo Online Ads Experiment Global parameters (-τ) Global parameter τ Local parameters MAP MAP MAP

MCMC MCMC MCMC Global parameters (all) Local parameters • MAP: 12 s • MFVB: 57 s MFVB MFVB

MCMC

12 MCMC [Giordano, Broderick, Jordan 2017] Criteo Online Ads Experiment Global parameters (-τ) Global parameter τ Local parameters MAP MAP MAP

MCMC MCMC MCMC Global parameters (all) Local parameters • MAP: 12 s • MFVB: 57 s • MCMC (5K MFVB samples): MFVB 21,066 s (5.85 h) MCMC

12 MCMC [Giordano, Broderick, Jordan 2017] Roadmap

• Bayes & Approximate Bayes review • What is: • Variational Bayes (VB) • Mean-field variational Bayes (MFVB) • Why use MFVB? • When can we trust MFVB? • Where do we go from here? Latent Dirichlet Allocation (LDA)

13 Latent Dirichlet Allocation (LDA)

• Topics

13 Latent Dirichlet Allocation (LDA)

• Topics: two perspectives • Each document can belong to multiple groups • Cluster words in documents

13 Latent Dirichlet Allocation (LDA)

• Topics: two perspectives • Each document can belong to multiple groups • Cluster words in documents

[Blei, Ng, Jordan 13 2003] Latent Dirichlet Allocation (LDA)

• Topics: two perspectives • Each document can belong to multiple groups • Cluster words in documents

[Blei, Ng, Jordan 2003; Pritchard, Stephens, Donnelly 13 2000] Roadmap

• Bayes & Approximate Bayes review • What is: • Variational Bayes (VB) • Mean-field variational Bayes (MFVB) • Why use MFVB? • When can we trust MFVB? • Where do we go from here? Approximate Bayesian inference

qAAAB6nicbVBNS8NAEJ34WetX1aOXxSKIh5KIoMeiF48V7Qe0sWy2k3bpZhN3N0IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O0vLK6tr64WN4ubW9s5uaW+/oeNUMayzWMSqFVCNgkusG24EthKFNAoENoPh9cRvPqHSPJb3ZpSgH9G+5CFn1Fjp7vHhtFsquxV3CrJIvJyUIUetW/rq9GKWRigNE1Trtucmxs+oMpwJHBc7qcaEsiHtY9tSSSPUfjY9dUyOrdIjYaxsSUOm6u+JjEZaj6LAdkbUDPS8NxH/89qpCS/9jMskNSjZbFGYCmJiMvmb9LhCZsTIEsoUt7cSNqCKMmPTKdoQvPmXF0njrOK5Fe/2vFy9yuMowCEcwQl4cAFVuIEa1IFBH57hFd4c4bw4787HrHXJyWcO4A+czx/01o2R ⇤ p( y) Use to approximate AAAB83icbVBNS8NAEJ3Ur1q/qh69LBahXkoigh6LXjxWsB/QhLLZbtqlm82yuxFC7N/w4kERr/4Zb/4bt20O2vpg4PHeDDPzQsmZNq777ZTW1jc2t8rblZ3dvf2D6uFRRyepIrRNEp6oXog15UzQtmGG055UFMchp91wcjvzu49UaZaIB5NJGsR4JFjECDZW8mXdJ8PEoCeUnQ+qNbfhzoFWiVeQGhRoDapf/jAhaUyFIRxr3fdcaYIcK8MIp9OKn2oqMZngEe1bKnBMdZDPb56iM6sMUZQoW8Kgufp7Isex1lkc2s4Ym7Fe9mbif14/NdF1kDMhU0MFWSyKUo5MgmYBoCFTlBieWYKJYvZWRMZYYWJsTBUbgrf88irpXDQ8t+HdX9aaN0UcZTiBU6iDB1fQhDtoQRsISHiGV3hzUufFeXc+Fq0lp5g5hj9wPn8A5+eQ7g== ·| Optimization q⇤ = argminq Qf(q( ),p( y)) 2 · ·| Variational Bayes q⇤ = argminq QKL(q( ) p( y)) 2 · || ·| Mean-field variational Bayes

q⇤ = argminq Q KL(q( ) p( y)) 2 MFVB · || ·| • Coordinate descent • Stochastic variational inference (SVI) [Hoffman et al 2013] • Automatic differentiation variational inference (ADVI) [Kucukelbir et al 2015, 2017]

14 Roadmap

• Bayes & Approximate Bayes review • What is: • Variational Bayes (VB) • Mean-field variational Bayes (MFVB) • Why use MFVB? • When can we trust MFVB? • Where do we go from here? Roadmap

• Bayes & Approximate Bayes review • What is: • Variational Bayes (VB) • Mean-field variational Bayes (MFVB) • Why use MFVB? • When can we trust MFVB? • Where do we go from here? References (1/6)

R Bardenet, A Doucet, and C Holmes. "On Markov chain Monte Carlo methods for tall data." The Journal of Machine Learning Research 18.1 (2017): 1515-1557. AG Baydin, BA Pearlmutter, AA Radul, and JM Siskind. "Automatic differentiation in machine learning: a survey." ArXiv:1502.05767v4 (2018). DM Blei, A Kucukelbir, and JD McAuliffe. "Variational inference: A review for statisticians." Journal of the American Statistical Association 112.518 (2017): 859-877. T Broderick, N Boyd, A Wibisono, AC Wilson, and MI Jordan. Streaming variational Bayes. NIPS 2013. CM Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, 2006. T Campbell and T Broderick. Automated scalable Bayesian inference via Hilbert coresets. Under review. ArXiv:1710.05053. T Campbell and T Broderick. Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent. ICML 2018, to appear. RJ Giordano, T Broderick, and MI Jordan. "Linear response methods for accurate covariance estimates from mean field variational Bayes." NIPS 2015. R Giordano, T Broderick, R Meager, J Huggins, and MI Jordan. "Fast robustness quantification with variational Bayes." ICML 2016 Workshop on #Data4Good: Machine Learning in Social Good Applications, 2016. R Giordano, T Broderick, and MI Jordan. Covariances, robustness, and variational Bayes, 2017. Under review. ArXiv:1709.02536. References (2/6)

J Gorham and L Mackey. "Measuring sample quality with Stein's method." NIPS 2015. J Gorham, and L Mackey. "Measuring sample quality with kernels." ArXiv:1703.01717 (2017). PD Hoff. A first course in Bayesian statistical methods. Springer Science & Business Media, 2009. MD Hoffman, DM Blei, C Wang, and J Paisley. "Stochastic variational inference." The Journal of Machine Learning Research 14.1 (2013): 1303-1347. JH Huggins, T Campbell, and T Broderick. Coresets for scalable Bayesian logistic regression. NIPS 2016. JH Huggins, RP Adams, and T Broderick. PASS-GLM: Polynomial approximate sufficient for scalable Bayesian GLM inference. NIPS 2017. JH Huggins, M Kasprzak, T Campbell, and T Broderick. Bayesian posterior mean and uncertainty estimates: a non-asymptotic approach. Forthcoming. A Kucukelbir, R Ranganath, A Gelman, and D Blei. "Automatic variational inference in Stan." NIPS 2015. A Kucukelbir, D Tran, R Ranganath, A Gelman, and DM Blei. "Automatic differentiation variational inference." The Journal of Machine Learning Research 18.1 (2017): 430-474. DJC MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003. Stan (open source software). http://mc-stan.org/ Accessed: 2018. References (3/6)

S Talts, M Betancourt, D Simpson, A Vehtari, and A Gelman. "Validating Bayesian Inference Algorithms with Simulation-Based Calibration." aArXiv:1804.06788 (2018). RE Turner and M Sahani. Two problems with variational expectation maximisation for time- series models. In D Barber, AT Cemgil, and S Chiappa, editors, Bayesian Time Series Models, 2011. Y Yao, A Vehtari, D Simpson, and A Gelman. "Yes, but Did It Work?: Evaluating Variational Inference." ArXiv:1802.02538 (2018). Application References (4/6)

Abbott, Benjamin P., et al. "Observation of gravitational waves from a binary black hole merger." Physical Review Letters 116.6 (2016): 061102. Abbott, Benjamin P., et al. "The rate of binary black hole mergers inferred from advanced LIGO observations surrounding GW150914." The Astrophysical Journal Letters 833.1 (2016): L1. Airoldi, Edoardo M., David M. Blei, Stephen E. Fienberg, and Eric P. Xing. "Mixed membership stochastic blockmodels." Journal of Machine Learning Research 9.Sep (2008): 1981-2014. Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent Dirichlet allocation." Journal of Machine Learning Research 3.Jan (2003): 993-1022. Chati, Yashovardhan Sushil, and Hamsa Balakrishnan. "A Gaussian process regression approach to model aircraft engine fuel flow rate." Cyber-Physical Systems (ICCPS), 2017 ACM/ IEEE 8th International Conference on. IEEE, 2017. Gershman, Samuel J., David M. Blei, Kenneth A. Norman, and Per B. Sederberg. "Decomposing spatiotemporal brain patterns into topographic latent sources." NeuroImage 98 (2014): 91-102. Gillon, Michaël, et al. "Seven temperate terrestrial planets around the nearby ultracool dwarf star TRAPPIST-1." Nature 542.7642 (2017): 456. Grimm, Simon L., et al. "The nature of the TRAPPIST-1 exoplanets." Astronomy & Astrophysics 613 (2018): A68. Application References (5/6) Grogan Jr, William L., and Willis W. Wirth. "A new American genus of predaceous midges related to Palpomyia and Bezzia (Diptera: Ceratopogonidae). Un nuevo género Americano de purrujas depredadoras relacionadas con Palpomyia y Bezzia (Diptera: Ceratopogonidae)." Proceedings of the Biological Society of Washington. 94.4 (1981): 1279-1305. Kuikka, Sakari, Jarno Vanhatalo, Henni Pulkkinen, Samu Mäntyniemi, and Jukka Corander. "Experiences in Bayesian inference in Baltic salmon management." Statistical Science 29.1 (2014): 42-49. Meager, Rachael. “Understanding the impact of microcredit expansions: A Bayesian hierarchical analysis of 7 randomized experiments.” AEJ: Applied, to appear, 2018a. Meager, Rachael. “Aggregating Distributional Treatment Effects: A Bayesian Hierarchical Analysis of the Microcredit Literature.” Working paper, 2018b. Stegle, Oliver, Leopold Parts, Richard Durbin, and John Winn. "A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies." PLoS computational biology 6.5 (2010): e1000770. Stone, Lawrence D., Colleen M. Keller, Thomas M. Kratzke, and Johan P. Strumpfer. "Search for the wreckage of Air France Flight AF 447." Statistical Science (2014): 69-80. Woodard, Dawn, Galina Nogin, Paul Koch, David Racz, Moises Goldszmidt, and Eric Horvitz. "Predicting travel time reliability using mobile phone GPS data." Transportation Research Part C: Emerging Technologies 75 (2017): 30-44. Xing, Eric P., Wei Wu, Michael I. Jordan, and Richard M. Karp. "LOGOS: a modular Bayesian model for de novo motif detection." Journal of Bioinformatics and Computational Biology 2.01 (2004): 127-154. Additional image references (6/6)

amCharts. Visited Countries Map. https://www.amcharts.com/visited_countries/ Accessed: 2016.

Baltic Salmon Fund. https://www.en.balticsalmonfund.org/about_us Accessed: 2018.

ESO/L. Calçada/M. Kornmesser. 16 October 2017, 16:00:00. Obtained from: https:// commons.wikimedia.org/wiki/File:Artist %E2%80%99s_impression_of_merging_neutron_stars.jpg || Source: https://www.eso.org/ public/images/eso1733a/ (Creative Commons Attribution 4.0 International License)

J. Herzog. 3 June 2016, 17:17:30. Obtained from: https://commons.wikimedia.org/wiki/ File:Airbus_A350-941_F-WWCF_MSN002_ILA_Berlin_2016_17.jpg (Creative Commons Attribution 4.0 International License)

E. Xing. 2003. Slides “LOGOS: a modular Bayesian model for de novo motif detection.” Obtained from: https://www.cs.cmu.edu/~epxing/papers/Old_papers/slide_CSB03/CSB1.pdf Accessed: 2018.