Biostatistics 615/815 Lecture 13: Programming with Matrix
Total Page:16
File Type:pdf, Size:1020Kb
Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Annoucements . Homework #3 . .. Biostatistics 615/815 Lecture 13: Homework 3 is due today • Programming with Matrix If you’re using Visual C++ and still have problems in using boost . • .. library, you can ask for another extension . .. Hyun Min Kang . Homework #4 . .. Homework 4 is out • February 17th, 2011 Floyd-Warshall algorithm • Note that some key information was not covered in the class. • Fair/biased coint HMM . • .. Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 1 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 2 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Last lecture - Conditional independence in graphical models Markov Blanket '()!+" !" '()#*!+" #" '()$*#+" '()&*#+" '()%*#+" $" %" &" If conditioned on the variables in the gray area (variables with direct • dependency), A is independent of all the other nodes. Pr(A, C, D, E B) = Pr(A B) Pr(C B) Pr(D B) Pr(E B) A (U A πA) πA • | | | | | • ⊥ − − | Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 3 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 4 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Hidden Markov Models Conditional dependency in forward-backward algorithms Forward : (qt, ot) ot− qt 1. • ⊥ | − Backward : o o+ q . • t+1 t+1 t+1 ()*# "# $# %# !" &# ⊥ | -$"# -%$# -&/&0"1# +,-,*+# !"# !$# !%# !" !&# !" "#$% "% "&$% !" 3!"/'"1# 3!$/'$1# 3 /' 1 3 /' 1 !% % # !& & # !" !"#$% !"% !"&$% !" .-,-# '"# '$# '%# !" '&# 2# !" '"#$% '"% '"&$% !" Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 5 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 6 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Viterbi algorithm - example Today’s lecture When observations were (walk, shop, clean) • Similar to Dijkstra’s or Manhattan tourist algorithm • Calculating Power • Linear algebra 101 • Using Eigen library for linear algebra • Implementing a simple linear regression • Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 7 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 8 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Calculating power More efficient computation of power . Problem . Function fastPower() . .. .. n Computing a , where a R and n N. double fastPower(double a, int n) { • ∈ ∈ if ( n == 1 ) { How many multiplications would be needed? . • return a; .. } . else { Function slowPower() . double x = fastPower(a,n/2); .. if ( n % 2 == 0 ) { double slowPower(double a, int n) { return x * x; double x = a; } for(int i=1; i < n; ++i) { else { x *= a; return x * x * a; } } return x; } .} . .} .. .. Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 9 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 10 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Computational time Summary - fastPower() . main() .. int main(int argc, char** argv) { double a = 1.0000001; int n = 1000000000; clock_t t1 = clock(); double x = slowPower(a,n); Θ(log n) complexity compared to Θ(n) complexity of slowPower() • clock_t t2 = clock(); Similar to binary search vs linear search double y = fastPower(a,n); • clock_t t3 = clock(); Good example to illustrate how the efficiency of numerical • std::cout << "slowPower ans = " << x << ", sec = " computation could change by clever algorithms << (double)(t2-t1)/CLOCKS_PER_SEC << std::endl; std::cout << "fastPower ans = " << y << ", sec = " << (double)(t3-t2)/CLOCKS_PER_SEC << std::endl; .} .. Running examples . .. slowPower ans = 2.6881e+43, sec = 1.88659 .fastPower ans = 2.6881e+43, sec = 3e-06 .. Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 11. / 28 . Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 12 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Programming with Matrix Ways to Matrix programmming Implementing Matrix libraries on your own • Implementation can well fit to specific need . • Why Matrix matters? . Need to pay for implementation overhead .. • Many statistical models can be well represented as matrix operations Computational efficiency may not be excellent for large matrices • • Linear regression Using BLAS/LAPACK library • • Logistic regression Low-level Fortran/C API • • Mixed models ATLAS implementation for gcc, MKL library for intel compiler (with • • Efficient matrix computation can make difference in the practicality of multithread support) • Used in many statistical packages including R • a statistical method Not user-friendly interface use. • Understanding C++ implementation of matrix operation can expedite boost supports C++ interface for BLAS • • the efficiency by orders of magnitude . Using a third-party library, Eigen package . • .. A convenient C++ interface • Reasonably fast performance • Supports most functions BLAS/LAPACK provides • Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 13 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 14 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Using a third party library Example usages of Eigen library #include <iostream> #include <Eigen/Dense> // For non-sparse matrix . using namespace Eigen; // avoid using Eigen:: Downloading and installing Eigen package . int main() .. { Download at http://eigen.tuxfamily.org/index.php?title=3.0 beta Matrix2d a; // 2x2 matrix type is defined for convenience • a << 1, 2, To install - just uncompress it 3, 4; . • MatrixXd b(2,2); // but you can define the type from arbitrary-size matrix .. b << 2, 3, . 1, 4; Using Eigen package . .. std::cout << "a + b =\n" << a + b << std::endl; // matrix addition Add -I DOWNLOADED PATH/eigen option when compile std::cout << "a - b =\n" << a - b << std::endl; // matrix subtraction • std::cout << "Doing a += b;" << std::endl; No need to install separate library. Including header files is sufficient a += b; . • std::cout << "Now a =\n" << a << std::endl; . .. Vector3d v(1,2,3); // vector operations Vector3d w(1,0,0); std::cout << "-v + w - v =\n" << -v + w - v << std::endl; } Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 15 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 16 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . More examples Time complexity of matrix computation . #include <iostream> Square matrix multiplication / inversion . #include <Eigen/Dense> .. Naive algorithm : O(n3) using namespace Eigen; • 2.807 int main() Strassen algorithm : O(n ) { • Coppersmith-Winograd algorithm : O(n2.376) (with very large Matrix2d mat; // 2*2 matrix • mat << 1, 2, constant factor) 3, 4; . Vector2d u(-1,1), v(2,0); // 2D vector .. std::cout << "Here is mat*mat:\n" << mat*mat << std::endl; . Determinant . std::cout << "Here is mat*u:\n" << mat*u << std::endl; .. std::cout << "Here is uˆT*mat:\n" << u.transpose()*mat << std::endl; Laplace expansion : O(n!) std::cout << "Here is uˆT*v:\n" << u.transpose()*v << std::endl; • std::cout << "Here is u*vˆT:\n" << u*v.transpose() << std::endl; LU decomposition : O(n3) std::cout << "Let's multiply mat by itself" << std::endl; • 3 mat = mat*mat; Bareiss algorithm : O(n ) • std::cout << "Now mat is mat:\n" << mat << std::endl; 2.376 } Fast matrix multiplication algorithm : O(n ) . • .. Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 17 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 18 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation Linear System Least square Summary . Computational considerations in matrix operations Quadratic multiplication . Same time complexity, but one is slightly more efficient . .. Avoiding expensive computation . Computing x Ay. .. • ′ Computation of u′ABv O(n2) + O(n) if ordered as (x A)y. • • ′ If the order is (((u′(AB))v) • Can be simplified as i j xiAijyj O(n3) + O(n2) + O(n) operations . • • . O(n2) overall .. ∑ ∑ . • . If the order is (((u A)B)v) A symmetric case . • ′ .. Two O(n2) operations and one O(n) operation • Computing x′Ax where A = LL′ O(n2) overall • . • u = L′x can be computed more efficiently than Ax. .. • x Ax = u u . • ′ ′ .. Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 19 / 28 Hyun Min Kang Biostatistics 615/815 - Lecture 13 February 17th, 2011 20 / 28 Introduction Power Matrix Matrix Computation Linear System Least square Summary Introduction Power Matrix Matrix Computation