资源预览内容
第1页 / 共34页
第2页 / 共34页
第3页 / 共34页
第4页 / 共34页
第5页 / 共34页
第6页 / 共34页
第7页 / 共34页
第8页 / 共34页
第9页 / 共34页
第10页 / 共34页
亲,该文档总共34页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Matrix Diff erentiation CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng(NUS) Matrix Diff erentiation1 / 34 Linear Fitting Revisited Linear Fitting Revisited Linear fi tting solves this problem: Given n data points pi= xi1 xim, 1 i n, and their corresponding values vi , fi nd a linear function f that minimizes the error E = n X i=1 (f(pi) vi)2.(1) The linear function f(pi) has the form f(p) = f(x1,.,xm) = a1x1+ + amxm+ am+1.(2) Leow Wee Kheng(NUS) Matrix Diff erentiation2 / 34 Linear Fitting Revisited The data points are organized into a matrix equation Da = v,(3) where D = x11x1m1 . . . . . . . . xn1xnm1 , a = a1 . . . am am+1 ,and v = v1 . . . vn . (4) The solution of Eq. 3 is a = (DD)1Dv.(5) Leow Wee Kheng(NUS) Matrix Diff erentiation3 / 34 Linear Fitting Revisited Denote each row of D as d i . Then, E = n X i=1 (d i a vi)2= kDa vk2.(6) So, linear least squares problem can be described very compactly as min a kDa vk2.(7) To show that the solution in Eq. 5 minimizes error E, need to diff erentiate E with respect to a and set it to zero: dE da = 0.(8) How to do this diff erentiation? Leow Wee Kheng(NUS) Matrix Diff erentiation4 / 34 Linear Fitting Revisited The obvious (but hard) way: E = n X i=1 m X j=1 ajxij+ am+1 vi 2 .(9) Expand equation explicitly giving E ak = 2 n X i=1 m X j=1 ajxij+ am+1 vi xik, for k 6= m + 1 2 n X i=1 m X j=1 ajxij+ am+1 vi , for k = m + 1 Then, set E/ak= 0 and solve for ak. This is slow, tedious and error prone! Leow Wee Kheng(NUS) Matrix Diff erentiation5 / 34 Linear Fitting Revisited Which one do you like to be? Leow Wee Kheng(NUS) Matrix Diff erentiation6 / 34 Linear Fitting Revisited At least like these? Leow Wee Kheng(NUS) Matrix Diff erentiation7 / 34 Matrix Derivatives Matrix Derivatives There are 6 common types of matrix derivatives: TypeScalarVectorMatrix Scalar y x y x Y x Vector y x y x Matrix y X Leow Wee Kheng(NUS) Matrix Diff erentiation8 / 34 Matrix Derivatives Derivatives by Scalar Numerator Layout NotationDenominator Layout Notation y x y x y x = y1 x . . . ym x y x = ?y 1 x ym x ? y x Y x = y11 x y1n x . . . . . ym1 x ymn x Leow Wee Kheng(NUS) Matrix Diff erentiation9 / 34 Matrix Derivatives Derivatives by Vector Numerator Layout NotationDenominator Layout Notation y x = ? y x1 y xn ? y x = y x1 . . . y xn y x = y1 x1 y1 xn . . . . . ym x1 ym xn y x = y1 x1 ym x1 . . . . . y1 xn ym xn y x y x Leow Wee Kheng(NUS) Matrix Diff erentiation10 / 34 Matrix Derivatives Derivative by Matrix Numerator Layout NotationDenominator Layout Notation y X = y x11 y xm1 . . . . . y x1n y xmn y X = y x11 y x1n . . . . . y xm1 y xmn y X y X Leow Wee Kheng(NUS) Matrix Diff erentiation11 / 34 Matrix Derivatives Pictorial Representation numerator layout denominator layout . . . . Leow Wee Kheng(NUS) Matrix Diff erentiation12 / 34 Matrix Derivatives Caution Most books and papers dont state which convention they use. Reference 2 uses both conventions but clearly diff erentiate them. y x = ? y x1 y xn ? y x = y x1 . . . y xn y x = y1 x1 y1 xn . . . . . ym x1 ym xn y x = y1 x1 ym x1 . . . . . y1 xn ym xn It is best not to mix the two conventions in your equations. We adopt numerator layout notation. Leow Wee Kheng(NUS) Matrix Diff erentiation13 / 34 Matrix DerivativesCommonly Used Derivatives Commonly Used Derivatives Here, scalar a, vector a and matrix A are not functions of x and x. (C1) da dx = 0(column matrix) (C2) da dx = 0(row matrix) (C3) da dX = 0(matrix) (C4) da dx = 0(matrix) (C5) dx dx = I Leow Wee Kheng(NUS) Matrix Diff erentiation14 / 34 Matrix DerivativesCommonly Used Derivatives (C6) dax dx = dxa dx = a (C7) dxx dx = 2x (C8) d(xa)2 dx = 2xaa (C9) dAx dx = A (C10) dxA dx = A (C11) dxAx dx = x(A + A ) Leow Wee Kheng(NUS) Matrix Diff erentiation15 / 34 Matrix DerivativesDerivatives of Scalar by Scalar Derivatives of Scalar by Scalar (SS1) (u + v) x = u x + v x (SS2) uv x = u v x + v u x (product rule) (SS3) g(u) x = g(u) u u x (chain rule) (SS4) f(g(u) x = f(g) g g(u) u u x (chain rule) Leow Wee Kheng(NUS) Matrix Diff erentiation16 / 34 Matrix DerivativesDerivatives of Vector by Scalar Derivatives of Vector by Scalar (VS1) au x = au x where a is not a function of x. (VS2) Au x = Au x where A is not a function of x. (VS3) u x
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号