O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Próximos SlideShares
Carregando em…5
×

# 統計的学習の基礎 4章 前半

スライド中で使われているPythonコードはこちら
https://github.com/matsuken92/Qiita_Contents/blob/master/Castella-book/Castella-chapter-4-first_half.ipynb

• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Seja o primeiro a comentar

### 統計的学習の基礎 4章 前半

1. 1. Twitter
2. 2. http://www.slideshare.net/matsukenbook
3. 3. MASAKARI Come On! щ( щ) https://twitter.com/_inundata/status/616658949761302528
4. 4. Y G X N X N ⇥ p K K p
5. 5. N ⇥ pxi X yi gi K p N
6. 6. X xi N p ... ... xj N ⇥ p X, Y, G X xi XT 1 XT i XT N
7. 7. G(x) G G = {class1, class2, · · · , classK}
8. 8. k ˆfk(x) = ˆk0 + ˆT k x k ` ˆfk(x) = ˆf`(x) {x : (ˆk0 ˆ`0) + (ˆk ˆ`)T x = 0} x 🌾 p p 1 1
9. 9. k k(x) x k Pr(G = k|X = x) x k(x) Pr(G = k|X = x) x
10. 10. Pr(G = 1|X = x) = exp( 0 + T x) 1 + exp( 0 + T x) Pr(G = 2|X = x) = 1 1 + exp( 0 + T x) log ✓ p 1 p ◆ logit(Pr(G = 1|X = x)) = 0 + T x = log Pr(G = 1|X = x) Pr(G = 2|X = x)
11. 11. X1, · · · , Xp D = (X1, X2) D0 = (X1, X2, X2 1 , X2 2 , X1X2) p(p + 1)/2 Rp 7! Rq (q > p) h(X)
12. 12. sklearn.discriminant_analysis.LinearDiscriminantAnalysis D0 = (X1, X2, X2 1 , X2 2 , X1X2)
13. 13. G K G(x) k Y Y = (Y1, Y2, · · · , Yk, · · · , YK) = (0, 0, · · · , 1, · · · , 0) N Y N K N ⇥ K k K1 2
14. 14. ˆY = X(X T X) 1 XT Y ˆB N K N p + 1 p + 1 N N p + 1 p + 1 p + 1 p + 1 N N K p + 1 K X p + 1
15. 15. x ˆf(x)T = (1, xT ) ˆB K K x ˆG(x) = argmax k2G ˆfk(x)
16. 16. ˆY = X(X T X) 1 XT Y ˆG(x) = argmax k2G ˆfk(x)
17. 17. E(Yk|X = x) = Pr(G = k|X = x) ˆfk(x) E(Yk|X = x) = Pr(G = k|X = x) [ 0.19942274, -0.01553064, 0.8161079 ] X k2G ˆfk(x) = 1 0  ˆfk(x)  1
18. 18. min B NX i=1 kyi ⇥ (1, xT i )B ⇤T k2 tk ˆG(x) = argmin k k ˆf(x) tkk2 K 1 1 K p + 1 B ˆf(x) tk ˆf(x) p + 1
19. 19. ˆG(x) = argmin k k ˆf(x) tkk2 ˆG(x) = argmax k2G ˆfk(x) min B NX i=1 kyi ⇥ (1, xT i )B ⇤T k2
20. 20. K K 3
21. 21. [ 0.3549 0.5517 0.712 0.8062 0.8631 0.9095 0.9458 0.9739 0.9916 1.0000 ]
22. 22. Pr(G|X) fk(x) G = k X ⇡k k KX k=1 ⇡k = 1 Pr(G = k|X = x) = fk(x)⇡k PK ` f`(x)⇡`
23. 23. fk(x) = 1 (2⇡)p/2|⌃k|1/2 exp ✓ 1 2 (x µk)T ⌃ 1 k (x µk) ◆ 1 p + 1 p + 1p + 1 p + 1 1 xt Ax = nX i,j=1 aijxixj µ = (0, 0) ⌃ = ✓ 2 1 12 12 2 2 ◆ = ✓ 1 0.6 0.6 1 ◆
24. 24. ⌃k = ⌃, 8k log Pr(G = k|X = x) Pr(G = `|X = x) = log fk(x) f`(x) + log ⇡k ⇡` = log ⇡k ⇡` 1 2 (µk + µ`)T ⌃ 1 (µk + µ`) + xT ⌃ 1 (µk + µ`) = log fk(x)⇡k f`(x)⇡` k `
25. 25. k(x) = xT ⌃ 1 µk 1 2 µT k ⌃ 1 µk + log ⇡k G(x) = argmaxk k(x)
26. 26. ˆ⇡k = Nk/N ˆµk = X gi=k xi/Nk ˆ⌃ = KX k=1 X gi=k (xi ˆµk)(xi ˆµk)T /(N K)
27. 27. xT ˆ⌃ 1 (ˆµ2 ˆµ1) > 1 2 (ˆµ2 + ˆµ1)T ˆ⌃ 1 (ˆµ2 ˆµ1) log N2 N1 1(x) = xT ⌃ 1 µ1 1 2 µT 1 ⌃ 1 µ1 + log ⇡1 2(x) = xT ⌃ 1 µ2 1 2 µT 2 ⌃ 1 µ2 + log ⇡2 ˆ⇡k = Nk/N ˆµk = X gi=k xi/Nk ˆ⌃ = KX k=1 X gi=k (xi ˆµk)(xi ˆµk)T /(N K) µk = ˆµk, ⌃ 1 = ˆ⌃ 1
28. 28. xT ˆ⌃ 1 (ˆµ2 ˆµ1) > 1 2 (ˆµ2 + ˆµ1)T ˆ⌃ 1 (ˆµ2 ˆµ1) log N2 N1
29. 29. fk(x), f`(x) log Pr(G = k|X = x) Pr(G = `|X = x) = log fk(x) f`(x) + log ⇡k ⇡` = log ⇡k ⇡` 1 2 (µk + µ`)T ⌃ 1 (µk + µ`) + xT ⌃ 1 (µk + µ`) fk(x) = 1 (2⇡)p/2|⌃k|1/2 exp ✓ 1 2 (x µk)T ⌃ 1 k (x µk) ◆
30. 30. k(x) = 1 2 log |⌃k| 1 2 (x µk)T ⌃ 1 k (x µk) + log ⇡k k(x) = xT ⌃ 1 µk 1 2 µT k ⌃ 1 µk + log ⇡k
31. 31. (K 1) ⇥ (p + 1) (K 1) ⇥ {p(p + 3)/2 + 1}
32. 32. ˆ⌃k(↵) = ↵ˆ⌃k + (1 ↵)ˆ⌃ ↵ 2 [0, 1]
33. 33. url <- "https://cran.r-project.org/src/contrib/Archive/ascrda/ascrda_1.15.tar.gz" pkgFile <- "ascrda_1.15.tar.gz" download.file(url = url, destfile = pkgFile) # Install package install.packages(c("rda", "sfsmisc", "e1071", "pamr")) install.packages(pkgs=pkgFile, type="source", repos=NULL) # http://www.inside-r.org/packages/cran/ascrda/docs/FitRda install.packages("ascrda") require(ascrda) df_vowel_train <- read.table("../vowel.train.csv", sep=",", header=1) df_vowel_test <- read.table("../vowel.test.csv", sep=",", header=1) df_vowel_train\$row.names<- NULL df_vowel_test\$row.names<- NULL y <- df_vowel_train\$y y_test <- df_vowel_test\$y X <- df_vowel_train[ ,c(F,T,T,T,T,T,T,T,T,T,T)] X_test <- df_vowel_test[ ,c(F,T,T,T,T,T,T,T,T,T,T)] a <- rep(0, 100) res_train <- rep(0, 100) res_test <- rep(0, 100) for(i in 1:101){ a[i] <- 0.01*(i-1) print (a[i]) startTime <- proc.time()[3] ans <- FitRda(X, y, X_test, y_test, alpha=a) endTime <- proc.time()[3] print(endTime-startTime) res_train[i] <- ans[1] res_test[i] <- ans[2] } df_result <- data.frame(train=res_train, test=res_test) write.table(df_result, file="df_result.csv", sep=",")
34. 34. ˆ⌃( ) = ˆ⌃ + (1 )ˆ2 I 2 [0, 1]
35. 35. ˆ⌃k = UkDkUT k log |ˆ⌃k| = X ` log dk` (x ˆµk)T ˆ⌃ 1 k (x ˆµk) = ⇥ UT k (x ˆµk) ⇤T D 1 k ⇥ UT k (x ˆµk) ⇤ ˆ⌃k = UkDkUT k p ⇥ p dk` (AB) 1 = B 1 A 1
36. 36. ˆ⇡k = Nk/N ˆµk = X gi=k xi/Nk ˆ⌃ = KX k=1 X gi=k (xi ˆµk)(xi ˆµk)T /(N K) X⇤ D 1/2 UT X ˆ⌃ = UDUT X⇤ def  sphere(X):          S  =  np.cov(X.T)          U  =  np.linalg.eig(S)[1]          D  =  np.diag(np.linalg.eigvals(S))          D_rt  =  scipy.linalg.sqrtm(D)          D_rt_inv  =  np.linalg.inv(D_rt)          return  np.dot(D_rt_inv,  np.dot(U.T,  X.T)).T ⇡k D1/2
37. 37. p K K 1 p K HK 1
38. 38. K > 3 K 1 L < K 1 HL ✓ HK 1 L
39. 39. Z = aT X a a max a aT Ba aT Wa
40. 40. (m1 m2)2 m1 m2 µ2 µ1 a = (aT (µ1 µ2))2 = (aT (µ1 µ2))(aT (µ1 µ2))T = aT (µ1 µ2)(µ1 µ2)T a 1 p + 1p + 1 1 1 p + 1 p + 1 1 max a aT Ba aT Wa max a aT Ba aT Wa = aT Ba ⇥ N
41. 41. max a aT Ba aT Wa max a aT Ba aT Wa a m1 µ1 x1 x2 x3 y3 y2 y1 = KX k=1 X gi=k (yi mk)2 = KX k=1 X gi=k (aT (xi µk))2 = KX k=1 X gi=k (aT (xi µk))(aT (xi µk))T = KX k=1 aT 0 @ X gi=k (xi µk)(xi µk)T 1 A a = aT Wa ⇥ N
42. 42. max a aT Ba aT Wa max a aT Ba subject to aT Wa = 1 @ @a [aT Ba + (aT Wa 1)] = 0 @ @ [aT Ba + (aT Wa 1)] = 0 Ba = Wa W W 1 Ba = a @ @x xT Ax = 2Ax
43. 43. W 1 Ba = a a m1 m2 µ2 µ1 a W W 1 Ba = a
44. 44. M⇤ = MW 1/2 B⇤ = V⇤ DBV⇤T v` = W 1/2 v⇤ ` K ⇥ pM W WM⇤ B M⇤ B⇤ B V⇤v⇤ ` ` Z` = vT ` X p + 1 p ⇥ p W 1/2 = UD 1/2 W U 1 W = UDW U 1 K M = {µ1, · · · , µK}T
45. 45. [-1.23290493 1.00301738] [-0.30483077 -0.78126293] v1 = v2 = v2 v1 Z1 Z2
46. 46. log ⇡k ⇡k