SlideShare uma empresa Scribd logo
1 de 51
Baixar para ler offline
分層的表格為主函數近似方法
Hierarchical Multipartite
Function Evaluation
Advisor : Prof. Shen-Fu Hsiao (蕭勝夫)
Student : Yi-Hau Chen (陳奕豪)
Outline
•Motivation
•Related Work
•Proposed
•Results & Comparison
•Conclusion
2
Outline
•Motivation
•Related Work
•Proposed
•Results & Comparison
•Conclusion
3
Motivation
• 特殊函數運算單元被廣泛應用於在數位訊號處理及多媒體
應用,如:圖像處理器(graphics processing unit)。
• 特殊函數運算單元(Special function unit)
• 三角函數(trigonometric)、倒數(reciprocal)、指數
(exponential) 與對數(logarithm)。
• 查表(lookup tables(LUT)) 與一些簡單的算數運算單元所
構成
• 主要分為兩類:
• piecewise polynomial approximation (PPA)
• table-lookup-and-addition (TA)
• 本論文主要探討如何有效地減少TA 的表格面積,仍然可以
保持TA 運算速度較快的優點。
4
Outline
• Motivation
• Related Work
• Category
• Piecewise Polynomial Approximation (PPA)
• Table-Lookup-and-Addition (TA)
• Bipartite Table Methods (BP)
• Symmetric Bipartite Table Methods (SBTM)
• Symmetric Table Addition Methods (STAM)
• Multipartite Table Methods (MP)
• Proposed
• Results & Comparison
• Conclusion
5
Category
6
Piecewise Polynomial
Approximation (PPA)-(1/2)
7
𝑓 𝑥 ≅ 𝑎0 𝑥 𝑚 + 𝑎1(𝑥 𝑚) ∙ 𝑥𝑙
PPA-(2/2) deg-2 Architecture
8
𝑓 𝑥 ≅ 𝑎0 𝑥 𝑚 + 𝑎1(𝑥 𝑚) ∙ 𝑥𝑙 + 𝑎2(𝑥 𝑚) ∙ 𝑥𝑙
2
Table-Lookup-and-Addition (TA)
• 主要分為兩類,add-table-add(ATA) 方法
以及bipartite/multipartite 方法。
• 而bipartite/multipartite 類的方法包含
• bipartite table methods (BP) [16]
• symmetric bipartite table methods (SBTM) [17]
• symmetric table addition methods (STAM) [18]
• multipartite table methods (MP) [1,19]
9
Bipartite Table Methods (BP)-
(1/5) 位元分區(bit partition)
10
在函數近似方法裡,為了近似一個的函數f(x),n-bit 的輸
入 x 被分成兩個部分𝑥0以及𝑥1,其位元寬度分別為𝛼和𝛽且𝛼 +
𝛽 = 𝑛。我們假設初始輸入區間為0 ≤ 𝑥 < 1, 即
𝑥 = 𝑥0 + 𝑥1
0 ≤ 𝑥0 ≤ 1 − 2−𝛼
0 ≤ 𝑥1 ≤ 2−𝛼
− 2−𝑛
0 ≤ 𝑥0,1 ≤ 1 − 2−𝛾
BP-(2/5) 泰勒展開式
11
因此,函數f(x) 可以透過泰勒展開式的前兩項來近似:
𝑛=0
∞
𝑓 𝑛
(𝑎)
𝑛!
∙ 𝑥 − 𝑎 𝑛
(𝑎 = 𝑥0 and x = 𝑥0 + 𝑥1)
𝑓 𝑥 = 𝑓 𝑥0 + 𝑓′ 𝑥0 ∙ 𝑥1 + 𝜀𝑙𝑖𝑛
𝑓 𝑥 = 𝑓 𝑥0 + 𝑓′
𝑥0,1 ∙ 𝑥1 + 𝜀𝑙𝑖𝑛 + 𝜀 𝑠𝑙𝑝
BP-(3/5) 曲線放大示意圖
12
BP-(4/5) 架構(Architecture)
13
𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝑓′
𝑥0,1 ∙ 𝑥1
≅ 𝑇𝐼 𝑥0 + 𝑇𝑂(𝑥0,1, 𝑥1)
𝑇𝐼 𝑥0 ≅ 𝑄[𝑓 𝑥0 ]
𝑇𝑂(𝑥0,1, 𝑥1) ≅ 𝑄[𝑓′ 𝑥0,1 ∙ 𝑥1]
Table of Initial Values
Table of Offset
BP-(5/5) 表格分割(table decomposition)
14
Symmetric Bipartite Table
Methods (SBTM)
15
0 ≤ 𝑥0 ≤ 1 − 2−𝛼
0 ≤ 𝑥1 ≤ 2−𝛼
− 2−𝑛
0 ≤ 𝑥0,1 ≤ 1 − 2−𝛾
𝛿1 = 2−𝛼
− 2−𝑛
𝛿0 = 2−𝛾
− 2−𝛼
16
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0 +
𝛿1
2
) ∙ (𝑥1 −
𝛿1
2
)
17
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙ (𝑥1 −
𝛿1
2
)
18
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙ (𝑥1 −
𝛿1
2
)
𝑇𝐼 𝑥0 = 𝑄[𝑓 𝑥0 +
𝛿1
2
]
𝑇𝑂 𝑥0,1, 𝑥1 = 𝑄[𝑓′
𝑥0,1 +
𝛿0
2
+
𝛿1
2
∙ 𝑥1 −
𝛿1
2
]
Symmetric Bipartite Table
Methods (SBTM)
Symmetric Table Addition
Methods (STAM)
19
𝛿1 =
𝑖=1
𝑚
𝛿1,𝑖, 𝛿1,𝑖 = 2−𝑝 𝑖−1 − 2−𝑝 𝑖, 𝑖 = 1,2, … , 𝑚
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙ (𝑥1 −
𝛿1
2
)
with
𝑝0 = 𝛼, 𝑝𝑖 = 𝑝𝑖−1 + 𝛽𝑖, 𝑖 = 1,2, … , 𝑚
𝑥1 =
𝑖=1
𝑚
𝑥1,𝑖
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙ (
𝑖=1
𝑚
𝑥1,𝑖 −
𝑖=1
𝑚
𝛿1,𝑖
2
)
𝑓 𝑥 ≅ 𝑓 𝑥0 +
𝛿1
2
+ 𝑓′(𝑥0,1 +
𝛿0
2
+
𝛿1
2
) ∙
𝑖=1
𝑚
(𝑥1,𝑖 −
𝛿1,𝑖
2
)
Multipartite Table Methods (MP[1])-
(1/5) 位元分區(bit partition)
20
Multipartite Table Methods (MP[1])-
(2/5) 不同的初值以及斜率產生方式
21
𝑇𝐼 𝑥0 = 𝑄[
𝑓 𝑥0 + 𝑓 𝑥0 + 𝛿1
2
]
𝑇𝑂 𝑥0,1, 𝑥1 = 𝑄[𝑠 𝑥0,𝑖 ∙ 𝑥1,𝑖 −
𝛿1,𝑖
2
]
𝑇𝐼 𝑥0 = 𝑄[𝑓 𝑥0 +
𝛿1
2
]
𝑇𝑂 𝑥0,1, 𝑥1,𝑖 = 𝑄[𝑓′ 𝑥0,1 +
𝛿0
2
+
𝛿1
2
∙ 𝑥1,𝑖 −
𝛿1,𝑖
2
]
MP[1]:
STAM:
Multipartite Table Methods (MP[1])-
(3/5) 斜率s的算法
22
𝑠 𝑥0,𝑖 =
𝑓 𝜑2 − 𝑓 𝜑1 + 𝑓 𝜑4 − 𝑓 𝜑3
2 ∙ 𝛿1,𝑖
Multipartite Table Methods (MP[1])-
(4/5)架構(Architecture)
23
Multipartite Table Methods (MP[1])-
(5/5)表格分割(table decomposition)
24
Outline
• Motivation
• Related Work
• Proposed
• 函數的定義域(domain) 與值域(range)
• 取樣方法及誤差分配(Error Budget)
• HMP方法概述
• Lossless ROM Compression with Low Cost
• 整合誤差(Combined Error) 與窮舉搜尋(Exhaustive Search)
• 搜尋方法的加速
• Results & Comparison
• Conclusion
25
函數的定義域(domain) 與值域(range)
26
27
𝑇𝐼 𝑥0 = 𝑄[
𝑓 𝑥0 + 𝑓 𝑥0 + 𝛿1
2
]
𝜀 𝑞 = 𝑚 + 1 ∙ 2−𝑛−𝑔−1
取樣方法及誤差分配(Error Budget)
𝜀 𝑟𝑛𝑑 = 0.5 ∙ (2−𝑛
− 2−𝑔
)
𝜀 𝑎𝑝𝑥 +𝜀 𝑞 +𝜀 𝑟𝑛𝑑 = 𝜀𝑡𝑜𝑡𝑎𝑙 < 2−𝑛
28
取樣方法及誤差分配(Error Budget)
|𝜀1| = 𝜀2 = |𝜀3| = |𝜀4|
HMP方法概述
29
HMP方法概述
30
HMP方法概述
31
HMP方法概述
32
比較MP與HMP
33
Lossless ROM Compression
原理示意圖(s= 3)
34
Lossless ROM Compression with Low Cost-
表格分割(table decomposition)
35
整合誤差(Combined Error) 與窮舉
搜尋(Exhaustive Search)
36
搜尋方法的加速 流程圖
37
搜尋方法的加速 驗證示意圖
38
Outline
•Motivation
•Related Work
•Proposed
•Results & Comparison
•Conclusion
39
40
41
表4.2: 24 位元
SIN 函數採用
MP [1] 及
HMP 之表格
分解
42
43
比較 MP, HMP, HMP_TI
44
45
46
47
使用版本為ISE10.1
編號為Xilinx Virtex-II XC2V1000-fg456-5
Outline
•Motivation
•Related Work
•Proposed
•Results & Comparison
•Conclusion
48
Conclusion
• 本論文提出之HMP能有效改良MP[1]的表格
面積。
• 本論文提出之Lossless ROM Compression不僅
有效降低表格面積,且delay增加得很少。
• 本論文一併提出的整合誤差(Combined Error)
與窮舉搜尋(Exhaustive Search)能加速到有效
時間內完成,相比過去有很大的進展。
• 未來展望:希望能將這些方法,拓展到更高
精確度上。(i.e.,32 bits)
49
References
1) F. de Dinechin and A. Tisserand, “Multipartite table methods,” IEEE Transactions on Computers, vol. 54, pp. 319–330, March 2005.
2) Y. J. Kim, H. E. Kim, S. H. Kim, J. S. Park, S. Paek, and L. S. Kim, “Homogeneous stream processors with embedded special function units for high-utilization
programmable shaders,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, pp. 1691–1704, Sept 2012.
3) D. D. Caro, N. Petra, and A. G. M. Strollo, “Reducing lookup-table size in direct digital frequency synthesizers using optimized multipartite table method,” IEEE
Transactions on Circuits and Systems I: Regular Papers, vol. 55, pp. 2116–2127, Aug 2008.
4) B. G. Nam, H. Kim, and H. J. Yoo, “Power and area-efficient unified computation of vector and elementary functions for handheld 3d graphics systems,” IEEE
Transactions on Computers, vol. 57, pp. 490–504, April 2008.
5) D. D. Caro, N. Petra, and A. G. M. Strollo, “High-performance special function unit for programmable 3-d graphics processors,” IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 56, pp. 1968–1978, Sept 2009.
6) D. D. Caro, N. Petra, and A. G. M. Strollo, “Direct digital frequency synthesizer using nonuniform piecewise-linear approximation,” IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 58, pp. 2409–2419, Oct 2011.
7) J. A. Pineiro, S. F. Oberman, J. M. Muller, and J. D. Bruguera, “High-speed function approximation using a minimax quadratic interpolator,” IEEE Transactions on
Computers, vol. 54, pp. 304–318, March 2005.
8) D. U. Lee, R. Cheung, W. Luk, and J. Villasenor, “Hardware implementation tradeoffs of polynomial approximations and interpolations,” IEEE Transactions on
Computers, vol. 57, pp. 686–701, May 2008.
9) D. U. Lee and J. D. Villasenor, “Optimized custom precision function evaluation for embedded processors,” IEEE Transactions on Computers, vol. 58, pp. 46–59, Jan
2009.56
10) D. U. Lee, R. C. C. Cheung, W. Luk, and J. D. Villasenor, “Hierarchical segmentation for hardware function evaluation,” IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 17, pp. 103–116, Jan 2009.
11) T. Sasao, S. Nagayama, and J. T. Butler, “Numerical function generators using lut cascades,” IEEE Transactions on Computers, vol. 56, pp. 826–838, June 2007.
12) S. F. Hsiao, H. J. Ko, Y. L. Tseng, W. L. Huang, S. H. Lin, and C. S. Wen, “Design of hardware function evaluators using low-overhead nonuniform segmentation with
address remapping,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 21, pp. 875–886, May 2013.
13) A. G. M. Strollo, D. D. Caro, and N. Petra, “Elementary functions hardware implementation using constrained piecewise-polynomial approximations,” IEEE Transactions
on Computers, vol. 60, pp. 418–432, March 2011.
14) S. F. Hsiao, H. J. Ko, and C. S. Wen, “Two-level hardware function evaluation based on correction of normalized piecewise difference functions,” IEEE Transactions on
Circuits and Systems II: Express Briefs, vol. 59, pp. 292–296, May 2012.
15) M. Chaudhary and P. Lee, “An improved two-step binary logarithmic converter for fpgas,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp.
476–480, May 2015.
50
References
16) D. D. Sarma and D. W. Matula, “Faithful bipartite rom reciprocal tables,” in Computer Arithmetic, 1995., Proceedings of the 12th Symposium on, pp. 17–28, Jul 1995.
17) M. J. Schulte and J. E. Stine, “Approximating elementary functions with symmetric bipartite tables,” IEEE Transactions on Computers, vol. 48, pp. 842–847, Aug 1999.
18) J. E. Stine and M. J. Schulte, “The symmetric table addition method for accurate function approximation,” Journal of VLSI signal processing systems for signal, image and
video technology, vol. 21, no. 2, pp. 167–177, 1999.
19) J.-M. Muller, “A few results on table-based methods,” Reliable Computing, vol. 5, no. 3, pp. 279–288, 1999.
20) P. K. Meher, “Lut optimization for memory-based computation,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 57, pp. 285–289, April 2010. 57
21) W. F. Wong and E. Goto, “Fast evaluation of the elementary functions in single precision,” IEEE Transactions on Computers, vol. 44, pp. 453–457, Mar 1995.
22) J. Y. L. Low and C. C. Jong, “A memory-efficient tables-and-additions method for accurate computation of elementary functions,” IEEE Transactions on Computers, vol.
62, pp. 858–872, May 2013.
23) D. Wang, J. M. Muller, N. Brisebarre, and M. D. Ercegovac, “(m,p,k) –friendly points: A table-based method to evaluate trigonometric function,” IEEE Transactions on
Circuits and Systems II: Express Briefs, vol. 61, pp. 711–715, Sept 2014.
24) S. F. Hsiao, P. H. Wu, C. S. Wen, and P. K. Meher, “Table size reduction methods for faithfully rounded lookup-table-based multiplierless function evaluation,” IEEE
Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp. 466–470, May 2015.
25) J.-M. Muller, Elementary Functions: Algorithms and Implementation, 2nd ed. Birkhauser, 2006.
26) M. D. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann Pub, 2004.
27) B. Parhami, Algorithms and Design Methods for Digital Computer Arithmetic, International 2nd ed. Oxford University Press, 2012.
28) S.-F. Hsiao, P.-C. Wei, and C.-P. Lin, “An automatic hardware generator for special arithmetic functions using various rom-based approximation approaches,” in Circuits
and Systems, 2008. ISCAS 2008. IEEE International Symposium on, pp. 468–471, May 2008.
29) 曾于玲, “使用位元截斷法之查表式函數求值單元自動產生器設計,” 國立中山大學資訊工程學系碩士論文, 2011.
30) 吳柏翰, “無乘法器查表法函數運算設計之表格縮減和最佳化,” 國立中山大學資訊工程學系碩士論文, 2013.
31) S. F. Hsiao, C. S. Wen, Y. H. Chen, and K. C. Huang, “Hierarchical multipartite function evaluation,” IEEE Transactions on Computers, vol. PP, no. 99, pp. 1–1, 2016.
51

Mais conteúdo relacionado

Mais procurados

論文口試簡報製作技巧
論文口試簡報製作技巧論文口試簡報製作技巧
論文口試簡報製作技巧滄碩 劉
 
Grad-CAMの始まりのお話
Grad-CAMの始まりのお話Grad-CAMの始まりのお話
Grad-CAMの始まりのお話Shintaro Yoshida
 
131111台大論文口試
131111台大論文口試131111台大論文口試
131111台大論文口試貽得 廖
 
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校CHENHuiMei
 
患者安全技能ノンテクニカルスキル向上をあなたの組織で実現するには
患者安全技能ノンテクニカルスキル向上をあなたの組織で実現するには患者安全技能ノンテクニカルスキル向上をあなたの組織で実現するには
患者安全技能ノンテクニカルスキル向上をあなたの組織で実現するにはTakahiro Matsumoto
 
落合陽一前陣速攻のスライド0131 #JILS
落合陽一前陣速攻のスライド0131 #JILS 落合陽一前陣速攻のスライド0131 #JILS
落合陽一前陣速攻のスライド0131 #JILS Yoichi Ochiai
 
SSII2022 [OS3-01] 深層学習のための効率的なデータ収集と活用
SSII2022 [OS3-01] 深層学習のための効率的なデータ収集と活用SSII2022 [OS3-01] 深層学習のための効率的なデータ収集と活用
SSII2022 [OS3-01] 深層学習のための効率的なデータ収集と活用SSII
 
碩士論文簡報 王玠瑛-20130601-final版-組織變革要留「心」!
碩士論文簡報 王玠瑛-20130601-final版-組織變革要留「心」!碩士論文簡報 王玠瑛-20130601-final版-組織變革要留「心」!
碩士論文簡報 王玠瑛-20130601-final版-組織變革要留「心」!tabowang
 
產品企劃與開發Part1 分享版
產品企劃與開發Part1 分享版產品企劃與開發Part1 分享版
產品企劃與開發Part1 分享版Mr PM
 
イノベーションスプリント2011_野中先生
イノベーションスプリント2011_野中先生イノベーションスプリント2011_野中先生
イノベーションスプリント2011_野中先生InnovationSprint2011
 
論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNNTakashi Abe
 
第15回 配信講義 計算科学技術特論A(2021)
第15回 配信講義 計算科学技術特論A(2021)第15回 配信講義 計算科学技術特論A(2021)
第15回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 
データ中心の時代を生き抜くエンジニアに知ってほしい10?のこと
データ中心の時代を生き抜くエンジニアに知ってほしい10?のことデータ中心の時代を生き抜くエンジニアに知ってほしい10?のこと
データ中心の時代を生き抜くエンジニアに知ってほしい10?のことHideo Terada
 
第18回コンピュータビジョン勉強会@関東「ICCV祭り」発表資料(kanejaki)
第18回コンピュータビジョン勉強会@関東「ICCV祭り」発表資料(kanejaki)第18回コンピュータビジョン勉強会@関東「ICCV祭り」発表資料(kanejaki)
第18回コンピュータビジョン勉強会@関東「ICCV祭り」発表資料(kanejaki)kanejaki
 
外食のマーケティングを進化させる「外食データクラウド」とAIを活用した外食POSデータ「ラベリング技術」の業界を超えた戦略とMLOps活用
外食のマーケティングを進化させる「外食データクラウド」とAIを活用した外食POSデータ「ラベリング技術」の業界を超えた戦略とMLOps活用外食のマーケティングを進化させる「外食データクラウド」とAIを活用した外食POSデータ「ラベリング技術」の業界を超えた戦略とMLOps活用
外食のマーケティングを進化させる「外食データクラウド」とAIを活用した外食POSデータ「ラベリング技術」の業界を超えた戦略とMLOps活用Deep Learning Lab(ディープラーニング・ラボ)
 

Mais procurados (20)

論文口試簡報製作技巧
論文口試簡報製作技巧論文口試簡報製作技巧
論文口試簡報製作技巧
 
Grad-CAMの始まりのお話
Grad-CAMの始まりのお話Grad-CAMの始まりのお話
Grad-CAMの始まりのお話
 
131111台大論文口試
131111台大論文口試131111台大論文口試
131111台大論文口試
 
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
 
患者安全技能ノンテクニカルスキル向上をあなたの組織で実現するには
患者安全技能ノンテクニカルスキル向上をあなたの組織で実現するには患者安全技能ノンテクニカルスキル向上をあなたの組織で実現するには
患者安全技能ノンテクニカルスキル向上をあなたの組織で実現するには
 
論文口試
論文口試論文口試
論文口試
 
落合陽一前陣速攻のスライド0131 #JILS
落合陽一前陣速攻のスライド0131 #JILS 落合陽一前陣速攻のスライド0131 #JILS
落合陽一前陣速攻のスライド0131 #JILS
 
人工知能概論 15
人工知能概論 15人工知能概論 15
人工知能概論 15
 
SSII2022 [OS3-01] 深層学習のための効率的なデータ収集と活用
SSII2022 [OS3-01] 深層学習のための効率的なデータ収集と活用SSII2022 [OS3-01] 深層学習のための効率的なデータ収集と活用
SSII2022 [OS3-01] 深層学習のための効率的なデータ収集と活用
 
碩士論文簡報 王玠瑛-20130601-final版-組織變革要留「心」!
碩士論文簡報 王玠瑛-20130601-final版-組織變革要留「心」!碩士論文簡報 王玠瑛-20130601-final版-組織變革要留「心」!
碩士論文簡報 王玠瑛-20130601-final版-組織變革要留「心」!
 
產品企劃與開發Part1 分享版
產品企劃與開發Part1 分享版產品企劃與開發Part1 分享版
產品企劃與開發Part1 分享版
 
分析手法のご紹介
分析手法のご紹介分析手法のご紹介
分析手法のご紹介
 
イノベーションスプリント2011_野中先生
イノベーションスプリント2011_野中先生イノベーションスプリント2011_野中先生
イノベーションスプリント2011_野中先生
 
論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN論文紹介: Fast R-CNN&Faster R-CNN
論文紹介: Fast R-CNN&Faster R-CNN
 
第15回 配信講義 計算科学技術特論A(2021)
第15回 配信講義 計算科学技術特論A(2021)第15回 配信講義 計算科学技術特論A(2021)
第15回 配信講義 計算科学技術特論A(2021)
 
データ中心の時代を生き抜くエンジニアに知ってほしい10?のこと
データ中心の時代を生き抜くエンジニアに知ってほしい10?のことデータ中心の時代を生き抜くエンジニアに知ってほしい10?のこと
データ中心の時代を生き抜くエンジニアに知ってほしい10?のこと
 
第18回コンピュータビジョン勉強会@関東「ICCV祭り」発表資料(kanejaki)
第18回コンピュータビジョン勉強会@関東「ICCV祭り」発表資料(kanejaki)第18回コンピュータビジョン勉強会@関東「ICCV祭り」発表資料(kanejaki)
第18回コンピュータビジョン勉強会@関東「ICCV祭り」発表資料(kanejaki)
 
外食のマーケティングを進化させる「外食データクラウド」とAIを活用した外食POSデータ「ラベリング技術」の業界を超えた戦略とMLOps活用
外食のマーケティングを進化させる「外食データクラウド」とAIを活用した外食POSデータ「ラベリング技術」の業界を超えた戦略とMLOps活用外食のマーケティングを進化させる「外食データクラウド」とAIを活用した外食POSデータ「ラベリング技術」の業界を超えた戦略とMLOps活用
外食のマーケティングを進化させる「外食データクラウド」とAIを活用した外食POSデータ「ラベリング技術」の業界を超えた戦略とMLOps活用
 
02 FDA 醫材臨床試驗考量
02 FDA 醫材臨床試驗考量02 FDA 醫材臨床試驗考量
02 FDA 醫材臨床試驗考量
 
Semantic segmentation2
Semantic segmentation2Semantic segmentation2
Semantic segmentation2
 

Semelhante a 碩士論文投影片

Positioning Error Analysis and Compensation of Differential Precision Workbench
Positioning Error Analysis and Compensation of Differential Precision WorkbenchPositioning Error Analysis and Compensation of Differential Precision Workbench
Positioning Error Analysis and Compensation of Differential Precision WorkbenchIJRES Journal
 
An Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
An Algebraic Method to Check the Singularity-Free Paths for Parallel RobotsAn Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
An Algebraic Method to Check the Singularity-Free Paths for Parallel RobotsDr. Ranjan Jha
 
Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...ijfcstjournal
 
An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...IJECEIAES
 
New Directions in Mahout's Recommenders
New Directions in Mahout's RecommendersNew Directions in Mahout's Recommenders
New Directions in Mahout's Recommenderssscdotopen
 
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...IRJET Journal
 
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...IJMTST Journal
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleHakka Labs
 
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using MatpackIRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using MatpackIRJET Journal
 
An Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingAn Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingCSCJournals
 
Palmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestPalmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestDakshina Kisku
 
Palmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestPalmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestDakshina Kisku
 
Bayesian Estimation for Missing Values in Latin Square Design
Bayesian Estimation for Missing Values in Latin Square DesignBayesian Estimation for Missing Values in Latin Square Design
Bayesian Estimation for Missing Values in Latin Square Designinventionjournals
 
Fault detection based on novel fuzzy modelling
Fault detection based on novel fuzzy modelling Fault detection based on novel fuzzy modelling
Fault detection based on novel fuzzy modelling csijjournal
 
Multiple Ant Colony Optimizations for Stereo Matching
Multiple Ant Colony Optimizations for Stereo MatchingMultiple Ant Colony Optimizations for Stereo Matching
Multiple Ant Colony Optimizations for Stereo MatchingCSCJournals
 
Implementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderImplementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderVLSICS Design
 
Design of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic MultiplierDesign of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic MultiplierVLSICS Design
 

Semelhante a 碩士論文投影片 (20)

Positioning Error Analysis and Compensation of Differential Precision Workbench
Positioning Error Analysis and Compensation of Differential Precision WorkbenchPositioning Error Analysis and Compensation of Differential Precision Workbench
Positioning Error Analysis and Compensation of Differential Precision Workbench
 
An Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
An Algebraic Method to Check the Singularity-Free Paths for Parallel RobotsAn Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
An Algebraic Method to Check the Singularity-Free Paths for Parallel Robots
 
Unit1 pg math model
Unit1 pg math modelUnit1 pg math model
Unit1 pg math model
 
Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...Segmentation and recognition of handwritten digit numeral string using a mult...
Segmentation and recognition of handwritten digit numeral string using a mult...
 
An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...An efficient hardware logarithm generator with modified quasi-symmetrical app...
An efficient hardware logarithm generator with modified quasi-symmetrical app...
 
New Directions in Mahout's Recommenders
New Directions in Mahout's RecommendersNew Directions in Mahout's Recommenders
New Directions in Mahout's Recommenders
 
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
A Comparative study of K-SVD and WSQ Algorithms in Fingerprint Compression Te...
 
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
A Method for the Reduction 0f Linear High Order MIMO Systems Using Interlacin...
 
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at GoogleDataEngConf: Feature Extraction: Modern Questions and Challenges at Google
DataEngConf: Feature Extraction: Modern Questions and Challenges at Google
 
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using MatpackIRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
IRJET- Kinematic Analysis of Planar and Spatial Mechanisms using Matpack
 
An Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video CodingAn Efficient Multiplierless Transform algorithm for Video Coding
An Efficient Multiplierless Transform algorithm for Video Coding
 
Medial axis transformation based skeletonzation of image patterns using image...
Medial axis transformation based skeletonzation of image patterns using image...Medial axis transformation based skeletonzation of image patterns using image...
Medial axis transformation based skeletonzation of image patterns using image...
 
9.venkata naga vamsi. a
9.venkata naga vamsi. a9.venkata naga vamsi. a
9.venkata naga vamsi. a
 
Palmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestPalmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interest
 
Palmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interestPalmprint verification using lagrangian decomposition and invariant interest
Palmprint verification using lagrangian decomposition and invariant interest
 
Bayesian Estimation for Missing Values in Latin Square Design
Bayesian Estimation for Missing Values in Latin Square DesignBayesian Estimation for Missing Values in Latin Square Design
Bayesian Estimation for Missing Values in Latin Square Design
 
Fault detection based on novel fuzzy modelling
Fault detection based on novel fuzzy modelling Fault detection based on novel fuzzy modelling
Fault detection based on novel fuzzy modelling
 
Multiple Ant Colony Optimizations for Stereo Matching
Multiple Ant Colony Optimizations for Stereo MatchingMultiple Ant Colony Optimizations for Stereo Matching
Multiple Ant Colony Optimizations for Stereo Matching
 
Implementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderImplementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adder
 
Design of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic MultiplierDesign of optimized Interval Arithmetic Multiplier
Design of optimized Interval Arithmetic Multiplier
 

Último

Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...arifengg7
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxStephen Sitton
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...gerogepatton
 
AntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxAntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxLina Kadam
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfBalamuruganV28
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHSneha Padhiar
 
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliStructural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliNimot Muili
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptxmohitesoham12
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfManish Kumar
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...Amil baba
 
STATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectSTATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectGayathriM270621
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSsandhya757531
 
Module-1-Building Acoustics(Introduction)(Unit-1).pdf
Module-1-Building Acoustics(Introduction)(Unit-1).pdfModule-1-Building Acoustics(Introduction)(Unit-1).pdf
Module-1-Building Acoustics(Introduction)(Unit-1).pdfManish Kumar
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.elesangwon
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSneha Padhiar
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 

Último (20)

Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
Analysis and Evaluation of Dal Lake Biomass for Conversion to Fuel/Green fert...
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
Turn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptxTurn leadership mistakes into a better future.pptx
Turn leadership mistakes into a better future.pptx
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
March 2024 - Top 10 Read Articles in Artificial Intelligence and Applications...
 
AntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptxAntColonyOptimizationManetNetworkAODV.pptx
AntColonyOptimizationManetNetworkAODV.pptx
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdf
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
 
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot MuiliStructural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
Structural Integrity Assessment Standards in Nigeria by Engr Nimot Muili
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptx
 
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdfModule-1-(Building Acoustics) Noise Control (Unit-3). pdf
Module-1-(Building Acoustics) Noise Control (Unit-3). pdf
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
Uk-NO1 Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Exp...
 
STATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subjectSTATE TRANSITION DIAGRAM in psoc subject
STATE TRANSITION DIAGRAM in psoc subject
 
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMSHigh Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
High Voltage Engineering- OVER VOLTAGES IN ELECTRICAL POWER SYSTEMS
 
Module-1-Building Acoustics(Introduction)(Unit-1).pdf
Module-1-Building Acoustics(Introduction)(Unit-1).pdfModule-1-Building Acoustics(Introduction)(Unit-1).pdf
Module-1-Building Acoustics(Introduction)(Unit-1).pdf
 
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
2022 AWS DNA Hackathon 장애 대응 솔루션 jarvis.
 
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATIONSOFTWARE ESTIMATION COCOMO AND FP CALCULATION
SOFTWARE ESTIMATION COCOMO AND FP CALCULATION
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 

碩士論文投影片

  • 1. 分層的表格為主函數近似方法 Hierarchical Multipartite Function Evaluation Advisor : Prof. Shen-Fu Hsiao (蕭勝夫) Student : Yi-Hau Chen (陳奕豪)
  • 4. Motivation • 特殊函數運算單元被廣泛應用於在數位訊號處理及多媒體 應用,如:圖像處理器(graphics processing unit)。 • 特殊函數運算單元(Special function unit) • 三角函數(trigonometric)、倒數(reciprocal)、指數 (exponential) 與對數(logarithm)。 • 查表(lookup tables(LUT)) 與一些簡單的算數運算單元所 構成 • 主要分為兩類: • piecewise polynomial approximation (PPA) • table-lookup-and-addition (TA) • 本論文主要探討如何有效地減少TA 的表格面積,仍然可以 保持TA 運算速度較快的優點。 4
  • 5. Outline • Motivation • Related Work • Category • Piecewise Polynomial Approximation (PPA) • Table-Lookup-and-Addition (TA) • Bipartite Table Methods (BP) • Symmetric Bipartite Table Methods (SBTM) • Symmetric Table Addition Methods (STAM) • Multipartite Table Methods (MP) • Proposed • Results & Comparison • Conclusion 5
  • 7. Piecewise Polynomial Approximation (PPA)-(1/2) 7 𝑓 𝑥 ≅ 𝑎0 𝑥 𝑚 + 𝑎1(𝑥 𝑚) ∙ 𝑥𝑙
  • 8. PPA-(2/2) deg-2 Architecture 8 𝑓 𝑥 ≅ 𝑎0 𝑥 𝑚 + 𝑎1(𝑥 𝑚) ∙ 𝑥𝑙 + 𝑎2(𝑥 𝑚) ∙ 𝑥𝑙 2
  • 9. Table-Lookup-and-Addition (TA) • 主要分為兩類,add-table-add(ATA) 方法 以及bipartite/multipartite 方法。 • 而bipartite/multipartite 類的方法包含 • bipartite table methods (BP) [16] • symmetric bipartite table methods (SBTM) [17] • symmetric table addition methods (STAM) [18] • multipartite table methods (MP) [1,19] 9
  • 10. Bipartite Table Methods (BP)- (1/5) 位元分區(bit partition) 10 在函數近似方法裡,為了近似一個的函數f(x),n-bit 的輸 入 x 被分成兩個部分𝑥0以及𝑥1,其位元寬度分別為𝛼和𝛽且𝛼 + 𝛽 = 𝑛。我們假設初始輸入區間為0 ≤ 𝑥 < 1, 即 𝑥 = 𝑥0 + 𝑥1 0 ≤ 𝑥0 ≤ 1 − 2−𝛼 0 ≤ 𝑥1 ≤ 2−𝛼 − 2−𝑛 0 ≤ 𝑥0,1 ≤ 1 − 2−𝛾
  • 11. BP-(2/5) 泰勒展開式 11 因此,函數f(x) 可以透過泰勒展開式的前兩項來近似: 𝑛=0 ∞ 𝑓 𝑛 (𝑎) 𝑛! ∙ 𝑥 − 𝑎 𝑛 (𝑎 = 𝑥0 and x = 𝑥0 + 𝑥1) 𝑓 𝑥 = 𝑓 𝑥0 + 𝑓′ 𝑥0 ∙ 𝑥1 + 𝜀𝑙𝑖𝑛 𝑓 𝑥 = 𝑓 𝑥0 + 𝑓′ 𝑥0,1 ∙ 𝑥1 + 𝜀𝑙𝑖𝑛 + 𝜀 𝑠𝑙𝑝
  • 13. BP-(4/5) 架構(Architecture) 13 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝑓′ 𝑥0,1 ∙ 𝑥1 ≅ 𝑇𝐼 𝑥0 + 𝑇𝑂(𝑥0,1, 𝑥1) 𝑇𝐼 𝑥0 ≅ 𝑄[𝑓 𝑥0 ] 𝑇𝑂(𝑥0,1, 𝑥1) ≅ 𝑄[𝑓′ 𝑥0,1 ∙ 𝑥1] Table of Initial Values Table of Offset
  • 15. Symmetric Bipartite Table Methods (SBTM) 15 0 ≤ 𝑥0 ≤ 1 − 2−𝛼 0 ≤ 𝑥1 ≤ 2−𝛼 − 2−𝑛 0 ≤ 𝑥0,1 ≤ 1 − 2−𝛾 𝛿1 = 2−𝛼 − 2−𝑛 𝛿0 = 2−𝛾 − 2−𝛼
  • 16. 16 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0 + 𝛿1 2 ) ∙ (𝑥1 − 𝛿1 2 )
  • 17. 17 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ (𝑥1 − 𝛿1 2 )
  • 18. 18 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ (𝑥1 − 𝛿1 2 ) 𝑇𝐼 𝑥0 = 𝑄[𝑓 𝑥0 + 𝛿1 2 ] 𝑇𝑂 𝑥0,1, 𝑥1 = 𝑄[𝑓′ 𝑥0,1 + 𝛿0 2 + 𝛿1 2 ∙ 𝑥1 − 𝛿1 2 ] Symmetric Bipartite Table Methods (SBTM)
  • 19. Symmetric Table Addition Methods (STAM) 19 𝛿1 = 𝑖=1 𝑚 𝛿1,𝑖, 𝛿1,𝑖 = 2−𝑝 𝑖−1 − 2−𝑝 𝑖, 𝑖 = 1,2, … , 𝑚 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ (𝑥1 − 𝛿1 2 ) with 𝑝0 = 𝛼, 𝑝𝑖 = 𝑝𝑖−1 + 𝛽𝑖, 𝑖 = 1,2, … , 𝑚 𝑥1 = 𝑖=1 𝑚 𝑥1,𝑖 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ ( 𝑖=1 𝑚 𝑥1,𝑖 − 𝑖=1 𝑚 𝛿1,𝑖 2 ) 𝑓 𝑥 ≅ 𝑓 𝑥0 + 𝛿1 2 + 𝑓′(𝑥0,1 + 𝛿0 2 + 𝛿1 2 ) ∙ 𝑖=1 𝑚 (𝑥1,𝑖 − 𝛿1,𝑖 2 )
  • 20. Multipartite Table Methods (MP[1])- (1/5) 位元分區(bit partition) 20
  • 21. Multipartite Table Methods (MP[1])- (2/5) 不同的初值以及斜率產生方式 21 𝑇𝐼 𝑥0 = 𝑄[ 𝑓 𝑥0 + 𝑓 𝑥0 + 𝛿1 2 ] 𝑇𝑂 𝑥0,1, 𝑥1 = 𝑄[𝑠 𝑥0,𝑖 ∙ 𝑥1,𝑖 − 𝛿1,𝑖 2 ] 𝑇𝐼 𝑥0 = 𝑄[𝑓 𝑥0 + 𝛿1 2 ] 𝑇𝑂 𝑥0,1, 𝑥1,𝑖 = 𝑄[𝑓′ 𝑥0,1 + 𝛿0 2 + 𝛿1 2 ∙ 𝑥1,𝑖 − 𝛿1,𝑖 2 ] MP[1]: STAM:
  • 22. Multipartite Table Methods (MP[1])- (3/5) 斜率s的算法 22 𝑠 𝑥0,𝑖 = 𝑓 𝜑2 − 𝑓 𝜑1 + 𝑓 𝜑4 − 𝑓 𝜑3 2 ∙ 𝛿1,𝑖
  • 23. Multipartite Table Methods (MP[1])- (4/5)架構(Architecture) 23
  • 24. Multipartite Table Methods (MP[1])- (5/5)表格分割(table decomposition) 24
  • 25. Outline • Motivation • Related Work • Proposed • 函數的定義域(domain) 與值域(range) • 取樣方法及誤差分配(Error Budget) • HMP方法概述 • Lossless ROM Compression with Low Cost • 整合誤差(Combined Error) 與窮舉搜尋(Exhaustive Search) • 搜尋方法的加速 • Results & Comparison • Conclusion 25
  • 27. 27 𝑇𝐼 𝑥0 = 𝑄[ 𝑓 𝑥0 + 𝑓 𝑥0 + 𝛿1 2 ] 𝜀 𝑞 = 𝑚 + 1 ∙ 2−𝑛−𝑔−1 取樣方法及誤差分配(Error Budget) 𝜀 𝑟𝑛𝑑 = 0.5 ∙ (2−𝑛 − 2−𝑔 ) 𝜀 𝑎𝑝𝑥 +𝜀 𝑞 +𝜀 𝑟𝑛𝑑 = 𝜀𝑡𝑜𝑡𝑎𝑙 < 2−𝑛
  • 35. Lossless ROM Compression with Low Cost- 表格分割(table decomposition) 35
  • 40. 40
  • 41. 41 表4.2: 24 位元 SIN 函數採用 MP [1] 及 HMP 之表格 分解
  • 42. 42
  • 43. 43
  • 44. 比較 MP, HMP, HMP_TI 44
  • 45. 45
  • 46. 46
  • 49. Conclusion • 本論文提出之HMP能有效改良MP[1]的表格 面積。 • 本論文提出之Lossless ROM Compression不僅 有效降低表格面積,且delay增加得很少。 • 本論文一併提出的整合誤差(Combined Error) 與窮舉搜尋(Exhaustive Search)能加速到有效 時間內完成,相比過去有很大的進展。 • 未來展望:希望能將這些方法,拓展到更高 精確度上。(i.e.,32 bits) 49
  • 50. References 1) F. de Dinechin and A. Tisserand, “Multipartite table methods,” IEEE Transactions on Computers, vol. 54, pp. 319–330, March 2005. 2) Y. J. Kim, H. E. Kim, S. H. Kim, J. S. Park, S. Paek, and L. S. Kim, “Homogeneous stream processors with embedded special function units for high-utilization programmable shaders,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 20, pp. 1691–1704, Sept 2012. 3) D. D. Caro, N. Petra, and A. G. M. Strollo, “Reducing lookup-table size in direct digital frequency synthesizers using optimized multipartite table method,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 55, pp. 2116–2127, Aug 2008. 4) B. G. Nam, H. Kim, and H. J. Yoo, “Power and area-efficient unified computation of vector and elementary functions for handheld 3d graphics systems,” IEEE Transactions on Computers, vol. 57, pp. 490–504, April 2008. 5) D. D. Caro, N. Petra, and A. G. M. Strollo, “High-performance special function unit for programmable 3-d graphics processors,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 56, pp. 1968–1978, Sept 2009. 6) D. D. Caro, N. Petra, and A. G. M. Strollo, “Direct digital frequency synthesizer using nonuniform piecewise-linear approximation,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 58, pp. 2409–2419, Oct 2011. 7) J. A. Pineiro, S. F. Oberman, J. M. Muller, and J. D. Bruguera, “High-speed function approximation using a minimax quadratic interpolator,” IEEE Transactions on Computers, vol. 54, pp. 304–318, March 2005. 8) D. U. Lee, R. Cheung, W. Luk, and J. Villasenor, “Hardware implementation tradeoffs of polynomial approximations and interpolations,” IEEE Transactions on Computers, vol. 57, pp. 686–701, May 2008. 9) D. U. Lee and J. D. Villasenor, “Optimized custom precision function evaluation for embedded processors,” IEEE Transactions on Computers, vol. 58, pp. 46–59, Jan 2009.56 10) D. U. Lee, R. C. C. Cheung, W. Luk, and J. D. Villasenor, “Hierarchical segmentation for hardware function evaluation,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 17, pp. 103–116, Jan 2009. 11) T. Sasao, S. Nagayama, and J. T. Butler, “Numerical function generators using lut cascades,” IEEE Transactions on Computers, vol. 56, pp. 826–838, June 2007. 12) S. F. Hsiao, H. J. Ko, Y. L. Tseng, W. L. Huang, S. H. Lin, and C. S. Wen, “Design of hardware function evaluators using low-overhead nonuniform segmentation with address remapping,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 21, pp. 875–886, May 2013. 13) A. G. M. Strollo, D. D. Caro, and N. Petra, “Elementary functions hardware implementation using constrained piecewise-polynomial approximations,” IEEE Transactions on Computers, vol. 60, pp. 418–432, March 2011. 14) S. F. Hsiao, H. J. Ko, and C. S. Wen, “Two-level hardware function evaluation based on correction of normalized piecewise difference functions,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 59, pp. 292–296, May 2012. 15) M. Chaudhary and P. Lee, “An improved two-step binary logarithmic converter for fpgas,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp. 476–480, May 2015. 50
  • 51. References 16) D. D. Sarma and D. W. Matula, “Faithful bipartite rom reciprocal tables,” in Computer Arithmetic, 1995., Proceedings of the 12th Symposium on, pp. 17–28, Jul 1995. 17) M. J. Schulte and J. E. Stine, “Approximating elementary functions with symmetric bipartite tables,” IEEE Transactions on Computers, vol. 48, pp. 842–847, Aug 1999. 18) J. E. Stine and M. J. Schulte, “The symmetric table addition method for accurate function approximation,” Journal of VLSI signal processing systems for signal, image and video technology, vol. 21, no. 2, pp. 167–177, 1999. 19) J.-M. Muller, “A few results on table-based methods,” Reliable Computing, vol. 5, no. 3, pp. 279–288, 1999. 20) P. K. Meher, “Lut optimization for memory-based computation,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 57, pp. 285–289, April 2010. 57 21) W. F. Wong and E. Goto, “Fast evaluation of the elementary functions in single precision,” IEEE Transactions on Computers, vol. 44, pp. 453–457, Mar 1995. 22) J. Y. L. Low and C. C. Jong, “A memory-efficient tables-and-additions method for accurate computation of elementary functions,” IEEE Transactions on Computers, vol. 62, pp. 858–872, May 2013. 23) D. Wang, J. M. Muller, N. Brisebarre, and M. D. Ercegovac, “(m,p,k) –friendly points: A table-based method to evaluate trigonometric function,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 61, pp. 711–715, Sept 2014. 24) S. F. Hsiao, P. H. Wu, C. S. Wen, and P. K. Meher, “Table size reduction methods for faithfully rounded lookup-table-based multiplierless function evaluation,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp. 466–470, May 2015. 25) J.-M. Muller, Elementary Functions: Algorithms and Implementation, 2nd ed. Birkhauser, 2006. 26) M. D. Ercegovac and T. Lang, Digital Arithmetic. Morgan Kaufmann Pub, 2004. 27) B. Parhami, Algorithms and Design Methods for Digital Computer Arithmetic, International 2nd ed. Oxford University Press, 2012. 28) S.-F. Hsiao, P.-C. Wei, and C.-P. Lin, “An automatic hardware generator for special arithmetic functions using various rom-based approximation approaches,” in Circuits and Systems, 2008. ISCAS 2008. IEEE International Symposium on, pp. 468–471, May 2008. 29) 曾于玲, “使用位元截斷法之查表式函數求值單元自動產生器設計,” 國立中山大學資訊工程學系碩士論文, 2011. 30) 吳柏翰, “無乘法器查表法函數運算設計之表格縮減和最佳化,” 國立中山大學資訊工程學系碩士論文, 2013. 31) S. F. Hsiao, C. S. Wen, Y. H. Chen, and K. C. Huang, “Hierarchical multipartite function evaluation,” IEEE Transactions on Computers, vol. PP, no. 99, pp. 1–1, 2016. 51