FPGAのトレンドをまとめてみた

FPGAのトレンドをなんとなく
まとめてみた
みよしたけふみ
!
2014.05.13
1

勝手な予想の概略
14nm→10nm→7nmの微細化で14nmが安価に
使用可能なロジック数やBRAMは現状のx2+を期待
最高動作周波数は1.15∼1.6倍 (= 1GHz+)
消費電力削減(処理性能/WはCPU比 ∼10,000倍)
高速通信のサポート
オフチップ通信: 28Gbps - 56Gbps
メモリアクセス: HMC(2.5Tbps), DDR4(1.3Tbps)
DSPの増強
単精度10TFLOPS，100GFLOPS/W
プロセッサユニットとの密な連携
ARM 1.5GHz Quadコアとか
2
2014年
4月
次
世
代
デバイ
ス
に
搭
載
予
定
の
14nmテス
ト
チ
ップ
の
デ
モ
*1 だ
そ
う
な
の
で，
あ
と
5, 6年
後
く
ら
い
に
手
が
でる
く
ら
い
の
スペ
ッ
ク
は
こ
ん
な
か
な
あ
？
と
想
像
*1) http://www.altera.co.jp/corporate/news_room/releases/2014/products/nr-14nm-device.html

主な参考資料
数値はAltera/Xilinxの次世代デバイスポートフォリオを参考
Altera Generation 10 FPGA & SoC
http://www.altera.co.jp/technology/system-tech/next-gen-technologies.html
Xilinx UltraScaleアーキテクチャ
http://japan.xilinx.com/products/technology/ultrascale.html
!
!
!
FPGA2012 併設ワークショップ FPGA in 2032
http://tcfpga.org/fpga2012/
3
2014年4月次世代デバイスに搭載予定の14nmテストチップのデモ*1

→ あと5, 6年後くらいには，手がでるくらいかなあ？
これらの話に対して，今日までの動きに大きな乖離はない
以降，参考資料からの抜粋

現代FPGAの向いている基本的な方向
4
出典: Ivo Bolsens, "Programming Modern FPGAs", MPSOC, 2006年8月, http://www.mpsoc-forum.org/previous/2006/slides/Bolsens.pdf
MPSOC 2006 slide 10
Xilinx Strategic Directions
APPS
New
Existing
Markets
Glue Logic
• Network Infrastructure
• Computing Infrastructure
• Industrial, medical
• Military
Existing
Time
Algorithmic Logic
• Consumer Electronics
• Automotive
• Portable
New
Embedded Processor
Gb Transceivers
DSP
Integration
Hard IP
System Tools
Cost
Power
Quality

次世代FPGAのターゲットエリア
5
UltraScale :
UltraScale
( 1 )
UltraScale
1 :
OTN
Networking
Massive
Packet Processing
>400 Gb/s Wire-Speed
Massive
Data Flow
>>5 Tb/s
Massive
I/O and Memory Bandwidth
>5 Tb/s
Massive
DSP Performance
>7 TMACs
400 Gb/s
100 Gb/s
1Tb/s
Digital Video
4k/2k
1080P
8K
Wireless
Communications
LTE
3G
LTE-A
Radar
Active
Element
Passive
Array
Digital
Array
UltraScale
Architecture
Requirements
Smarter
Applications
WP435_01_070213
Xilinxの次世代FPGA
出典: Xilinx, "ホワイトペーパー:UltraScale アーキテクチャ WP435(v1.0)" 2013年7月8日

次世代Stratix10と今のFPGAとの比較
6
1 Stratix 10 6 72 DDR4 SDRAM
3.2 Gbps 1.382 Tbps
1 Stratix V FPGA Stratix 10 FPGA
2
FPGA & SoC
Arria® 10 FPGA & SoC Generation 10
FPGA
5 Arria 10 Arria V FPGA
1. Stratix V FPGA Stratix 10
Stratix V
FPGA
Stratix 10
1,000 K LEs 4,000 K LEs 4x
Tera FLOPS 1 10+ 10X+
500 MHz 1 GHz+ 2X
28 Gbps 56 Gbps 2X
DDR 1,866 Mbps 3,200 Mbps 1.7X
2 Stratix 10 FPGA & SoC
出典:Altera Corporation，"ゼタバイト時代の性能および消費電力要件にミートするアルテラのGeneration10製品"，2013年6月

シリコンロードマップ
7
Silicon Roadmap
Courtesy : IMEC
Copyright 2012 Xilinx
出典: Ivo Bolsens, "FPGA2032 Roadmap:A Personal Perspective", FPGAs in 2032: Challenges and Opportunities in the next 20 years, 2012年2月22日

http://tcfpga.org/fpga2012/IvoBolsens.pdf
ムーア則によると2020年はこの辺り

プロセスとSRAM bit cell size
8
出典 Zvi Or-Bach, "28nm – The Last Node of Moore's Law", 2014年3月19日, http://www.eetimes.com/author.asp?doc_id=1321536
28nmプロセス(現行のFPGAで採用されている)では0.127um2程度なので，

微細化しても作れるSRAM(≒ロジックセル，メモリ)は2倍∼2.5倍程度と予想

ゲート規模の推移(+予測)
9
出典: 船田悟史，"FPGAの応用領域が拡大，ビッグ・データや金融取引，Webデータ処理のインフラ技術に",

TechVillage, 2013年3月22日, http://www.kumikomi.net/archives/2013/03/co16fpga.php

FPGAの使い方
10
CPUs vs. Stream Processing
6
2020年も変わらず，データフローの展開と考えられる
出典 Michael J. Flynn, "Using FPGAs for HPC* acceleration: now and in 20 years", FPGAs in 2032: Challenges and Opportunities in the next 20 years,

2012年2月22日 http://tcfpga.org/fpga2012/MichaelFlynn.pdf

FPGAがはまる適用事例での性能向上
11
Achieved Computational Speedup for the entire
application (not just kernel) compared to Intel server
RTM with Chevron
VTI 19x and TTI 25x
Sparse Matrix
20-40x
Seismic Trace Processing
24x
Lattice Boltzman
Fluid Flow 30x
Conjugate Gradient Opt 26xCredit 32x and Rates 26x
624
624
9
ビッグデータ処理でも，うまくはまれば10倍以上の性能向上が期待できる
出典 Michael J. Flynn, "Using FPGAs for HPC* acceleration: now and in 20 years", FPGAs in 2032: Challenges and Opportunities in the next 20 years,

2012年2月22日 http://tcfpga.org/fpga2012/MichaelFlynn.pdf

新しいFPGA開発処理系の進歩に期待!?
高位合成
C言語ベース，関数型言語系(Bluespec)
OpenCL
MaxCompiler(JavaでDFMを作る)
ドメイン特化型
SQLをロジックに変換するなど
12
などなど
アプリケーションの実装が，もう少しは楽になるのでは，と期待

デザインツールの必要性
13
Need for Design Tools
13
Hour Day Week Month
0.25
1
Year
4
16
64
256
Initial Design
Relative
Performance
Design-time
CPU
GPU
FPGA
Gap
Courtesy : David Thomas
とりあえずの実装(Initial Design)に時間かかりすぎだし，性能でないし…


FPGAのヘテロジニアスプロセッサ化
14
The Programmable Processing Platform
A heterogeneous multicore
Application processors
– Hard core and soft core
– External and embedded
– Caches and large memory space
– Unified shared memory
– Full OS support
Streaming micro-engines
– Configurable (soft) vector cores
– Tiny memory footprint
– Many, distributed, memories
– Compute kernels, no OS
Fixed function datapaths
– C to Gates generated
– HDL coded
– Library IP component
DDR3
MemCon
Interconnect A
SMP
CPU
X86 CPUDSP
High speed
I/O
FPGAs provide a rich set of mapping options for complex algorithms and
communication patterns
Discrete
GPU
Micro-
Engine
Array
HW
Datapaths
Interconnect B
FPGA

XilinxもAlteraもCPU混在にするのが今のトレンド→SW/HW混在アプリも

高速トランシーバは専用HWとして搭載
PCIe
DDR4メモリコントローラ
100Gbps EMAC
光トランシーバ
15
などなど
ASICと色ない足回りを活用したロジックが実現できる…といいなあ

FPGA内部のメモリバンド幅(対CPU)
16
MPSOC 2006 slide 19
Memory Bandwidth Envelope
Intel; Xilinx
0
200
400
600
800
1000
0 50 100 150 200 250 300
B andwidt h ( Tbps)
Memory(KB)
4VLX200
2V6000
3.5GHz P5
• Bandwidth to Registers: 500x that of a processor registerfile
• Bandwidth to LUTrams: 50x that of L1 cache of processor
• Bandwidth to BRAMS: 5x that of L1 to L2 cache of a processor
REGISTERS
LUT-RAM
BRAM
出典: Ivo Bolsens, "Programming Modern FPGAs", MPSOC, 2006年8月, http://www.mpsoc-forum.org/previous/2006/slides/Bolsens.pdf
少し古い資料ですが
FPGA

電力効率
17
17
Stratix 10 FPGA & SoC
Stratix 10 FPGA & SoC 14nm
FPGA
13 Stratix 10 Stratix V FPGA
Stratix V Stratix
10 55 %
70 %
14 Stratix 10 Arria 10
Stratix 10 Arria 10
Arria 10 Stratix 10 Stratix V
40 50 % Arria 10
13 Stratix V FPGA Stratix 10
Stratix V
標準デバイス
Stratix 10
標準デバイス
Stratix 10
消費電力削減技法を
使用
1.0
0.8
0.6
0.4
0.2
0.0
消費電力
(Stratix V デバイスを
1 に設定)
最大
55 %
削減
最大
70 %
削減
Waterman FPGA CPU GPU
(15) 6 OpenCL
FPGA
1 OpenCL OpenCL Apple Inc. Khronos
7 FPGA (Stratix V )
GPU 148
7 Arria 10 Stratix 10 FPGA Smith-Waterman
GPU Arria 10 FPGA
18 200 Stratix 10 FPGA
GPU 660
FPGA SoC
OpenCL
C FPGA CPU GPU DSP
6. Smith-Waterman 3
( ) = (256, 15M)
(MCUPS) ( ) (MCUPS )
Intel® Xeon® Quad- 40 140 0.29
NVIDIA GT620 438 50 8.76
Stratix V A7 FPGA 32,596 25 1,303
7. Arria 10 & Stratix 10 Smith-Waterman
( ) = (256, 15M)
(MCUPS) ( ) (MCUPS )
Arria 10 >35,000 18 >1,900
Stratix 10 >70,000 12 >5,800
現状でもCPU，GPUと比較して高い電力効率
次世代ではさらに，電力効率の向上が期待できる
約4000倍

次世代デバイスにおける電力効率の見積もり
18
ICT
FPGA SoC
8 ICT
Generation 10
FPGA SoC ICT
8. ICT FPGA SoC
Generation 10
Arria 10
GPU 148
100G OTN 40 %
60 MHz
(RRH)
20W
500 MHz
Stratix 10
GPU 200
100G OTN 65 %
60 MHz
(RRH)
20W
736 MHz

FPGAのトレンドをまとめてみた

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (20)

Semelhante a FPGAのトレンドをまとめてみた

Semelhante a FPGAのトレンドをまとめてみた (20)

Mais de Takefumi MIYOSHI

Mais de Takefumi MIYOSHI (20)

Último

Último (9)

FPGAのトレンドをまとめてみた