Nell’iperspazio con Rocket: il Framework Web di Rust!
Apu13 cp lu-keynote-final-slideshare
1. How many cores will we need?
Chien-ping lu, phd
Sr. director, Mediatek inc
2. a group of hippos is called …
A Crash
2
| how many cores will we need? | December 4, 2013 | Confidential
3. a group of crows is called …
A Murder
3
| how many cores will we need? | December 4, 2013 | Confidential
4. a group of giraffes is called …
From Wikipedia
A Tower
4
| how many cores will we need? | December 4, 2013 | Confidential
5. So, it is not surprising that we use
“A Parade” of elephants
5
| how many cores will we need? | December 4, 2013 | Confidential
“A Herd” of sheep
“An Army” of ants
6. From frequency to MULTIcore scaling
Power
Frequency
performance
Power
Serial Computing
Time
6
| how many cores will we need? | December 4, 2013 | Confidential
Parallel Computing
Power wall: 2005
7. How many cores will we need?
Performance
Moderate
Time
7
| how many cores will we need? | December 4, 2013 | Confidential
Massive
8. Dark silicon (OR DARK CORES)?
Performance
8x 4x
2x
Time
8
| how many cores will we need? | December 4, 2013 | Confidential
4x 3x
16x 4x
9. Light up the cores
Redefine the cores to be heterogeneous
Redefine the cores to be heterogeneous
Dark Silicon:
Dark Silicon:
A concern on power
A concern on power
Power ceiling
re w p
o
GPU-style “cores”
Little cores
Body tracking
Big cores
Parallelism wall
Amdahl’s law
Degree of Parallelism (number of cores)
9
An argument against
An argument against
parallel computing
parallel computing
| how many cores will we need? | December 4, 2013 | Confidential
Ray tracing
10. The elephants: CPU cores
For multiple-instruction-multiple-DATA (MIMD) execution
Retrofitted for moderately parallel
workloads, and not very efficient for
massively parallel workloads
Parallel.For (…)
…
Front End
Front End
Front End
Front End
Front End
Front End
Front End
Front End
Front End
Front End
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
…
ALU
ALU
Else
Front End
Front End
…
…
A CPU core runs 1 iteration of the parallel loop
The same color means the same piece of code
10
| how many cores will we need? | December 4, 2013 | Confidential
11. army of ants: simt cores
For SIMT (single-instruction-multiple-thread ) Execution
A SIMT core runs 1 iteration of
the parallel loop
Parallel.For (…)
SIMT is the execution model of HSA
and implemented in modern GPUs,
with MIMD flexibility and SIMD
efficiency
Front End
Front End
Front End
…
…
Else
…
SFU 1
SFU 0
A cluster of SIMT cores shares one front end in a SIMD
manner
11
| how many cores will we need? | December 4, 2013 | Confidential
Can achieve better power efficiency with more
specialized function units given the right workload
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
…
A branch is emulated
thru divergence
12. Properties of massively data-parallel workloads
• Problem size N of the parallel workload can keep growing
• Visible serial workload s can be kept constant
• Communication overhead is proportional to log P (by a factor of r)
• Parallel workload is speeded up linearly by P, the number of cores
• "Embarrassingly" parallel, when there is no communication overhead (r=0)
ss
ss
N
N
rrlog P
log P
N/P
N/P
Time saved by P cores
12
| how many cores will we need? | December 4, 2013 | Confidential
13. Revisiting Amdahl's law for trend prediction
Speedup =
Speedup =
13
| how many cores will we need? | December 4, 2013 | Confidential
s s +PN
+
ss+ rrlog PP + 1 / P
+ log + N
14. Mediatek face beautification
When it comes to beauty, there seems to be no limit
Before
14
| how many cores will we need? | December 4, 2013 | Confidential
Skin tone adjustment
Wrinkle removal
Thinner face, bigger eyes
15. graphics keeps moving
Recognized by 94% of
American Consumers
GL benchmark 2.1 Egypt, 2011 GFX bench 2.7 T-Rex, 2013
Highest grossing video
game of all-time
Pac-man, 1980
GL benchmark 2.5 Egypt, 2012 GFX bench 3.0 Manhattan, 2013
Mobile 3D Graphics
15
| how many cores will we need? | December 4, 2013 | Confidential
16. High-performance computing (HPC) keeps scaling out
HPC from 1993 to 2012
‒GFLOPS ~ 130,000x
‒Cores ~ 11,000x
‒GHz ~ 10x
More atoms
Higher grid resolution
More time steps
16
| how many cores will we need? | December 4, 2013 | Confidential
17. parallel killer apps are just around the corner
completing the positive feedback loop
Moore’s law
Moore’s law
Better user
Better user
experience
experience
Higher Frequency
More cores
Higher Frequency
More cores
What bigger problems to
solve with bigger data?
How solving bigger problems
leads to better user experience?
Bigger data-parallel
Bigger data-parallel
workloads in Graphics
workloads in Graphics
and HPC
and HPC
17
| how many cores will we need? | December 4, 2013 | Confidential
Data
Data
Mining bigger data
Mining bigger data
More complex
More complex
with Machine
Bigger Machine
with problems
Biggerproblems
software
software
Learning
Learning
18. How to distinguish cat photos from dog ones?
ASIRRA
Animal Species Image Recognition for Restricting Access (from Microsoft Research)
18
| how many cores will we need? | December 4, 2013 | Confidential
19. Why is it hard?
Source: training set of Kaggle.com Dogs vs. Cats competition
19
| how many cores will we need? | December 4, 2013 | Confidential
20. is there a solution to relate photos from the same dog?
Prancer, a 5-years-old toy poodle, before and after grooming
20
| how many cores will we need? | December 4, 2013 | Confidential
21. MINE the solutions from the data
Dog-Cat
Dog-Cat
classifier
classifier
Theory of the differences
Theory of the differences
between dogs and cats?
between dogs and cats?
Learn from many (12,500)
Learn from many (12,500)
photos labeled as dogs or
photos labeled as dogs or
cats
cats
Machine Learning
Machine Learning
21
| how many cores will we need? | December 4, 2013 | Confidential
22. machine learning: prediction with powerful models
More powerful have more
knobs, which need to be
determined with a bigger data
set
The explosive growth of data
has made very powerful models
feasible
6th-order polynomial over-fits the 4 samples
22
| how many cores will we need? | December 4, 2013 | Confidential
23. From data to user experience
dog/cat photos
Sensor readings
Depth images
Examples:
x
Bigger data lead to more
Bigger data lead to more
powerful models
powerful models
Web-scale Data
( xn , y n )
Client
f x { ai }
Model
ai Knobs
Cloud
{ }
dog or cat
jogging, walking or climbing
body motion
y models with
Powerful models with
Powerful
more knobs lead to
more knobs lead to
better user experience
better user experience
Determine { ai } to minimize the error between
f xn { ai }
and
Model
Machine Learning
23
| how many cores will we need? | December 4, 2013 | Confidential
yn
24. Smart clients in the era of data
Smarter Client
Client
Smarter Client
Client
Cloud
Bigger Training
Bigger Training
Big Training Set
Big Training Set
Set
Set
In the cloud or
the clients
Better
Better
Connectivity
Connectivity
Connectivity
Connectivity
24
| how many cores will we need? | December 4, 2013 | Confidential
More powerful
More powerful
Powerful Model
Powerful Model
Model
Model
Better User
User
Better User
User
Experience
Experience
Better Sensing
Sensing
Better Sensing
Sensing
Bigger Data
Bigger Data
Data Mining
Data Mining
Mining
Mining
Local Machine
Local Machine
Learning
Learning
Input
Input
data
data
25. Looking forward
The future is here
‒ There are already massively parallel
heterogeneous processors
There is no shame in being dataparallel
‒ One of the smartest things achieved
in computing is data parallel
Source: Le et al., Building High-level Features Using
Large Scale Unsupervised Learning
25
| how many cores will we need? | December 4, 2013 | Confidential
Go parallel and go
heterogeneous to keep
Mobile device cool in our palms
Data centers clean for our
environment
Carbon footprint of US
datacenters is at the
same level as the airline
industry