SlideShare uma empresa Scribd logo
1 de 53
Baixar para ler offline
Energy Efficient Computing ... In the early 21C


Abstract:





Opinions expressed are those of the author alone

With the assistance of its global partners, ARM shipped 8.7 billion CPUs in 2012; a number which continues
to grow at around ~20%pa. The 40B we have shipped to date outnumber the total of PC's more than 50
times; and today more than 75% of the things connected to the Internet are ARM based. The dominant
nature of Computing in the 21c is very different to that of the Mainframe era. It is sobering to think that if
each of those 8.7B CPUs was to dissipate just 100mw, then it would require the output of two modern power
stations to drive them; with 2.4 next year, and 3 the year after that! So Electronic Systems are also defining
where the real Energy Efficient Computing issue is! But with such a small footprint it must be easy to
measure and manage power optimisation? An increasing percentage of these are immensely complex
systems, running significant multi-tasking and multi-threaded operating systems on platforms which include
multi-processor CPU/GPU configurations, and GB of memory. Whilst their minimum dissipations are a few
uW, their peak power exceed the silicon's ability to dissipate it; so the penalty for power un-aware software
design is huge. What has been done to manage this in Electronic Systems design, and can any lessons can be
transferred to the Classic Computing domains?

Context




1hr talk at The Centre for Robotics and Neural Systems (CNRS) at University of Plymouth, Devon, UK.
The CRNS has a regular seminar series inviting national and international speakers.
http://www.tech.plym.ac.uk/SOCCE/CRNS/

SlideCast and pdf available via http://ianp24.blogspot.co.uk/

1
Opinions expressed are those of the author alone

Prof. Ian Phillips
Principal Staff Eng’r,
ARM Ltd
ian.phillips@arm.com
Visiting Prof. at ...

Contribution to Industry
Award 2008

Centre for Robotics and Neural Systems
Uo.Plymouth
1nov13
SlideCast and pdf available via http://ianp24.blogspot.co.uk/

2

1v0
Energy Efficient Computing ..?

3
Energy Efficient Computing ..?

4
Energy Efficient Computing ..?

5
The Visible Face of Computing Today

6
The Invisible Face of Computing Today

 100’s of Billions of computers each consuming mW!
 Bringing Embedded Intelligence to the Consumer

Market, has changed the Face of Computing! (Again)

7
Our 21c World ...

8
Markets provide the Growth Drivers
3rd Era

Millions of Units

Computing as part
of our lives

2nd Era
Broad-based computing
for specific tasks

1st Era
Select work
tasks

1960

1970

1980

1990

2000

2010

2020

Today: ~2% of our Energy Use goes on Computing and Electronics!
... Tomorrow: It could easily be 20%!
9
ARM in the Digital World

150+
billion

CPUs cumulative
by 2020

 8.7B CPUs shipped in 2012 (Growing 20%pa.pa)
 75% of the things connected to the
Internet today are ARM Powered! Gartner

40+
billion

CPUs to date
1998
10

http://www.arm.com/

2012

2020
Moore’s Law ...

X

100nm

10um

Transistor/PM (K)

1um

Transistors/Chip (M)

Approximate Process Geometry

10nm

Gordon Moore. Founder of Intel. (1965)

100um

ITRS’99

...
11

http://en.wikipedia.org/wiki/Moore’s_law

x More Functionality on a Si Chip in 20 yrs!
A Machine for Computing ...
Computing: A general term for algebraic manipulation of data ...
Numerated
Phenomena
IN (x)

y=F(x,t,s)

Processed Data/
Information
OUT (y)

... State and Time are always factors (variable weight).



It can include phenomena ranging from human thinking to calculations
with a narrower meaning.
Usually used it to exercise analogies (models) of real-world situations;
Frequently in real-time (Fast enough to be a stabilising factor in a loop).
Wikipedia



... So what part does Hardware and Software play?
... And what about Energy?
12
Antikythera c87BC ... Planet Motion Computer
Mechanical
Technology

• Inventor: Hipparchos (c.190 BC – c.120 BC).
•

Ancient Greek Astronomer, Philosopher and Mathematician.
Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!)
See: http://www.youtube.com/watch?v=L1CuR29OajI
13
Orrery c1700 ... Planet Motion Computer
Mechanical
Technology

• Inventor: George Graham (1674-1751). English Clock-Maker.
• Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!)
14
Babbage's Difference Engine 1837
Mechanical
Technology

(Re)construction
c2000


The difference engine consists of a number of columns, numbered from 1 to N. Each column is able to store one decimal number. The only operation the engine
can do is add the value of a column n + 1 to column n to produce the new value of n. Column N can only store a constant, column 1 displays (and possibly prints)
the value of the calculation on the current iteration.

Computer for Calculating Tables: A Basic ALU Engine

15
“Enigma” c1940
Mechanical
Technology

Data Encryption/Decryption Computer
16
“Colossus” 1944
Valve/Mechanical
Technology

Code-Breaking Computer: A Data Processor
17
“Baby” 1947

(Reconstruction)
Valve/Software
Technology

General Purpose, Quantised Time and Data, (Digital) Electronic Computing
18
Signal Processing

Tele-Verta Radio
4 Valves
1 Rectifier Valve

BTH
Crystal Set

c1945

1 Diode

Evoke DAB Radio

c1925

100 M Transistors
2-3 Embedded Processors

Bush Radio
7 Transistors
1 Diode
c1960

19

c2005
Radio as Computation ...
Vi
Vrf=Vi*100
Vro='Bandpass'(Vif*1000)

Vrf

Vif
Vro
Vif=Vrf*Vlo
Vlo

Vlo=Cos(t*1^6)

Single-Task (Embedded), Real-Time, Analogue (Close-Enough) Computing
20
Radio as Computation ...
Valve
Technology

Vi
Vrf=Vi*100
Vro='Bandpass'(Vif*1000)

Vrf

Vif
Vro
Vif=Vrf*Vlo
Vlo

Vlo=Cos(t*1^6)

Single-Task (Embedded), Real-Time, Analogue (Close-Enough) Computing
21
Radio as Computation ...
‘Integrated Circuit’
Transistor
Valve
Technology

Vi
Vrf=Vi*100
Vro='Bandpass'(Vif*1000)

Vrf

Vif
Vro
Vif=Vrf*Vlo
Vlo

Vlo=Cos(t*1^6)

Single-Task (Embedded), Real-Time, Analogue (Close-Enough) Computing
22
Computing is Era and Application Related ...

Computing: Creating Useful Output from Input ...
Architecture: The way this is done on the day.
It is the Most Important Product Decision!
(HW, SW, Digital, Analogue, Optics, Graphene, Mechanics, Steam, etc)

23
Moore's Real Law: x2 Functionality Every 18mth!
 Cascade of Technologies supporting Functional growth ...

Functional Density (units)

1012

1010

106

102
Electronic era:

System era:

1975-2005

2003-2030

100
1960

1980

2000

2020

... The ‘Law’ started with Wood ⇒ Stone ⇒ Bronze ⇒ Iron
24
Computing in a Cool iCon ...

25
‘A lot’ of Architecture in a Smart Phone ...

... Computation in many forms
26
Take a Look Inside...
Level-1: Modules
The Control Board.

27

http://www.ifixit.com
Inside The Control Board

(a-side)

Level-2: Sub-Assemblies




Visible Computing Contributors ...
 Samsung: Flash Memory - NV-MOS (ARM Partner)
 Cirrus Logic: Audio Codec - Bi-CMOS (ARM Partner)
 AKM: Magnetic Sensor - MEM-CMOS
 Texas Instruments:Touch Screen Controller and mobile DDR - Analogue-CMOS (ARM Partner)
 RF Filters - SAW Filter Technology
Invisible Computing Contributors ...
 OS, Drivers, Stacks, Applications, GSM, Security, Graphics, Video, Sound, etc
 Software Tools, Debug Tools, etc

28

http://www.ifixit.com
Inside The Control Board

(b-side)

Level-2: Sub-Assemblies


More Visible Computing Contributors ...








A4 Processor. Spec:Apple, Design & Mfr: Samsung
Digital-CMOS (nm) ...
 Provides the iPhone 4 with its GP computing power.
 (Said to contain ARM A8 600 MHz CPU and other ARM IP)
ST-Micro: 3 axis Gyroscope - MEM-CMOS (ARM Partner)
Broadcom: Wi-Fi, Bluetooth, and GPS - Analogue-CMOS (ARM Ptr)
Skyworks: GSM
Analogue-Bipolar
Triquint: GSM PA Analogue-GaAs
Infineon: GSM Transceiver - Anal/Digi-CMOS (ARM Partner)

GPS
Bluetooth,
EDR &FM

29

http://www.ifixit.com
Level-3: Processor

NB: The Tegra 3 is similar to the
A4/5, but not used in the iPhone

30

(Nvidea Tegra 3, Around 1B transistors)
Packing Technology into an iCon
Analogue and Digital Design
Embedded Software
Mechanics, Plastics and Glass
Micro-Machines (MEMs)
Displays and Transducers
Robotics and Test
Knowledge and Know-How
Research, Education and Training
Components, Sub-Systems and Systems;
Design, Assembly and Manufacture
Metrology, Methodology and Tools
... Involving Many Specialist Businesses
... Round and Round the World
...Not-Least from Europe

31
Architecting your Product




: Is the cumulative non-functional choices made to
support the functional need
 A Good Architecture is the one that ‘survives’
 History is written by winners (2nd is for losers)
: Component Performance may be ‘poor’ as long
as System Performance is ‘better’ for its use.

 Architectural Options ...
: Business Model (Cost-of Ownership, ROI), TTM (Productivity, History, IPAvailability, Know-How), Aesthetics (Power, Quality, Behaviour, Appearance)

: Analogue, Digital, Mechanical, Optical, RF, Software, Plastics,
Metal-forming, Manufacturing, Glass, ...

: More than 99% of a Product is Reused from its Predecessor


...

32

is assumed (working is expected!)
... It used to be the only consideration!
Power Philosophy
 Hardware Dissipates Power ...


Chose Underlying Technology for best power efficiency.



One size does not fit all (Products, Applications or Instances)

 ... Software Doesn’t (But it Tells Hardware To!)



Chips can literaly melt-down under software ‘instruction’
Make computing hardware power as ‘Activity’ dependent as possible





Zero Activity => Zero Power

Make OS/Apps aware of the power/performance situation,
and their options for controlling it (Need Indicators and Levers)

 ... Think System: It’s how the ‘box’ performs, not the components

33
Core Power Management
 For Processor and Peripheral Circuitry...
 Variable/Gated - Clock Domains
 Variable/Switched - Power Domains
 Indicators and Levers

 Allow the software to see and influence what is going on
 Principles of Core Power Efficiency...
 Minimise voltage/frequency (P=CV2f) so that processor has just




enough performance for the current application need
Maximises ‘Activity Power’ dependence (Zero Activity => Zero Power)
Management by the OS and the Application SW
Apply to all on & off-chip zones (not just the CPU) ...




34

Methodology
Retention Flops/Latches, Level Shifters, Power-Switch Cells, PLLs
Architectural Energy Efficiency - Parallelism
Processor
Input

Output

Output
Processor

f

Input

f/2

Processor

f

f/2

Capacitance = C
Voltage = V
Frequency = f

Capacitance = 2.2C
Voltage = 0.6V
Frequency = 0.5f

Power = CV2f

Power = (2.2*0.6*0.6*0.5)CV2f = 0.4CV2f

 To a limit determined by Amdahl’s or Gustafson’s Law ...
 Amdahl: Extracted parallelism from existing code (Reuse)
 Gustafson: Some needs only benefit from parallelism (Custom)
... Actual improvement is application specific.
35
Architectural Energy Efficiency - Data
 Moving Data takes significant Energy


Becoming the dominant energy consumption in a system

 Data Location

 Avoid moving or copying Data
 Energy ∝ DataVolume x Speed x Distance>2(3)
 Bring the processing to the data
 Bring the Processing to the Data
 Caching is good (depends on implementation)
 Write back is better than write-through
 Local working memory is good
 Aka Software Caching
... The Arrangement of your Data matters!
36
All ARM Processors are Power Efficient

37
Chose The Horses for The Course
About 50MTr

About 50KTr

... Delivering ~5x speed (Architecture + Process + Clock)
38
Multicore ARM On-Chip ...
 Heterogeneous Multicore Systems


have been in ARM for a long time:
Application

UI & 3D Graphics

Power Manager

Cortex™-A8

Mali™-400
MP

Cortex-M3

Interconnect

Memory

39
Coherent Multicore Cluster ...
 Homogenous Multicore


cluster, as part of a heterogeneous system:

Cortex-A9

Power Manager

Mali-400 MP

…

User Interface
and 3D graphics

Cortex-M3

Cortex-A9

Coherency Logic

Interconnect

40
Multiple Clusters ...
 Multiple Homogeneous Coherent Clusters

…
Cortex-A15

Cortex-A15

Coherency Logic in L2 Cache

…
Cortex-A15

Coherency Logic in L2 Cache

Coherent Interconnect

41

Cortex-A15
Computer On a Chip c2010 ...
Today’s Consumer require a pocket ‘Super-Computer’ ...
 Silicon Technology Provides a Billion transistors ...
 It will be supported with a few GB of memory ...
• Typically 10 Processors ...
•
•
•
•
•

•

42

http://www.arm.com/

4 x A9 Processors (2x2):
4 x MALI 400 Frag. Proc
1 x MALI 400 Vertex Proc
1 x MALI Video CoDec
Software Stacks, OS’s and Design
Tools/

ARM Technology gives
chip/system designers ...
• Improved Productivity
• Improved TTM
• Improved Quality/Certainty
CoreLink™ CCN-504 and DMC-520
Heterogeneous processors – CPU, GPU,
DSP and accelerators

Virtualized Interrupts
Up to 4 cores
per cluster

Up to 4
coherent
clusters

Quad
CortexA15

Quad
CortexA15

Quad
CortexA15

L2 cache

L2 cache

L2 cache

Quad
ACE
CortexA15
L2 cache

DSP
DSP
DSP

PCIe
DPI

Crypto

USB

AHB
ACE

SATA
NIC-400

IO Virtualisation with System MMU

CoreLink™ CCN-504 Cache Coherent Network

Integrated
L3 cache

Snoop
Filter

8-16MB L3 cache

CoreLink™
DMC-520

Dual channel
DDR3/4 x72

10-40
GbE

Interrupt Control

Uniform
System
memory

CoreLink™
DMC-520

NIC-400 Network Interconnect

PHY

x72
DDR4-3200

x72
DDR4-3200

Flash

GPIO

Peripheral address space

43

Up to 18
AMBA
interfaces for
I/O coherent
accelerators
and IO
Methodology As Well As Hardware
 C/C++

 Debug & Trace

Development

Energy Trace
Modules

 Middleware

44
big.LITTLE Processing
 For High-Performance systems...
 Tightly coupled combination of two ARM CPU clusters:



Cortex-A15 and Cortex-A7 - functionally identical
Same programmers view, looks the same to OS and applications

 big.LITTLE combines high-performance and low power



Automatically selects the right processor for the right job
Redefines the efficiency/performance trade-off
“Demanding tasks”

>2x Performance

Current big.LITTLE
smartphone

45

big

“Always on, always
connected tasks”

LITTLE

30% of the Power
(select use cases)

Current big.LITTLE
smartphone
LITTLE

Fine-Tuned to Different Performance Points
Most energy-efficient applications processor from ARM




Simple, in-order, 8 stage pipelines
Performance better than mainstream, high-volume
smartphones (Cortex-A8 and Cortex-A9)

big

Highest performance in mobile power envelope

46




Complex, out-of-order, multi-issue pipelines
Up to 2x the performance of today’s high-end
smartphones

Cortex-A7
Cortex-A53

Q
u
e
u
e

I
s
s
u
e

I
n
t
e
g
e
r

Cortex-A15
Cortex-A57
big.LITTLE Software
CPU Migration

 Migrate a single processor workload to the appropriate CPU
 Migration = save context then resume on another core
 Also known as Linaro “In Kernel Switcher”
 DVFS driver modifications and kernel modifications
 Based on standard power management routines
 Small modification to OS and DVFS, ~600 lines of code
big.LITTLE MP

 OS scheduler moves threads/tasks to appropriate CPU
 Based on CPU workload
 Based on dynamic thread performance requirements
 Enables highest peak performance by using all cores at once

47
Bringing the Processing to the Data …
Press Claims:

Dell + Marvell, Copper

BaiDu + Marvell, Baserock

 288 server nodes in a 4U rack space
Public Source: http://www.engadget.com/2011/11/02/hp-and-calxedas-moonshot-arm-servers-will-bring-all-the-boys-to/

48
... Refining Data into Information

49
Transferrable Lessons to GP Software

 Moving data is Power Expensive ...

 Don’t move data; use it locally (Cache it)
 Refine it once, use it often (Pre-Process it)
 Your CPU Power is work-load independent ...
 So, get in; get the work done; and get out.
 Maximise the workload of your code; terminate when complete.
 Make your Processing work-load dependent
 Use a Hypervisor and turn off (at least free) processors not in use.

50
Societies Challenges in the 21c
 Urbanisation (Smart Cities)
 Health (eHealth)
 Transport
 Energy (Smart Grid)
 Security
 Environment

 Food/Water
 Ageing Society
 Sustainability
 Digital Inclusion
 Economics

And whilst our technologies will be an
essential part of all solutions, they
cannot not fix them without Society’s
help and cooperation!
... Energy Efficient Computing will minimise
the impact not avert the challenges!
51

Having a great time!
Conclusions
 Putting the power of Computation into the hands of the masses,
has changed the face of Computing (again)


Electronic Systems will become Essential to our Lives and the Economy

 Power Efficient ES are a major issue to Society


Which faces a future with them as a significant energy consumer in themselves

 Power Efficiency must be architected into the System Hardware
and Software from the beginning





52

To realise the maximum potential out of your Silicon (Avoiding Dark Si)
Architect & Design HW as efficiently as possible (reflecting the task)
 Strive for: No Work => No Power
Equip HW with Indicators and Levers so the System/App can manage it
Bring Processing to the Data ...
 Don’t move Data; move Information
 Process data Locally
 Energy ∝ DataVolume x Speed x Distance>2(3)
Computing at the heart of the 21c

ARM:

Enabling the Creation of
High-Performance Electronic Systems
--• Productively, Economically and Reliably
• Through Hw/Sw Reuse Methodologies
• Based on a family of CPU/GPU cores

53

Mais conteúdo relacionado

Destaque

Educacion y-educadores-de-la-primera-infancia
Educacion y-educadores-de-la-primera-infanciaEducacion y-educadores-de-la-primera-infancia
Educacion y-educadores-de-la-primera-infanciaCamila Giraldo
 
Confused CMS Presentation - Internet World London 2011 #iwexpo. Delivered on...
Confused CMS Presentation - Internet World London 2011 #iwexpo.  Delivered on...Confused CMS Presentation - Internet World London 2011 #iwexpo.  Delivered on...
Confused CMS Presentation - Internet World London 2011 #iwexpo. Delivered on...✪ Chris Lewis ✪
 
AUTOLIBRE Votre mobilité en toute liberté
AUTOLIBRE Votre mobilité en toute libertéAUTOLIBRE Votre mobilité en toute liberté
AUTOLIBRE Votre mobilité en toute libertéPhilippe BECOULET
 
The Recipe to Getting Attendees to Your Open Source Events
The Recipe to Getting Attendees to Your Open Source Events The Recipe to Getting Attendees to Your Open Source Events
The Recipe to Getting Attendees to Your Open Source Events Karen Vuong
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, ReallyMYXPLAIN
 
Pretty Little Liars
Pretty Little Liars Pretty Little Liars
Pretty Little Liars Sara Monteiro
 
Financing decisions
Financing decisionsFinancing decisions
Financing decisionsfinanzas_uca
 
Curriculum vitssfae pepe
Curriculum vitssfae pepeCurriculum vitssfae pepe
Curriculum vitssfae pepeJhuAn Mamani
 
T 02 rev liberais e nacionalismos sxix
T 02 rev liberais e nacionalismos sxixT 02 rev liberais e nacionalismos sxix
T 02 rev liberais e nacionalismos sxixMaribel Valiela
 
Jornal O Guaracy - Edição 157
Jornal O Guaracy - Edição 157Jornal O Guaracy - Edição 157
Jornal O Guaracy - Edição 157Eudes Sousa
 

Destaque (17)

ΚΩΣΤΑΝΤΙΝΟΣ ΒΑΡΝΑΛΗΣ
ΚΩΣΤΑΝΤΙΝΟΣ ΒΑΡΝΑΛΗΣΚΩΣΤΑΝΤΙΝΟΣ ΒΑΡΝΑΛΗΣ
ΚΩΣΤΑΝΤΙΝΟΣ ΒΑΡΝΑΛΗΣ
 
Educacion y-educadores-de-la-primera-infancia
Educacion y-educadores-de-la-primera-infanciaEducacion y-educadores-de-la-primera-infancia
Educacion y-educadores-de-la-primera-infancia
 
Confused CMS Presentation - Internet World London 2011 #iwexpo. Delivered on...
Confused CMS Presentation - Internet World London 2011 #iwexpo.  Delivered on...Confused CMS Presentation - Internet World London 2011 #iwexpo.  Delivered on...
Confused CMS Presentation - Internet World London 2011 #iwexpo. Delivered on...
 
El educador
El educadorEl educador
El educador
 
Kyiv ukraine access
Kyiv ukraine accessKyiv ukraine access
Kyiv ukraine access
 
AUTOLIBRE Votre mobilité en toute liberté
AUTOLIBRE Votre mobilité en toute libertéAUTOLIBRE Votre mobilité en toute liberté
AUTOLIBRE Votre mobilité en toute liberté
 
The Recipe to Getting Attendees to Your Open Source Events
The Recipe to Getting Attendees to Your Open Source Events The Recipe to Getting Attendees to Your Open Source Events
The Recipe to Getting Attendees to Your Open Source Events
 
Redes
RedesRedes
Redes
 
Bioquimica 11 3
Bioquimica 11 3Bioquimica 11 3
Bioquimica 11 3
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 
Pretty Little Liars
Pretty Little Liars Pretty Little Liars
Pretty Little Liars
 
Pel p&l
Pel   p&lPel   p&l
Pel p&l
 
Financing decisions
Financing decisionsFinancing decisions
Financing decisions
 
Curriculum vitssfae pepe
Curriculum vitssfae pepeCurriculum vitssfae pepe
Curriculum vitssfae pepe
 
T 02 rev liberais e nacionalismos sxix
T 02 rev liberais e nacionalismos sxixT 02 rev liberais e nacionalismos sxix
T 02 rev liberais e nacionalismos sxix
 
Jornal O Guaracy - Edição 157
Jornal O Guaracy - Edição 157Jornal O Guaracy - Edição 157
Jornal O Guaracy - Edição 157
 
Som e ouvido
Som e ouvidoSom e ouvido
Som e ouvido
 

Semelhante a Energy Efficiant Computing in the 21c

Energy Efficient Computing - 26mar13
Energy Efficient Computing - 26mar13Energy Efficient Computing - 26mar13
Energy Efficient Computing - 26mar13Ian Phillips
 
Global Technology Trends - Electronic Systems
Global Technology Trends - Electronic SystemsGlobal Technology Trends - Electronic Systems
Global Technology Trends - Electronic SystemsIan Phillips
 
Computing Platforms for the XXIc - DSD/SEAA Keynote
Computing Platforms for the XXIc - DSD/SEAA KeynoteComputing Platforms for the XXIc - DSD/SEAA Keynote
Computing Platforms for the XXIc - DSD/SEAA KeynoteIan Phillips
 
EDCC14 Keynote, Newcastle 15may14
EDCC14 Keynote, Newcastle 15may14EDCC14 Keynote, Newcastle 15may14
EDCC14 Keynote, Newcastle 15may14Ian Phillips
 
IS 139 Lecture 1 - 2015
IS 139 Lecture 1 - 2015IS 139 Lecture 1 - 2015
IS 139 Lecture 1 - 2015Aron Kondoro
 
invited speech at Ge2013, Udine 2013
invited speech at Ge2013, Udine 2013 invited speech at Ge2013, Udine 2013
invited speech at Ge2013, Udine 2013 Roberto Siagri
 
Unit i-introduction
Unit i-introductionUnit i-introduction
Unit i-introductionakruthi k
 
Carving the Perfect Design Engineer
Carving the Perfect Design EngineerCarving the Perfect Design Engineer
Carving the Perfect Design EngineerIan Phillips
 
Embedded systems The Past Present and the Future
Embedded systems The Past Present and the FutureEmbedded systems The Past Present and the Future
Embedded systems The Past Present and the FutureSrikanth KS
 
Evolution of Computing Microprocessors and SoCs
Evolution of Computing Microprocessors and SoCsEvolution of Computing Microprocessors and SoCs
Evolution of Computing Microprocessors and SoCsazmathmoosa
 
Empowering active teaching and experimental research apr 2010
Empowering active teaching and experimental research apr 2010Empowering active teaching and experimental research apr 2010
Empowering active teaching and experimental research apr 2010Thorsten MAYER
 
Smalltalk-80 : hardware and software
Smalltalk-80 : hardware and softwareSmalltalk-80 : hardware and software
Smalltalk-80 : hardware and softwareESUG
 
¿Es posible construir el Airbus de la Supercomputación en Europa?
¿Es posible construir el Airbus de la Supercomputación en Europa?¿Es posible construir el Airbus de la Supercomputación en Europa?
¿Es posible construir el Airbus de la Supercomputación en Europa?AMETIC
 
Tsoc Feb09 Bannink V41
Tsoc Feb09 Bannink V41Tsoc Feb09 Bannink V41
Tsoc Feb09 Bannink V41Chris Bannink
 
Comparison between computers of past and present
Comparison between computers of past and presentComparison between computers of past and present
Comparison between computers of past and presentMuhammad Danish Badar
 
Evolution of Microprocessors
Evolution of MicroprocessorsEvolution of Microprocessors
Evolution of Microprocessorsaneetaanu
 

Semelhante a Energy Efficiant Computing in the 21c (20)

Energy Efficient Computing - 26mar13
Energy Efficient Computing - 26mar13Energy Efficient Computing - 26mar13
Energy Efficient Computing - 26mar13
 
Global Technology Trends - Electronic Systems
Global Technology Trends - Electronic SystemsGlobal Technology Trends - Electronic Systems
Global Technology Trends - Electronic Systems
 
Computing Platforms for the XXIc - DSD/SEAA Keynote
Computing Platforms for the XXIc - DSD/SEAA KeynoteComputing Platforms for the XXIc - DSD/SEAA Keynote
Computing Platforms for the XXIc - DSD/SEAA Keynote
 
EDCC14 Keynote, Newcastle 15may14
EDCC14 Keynote, Newcastle 15may14EDCC14 Keynote, Newcastle 15may14
EDCC14 Keynote, Newcastle 15may14
 
IS 139 Lecture 1 - 2015
IS 139 Lecture 1 - 2015IS 139 Lecture 1 - 2015
IS 139 Lecture 1 - 2015
 
invited speech at Ge2013, Udine 2013
invited speech at Ge2013, Udine 2013 invited speech at Ge2013, Udine 2013
invited speech at Ge2013, Udine 2013
 
Unit i-introduction
Unit i-introductionUnit i-introduction
Unit i-introduction
 
Carving the Perfect Design Engineer
Carving the Perfect Design EngineerCarving the Perfect Design Engineer
Carving the Perfect Design Engineer
 
Computer Evolution
Computer EvolutionComputer Evolution
Computer Evolution
 
Digital Fluency
Digital FluencyDigital Fluency
Digital Fluency
 
Embedded systems The Past Present and the Future
Embedded systems The Past Present and the FutureEmbedded systems The Past Present and the Future
Embedded systems The Past Present and the Future
 
Evolution of Computing Microprocessors and SoCs
Evolution of Computing Microprocessors and SoCsEvolution of Computing Microprocessors and SoCs
Evolution of Computing Microprocessors and SoCs
 
fundamental.pptx
fundamental.pptxfundamental.pptx
fundamental.pptx
 
Empowering active teaching and experimental research apr 2010
Empowering active teaching and experimental research apr 2010Empowering active teaching and experimental research apr 2010
Empowering active teaching and experimental research apr 2010
 
Smalltalk-80 : hardware and software
Smalltalk-80 : hardware and softwareSmalltalk-80 : hardware and software
Smalltalk-80 : hardware and software
 
¿Es posible construir el Airbus de la Supercomputación en Europa?
¿Es posible construir el Airbus de la Supercomputación en Europa?¿Es posible construir el Airbus de la Supercomputación en Europa?
¿Es posible construir el Airbus de la Supercomputación en Europa?
 
Tsoc Feb09 Bannink V41
Tsoc Feb09 Bannink V41Tsoc Feb09 Bannink V41
Tsoc Feb09 Bannink V41
 
Comparison between computers of past and present
Comparison between computers of past and presentComparison between computers of past and present
Comparison between computers of past and present
 
Optical computers pdf
Optical computers pdfOptical computers pdf
Optical computers pdf
 
Evolution of Microprocessors
Evolution of MicroprocessorsEvolution of Microprocessors
Evolution of Microprocessors
 

Último

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Último (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Energy Efficiant Computing in the 21c

  • 1. Energy Efficient Computing ... In the early 21C  Abstract:   Opinions expressed are those of the author alone With the assistance of its global partners, ARM shipped 8.7 billion CPUs in 2012; a number which continues to grow at around ~20%pa. The 40B we have shipped to date outnumber the total of PC's more than 50 times; and today more than 75% of the things connected to the Internet are ARM based. The dominant nature of Computing in the 21c is very different to that of the Mainframe era. It is sobering to think that if each of those 8.7B CPUs was to dissipate just 100mw, then it would require the output of two modern power stations to drive them; with 2.4 next year, and 3 the year after that! So Electronic Systems are also defining where the real Energy Efficient Computing issue is! But with such a small footprint it must be easy to measure and manage power optimisation? An increasing percentage of these are immensely complex systems, running significant multi-tasking and multi-threaded operating systems on platforms which include multi-processor CPU/GPU configurations, and GB of memory. Whilst their minimum dissipations are a few uW, their peak power exceed the silicon's ability to dissipate it; so the penalty for power un-aware software design is huge. What has been done to manage this in Electronic Systems design, and can any lessons can be transferred to the Classic Computing domains? Context    1hr talk at The Centre for Robotics and Neural Systems (CNRS) at University of Plymouth, Devon, UK. The CRNS has a regular seminar series inviting national and international speakers. http://www.tech.plym.ac.uk/SOCCE/CRNS/ SlideCast and pdf available via http://ianp24.blogspot.co.uk/ 1
  • 2. Opinions expressed are those of the author alone Prof. Ian Phillips Principal Staff Eng’r, ARM Ltd ian.phillips@arm.com Visiting Prof. at ... Contribution to Industry Award 2008 Centre for Robotics and Neural Systems Uo.Plymouth 1nov13 SlideCast and pdf available via http://ianp24.blogspot.co.uk/ 2 1v0
  • 6. The Visible Face of Computing Today 6
  • 7. The Invisible Face of Computing Today  100’s of Billions of computers each consuming mW!  Bringing Embedded Intelligence to the Consumer Market, has changed the Face of Computing! (Again) 7
  • 9. Markets provide the Growth Drivers 3rd Era Millions of Units Computing as part of our lives 2nd Era Broad-based computing for specific tasks 1st Era Select work tasks 1960 1970 1980 1990 2000 2010 2020 Today: ~2% of our Energy Use goes on Computing and Electronics! ... Tomorrow: It could easily be 20%! 9
  • 10. ARM in the Digital World 150+ billion CPUs cumulative by 2020  8.7B CPUs shipped in 2012 (Growing 20%pa.pa)  75% of the things connected to the Internet today are ARM Powered! Gartner 40+ billion CPUs to date 1998 10 http://www.arm.com/ 2012 2020
  • 11. Moore’s Law ... X 100nm 10um Transistor/PM (K) 1um Transistors/Chip (M) Approximate Process Geometry 10nm Gordon Moore. Founder of Intel. (1965) 100um ITRS’99 ... 11 http://en.wikipedia.org/wiki/Moore’s_law x More Functionality on a Si Chip in 20 yrs!
  • 12. A Machine for Computing ... Computing: A general term for algebraic manipulation of data ... Numerated Phenomena IN (x) y=F(x,t,s) Processed Data/ Information OUT (y) ... State and Time are always factors (variable weight).  It can include phenomena ranging from human thinking to calculations with a narrower meaning. Usually used it to exercise analogies (models) of real-world situations; Frequently in real-time (Fast enough to be a stabilising factor in a loop). Wikipedia  ... So what part does Hardware and Software play? ... And what about Energy? 12
  • 13. Antikythera c87BC ... Planet Motion Computer Mechanical Technology • Inventor: Hipparchos (c.190 BC – c.120 BC). • Ancient Greek Astronomer, Philosopher and Mathematician. Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!) See: http://www.youtube.com/watch?v=L1CuR29OajI 13
  • 14. Orrery c1700 ... Planet Motion Computer Mechanical Technology • Inventor: George Graham (1674-1751). English Clock-Maker. • Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!) 14
  • 15. Babbage's Difference Engine 1837 Mechanical Technology (Re)construction c2000  The difference engine consists of a number of columns, numbered from 1 to N. Each column is able to store one decimal number. The only operation the engine can do is add the value of a column n + 1 to column n to produce the new value of n. Column N can only store a constant, column 1 displays (and possibly prints) the value of the calculation on the current iteration. Computer for Calculating Tables: A Basic ALU Engine 15
  • 18. “Baby” 1947 (Reconstruction) Valve/Software Technology General Purpose, Quantised Time and Data, (Digital) Electronic Computing 18
  • 19. Signal Processing Tele-Verta Radio 4 Valves 1 Rectifier Valve BTH Crystal Set c1945 1 Diode Evoke DAB Radio c1925 100 M Transistors 2-3 Embedded Processors Bush Radio 7 Transistors 1 Diode c1960 19 c2005
  • 20. Radio as Computation ... Vi Vrf=Vi*100 Vro='Bandpass'(Vif*1000) Vrf Vif Vro Vif=Vrf*Vlo Vlo Vlo=Cos(t*1^6) Single-Task (Embedded), Real-Time, Analogue (Close-Enough) Computing 20
  • 21. Radio as Computation ... Valve Technology Vi Vrf=Vi*100 Vro='Bandpass'(Vif*1000) Vrf Vif Vro Vif=Vrf*Vlo Vlo Vlo=Cos(t*1^6) Single-Task (Embedded), Real-Time, Analogue (Close-Enough) Computing 21
  • 22. Radio as Computation ... ‘Integrated Circuit’ Transistor Valve Technology Vi Vrf=Vi*100 Vro='Bandpass'(Vif*1000) Vrf Vif Vro Vif=Vrf*Vlo Vlo Vlo=Cos(t*1^6) Single-Task (Embedded), Real-Time, Analogue (Close-Enough) Computing 22
  • 23. Computing is Era and Application Related ... Computing: Creating Useful Output from Input ... Architecture: The way this is done on the day. It is the Most Important Product Decision! (HW, SW, Digital, Analogue, Optics, Graphene, Mechanics, Steam, etc) 23
  • 24. Moore's Real Law: x2 Functionality Every 18mth!  Cascade of Technologies supporting Functional growth ... Functional Density (units) 1012 1010 106 102 Electronic era: System era: 1975-2005 2003-2030 100 1960 1980 2000 2020 ... The ‘Law’ started with Wood ⇒ Stone ⇒ Bronze ⇒ Iron 24
  • 25. Computing in a Cool iCon ... 25
  • 26. ‘A lot’ of Architecture in a Smart Phone ... ... Computation in many forms 26
  • 27. Take a Look Inside... Level-1: Modules The Control Board. 27 http://www.ifixit.com
  • 28. Inside The Control Board (a-side) Level-2: Sub-Assemblies   Visible Computing Contributors ...  Samsung: Flash Memory - NV-MOS (ARM Partner)  Cirrus Logic: Audio Codec - Bi-CMOS (ARM Partner)  AKM: Magnetic Sensor - MEM-CMOS  Texas Instruments:Touch Screen Controller and mobile DDR - Analogue-CMOS (ARM Partner)  RF Filters - SAW Filter Technology Invisible Computing Contributors ...  OS, Drivers, Stacks, Applications, GSM, Security, Graphics, Video, Sound, etc  Software Tools, Debug Tools, etc 28 http://www.ifixit.com
  • 29. Inside The Control Board (b-side) Level-2: Sub-Assemblies  More Visible Computing Contributors ...       A4 Processor. Spec:Apple, Design & Mfr: Samsung Digital-CMOS (nm) ...  Provides the iPhone 4 with its GP computing power.  (Said to contain ARM A8 600 MHz CPU and other ARM IP) ST-Micro: 3 axis Gyroscope - MEM-CMOS (ARM Partner) Broadcom: Wi-Fi, Bluetooth, and GPS - Analogue-CMOS (ARM Ptr) Skyworks: GSM Analogue-Bipolar Triquint: GSM PA Analogue-GaAs Infineon: GSM Transceiver - Anal/Digi-CMOS (ARM Partner) GPS Bluetooth, EDR &FM 29 http://www.ifixit.com
  • 30. Level-3: Processor NB: The Tegra 3 is similar to the A4/5, but not used in the iPhone 30 (Nvidea Tegra 3, Around 1B transistors)
  • 31. Packing Technology into an iCon Analogue and Digital Design Embedded Software Mechanics, Plastics and Glass Micro-Machines (MEMs) Displays and Transducers Robotics and Test Knowledge and Know-How Research, Education and Training Components, Sub-Systems and Systems; Design, Assembly and Manufacture Metrology, Methodology and Tools ... Involving Many Specialist Businesses ... Round and Round the World ...Not-Least from Europe 31
  • 32. Architecting your Product   : Is the cumulative non-functional choices made to support the functional need  A Good Architecture is the one that ‘survives’  History is written by winners (2nd is for losers) : Component Performance may be ‘poor’ as long as System Performance is ‘better’ for its use.  Architectural Options ... : Business Model (Cost-of Ownership, ROI), TTM (Productivity, History, IPAvailability, Know-How), Aesthetics (Power, Quality, Behaviour, Appearance)  : Analogue, Digital, Mechanical, Optical, RF, Software, Plastics, Metal-forming, Manufacturing, Glass, ...  : More than 99% of a Product is Reused from its Predecessor  ... 32 is assumed (working is expected!) ... It used to be the only consideration!
  • 33. Power Philosophy  Hardware Dissipates Power ...  Chose Underlying Technology for best power efficiency.  One size does not fit all (Products, Applications or Instances)  ... Software Doesn’t (But it Tells Hardware To!)   Chips can literaly melt-down under software ‘instruction’ Make computing hardware power as ‘Activity’ dependent as possible   Zero Activity => Zero Power Make OS/Apps aware of the power/performance situation, and their options for controlling it (Need Indicators and Levers)  ... Think System: It’s how the ‘box’ performs, not the components 33
  • 34. Core Power Management  For Processor and Peripheral Circuitry...  Variable/Gated - Clock Domains  Variable/Switched - Power Domains  Indicators and Levers  Allow the software to see and influence what is going on  Principles of Core Power Efficiency...  Minimise voltage/frequency (P=CV2f) so that processor has just    enough performance for the current application need Maximises ‘Activity Power’ dependence (Zero Activity => Zero Power) Management by the OS and the Application SW Apply to all on & off-chip zones (not just the CPU) ...   34 Methodology Retention Flops/Latches, Level Shifters, Power-Switch Cells, PLLs
  • 35. Architectural Energy Efficiency - Parallelism Processor Input Output Output Processor f Input f/2 Processor f f/2 Capacitance = C Voltage = V Frequency = f Capacitance = 2.2C Voltage = 0.6V Frequency = 0.5f Power = CV2f Power = (2.2*0.6*0.6*0.5)CV2f = 0.4CV2f  To a limit determined by Amdahl’s or Gustafson’s Law ...  Amdahl: Extracted parallelism from existing code (Reuse)  Gustafson: Some needs only benefit from parallelism (Custom) ... Actual improvement is application specific. 35
  • 36. Architectural Energy Efficiency - Data  Moving Data takes significant Energy  Becoming the dominant energy consumption in a system  Data Location  Avoid moving or copying Data  Energy ∝ DataVolume x Speed x Distance>2(3)  Bring the processing to the data  Bring the Processing to the Data  Caching is good (depends on implementation)  Write back is better than write-through  Local working memory is good  Aka Software Caching ... The Arrangement of your Data matters! 36
  • 37. All ARM Processors are Power Efficient 37
  • 38. Chose The Horses for The Course About 50MTr About 50KTr ... Delivering ~5x speed (Architecture + Process + Clock) 38
  • 39. Multicore ARM On-Chip ...  Heterogeneous Multicore Systems  have been in ARM for a long time: Application UI & 3D Graphics Power Manager Cortex™-A8 Mali™-400 MP Cortex-M3 Interconnect Memory 39
  • 40. Coherent Multicore Cluster ...  Homogenous Multicore  cluster, as part of a heterogeneous system: Cortex-A9 Power Manager Mali-400 MP … User Interface and 3D graphics Cortex-M3 Cortex-A9 Coherency Logic Interconnect 40
  • 41. Multiple Clusters ...  Multiple Homogeneous Coherent Clusters … Cortex-A15 Cortex-A15 Coherency Logic in L2 Cache … Cortex-A15 Coherency Logic in L2 Cache Coherent Interconnect 41 Cortex-A15
  • 42. Computer On a Chip c2010 ... Today’s Consumer require a pocket ‘Super-Computer’ ...  Silicon Technology Provides a Billion transistors ...  It will be supported with a few GB of memory ... • Typically 10 Processors ... • • • • • • 42 http://www.arm.com/ 4 x A9 Processors (2x2): 4 x MALI 400 Frag. Proc 1 x MALI 400 Vertex Proc 1 x MALI Video CoDec Software Stacks, OS’s and Design Tools/ ARM Technology gives chip/system designers ... • Improved Productivity • Improved TTM • Improved Quality/Certainty
  • 43. CoreLink™ CCN-504 and DMC-520 Heterogeneous processors – CPU, GPU, DSP and accelerators Virtualized Interrupts Up to 4 cores per cluster Up to 4 coherent clusters Quad CortexA15 Quad CortexA15 Quad CortexA15 L2 cache L2 cache L2 cache Quad ACE CortexA15 L2 cache DSP DSP DSP PCIe DPI Crypto USB AHB ACE SATA NIC-400 IO Virtualisation with System MMU CoreLink™ CCN-504 Cache Coherent Network Integrated L3 cache Snoop Filter 8-16MB L3 cache CoreLink™ DMC-520 Dual channel DDR3/4 x72 10-40 GbE Interrupt Control Uniform System memory CoreLink™ DMC-520 NIC-400 Network Interconnect PHY x72 DDR4-3200 x72 DDR4-3200 Flash GPIO Peripheral address space 43 Up to 18 AMBA interfaces for I/O coherent accelerators and IO
  • 44. Methodology As Well As Hardware  C/C++  Debug & Trace Development Energy Trace Modules  Middleware 44
  • 45. big.LITTLE Processing  For High-Performance systems...  Tightly coupled combination of two ARM CPU clusters:   Cortex-A15 and Cortex-A7 - functionally identical Same programmers view, looks the same to OS and applications  big.LITTLE combines high-performance and low power   Automatically selects the right processor for the right job Redefines the efficiency/performance trade-off “Demanding tasks” >2x Performance Current big.LITTLE smartphone 45 big “Always on, always connected tasks” LITTLE 30% of the Power (select use cases) Current big.LITTLE smartphone
  • 46. LITTLE Fine-Tuned to Different Performance Points Most energy-efficient applications processor from ARM   Simple, in-order, 8 stage pipelines Performance better than mainstream, high-volume smartphones (Cortex-A8 and Cortex-A9) big Highest performance in mobile power envelope 46   Complex, out-of-order, multi-issue pipelines Up to 2x the performance of today’s high-end smartphones Cortex-A7 Cortex-A53 Q u e u e I s s u e I n t e g e r Cortex-A15 Cortex-A57
  • 47. big.LITTLE Software CPU Migration  Migrate a single processor workload to the appropriate CPU  Migration = save context then resume on another core  Also known as Linaro “In Kernel Switcher”  DVFS driver modifications and kernel modifications  Based on standard power management routines  Small modification to OS and DVFS, ~600 lines of code big.LITTLE MP  OS scheduler moves threads/tasks to appropriate CPU  Based on CPU workload  Based on dynamic thread performance requirements  Enables highest peak performance by using all cores at once 47
  • 48. Bringing the Processing to the Data … Press Claims: Dell + Marvell, Copper BaiDu + Marvell, Baserock  288 server nodes in a 4U rack space Public Source: http://www.engadget.com/2011/11/02/hp-and-calxedas-moonshot-arm-servers-will-bring-all-the-boys-to/ 48
  • 49. ... Refining Data into Information 49
  • 50. Transferrable Lessons to GP Software   Moving data is Power Expensive ...  Don’t move data; use it locally (Cache it)  Refine it once, use it often (Pre-Process it)  Your CPU Power is work-load independent ...  So, get in; get the work done; and get out.  Maximise the workload of your code; terminate when complete.  Make your Processing work-load dependent  Use a Hypervisor and turn off (at least free) processors not in use. 50
  • 51. Societies Challenges in the 21c  Urbanisation (Smart Cities)  Health (eHealth)  Transport  Energy (Smart Grid)  Security  Environment  Food/Water  Ageing Society  Sustainability  Digital Inclusion  Economics And whilst our technologies will be an essential part of all solutions, they cannot not fix them without Society’s help and cooperation! ... Energy Efficient Computing will minimise the impact not avert the challenges! 51 Having a great time!
  • 52. Conclusions  Putting the power of Computation into the hands of the masses, has changed the face of Computing (again)  Electronic Systems will become Essential to our Lives and the Economy  Power Efficient ES are a major issue to Society  Which faces a future with them as a significant energy consumer in themselves  Power Efficiency must be architected into the System Hardware and Software from the beginning     52 To realise the maximum potential out of your Silicon (Avoiding Dark Si) Architect & Design HW as efficiently as possible (reflecting the task)  Strive for: No Work => No Power Equip HW with Indicators and Levers so the System/App can manage it Bring Processing to the Data ...  Don’t move Data; move Information  Process data Locally  Energy ∝ DataVolume x Speed x Distance>2(3)
  • 53. Computing at the heart of the 21c ARM: Enabling the Creation of High-Performance Electronic Systems --• Productively, Economically and Reliably • Through Hw/Sw Reuse Methodologies • Based on a family of CPU/GPU cores 53