SlideShare a Scribd company logo
1 of 39
Download to read offline
OSCAR	
  Compiler	
  Controlled	
  	
  
Mul3core	
  Power	
  Reduc3on	
  	
  
on	
  Android	
  Pla8orm	
Hideo	
  Yamamoto¹,	
   Tomohiro	
  Hirano¹,	
  Kohei	
  Muto¹,	
  	
  
Hiroki	
  Mikami¹,	
  Takashi	
  Goto¹,	
  Dominic	
  Hillenbrand¹,	
  	
  
Moriyuki	
  Takamura²,	
  Keiji	
  Kimura¹	
  and	
  Hironori	
  Kasahara¹	
  
	
  
¹Green	
  Compu3ng	
  Systems	
  Research	
  and	
  Department	
  Center	
  Waseda	
  University	
  
²FUJITSU	
  LABORATORIES	
  LTD.	
  
LCPC2013	
 1
Presenta3on	
  Outline	
•  Background	
  
–  Power	
  consump3on	
  in	
  mul3core	
  	
  
–  Power	
  control	
  mechanism	
  of	
  the	
  OSCAR	
  Compiler	
  
–  Power	
  control	
  on	
  the	
  Android™ pla8orm	
  
•  Experimental	
  
–  Evalua3on	
  target	
  ,	
  power	
  rail	
  and	
  measurement	
  device	
  
–  Precise	
  power	
  measurement	
  method	
  Using	
  GPIO	
  
–  Bind	
  mode	
  
–  Clock	
  ga3ng	
  method	
  using	
  WFI	
  instruc3on	
  
•  Highlight	
  event	
  in	
  data	
  
–  Power	
  consump3on	
  of	
  MPEG2	
  decoder	
  	
  
•  Conclusion	
LCPC2013	
 2
BACKGROUND	
  
	
  	
LCPC2013	
 3
A	
  	
  Plethora	
  	
  of	
  	
  Smart Devices	
LCPC2013	
 4	
Linux	
ARM11/	
  
CortexA8	
Linux	
  -­‐2	
  core	
  SMP	
Cortex-­‐	
  
A9	
Cortex-­‐	
  
A9	
Linux	
  -­‐	
  4	
  core	
  SMP	
Cortex-­‐	
  
A9	
Cortex-­‐	
  
A9	
Cortex-­‐	
  
A9	
Cortex-­‐	
  
A9	
Linux	
  –	
  8	
  core	
  HMP	
Cortex-­‐	
  
A15	
Cortex-­‐	
  
A15	
Cortex-­‐	
  
A15	
Cortex-­‐	
  
A15	
 Cortex-­‐	
  
A7	
Cortex-­‐	
  
A7	
Cortex-­‐	
  
A7	
Cortex-­‐	
  
A7	
Linux	
  -­‐	
  8	
  core	
  big.LITTLE	
Cortex-­‐	
  
A15	
Cortex-­‐	
  
A15	
Cortex-­‐	
  
A15	
Cortex-­‐	
  
A15	
Cortex-­‐	
  
A7	
Cortex-­‐	
  
A7	
Cortex-­‐	
  
A7	
Cortex-­‐	
  
A7	
Linux	
  -­‐	
  5	
  core	
  	
  4+1	
  vSMP	
  	
Cortex-­‐	
  
A9	
Cortex-­‐	
  
A9	
Cortex-­‐	
  
A9	
Cortex-­‐	
  
A9	
Cortex-­‐	
  
A9	
2013	
2007	
 2011	
 ・・・・・・・	
 2014	
High	
  performance	
  device	
Cumula3ve	
  smart	
  device	
  shipment	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  iOS	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  700,000,000	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  Android	
  	
  	
  1000,000,000
 In	
  quad	
  core	
  case,	
  you	
  can	
  reduce	
  ‘f’	
  to	
  ¼	
  
keeping	
  the	
  same	
  performance.	
  	
   If	
  ‘v’	
  is	
  	
  
0.6(v)	
  for	
  ¼	
  ‘f’,	
  power	
  consump3on	
  will	
  be	
  
reduced	
  to	
  0.36	
  
Power	
  Consump3on	
  in	
  mul3	
  core	
•  Uni	
  Core	
  
P	
  =	
  f*c*v^2	
  	
   	
   	
   	
  ・・・・・ Eq.1	
  
•  	
  Mul3	
  Core	
  
P	
  =	
  n*f*c*v^2 	
   	
   	
  ・・・・・ Eq.2	
  
LCPC2013	
 5
OSCAR	
  Compiler	
LCPC2013	
 6	
Waseda	
  University	
  
Mul3grain	
  Parallel	
  Processing	
  
• Hierarchical	
  and	
  Global	
  Paralleliza3on	
• Coarse	
  grain	
  task	
  parallel	
  
• Loop	
  itera3on	
  parallel	
  
• Statement	
  level	
  parallel	
  
Data	
  Locality	
  Op3miza3on	
  
• Task	
  (or	
  loop)	
  decomposi3on	
  considering	
  
cache	
  size	
  or	
  local	
  memory	
  size	
  
• Task	
  scheduling	
  considering	
  data	
  affinity	
  
Low	
  power	
  op3miza3on	
  
• Power	
  scheduling	
  with	
  
DVFS,	
  clock	
  ga3ng	
  and	
  
power	
  ga3ng	
  by	
  somware	
  
Doall loop	
Seq. loop	
Task level or
statement level
parallelization
Power	
  Control	
  Mechanism	
  of	
  	
  
the	
  OSCAR	
  Compiler	
•  Es3mate	
  execu3on	
  3me	
  of	
  each	
  MT	
  and	
  find	
  cri3cal	
  path	
  
•  Determine	
  execu3on	
  3me	
  to	
  sa3sfy	
  the	
  given	
  deadline	
  
•  Decide	
  op3mal	
  frequency	
  and	
  voltage	
  of	
  each	
  MT.	
  	
  
LCPC2013	
 7	
MT1	
MT2	
MT5	
MT3	
MT6	
MT8	
MT4	
MT7	
MT9	
Core0	
 Core1	
 Core2	
 Core0	
 Core1	
 Core2	
MT1	
MT2	
MT5	
  
(Low	
  freq.)	
  
MT3	
  
(Low	
  freq.)	
MT6	
MT8	
MT4	
MT7	
MT9	
Given	
  Dead	
  Line	
3me	
Margin	
 Clock	
  ga>ng	
Power	
  
ga3ng	
Power	
  
ga3ng	
Power	
  
ga3ng	
Sta3c	
  scheduled	
  MTG	
 Power	
  scheduling	
  with	
  DVFS,	
  clock	
  
ga3ng	
  and	
  power	
  ga3ng	
  by	
  somware	
  
Time	
  
management	
3me
Power	
  Control	
  on	
  Android	
•  CPUFreq	
   	
   	
  	
  
– Frequency	
  and	
  voltage	
  scaling	
  of	
  a	
  target	
  CPU	
  
•  CPUIdle	
  
– Manages	
  the	
  level	
  of	
  idle	
  on	
  each	
  core	
  of	
  the	
  CPU	
  
•  HotPlug 	
   	
  >	
  10ms	
  
– Extended	
  func3on	
  of	
  CPUFreq	
  and	
  CPUIdle	
  
– Adds	
  another	
  core	
  to	
  distribute	
  the	
  	
  load	
  in	
  high	
  
u3liza3on	
  
– Shuts	
  down	
  excess	
  core	
  with	
  low	
  u3liza3on	
  	
  
– Decide	
  core	
  on/off	
  line	
  in	
  a	
  heuris3c	
  adap3on	
  
	
  
LCPC2013	
 8
Problems	
  of	
  Linux	
  	
  power	
  control	
  and	
  
parallel	
  processing	
  	
•  Hotplug	
  can’t	
  online	
  core	
  and	
  thread	
  binding	
  swimly	
  
–  In	
  worst	
  case	
  it	
  needs	
  several	
  hundred	
  milliseconds	
  
	
  
	
  
•  Non	
  real-­‐3me	
  
–  Linux	
  can’t	
  control	
  fine	
  resolu3on	
  3me	
  under	
  5-­‐10ms	
  
	
 LCPC2013	
440.6ms	
9	
Startup	
  3me	
  440.6ms
Background	
  
•  Mo3va3on	
  
–  Paralleliza3on	
  is	
  effec3ve	
  for	
  low	
  power	
  execu3on	
  with	
  
DVFS,	
  power-­‐ga3ng	
  and	
  clock-­‐ga3ng	
  
–  OSCAR	
  compiler	
  has	
  the	
  capability	
  to	
  generate	
  power	
  
control	
  API	
  automa3cally	
  	
  	
  
•  Obstacle	
  
–  Linux	
  needs	
  long	
  startup	
  3me	
  for	
  distribu3ng	
  load	
  	
  to	
  
mul3cores	
  	
  
–  Lack	
  of	
  fine	
  resolu3on	
  3me	
  control	
  
•  Challenge	
  
–  Low	
  power	
  execu3on	
  Android	
  pla8orm	
  by	
  paralleliza3on	
  	
  
LCPC2013	
 10
EXPERIMENTAL	
	
  	
LCPC2013	
 11
Evalua3on	
  board	
  -­‐	
  ODROID-­‐X2	
•  Samsung	
  Exynos4412	
  Prime	
  
– ARM	
  Cortex-­‐A9	
  Quad	
  core	
  
– Maximum	
  clock	
  frequency	
  1.7GHz	
  
– Used	
  by	
  Samsung's	
  Galaxy	
  S3	
  
•  DVFS	
  can’t	
  be	
  applied	
  to	
  each	
  core	
  
independently	
  
•  Android	
  Open	
  Source	
  version	
  is	
  in	
  place	
  
•  Circuit	
  Schema3c	
  is	
  available	
  on	
  request	
  
	
 LCPC2013	
 12
SoC Exynos4412	
Power	
  Rail	
  for	
  Exynos4412	
•  Exynos4412	
  is	
  powered	
  by	
  4	
  PMIC	
  (Power	
  Management	
  IC)	
  voltage	
  
–  VDD_ARM	
   	
  CORE	
  
–  VDD_INT 	
   	
  Interrupt	
  controller	
  and	
  L2	
–  VDD_G3D	
   	
  GPU	
–  VDD_MIF 	
   	
  DDR	
  Memory	
•  Power	
  consump3on	
  of	
  VDD_ARM	
  (CORE)	
  has	
  been	
  measured	
  	
  
	
LCPC2013	
Cortex-­‐A9	
  
32KB	
  I/D	
  
NEON	
Cortex-­‐A9	
  
32KB	
  I/D	
  
NEON	
Cortex-­‐A9	
  
32KB	
  I/D	
  
NEON	
Cortex-­‐A9	
  
32KB	
  I/D	
  
NEON	
Interrupt	
  controller	
  	
  +	
  	
  L2	
  	
GPU	
DDR	
VDD_ARM	
VDD_INT	
VDD_G3D	
VDD_MIF	
PMIC	
13
Modified	
  Circuit	
  Diagram	
  of	
  	
  
ODROID-­‐X2	
LCPC2013	
 14	
Current	
Voltage	
Voltage	
  (V)	
 Current	
  (A)	
x	
 =	
 Power	
  (W)
How	
  to	
  measure	
  CORE	
  power	
  	
  
on	
  ODROID-­‐X2	
•  Adding	
  a	
  40	
  mΩ	
  shunt	
  resistor	
  to	
  VDD_ARM	
LCPC2013	
SoC	
PMIC	
Shunt	
Instrumenta3on	
  amp	
Voltage	
  
drop	
15
synchroniza3on	
  between	
  program	
  
and	
  waveforms	
  using	
  GPIO	
LCPC2013	
 16
“bind”	
  mode	
•  Core	
  assignment	
  logic	
  of	
  Android	
  Linux	
  hotplug	
  	
  is	
  heuris3c	
  
•  New	
  core	
  assignment	
  mode	
  called	
  “bind”	
  mode	
  is	
  developed	
  
for	
  efficient	
  parallel	
  execu3on	
  
•  "bind"	
  mode	
  is	
  integrated	
  in	
  Android	
  Linux	
  as	
  OSCAR	
  run3me	
  
and	
  API	
  
•  Specifica3on	
  of	
  OSCAR	
  API	
  for	
  “bind”	
  mode	
  	
  
–  Core	
  0	
  is	
  	
  reserved	
  	
  for	
  Android	
  system	
  and	
  non	
  OSCAR	
  	
  parallel	
  
program	
  	
  
–  Applica3on	
  can	
  disable	
  hotplug	
  and	
  control	
  for	
  Core	
  ON/OFF	
  line	
  	
–  Applica3on	
  can	
  Bind	
  Core	
  1,2	
  and	
  3	
  to	
  OSCAR	
  parallel	
  program	
  	
  
LCPC2013	
 17	
Startup	
  3me	
  7.2ms
clock	
  ga3ng	
•  WFI	
  instruc3on	
  
– WFI	
  instruc3on	
  	
  suspends	
  the	
  execu3on	
  of	
  the	
  
processor	
  core	
  and	
  stops	
  the	
  clock	
  un3l	
  3mer	
  
event	
  
•  Clock	
  ga3ng	
  driver	
  using	
  WFI	
  instruc3on	
  
– The	
  WFI	
  instruc3on	
  is	
  privileged	
  instruc3on	
  
– The	
  API	
  allows	
  user	
  program	
  to	
  execute	
  WFI	
  
instruc3on	
  within	
  Linux	
  driver	
  
LCPC2013	
 18
while(1)	
  {	
  
	
  	
  gpio_value(1);	
  
	
  	
  call_wfi_api(1);	
  
	
  	
  gpio_value(0);	
  	
}	
250mA	
500mA	
Fine	
  3ming	
  control	
  by	
  WFI	
  driver	
LCPC2013	
 19	
250mA	
500mA	
2000us	
  (4	
  slot)	
Wake	
  
up	
Time	
  Slot	
  is	
  500	
  us	
GPIO	
while(1)	
  {	
  
	
  	
  gpio_value(1);	
  
	
  	
  call_wfi_api(4);	
  
	
  	
  gpio_value(0);	
  	
}	
GPIO	
Clock	
  ga3ng	
0us	
  <	
  	
  T	
  <	
  500us	
 1500us	
  <	
  	
  T	
  	
  	
  <	
  2000us	
15000us	
  (3	
  slot)	
(N	
  -­‐1)	
  x	
  500us	
  	
  	
  <	
  	
  T	
  	
  <	
  	
  N	
  x	
  500us
Current	
  waveform	
  of	
  busy	
  wait	
  	
  
without	
  clock	
  ga3ng	
  	
1000mA	
1500mA	
2000mA	
	
  	
  500mA	
1core	
 2cores	
 3cores	
 4cores	
Busy	
  wait	
  in	
  ordinary	
  execute	
20
Current	
  waveform	
  of	
  busy	
  wait	
  	
  	
  
with	
  clock	
  ga3ng	
LCPC2013	
1000mA	
1500mA	
2000mA	
	
  	
  500mA	
1core	
 2cores	
 3cores	
 4cores	
Busy	
  wait	
  with	
  clock	
  ga>ng	
21	
Wake	
  up	
  all	
  cores	
Clock	
  ga3ng	
  all	
  cores
 
Compare	
  with	
  	
  
current	
  waveforms	
  	
  
	
1000mA	
1500mA	
2000mA	
	
  	
  500mA	
1core	
 2cores	
 3cores	
 4cores	
Busy	
  wait	
  in	
  ordinary	
  execute	
LCPC2013	
1000mA	
1500mA	
2000mA	
	
  	
  500mA	
1core	
 2cores	
 3cores	
 4cores	
Busy	
  wait	
  with	
  clock	
  ga>ng	
22	
Wake	
  up	
  all	
  cores	
Clock	
  ga3ng	
  all	
  cores
MPEG2	
  DECODER	
	
  Highlight	
  data	
LCPC2013	
 23
Power	
  Consump3on	
  of	
  	
  
MPEG2	
  Decoder	
  on	
  ODROID-­‐X2	
LCPC2013	
1/7(13.3%)	
1/3(38.1%)	
NUMBER	
  OF	
  CORES	
24	
With	
  Power	
  Reduc3on	
  Control	
Without	
  Power	
  Reduc3on	
  Control	
  
 demo	
LCPC2013	
 25
LCPC2013	
	
  	
 MPEG2	
  Decode	
  execu3on	
  
In	
  high	
  clock	
  and	
  voltage	
  
Busy	
  Wait	
  execu3on	
  
	
  Clock	
  ga3ng	
  	
  
by	
  WFI	
  
Reduced	
  
by	
  WFI	
Consumed	
Reduced	
  	
26	
(a)	
  Without	
  Power	
  Reduc3on	
  Control	
 (b)	
  With	
  Power	
  Reduc3on	
  Control	
Power	
  Waveform	
  of	
  	
  
MPEG2	
  Decoder	
  for	
  1	
  Core	
1.7GHz,	
  1.4V	
1.7GHz,	
  1.4V
LCPC2013	
Busy	
  Wait	
  execu3on	
  
	
  Clock	
  ga3ng	
  	
  
by	
  WFI	
  
MPEG2	
  Decode	
  execu3on	
  
In	
  low	
  clock	
  and	
  voltage	
  
Power	
  Waveform	
  of	
  	
  
MPEG2	
  Decoder	
  for	
  3	
  Core	
DVFS	
  
P	
  =	
  n*f*c*V^2	
  
Reduced	
  
by	
  WFI	
MPEG2	
  Decode	
  execu3on	
  
In	
  high	
  clock	
  and	
  voltage	
  
Consumed	
Reduced	
27	
(a)	
  Without	
  Power	
  Reduc3on	
  Control	
 (b)	
  With	
  Power	
  Reduc3on	
  Control	
1.7GHz,	
  1.4V	
400MHz,	
  1.05V	
200MHz,	
  0.92V
Power	
  Consump3on	
  of	
  	
  
MPEG2	
  Decoder	
  on	
  ODROID-­‐X2	
LCPC2013	
NUMBER	
  OF	
  CORES	
2.79	
0.97	
0.63	
 0.37	
WFI	
DVFS	
WFI	
1/3(38.1%)	
Consumed	
Reduced	
28
Conclusions	
  
•  The	
  ODROID-­‐X2	
  Circuit	
  is	
  modified	
  such	
  that	
  
1.  Precise	
  Power	
  waveforms	
  at	
  the	
  output	
  of	
  PMIC	
  is	
  
observed,	
  and	
  
2.  The	
  power	
  waveforms	
  and	
  parallel	
  program	
  event	
  are	
  inter-­‐
related	
  in	
  3ming	
  for	
  OSCAR	
  compiler	
  op3miza3on.	
  
•  The	
  efficient	
  parallel	
  program	
  execu3on	
  pla8orm	
  on	
  Android	
  is	
  
established	
  by	
  
1.  “bind”	
  mode,	
  and	
  	
  
2.  The	
  WFI	
  instruc3on	
  	
  by	
  the	
  OSCAR	
  compiler.	
  
•  The	
  newly	
  developed	
  OSCAR	
  compiler	
  power	
  control	
  
mechanism	
  has	
  decreased	
  the	
  power	
  to	
  one	
  third,	
  from	
  0.97	
  
Wa~	
  in	
  1-­‐core	
  to	
  0.37	
  Wa~	
  in	
  3-­‐core,	
  in	
  running	
  MPEG2	
  
decoder	
  on	
  Android	
  pla8orm.	
  
LCPC2013	
 29
BACKUP	
  SLIDE	
	
  	
LCPC2013	
 30
OPTICAL	
  FLOW	
Highlight	
  data	
LCPC2013	
 31
Power	
  Consump3on	
  of	
  	
  
Op3cal	
  Flow	
  on	
  ODROID-­‐X2	
LCPC2013	
13.4%	
31.5%	
32
Power	
  Waveform	
  of	
  	
  
Op3cal	
  Flow	
  for	
  1core	
LCPC2013	
Op3cal	
  Flow	
  execu3on	
  
Busy	
  Wait	
  execu3on	
   Clock	
  ga3ng	
  by	
  WFI	
  
Reduce	
  power	
  
of	
  waste	
  CPU	
  
cycles	
33
Power	
  Waveform	
  of	
  	
  
Op3cal	
  Flow	
  for	
  3core	
LCPC2013	
Op3cal	
  Flow	
  execu3on	
  
In	
  high	
  clock	
  and	
  voltage	
  
Busy	
  Wait	
  execu3on	
  
Clock	
  ga3ng	
  	
  
by	
  WFI	
  
P	
  =	
  n*f*c*V^2	
  
Op3cal	
  Flow	
  execu3on	
  
In	
  low	
  clock	
  and	
  voltage	
  
34
#pragma	
  oscar	
  get_current_>me(current,	
  >mer_no
Low-­‐power	
  code	
  with	
  OSCAR	
  API	
LCPC2013	
Proc0
Scheduled
Tasks
T1 off
Proc1
Scheduled
Tasks
T2 T4
Proc2
Scheduled
Tasks
T3 T6(slow)
OSCAR
Compiler
• Multigrain
Parallelization
• Memory
Optimization
• Data Transfer

Optimization	
• DVFS,
Clock gating
Sequential
Programs
C/Fortran
Low-­‐power	
  parallel	
  C/Fortran	
  Programs	
  
including	
  OSCAR	
  API	
Backend Compiler
API	
  Decoder	
Na3ve	
  Compiler	
#pragma	
  oscar	
  fvcontrol(pe,	
  (id,	
  state))	
  
#pragma	
  oscar	
  get_fvstatus(pe,	
  id,	
  state)	
  
Translate	
  OSCAR	
  API	
  into	
  Library	
  call	
  
Exec.
Object	
35
ODROID Original	
 	
L	
C	
GND	
L	
C	
GND	
VDD_ARM	
Schema3c	
 Layout	
36	
PMIC
ODROID	
  Amer	
  rework	
PMIC	
GND	
 GND	
VDD_ARM	
R
C	
 C	
L	
GND	
Single	
  5	
  Pin	
Drop	
  Voltage	
L	
R
Voltage	
37
How to work hotplug
L L L L
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
L L
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2
up2g0_delay
	
up2gn_delay	
 down_delay	
up2gn_delay	
 down_delay	
1 1
up	
up	
up	
 Down	
Down	
Down	
down_delay	
Idle	
Idle	
Idle	
Idle	
up 	
down
idle
disable
Auto hotplug governor
	
tegra_cpu_set_speed_cap	
578 int tegra_cpu_set_speed_cap(unsigned int *speed_cap)
579 {
581 unsigned int new_speed = tegra_cpu_highest_speed();
586 new_speed = tegra_throttle_governor_speed(new_speed);
587 new_speed = edp_governor_speed(new_speed);
588 new_speed = user_cap_speed(new_speed);
592 ret = tegra_update_cpu_speed(new_speed);
594 tegra_auto_hotplug_governor(new_speed, false);
596 }	
tegra_auto_hotplug_governor	
parameters	
 LP-mode	
 GP-MODE	
up_delay	
 up2g0_delay	
 up2dn_delay	
down_delay	
 down_deley	
 down_delay	
top_freq	
 idle_top_freq	
 idle_bottom_freq	
botttom_freq	
 0	
 idle_bottom_freq	
Current
State	
Compare with
requested freq	
New
State	
Delay to effecte	
IDLE	
 > top_freq	
 UP	
 Up_delay	
IDLE	
 <=bottom_freq	
 DOWN	
 Down_delay	
DOWN	
 >top_freq	
 UP	
 Up_delay	
DOWN	
 >bottom_freq	
 IDLE	
 NA	
UP	
 <bottom_freq	
 DOWN	
 Down_delay	
UP	
 <=top_freq	
 IDLE	
 ND	
Throttle_table	
 throttle_index	
Update form user	
 thermal_cooling_device	
Edp_Thermal	
 Auto Hot plug	
Suspend 	
 CpuFreq

More Related Content

What's hot

8-Bit CMOS Microcontrollers with nanoWatt Technology
8-Bit CMOS Microcontrollers with nanoWatt Technology8-Bit CMOS Microcontrollers with nanoWatt Technology
8-Bit CMOS Microcontrollers with nanoWatt TechnologyPremier Farnell
 
Interruption Timer Périodique
Interruption Timer PériodiqueInterruption Timer Périodique
Interruption Timer PériodiqueAnne Nicolas
 
LSA2 - 02 Control Groups
LSA2 - 02   Control GroupsLSA2 - 02   Control Groups
LSA2 - 02 Control GroupsMarian Marinov
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価Ryousei Takano
 
JS Fest 2018. Володимир Шиманський. Запуск двіжка JS на мікроконтролері
JS Fest 2018. Володимир Шиманський. Запуск двіжка JS на мікроконтролеріJS Fest 2018. Володимир Шиманський. Запуск двіжка JS на мікроконтролері
JS Fest 2018. Володимир Шиманський. Запуск двіжка JS на мікроконтролеріJSFestUA
 
F9 Microkernel code reading - part 1
F9 Microkernel code reading - part 1F9 Microkernel code reading - part 1
F9 Microkernel code reading - part 1Benux Wei
 
Exploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET ImplementationExploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET Implementationnkslides
 
F9 microkernel app development part 1
F9 microkernel app development part 1F9 microkernel app development part 1
F9 microkernel app development part 1Benux Wei
 
SMP Implementation for OpenBSD/sgi [Japanese Edition]
SMP Implementation for OpenBSD/sgi [Japanese Edition]SMP Implementation for OpenBSD/sgi [Japanese Edition]
SMP Implementation for OpenBSD/sgi [Japanese Edition]Takuya ASADA
 
Case 1 general performance inspection - a telecomm company
Case 1   general performance inspection - a telecomm companyCase 1   general performance inspection - a telecomm company
Case 1 general performance inspection - a telecomm companyTeemStone Pty Ltd
 
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)Simen Li
 
How to train your L3DSR with PBR - MEMO -
How to train your L3DSR with PBR - MEMO -How to train your L3DSR with PBR - MEMO -
How to train your L3DSR with PBR - MEMO -Naoto MATSUMOTO
 
Cacti安装手册
Cacti安装手册Cacti安装手册
Cacti安装手册Yiwei Ma
 
Kernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureKernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureAnne Nicolas
 

What's hot (20)

8-Bit CMOS Microcontrollers with nanoWatt Technology
8-Bit CMOS Microcontrollers with nanoWatt Technology8-Bit CMOS Microcontrollers with nanoWatt Technology
8-Bit CMOS Microcontrollers with nanoWatt Technology
 
Interruption Timer Périodique
Interruption Timer PériodiqueInterruption Timer Périodique
Interruption Timer Périodique
 
LSA2 - 02 Control Groups
LSA2 - 02   Control GroupsLSA2 - 02   Control Groups
LSA2 - 02 Control Groups
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価
 
JS Fest 2018. Володимир Шиманський. Запуск двіжка JS на мікроконтролері
JS Fest 2018. Володимир Шиманський. Запуск двіжка JS на мікроконтролеріJS Fest 2018. Володимир Шиманський. Запуск двіжка JS на мікроконтролері
JS Fest 2018. Володимир Шиманський. Запуск двіжка JS на мікроконтролері
 
Kernel crashdump
Kernel crashdumpKernel crashdump
Kernel crashdump
 
F9 Microkernel code reading - part 1
F9 Microkernel code reading - part 1F9 Microkernel code reading - part 1
F9 Microkernel code reading - part 1
 
Proxy arp
Proxy arpProxy arp
Proxy arp
 
OpenBTS AirPutih
OpenBTS AirPutihOpenBTS AirPutih
OpenBTS AirPutih
 
Exploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET ImplementationExploiting the Linux Kernel via Intel's SYSRET Implementation
Exploiting the Linux Kernel via Intel's SYSRET Implementation
 
Presentation
PresentationPresentation
Presentation
 
BURA Supercomputer
BURA SupercomputerBURA Supercomputer
BURA Supercomputer
 
F9 microkernel app development part 1
F9 microkernel app development part 1F9 microkernel app development part 1
F9 microkernel app development part 1
 
Linux boot-time
Linux boot-timeLinux boot-time
Linux boot-time
 
SMP Implementation for OpenBSD/sgi [Japanese Edition]
SMP Implementation for OpenBSD/sgi [Japanese Edition]SMP Implementation for OpenBSD/sgi [Japanese Edition]
SMP Implementation for OpenBSD/sgi [Japanese Edition]
 
Case 1 general performance inspection - a telecomm company
Case 1   general performance inspection - a telecomm companyCase 1   general performance inspection - a telecomm company
Case 1 general performance inspection - a telecomm company
 
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
[嵌入式系統] MCS-51 實驗 - 使用 IAR (2)
 
How to train your L3DSR with PBR - MEMO -
How to train your L3DSR with PBR - MEMO -How to train your L3DSR with PBR - MEMO -
How to train your L3DSR with PBR - MEMO -
 
Cacti安装手册
Cacti安装手册Cacti安装手册
Cacti安装手册
 
Kernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architectureKernel Recipes 2015 - Porting Linux to a new processor architecture
Kernel Recipes 2015 - Porting Linux to a new processor architecture
 

Viewers also liked (10)

Unix v6 セミナー vol. 5
Unix v6 セミナー vol. 5Unix v6 セミナー vol. 5
Unix v6 セミナー vol. 5
 
波形で見るBig.little
波形で見るBig.little波形で見るBig.little
波形で見るBig.little
 
V6 unix in okinawa
V6 unix in okinawaV6 unix in okinawa
V6 unix in okinawa
 
Android binder-ipc
Android binder-ipcAndroid binder-ipc
Android binder-ipc
 
自動並列化コンパイラをAndroidに適用してみた
自動並列化コンパイラをAndroidに適用してみた自動並列化コンパイラをAndroidに適用してみた
自動並列化コンパイラをAndroidに適用してみた
 
Deep learning入門
Deep learning入門Deep learning入門
Deep learning入門
 
Android ipm 20110409
Android ipm 20110409Android ipm 20110409
Android ipm 20110409
 
Android IPC Mechanism
Android IPC MechanismAndroid IPC Mechanism
Android IPC Mechanism
 
Android IPC Mechanism
Android IPC MechanismAndroid IPC Mechanism
Android IPC Mechanism
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 

Similar to Oscar compiler for power reduction

Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...eSAT Journals
 
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic AnalyticsSAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic AnalyticsQin Liu
 
Ovs perf
Ovs perfOvs perf
Ovs perfMadhu c
 
F9 Microkernel code reading part 2 scheduling
F9 Microkernel code reading part 2 schedulingF9 Microkernel code reading part 2 scheduling
F9 Microkernel code reading part 2 schedulingBenux Wei
 
20088 1 ccna3 3.1-06 switch configurations
20088 1 ccna3 3.1-06 switch configurations20088 1 ccna3 3.1-06 switch configurations
20088 1 ccna3 3.1-06 switch configurationsDipak Misra
 
Master Serial Killer - DEF CON 22 - ICS Village
Master Serial Killer - DEF CON 22 - ICS VillageMaster Serial Killer - DEF CON 22 - ICS Village
Master Serial Killer - DEF CON 22 - ICS VillageChris Sistrunk
 
Pms System Training
Pms System TrainingPms System Training
Pms System Trainingvkmalik
 
참여기관_발표자료-국민대학교 201301 정기회의
참여기관_발표자료-국민대학교 201301 정기회의참여기관_발표자료-국민대학교 201301 정기회의
참여기관_발표자료-국민대학교 201301 정기회의DzH QWuynh
 
Sms based pumpset_control
Sms based pumpset_controlSms based pumpset_control
Sms based pumpset_controlAnand Biradar
 
Arm Processor Based Speed Control Of BLDC Motor
Arm Processor Based Speed Control Of BLDC MotorArm Processor Based Speed Control Of BLDC Motor
Arm Processor Based Speed Control Of BLDC MotorUday Wankar
 
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...deawoo Kim
 
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...Databricks
 
Routing Protocols and Concepts: Ch9 - EIGRP
Routing Protocols and Concepts: Ch9 - EIGRPRouting Protocols and Concepts: Ch9 - EIGRP
Routing Protocols and Concepts: Ch9 - EIGRPAbdelkhalik Mosa
 

Similar to Oscar compiler for power reduction (20)

I2C
I2CI2C
I2C
 
Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...
 
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic AnalyticsSAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics
 
Ovs perf
Ovs perfOvs perf
Ovs perf
 
F9 Microkernel code reading part 2 scheduling
F9 Microkernel code reading part 2 schedulingF9 Microkernel code reading part 2 scheduling
F9 Microkernel code reading part 2 scheduling
 
20088 1 ccna3 3.1-06 switch configurations
20088 1 ccna3 3.1-06 switch configurations20088 1 ccna3 3.1-06 switch configurations
20088 1 ccna3 3.1-06 switch configurations
 
Cisco-6500-v1.0-R
Cisco-6500-v1.0-RCisco-6500-v1.0-R
Cisco-6500-v1.0-R
 
ALU-Presentation-Isaac Mwesigwa Optics Engineer
ALU-Presentation-Isaac Mwesigwa Optics EngineerALU-Presentation-Isaac Mwesigwa Optics Engineer
ALU-Presentation-Isaac Mwesigwa Optics Engineer
 
A presentation-isaac mwesigwa optics engineer
A presentation-isaac mwesigwa optics engineerA presentation-isaac mwesigwa optics engineer
A presentation-isaac mwesigwa optics engineer
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
Master Serial Killer - DEF CON 22 - ICS Village
Master Serial Killer - DEF CON 22 - ICS VillageMaster Serial Killer - DEF CON 22 - ICS Village
Master Serial Killer - DEF CON 22 - ICS Village
 
PROGRESS 1& 2.ppt
PROGRESS 1& 2.pptPROGRESS 1& 2.ppt
PROGRESS 1& 2.ppt
 
D031201021027
D031201021027D031201021027
D031201021027
 
Pms System Training
Pms System TrainingPms System Training
Pms System Training
 
참여기관_발표자료-국민대학교 201301 정기회의
참여기관_발표자료-국민대학교 201301 정기회의참여기관_발표자료-국민대학교 201301 정기회의
참여기관_발표자료-국민대학교 201301 정기회의
 
Sms based pumpset_control
Sms based pumpset_controlSms based pumpset_control
Sms based pumpset_control
 
Arm Processor Based Speed Control Of BLDC Motor
Arm Processor Based Speed Control Of BLDC MotorArm Processor Based Speed Control Of BLDC Motor
Arm Processor Based Speed Control Of BLDC Motor
 
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
 
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
Training Distributed Deep Recurrent Neural Networks with Mixed Precision on G...
 
Routing Protocols and Concepts: Ch9 - EIGRP
Routing Protocols and Concepts: Ch9 - EIGRPRouting Protocols and Concepts: Ch9 - EIGRP
Routing Protocols and Concepts: Ch9 - EIGRP
 

More from magoroku Yamamoto

More from magoroku Yamamoto (20)

仮想記憶入門 BSD-4.3を例題に
仮想記憶入門 BSD-4.3を例題に仮想記憶入門 BSD-4.3を例題に
仮想記憶入門 BSD-4.3を例題に
 
V6 unix vol.2 in okinawa
V6 unix vol.2 in okinawaV6 unix vol.2 in okinawa
V6 unix vol.2 in okinawa
 
Adk2012
Adk2012Adk2012
Adk2012
 
ぐだ生システム#2
ぐだ生システム#2ぐだ生システム#2
ぐだ生システム#2
 
ぐだ生って何
ぐだ生って何ぐだ生って何
ぐだ生って何
 
Android builders summit slide tour
Android builders summit slide tourAndroid builders summit slide tour
Android builders summit slide tour
 
第4回名古屋Android勉強会資料
第4回名古屋Android勉強会資料第4回名古屋Android勉強会資料
第4回名古屋Android勉強会資料
 
Poorman’s adk トレーナ
Poorman’s adk トレーナPoorman’s adk トレーナ
Poorman’s adk トレーナ
 
20分でわかった事にするパワーマネジメント
20分でわかった事にするパワーマネジメント20分でわかった事にするパワーマネジメント
20分でわかった事にするパワーマネジメント
 
Poormans sdk
Poormans sdkPoormans sdk
Poormans sdk
 
Ngk2011 b
Ngk2011 bNgk2011 b
Ngk2011 b
 
オレオレ家電
オレオレ家電オレオレ家電
オレオレ家電
 
V6read#4
V6read#4V6read#4
V6read#4
 
V6read#3
V6read#3V6read#3
V6read#3
 
Unixファイルシステムの歴史
Unixファイルシステムの歴史Unixファイルシステムの歴史
Unixファイルシステムの歴史
 
Pdp11 on-fpga
Pdp11 on-fpgaPdp11 on-fpga
Pdp11 on-fpga
 
V6read#2
V6read#2V6read#2
V6read#2
 
Androidの入力システム
Androidの入力システムAndroidの入力システム
Androidの入力システム
 
ぐだ生システム再構成4
ぐだ生システム再構成4ぐだ生システム再構成4
ぐだ生システム再構成4
 
20分で理解する仮想記憶
20分で理解する仮想記憶20分で理解する仮想記憶
20分で理解する仮想記憶
 

Recently uploaded

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 

Recently uploaded (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 

Oscar compiler for power reduction

  • 1. OSCAR  Compiler  Controlled     Mul3core  Power  Reduc3on     on  Android  Pla8orm Hideo  Yamamoto¹,   Tomohiro  Hirano¹,  Kohei  Muto¹,     Hiroki  Mikami¹,  Takashi  Goto¹,  Dominic  Hillenbrand¹,     Moriyuki  Takamura²,  Keiji  Kimura¹  and  Hironori  Kasahara¹     ¹Green  Compu3ng  Systems  Research  and  Department  Center  Waseda  University   ²FUJITSU  LABORATORIES  LTD.   LCPC2013 1
  • 2. Presenta3on  Outline •  Background   –  Power  consump3on  in  mul3core     –  Power  control  mechanism  of  the  OSCAR  Compiler   –  Power  control  on  the  Android™ pla8orm   •  Experimental   –  Evalua3on  target  ,  power  rail  and  measurement  device   –  Precise  power  measurement  method  Using  GPIO   –  Bind  mode   –  Clock  ga3ng  method  using  WFI  instruc3on   •  Highlight  event  in  data   –  Power  consump3on  of  MPEG2  decoder     •  Conclusion LCPC2013 2
  • 4. A    Plethora    of    Smart Devices LCPC2013 4 Linux ARM11/   CortexA8 Linux  -­‐2  core  SMP Cortex-­‐   A9 Cortex-­‐   A9 Linux  -­‐  4  core  SMP Cortex-­‐   A9 Cortex-­‐   A9 Cortex-­‐   A9 Cortex-­‐   A9 Linux  –  8  core  HMP Cortex-­‐   A15 Cortex-­‐   A15 Cortex-­‐   A15 Cortex-­‐   A15 Cortex-­‐   A7 Cortex-­‐   A7 Cortex-­‐   A7 Cortex-­‐   A7 Linux  -­‐  8  core  big.LITTLE Cortex-­‐   A15 Cortex-­‐   A15 Cortex-­‐   A15 Cortex-­‐   A15 Cortex-­‐   A7 Cortex-­‐   A7 Cortex-­‐   A7 Cortex-­‐   A7 Linux  -­‐  5  core    4+1  vSMP   Cortex-­‐   A9 Cortex-­‐   A9 Cortex-­‐   A9 Cortex-­‐   A9 Cortex-­‐   A9 2013 2007 2011 ・・・・・・・ 2014 High  performance  device Cumula3ve  smart  device  shipment                    iOS                          700,000,000                    Android      1000,000,000
  • 5.  In  quad  core  case,  you  can  reduce  ‘f’  to  ¼   keeping  the  same  performance.     If  ‘v’  is     0.6(v)  for  ¼  ‘f’,  power  consump3on  will  be   reduced  to  0.36   Power  Consump3on  in  mul3  core •  Uni  Core   P  =  f*c*v^2          ・・・・・ Eq.1   •   Mul3  Core   P  =  n*f*c*v^2      ・・・・・ Eq.2   LCPC2013 5
  • 6. OSCAR  Compiler LCPC2013 6 Waseda  University   Mul3grain  Parallel  Processing   • Hierarchical  and  Global  Paralleliza3on • Coarse  grain  task  parallel   • Loop  itera3on  parallel   • Statement  level  parallel   Data  Locality  Op3miza3on   • Task  (or  loop)  decomposi3on  considering   cache  size  or  local  memory  size   • Task  scheduling  considering  data  affinity   Low  power  op3miza3on   • Power  scheduling  with   DVFS,  clock  ga3ng  and   power  ga3ng  by  somware   Doall loop Seq. loop Task level or statement level parallelization
  • 7. Power  Control  Mechanism  of     the  OSCAR  Compiler •  Es3mate  execu3on  3me  of  each  MT  and  find  cri3cal  path   •  Determine  execu3on  3me  to  sa3sfy  the  given  deadline   •  Decide  op3mal  frequency  and  voltage  of  each  MT.     LCPC2013 7 MT1 MT2 MT5 MT3 MT6 MT8 MT4 MT7 MT9 Core0 Core1 Core2 Core0 Core1 Core2 MT1 MT2 MT5   (Low  freq.)   MT3   (Low  freq.) MT6 MT8 MT4 MT7 MT9 Given  Dead  Line 3me Margin Clock  ga>ng Power   ga3ng Power   ga3ng Power   ga3ng Sta3c  scheduled  MTG Power  scheduling  with  DVFS,  clock   ga3ng  and  power  ga3ng  by  somware   Time   management 3me
  • 8. Power  Control  on  Android •  CPUFreq         – Frequency  and  voltage  scaling  of  a  target  CPU   •  CPUIdle   – Manages  the  level  of  idle  on  each  core  of  the  CPU   •  HotPlug    >  10ms   – Extended  func3on  of  CPUFreq  and  CPUIdle   – Adds  another  core  to  distribute  the    load  in  high   u3liza3on   – Shuts  down  excess  core  with  low  u3liza3on     – Decide  core  on/off  line  in  a  heuris3c  adap3on     LCPC2013 8
  • 9. Problems  of  Linux    power  control  and   parallel  processing   •  Hotplug  can’t  online  core  and  thread  binding  swimly   –  In  worst  case  it  needs  several  hundred  milliseconds       •  Non  real-­‐3me   –  Linux  can’t  control  fine  resolu3on  3me  under  5-­‐10ms   LCPC2013 440.6ms 9 Startup  3me  440.6ms
  • 10. Background   •  Mo3va3on   –  Paralleliza3on  is  effec3ve  for  low  power  execu3on  with   DVFS,  power-­‐ga3ng  and  clock-­‐ga3ng   –  OSCAR  compiler  has  the  capability  to  generate  power   control  API  automa3cally       •  Obstacle   –  Linux  needs  long  startup  3me  for  distribu3ng  load    to   mul3cores     –  Lack  of  fine  resolu3on  3me  control   •  Challenge   –  Low  power  execu3on  Android  pla8orm  by  paralleliza3on     LCPC2013 10
  • 12. Evalua3on  board  -­‐  ODROID-­‐X2 •  Samsung  Exynos4412  Prime   – ARM  Cortex-­‐A9  Quad  core   – Maximum  clock  frequency  1.7GHz   – Used  by  Samsung's  Galaxy  S3   •  DVFS  can’t  be  applied  to  each  core   independently   •  Android  Open  Source  version  is  in  place   •  Circuit  Schema3c  is  available  on  request   LCPC2013 12
  • 13. SoC Exynos4412 Power  Rail  for  Exynos4412 •  Exynos4412  is  powered  by  4  PMIC  (Power  Management  IC)  voltage   –  VDD_ARM    CORE   –  VDD_INT    Interrupt  controller  and  L2 –  VDD_G3D    GPU –  VDD_MIF    DDR  Memory •  Power  consump3on  of  VDD_ARM  (CORE)  has  been  measured     LCPC2013 Cortex-­‐A9   32KB  I/D   NEON Cortex-­‐A9   32KB  I/D   NEON Cortex-­‐A9   32KB  I/D   NEON Cortex-­‐A9   32KB  I/D   NEON Interrupt  controller    +    L2   GPU DDR VDD_ARM VDD_INT VDD_G3D VDD_MIF PMIC 13
  • 14. Modified  Circuit  Diagram  of     ODROID-­‐X2 LCPC2013 14 Current Voltage Voltage  (V) Current  (A) x = Power  (W)
  • 15. How  to  measure  CORE  power     on  ODROID-­‐X2 •  Adding  a  40  mΩ  shunt  resistor  to  VDD_ARM LCPC2013 SoC PMIC Shunt Instrumenta3on  amp Voltage   drop 15
  • 16. synchroniza3on  between  program   and  waveforms  using  GPIO LCPC2013 16
  • 17. “bind”  mode •  Core  assignment  logic  of  Android  Linux  hotplug    is  heuris3c   •  New  core  assignment  mode  called  “bind”  mode  is  developed   for  efficient  parallel  execu3on   •  "bind"  mode  is  integrated  in  Android  Linux  as  OSCAR  run3me   and  API   •  Specifica3on  of  OSCAR  API  for  “bind”  mode     –  Core  0  is    reserved    for  Android  system  and  non  OSCAR    parallel   program     –  Applica3on  can  disable  hotplug  and  control  for  Core  ON/OFF  line   –  Applica3on  can  Bind  Core  1,2  and  3  to  OSCAR  parallel  program     LCPC2013 17 Startup  3me  7.2ms
  • 18. clock  ga3ng •  WFI  instruc3on   – WFI  instruc3on    suspends  the  execu3on  of  the   processor  core  and  stops  the  clock  un3l  3mer   event   •  Clock  ga3ng  driver  using  WFI  instruc3on   – The  WFI  instruc3on  is  privileged  instruc3on   – The  API  allows  user  program  to  execute  WFI   instruc3on  within  Linux  driver   LCPC2013 18
  • 19. while(1)  {      gpio_value(1);      call_wfi_api(1);      gpio_value(0);   } 250mA 500mA Fine  3ming  control  by  WFI  driver LCPC2013 19 250mA 500mA 2000us  (4  slot) Wake   up Time  Slot  is  500  us GPIO while(1)  {      gpio_value(1);      call_wfi_api(4);      gpio_value(0);   } GPIO Clock  ga3ng 0us  <    T  <  500us 1500us  <    T      <  2000us 15000us  (3  slot) (N  -­‐1)  x  500us      <    T    <    N  x  500us
  • 20. Current  waveform  of  busy  wait     without  clock  ga3ng   1000mA 1500mA 2000mA    500mA 1core 2cores 3cores 4cores Busy  wait  in  ordinary  execute 20
  • 21. Current  waveform  of  busy  wait       with  clock  ga3ng LCPC2013 1000mA 1500mA 2000mA    500mA 1core 2cores 3cores 4cores Busy  wait  with  clock  ga>ng 21 Wake  up  all  cores Clock  ga3ng  all  cores
  • 22.   Compare  with     current  waveforms     1000mA 1500mA 2000mA    500mA 1core 2cores 3cores 4cores Busy  wait  in  ordinary  execute LCPC2013 1000mA 1500mA 2000mA    500mA 1core 2cores 3cores 4cores Busy  wait  with  clock  ga>ng 22 Wake  up  all  cores Clock  ga3ng  all  cores
  • 23. MPEG2  DECODER  Highlight  data LCPC2013 23
  • 24. Power  Consump3on  of     MPEG2  Decoder  on  ODROID-­‐X2 LCPC2013 1/7(13.3%) 1/3(38.1%) NUMBER  OF  CORES 24 With  Power  Reduc3on  Control Without  Power  Reduc3on  Control  
  • 26. LCPC2013   MPEG2  Decode  execu3on   In  high  clock  and  voltage   Busy  Wait  execu3on    Clock  ga3ng     by  WFI   Reduced   by  WFI Consumed Reduced   26 (a)  Without  Power  Reduc3on  Control (b)  With  Power  Reduc3on  Control Power  Waveform  of     MPEG2  Decoder  for  1  Core 1.7GHz,  1.4V 1.7GHz,  1.4V
  • 27. LCPC2013 Busy  Wait  execu3on    Clock  ga3ng     by  WFI   MPEG2  Decode  execu3on   In  low  clock  and  voltage   Power  Waveform  of     MPEG2  Decoder  for  3  Core DVFS   P  =  n*f*c*V^2   Reduced   by  WFI MPEG2  Decode  execu3on   In  high  clock  and  voltage   Consumed Reduced 27 (a)  Without  Power  Reduc3on  Control (b)  With  Power  Reduc3on  Control 1.7GHz,  1.4V 400MHz,  1.05V 200MHz,  0.92V
  • 28. Power  Consump3on  of     MPEG2  Decoder  on  ODROID-­‐X2 LCPC2013 NUMBER  OF  CORES 2.79 0.97 0.63 0.37 WFI DVFS WFI 1/3(38.1%) Consumed Reduced 28
  • 29. Conclusions   •  The  ODROID-­‐X2  Circuit  is  modified  such  that   1.  Precise  Power  waveforms  at  the  output  of  PMIC  is   observed,  and   2.  The  power  waveforms  and  parallel  program  event  are  inter-­‐ related  in  3ming  for  OSCAR  compiler  op3miza3on.   •  The  efficient  parallel  program  execu3on  pla8orm  on  Android  is   established  by   1.  “bind”  mode,  and     2.  The  WFI  instruc3on    by  the  OSCAR  compiler.   •  The  newly  developed  OSCAR  compiler  power  control   mechanism  has  decreased  the  power  to  one  third,  from  0.97   Wa~  in  1-­‐core  to  0.37  Wa~  in  3-­‐core,  in  running  MPEG2   decoder  on  Android  pla8orm.   LCPC2013 29
  • 32. Power  Consump3on  of     Op3cal  Flow  on  ODROID-­‐X2 LCPC2013 13.4% 31.5% 32
  • 33. Power  Waveform  of     Op3cal  Flow  for  1core LCPC2013 Op3cal  Flow  execu3on   Busy  Wait  execu3on   Clock  ga3ng  by  WFI   Reduce  power   of  waste  CPU   cycles 33
  • 34. Power  Waveform  of     Op3cal  Flow  for  3core LCPC2013 Op3cal  Flow  execu3on   In  high  clock  and  voltage   Busy  Wait  execu3on   Clock  ga3ng     by  WFI   P  =  n*f*c*V^2   Op3cal  Flow  execu3on   In  low  clock  and  voltage   34
  • 35. #pragma  oscar  get_current_>me(current,  >mer_no Low-­‐power  code  with  OSCAR  API LCPC2013 Proc0 Scheduled Tasks T1 off Proc1 Scheduled Tasks T2 T4 Proc2 Scheduled Tasks T3 T6(slow) OSCAR Compiler • Multigrain Parallelization • Memory Optimization • Data Transfer
 Optimization • DVFS, Clock gating Sequential Programs C/Fortran Low-­‐power  parallel  C/Fortran  Programs   including  OSCAR  API Backend Compiler API  Decoder Na3ve  Compiler #pragma  oscar  fvcontrol(pe,  (id,  state))   #pragma  oscar  get_fvstatus(pe,  id,  state)   Translate  OSCAR  API  into  Library  call   Exec. Object 35
  • 37. ODROID  Amer  rework PMIC GND GND VDD_ARM R C C L GND Single  5  Pin Drop  Voltage L R Voltage 37
  • 38. How to work hotplug L L L L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 L L 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 up2g0_delay up2gn_delay down_delay up2gn_delay down_delay 1 1 up up up Down Down Down down_delay Idle Idle Idle Idle up down idle disable
  • 39. Auto hotplug governor tegra_cpu_set_speed_cap 578 int tegra_cpu_set_speed_cap(unsigned int *speed_cap) 579 { 581 unsigned int new_speed = tegra_cpu_highest_speed(); 586 new_speed = tegra_throttle_governor_speed(new_speed); 587 new_speed = edp_governor_speed(new_speed); 588 new_speed = user_cap_speed(new_speed); 592 ret = tegra_update_cpu_speed(new_speed); 594 tegra_auto_hotplug_governor(new_speed, false); 596 } tegra_auto_hotplug_governor parameters LP-mode GP-MODE up_delay up2g0_delay up2dn_delay down_delay down_deley down_delay top_freq idle_top_freq idle_bottom_freq botttom_freq 0 idle_bottom_freq Current State Compare with requested freq New State Delay to effecte IDLE > top_freq UP Up_delay IDLE <=bottom_freq DOWN Down_delay DOWN >top_freq UP Up_delay DOWN >bottom_freq IDLE NA UP <bottom_freq DOWN Down_delay UP <=top_freq IDLE ND Throttle_table throttle_index Update form user thermal_cooling_device Edp_Thermal Auto Hot plug Suspend CpuFreq