SlideShare uma empresa Scribd logo
1 de 58
Baixar para ler offline
1
Dan Reed Reed, Iowa
Sadaf Alam, CSCS/ETH
Bill Gropp, NCSA/Illinois
Satoshi Matsuoka, GSIC/Tokyo Tech
John Shalf, NERSC/LBL
2
Gray’s cost/performance axes
• Networking
• Computation
• Storage
• Access
… still driving change
Facility, systems, operations
Software …
3
4
What system software has been most helpful in improving system energy efficiency?
Which software stack layers should support energy awareness?
Is there user willingness to exploit libraries for energy efficiency?
What might encourage adoption of APIs for measurement/management?
Where are the biggest opportunities for energy efficiency improvements?
5
At a PUE of <1.1, how much more is there to be gained?
Can software-based runtime introspection mechanisms react fast enough?
What is the user reward for developing energy efficient codes?
What is the total cost of ownership (TCO) tradeoff in tuning codes for
(a) improved performance or (b) reduced energy consumption
What are the fundamental properties of energy efficient applications?
SC17 Panel: Energy Efficiency Gains From Software:
Retrospectives and Perspectives
Sadaf Alam
Chief Technology Officer
Swiss National Supercomputing Centre
November 17, 2017
SC17 Panel 7
Piz Daint and the User Lab
Model Cray XC40/XC50
XC50 Compute
Nodes
IntelÂŽ XeonÂŽ E5-2690 v3 @ 2.60GHz (12
cores, 64GB RAM) and NVIDIAÂŽ TeslaÂŽ
P100 16GB
XC40 Compute
Nodes
IntelÂŽ XeonÂŽ E5-2695 v4 @ 2.10GHz (18
cores, 64/128 GB RAM)
Interconnect
Configuration
Aries routing and communications ASIC,
and Dragonfly network topology
Scratch
capacity
~9 + 2.7 PB
http://www.cscs.ch/uploads/tx_factsheet/FSPizDaint_2017_EN.pdf
http://www.cscs.ch/publications/highlights/
http://www.cscs.ch/uploads/tx_factsheet/AR2016_Online.pdf
Measurement Tools on Piz Daint
 Level 3 measurements for official
submissions
 Accuracy for Top500, Green500, …
 Grafana dashboard
 Overall system and
cabinet row
 Cray performance
measurement database
(PMDB)
 DB with node, blade,
cabinet, job, …
 SLURM output
 Using data available on
compute nodes
SC17 Panel 8
> sacct -j 3****** -o JobID%20,ConsumedEnergyRaw
JobID ConsumedEnergyRaw
-------------------- -----------------
3******
3******.batch 379848.000000
3******.extern 2005970432.000000
3******.0 2001926407.000000
SC17 Panel 9
MeteoSwiss’ Performance Ambitions in 2013
SC17 Panel 10
COSMO: old and new (refactored) code
SC17 Panel 11
SC17 Panel 12
SC17 Panel 13
Only system in top 10 with level 3 submission
Subset of cores with lower CPU frequency for Green500
submission
Green500 top systems contain accelerator
devices
Engaging users and developers
Thank you for your attention.
Four Steps to Energy Efficiency Through Software
William Gropp
Wgropp.cs.illinois.edu
What was the biggest contribution to HPC System Energy
Efficiency?
• Biggest contribution to HPC system energy efficiency was cooling.
System uses air cooling within cabinets, with water removing heat
from each cabinet. Heat exchangers on roof provide “free” cooling
most of the year.
• No impact on running application. No software changes required.
• Problem: Our PUE is between 1.1 and 1.2; little further gain in
efficiency possible here
National Petascale Computing Facility
• Only Facility in the world of this
scale on an Academic Campus
• Capable of sustained 24 MW today
• Expandable in space, power and
cooling [50,000 ft2 (4,645+ m2) machine
room gallery and sustained 100 MW]
• Modern Data Center
• 90,000+ ft2 (8,360+ m2) total
• 30,000 ft2 (2,790+ m2) raised floor
20,000 ft2 (1,860+ m2) machine room gallery
• Energy Efficiency
• LEED certified Gold
• Power Utilization Efficiency, PUE = 1.1–1.2
How Does Your Center Contribute Most in Energy Efficiency?
• There are still be gaps in the baseline – the efficiency of applications. This is
why “race to halt” is (was?) often the best strategy.
• Our focus remains on improving application efficiency – including moving some
applications to use GPUs on our mixed XE6/XK7 system
• There are many parts to this
• One of the most difficult yet most important is having enough of a performance model to
understand how close the application is to achievable performance
• This is different from efficiency of execution based on counting operations (this often over-
estimates the number of necessary data moves, leading to a lower estimate for achievable
performance)
• Other steps, such as allocating jobs with topology awareness, are important
Where Should Energy-Awareness Be Added to the Software
Stack?
•Easiest if in runtime, scheduler, or closed application
• That is, easiest if programmers don’t see it
•Answer needs to be in context of goal
• Information gathering (such as we did for I/O performance) including
interconnect contribution and message wait (opportunity to reduce power)
can be done with limited impact using MPI profiling interface or similar
system intercept interfaces
•If runtime can “do no harm”, then do it in the runtime
• If you can’t guarantee that, allow users to opt out.
• See discussion of incentives
Will Users Make Codes Energy Efficient?
•No.
•There is no incentive, and in fact, there is usually a negative
incentive: if energy efficiency slows down the application
•Some users may be willing to experiment or to be good “global
citizens,” but most are focused on their science.
•Different charging policies must be approved; in our case, by NSF.
• For example, could charge allocation with a weight that included energy
consumed, providing an incentive. But must include full cost of system
including depreciation
Does the Community Need to Collect and Share Data?
• Yes. To provide better incentives, there is a need to both measure energy use
and provide actionable feedback to users. We need to to collect data on
different approaches and reflect them in the context of user incentives
• There are many complexities in this, and it will be hard to provide accurate data
about energy use by application
• This is not the same as having a deterministic way to charge for energy use.
• The problem is that the actions of other applications may impact the energy use of shared
resources (e.g., network, file system) and it may not be possible to determine what each
application really used, as compared with what it was charged
• Yes, this is also true for time charged to applications. That is not the issue; it is whether the
users accept the approximations that are used
Four Steps for Energy Efficiency
1. Performance Efficiency
• Like conservation, first step is to get more from the same (or less)
• Performance modeling
• Performance tuning
2. Recognize Hardware Realities
• Next is to adapt to realities of the hardware
• Rethink the algorithm
• Adapt to memory latency, wide vectors, etc.
• Can’t compare algorithms by comparing floating point counts, or even memory
references
• Avoid BSP-style programs
• Separate communication and computation phases, and synchronization, waste energy
Four Steps (con’t)
3. You Get What You Measure
•You get what you measure (and reward/penalize)
• Again, focus on performance
• Expose the cost of waiting
4. Consider The Total Cost
•Finally, need to consider total workflow costs
• Local and global optimums not necessarily the same
• Consider data motion for a data set – moving data back and forth can be expensive –
even (locally) less efficient in situ analysis may be cheaper overall
• And how that total cost is exposed/charged to the user
• There must be a positive incentive to the user for pursuing energy efficiency
Green Computingin Tsubame2 and 3
Satoshi Matsuoka
Professor, GSIC, Tokyo Institute of Technology /
Director, AIST-Tokyo Tech. Big Data Open Innovation Lab /
Fellow, Artificial Intelligence Research Center, AIST, Japan /
Vis. Researcher, Advanced Institute for Computational Science, Riken
SC17 Green PanelTalk
2017/11/17
Impacts of Mar 11 Earthquake
• Earthquake itself did not affect
the supercomputer, but…
• Power plant accident caused
power crisis in the capital area
– Rolling outage in March
– University/Government’s direction
to save power
• Although TSUBAME2 is “green”, it
consumes ~10% (0.9—1.2MW)
power of the campus!
Operation of TSUBAME2.0 started in Nov 2010, and The Great
East Japan Earthquake happened 4 months later
TSUBAME Power Operation: 2011 ver. (1)
• University directed us to save
power usage by:
– -50% in early April
 # of nodes has to be -70%
– -25% in late April
 # of nodes has to be -35%
• Reducing too much nodes
constantly is harmful for
supercomputer users
• What is the social demand?
The most important constraint is
to reduce peak power
consumption in daytime
“Peak shift” operation
Total power usage of the capital area
in a day
daytime
TSUBAME Power Operation: 2011 ver. (2)
• It was allowable to increase running nodes
in nighttime and weekend
65% operation in daytime
100% operation in nighttime/weekend
• Tokyo-Tech and NEC developed tools to shutdown/
wake up nodes automatically everyday
• The number “65%” is determined by measurement
of power consumption in typical operation
Results of the Operation in 2011
5/12~6/8 CPU
5/12~6/8 Power 7/4~ Power
7/4~ CPU
Target
787kW
• The power is successfully limited in daytime
• However, it is still sometimes too pessimistic
• Our goal is not capping # of nodes, but capping power
 New power-aware system management method is required
TSUBAME2.0⇒2.5 Thin Node Upgrade in 2013
HP SL390G7 (Developed for
TSUBAME 2.0, Modified for 2.5)
GPU: NVIDIA Kepler K20X x 3
1310GFlops, 6GByte Mem(per GPU)
CPU: Intel Westmere-EP 2.93GHz x2
Thin
Node
Infiniband QDR
x2 (80Gbps)
Productized
as HP
ProLiant
SL390s
Modified for
TSUBAME2.5
Peak Perf.
4.08 Tflops
~800GB/s
Mem BW
80GBps NW
~1KW max
NVIDIA Fermi
M2050
1039/515
GFlops
NVIDIA Kepler
K20X
3950/1310
GFlops
System total: 2.4PFlops 5.7PFlops
TSUBAME Power Operation: 2014 ver.
(1) Real-time power consumption
of the system is measured
• Including cooling
Is current power sufficiently lower
than the specified cap?
Or are we in a critical zone?
(2) Current power and the target are
compared
 We control the number of working
nodes dynamically
Transition of modes of each node
(simplified)
ON
OFF
Offline
The node is awaken,
but no new job starts
on this node
Dynamic control is introduced based on real-time system power
Results of the Operation in 2014
The number of working CPU cores
Power Consumption Daytime
The number of nodes are
controlled dynamically
The power is capped by
800 kW successfully
TSUBAME Power Operation: 2016 ver.
• In addition to the peak-shift (2014 ver) operation, we implemented
new mechanism to reduce total energy consumption
 Idle nodes without running jobs are shut down
– Disadvantage: Start time of new jobs may be delayed around ten minutes for
boot time of nodes
Assigned Idle Powered off
Illustration of Power Consumption of Parallel
Operation of TSUBAME3.0+2.5
Changes of power consumption during a fiscal year (April to next March)
TSUBAME3.0 power
TSUBAME2.5 power
TSUBAME2.5 power
Peak power
in the Summer
We should do both of (1) limiting power in the summer and (2) limiting the total
energy consumption within the given budget
System usage peak
in the Winter
Controlling Total Electric Fee per FY (plan)
Assumption
• System usage reaches the peak in the Winter
• Electric fee per Wh changes every month for surcharge
Strategy
• All T3 nodes are “ON”, and T2.5 nodes become alive on-demand
• We allocate budget for each month
• Based on the month budget, we decide working T2.5 node
• When we have remaining amount, we adjust the following months
Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar
x100k yen 60 65 70 50 50 120 120 120 120 120 120 120
Total
MWh
288 312 360 245 220 480 500 ? ? ? ? ?
T3 MWh 200 250 300 220 210 350 350 ? ? ? ? ?
#T2node 400 390 380 40 20 300 360 ? ? ? ? ?
Numbers are tentative
Overview of TSUBAME3.0
BYTES-centric Architecture, Scalaibility to all 2160 GPUs,
all nodes, the entire memory hiearchy
Full Bisection Bandwidgh
Intel Omni-Path Interconnect. 4 ports/node
Full Bisection / 432 Terabits/s bidirectional
~x2 BW of entire Internet backbone traffic
DDN Storage
(Lustre FS 15.9PB+Home 45TB)
540 Compute Nodes SGI ICE XA + New Blade
Intel Xeon CPU x 2+NVIDIA Pascal GPUx4 (NV-Link)
256GB memory 2TB Intel NVMe SSD
47.2 AI-Petaflops, 12.1 Petaflops
Full Operations
Aug. 2017
TSUBAME3.0 Co-Designed SGI ICE-XA Blade (new)
- No exterior cable mess (power, NW, water)
- Plan to become a future HPE product
Power Meters
Running Benchmarks (HPL, HPCG)
Tokyo Tech + HPE/SGI Japan Team
Top 500 Submission =
1.x Petaflops @ 144 Nodes
Green 500 Submission =
14.11 Gflops/W (Nov. 2016 #1 = 9.46
Gflops/W)
Announcement ISC17 Opening,
9AM Monday, 19th June World #1 !!!
100Gbps x 4 Liquid Cooled NVMe
PCIe NVMe
Drive Bay x 4
Liquid Cooled
“Hot Pluggable” ICE-
XA Blade
Smaller than 1U server,
no cables or pipes
Xeon x 2
PCIe Switch
> 20 TeraFlops
DFP
256GByte Memory
144 GPUs & 72
CPUs/rack
Integrated
100/200Gbps
Fabric Backplane
15 Compute Racks
4 DDN Storage Racks
3 Peripheral & SW
racks
Total 22 Racks
40
Warm Water Cooling Distribution in T3
PUE ~= 1.03 (~1.1 w/storage)
Compute Node
HPE SGI ICE XA
Storage
Interconnect SW
Rooftop free cooling tower
In-Room Air-
Con for Humans
On the ground chillers
(shared with Tsubame2)
Outgoing 32 degrees C
Return 40 degrees C
Backup Heat Exchanger
Outgoing 17 degrees C
1MB Cooling Capacity 2MW Cooling Capacity
100KW Max
Return 24 degrees C
Smart Data Center Operation for ABCI
(NEDO Project 2017-)
• Started to develop a system that achieves a self-sustainable operation
and reduce operation cost of data center, especially for ABCI system
• Data storage for storing sensor data from node, cooling system, etc.
• ML/DL algorithms to analyze the data and model data center behavior
• Reduce power consumption, detect errors, etc.
• Apply the algorithms to improve operation
• Current status
• Started from Aug. 2017
• Designing/developing sensor
data collector and its storage
Comparing TSUBAME3/ABCI to Classical IDC
AI IDC CAPX/OPEX accelerartion by > x100
43
Traditional Xeon IDC
~10KW/rack PUE 1.5~2
15~20 1U Xeon Servers
2 Tera AI-FLOPS(SFP) / server
30~40 Tera AI-FLOP / rack
TSUBAME3 (+Volta) & ABCI IDC
~60KW/rack PUE 1.0x
~36 T3 evolution servers
~500 Tera AI-FLOPS(HFP) / server
~17 Peta AI-FLOPs / rack
Perf > 400~600
Power Eff > 200~300
Energy Efficiency Gains from
Software:
Retrospectives and Perspectives
SC17
November, 2017
John Shalf
Lawrence Berkeley
National Laboratory
Measurement and Metrics
"If you can not measure it,
you can not improve it.”
-Lord Kelvin (1883)
High Data Rate Collection System
Future Data Sources
Syslog
Job Data
Lustre + GFPS statistics
LDMS
Outdoor particle counters
???
- 46 -
Current Data Sources
Substations, panels, PDUs, UPS
Cori & Edison SEDC
Onewire Temp & RH sensors
BMS through BACNET
Indoor Particle counters
Weather station
• Rabbit MQ, Elastic, Linux
– Collects ~20K data items per second
– Over 40TB data online (100TB capacity)
– 45 days of SEDC (versus 3 hours on SMW)
– 180 days of BMS data (6X more than BMS)
– 260 days of power data
Kibana, Grafana
Liquid Cooling Performance
- 47 -
Baseline Balanced
Have saved 700,000 kWh/year
Analytics for resource utilization
0
10
20
30
40
50
60
70
80
90
100
1 2 4 8 16 32 64 128 256 512 1024
CumulativeDistributionFunction(%)
Memory per Process (GB)
Memory Utilizationon Genepool
Jan-Jun 2016
Maxrss Jobs
Maxrss SlotHours
Maxvmem Jobs
Maxvmem SlotHours
Introspective
business
analytics can
save you a LOT
of MONEY and
POWER
Realtime Feedback and Tuning for Energy Efficiency
- 49 -
65250.0
1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6
Walltime
Energy
CPU Frequency
Performance and power
increase with frequency, but at
different rates.
Brian Austin: NERSC
GTC
MILC
Mini-DFT
95%
100%
105%
110%
115%
120%
125%
1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6
ApplicationEnergyUse
(Normalizedvs.2.4GHz)
CPU Frequency (GHz)
GTC: No minimum.
MILC: 13% performance penalty.
Mini-DFT: 11% performance penalty.
Adjust Frequency & Measure Energy
Anatomy of a “Value” Metric
Good Stuff
Bad Stuff
Anatomy of a “Value” Metric
FLOP/s
Watts
Bogus!!!
Potentially
Bogus!!
Anatomy of a “Value” Metric
Measured Performance
Measured Watt
Have mined this to a limit case
At PUE of 1.1, hard to squeeze out more
But must remain vigilant
Optimizing
performance has
substantial room
for improvement
Roofline Model: “Houston, is there a problem?”
Am I under-performing? (FLOP rate is a poor metric!)
And how bad is it???
54
Roofline Automation in IntelÂŽ Advisor 2017
Improving under collaboration with NERSC, LBNL, DOE.
Each Dot
represents loop or function in
YOUR APPLICATION (profiled)
Each Roof (slope)
Gives peak CPU/Memory throughput
of your PLATFORM (benchmarked)
Automatic and integrated – first class citizen in Intel® Advisor
Speed-ups from NESAP Investments
55
Comparison is node to node
1 node of Edison = 2 IvyBridge Procs
1 node of Cori = 1 KNL Proc
7x Increase in Performance is
nearly 7x increased Energy
Efficiency!
ARPAe PINE Key Concept: dynamic Bandwidth Steering
for energy efficiency (use case for power API?)
56
Compute MCM
HBM MCM NVRAM MCM
NVM
NVM
NVM
NVM
RX
RX
TX
TX
Packet Switching
MCM
RX
RX
TX
TX
To other nodes
CPU/GPU
HBM MCM
CPU GPU
RAM NVM
Optical switch 3
CMP1 CMP2 CMP3 CMP4
GPU3GPU1 GPU2 GPU8GPU6 GPU7GPU5GPU4 GPU9
MEM
MEM
MEM
MEM
MEM
MEM MEM
MEM
MEM
MEM
MEM
MEM
MEM
MEM
CMP1
CMP2
CMP4
MEM
MEM
CMP3
MEM MEM MEM MEM
MEM
GPU1
GPU3
Interconnect
R=2
Intra-server
Inter-server
Bergman
Gaeta LipsonBowersCoolbaughJohansson Patel Dennison Shalf Ghobadi
Conclusions
• Measuring datacenter operational efficiency has resulted in huge improvements
– At PUE 1.1, it is hard to mine MORE efficiency
– But must stay vigilant with continuous monitoring
• Runtime System Optimizations driven by real-time feedback may work
– Currently many programming structures are fixed at compile time (currently limited)
– Software mediated control loop may be too slow for nanosecond events
– Runtime system will always know less about application than application scientist
• Tuning codes is HUGE energy benefit! (energy efficiency = performance/watts)
– NESAP code tuning offered largest (3x – 7x) efficiency improvements!
– Need tools that do a better job identifying the performance *opportunity* (roofline)
• Integrated analytics at every level is essential for identifying efficiency opportunities
– But we are collecting data faster than we can formulate questions to pose against it!
– Perhaps an opportunity for machine learning as “anomaly detection”
58
What system software has been most helpful in improving system energy efficiency?
Which software stack layers should support energy awareness?
Is there user willingness to exploit libraries for energy efficiency?
What might encourage adoption of APIs for measurement/management?
Where are the biggest opportunities for energy efficiency improvements?
Can software-based runtime introspection mechanisms react fast enough?
What is the user reward for developing energy efficient codes?
What is the total cost of ownership (TCO) tradeoff in tuning codes for
(a) improved performance or (b) reduced energy consumption
What are the fundamental properties of energy efficient applications?

Mais conteĂşdo relacionado

Mais procurados

How Utilities can Improve Operations using Data Integration Workflows
How Utilities can Improve Operations using Data Integration WorkflowsHow Utilities can Improve Operations using Data Integration Workflows
How Utilities can Improve Operations using Data Integration WorkflowsSafe Software
 
Google ppt. mis
Google ppt. misGoogle ppt. mis
Google ppt. misjayadevans89
 
Air Flow Presentation Pdf
Air Flow  Presentation PdfAir Flow  Presentation Pdf
Air Flow Presentation PdfDanny Newman
 
Optimization of energy consumption in cloud computing datacenters
Optimization of energy consumption in cloud computing datacenters Optimization of energy consumption in cloud computing datacenters
Optimization of energy consumption in cloud computing datacenters IJECEIAES
 
CloudFuzion solutions for the Smart Grid
CloudFuzion solutions for the Smart GridCloudFuzion solutions for the Smart Grid
CloudFuzion solutions for the Smart GridCloudFuzion
 
Building Simulation, Its Role, Softwares & Their Limitations
Building Simulation, Its Role, Softwares & Their LimitationsBuilding Simulation, Its Role, Softwares & Their Limitations
Building Simulation, Its Role, Softwares & Their LimitationsPrasad Thanthratey
 
Deep Learning Based Integrated Energy Efficiency Optimization for Smart Building
Deep Learning Based Integrated Energy Efficiency Optimization for Smart BuildingDeep Learning Based Integrated Energy Efficiency Optimization for Smart Building
Deep Learning Based Integrated Energy Efficiency Optimization for Smart Buildingeliyart
 
Calient data center presentation Fall 2015
Calient data center presentation Fall 2015Calient data center presentation Fall 2015
Calient data center presentation Fall 2015Nick Howard
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computingwww.pixelsolutionbd.com
 
IRJET- Research Paper on Energy-Aware Virtual Machine Migration for Cloud Com...
IRJET- Research Paper on Energy-Aware Virtual Machine Migration for Cloud Com...IRJET- Research Paper on Energy-Aware Virtual Machine Migration for Cloud Com...
IRJET- Research Paper on Energy-Aware Virtual Machine Migration for Cloud Com...IRJET Journal
 
Express Computer
Express ComputerExpress Computer
Express ComputerAniket Patange
 
Emerson Energy Logic
Emerson Energy LogicEmerson Energy Logic
Emerson Energy LogicRavi Ayilavarapu
 
Data Center at BNY Mellon
Data Center at BNY Mellon Data Center at BNY Mellon
Data Center at BNY Mellon AbhiJeet Singh
 
Upgrading data centres beyond planned capacity
Upgrading data centres beyond planned capacityUpgrading data centres beyond planned capacity
Upgrading data centres beyond planned capacityNorman Disney & Young
 
Energy Logic: Reducing Data Center Energy Consumption by Creating Savings tha...
Energy Logic: Reducing Data Center Energy Consumption by Creating Savings tha...Energy Logic: Reducing Data Center Energy Consumption by Creating Savings tha...
Energy Logic: Reducing Data Center Energy Consumption by Creating Savings tha...Knurr USA
 
IRJET- A Statistical Approach Towards Energy Saving in Cloud Computing
IRJET-  	  A Statistical Approach Towards Energy Saving in Cloud ComputingIRJET-  	  A Statistical Approach Towards Energy Saving in Cloud Computing
IRJET- A Statistical Approach Towards Energy Saving in Cloud ComputingIRJET Journal
 
DCIM: An Integral Part of the Software Defined Data Centre
DCIM: An Integral Part of the Software Defined Data CentreDCIM: An Integral Part of the Software Defined Data Centre
DCIM: An Integral Part of the Software Defined Data CentreConcurrentThinking
 
[Webinar Presentation] Best Practices for IT/OT Convergence
[Webinar Presentation] Best Practices for IT/OT Convergence[Webinar Presentation] Best Practices for IT/OT Convergence
[Webinar Presentation] Best Practices for IT/OT ConvergenceSchneider Electric
 
Chap 2 classification of parralel architecture and introduction to parllel p...
Chap 2  classification of parralel architecture and introduction to parllel p...Chap 2  classification of parralel architecture and introduction to parllel p...
Chap 2 classification of parralel architecture and introduction to parllel p...Malobe Lottin Cyrille Marcel
 

Mais procurados (20)

How Utilities can Improve Operations using Data Integration Workflows
How Utilities can Improve Operations using Data Integration WorkflowsHow Utilities can Improve Operations using Data Integration Workflows
How Utilities can Improve Operations using Data Integration Workflows
 
Google ppt. mis
Google ppt. misGoogle ppt. mis
Google ppt. mis
 
Air Flow Presentation Pdf
Air Flow  Presentation PdfAir Flow  Presentation Pdf
Air Flow Presentation Pdf
 
Optimization of energy consumption in cloud computing datacenters
Optimization of energy consumption in cloud computing datacenters Optimization of energy consumption in cloud computing datacenters
Optimization of energy consumption in cloud computing datacenters
 
CloudFuzion solutions for the Smart Grid
CloudFuzion solutions for the Smart GridCloudFuzion solutions for the Smart Grid
CloudFuzion solutions for the Smart Grid
 
Building Simulation, Its Role, Softwares & Their Limitations
Building Simulation, Its Role, Softwares & Their LimitationsBuilding Simulation, Its Role, Softwares & Their Limitations
Building Simulation, Its Role, Softwares & Their Limitations
 
Deep Learning Based Integrated Energy Efficiency Optimization for Smart Building
Deep Learning Based Integrated Energy Efficiency Optimization for Smart BuildingDeep Learning Based Integrated Energy Efficiency Optimization for Smart Building
Deep Learning Based Integrated Energy Efficiency Optimization for Smart Building
 
Calient data center presentation Fall 2015
Calient data center presentation Fall 2015Calient data center presentation Fall 2015
Calient data center presentation Fall 2015
 
Fault tolerance on cloud computing
Fault tolerance on cloud computingFault tolerance on cloud computing
Fault tolerance on cloud computing
 
IRJET- Research Paper on Energy-Aware Virtual Machine Migration for Cloud Com...
IRJET- Research Paper on Energy-Aware Virtual Machine Migration for Cloud Com...IRJET- Research Paper on Energy-Aware Virtual Machine Migration for Cloud Com...
IRJET- Research Paper on Energy-Aware Virtual Machine Migration for Cloud Com...
 
Express Computer
Express ComputerExpress Computer
Express Computer
 
Emerson Energy Logic
Emerson Energy LogicEmerson Energy Logic
Emerson Energy Logic
 
Data Center Power Strategies
Data Center Power StrategiesData Center Power Strategies
Data Center Power Strategies
 
Data Center at BNY Mellon
Data Center at BNY Mellon Data Center at BNY Mellon
Data Center at BNY Mellon
 
Upgrading data centres beyond planned capacity
Upgrading data centres beyond planned capacityUpgrading data centres beyond planned capacity
Upgrading data centres beyond planned capacity
 
Energy Logic: Reducing Data Center Energy Consumption by Creating Savings tha...
Energy Logic: Reducing Data Center Energy Consumption by Creating Savings tha...Energy Logic: Reducing Data Center Energy Consumption by Creating Savings tha...
Energy Logic: Reducing Data Center Energy Consumption by Creating Savings tha...
 
IRJET- A Statistical Approach Towards Energy Saving in Cloud Computing
IRJET-  	  A Statistical Approach Towards Energy Saving in Cloud ComputingIRJET-  	  A Statistical Approach Towards Energy Saving in Cloud Computing
IRJET- A Statistical Approach Towards Energy Saving in Cloud Computing
 
DCIM: An Integral Part of the Software Defined Data Centre
DCIM: An Integral Part of the Software Defined Data CentreDCIM: An Integral Part of the Software Defined Data Centre
DCIM: An Integral Part of the Software Defined Data Centre
 
[Webinar Presentation] Best Practices for IT/OT Convergence
[Webinar Presentation] Best Practices for IT/OT Convergence[Webinar Presentation] Best Practices for IT/OT Convergence
[Webinar Presentation] Best Practices for IT/OT Convergence
 
Chap 2 classification of parralel architecture and introduction to parllel p...
Chap 2  classification of parralel architecture and introduction to parllel p...Chap 2  classification of parralel architecture and introduction to parllel p...
Chap 2 classification of parralel architecture and introduction to parllel p...
 

Semelhante a SC17 Panel: Energy Efficiency Gains From HPC Software

Sustainable Development using Green Programming
Sustainable Development using Green ProgrammingSustainable Development using Green Programming
Sustainable Development using Green ProgrammingIRJET Journal
 
Ppt4 london - michael rudgyard ( concurrent thinking ) driving efficiencie...
Ppt4   london -  michael rudgyard ( concurrent thinking ) driving efficiencie...Ppt4   london -  michael rudgyard ( concurrent thinking ) driving efficiencie...
Ppt4 london - michael rudgyard ( concurrent thinking ) driving efficiencie...JISC's Green ICT Programme
 
Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionShobha Kumar
 
Energy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud EnvironmentEnergy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud EnvironmentIRJET Journal
 
apidays London 2023 - API Green Score, Yannick Tremblais & Julien Brun, Green...
apidays London 2023 - API Green Score, Yannick Tremblais & Julien Brun, Green...apidays London 2023 - API Green Score, Yannick Tremblais & Julien Brun, Green...
apidays London 2023 - API Green Score, Yannick Tremblais & Julien Brun, Green...apidays
 
IRJET- Enhance Dynamic Heterogeneous Shortest Job rst (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job rst (DHSJF): A Task Schedu...IRJET- Enhance Dynamic Heterogeneous Shortest Job rst (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job rst (DHSJF): A Task Schedu...IRJET Journal
 
Multicore processor technology advantages and challenges
Multicore processor technology  advantages and challengesMulticore processor technology  advantages and challenges
Multicore processor technology advantages and challengeseSAT Journals
 
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...IRJET Journal
 
UnitOnePresentationSlides.pptx
UnitOnePresentationSlides.pptxUnitOnePresentationSlides.pptx
UnitOnePresentationSlides.pptxBLACKSPAROW
 
SOLUTION MANUAL OF OPERATING SYSTEM CONCEPTS BY ABRAHAM SILBERSCHATZ, PETER B...
SOLUTION MANUAL OF OPERATING SYSTEM CONCEPTS BY ABRAHAM SILBERSCHATZ, PETER B...SOLUTION MANUAL OF OPERATING SYSTEM CONCEPTS BY ABRAHAM SILBERSCHATZ, PETER B...
SOLUTION MANUAL OF OPERATING SYSTEM CONCEPTS BY ABRAHAM SILBERSCHATZ, PETER B...vtunotesbysree
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrationsinside-BigData.com
 
Learning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsLearning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsPooyan Jamshidi
 
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDSFAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDSMaurvi04
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel ComputingRoshan Karunarathna
 
IRJET- Smart Tracking System for Healthcare Monitoring using GSM/GPS
IRJET- Smart Tracking System for Healthcare Monitoring using GSM/GPSIRJET- Smart Tracking System for Healthcare Monitoring using GSM/GPS
IRJET- Smart Tracking System for Healthcare Monitoring using GSM/GPSIRJET Journal
 
IRJET- A Review on Aluminium (LM 25) Reinforced with Boron Carbide (B4C) & Tu...
IRJET- A Review on Aluminium (LM 25) Reinforced with Boron Carbide (B4C) & Tu...IRJET- A Review on Aluminium (LM 25) Reinforced with Boron Carbide (B4C) & Tu...
IRJET- A Review on Aluminium (LM 25) Reinforced with Boron Carbide (B4C) & Tu...IRJET Journal
 
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...IRJET Journal
 
Power Usage Effectiveness (PUE): Status & Directions. By K. Winkler
Power  Usage Effectiveness (PUE): Status & Directions. By K. WinklerPower  Usage Effectiveness (PUE): Status & Directions. By K. Winkler
Power Usage Effectiveness (PUE): Status & Directions. By K. WinklerDCC Mission Critical
 
Supermicro and The Green Grid (TGG)
Supermicro and The Green Grid (TGG)Supermicro and The Green Grid (TGG)
Supermicro and The Green Grid (TGG)Rebekah Rodriguez
 

Semelhante a SC17 Panel: Energy Efficiency Gains From HPC Software (20)

Sustainable Development using Green Programming
Sustainable Development using Green ProgrammingSustainable Development using Green Programming
Sustainable Development using Green Programming
 
Ppt4 london - michael rudgyard ( concurrent thinking ) driving efficiencie...
Ppt4   london -  michael rudgyard ( concurrent thinking ) driving efficiencie...Ppt4   london -  michael rudgyard ( concurrent thinking ) driving efficiencie...
Ppt4 london - michael rudgyard ( concurrent thinking ) driving efficiencie...
 
Cse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solutionCse viii-advanced-computer-architectures-06cs81-solution
Cse viii-advanced-computer-architectures-06cs81-solution
 
Energy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud EnvironmentEnergy-Efficient Task Scheduling in Cloud Environment
Energy-Efficient Task Scheduling in Cloud Environment
 
apidays London 2023 - API Green Score, Yannick Tremblais & Julien Brun, Green...
apidays London 2023 - API Green Score, Yannick Tremblais & Julien Brun, Green...apidays London 2023 - API Green Score, Yannick Tremblais & Julien Brun, Green...
apidays London 2023 - API Green Score, Yannick Tremblais & Julien Brun, Green...
 
IRJET- Enhance Dynamic Heterogeneous Shortest Job rst (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job rst (DHSJF): A Task Schedu...IRJET- Enhance Dynamic Heterogeneous Shortest Job rst (DHSJF): A Task Schedu...
IRJET- Enhance Dynamic Heterogeneous Shortest Job rst (DHSJF): A Task Schedu...
 
Multicore processor technology advantages and challenges
Multicore processor technology  advantages and challengesMulticore processor technology  advantages and challenges
Multicore processor technology advantages and challenges
 
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
IRJET- An Energy-Saving Task Scheduling Strategy based on Vacation Queuing & ...
 
UnitOnePresentationSlides.pptx
UnitOnePresentationSlides.pptxUnitOnePresentationSlides.pptx
UnitOnePresentationSlides.pptx
 
Training - What is Performance ?
Training  - What is Performance ?Training  - What is Performance ?
Training - What is Performance ?
 
SOLUTION MANUAL OF OPERATING SYSTEM CONCEPTS BY ABRAHAM SILBERSCHATZ, PETER B...
SOLUTION MANUAL OF OPERATING SYSTEM CONCEPTS BY ABRAHAM SILBERSCHATZ, PETER B...SOLUTION MANUAL OF OPERATING SYSTEM CONCEPTS BY ABRAHAM SILBERSCHATZ, PETER B...
SOLUTION MANUAL OF OPERATING SYSTEM CONCEPTS BY ABRAHAM SILBERSCHATZ, PETER B...
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
 
Learning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsLearning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain Environments
 
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDSFAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
FAULT TOLERANCE OF RESOURCES IN COMPUTATIONAL GRIDS
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
 
IRJET- Smart Tracking System for Healthcare Monitoring using GSM/GPS
IRJET- Smart Tracking System for Healthcare Monitoring using GSM/GPSIRJET- Smart Tracking System for Healthcare Monitoring using GSM/GPS
IRJET- Smart Tracking System for Healthcare Monitoring using GSM/GPS
 
IRJET- A Review on Aluminium (LM 25) Reinforced with Boron Carbide (B4C) & Tu...
IRJET- A Review on Aluminium (LM 25) Reinforced with Boron Carbide (B4C) & Tu...IRJET- A Review on Aluminium (LM 25) Reinforced with Boron Carbide (B4C) & Tu...
IRJET- A Review on Aluminium (LM 25) Reinforced with Boron Carbide (B4C) & Tu...
 
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
IRJET - A Research on Eloquent Salvation and Productive Outsourcing of Massiv...
 
Power Usage Effectiveness (PUE): Status & Directions. By K. Winkler
Power  Usage Effectiveness (PUE): Status & Directions. By K. WinklerPower  Usage Effectiveness (PUE): Status & Directions. By K. Winkler
Power Usage Effectiveness (PUE): Status & Directions. By K. Winkler
 
Supermicro and The Green Grid (TGG)
Supermicro and The Green Grid (TGG)Supermicro and The Green Grid (TGG)
Supermicro and The Green Grid (TGG)
 

Mais de inside-BigData.com

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in ITinside-BigData.com
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnectsinside-BigData.com
 

Mais de inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Último

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vĂĄzquez
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 

Último (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 

SC17 Panel: Energy Efficiency Gains From HPC Software

  • 1. 1 Dan Reed Reed, Iowa Sadaf Alam, CSCS/ETH Bill Gropp, NCSA/Illinois Satoshi Matsuoka, GSIC/Tokyo Tech John Shalf, NERSC/LBL
  • 2. 2 Gray’s cost/performance axes • Networking • Computation • Storage • Access … still driving change Facility, systems, operations Software …
  • 3. 3
  • 4. 4 What system software has been most helpful in improving system energy efficiency? Which software stack layers should support energy awareness? Is there user willingness to exploit libraries for energy efficiency? What might encourage adoption of APIs for measurement/management? Where are the biggest opportunities for energy efficiency improvements?
  • 5. 5 At a PUE of <1.1, how much more is there to be gained? Can software-based runtime introspection mechanisms react fast enough? What is the user reward for developing energy efficient codes? What is the total cost of ownership (TCO) tradeoff in tuning codes for (a) improved performance or (b) reduced energy consumption What are the fundamental properties of energy efficient applications?
  • 6. SC17 Panel: Energy Efficiency Gains From Software: Retrospectives and Perspectives Sadaf Alam Chief Technology Officer Swiss National Supercomputing Centre November 17, 2017
  • 7. SC17 Panel 7 Piz Daint and the User Lab Model Cray XC40/XC50 XC50 Compute Nodes IntelÂŽ XeonÂŽ E5-2690 v3 @ 2.60GHz (12 cores, 64GB RAM) and NVIDIAÂŽ TeslaÂŽ P100 16GB XC40 Compute Nodes IntelÂŽ XeonÂŽ E5-2695 v4 @ 2.10GHz (18 cores, 64/128 GB RAM) Interconnect Configuration Aries routing and communications ASIC, and Dragonfly network topology Scratch capacity ~9 + 2.7 PB http://www.cscs.ch/uploads/tx_factsheet/FSPizDaint_2017_EN.pdf http://www.cscs.ch/publications/highlights/ http://www.cscs.ch/uploads/tx_factsheet/AR2016_Online.pdf
  • 8. Measurement Tools on Piz Daint  Level 3 measurements for official submissions  Accuracy for Top500, Green500, …  Grafana dashboard  Overall system and cabinet row  Cray performance measurement database (PMDB)  DB with node, blade, cabinet, job, …  SLURM output  Using data available on compute nodes SC17 Panel 8 > sacct -j 3****** -o JobID%20,ConsumedEnergyRaw JobID ConsumedEnergyRaw -------------------- ----------------- 3****** 3******.batch 379848.000000 3******.extern 2005970432.000000 3******.0 2001926407.000000
  • 9. SC17 Panel 9 MeteoSwiss’ Performance Ambitions in 2013
  • 10. SC17 Panel 10 COSMO: old and new (refactored) code
  • 13. SC17 Panel 13 Only system in top 10 with level 3 submission Subset of cores with lower CPU frequency for Green500 submission Green500 top systems contain accelerator devices Engaging users and developers
  • 14. Thank you for your attention.
  • 15. Four Steps to Energy Efficiency Through Software William Gropp Wgropp.cs.illinois.edu
  • 16. What was the biggest contribution to HPC System Energy Efficiency? • Biggest contribution to HPC system energy efficiency was cooling. System uses air cooling within cabinets, with water removing heat from each cabinet. Heat exchangers on roof provide “free” cooling most of the year. • No impact on running application. No software changes required. • Problem: Our PUE is between 1.1 and 1.2; little further gain in efficiency possible here
  • 17. National Petascale Computing Facility • Only Facility in the world of this scale on an Academic Campus • Capable of sustained 24 MW today • Expandable in space, power and cooling [50,000 ft2 (4,645+ m2) machine room gallery and sustained 100 MW] • Modern Data Center • 90,000+ ft2 (8,360+ m2) total • 30,000 ft2 (2,790+ m2) raised floor 20,000 ft2 (1,860+ m2) machine room gallery • Energy Efficiency • LEED certified Gold • Power Utilization Efficiency, PUE = 1.1–1.2
  • 18. How Does Your Center Contribute Most in Energy Efficiency? • There are still be gaps in the baseline – the efficiency of applications. This is why “race to halt” is (was?) often the best strategy. • Our focus remains on improving application efficiency – including moving some applications to use GPUs on our mixed XE6/XK7 system • There are many parts to this • One of the most difficult yet most important is having enough of a performance model to understand how close the application is to achievable performance • This is different from efficiency of execution based on counting operations (this often over- estimates the number of necessary data moves, leading to a lower estimate for achievable performance) • Other steps, such as allocating jobs with topology awareness, are important
  • 19. Where Should Energy-Awareness Be Added to the Software Stack? •Easiest if in runtime, scheduler, or closed application • That is, easiest if programmers don’t see it •Answer needs to be in context of goal • Information gathering (such as we did for I/O performance) including interconnect contribution and message wait (opportunity to reduce power) can be done with limited impact using MPI profiling interface or similar system intercept interfaces •If runtime can “do no harm”, then do it in the runtime • If you can’t guarantee that, allow users to opt out. • See discussion of incentives
  • 20. Will Users Make Codes Energy Efficient? •No. •There is no incentive, and in fact, there is usually a negative incentive: if energy efficiency slows down the application •Some users may be willing to experiment or to be good “global citizens,” but most are focused on their science. •Different charging policies must be approved; in our case, by NSF. • For example, could charge allocation with a weight that included energy consumed, providing an incentive. But must include full cost of system including depreciation
  • 21. Does the Community Need to Collect and Share Data? • Yes. To provide better incentives, there is a need to both measure energy use and provide actionable feedback to users. We need to to collect data on different approaches and reflect them in the context of user incentives • There are many complexities in this, and it will be hard to provide accurate data about energy use by application • This is not the same as having a deterministic way to charge for energy use. • The problem is that the actions of other applications may impact the energy use of shared resources (e.g., network, file system) and it may not be possible to determine what each application really used, as compared with what it was charged • Yes, this is also true for time charged to applications. That is not the issue; it is whether the users accept the approximations that are used
  • 22. Four Steps for Energy Efficiency 1. Performance Efficiency • Like conservation, first step is to get more from the same (or less) • Performance modeling • Performance tuning 2. Recognize Hardware Realities • Next is to adapt to realities of the hardware • Rethink the algorithm • Adapt to memory latency, wide vectors, etc. • Can’t compare algorithms by comparing floating point counts, or even memory references • Avoid BSP-style programs • Separate communication and computation phases, and synchronization, waste energy
  • 23. Four Steps (con’t) 3. You Get What You Measure •You get what you measure (and reward/penalize) • Again, focus on performance • Expose the cost of waiting 4. Consider The Total Cost •Finally, need to consider total workflow costs • Local and global optimums not necessarily the same • Consider data motion for a data set – moving data back and forth can be expensive – even (locally) less efficient in situ analysis may be cheaper overall • And how that total cost is exposed/charged to the user • There must be a positive incentive to the user for pursuing energy efficiency
  • 24. Green Computingin Tsubame2 and 3 Satoshi Matsuoka Professor, GSIC, Tokyo Institute of Technology / Director, AIST-Tokyo Tech. Big Data Open Innovation Lab / Fellow, Artificial Intelligence Research Center, AIST, Japan / Vis. Researcher, Advanced Institute for Computational Science, Riken SC17 Green PanelTalk 2017/11/17
  • 25. Impacts of Mar 11 Earthquake • Earthquake itself did not affect the supercomputer, but… • Power plant accident caused power crisis in the capital area – Rolling outage in March – University/Government’s direction to save power • Although TSUBAME2 is “green”, it consumes ~10% (0.9—1.2MW) power of the campus! Operation of TSUBAME2.0 started in Nov 2010, and The Great East Japan Earthquake happened 4 months later
  • 26. TSUBAME Power Operation: 2011 ver. (1) • University directed us to save power usage by: – -50% in early April  # of nodes has to be -70% – -25% in late April  # of nodes has to be -35% • Reducing too much nodes constantly is harmful for supercomputer users • What is the social demand? The most important constraint is to reduce peak power consumption in daytime “Peak shift” operation Total power usage of the capital area in a day daytime
  • 27. TSUBAME Power Operation: 2011 ver. (2) • It was allowable to increase running nodes in nighttime and weekend 65% operation in daytime 100% operation in nighttime/weekend • Tokyo-Tech and NEC developed tools to shutdown/ wake up nodes automatically everyday • The number “65%” is determined by measurement of power consumption in typical operation
  • 28. Results of the Operation in 2011 5/12~6/8 CPU 5/12~6/8 Power 7/4~ Power 7/4~ CPU Target 787kW • The power is successfully limited in daytime • However, it is still sometimes too pessimistic • Our goal is not capping # of nodes, but capping power  New power-aware system management method is required
  • 29. TSUBAME2.0⇒2.5 Thin Node Upgrade in 2013 HP SL390G7 (Developed for TSUBAME 2.0, Modified for 2.5) GPU: NVIDIA Kepler K20X x 3 1310GFlops, 6GByte Mem(per GPU) CPU: Intel Westmere-EP 2.93GHz x2 Thin Node Infiniband QDR x2 (80Gbps) Productized as HP ProLiant SL390s Modified for TSUBAME2.5 Peak Perf. 4.08 Tflops ~800GB/s Mem BW 80GBps NW ~1KW max NVIDIA Fermi M2050 1039/515 GFlops NVIDIA Kepler K20X 3950/1310 GFlops System total: 2.4PFlops 5.7PFlops
  • 30. TSUBAME Power Operation: 2014 ver. (1) Real-time power consumption of the system is measured • Including cooling Is current power sufficiently lower than the specified cap? Or are we in a critical zone? (2) Current power and the target are compared  We control the number of working nodes dynamically Transition of modes of each node (simplified) ON OFF Offline The node is awaken, but no new job starts on this node Dynamic control is introduced based on real-time system power
  • 31. Results of the Operation in 2014 The number of working CPU cores Power Consumption Daytime The number of nodes are controlled dynamically The power is capped by 800 kW successfully
  • 32. TSUBAME Power Operation: 2016 ver. • In addition to the peak-shift (2014 ver) operation, we implemented new mechanism to reduce total energy consumption  Idle nodes without running jobs are shut down – Disadvantage: Start time of new jobs may be delayed around ten minutes for boot time of nodes Assigned Idle Powered off
  • 33. Illustration of Power Consumption of Parallel Operation of TSUBAME3.0+2.5 Changes of power consumption during a fiscal year (April to next March) TSUBAME3.0 power TSUBAME2.5 power TSUBAME2.5 power Peak power in the Summer We should do both of (1) limiting power in the summer and (2) limiting the total energy consumption within the given budget System usage peak in the Winter
  • 34. Controlling Total Electric Fee per FY (plan) Assumption • System usage reaches the peak in the Winter • Electric fee per Wh changes every month for surcharge Strategy • All T3 nodes are “ON”, and T2.5 nodes become alive on-demand • We allocate budget for each month • Based on the month budget, we decide working T2.5 node • When we have remaining amount, we adjust the following months Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar x100k yen 60 65 70 50 50 120 120 120 120 120 120 120 Total MWh 288 312 360 245 220 480 500 ? ? ? ? ? T3 MWh 200 250 300 220 210 350 350 ? ? ? ? ? #T2node 400 390 380 40 20 300 360 ? ? ? ? ? Numbers are tentative
  • 35. Overview of TSUBAME3.0 BYTES-centric Architecture, Scalaibility to all 2160 GPUs, all nodes, the entire memory hiearchy Full Bisection Bandwidgh Intel Omni-Path Interconnect. 4 ports/node Full Bisection / 432 Terabits/s bidirectional ~x2 BW of entire Internet backbone traffic DDN Storage (Lustre FS 15.9PB+Home 45TB) 540 Compute Nodes SGI ICE XA + New Blade Intel Xeon CPU x 2+NVIDIA Pascal GPUx4 (NV-Link) 256GB memory 2TB Intel NVMe SSD 47.2 AI-Petaflops, 12.1 Petaflops Full Operations Aug. 2017
  • 36. TSUBAME3.0 Co-Designed SGI ICE-XA Blade (new) - No exterior cable mess (power, NW, water) - Plan to become a future HPE product
  • 37. Power Meters Running Benchmarks (HPL, HPCG) Tokyo Tech + HPE/SGI Japan Team Top 500 Submission = 1.x Petaflops @ 144 Nodes Green 500 Submission = 14.11 Gflops/W (Nov. 2016 #1 = 9.46 Gflops/W) Announcement ISC17 Opening, 9AM Monday, 19th June World #1 !!!
  • 38. 100Gbps x 4 Liquid Cooled NVMe PCIe NVMe Drive Bay x 4 Liquid Cooled “Hot Pluggable” ICE- XA Blade Smaller than 1U server, no cables or pipes Xeon x 2 PCIe Switch > 20 TeraFlops DFP 256GByte Memory
  • 39. 144 GPUs & 72 CPUs/rack Integrated 100/200Gbps Fabric Backplane 15 Compute Racks 4 DDN Storage Racks 3 Peripheral & SW racks Total 22 Racks
  • 40. 40
  • 41. Warm Water Cooling Distribution in T3 PUE ~= 1.03 (~1.1 w/storage) Compute Node HPE SGI ICE XA Storage Interconnect SW Rooftop free cooling tower In-Room Air- Con for Humans On the ground chillers (shared with Tsubame2) Outgoing 32 degrees C Return 40 degrees C Backup Heat Exchanger Outgoing 17 degrees C 1MB Cooling Capacity 2MW Cooling Capacity 100KW Max Return 24 degrees C
  • 42. Smart Data Center Operation for ABCI (NEDO Project 2017-) • Started to develop a system that achieves a self-sustainable operation and reduce operation cost of data center, especially for ABCI system • Data storage for storing sensor data from node, cooling system, etc. • ML/DL algorithms to analyze the data and model data center behavior • Reduce power consumption, detect errors, etc. • Apply the algorithms to improve operation • Current status • Started from Aug. 2017 • Designing/developing sensor data collector and its storage
  • 43. Comparing TSUBAME3/ABCI to Classical IDC AI IDC CAPX/OPEX accelerartion by > x100 43 Traditional Xeon IDC ~10KW/rack PUE 1.5~2 15~20 1U Xeon Servers 2 Tera AI-FLOPS(SFP) / server 30~40 Tera AI-FLOP / rack TSUBAME3 (+Volta) & ABCI IDC ~60KW/rack PUE 1.0x ~36 T3 evolution servers ~500 Tera AI-FLOPS(HFP) / server ~17 Peta AI-FLOPs / rack Perf > 400~600 Power Eff > 200~300
  • 44. Energy Efficiency Gains from Software: Retrospectives and Perspectives SC17 November, 2017 John Shalf Lawrence Berkeley National Laboratory
  • 45. Measurement and Metrics "If you can not measure it, you can not improve it.” -Lord Kelvin (1883)
  • 46. High Data Rate Collection System Future Data Sources Syslog Job Data Lustre + GFPS statistics LDMS Outdoor particle counters ??? - 46 - Current Data Sources Substations, panels, PDUs, UPS Cori & Edison SEDC Onewire Temp & RH sensors BMS through BACNET Indoor Particle counters Weather station • Rabbit MQ, Elastic, Linux – Collects ~20K data items per second – Over 40TB data online (100TB capacity) – 45 days of SEDC (versus 3 hours on SMW) – 180 days of BMS data (6X more than BMS) – 260 days of power data Kibana, Grafana
  • 47. Liquid Cooling Performance - 47 - Baseline Balanced Have saved 700,000 kWh/year
  • 48. Analytics for resource utilization 0 10 20 30 40 50 60 70 80 90 100 1 2 4 8 16 32 64 128 256 512 1024 CumulativeDistributionFunction(%) Memory per Process (GB) Memory Utilizationon Genepool Jan-Jun 2016 Maxrss Jobs Maxrss SlotHours Maxvmem Jobs Maxvmem SlotHours Introspective business analytics can save you a LOT of MONEY and POWER
  • 49. Realtime Feedback and Tuning for Energy Efficiency - 49 - 65250.0 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 Walltime Energy CPU Frequency Performance and power increase with frequency, but at different rates. Brian Austin: NERSC GTC MILC Mini-DFT 95% 100% 105% 110% 115% 120% 125% 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 ApplicationEnergyUse (Normalizedvs.2.4GHz) CPU Frequency (GHz) GTC: No minimum. MILC: 13% performance penalty. Mini-DFT: 11% performance penalty. Adjust Frequency & Measure Energy
  • 50. Anatomy of a “Value” Metric Good Stuff Bad Stuff
  • 51. Anatomy of a “Value” Metric FLOP/s Watts Bogus!!! Potentially Bogus!!
  • 52. Anatomy of a “Value” Metric Measured Performance Measured Watt Have mined this to a limit case At PUE of 1.1, hard to squeeze out more But must remain vigilant Optimizing performance has substantial room for improvement
  • 53. Roofline Model: “Houston, is there a problem?” Am I under-performing? (FLOP rate is a poor metric!) And how bad is it???
  • 54. 54 Roofline Automation in IntelÂŽ Advisor 2017 Improving under collaboration with NERSC, LBNL, DOE. Each Dot represents loop or function in YOUR APPLICATION (profiled) Each Roof (slope) Gives peak CPU/Memory throughput of your PLATFORM (benchmarked) Automatic and integrated – first class citizen in IntelÂŽ Advisor
  • 55. Speed-ups from NESAP Investments 55 Comparison is node to node 1 node of Edison = 2 IvyBridge Procs 1 node of Cori = 1 KNL Proc 7x Increase in Performance is nearly 7x increased Energy Efficiency!
  • 56. ARPAe PINE Key Concept: dynamic Bandwidth Steering for energy efficiency (use case for power API?) 56 Compute MCM HBM MCM NVRAM MCM NVM NVM NVM NVM RX RX TX TX Packet Switching MCM RX RX TX TX To other nodes CPU/GPU HBM MCM CPU GPU RAM NVM Optical switch 3 CMP1 CMP2 CMP3 CMP4 GPU3GPU1 GPU2 GPU8GPU6 GPU7GPU5GPU4 GPU9 MEM MEM MEM MEM MEM MEM MEM MEM MEM MEM MEM MEM MEM MEM CMP1 CMP2 CMP4 MEM MEM CMP3 MEM MEM MEM MEM MEM GPU1 GPU3 Interconnect R=2 Intra-server Inter-server Bergman Gaeta LipsonBowersCoolbaughJohansson Patel Dennison Shalf Ghobadi
  • 57. Conclusions • Measuring datacenter operational efficiency has resulted in huge improvements – At PUE 1.1, it is hard to mine MORE efficiency – But must stay vigilant with continuous monitoring • Runtime System Optimizations driven by real-time feedback may work – Currently many programming structures are fixed at compile time (currently limited) – Software mediated control loop may be too slow for nanosecond events – Runtime system will always know less about application than application scientist • Tuning codes is HUGE energy benefit! (energy efficiency = performance/watts) – NESAP code tuning offered largest (3x – 7x) efficiency improvements! – Need tools that do a better job identifying the performance *opportunity* (roofline) • Integrated analytics at every level is essential for identifying efficiency opportunities – But we are collecting data faster than we can formulate questions to pose against it! – Perhaps an opportunity for machine learning as “anomaly detection”
  • 58. 58 What system software has been most helpful in improving system energy efficiency? Which software stack layers should support energy awareness? Is there user willingness to exploit libraries for energy efficiency? What might encourage adoption of APIs for measurement/management? Where are the biggest opportunities for energy efficiency improvements? Can software-based runtime introspection mechanisms react fast enough? What is the user reward for developing energy efficient codes? What is the total cost of ownership (TCO) tradeoff in tuning codes for (a) improved performance or (b) reduced energy consumption What are the fundamental properties of energy efficient applications?