SlideShare uma empresa Scribd logo
1 de 32
Baixar para ler offline
Introduction to
National Supercomputer center in Tianjin
          TH-1A Supercomputer
Agenda

�   National Supercomputer Center in Tianjin( NSCC-TJ)

�   TH-1A system
     �   Hardware sub-system
     �   Software sub-system


�   Applications
NSCC-TJ

� National SuperComputer Center in Tianjin
   �   Sponsored by
        � Chinese Ministry of Science and Technology

        � Tianjin Binhai New Area

   �   Public information infrastructure
        � To accelerate the economy, education and industry of
          Northern China
        � To provide high performance computing service to whole
          China
   �   Open platform for research and education
NSCC-TJ


                   Main building


                                               office




                          Computer room
Transformer station &
                          Total area: 2400m2
air conditioner
NSCC-TJ




    The first floor of central computing room: 1200m2
NSCC-TJ




      The second floor of central computing room:
             Visualization environment, 1200m2
NSCC-TJ




          Electric transformer station
NSCC-TJ




            Cooling water station


2011-6-28        TH-1               8
NSCC-TJ
� Layout   of computing room
TH-1A system
TH-1A system
� Enhanced system based on TH-1 system (Sep. 2009)
� Installed in NSCC-TJ, Aug. 2010
� Debugging and performance testing, Sept.~Oct. 2010
                                     Sept.~Oct.
� On service, after Nov. 2010

       Items                         Configuration
    Processors     14336 Intel CPUs + 7168 nVIDIA GPUs + 2048FT CPUs
     Memory                           262TB in total
    Interconnect      Proprietary high-speed interconnecting network
      Storage                             2PB
                              120 Compute / service Cabinets
     Cabinets                      14 Storage Cabinets
                               6 Communication Cabinets
TH-1A system

�   TH-1A System Architecture
     � Hybrid MPP structure: CPU & GPU

     � Proprietary compute nodes

     � Connected by proprietary high-speed interconnect
        network
     � Global shared parallel storage system

     � Custom software stack
TH-1A hardware sub-system

                                                                         Service
                                                                          Service
                                 Compute sub-system
                                 Compute sub-system                    sub-system
                                                                        sub-system
                          CPU    CPU   CPU     CPU         CPU
                                                     …             Operation   Operation
   diagnosis sub-system
   diagnosis sub-system


                           +      +     +       +           +
                                                                     node        node
                          GPU    GPU   GPU     GPU         GPU
       Monitor and
        Monitor and




                                        Communication sub-system
                                        Communication sub-system

                                              Storage sub-system
                                              Storage sub-system
                                MDS                                     …
                                             OSS     OSS         OSS           OSS
Compute sub-system
�   7,168 compute nodes
    �   2 six-core CPU and 1 GPU per node
    �   CPU
          �Xeon X5670 ( Westmere )
                       (Westmere
                        Westmere)
          �Processor speed - 2.93GHz

    �   GPU
          �NVIDIA Tesla M2050

          �Connected with CPU by PCI-E

    �   32GB memory per node
    �   2U height
    �   Peak performance
          �4,701,061Gflops
Service sub-system
�   1,024 service nodes
    �   2 eight-core domestic CPUs
    �   CPU: FT-1000
          � SoC

          � 1.0GH z
            1.0GHz
          � Eight-core, eight-thread per
              ight-core,
            core
          � Peak performance 8Gflops

    �   32GB memory per node
    �   For login, compile, and applications
        need throughput computing
Proprietary interconnection network
� Interconnection signal speed – 10Gbps
� Bi-directional bandwidth – 160Gbps
� Hierarchy fat-tree structure
    �   First stage: 16 nodes connected by 16-port switching board
    �   Second stage: all parts connected to eleven 384-port switches
Proprietary interconnection network
 �   High radix router ASIC:NRC
                       ASIC:
     �   Feature size :90nm
     �   Die size:17.16mm x 17.16mm
             size:
     �   Package :FC-PBGA
         Package:
     �   2577 pins
     �   Throughput of single NRC: 2.56Tbps
 �   Network interface ASIC:NIC
     �   Same feature size and package as NRC
     �   Die size :10.76mm x 10.76mm
             size:
     �   675 pins
Proprietary interconnection network
                        16-port switch board
                         in cabinet
                              Leaf switch blade and
                              Root switch blade of 384-ports switch




     Back plane of 384-ports switch
      about 700mm *600mm
            700mm*
Proprietary interconnecting network
�   Switching board and high-radix switch
    �   Based on network interface ASIC and router ASIC
� Reduced user communication protocol
� Throughput: 61.44Tbps

    Front

                                                  two 384-port
                                                  high-radix switches

    Back
Storage sub-system
� Capacity: 2 PB
� Connected by proprietary interconnection network
� Lustre based parallel file system
Monitor and diagnosis sub-system
 �   Rich monitor & control functions
     �   Real-time monitor hardware
         parameters
     �   Precise fault position
     �   Alarm and immediate action
         against emergency
     �   Self-feedback cool adjust for
         environment status
     �   I2C & JTAG diagnosis
         mechanism
     �   Large scale console
     �   Remote monitor and
         management
Computing cabinet
�   Node: 2 CPUs and 1 GPU
�   Blade: 2 nodes
�   Frame
    �   8 computing blades
    �   16-port switching board
    �   1 monitor and diagnosis board
�   Cabinet
    �   4 frames, 64 nodes
�   Close-coupled chilled water cooling
    �   128 CPUs, 64 GPU
    �   56KW cooling capacity in a cabinet
�   Footprint
    �   700m2
TH-1A software sub-system
�   Software stack
Operating system

� Kylin Linux
� compute node kernel
� Provide virtual running environment
    �   Isolated running environments for different users
    �   Custom software package installation
� QoS support
� Power aware computing
Compiler system
� C, C++, Fortran, Java
� OpenMP, MPI, OpenMP/MPI
  OpenMP,        OpenMP/MPI
� CUDA, OpenCL
� Heterogeneous programming framework
    �   Accelerate the large scale, complex applications, especially
        for applications in developing status or their full source codes
        are not available
    �   Use the computing power of CPUs and GPUs, hide the GPU
                                                 GPUs,
        programming to users
          � Inter-node homogeneous parallel programming (users)
          � Intra-node heterogeneous parallel computing (computer
            experts)
Compiler system
�   Heterogeneous programming framework
    �   Inter-node homogeneous parallel programming (JASMIN)
          � Patch-based objects data structures

          � MPI communication, dynamic load balancing support

          � Zero-copy optimization in communication library
Compiler system
�   Heterogeneous programming framework
    �   Intra-node heterogeneous parallel computing
          � Compiler optimized / hand-tuned threaded code

          � Optimizations include
             �   Adaptive partitioning, balance the workloads between CPUs and
                 GPU
             �   Asynchronous data transfer / computing, overlap CPU operations
                 with GPU operations
             �   Software pipelining, overlap GPU computing with data transfer
                 between host and GPU device memory
             �   ……
Compiler system
�   Heterogeneous programming framework
    �   An example: 3-D short range molecular simulations
    �   For each time step
         � Split workload (force calculation) between CPU and GPU
             �   For each patch allocated to GPU
                   � Start asynchronous operations: transfer the patch data to
                      GPU, compute the patch, get results from GPU
             �   For each patch allocated to CPU
                   � Launch threads on CPU cores to compute the patch

             �   CPU waits for GPU completion event
             �   Adjust the split value according to the CPU/GPU performance
                 (patches per second + empirical )
         � Other workload (velocity, position) computed on CPU
    �   Performance: one NVIDIA M2050 GPU is 3 times faster than
        one Intel X5670 CPU
Programming environment
�   Virtual running environments
    �   Provide services on demand
�   Parallel toolkits
    �   Based on Eclipse
    �   To integrate all kinds of tools
    �   Editor, debugger, profiler
�   Work flow support
    �   Support QoS negotiate
    �   Reserve resource for future
        requirement
Visualization system
�   Application area
    �   Numerical weather
        forecast
    �   Computational fluid
        dynamics
    �   Oil exploration
    �   Other large-scale data
�   Computing platform
    �   Tianhe-1A
�   Render server
    �   128 CPU + 64 GPU
�   Display device
    �   3x6 multi-channel
        display wall
Applications
 �   Oil exploration
 �   High-end equipment development
 �   Bio-medical research
 �   Animation design
 �   New energy research
 �   New material research
 �   Weather and climate forecasting
 �   Engineering design, simulation and
     analysis
 �   Remote sensing data processing
 �   Financial risk analysis
Thanks

Mais conteúdo relacionado

Mais procurados

Spine net learning scale permuted backbone for recognition and localization
Spine net learning scale permuted backbone for recognition and localizationSpine net learning scale permuted backbone for recognition and localization
Spine net learning scale permuted backbone for recognition and localizationDevansh16
 
C++ neural networks and fuzzy logic
C++ neural networks and fuzzy logicC++ neural networks and fuzzy logic
C++ neural networks and fuzzy logicJamerson Ramos
 
Stefano Giordano
Stefano GiordanoStefano Giordano
Stefano GiordanoGoWireless
 
Scalable Parallel Computing on Clouds
Scalable Parallel Computing on CloudsScalable Parallel Computing on Clouds
Scalable Parallel Computing on CloudsThilina Gunarathne
 
MPEG-21-based Cross-Layer Optimization Techniques for enabling Quality of Exp...
MPEG-21-based Cross-Layer Optimization Techniques for enabling Quality of Exp...MPEG-21-based Cross-Layer Optimization Techniques for enabling Quality of Exp...
MPEG-21-based Cross-Layer Optimization Techniques for enabling Quality of Exp...Alpen-Adria-Universität
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
High Speed and Area Efficient 2D DWT Processor Based Image Compression
High Speed and Area Efficient 2D DWT Processor Based Image CompressionHigh Speed and Area Efficient 2D DWT Processor Based Image Compression
High Speed and Area Efficient 2D DWT Processor Based Image Compressionsipij
 
High Performance Parallel Computing with Clouds and Cloud Technologies
High Performance Parallel Computing with Clouds and Cloud TechnologiesHigh Performance Parallel Computing with Clouds and Cloud Technologies
High Performance Parallel Computing with Clouds and Cloud Technologiesjaliyae
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...LEGATO project
 
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P..."Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...Edge AI and Vision Alliance
 
Image transmission in wireless sensor networks
Image transmission in wireless sensor networksImage transmission in wireless sensor networks
Image transmission in wireless sensor networkseSAT Publishing House
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...Pradeeban Kathiravelu, Ph.D.
 
Video complexity analyzer (VCA) for streaming applications
 Video complexity analyzer (VCA) for streaming applications Video complexity analyzer (VCA) for streaming applications
Video complexity analyzer (VCA) for streaming applicationsAlpen-Adria-Universität
 
Cost-Efficient Rule Management and Traffic Engineering for Software Defined N...
Cost-Efficient Rule Management and Traffic Engineering for Software Defined N...Cost-Efficient Rule Management and Traffic Engineering for Software Defined N...
Cost-Efficient Rule Management and Traffic Engineering for Software Defined N...Huawei Huang
 
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...Bharath Sudharsan
 
Delay Constrained Energy Efficient Data Transmission over WSN
Delay Constrained Energy Efficient Data Transmission over WSNDelay Constrained Energy Efficient Data Transmission over WSN
Delay Constrained Energy Efficient Data Transmission over WSNpaperpublications3
 

Mais procurados (20)

Spine net learning scale permuted backbone for recognition and localization
Spine net learning scale permuted backbone for recognition and localizationSpine net learning scale permuted backbone for recognition and localization
Spine net learning scale permuted backbone for recognition and localization
 
C++ neural networks and fuzzy logic
C++ neural networks and fuzzy logicC++ neural networks and fuzzy logic
C++ neural networks and fuzzy logic
 
Coca1
Coca1Coca1
Coca1
 
Stefano Giordano
Stefano GiordanoStefano Giordano
Stefano Giordano
 
Scalable Parallel Computing on Clouds
Scalable Parallel Computing on CloudsScalable Parallel Computing on Clouds
Scalable Parallel Computing on Clouds
 
Lec06 memory
Lec06 memoryLec06 memory
Lec06 memory
 
MPEG-21-based Cross-Layer Optimization Techniques for enabling Quality of Exp...
MPEG-21-based Cross-Layer Optimization Techniques for enabling Quality of Exp...MPEG-21-based Cross-Layer Optimization Techniques for enabling Quality of Exp...
MPEG-21-based Cross-Layer Optimization Techniques for enabling Quality of Exp...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
High Speed and Area Efficient 2D DWT Processor Based Image Compression
High Speed and Area Efficient 2D DWT Processor Based Image CompressionHigh Speed and Area Efficient 2D DWT Processor Based Image Compression
High Speed and Area Efficient 2D DWT Processor Based Image Compression
 
High Performance Parallel Computing with Clouds and Cloud Technologies
High Performance Parallel Computing with Clouds and Cloud TechnologiesHigh Performance Parallel Computing with Clouds and Cloud Technologies
High Performance Parallel Computing with Clouds and Cloud Technologies
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
 
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P..."Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
 
Image transmission in wireless sensor networks
Image transmission in wireless sensor networksImage transmission in wireless sensor networks
Image transmission in wireless sensor networks
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
I017425763
I017425763I017425763
I017425763
 
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
ViTeNA: An SDN-Based Virtual Network Embedding Algorithm for Multi-Tenant Dat...
 
Video complexity analyzer (VCA) for streaming applications
 Video complexity analyzer (VCA) for streaming applications Video complexity analyzer (VCA) for streaming applications
Video complexity analyzer (VCA) for streaming applications
 
Cost-Efficient Rule Management and Traffic Engineering for Software Defined N...
Cost-Efficient Rule Management and Traffic Engineering for Software Defined N...Cost-Efficient Rule Management and Traffic Engineering for Software Defined N...
Cost-Efficient Rule Management and Traffic Engineering for Software Defined N...
 
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
 
Delay Constrained Energy Efficient Data Transmission over WSN
Delay Constrained Energy Efficient Data Transmission over WSNDelay Constrained Energy Efficient Data Transmission over WSN
Delay Constrained Energy Efficient Data Transmission over WSN
 

Destaque

Tech Vision 2015 Trend 1: Internet of me
Tech Vision 2015 Trend 1: Internet of meTech Vision 2015 Trend 1: Internet of me
Tech Vision 2015 Trend 1: Internet of meaccenture
 
スナック感覚で楽しめる和菓子
スナック感覚で楽しめる和菓子スナック感覚で楽しめる和菓子
スナック感覚で楽しめる和菓子stucon
 
MOLTO poster for META Forum, Brussels 2010, Belgium.
MOLTO poster for META Forum, Brussels 2010, Belgium.MOLTO poster for META Forum, Brussels 2010, Belgium.
MOLTO poster for META Forum, Brussels 2010, Belgium.Olga Caprotti
 
Connecting Ecommerce & Centralized Analytics to Cascade Server
Connecting Ecommerce & Centralized Analytics to Cascade ServerConnecting Ecommerce & Centralized Analytics to Cascade Server
Connecting Ecommerce & Centralized Analytics to Cascade Serverhannonhill
 
digital marketing certificate programs in Bangalore
digital marketing certificate programs in Bangaloredigital marketing certificate programs in Bangalore
digital marketing certificate programs in Bangalorevinuthna58
 
Ο κόσμος από ψηλά
Ο κόσμος από ψηλάΟ κόσμος από ψηλά
Ο κόσμος από ψηλάnicodimosnis
 
Staying Productive On the Road, At Home, and Everywhere Else
Staying Productive On the Road, At Home, and Everywhere ElseStaying Productive On the Road, At Home, and Everywhere Else
Staying Productive On the Road, At Home, and Everywhere ElseLinkedIn Learning Solutions
 
Charting a UX Strategist Course in 90 Days
Charting a UX Strategist Course in 90 DaysCharting a UX Strategist Course in 90 Days
Charting a UX Strategist Course in 90 DaysJon Kohrs
 
Zoekmachines weten het antwoord
Zoekmachines weten het antwoordZoekmachines weten het antwoord
Zoekmachines weten het antwoordEric Sieverts
 
Moms: Faking it While Freaking Out
Moms: Faking it While Freaking OutMoms: Faking it While Freaking Out
Moms: Faking it While Freaking OutAbelsonTaylor
 
Nuevas tecnologías de TV y su desarrollo e implementación en la Argentina
Nuevas tecnologías de TV y su desarrollo e implementación en la ArgentinaNuevas tecnologías de TV y su desarrollo e implementación en la Argentina
Nuevas tecnologías de TV y su desarrollo e implementación en la ArgentinaLuis Valle
 
Antisocial powerpoin txxxx
Antisocial powerpoin txxxxAntisocial powerpoin txxxx
Antisocial powerpoin txxxxMilen Ramos
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConfXavier Amatriain
 
Getting Information through HTML Forms
Getting Information through HTML FormsGetting Information through HTML Forms
Getting Information through HTML FormsMike Crabb
 
INFOGRAPHIC: The Internet of Things: Are Organizations Ready For A Multi-Tril...
INFOGRAPHIC: The Internet of Things: Are Organizations Ready For A Multi-Tril...INFOGRAPHIC: The Internet of Things: Are Organizations Ready For A Multi-Tril...
INFOGRAPHIC: The Internet of Things: Are Organizations Ready For A Multi-Tril...Capgemini
 
Deel communicatiebudget FOD Justitie ging naar hotel
Deel communicatiebudget FOD Justitie ging naar hotelDeel communicatiebudget FOD Justitie ging naar hotel
Deel communicatiebudget FOD Justitie ging naar hotelThierry Debels
 

Destaque (20)

Tech Vision 2015 Trend 1: Internet of me
Tech Vision 2015 Trend 1: Internet of meTech Vision 2015 Trend 1: Internet of me
Tech Vision 2015 Trend 1: Internet of me
 
El capital
El capitalEl capital
El capital
 
スナック感覚で楽しめる和菓子
スナック感覚で楽しめる和菓子スナック感覚で楽しめる和菓子
スナック感覚で楽しめる和菓子
 
MOLTO poster for META Forum, Brussels 2010, Belgium.
MOLTO poster for META Forum, Brussels 2010, Belgium.MOLTO poster for META Forum, Brussels 2010, Belgium.
MOLTO poster for META Forum, Brussels 2010, Belgium.
 
Connecting Ecommerce & Centralized Analytics to Cascade Server
Connecting Ecommerce & Centralized Analytics to Cascade ServerConnecting Ecommerce & Centralized Analytics to Cascade Server
Connecting Ecommerce & Centralized Analytics to Cascade Server
 
digital marketing certificate programs in Bangalore
digital marketing certificate programs in Bangaloredigital marketing certificate programs in Bangalore
digital marketing certificate programs in Bangalore
 
Ο κόσμος από ψηλά
Ο κόσμος από ψηλάΟ κόσμος από ψηλά
Ο κόσμος από ψηλά
 
Staying Productive On the Road, At Home, and Everywhere Else
Staying Productive On the Road, At Home, and Everywhere ElseStaying Productive On the Road, At Home, and Everywhere Else
Staying Productive On the Road, At Home, and Everywhere Else
 
Charting a UX Strategist Course in 90 Days
Charting a UX Strategist Course in 90 DaysCharting a UX Strategist Course in 90 Days
Charting a UX Strategist Course in 90 Days
 
ㅣㅣ
ㅣㅣㅣㅣ
ㅣㅣ
 
Zoekmachines weten het antwoord
Zoekmachines weten het antwoordZoekmachines weten het antwoord
Zoekmachines weten het antwoord
 
Moms: Faking it While Freaking Out
Moms: Faking it While Freaking OutMoms: Faking it While Freaking Out
Moms: Faking it While Freaking Out
 
Nuevas tecnologías de TV y su desarrollo e implementación en la Argentina
Nuevas tecnologías de TV y su desarrollo e implementación en la ArgentinaNuevas tecnologías de TV y su desarrollo e implementación en la Argentina
Nuevas tecnologías de TV y su desarrollo e implementación en la Argentina
 
Antisocial powerpoin txxxx
Antisocial powerpoin txxxxAntisocial powerpoin txxxx
Antisocial powerpoin txxxx
 
Natureview frame
Natureview frameNatureview frame
Natureview frame
 
Paris ML meetup
Paris ML meetupParis ML meetup
Paris ML meetup
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
Getting Information through HTML Forms
Getting Information through HTML FormsGetting Information through HTML Forms
Getting Information through HTML Forms
 
INFOGRAPHIC: The Internet of Things: Are Organizations Ready For A Multi-Tril...
INFOGRAPHIC: The Internet of Things: Are Organizations Ready For A Multi-Tril...INFOGRAPHIC: The Internet of Things: Are Organizations Ready For A Multi-Tril...
INFOGRAPHIC: The Internet of Things: Are Organizations Ready For A Multi-Tril...
 
Deel communicatiebudget FOD Justitie ging naar hotel
Deel communicatiebudget FOD Justitie ging naar hotelDeel communicatiebudget FOD Justitie ging naar hotel
Deel communicatiebudget FOD Justitie ging naar hotel
 

Semelhante a Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer

Stream Processing
Stream ProcessingStream Processing
Stream Processingarnamoy10
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialmadhuinturi
 
Intel new processors
Intel new processorsIntel new processors
Intel new processorszaid_b
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 
Os Madsen Block
Os Madsen BlockOs Madsen Block
Os Madsen Blockoscon2007
 
并行计算与分布式计算的区别
并行计算与分布式计算的区别并行计算与分布式计算的区别
并行计算与分布式计算的区别xiazdong
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2Junli Gu
 
Overview of ST7 8-bit Microcontrollers
Overview of ST7 8-bit MicrocontrollersOverview of ST7 8-bit Microcontrollers
Overview of ST7 8-bit MicrocontrollersPremier Farnell
 
SoM with Zynq UltraScale device
SoM with Zynq UltraScale deviceSoM with Zynq UltraScale device
SoM with Zynq UltraScale devicenie, jack
 
Modern processor art
Modern processor artModern processor art
Modern processor artwaqasjadoon11
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5Steen Larsen
 

Semelhante a Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer (20)

Stream Processing
Stream ProcessingStream Processing
Stream Processing
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
Intel new processors
Intel new processorsIntel new processors
Intel new processors
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
 
uCluster
uClusteruCluster
uCluster
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
Os Madsen Block
Os Madsen BlockOs Madsen Block
Os Madsen Block
 
并行计算与分布式计算的区别
并行计算与分布式计算的区别并行计算与分布式计算的区别
并行计算与分布式计算的区别
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2
 
Overview of ST7 8-bit Microcontrollers
Overview of ST7 8-bit MicrocontrollersOverview of ST7 8-bit Microcontrollers
Overview of ST7 8-bit Microcontrollers
 
DSP Processor.pptx
DSP Processor.pptxDSP Processor.pptx
DSP Processor.pptx
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
SoM with Zynq UltraScale device
SoM with Zynq UltraScale deviceSoM with Zynq UltraScale device
SoM with Zynq UltraScale device
 
Tos tutorial
Tos tutorialTos tutorial
Tos tutorial
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
 
Modern processor art
Modern processor artModern processor art
Modern processor art
 
processor struct
processor structprocessor struct
processor struct
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 

Mais de Förderverein Technische Fakultät

The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...Förderverein Technische Fakultät
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfFörderverein Technische Fakultät
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfFörderverein Technische Fakultät
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Förderverein Technische Fakultät
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...Förderverein Technische Fakultät
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksFörderverein Technische Fakultät
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfFörderverein Technische Fakultät
 

Mais de Förderverein Technische Fakultät (20)

Supervisory control of business processes
Supervisory control of business processesSupervisory control of business processes
Supervisory control of business processes
 
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
 
A Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdfA Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdf
 
From Mind to Meta.pdf
From Mind to Meta.pdfFrom Mind to Meta.pdf
From Mind to Meta.pdf
 
Miniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdfMiniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdf
 
Distributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptxDistributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptx
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdf
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
 
Towards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdfTowards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdf
 
Förderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptx
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...
 
Machine Learning in Finance via Randomization
Machine Learning in Finance via RandomizationMachine Learning in Finance via Randomization
Machine Learning in Finance via Randomization
 
IT does not stop
IT does not stopIT does not stop
IT does not stop
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial Networks
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
 
Introduction to 5G from radio perspective
Introduction to 5G from radio perspectiveIntroduction to 5G from radio perspective
Introduction to 5G from radio perspective
 

Último

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer

  • 1. Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
  • 2. Agenda � National Supercomputer Center in Tianjin( NSCC-TJ) � TH-1A system � Hardware sub-system � Software sub-system � Applications
  • 3. NSCC-TJ � National SuperComputer Center in Tianjin � Sponsored by � Chinese Ministry of Science and Technology � Tianjin Binhai New Area � Public information infrastructure � To accelerate the economy, education and industry of Northern China � To provide high performance computing service to whole China � Open platform for research and education
  • 4. NSCC-TJ Main building office Computer room Transformer station & Total area: 2400m2 air conditioner
  • 5. NSCC-TJ The first floor of central computing room: 1200m2
  • 6. NSCC-TJ The second floor of central computing room: Visualization environment, 1200m2
  • 7. NSCC-TJ Electric transformer station
  • 8. NSCC-TJ Cooling water station 2011-6-28 TH-1 8
  • 9. NSCC-TJ � Layout of computing room
  • 11. TH-1A system � Enhanced system based on TH-1 system (Sep. 2009) � Installed in NSCC-TJ, Aug. 2010 � Debugging and performance testing, Sept.~Oct. 2010 Sept.~Oct. � On service, after Nov. 2010 Items Configuration Processors 14336 Intel CPUs + 7168 nVIDIA GPUs + 2048FT CPUs Memory 262TB in total Interconnect Proprietary high-speed interconnecting network Storage 2PB 120 Compute / service Cabinets Cabinets 14 Storage Cabinets 6 Communication Cabinets
  • 12. TH-1A system � TH-1A System Architecture � Hybrid MPP structure: CPU & GPU � Proprietary compute nodes � Connected by proprietary high-speed interconnect network � Global shared parallel storage system � Custom software stack
  • 13. TH-1A hardware sub-system Service Service Compute sub-system Compute sub-system sub-system sub-system CPU CPU CPU CPU CPU … Operation Operation diagnosis sub-system diagnosis sub-system + + + + + node node GPU GPU GPU GPU GPU Monitor and Monitor and Communication sub-system Communication sub-system Storage sub-system Storage sub-system MDS … OSS OSS OSS OSS
  • 14. Compute sub-system � 7,168 compute nodes � 2 six-core CPU and 1 GPU per node � CPU �Xeon X5670 ( Westmere ) (Westmere Westmere) �Processor speed - 2.93GHz � GPU �NVIDIA Tesla M2050 �Connected with CPU by PCI-E � 32GB memory per node � 2U height � Peak performance �4,701,061Gflops
  • 15. Service sub-system � 1,024 service nodes � 2 eight-core domestic CPUs � CPU: FT-1000 � SoC � 1.0GH z 1.0GHz � Eight-core, eight-thread per ight-core, core � Peak performance 8Gflops � 32GB memory per node � For login, compile, and applications need throughput computing
  • 16. Proprietary interconnection network � Interconnection signal speed – 10Gbps � Bi-directional bandwidth – 160Gbps � Hierarchy fat-tree structure � First stage: 16 nodes connected by 16-port switching board � Second stage: all parts connected to eleven 384-port switches
  • 17. Proprietary interconnection network � High radix router ASIC:NRC ASIC: � Feature size :90nm � Die size:17.16mm x 17.16mm size: � Package :FC-PBGA Package: � 2577 pins � Throughput of single NRC: 2.56Tbps � Network interface ASIC:NIC � Same feature size and package as NRC � Die size :10.76mm x 10.76mm size: � 675 pins
  • 18. Proprietary interconnection network 16-port switch board in cabinet Leaf switch blade and Root switch blade of 384-ports switch Back plane of 384-ports switch about 700mm *600mm 700mm*
  • 19. Proprietary interconnecting network � Switching board and high-radix switch � Based on network interface ASIC and router ASIC � Reduced user communication protocol � Throughput: 61.44Tbps Front two 384-port high-radix switches Back
  • 20. Storage sub-system � Capacity: 2 PB � Connected by proprietary interconnection network � Lustre based parallel file system
  • 21. Monitor and diagnosis sub-system � Rich monitor & control functions � Real-time monitor hardware parameters � Precise fault position � Alarm and immediate action against emergency � Self-feedback cool adjust for environment status � I2C & JTAG diagnosis mechanism � Large scale console � Remote monitor and management
  • 22. Computing cabinet � Node: 2 CPUs and 1 GPU � Blade: 2 nodes � Frame � 8 computing blades � 16-port switching board � 1 monitor and diagnosis board � Cabinet � 4 frames, 64 nodes � Close-coupled chilled water cooling � 128 CPUs, 64 GPU � 56KW cooling capacity in a cabinet � Footprint � 700m2
  • 24. Operating system � Kylin Linux � compute node kernel � Provide virtual running environment � Isolated running environments for different users � Custom software package installation � QoS support � Power aware computing
  • 25. Compiler system � C, C++, Fortran, Java � OpenMP, MPI, OpenMP/MPI OpenMP, OpenMP/MPI � CUDA, OpenCL � Heterogeneous programming framework � Accelerate the large scale, complex applications, especially for applications in developing status or their full source codes are not available � Use the computing power of CPUs and GPUs, hide the GPU GPUs, programming to users � Inter-node homogeneous parallel programming (users) � Intra-node heterogeneous parallel computing (computer experts)
  • 26. Compiler system � Heterogeneous programming framework � Inter-node homogeneous parallel programming (JASMIN) � Patch-based objects data structures � MPI communication, dynamic load balancing support � Zero-copy optimization in communication library
  • 27. Compiler system � Heterogeneous programming framework � Intra-node heterogeneous parallel computing � Compiler optimized / hand-tuned threaded code � Optimizations include � Adaptive partitioning, balance the workloads between CPUs and GPU � Asynchronous data transfer / computing, overlap CPU operations with GPU operations � Software pipelining, overlap GPU computing with data transfer between host and GPU device memory � ……
  • 28. Compiler system � Heterogeneous programming framework � An example: 3-D short range molecular simulations � For each time step � Split workload (force calculation) between CPU and GPU � For each patch allocated to GPU � Start asynchronous operations: transfer the patch data to GPU, compute the patch, get results from GPU � For each patch allocated to CPU � Launch threads on CPU cores to compute the patch � CPU waits for GPU completion event � Adjust the split value according to the CPU/GPU performance (patches per second + empirical ) � Other workload (velocity, position) computed on CPU � Performance: one NVIDIA M2050 GPU is 3 times faster than one Intel X5670 CPU
  • 29. Programming environment � Virtual running environments � Provide services on demand � Parallel toolkits � Based on Eclipse � To integrate all kinds of tools � Editor, debugger, profiler � Work flow support � Support QoS negotiate � Reserve resource for future requirement
  • 30. Visualization system � Application area � Numerical weather forecast � Computational fluid dynamics � Oil exploration � Other large-scale data � Computing platform � Tianhe-1A � Render server � 128 CPU + 64 GPU � Display device � 3x6 multi-channel display wall
  • 31. Applications � Oil exploration � High-end equipment development � Bio-medical research � Animation design � New energy research � New material research � Weather and climate forecasting � Engineering design, simulation and analysis � Remote sensing data processing � Financial risk analysis