SlideShare a Scribd company logo
1 of 36
Intel Multi Core Micro Architecture
and software tools




Nir Arazy
Field Application engineer
Eastronics
June 2010
Nir.arazy@easx.co.il
Legal Disclaimer
•   Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property
    rights that relate to the presented subject matter. The furnishing of documents and other materials and information does
    not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other
    intellectual property rights.
•   The Intel products) referred to in this document is intended for standard commercial use only. Customer are solely
    responsible for assessing the suitability of the product for use in particular applications. Intel products are not intended for
    use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications.
•   Performance tests and ratings are measured using specific computer systems and/or components and reflect the
    approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software
    design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the
    performance of systems or components they are considering purchasing. For more information on performance tests and on
    the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-
    8686 or 1-916-356-3104.
•   All information provided related to future Intel products and plans is preliminary and subject to change at any time, without
    notice. All dates provided are subject to change without notice. Intel may make changes to specifications and product
    descriptions at any time, without notice.
•   Celeron, Intel, Intel logo, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel
    NetBurst, Intel SpeedStep, Intel XScale, Itanium, Pentium, Pentium Inside, VTune, Xeon, and Xeon Inside are trademarks
    or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
•   * Other names and brands may be claimed as the property of others.
•   Other vendors are listed by Intel as a convenience to Intel's general customer base, but Intel does not make any
    representations or warranties whatsoever regarding quality, reliability, functionality, or compatibility of these devices. This
    list and/or these devices may be subject to change without notice.
•   Copyright © 2009, Intel Corporation. All rights reserved.




    2                                                                                               Document# 408075
                                                                  Intel Confidential
2007   2008   2009          2010     2011


3                                         Document# 408075
                    Intel Confidential
Moore’s Law – GHz to Multi-Core

                                                                          Performance Through
                                                                                 Multi-Core
    Performance




                    “Concurrency is the next
                   major revolution in how we                                     Intel MC Assistance
                         write software”
                                                                                        •Threading
                       -Dr Dobb’s Journal,                                             •Multi-tasking
                           Herb Sutter
                                                                                            •Training
                           March 2005
                                                                                             •Tools

                                                                          Performance Through
                                                                                 frequency

                                                2006
                                     -                               +


4                                                                        Document# 408075
                                                Intel Confidential
Multi-core is Mainstream
    Is Your Software Ready?




        Multiple execution cores ramping
              across Intel platforms


5                                         Document# 408075
                     Intel Confidential
Agenda

    • HW based parallelism
      – Multi-Cores
      – Turbo boost
      – SMT
      – SSE


    • SW tools to enable efficient parallelism
      – IPP
      – TBB
      – Thread Checker
      – Thread Profiler



6                                                Document# 408075
                            Intel Confidential
Simultaneous Multi-Threading (SMT)
                                                                w/o SMT    SMT
    • SMT
      – Run 2 threads at the same time per core
    • Take advantage of 4-wide execution
      engine




                                                  Time (proc.
      – Keep it fed with multiple threads




                                                    cycles)
      – Hide latency of a single thread
    • Most power efficient performance
      feature
      – Very low die area cost
      – Can provide significant performance
        benefit depending on application
      – Much more efficient than adding an
        entire core                                                         Note: Each
                                                                                box
    • Nehalem/Westmere advantages                                          represents a
                                                                             processor
      – Larger caches                                                     execution unit
      – Massive memory BW



        Simultaneous multi-threading enhances
7          performance and energy efficiency
                        Intel Confidential
                                           Document# 408075
Enhanced Cache Subsystem

    • 3-level cache hierarchy                          32KB FLC          32KB FLC
      – First Level Cache (FLC)                        Instruction       Instruction
        – 32 KB Instruction & 32 KB Data
          per core                                     32KB FLC          32KB FLC
        – Equivalent to L1 Cache in Intel®               Data                 Data
          Core™ microarchitecture
      – Mid Level Cache (MLC)                           256KB                256KB
        – 256 KB per core                                MLC                  MLC
      – Last Level Cache
                                                        Core 0               Core 1
        – Up-to 4MB shared across both
          core
        – Inclusive cache policy – minimize            ≤ 4MB Last Level Cache
          snoop traffic
        – Equivalent to L2 Cache in Intel®             Processor Cache Subsystem
          Core™2 Duo microarchitecture

8                                                         Document# 408075
                                  Intel Confidential
All New 2010 Intel® Core™ Performance-Based
                   Technology Overview
      Core 2010 Features
          CPU Thread               Intel® Turbo Boost
                                   Intel®
                                       and Hyper-
                                           Hyper-
                                                                           Intel® Hyper-Threading Technology
                        CPU Thread      Threading
    CPU Thread
                                      Technologies                    • Smart multitasking by doubling the number of
                               GFX Core                                 processor threads per core with Intel® Hyper-
             CPU Thread
                                                                                   Threading Technology

                                                                                Intel® Turbo Boost Technology1
                                                                               Intelligently and seamlessly delivers
           CPU Core                      Intel HD Graphics
                                                                            improved CPU performance to match your
                                           with Dynamic                    workload when thermal and power headroom
    CPU Core                                 Frequency
                                             Mobile Only
                                                                                                exist
                      GFX Core
                                                                           Intel® HD Graphics with Dynamic Frequency
                                               Available on
                                               Mobile only                     Delivers graphics performance boost to
                                                                               graphics intensive applications provided
                                                                                 thermal and power headroom exist

            New Intel® Core processors with Intel® Turbo Boost Technology and Dynamic
             Frequency to maximize performance of CPU and graphics intensive tasks

      Note1: See Intel® Turbo Boost Technology disclaimer in the back-up
9                                                                                            Document# 408075
                                                                     Intel Confidential
Intel® Turbo Boost Technology
      Previous
                                                                   Current Platform
     Generation
                                   +Multiple                                            Dynamically trade TDP budget
                                   Speed Bins                                        Scenario 1                Scenario 2
                                                                    +Multiple     CPU Intensive Load          GFX Intensive Load
                                                                    Speed Bins
          +1 Speed
             Bin

                                                                                                                  GFX
                                                                                                                  Turbo



                     C3 State                  C3 State
                     or lower                  or lower


          Core 1 Core 2                                           Core 1 Core 2        CPU      GFX        CPU     GFX
                                     Core 1 Core 2
             Single core             Single Core                  Dual Core        Intel® Intelligent Power sharing
             CPU Turbo                CPU Turbo                   CPU Turbo       Note: CPU and GFX can turbo simultaneously


            Strategy: Maximize CPU and GFX performance while
                 staying within the processor TDP and Tjmax
     Note: Some features may be available only on certain SKU’s


10                                                                                     Document# 408075
                                                            Intel Confidential
Intel® Turbo Boost Technology




                                                                                                                                               Processor w/Turbo



                                                                                                                                              Processor w/out Turbo




                                         Intel® Turbo Boost Technology is targeted to deliver
                                              additional performance gains on Platform
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any
difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or
components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations
Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect
actual performance.
Results have been simulated and are provided for informational purposes only. Results were derived using simulations run on an architecture simulator or model. Any difference in system
hardware or software design or configuration may affect actual performance.




 11                                                                                                                                  Document# 408075
                                                                                        Intel Confidential
Intel® Advanced Digital Media Boost
Single Cycle SSE
     In Each Core                                SSE Operation
                                                       (SSE/SSE2/SSE3)
                                                       (SSE/SSE2/SSE3)
                  Single          SOURCE        127                             0
                  Cycle                          X4        X3       X2    X1

                   SSE          SSE/2/3 OP

                                                 Y4        Y3       Y2     Y1
     DECODE      DECODE             DEST




                           Previous                   CLOCK
                                                                  X2opY2 X1opY1
                                                      CYCLE 1

     EXECUTE     EXECUTE               CLOCK
                                                X4opY4 X3opY3
                                      CYCLE 2
                           Intel® Core™ Microarchitecture
                                       CLOCK
                                                X4opY4 X3opY3 X2opY2 X1opY1
                                      CYCLE 1


               128 bit Single Cycle in each core
12                                                     Document# 408075
                           Intel Confidential
Single Instruction Multiple Data
     (SIMD)
     • Anything that fits into 16 byte…
     • and all conversions!

                                                   4x floats

                                                   2x doubles

                                                   16x bytes

                                                   8x words

                                                   4x dwords

                                                   2x qwords

                                                   1x dqword


13                                                    Document# 408075
                              Intel Confidential
Intel® Advanced Vector Extension (Intel® AVX)
                         • Features:
                            – New 256-bit Instruction Set Architecture (ISA)
                            – Built on legacy 128-bit SIMD (SSEx) and 64-bit
                              SIMD (MMX) ISA extensions
                            – Enhancements to 128-bit SIMD instructions
                            – Support for 3 and 4 -operand syntax
                         • Benefits:
                            Expected Intel® AVX benefits:
                            - Image, video and audio processing
                            - CNC* & PLC compute performance
                            - High performance Digital Signal & Image Processing
                              (DSIP) within small Size, Weight and total Power
                              (SW&P)
                         • Targeted segments:
                             -Military/Aerospace/Government
                             - Medical Imaging
                             - Comms, Industrial Controllers & Digital Signage

                            Source: http://software.intel.com/en-us/avx/


     Performance Improvements for Floating Point Intensive
                        Applications

14                                                         Document# 408075
                            Intel Confidential
Agenda

     • HW based parallelism
       – Multi-Cores
       – SMT
       – Turbo boost
       – SSE


     • SW tools to enable efficient parallelism
       – IPP
       – TBB
       – Thread Checker
       – Thread Profiler



15                                                 Document# 408075
                              Intel Confidential
Simplified Threaded
 Development with Intel® Tools




 Architectural Analysis   Introduce Threads                Confidence/Correctness   Optimize / Tune
Analyzers                 Compilers                       Checkers                  Analyzers
• Find the code that      • Built-in optimization         • Find deadlocks and      • Tune for
  can benefit from        • OpenMP                          race conditions           performance
  threading               Libraries                                                   and scalability
• Find hotspots that      • Multimedia & data processing                            • Visualize efficiency
  limit performance       • Math Processing                                           of threaded code
                          • Threading



16                                                                       Document# 408075
                                              Intel Confidential
Intel® Integrated Performance Primitives
                      (Intel® IPP) — Overview and Benefits


                                       Application Source Code

                                     Intel IPP Usage Code Samples                                Rapid
       Free Code                       •
                                       •
                                           Sample video/audio/speech codecs
                                           Image processing and JPEG                           Application
        Samples                        •
                                       •
                                           Signal processing
                                           Data compression                                   Development
                                       •   .NET and Java integration


                                                                        API calls
                                     Intel IPP Library C/C++ API
     Cross-platform          •
                             •
                                 Cryptography
                                 Image processing
                                                          •
                                                          •
                                                              Data Compression
                                                              Data Integrity
                                                                                              Compatibility
           API               •
                             •
                                 Image color conversion
                                 JPEG / JPEG2000
                                                          •
                                                          •
                                                              Signal processing
                                                              Matrix mathematics
                                                                                                  and
                             •
                             •
                                 Computer Vision
                                 Video coding
                                                          •
                                                          •
                                                              Vector mathematics
                                                              String processing
                                                                                              Code Re-Use
                             •   Audio coding             •   Speech coding
                                                          •   Speech recognition

                                                                        Static/Dynamic Link
                         Intel IPP Processor-Optimized Binaries
                             Intel® Atom™ Processors
       Processor-        •
                         •   Intel® Core™ i7 Processors                                       Outstanding
       Optimized         •
                         •
                             Intel® Core™ 2 Duo and Core™ Extreme Processors
                             Intel® Core™ Duo and Core™ Solo Processors                       Performance
     Implementation      •
                         •
                             Intel® Pentium® D Dual-Core Processors
                             Intel® Xeon® 64-bit Dual-Core Processors
                         •   Intel® Pentium® M and Pentium® 4 Processors
                         •   Intel® Itanium® 64-bit Processor Family
                         •   Intel® Xeon® DP and MP Processors




17                                                                                   Document# 408075
                                                   Intel Confidential
Intel® IPP Function Library
     • Over 11,000 functions in 15 domains

     • Threaded application support
       – all functions are fully thread-safe
       – many functions internally threaded


     • Multiple data type support
       – Fixed and floating point data type support
       – 8, 16, 32 and 64-bit


     • Supports both static and dynamic linking
       – Maximize performance while balancing application size



18                                                   Document# 408075
                                Intel Confidential
Intel® Integrated Performance Primitives
(IPP)
 Intel IPP vs. C on single processor
 • 200% faster (average over all domains)
 • Optimized C performance normalized to 1




                            System configuration: Intel® Xeon® 4 Processor, 2.8GHz, 2GB
                                                 using Windows* XP




19                                                                           Document# 408075
                                               Intel Confidential
Threading In Application




20                                       Document# 408075
                    Intel Confidential
Threading Inside Intel IPP




21                                       Document# 408075
                    Intel Confidential
Intel® IPP Code Samples:
     Multithreaded H.264 Video Decode




      Measured using a Dell* Inspiron* 9400 PC with an Intel® Core™ Duo Processor 2.2GHz, 512MB RAM using Microsoft Windows* XP SP2. Codec samples compiled using
          Intel® C++ Compiler 9.1 using compilation options $(ICL_OMPLIB_OPT) /Qwd9,171,188,593,810,981,1125,1418 -D_OMP_KARABAS -D_OPENMP -Qopenmp



22                                                                                                                Document# 408075
                                                                         Intel Confidential
Intel® Threading Building Blocks
     Extend C++ for parallelism

     Highlights
     •  A C++ runtime library that does thread management, letting
        developers focus on proven parallel patterns
     • Appropriately scales to the number of HW threads available
     • Supports nested parallelism
     • The thread library API is portable across Linux, Windows,
        and Mac OS* platforms. Open Source community extended
        support to FreeBSD*, IA Solaris* and XBox* 360
     • Run-time library provides optimal size thread pool, task
        granularity and performance oriented scheduling
         • Automatic load balancing through task stealing
         • Cache efficiency and memory reuse
     • Committed to:
         • compiler independence
         • processor independence
         • OS independence
     Both GPL and commercial licenses are available.
                    http://threadingbuildingblocks.org
         *Other names and brands may be claimed as the property of others



23                                                                                   Document# 408075
                                                                Intel Confidential
Check Intel® TBB online
         www.threadingbuildingblocks.org



        Active user forums, FAQs, technical
            blogs, latest documentation



     Open Source Package License information.



       Several very important contributions
         were made by the OS community
      allowing TBB 2.1 to build and work on:
          XBox* 360, Sun Solaris*, AIX*



     TBB news column and introductory videos




      *Other names and brands may be claimed as the property of others



24                                                                                Document# 408075
                                                             Intel Confidential
Threading Tools
      Intel Software Solutions Group:
      http://www.intel.com/software


     Intel® Thread Checker
     –Used to create correct multi-threaded code


     Intel® Thread Profiler
     –Used to analyze performance




25                                                 Document# 408075
                              Intel Confidential
Data Race
      • Suppose a=1, b=2

         Thread1               Thread2
         x=a+b                    b = 42

                What is value of x if:
              – Thread1 runs before Thread2? x = 3
              – Thread2 runs before Thread1? x = 43


                   Execution order is
                     not guaranteed

26                                              Document# 408075
                           Intel Confidential
Intel® Thread Checker Diagnostics




27                                            Document# 408075
                         Intel Confidential
Source Code Viewer




28                                        Document# 408075
                     Intel Confidential
Performance Profile

          Speedup    3
                     2
                     1
                     0
                         1      2              3         4        Threads

     Possible causes for this scalability profile:
                    1. Insufficient parallel work
                    2. Load imbalance
                    3. Synchronization overhead
                    4. Memory bandwidth limitations

29                                                           Document# 408075
                                    Intel Confidential
Finding Serial and Parallel Time




30                                           Document# 408075
                        Intel Confidential
Load Imbalance

     • Unequal work loads lead to idle threads and wasted
       time



Thread
  0
                                                               Busy
Thread
  1
                                                                Idle
Thread
  2

Thread
  3



       Start
      thread
                       Time                         Join
                                                   thread
         s                                            s



31                                               Document# 408075
                            Intel Confidential
Synchronization


     • By definition, synchronization serializes execution
     • Lock contention means more idle time for threads


 Thread
   0

 Thread                                                                     Busy
   1
                                                                            Idle
 Thread
   2
                                                                         In Critical
 Thread
   3

                Time



32                                                    Document# 408075
                                 Intel Confidential
Real example : Before fix (thread
     profiler)




                 Switching
        Serial   Overhead                     Paralle
                                                    l

33                                                      Document# 408075
                             Intel Confidential
Real example: After fix




           2 X Speed Up

           Serial
                                         Parallel

34    34            Intel Confidential
                                             Document# 408075
Summary
     • If the hardware doesn’t win outright (unlikely) Then
       it is the SW’s fault
       – And we can fix the SW
     • Parallelization is an imperative
     • Intel offers a set of tools, world-wide experience
       and online support.
     • Questions to be asked:
       – Have we enabled SMT?
       – Have we investigated the capabilities of SSE?
       – Did we license Intel SW tools? (IPP/TBB/Thread Checker…)
       – Where can I find Intel acronym dictionary????




35                                                    Document# 408075
                                 Intel Confidential
Thank You!



36                           Document# 408075
        Intel Confidential

More Related Content

What's hot

05 2012 power_roadshow_software_on_power
05 2012 power_roadshow_software_on_power05 2012 power_roadshow_software_on_power
05 2012 power_roadshow_software_on_powerGennaro (Rino) Persico
 
Intel® Virtualization Technology & Parallels Bring Native Graphics Innovation...
Intel® Virtualization Technology & Parallels Bring Native Graphics Innovation...Intel® Virtualization Technology & Parallels Bring Native Graphics Innovation...
Intel® Virtualization Technology & Parallels Bring Native Graphics Innovation...James Price
 
Algorithmic Memory Increases Memory Performance by an Order of Magnitude
Algorithmic Memory Increases Memory Performance by an Order of MagnitudeAlgorithmic Memory Increases Memory Performance by an Order of Magnitude
Algorithmic Memory Increases Memory Performance by an Order of Magnitudechiportal
 
The Next Generation of Intel: The Dawn of Nehalem
The Next Generation of Intel: The Dawn of NehalemThe Next Generation of Intel: The Dawn of Nehalem
The Next Generation of Intel: The Dawn of NehalemJames Price
 
Intel® Xeon® Processor 5500 Series
Intel® Xeon® Processor 5500 SeriesIntel® Xeon® Processor 5500 Series
Intel® Xeon® Processor 5500 SeriesJames Price
 
Intel i7 Technologies
Intel i7 TechnologiesIntel i7 Technologies
Intel i7 TechnologiesBibhu Biswal
 
Ottieni il massimo dalla virtualizzazione con le nuove piattaforme Intel® Xeon®.
Ottieni il massimo dalla virtualizzazione con le nuove piattaforme Intel® Xeon®.Ottieni il massimo dalla virtualizzazione con le nuove piattaforme Intel® Xeon®.
Ottieni il massimo dalla virtualizzazione con le nuove piattaforme Intel® Xeon®.FSCitalia
 
Cots moves to multicore: Wind River
Cots moves to multicore: Wind RiverCots moves to multicore: Wind River
Cots moves to multicore: Wind RiverKonrad Witte
 
Final draft intel core i5 processors architecture
Final draft intel core i5 processors architectureFinal draft intel core i5 processors architecture
Final draft intel core i5 processors architectureJawid Ahmad Baktash
 
Intel Knights Landing Slides
Intel Knights Landing SlidesIntel Knights Landing Slides
Intel Knights Landing SlidesRonen Mendezitsky
 
Intel core i7 processor
Intel core i7 processorIntel core i7 processor
Intel core i7 processorsharjeel anjum
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataDESMOND YUEN
 
Power Systems 2009 Hardware
Power Systems 2009 HardwarePower Systems 2009 Hardware
Power Systems 2009 HardwareAndrey Klyachkin
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developersMichelle Holley
 
Arista @ HPC on Wall Street 2012
Arista @ HPC on Wall Street 2012Arista @ HPC on Wall Street 2012
Arista @ HPC on Wall Street 2012Kazunori Sato
 
Increasing Throughput per Node for Content Delivery Networks
Increasing Throughput per Node for Content Delivery NetworksIncreasing Throughput per Node for Content Delivery Networks
Increasing Throughput per Node for Content Delivery NetworksDESMOND YUEN
 
Power Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore MultiprocessingPower Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore Multiprocessingchiportal
 
Data center computing trends a survey
Data center computing trends   a surveyData center computing trends   a survey
Data center computing trends a surveyPartha Kundu
 
AMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD
 

What's hot (19)

05 2012 power_roadshow_software_on_power
05 2012 power_roadshow_software_on_power05 2012 power_roadshow_software_on_power
05 2012 power_roadshow_software_on_power
 
Intel® Virtualization Technology & Parallels Bring Native Graphics Innovation...
Intel® Virtualization Technology & Parallels Bring Native Graphics Innovation...Intel® Virtualization Technology & Parallels Bring Native Graphics Innovation...
Intel® Virtualization Technology & Parallels Bring Native Graphics Innovation...
 
Algorithmic Memory Increases Memory Performance by an Order of Magnitude
Algorithmic Memory Increases Memory Performance by an Order of MagnitudeAlgorithmic Memory Increases Memory Performance by an Order of Magnitude
Algorithmic Memory Increases Memory Performance by an Order of Magnitude
 
The Next Generation of Intel: The Dawn of Nehalem
The Next Generation of Intel: The Dawn of NehalemThe Next Generation of Intel: The Dawn of Nehalem
The Next Generation of Intel: The Dawn of Nehalem
 
Intel® Xeon® Processor 5500 Series
Intel® Xeon® Processor 5500 SeriesIntel® Xeon® Processor 5500 Series
Intel® Xeon® Processor 5500 Series
 
Intel i7 Technologies
Intel i7 TechnologiesIntel i7 Technologies
Intel i7 Technologies
 
Ottieni il massimo dalla virtualizzazione con le nuove piattaforme Intel® Xeon®.
Ottieni il massimo dalla virtualizzazione con le nuove piattaforme Intel® Xeon®.Ottieni il massimo dalla virtualizzazione con le nuove piattaforme Intel® Xeon®.
Ottieni il massimo dalla virtualizzazione con le nuove piattaforme Intel® Xeon®.
 
Cots moves to multicore: Wind River
Cots moves to multicore: Wind RiverCots moves to multicore: Wind River
Cots moves to multicore: Wind River
 
Final draft intel core i5 processors architecture
Final draft intel core i5 processors architectureFinal draft intel core i5 processors architecture
Final draft intel core i5 processors architecture
 
Intel Knights Landing Slides
Intel Knights Landing SlidesIntel Knights Landing Slides
Intel Knights Landing Slides
 
Intel core i7 processor
Intel core i7 processorIntel core i7 processor
Intel core i7 processor
 
Intel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big DataIntel Distribution for Python - Scaling for HPC and Big Data
Intel Distribution for Python - Scaling for HPC and Big Data
 
Power Systems 2009 Hardware
Power Systems 2009 HardwarePower Systems 2009 Hardware
Power Systems 2009 Hardware
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developers
 
Arista @ HPC on Wall Street 2012
Arista @ HPC on Wall Street 2012Arista @ HPC on Wall Street 2012
Arista @ HPC on Wall Street 2012
 
Increasing Throughput per Node for Content Delivery Networks
Increasing Throughput per Node for Content Delivery NetworksIncreasing Throughput per Node for Content Delivery Networks
Increasing Throughput per Node for Content Delivery Networks
 
Power Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore MultiprocessingPower Optimization Through Manycore Multiprocessing
Power Optimization Through Manycore Multiprocessing
 
Data center computing trends a survey
Data center computing trends   a surveyData center computing trends   a survey
Data center computing trends a survey
 
AMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick Bergman
 

Viewers also liked

Group VII - Presentation 1
Group VII - Presentation 1Group VII - Presentation 1
Group VII - Presentation 1Myrish Banaag
 
IBM z/OS V2R2 Networking Technologies Update
IBM z/OS V2R2 Networking Technologies UpdateIBM z/OS V2R2 Networking Technologies Update
IBM z/OS V2R2 Networking Technologies UpdateAnderson Bassani
 
Intel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFIntel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFOfer Rosenberg
 
IBM z/OS V2R2 Performance and Availability Topics
IBM z/OS V2R2 Performance and Availability TopicsIBM z/OS V2R2 Performance and Availability Topics
IBM z/OS V2R2 Performance and Availability TopicsAnderson Bassani
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architecturesnextlib
 
Cache & CPU performance
Cache & CPU performanceCache & CPU performance
Cache & CPU performanceso61pi
 
可靠分布式系统基础 Paxos的直观解释
可靠分布式系统基础 Paxos的直观解释可靠分布式系统基础 Paxos的直观解释
可靠分布式系统基础 Paxos的直观解释Yanpo Zhang
 
Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecturePiyush Mittal
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesTanel Poder
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF SuperpowersBrendan Gregg
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016Brendan Gregg
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Brendan Gregg
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf toolsBrendan Gregg
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at NetflixBrendan Gregg
 
Computex 2014 AMD Press Conference
Computex 2014 AMD Press ConferenceComputex 2014 AMD Press Conference
Computex 2014 AMD Press ConferenceAMD
 
AMD Ryzen CPU Zen Cores Architecture
AMD Ryzen CPU Zen Cores ArchitectureAMD Ryzen CPU Zen Cores Architecture
AMD Ryzen CPU Zen Cores ArchitectureLow Hong Chuan
 

Viewers also liked (20)

Group VII - Presentation 1
Group VII - Presentation 1Group VII - Presentation 1
Group VII - Presentation 1
 
Ludden q3 2008_boston
Ludden q3 2008_bostonLudden q3 2008_boston
Ludden q3 2008_boston
 
IBM z/OS V2R2 Networking Technologies Update
IBM z/OS V2R2 Networking Technologies UpdateIBM z/OS V2R2 Networking Technologies Update
IBM z/OS V2R2 Networking Technologies Update
 
Intel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOFIntel's Presentation in SIGGRAPH OpenCL BOF
Intel's Presentation in SIGGRAPH OpenCL BOF
 
IBM z/OS V2R2 Performance and Availability Topics
IBM z/OS V2R2 Performance and Availability TopicsIBM z/OS V2R2 Performance and Availability Topics
IBM z/OS V2R2 Performance and Availability Topics
 
Multi-core architectures
Multi-core architecturesMulti-core architectures
Multi-core architectures
 
z/OS V2R2 Enhancements
z/OS V2R2 Enhancementsz/OS V2R2 Enhancements
z/OS V2R2 Enhancements
 
Multicore computers
Multicore computersMulticore computers
Multicore computers
 
Cache & CPU performance
Cache & CPU performanceCache & CPU performance
Cache & CPU performance
 
可靠分布式系统基础 Paxos的直观解释
可靠分布式系统基础 Paxos的直观解释可靠分布式系统基础 Paxos的直观解释
可靠分布式系统基础 Paxos的直观解释
 
Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecture
 
Low Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling ExamplesLow Level CPU Performance Profiling Examples
Low Level CPU Performance Profiling Examples
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
 
SMP/Multithread
SMP/MultithreadSMP/Multithread
SMP/Multithread
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 
Computex 2014 AMD Press Conference
Computex 2014 AMD Press ConferenceComputex 2014 AMD Press Conference
Computex 2014 AMD Press Conference
 
AMD Ryzen CPU Zen Cores Architecture
AMD Ryzen CPU Zen Cores ArchitectureAMD Ryzen CPU Zen Cores Architecture
AMD Ryzen CPU Zen Cores Architecture
 

Similar to Embedded Solutions 2010: Intel Multicore by Eastronics

Intel Itanium Hotchips 2011 Overview
Intel Itanium Hotchips 2011 OverviewIntel Itanium Hotchips 2011 Overview
Intel Itanium Hotchips 2011 OverviewPauline Nist
 
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning AcceleratorDeep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning Acceleratorinside-BigData.com
 
Features of modern intel microprocessors
Features of modern intel microprocessorsFeatures of modern intel microprocessors
Features of modern intel microprocessorsKrunal Siddhapathak
 
Accelerating Insights in the Technical Computing Transformation
Accelerating Insights in the Technical Computing TransformationAccelerating Insights in the Technical Computing Transformation
Accelerating Insights in the Technical Computing TransformationIntel IT Center
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Johann Lombardi
 
What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?Enkitec
 
Engineered Systems: Oracle's Vision for the Future
Engineered Systems: Oracle's Vision for the FutureEngineered Systems: Oracle's Vision for the Future
Engineered Systems: Oracle's Vision for the FutureBob Rhubart
 
Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel® VTune™ Amplifier - Intel Software Conference 2013Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel® VTune™ Amplifier - Intel Software Conference 2013Intel Software Brasil
 
I can\'t believe this is butter - A Tour of btrfs
I can\'t believe this is butter - A Tour of btrfsI can\'t believe this is butter - A Tour of btrfs
I can\'t believe this is butter - A Tour of btrfsAvi Miller
 
Processors
ProcessorsProcessors
Processorsmzlnmy
 
Processors
ProcessorsProcessors
Processorsmzlnmy
 
Processors
ProcessorsProcessors
Processorsmzlnmy
 
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdfNazarAhmadAlkhidir
 
Intel Roadmap 2010
Intel Roadmap 2010Intel Roadmap 2010
Intel Roadmap 2010Umair Mohsin
 
Lakefield: Hybrid Cores in 3D Package
Lakefield: Hybrid Cores in 3D PackageLakefield: Hybrid Cores in 3D Package
Lakefield: Hybrid Cores in 3D Packageinside-BigData.com
 
Engineered Systems: Oracle's Vision for the Future
Engineered Systems: Oracle's Vision for the FutureEngineered Systems: Oracle's Vision for the Future
Engineered Systems: Oracle's Vision for the FutureBob Rhubart
 
Oracleonoracle dec112012
Oracleonoracle dec112012Oracleonoracle dec112012
Oracleonoracle dec112012patmisasi
 
Intel new processors
Intel new processorsIntel new processors
Intel new processorszaid_b
 

Similar to Embedded Solutions 2010: Intel Multicore by Eastronics (20)

Intel Itanium Hotchips 2011 Overview
Intel Itanium Hotchips 2011 OverviewIntel Itanium Hotchips 2011 Overview
Intel Itanium Hotchips 2011 Overview
 
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning AcceleratorDeep Learning Training at Scale: Spring Crest Deep Learning Accelerator
Deep Learning Training at Scale: Spring Crest Deep Learning Accelerator
 
Features of modern intel microprocessors
Features of modern intel microprocessorsFeatures of modern intel microprocessors
Features of modern intel microprocessors
 
Accelerating Insights in the Technical Computing Transformation
Accelerating Insights in the Technical Computing TransformationAccelerating Insights in the Technical Computing Transformation
Accelerating Insights in the Technical Computing Transformation
 
Big Data Smarter Networks
Big Data Smarter NetworksBig Data Smarter Networks
Big Data Smarter Networks
 
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
Introduction to the DAOS Scale-out object store (HLRS Workshop, April 2017)
 
What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?What's under the hood of Exadata X2-2 and X2-8?
What's under the hood of Exadata X2-2 and X2-8?
 
Engineered Systems: Oracle's Vision for the Future
Engineered Systems: Oracle's Vision for the FutureEngineered Systems: Oracle's Vision for the Future
Engineered Systems: Oracle's Vision for the Future
 
Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel® VTune™ Amplifier - Intel Software Conference 2013Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel® VTune™ Amplifier - Intel Software Conference 2013
 
I can\'t believe this is butter - A Tour of btrfs
I can\'t believe this is butter - A Tour of btrfsI can\'t believe this is butter - A Tour of btrfs
I can\'t believe this is butter - A Tour of btrfs
 
Processors
ProcessorsProcessors
Processors
 
Processors
ProcessorsProcessors
Processors
 
Processors
ProcessorsProcessors
Processors
 
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdffinaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
finaldraft-intelcorei5processorsarchitecture-130207093535-phpapp01.pdf
 
Intel Roadmap 2010
Intel Roadmap 2010Intel Roadmap 2010
Intel Roadmap 2010
 
Lakefield: Hybrid Cores in 3D Package
Lakefield: Hybrid Cores in 3D PackageLakefield: Hybrid Cores in 3D Package
Lakefield: Hybrid Cores in 3D Package
 
Engineered Systems: Oracle's Vision for the Future
Engineered Systems: Oracle's Vision for the FutureEngineered Systems: Oracle's Vision for the Future
Engineered Systems: Oracle's Vision for the Future
 
Oracleonoracle dec112012
Oracleonoracle dec112012Oracleonoracle dec112012
Oracleonoracle dec112012
 
Intel new processors
Intel new processorsIntel new processors
Intel new processors
 
Hyper threading
Hyper threadingHyper threading
Hyper threading
 

Recently uploaded

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Embedded Solutions 2010: Intel Multicore by Eastronics

  • 1. Intel Multi Core Micro Architecture and software tools Nir Arazy Field Application engineer Eastronics June 2010 Nir.arazy@easx.co.il
  • 2. Legal Disclaimer • Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. • The Intel products) referred to in this document is intended for standard commercial use only. Customer are solely responsible for assessing the suitability of the product for use in particular applications. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. • Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628- 8686 or 1-916-356-3104. • All information provided related to future Intel products and plans is preliminary and subject to change at any time, without notice. All dates provided are subject to change without notice. Intel may make changes to specifications and product descriptions at any time, without notice. • Celeron, Intel, Intel logo, Intel Core, Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel SpeedStep, Intel XScale, Itanium, Pentium, Pentium Inside, VTune, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. • * Other names and brands may be claimed as the property of others. • Other vendors are listed by Intel as a convenience to Intel's general customer base, but Intel does not make any representations or warranties whatsoever regarding quality, reliability, functionality, or compatibility of these devices. This list and/or these devices may be subject to change without notice. • Copyright © 2009, Intel Corporation. All rights reserved. 2 Document# 408075 Intel Confidential
  • 3. 2007 2008 2009 2010 2011 3 Document# 408075 Intel Confidential
  • 4. Moore’s Law – GHz to Multi-Core Performance Through Multi-Core Performance “Concurrency is the next major revolution in how we Intel MC Assistance write software” •Threading -Dr Dobb’s Journal, •Multi-tasking Herb Sutter •Training March 2005 •Tools Performance Through frequency 2006 - + 4 Document# 408075 Intel Confidential
  • 5. Multi-core is Mainstream Is Your Software Ready? Multiple execution cores ramping across Intel platforms 5 Document# 408075 Intel Confidential
  • 6. Agenda • HW based parallelism – Multi-Cores – Turbo boost – SMT – SSE • SW tools to enable efficient parallelism – IPP – TBB – Thread Checker – Thread Profiler 6 Document# 408075 Intel Confidential
  • 7. Simultaneous Multi-Threading (SMT) w/o SMT SMT • SMT – Run 2 threads at the same time per core • Take advantage of 4-wide execution engine Time (proc. – Keep it fed with multiple threads cycles) – Hide latency of a single thread • Most power efficient performance feature – Very low die area cost – Can provide significant performance benefit depending on application – Much more efficient than adding an entire core Note: Each box • Nehalem/Westmere advantages represents a processor – Larger caches execution unit – Massive memory BW Simultaneous multi-threading enhances 7 performance and energy efficiency Intel Confidential Document# 408075
  • 8. Enhanced Cache Subsystem • 3-level cache hierarchy 32KB FLC 32KB FLC – First Level Cache (FLC) Instruction Instruction – 32 KB Instruction & 32 KB Data per core 32KB FLC 32KB FLC – Equivalent to L1 Cache in Intel® Data Data Core™ microarchitecture – Mid Level Cache (MLC) 256KB 256KB – 256 KB per core MLC MLC – Last Level Cache Core 0 Core 1 – Up-to 4MB shared across both core – Inclusive cache policy – minimize ≤ 4MB Last Level Cache snoop traffic – Equivalent to L2 Cache in Intel® Processor Cache Subsystem Core™2 Duo microarchitecture 8 Document# 408075 Intel Confidential
  • 9. All New 2010 Intel® Core™ Performance-Based Technology Overview Core 2010 Features CPU Thread Intel® Turbo Boost Intel® and Hyper- Hyper- Intel® Hyper-Threading Technology CPU Thread Threading CPU Thread Technologies • Smart multitasking by doubling the number of GFX Core processor threads per core with Intel® Hyper- CPU Thread Threading Technology Intel® Turbo Boost Technology1 Intelligently and seamlessly delivers CPU Core Intel HD Graphics improved CPU performance to match your with Dynamic workload when thermal and power headroom CPU Core Frequency Mobile Only exist GFX Core Intel® HD Graphics with Dynamic Frequency Available on Mobile only Delivers graphics performance boost to graphics intensive applications provided thermal and power headroom exist New Intel® Core processors with Intel® Turbo Boost Technology and Dynamic Frequency to maximize performance of CPU and graphics intensive tasks Note1: See Intel® Turbo Boost Technology disclaimer in the back-up 9 Document# 408075 Intel Confidential
  • 10. Intel® Turbo Boost Technology Previous Current Platform Generation +Multiple Dynamically trade TDP budget Speed Bins Scenario 1 Scenario 2 +Multiple CPU Intensive Load GFX Intensive Load Speed Bins +1 Speed Bin GFX Turbo C3 State C3 State or lower or lower Core 1 Core 2 Core 1 Core 2 CPU GFX CPU GFX Core 1 Core 2 Single core Single Core Dual Core Intel® Intelligent Power sharing CPU Turbo CPU Turbo CPU Turbo Note: CPU and GFX can turbo simultaneously Strategy: Maximize CPU and GFX performance while staying within the processor TDP and Tjmax Note: Some features may be available only on certain SKU’s 10 Document# 408075 Intel Confidential
  • 11. Intel® Turbo Boost Technology Processor w/Turbo Processor w/out Turbo Intel® Turbo Boost Technology is targeted to deliver additional performance gains on Platform Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Results have been simulated and are provided for informational purposes only. Results were derived using simulations run on an architecture simulator or model. Any difference in system hardware or software design or configuration may affect actual performance. 11 Document# 408075 Intel Confidential
  • 12. Intel® Advanced Digital Media Boost Single Cycle SSE In Each Core SSE Operation (SSE/SSE2/SSE3) (SSE/SSE2/SSE3) Single SOURCE 127 0 Cycle X4 X3 X2 X1 SSE SSE/2/3 OP Y4 Y3 Y2 Y1 DECODE DECODE DEST Previous CLOCK X2opY2 X1opY1 CYCLE 1 EXECUTE EXECUTE CLOCK X4opY4 X3opY3 CYCLE 2 Intel® Core™ Microarchitecture CLOCK X4opY4 X3opY3 X2opY2 X1opY1 CYCLE 1 128 bit Single Cycle in each core 12 Document# 408075 Intel Confidential
  • 13. Single Instruction Multiple Data (SIMD) • Anything that fits into 16 byte… • and all conversions! 4x floats 2x doubles 16x bytes 8x words 4x dwords 2x qwords 1x dqword 13 Document# 408075 Intel Confidential
  • 14. Intel® Advanced Vector Extension (Intel® AVX) • Features: – New 256-bit Instruction Set Architecture (ISA) – Built on legacy 128-bit SIMD (SSEx) and 64-bit SIMD (MMX) ISA extensions – Enhancements to 128-bit SIMD instructions – Support for 3 and 4 -operand syntax • Benefits: Expected Intel® AVX benefits: - Image, video and audio processing - CNC* & PLC compute performance - High performance Digital Signal & Image Processing (DSIP) within small Size, Weight and total Power (SW&P) • Targeted segments: -Military/Aerospace/Government - Medical Imaging - Comms, Industrial Controllers & Digital Signage Source: http://software.intel.com/en-us/avx/ Performance Improvements for Floating Point Intensive Applications 14 Document# 408075 Intel Confidential
  • 15. Agenda • HW based parallelism – Multi-Cores – SMT – Turbo boost – SSE • SW tools to enable efficient parallelism – IPP – TBB – Thread Checker – Thread Profiler 15 Document# 408075 Intel Confidential
  • 16. Simplified Threaded Development with Intel® Tools Architectural Analysis Introduce Threads Confidence/Correctness Optimize / Tune Analyzers Compilers Checkers Analyzers • Find the code that • Built-in optimization • Find deadlocks and • Tune for can benefit from • OpenMP race conditions performance threading Libraries and scalability • Find hotspots that • Multimedia & data processing • Visualize efficiency limit performance • Math Processing of threaded code • Threading 16 Document# 408075 Intel Confidential
  • 17. Intel® Integrated Performance Primitives (Intel® IPP) — Overview and Benefits Application Source Code Intel IPP Usage Code Samples Rapid Free Code • • Sample video/audio/speech codecs Image processing and JPEG Application Samples • • Signal processing Data compression Development • .NET and Java integration API calls Intel IPP Library C/C++ API Cross-platform • • Cryptography Image processing • • Data Compression Data Integrity Compatibility API • • Image color conversion JPEG / JPEG2000 • • Signal processing Matrix mathematics and • • Computer Vision Video coding • • Vector mathematics String processing Code Re-Use • Audio coding • Speech coding • Speech recognition Static/Dynamic Link Intel IPP Processor-Optimized Binaries Intel® Atom™ Processors Processor- • • Intel® Core™ i7 Processors Outstanding Optimized • • Intel® Core™ 2 Duo and Core™ Extreme Processors Intel® Core™ Duo and Core™ Solo Processors Performance Implementation • • Intel® Pentium® D Dual-Core Processors Intel® Xeon® 64-bit Dual-Core Processors • Intel® Pentium® M and Pentium® 4 Processors • Intel® Itanium® 64-bit Processor Family • Intel® Xeon® DP and MP Processors 17 Document# 408075 Intel Confidential
  • 18. Intel® IPP Function Library • Over 11,000 functions in 15 domains • Threaded application support – all functions are fully thread-safe – many functions internally threaded • Multiple data type support – Fixed and floating point data type support – 8, 16, 32 and 64-bit • Supports both static and dynamic linking – Maximize performance while balancing application size 18 Document# 408075 Intel Confidential
  • 19. Intel® Integrated Performance Primitives (IPP) Intel IPP vs. C on single processor • 200% faster (average over all domains) • Optimized C performance normalized to 1 System configuration: Intel® Xeon® 4 Processor, 2.8GHz, 2GB using Windows* XP 19 Document# 408075 Intel Confidential
  • 20. Threading In Application 20 Document# 408075 Intel Confidential
  • 21. Threading Inside Intel IPP 21 Document# 408075 Intel Confidential
  • 22. Intel® IPP Code Samples: Multithreaded H.264 Video Decode Measured using a Dell* Inspiron* 9400 PC with an Intel® Core™ Duo Processor 2.2GHz, 512MB RAM using Microsoft Windows* XP SP2. Codec samples compiled using Intel® C++ Compiler 9.1 using compilation options $(ICL_OMPLIB_OPT) /Qwd9,171,188,593,810,981,1125,1418 -D_OMP_KARABAS -D_OPENMP -Qopenmp 22 Document# 408075 Intel Confidential
  • 23. Intel® Threading Building Blocks Extend C++ for parallelism Highlights • A C++ runtime library that does thread management, letting developers focus on proven parallel patterns • Appropriately scales to the number of HW threads available • Supports nested parallelism • The thread library API is portable across Linux, Windows, and Mac OS* platforms. Open Source community extended support to FreeBSD*, IA Solaris* and XBox* 360 • Run-time library provides optimal size thread pool, task granularity and performance oriented scheduling • Automatic load balancing through task stealing • Cache efficiency and memory reuse • Committed to: • compiler independence • processor independence • OS independence Both GPL and commercial licenses are available. http://threadingbuildingblocks.org *Other names and brands may be claimed as the property of others 23 Document# 408075 Intel Confidential
  • 24. Check Intel® TBB online www.threadingbuildingblocks.org Active user forums, FAQs, technical blogs, latest documentation Open Source Package License information. Several very important contributions were made by the OS community allowing TBB 2.1 to build and work on: XBox* 360, Sun Solaris*, AIX* TBB news column and introductory videos *Other names and brands may be claimed as the property of others 24 Document# 408075 Intel Confidential
  • 25. Threading Tools Intel Software Solutions Group: http://www.intel.com/software Intel® Thread Checker –Used to create correct multi-threaded code Intel® Thread Profiler –Used to analyze performance 25 Document# 408075 Intel Confidential
  • 26. Data Race • Suppose a=1, b=2 Thread1 Thread2 x=a+b b = 42 What is value of x if: – Thread1 runs before Thread2? x = 3 – Thread2 runs before Thread1? x = 43 Execution order is not guaranteed 26 Document# 408075 Intel Confidential
  • 27. Intel® Thread Checker Diagnostics 27 Document# 408075 Intel Confidential
  • 28. Source Code Viewer 28 Document# 408075 Intel Confidential
  • 29. Performance Profile Speedup 3 2 1 0 1 2 3 4 Threads Possible causes for this scalability profile: 1. Insufficient parallel work 2. Load imbalance 3. Synchronization overhead 4. Memory bandwidth limitations 29 Document# 408075 Intel Confidential
  • 30. Finding Serial and Parallel Time 30 Document# 408075 Intel Confidential
  • 31. Load Imbalance • Unequal work loads lead to idle threads and wasted time Thread 0 Busy Thread 1 Idle Thread 2 Thread 3 Start thread Time Join thread s s 31 Document# 408075 Intel Confidential
  • 32. Synchronization • By definition, synchronization serializes execution • Lock contention means more idle time for threads Thread 0 Thread Busy 1 Idle Thread 2 In Critical Thread 3 Time 32 Document# 408075 Intel Confidential
  • 33. Real example : Before fix (thread profiler) Switching Serial Overhead Paralle l 33 Document# 408075 Intel Confidential
  • 34. Real example: After fix 2 X Speed Up Serial Parallel 34 34 Intel Confidential Document# 408075
  • 35. Summary • If the hardware doesn’t win outright (unlikely) Then it is the SW’s fault – And we can fix the SW • Parallelization is an imperative • Intel offers a set of tools, world-wide experience and online support. • Questions to be asked: – Have we enabled SMT? – Have we investigated the capabilities of SSE? – Did we license Intel SW tools? (IPP/TBB/Thread Checker…) – Where can I find Intel acronym dictionary???? 35 Document# 408075 Intel Confidential
  • 36. Thank You! 36 Document# 408075 Intel Confidential