SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
SF 2009




    PCI Express* 3.0 Technology:
    Device Architecture
    Optimizations on Intel
    Platforms
     Mahesh Wagh
     IO Architect

    TCIS006
Agenda

    • Next Generation PCI Express* (PCIe*)
      Protocol Extensions Summary
    • Device Architecture Considerations
      – Energy Efficient Performance
      – Power Management
    • Software Development
    • Summary
    • Call to action



2
PCI Express* (PCIe*) 2.1 Protocol Extensions Summary

    Extensions     Explanation                                      Benefit
 Transaction     Request hints to      Reduce access latency to system memory. Reduce System
                 enable optimized      Interconnect & Memory Bandwidth & Associated Power
 Layer Packet    processing within     Consumption
 (TLP)           host memory/cache     Application Class: NIC, Storage, Accelerators/GP-GPU
 Processing      hierarchy
 Hints
 Latency         Mechanisms for        Reducing Platform Power based on device service requirements,
                 platform to tune PM   Application Class: All devices/Segments
 Tolerance
 Reporting
 Opportunistic   Mechanisms for        Reducing Platform Power based aligning device activity with
                 platform to tune PM   platform PM events to further reduce platform power
 Buffer Flush    and to align device   Application Class: All devices/Segments
 and Fill        activities

 Atomics         Atomic Read-          Reduced Synchronization overhead, software library algorithm and
                 Modify-Write          data structure re-use across core and accelerators/devices.
                 mechanism             Application Class: (Graphics, Accelerators/GP-GPU))

 Resizable       Mechanism to          System Resource optimizations - breakaway from “All or Nothing
                 negotiate BAR size    device address space allocation”
 BAR
                                       Application Class: – Any Device with large local memory
                                       (Example: Graphics)

 Multicast       Address Based         Significant gain in efficiency compared to multiple unicasts
                 Multicast             Application Class: (Embedded, Storage & Multiple Graphics
                                       adapters)


3
Continued…
    Extensions         Explanation                                       Benefit

I/O Page         Extends IO address              System Memory Management optimizations
                 remapping for page faults –     Application Class: Accelerators, GP-GPU usage models
Faults
                 (Address Translation Services
                 1.1)

Ordering         New ordering semantic to        Improved performance (latency reduction) ~ (IDO) 20%
                 improve performance             read latency improvement by permitting unrelated reads
Enhancements
                                                 to pass writes. Application Class: All Devices with two
                                                 party communication

Dynamic          Mechanisms to allow             Dynamic component power/thermal control, manage
                 dynamic power/performance       endpoint function power usage to meet new customer or
Power
                 management of D0 (active)       regulatory operation requirements
Allocation       substates.                      Application Class: GP-GPU
(DPA)
Internal Error   Extend AER to report            Enables software to implement common and interoperable
                 component internal errors       error handling services. Improved error containment and
Reporting
                 (Correctable/ uncorrectable)    recovery.
                 and multiple error logs         Application Class: RAS for Switches


TLP Prefix       Mechanism to extend TLP         Scalable Architecture headroom for TLP headers to grow
                 headers                         with minimal impact to routing elements. Support Vendor
                                                 Specific header formats.
                                                 Application Class: MR-IOV, Large Topologies and
                                                 provisioning for future use models



          New Features with Broad Applicability
4
Device Architecture Considerations

                    • TLP Processing Hints (TPH)

                    • Power Management Extensions




5
TLP Processing Hints (TPH)

    Processor
                                             TLP Processing Hints (PCIe Base
                                                2.1 specification)
                $                               – Memory Read, Memory Write
                                                  and Atomic Operations
                                             System Specific Steering Tags
                $                              (ST)
                                    Memory      – Identify targeted resource e.g.
                                                  System Cache Location
                                                – 256 unique Steering Tags

      Root Complex                            Benefits
                                              Effective Use of System Resources
                                              • Reduce Access latency to system
                                                memory
                                              • Reduce Memory & system
                                                interconnect BW & Power
                     PCI Express* (PCIe*)
                           Device

                    Improves System Efficiency
     Effective use of System Fabric and Resources
6
Requirements Checklist
    Ecosystem                                                   Device Driver
     Platform support (Root Complex,
     Routing Elements)                                          Device Specific
                                                                 Assignments
     System specific Steering Tag
     advertisement


                                                                                            Platform
    Device Architecture spec. compliance
                      PCIe
                                      PCIe Capability support
                                                                                               ST
     Characterize workloads                                                                 Mapping
     Application processor Affinity
                                                                       OS/VMM
     Steering Tag to Workload
     association                                                                            Firmware
     Select Modes of Operation
                                                                Root Complex
    Software Development                                        TPH capability   Platform
    Basic Capability Discovery,
    Identification and Management
                                                                        ST Table            TPH
    System specific Steering Tag                                     TPH capability
    advertisement and assignment
    Device Driver enhancements
                                                                            PCIe Device



7   PCI Express* (PCIe*) Technology
No ST Mode


    No Steering Tags used

    Request Steering is Platform
    Specific                                            OS/VMM

    Basic Capability Enablement
                                                Root Complex

    Minimal Implementation                      TPH capability   Platform
    cost and complexity                                                   TPH
                                                      TPH capability    (No ST)
                                                            PCIe Device



                                      Basic support

8   PCI Express* (PCIe*) Technology
Interrupt Vector Mode
    ST associated with Interrupt
    (MSI/MSI-X)
                                                                                  Platform
    Firmware provides Platform                                                       ST
    specific ST information to                                                    Mapping
    OS/Hypervisor                                            OS/VMM
                                                                                  Firmware
    OS/Hypervisor assigns ST
    along with Interrupt vector
                                                      Root Complex
    assignment                                        TPH capability   Platform

    Suitable for devices with                                 ST Table            TPH
    workload/Interrupt affinity                            TPH capability

    to cores                                                      PCIe Device




                                      TTM Advantage
9   PCI Express* (PCIe*) Technology
Device Specific Mode
                                               Device Driver
                                               Device Specific
     Device Specific ST                         Assignments
     association

     Builds upon Interrupt vector                                          Platform
     mode s/w support                                                         ST
                                                                           Mapping
     Device Driver determines                         OS/VMM
     processor affinity                                                    Firmware


     New API required to request               Root Complex

     ST assignments                            TPH capability   Platform

     Independent of Interrupt                          ST Table
                                                    TPH capability
                                                                           TPH
     association                                           PCIe Device

                          Scalable & Flexible Solution
10   PCI Express* (PCIe*) Technology
TPH Aware Device Architecture
          Classify device initiated             Steering Tag Modes
               transactions:
              Bulk vs. Control                No ST: Basic Hints only, No ST
                                               used
          Select hints based on
         Data Struct. Use models              Interrupt Vector Mode:
                                               Faster TTM with Interrupt
                                               association
          Control Struct. (Descriptors)       Device Specific Mode:
                                               Scalable, Flexible & Dynamic,
          Headers for Pkt. Processing
                                               can provide TTM advantage
          Data Payload (Copies)



                           Software Development
        Basic TPH Capability Identification, Discovery and Management
        Firmware Support to advertise ST assignments
        OS/Hypervisor ST assignment support
        Optional API support


             TPH permits Device Architecture Specific Trade-Offs

11
Device Architecture Considerations

                 • TLP Processing Hints


                 • Power Management Extensions




12
System Perspective

     • Device behavior impacts power consumption of other
       system components
        – Devices should consider their system power impact, not just
          their own device level power consumption
           • Extreme example: Enhanced Host Controller Interface (EHCI)
     • Systems (and devices) are idle most of the time
        – There’s a big opportunity for devices and systems to take
          advantage of that
     • Latency Tolerance Reporting (LTR) enables lower
       power, longer exit latency system power states when
       devices can tolerate it
     • Optimized Buffer Flush/Fill (OBFF) enables platform
       activity alignment, resulting in system power savings

              Opportunity for Devices to Differentiate on
                        Platform Power Savings
13
Platform Power Savings Opportunity

                                40       Average power scenario running MM07 (Office* Productivity suite)
      Platform Power in Watts


                                                      On an Intel® Core™2 Duo platform running Windows* Vista*
                                35
                                                                                                             Idle workload power
                                30                                                                                (~8W – 10W)
                                25
                                                                                                                      PM
                                20                                                                                 Opportunity
                                15

                                10

                                 5

                                 0
                                     0     10    20    30      40      50      60      70      80     90    100    110
                                                               Time in Minutes                       Source: Intel Corporation

     • Usage Analysis: Typical mobile platform in S0 state is ~90% idle
     • When idle, platform components are kept in high power state to
       meet the service latency requirements of devices & applications

                    Power consumption for idle workload is high
14
Optimized Buffer Flush/Fill

     • Next several foils describe:
       – Platform activity alignment
       – PCI Express* (PCIe*) OBFF mechanisms
       – OBFF within context of platform activity
         alignment
       – Device implementation impacts for OBFF




15
Platform Activity Alignment

     Current Platforms



                                        Power Management Opportunities
     Platforms with Activity Alignment




      OS tick      Device         Device interrupts       Device       Device traffic
      inter-     interrupts         (deferrable)          traffic      (deferrable)
       rupts      (critical)      • Not time critical
                                                         (critical)
                • Time Critical                                         • No Buffering
                                  • Status Notifications • Buffer over-   constraints
                • Buffer          • User Command           (under-) run • Debug
                  replenish         Completions          • Throughput/ dumps
                • Performance/    • Debug, Statistics      Performance
                  Throughput

          Creates PM Opportunities for Semi-active workloads
16
Optimized Buffer Flush/Fill (OBFF)
                                                        OBFF
      Processor                                         • Notify all Endpoints of
                                                           optimal windows with
                                                           minimal power impact
                                                        Solution1: When possible,
                                                          use WAKE# with new
                    Optional           Memory             wire semantics
                    OBFF    2                           Solution2: WAKE# not
                    Message                               available – Use PCIe
     Wake#                                                Message
                                Optimal Windows
         Root Complex                                   WAKE# Waveforms
                                • Processor Active
                                  – Platform fully      Transition Event           WAKE#
                                  active. Optimal for
                                  bus mastering and     Idle  OBFF
                                  interrupts
                                • OBFF – Platform       Idle  Proc. Active
                                  memory path
                                  available for         OBFF/Proc. Active  Idle
                                  memory read and
                                  writes                OBFF  Proc. Active
                                • Idle – Platform is
         PCI Express* (PCIe*)     in low power state    Proc. Active  OBFF
               Device

              Greatest Potential Improvement When
17             Implemented by All Platform Devices
OBFF and Activity Alignment
                                                                                   Interrupts   DMAs
                                       PCIe     Dev A
                                                                   Device A
                                       PCIe     Dev B
             Platform                                              Device B
                                       PCIe     Dev C
                                                                   Device C
                                       WAKE#
                                                                   OS Timer
                                                                   Tick (intrpt)
       Traffic Pattern with No OBFF


                                                                                                  Time
       Traffic Pattern with OBFF


            Idle  Active                         Idle    Active    Idle     OBFF  Idle          Active
                         Idle            OBFF
           Window Wndw                          Window   Wndw     Window    Wndw Window         Window




18   PCI Express* (PCIe*) Technology                                                                     18
OBFF Device Implementation
 Impacts
                                        Classify device initiated
       Maximize idle window                   transactions:
       duration for platform             critical vs. deferrable
     •Align transactions with         •Perform critical
     other devices in system          transactions as necessary
     •Coalesce transactions into      •Defer other transactions
     groups where possible            to align with platform
         – Perform groups of          activity
           transactions all at            – Decode platform idle
           once, don’t trickle all          / active / OBFF
           the time                         window signaling


     Select data buffer depths to tolerate platform activity
     alignment
      • 300µs of deferral buffering recommended for Intel mobile
        platforms



19                                                                  19
Latency Tolerance Reporting

     • The next several foils describe:
       – Power vs. response latency
       – PCI Express* (PCIe*) LTR mechanisms,
         semantics
       – Examples of device implementation schemes
          • Application state driven LTR reporting
          • Data buffer depth driven LTR reporting
          • Software guided LTR reporting
       – Device implementation impact summary for
         LTR



20                                                   20
Power Vs Response Latency
 (Mobile)
                                                    System Active State (S0)                     System Inactive State (Sx)
 Platform Response Latency
                             ~10sec




                                           Max              Workload            Idle

                                                    Low Service Latency                                                        S4
                                                    requirements during                 High Service Latency
                             500msec




                                                   active workloads gives               requirements during
                                                    reliable performance                idle workloads gives
                                                                                        more PM opportunity
                                                                                                                      S3
                                                                     Devices and
                                                                  Applications have                            Response Latency
                                                                    Fixed Service                              Penalty increases
                                                                       Latency                                 with lower power
                                                                  expectations from                                  states
                                                                   a platform in S0
                             ~100usec




                                        Proc. C0           Proc. CX 



                                        Max power                                     8-10W                         600mW     400mW
                                                   Platform Power Consumption  Decreasing
                                   Variable Service Latency requirements in S0 is Optimal
21                                                                                                                                  21
Latency Tolerance Reporting (LTR)
                                         LTR Mechanism
     Processor                       •    PCI Express* (PCIe*) Message sent by
                                          Endpoint with tolerable latency
                                            –   Capability to report both snooped & non-
                                                snooped values
                                            –   “Terminate at Receiver” routing, MFD &
                                                Switch send aggregated message

                 LTR        Memory       Benefits
                 Message             •    Provides Device Benefit: Dynamically tune
                                          platform PM state as a function of Device
                                          activity level
                                     •    Platform benefit: Enables greater power
                                          savings without impact to
       Root Complex                       performance/functionality

                                     Dynamic LTR
                   1
                                          LTR (Max)
                                                            Buffer     1    Idle
                       2
           Buffer
                                          LTR (Activity
                                           Adjusted)        Buffer     2    Active
      PCI Express* Device

          LTR enables dynamic power vs. performance
22              tradeoffs at minimal cost impact
Application State Driven LTR
 Example: WLAN Device Sending LTR


       Latency     Latency                                 Latency
      (100usec)    (1msec)                                (100usec)




           Client Awake             Client Dozing


          Beacon with TIM    Data        PS-Poll    ACK


        Latency information with Wi-Fi Legacy Power Save


     Example use of device PM states to give latency guidance

23                                                                    23
Data Buffer Utilization Guided LTR
  Example: Active Ethernet NIC Sending LTR
  Initial         A new LTR value is in effect no later than the value sent in the previous LTR msg
 Latency                                                                                              Latency
  500µs                                                                                               500µsec
                                             Platform Components Power State
 Very Low                            High       Low                  High    Low                Low Very Low
  Power                             Power      Power                Power   Power              Power Power


                 500µs                       (PCIe) Bus
                                            Transactions

                               100µs                          100µs
     600µs                                                                            100us
                             Device
                             Buffer                        Device                       Timeout
                               LTR Threshold               Buffer

                               100µs


     Idle                                                                               Idle

                       Network Data
                                                                              Timeout
      LTR = 100µs                                                                               LTR = 500µs

              Example use of buffering to give latency guidance
24    PCI Express* (PCIe*) Technology                                                                         24
Software Guided Latency

     SW Guided Latency
      Three device categories                                        Platform
         Static: Device can always                                    Power
          support max platform latency
                                                Device Driver           Mgt
         Slow Dynamic: Latency
          requirements change                   LTR Policy Logic       Policy
          infrequently                                                 Engine
         Fast Dynamic: Latency
          requirements change frequently
      Static and Slow Dynamic types of
       devices may choose SW guided
       messaging
         Policy logic for determination of           MMIO Register    LTR
          when to send latency messages
          (and what values) in software
                                                       PCI Express* Device
         E.g. use an MMIO register
               A write to the register would
                trigger an LTR message




25                                                                               25
LTR Device Implementation Impacts
     When idle, let platform enter deep power saving states
       Use MaxLatency (LTR Extended Capabilities field) when idle
       Require low latencies only when necessary – don’t keep
        platform in high power state longer than necessary


       Dynamic, hardware                             Software guided
          driven LTR                                       LTR
      Leverage application                          Implement
       based opportunities to          SW             simple MMIO
       tolerate more latency                          register
                                      Guided
         E.g. WLAN radio off                         interface
                                        ?
          between beacons                               Register
      Implement data                                    writes cause
       buffering mechanism                               LTR message
       to comprehend LTR                                 to be sent




26                                                                      26
Software Enabling

     Features requiring basic software support
                             8GT/s speed upgrade
     Capability Discovery,   Atomics
     Identification and      Transaction Ordering Relaxations
                             Internal Error Reporting
     Management              TLP Prefix




       Features requiring additional support
     Above and beyond           Transaction Processing Hints
     capability enablement      LTR & OBFF
                                Resizable BAR
     Resource Allocation,       IO Page Faults
     Enumeration & API          Dynamic Power Allocation
                                Multicast




27
Summary
 • Next Generation PCI Express* (PCIe*)
   Protocol Extensions Deliver Energy Efficient
   Performance
     – Protocol Extensions with Broad Applicability

 • Ecosystem Development is essential
     – Platform Support
     – Device architectures optimized around protocol
       features
     – Software support and Enabling




28
Call to Action
 • Device Architecture Considerations
     – Develop Device Architecture to make the most of the most of
       proposed protocol extensions
 • Differentiate products utilizing TPH
     – Select Hints/ST modes based on device/market segment
       requirements
 • Differentiate products utilizing LTR and OBFF
     – Can differentiate by platform power impact not just device power
     – Power Savings opportunity is huge

 • Keep track of Next Generation PCI Express* Technology
   development
     – PCI-SIG www.pcisig.com

 • Engage with Intel on Next Generation PCI Express product
   development
     – www.intel.com/technology/pciexpress/devnet


29
Acknowledgements / Disclaimer
  • I would like to acknowledge the contributions of the following
    Intel employees
     –   Stephen Whalley, Intel Corporation
     –   Jasmin Ajanovic, Intel Corporation
     –   Anil Vasudevan, Intel Corporation
     –   Prashant Sethi, Intel Corporation
     –   Simer Singh, Intel Corporation
     –   Miles Penner, Intel Corporation
     –   Mahesh Natu, Intel Corporation
     –   Rob Gough, Intel Corporation
     –   Eric Wehage, Intel Corporation
     –   Jim Walsh, Intel Corporation
     –   Jaya Jeyaseelan, Intel Corporation
     –   Neil Songer, Intel Corporation
     –   Barnes Cooper, Intel Corporation

 All opinions, judgments, recommendations, etc. that
 are presented herein are the opinions of the presenter
 of the material and do not necessarily reflect the
 opinions of the PCI-SIG*.
30                                                                   30
Additional Sources of Information
     on This Topic
     • Other Sessions
       – TSIS007 –PCI Express* 3.0 Technology: PHY
         Implementation Considerations on Intel
         Platforms
       – TSIS008 – PCI Express* 3.0 Technology:
         Electrical Requirements for Designing ASICs on
         Intel Platforms
       – TCIQ002: Q&A: PCI Express* 3.0 Technology

     • www.intel.com/technology/pciexpress/devn
       et



31
Legal Disclaimer
• INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO
  LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL
  PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS
  AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER,
  AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF
  INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A
  PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR
  OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN
  MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.
• Intel may make changes to specifications and product descriptions at any time, without notice.
• All products, dates, and figures specified are preliminary based on current expectations, and are subject to
  change without notice.
• Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which
  may cause the product to deviate from published specifications. Current characterized errata are available on
  request.
• Code names featured are used internally within Intel to identify products that are in development and not yet
  publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to
  use code names in advertising, promotion or marketing of any product or services and any such use of Intel's
  internal code names is at the sole risk of the user
• Performance tests and ratings are measured using specific computer systems and/or components and reflect
  the approximate performance of Intel products as measured by those tests. Any difference in system
  hardware or software design or configuration may affect actual performance.
• Intel, Intel Inside, and the Intel logo are trademarks of Intel Corporation in the United States and other
  countries.
• *Other names and brands may be claimed as the property of others.
• Copyright © 2009 Intel Corporation.



32
Risk Factors
The above statements and any others in this document that refer to plans and expectations for the third quarter, the year and the
future are forward-looking statements that involve a number of risks and uncertainties. Many factors could affect Intel’s actual
results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from
those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could
cause actual results to differ materially from the corporation’s expectations. Ongoing uncertainty in global economic conditions pose
a risk to the overall economy as consumers and businesses may defer purchases in response to tighter credit and negative financial
news, which could negatively affect product demand and other related matters. Consequently, demand could be different from
Intel's expectations due to factors including changes in business and economic conditions, including conditions in the credit market
that could affect consumer confidence; customer acceptance of Intel’s and competitors’ products; changes in customer order
patterns including order cancellations; and changes in the level of inventory at customers. Intel operates in intensely competitive
industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product
demand that is highly variable and difficult to forecast. Additionally, Intel is in the process of transitioning to its next generation of
products on 32nm process technology, and there could be execution issues associated with these changes, including product defects
and errata along with lower than anticipated manufacturing yields. Revenue and the gross margin percentage are affected by the
timing of new Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's
competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such
actions; and Intel’s ability to respond quickly to technological developments and to incorporate new features into its products. The
gross margin percentage could vary significantly from expectations based on changes in revenue levels; capacity utilization; start-up
costs, including costs associated with the new 32nm process technology; variations in inventory valuation, including variations
related to the timing of qualifying products for sale; excess or obsolete inventory; product mix and pricing; manufacturing yields;
changes in unit costs; impairments of long-lived assets, including manufacturing, assembly/test and intangible assets; and the
timing and execution of the manufacturing ramp and associated costs. Expenses, particularly certain marketing and compensation
expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and
the level of revenue and profits. The current financial stress affecting the banking system and financial markets and the going
concern threats to investment banks and other financial institutions have resulted in a tightening in the credit markets, a reduced
level of liquidity in many financial markets, and heightened volatility in fixed income, credit and equity markets. There could be a
number of follow-on effects from the credit crisis on Intel’s business, including insolvency of key suppliers resulting in product
delays; inability of customers to obtain credit to finance purchases of our products and/or customer insolvencies; counterparty
failures negatively impacting our treasury operations; increased expense or inability to obtain short-term financing of Intel’s
operations from the issuance of commercial paper; and increased impairments from the inability of investee companies to obtain
financing. The majority of our non-marketable equity investment portfolio balance is concentrated in companies in the flash memory
market segment, and declines in this market segment or changes in management’s plans with respect to our investments in this
market segment could result in significant impairment charges, impacting restructuring charges as well as gains/losses on equity
investments and interest and other. Intel's results could be impacted by adverse economic, social, political and
physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other
security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Intel's
results could be affected by adverse effects associated with product defects and errata (deviations from published specifications),
and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust and other issues, such as the
litigation and regulatory matters described in Intel's SEC reports. A detailed discussion of these and other risk factors that could
affect Intel’s results is included in Intel’s SEC filings, including the report on Form 10-Q for the quarter ended June 27, 2009.


 Rev. 7/27/09

Mais conteúdo relacionado

Mais procurados

Revisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIORevisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIOHisaki Ohara
 
PCIe Error Status (1).pptx
PCIe Error Status (1).pptxPCIe Error Status (1).pptx
PCIe Error Status (1).pptxssuser1c8ca21
 
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick44CON
 
Uboot startup sequence
Uboot startup sequenceUboot startup sequence
Uboot startup sequenceHoucheng Lin
 
PCI Express Verification using Reference Modeling
PCI Express Verification using Reference ModelingPCI Express Verification using Reference Modeling
PCI Express Verification using Reference ModelingDVClub
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLinaro
 
APB protocol v1.0
APB protocol v1.0APB protocol v1.0
APB protocol v1.0Azad Mishra
 
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_FinalGopi Krishnamurthy
 
PCI Express* based Storage: Data Center NVM Express* Platform Topologies
PCI Express* based Storage: Data Center NVM Express* Platform TopologiesPCI Express* based Storage: Data Center NVM Express* Platform Topologies
PCI Express* based Storage: Data Center NVM Express* Platform TopologiesOdinot Stanislas
 
Q4.11: Introduction to eMMC
Q4.11: Introduction to eMMCQ4.11: Introduction to eMMC
Q4.11: Introduction to eMMCLinaro
 

Mais procurados (20)

Pci express technology 3.0
Pci express technology 3.0Pci express technology 3.0
Pci express technology 3.0
 
Revisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIORevisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIO
 
Pci express modi
Pci express modiPci express modi
Pci express modi
 
PCIe Error Status (1).pptx
PCIe Error Status (1).pptxPCIe Error Status (1).pptx
PCIe Error Status (1).pptx
 
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick
 
Uboot startup sequence
Uboot startup sequenceUboot startup sequence
Uboot startup sequence
 
Pcie basic
Pcie basicPcie basic
Pcie basic
 
PCI Express Verification using Reference Modeling
PCI Express Verification using Reference ModelingPCI Express Verification using Reference Modeling
PCI Express Verification using Reference Modeling
 
PCIe
PCIePCIe
PCIe
 
Ambha axi
Ambha axiAmbha axi
Ambha axi
 
PCI express
PCI expressPCI express
PCI express
 
Linux I2C
Linux I2CLinux I2C
Linux I2C
 
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to EmbeddedLAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
LAS16-402: ARM Trusted Firmware – from Enterprise to Embedded
 
axi protocol
axi protocolaxi protocol
axi protocol
 
AMBA Ahb 2.0
AMBA Ahb 2.0AMBA Ahb 2.0
AMBA Ahb 2.0
 
APB protocol v1.0
APB protocol v1.0APB protocol v1.0
APB protocol v1.0
 
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
 
PCI Express* based Storage: Data Center NVM Express* Platform Topologies
PCI Express* based Storage: Data Center NVM Express* Platform TopologiesPCI Express* based Storage: Data Center NVM Express* Platform Topologies
PCI Express* based Storage: Data Center NVM Express* Platform Topologies
 
Q4.11: Introduction to eMMC
Q4.11: Introduction to eMMCQ4.11: Introduction to eMMC
Q4.11: Introduction to eMMC
 
SOC design
SOC design SOC design
SOC design
 

Semelhante a PCI Express 3.0 Technology Optimizations on Intel Platforms

Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsAnand Haridass
 
BeRTOS: Free Embedded RTOS
BeRTOS: Free Embedded RTOSBeRTOS: Free Embedded RTOS
BeRTOS: Free Embedded RTOSDeveler S.r.l.
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryDeepak Shankar
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Community
 
Track B- Advanced ESL verification - Mentor
Track B- Advanced ESL verification - MentorTrack B- Advanced ESL verification - Mentor
Track B- Advanced ESL verification - Mentorchiportal
 
5 pipeline arch_rationale
5 pipeline arch_rationale5 pipeline arch_rationale
5 pipeline arch_rationalevideos
 
directCell - Cell/B.E. tightly coupled via PCI Express
directCell - Cell/B.E. tightly coupled via PCI ExpressdirectCell - Cell/B.E. tightly coupled via PCI Express
directCell - Cell/B.E. tightly coupled via PCI ExpressHeiko Joerg Schick
 
IBM i Technology Refreshes Overview 2012 06-04
IBM i Technology Refreshes Overview 2012 06-04IBM i Technology Refreshes Overview 2012 06-04
IBM i Technology Refreshes Overview 2012 06-04COMMON Europe
 
Oracle rac 10g best practices
Oracle rac 10g best practicesOracle rac 10g best practices
Oracle rac 10g best practicesHaseeb Alam
 
Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...
Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...
Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...RUDDER
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
NVMe_Infrastructure_final1.pdf
NVMe_Infrastructure_final1.pdfNVMe_Infrastructure_final1.pdf
NVMe_Infrastructure_final1.pdfIrfanBroadband
 
Red hat Enterprise Linux 6.4 for IBM System z Technical Highlights
Red hat Enterprise Linux 6.4 for IBM System z Technical HighlightsRed hat Enterprise Linux 6.4 for IBM System z Technical Highlights
Red hat Enterprise Linux 6.4 for IBM System z Technical HighlightsFilipe Miranda
 
S104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bS104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bTony Pearson
 
System_on_Chip_SOC.ppt
System_on_Chip_SOC.pptSystem_on_Chip_SOC.ppt
System_on_Chip_SOC.pptzahixdd
 
Yashi dealer meeting settembre 2016 tecnologie xeon intel italia
Yashi dealer meeting settembre 2016 tecnologie xeon intel italiaYashi dealer meeting settembre 2016 tecnologie xeon intel italia
Yashi dealer meeting settembre 2016 tecnologie xeon intel italiaYashi Italia
 
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceInfrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceWalter Moriconi
 

Semelhante a PCI Express 3.0 Technology Optimizations on Intel Platforms (20)

Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
BeRTOS: Free Embedded RTOS
BeRTOS: Free Embedded RTOSBeRTOS: Free Embedded RTOS
BeRTOS: Free Embedded RTOS
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP Library
 
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Taipei - Accelerate Ceph via SPDK
 
Track B- Advanced ESL verification - Mentor
Track B- Advanced ESL verification - MentorTrack B- Advanced ESL verification - Mentor
Track B- Advanced ESL verification - Mentor
 
5 pipeline arch_rationale
5 pipeline arch_rationale5 pipeline arch_rationale
5 pipeline arch_rationale
 
directCell - Cell/B.E. tightly coupled via PCI Express
directCell - Cell/B.E. tightly coupled via PCI ExpressdirectCell - Cell/B.E. tightly coupled via PCI Express
directCell - Cell/B.E. tightly coupled via PCI Express
 
IBM i Technology Refreshes Overview 2012 06-04
IBM i Technology Refreshes Overview 2012 06-04IBM i Technology Refreshes Overview 2012 06-04
IBM i Technology Refreshes Overview 2012 06-04
 
Oracle rac 10g best practices
Oracle rac 10g best practicesOracle rac 10g best practices
Oracle rac 10g best practices
 
PowerAI Deep dive
PowerAI Deep divePowerAI Deep dive
PowerAI Deep dive
 
PROSE
PROSEPROSE
PROSE
 
Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...
Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...
Configuration management benefits for everyone - Rudder @ FLOSSUK Spring Conf...
 
Choosing the right processor
Choosing the right processorChoosing the right processor
Choosing the right processor
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
NVMe_Infrastructure_final1.pdf
NVMe_Infrastructure_final1.pdfNVMe_Infrastructure_final1.pdf
NVMe_Infrastructure_final1.pdf
 
Red hat Enterprise Linux 6.4 for IBM System z Technical Highlights
Red hat Enterprise Linux 6.4 for IBM System z Technical HighlightsRed hat Enterprise Linux 6.4 for IBM System z Technical Highlights
Red hat Enterprise Linux 6.4 for IBM System z Technical Highlights
 
S104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809bS104878 nvme-revolution-jburg-v1809b
S104878 nvme-revolution-jburg-v1809b
 
System_on_Chip_SOC.ppt
System_on_Chip_SOC.pptSystem_on_Chip_SOC.ppt
System_on_Chip_SOC.ppt
 
Yashi dealer meeting settembre 2016 tecnologie xeon intel italia
Yashi dealer meeting settembre 2016 tecnologie xeon intel italiaYashi dealer meeting settembre 2016 tecnologie xeon intel italia
Yashi dealer meeting settembre 2016 tecnologie xeon intel italia
 
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with ConfidenceInfrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
Infrastruttura Efficiente Di Sun E Amd -Virtualise with Confidence
 

Último

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Último (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

PCI Express 3.0 Technology Optimizations on Intel Platforms

  • 1. SF 2009 PCI Express* 3.0 Technology: Device Architecture Optimizations on Intel Platforms Mahesh Wagh IO Architect TCIS006
  • 2. Agenda • Next Generation PCI Express* (PCIe*) Protocol Extensions Summary • Device Architecture Considerations – Energy Efficient Performance – Power Management • Software Development • Summary • Call to action 2
  • 3. PCI Express* (PCIe*) 2.1 Protocol Extensions Summary Extensions Explanation Benefit Transaction Request hints to Reduce access latency to system memory. Reduce System enable optimized Interconnect & Memory Bandwidth & Associated Power Layer Packet processing within Consumption (TLP) host memory/cache Application Class: NIC, Storage, Accelerators/GP-GPU Processing hierarchy Hints Latency Mechanisms for Reducing Platform Power based on device service requirements, platform to tune PM Application Class: All devices/Segments Tolerance Reporting Opportunistic Mechanisms for Reducing Platform Power based aligning device activity with platform to tune PM platform PM events to further reduce platform power Buffer Flush and to align device Application Class: All devices/Segments and Fill activities Atomics Atomic Read- Reduced Synchronization overhead, software library algorithm and Modify-Write data structure re-use across core and accelerators/devices. mechanism Application Class: (Graphics, Accelerators/GP-GPU)) Resizable Mechanism to System Resource optimizations - breakaway from “All or Nothing negotiate BAR size device address space allocation” BAR Application Class: – Any Device with large local memory (Example: Graphics) Multicast Address Based Significant gain in efficiency compared to multiple unicasts Multicast Application Class: (Embedded, Storage & Multiple Graphics adapters) 3
  • 4. Continued… Extensions Explanation Benefit I/O Page Extends IO address System Memory Management optimizations remapping for page faults – Application Class: Accelerators, GP-GPU usage models Faults (Address Translation Services 1.1) Ordering New ordering semantic to Improved performance (latency reduction) ~ (IDO) 20% improve performance read latency improvement by permitting unrelated reads Enhancements to pass writes. Application Class: All Devices with two party communication Dynamic Mechanisms to allow Dynamic component power/thermal control, manage dynamic power/performance endpoint function power usage to meet new customer or Power management of D0 (active) regulatory operation requirements Allocation substates. Application Class: GP-GPU (DPA) Internal Error Extend AER to report Enables software to implement common and interoperable component internal errors error handling services. Improved error containment and Reporting (Correctable/ uncorrectable) recovery. and multiple error logs Application Class: RAS for Switches TLP Prefix Mechanism to extend TLP Scalable Architecture headroom for TLP headers to grow headers with minimal impact to routing elements. Support Vendor Specific header formats. Application Class: MR-IOV, Large Topologies and provisioning for future use models New Features with Broad Applicability 4
  • 5. Device Architecture Considerations • TLP Processing Hints (TPH) • Power Management Extensions 5
  • 6. TLP Processing Hints (TPH) Processor TLP Processing Hints (PCIe Base 2.1 specification) $ – Memory Read, Memory Write and Atomic Operations System Specific Steering Tags $ (ST) Memory – Identify targeted resource e.g. System Cache Location – 256 unique Steering Tags Root Complex Benefits Effective Use of System Resources • Reduce Access latency to system memory • Reduce Memory & system interconnect BW & Power PCI Express* (PCIe*) Device Improves System Efficiency Effective use of System Fabric and Resources 6
  • 7. Requirements Checklist Ecosystem Device Driver Platform support (Root Complex, Routing Elements) Device Specific Assignments System specific Steering Tag advertisement Platform Device Architecture spec. compliance PCIe PCIe Capability support ST Characterize workloads Mapping Application processor Affinity OS/VMM Steering Tag to Workload association Firmware Select Modes of Operation Root Complex Software Development TPH capability Platform Basic Capability Discovery, Identification and Management ST Table TPH System specific Steering Tag TPH capability advertisement and assignment Device Driver enhancements PCIe Device 7 PCI Express* (PCIe*) Technology
  • 8. No ST Mode No Steering Tags used Request Steering is Platform Specific OS/VMM Basic Capability Enablement Root Complex Minimal Implementation TPH capability Platform cost and complexity TPH TPH capability (No ST) PCIe Device Basic support 8 PCI Express* (PCIe*) Technology
  • 9. Interrupt Vector Mode ST associated with Interrupt (MSI/MSI-X) Platform Firmware provides Platform ST specific ST information to Mapping OS/Hypervisor OS/VMM Firmware OS/Hypervisor assigns ST along with Interrupt vector Root Complex assignment TPH capability Platform Suitable for devices with ST Table TPH workload/Interrupt affinity TPH capability to cores PCIe Device TTM Advantage 9 PCI Express* (PCIe*) Technology
  • 10. Device Specific Mode Device Driver Device Specific Device Specific ST Assignments association Builds upon Interrupt vector Platform mode s/w support ST Mapping Device Driver determines OS/VMM processor affinity Firmware New API required to request Root Complex ST assignments TPH capability Platform Independent of Interrupt ST Table TPH capability TPH association PCIe Device Scalable & Flexible Solution 10 PCI Express* (PCIe*) Technology
  • 11. TPH Aware Device Architecture Classify device initiated Steering Tag Modes transactions: Bulk vs. Control No ST: Basic Hints only, No ST used  Select hints based on Data Struct. Use models Interrupt Vector Mode: Faster TTM with Interrupt association Control Struct. (Descriptors) Device Specific Mode: Scalable, Flexible & Dynamic, Headers for Pkt. Processing can provide TTM advantage Data Payload (Copies) Software Development  Basic TPH Capability Identification, Discovery and Management  Firmware Support to advertise ST assignments  OS/Hypervisor ST assignment support  Optional API support TPH permits Device Architecture Specific Trade-Offs 11
  • 12. Device Architecture Considerations • TLP Processing Hints • Power Management Extensions 12
  • 13. System Perspective • Device behavior impacts power consumption of other system components – Devices should consider their system power impact, not just their own device level power consumption • Extreme example: Enhanced Host Controller Interface (EHCI) • Systems (and devices) are idle most of the time – There’s a big opportunity for devices and systems to take advantage of that • Latency Tolerance Reporting (LTR) enables lower power, longer exit latency system power states when devices can tolerate it • Optimized Buffer Flush/Fill (OBFF) enables platform activity alignment, resulting in system power savings Opportunity for Devices to Differentiate on Platform Power Savings 13
  • 14. Platform Power Savings Opportunity 40 Average power scenario running MM07 (Office* Productivity suite) Platform Power in Watts On an Intel® Core™2 Duo platform running Windows* Vista* 35 Idle workload power 30 (~8W – 10W) 25 PM 20 Opportunity 15 10 5 0 0 10 20 30 40 50 60 70 80 90 100 110 Time in Minutes Source: Intel Corporation • Usage Analysis: Typical mobile platform in S0 state is ~90% idle • When idle, platform components are kept in high power state to meet the service latency requirements of devices & applications Power consumption for idle workload is high 14
  • 15. Optimized Buffer Flush/Fill • Next several foils describe: – Platform activity alignment – PCI Express* (PCIe*) OBFF mechanisms – OBFF within context of platform activity alignment – Device implementation impacts for OBFF 15
  • 16. Platform Activity Alignment Current Platforms Power Management Opportunities Platforms with Activity Alignment OS tick Device Device interrupts Device Device traffic inter- interrupts (deferrable) traffic (deferrable) rupts (critical) • Not time critical (critical) • Time Critical • No Buffering • Status Notifications • Buffer over- constraints • Buffer • User Command (under-) run • Debug replenish Completions • Throughput/ dumps • Performance/ • Debug, Statistics Performance Throughput Creates PM Opportunities for Semi-active workloads 16
  • 17. Optimized Buffer Flush/Fill (OBFF) OBFF Processor • Notify all Endpoints of optimal windows with minimal power impact Solution1: When possible, use WAKE# with new Optional Memory wire semantics OBFF 2 Solution2: WAKE# not Message available – Use PCIe Wake# Message Optimal Windows Root Complex WAKE# Waveforms • Processor Active – Platform fully Transition Event WAKE# active. Optimal for bus mastering and Idle  OBFF interrupts • OBFF – Platform Idle  Proc. Active memory path available for OBFF/Proc. Active  Idle memory read and writes OBFF  Proc. Active • Idle – Platform is PCI Express* (PCIe*) in low power state Proc. Active  OBFF Device Greatest Potential Improvement When 17 Implemented by All Platform Devices
  • 18. OBFF and Activity Alignment Interrupts DMAs PCIe Dev A Device A PCIe Dev B Platform Device B PCIe Dev C Device C WAKE# OS Timer Tick (intrpt) Traffic Pattern with No OBFF Time Traffic Pattern with OBFF Idle Active Idle Active Idle OBFF Idle Active Idle OBFF Window Wndw Window Wndw Window Wndw Window Window 18 PCI Express* (PCIe*) Technology 18
  • 19. OBFF Device Implementation Impacts Classify device initiated Maximize idle window transactions: duration for platform critical vs. deferrable •Align transactions with •Perform critical other devices in system transactions as necessary •Coalesce transactions into •Defer other transactions groups where possible to align with platform – Perform groups of activity transactions all at – Decode platform idle once, don’t trickle all / active / OBFF the time window signaling Select data buffer depths to tolerate platform activity alignment • 300µs of deferral buffering recommended for Intel mobile platforms 19 19
  • 20. Latency Tolerance Reporting • The next several foils describe: – Power vs. response latency – PCI Express* (PCIe*) LTR mechanisms, semantics – Examples of device implementation schemes • Application state driven LTR reporting • Data buffer depth driven LTR reporting • Software guided LTR reporting – Device implementation impact summary for LTR 20 20
  • 21. Power Vs Response Latency (Mobile) System Active State (S0) System Inactive State (Sx) Platform Response Latency ~10sec Max Workload Idle Low Service Latency S4 requirements during High Service Latency 500msec active workloads gives requirements during reliable performance idle workloads gives more PM opportunity S3 Devices and Applications have Response Latency Fixed Service Penalty increases Latency with lower power expectations from states a platform in S0 ~100usec Proc. C0 Proc. CX  Max power 8-10W 600mW 400mW Platform Power Consumption  Decreasing Variable Service Latency requirements in S0 is Optimal 21 21
  • 22. Latency Tolerance Reporting (LTR) LTR Mechanism Processor • PCI Express* (PCIe*) Message sent by Endpoint with tolerable latency – Capability to report both snooped & non- snooped values – “Terminate at Receiver” routing, MFD & Switch send aggregated message LTR Memory Benefits Message • Provides Device Benefit: Dynamically tune platform PM state as a function of Device activity level • Platform benefit: Enables greater power savings without impact to Root Complex performance/functionality Dynamic LTR 1 LTR (Max) Buffer 1 Idle 2 Buffer LTR (Activity Adjusted) Buffer 2 Active PCI Express* Device LTR enables dynamic power vs. performance 22 tradeoffs at minimal cost impact
  • 23. Application State Driven LTR Example: WLAN Device Sending LTR Latency Latency Latency (100usec) (1msec) (100usec) Client Awake Client Dozing Beacon with TIM Data PS-Poll ACK Latency information with Wi-Fi Legacy Power Save Example use of device PM states to give latency guidance 23 23
  • 24. Data Buffer Utilization Guided LTR Example: Active Ethernet NIC Sending LTR Initial A new LTR value is in effect no later than the value sent in the previous LTR msg Latency Latency 500µs 500µsec Platform Components Power State Very Low High Low High Low Low Very Low Power Power Power Power Power Power Power 500µs (PCIe) Bus Transactions 100µs 100µs 600µs 100us Device Buffer Device Timeout LTR Threshold Buffer 100µs Idle Idle Network Data Timeout LTR = 100µs LTR = 500µs Example use of buffering to give latency guidance 24 PCI Express* (PCIe*) Technology 24
  • 25. Software Guided Latency SW Guided Latency  Three device categories Platform  Static: Device can always Power support max platform latency Device Driver Mgt  Slow Dynamic: Latency requirements change LTR Policy Logic Policy infrequently Engine  Fast Dynamic: Latency requirements change frequently  Static and Slow Dynamic types of devices may choose SW guided messaging  Policy logic for determination of MMIO Register LTR when to send latency messages (and what values) in software PCI Express* Device  E.g. use an MMIO register  A write to the register would trigger an LTR message 25 25
  • 26. LTR Device Implementation Impacts When idle, let platform enter deep power saving states  Use MaxLatency (LTR Extended Capabilities field) when idle  Require low latencies only when necessary – don’t keep platform in high power state longer than necessary Dynamic, hardware Software guided driven LTR LTR  Leverage application  Implement based opportunities to SW simple MMIO tolerate more latency register Guided  E.g. WLAN radio off interface ? between beacons  Register  Implement data writes cause buffering mechanism LTR message to comprehend LTR to be sent 26 26
  • 27. Software Enabling Features requiring basic software support 8GT/s speed upgrade Capability Discovery, Atomics Identification and Transaction Ordering Relaxations Internal Error Reporting Management TLP Prefix Features requiring additional support Above and beyond Transaction Processing Hints capability enablement LTR & OBFF Resizable BAR Resource Allocation, IO Page Faults Enumeration & API Dynamic Power Allocation Multicast 27
  • 28. Summary • Next Generation PCI Express* (PCIe*) Protocol Extensions Deliver Energy Efficient Performance – Protocol Extensions with Broad Applicability • Ecosystem Development is essential – Platform Support – Device architectures optimized around protocol features – Software support and Enabling 28
  • 29. Call to Action • Device Architecture Considerations – Develop Device Architecture to make the most of the most of proposed protocol extensions • Differentiate products utilizing TPH – Select Hints/ST modes based on device/market segment requirements • Differentiate products utilizing LTR and OBFF – Can differentiate by platform power impact not just device power – Power Savings opportunity is huge • Keep track of Next Generation PCI Express* Technology development – PCI-SIG www.pcisig.com • Engage with Intel on Next Generation PCI Express product development – www.intel.com/technology/pciexpress/devnet 29
  • 30. Acknowledgements / Disclaimer • I would like to acknowledge the contributions of the following Intel employees – Stephen Whalley, Intel Corporation – Jasmin Ajanovic, Intel Corporation – Anil Vasudevan, Intel Corporation – Prashant Sethi, Intel Corporation – Simer Singh, Intel Corporation – Miles Penner, Intel Corporation – Mahesh Natu, Intel Corporation – Rob Gough, Intel Corporation – Eric Wehage, Intel Corporation – Jim Walsh, Intel Corporation – Jaya Jeyaseelan, Intel Corporation – Neil Songer, Intel Corporation – Barnes Cooper, Intel Corporation All opinions, judgments, recommendations, etc. that are presented herein are the opinions of the presenter of the material and do not necessarily reflect the opinions of the PCI-SIG*. 30 30
  • 31. Additional Sources of Information on This Topic • Other Sessions – TSIS007 –PCI Express* 3.0 Technology: PHY Implementation Considerations on Intel Platforms – TSIS008 – PCI Express* 3.0 Technology: Electrical Requirements for Designing ASICs on Intel Platforms – TCIQ002: Q&A: PCI Express* 3.0 Technology • www.intel.com/technology/pciexpress/devn et 31
  • 32. Legal Disclaimer • INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. • Intel may make changes to specifications and product descriptions at any time, without notice. • All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. • Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. • Code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user • Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. • Intel, Intel Inside, and the Intel logo are trademarks of Intel Corporation in the United States and other countries. • *Other names and brands may be claimed as the property of others. • Copyright © 2009 Intel Corporation. 32
  • 33. Risk Factors The above statements and any others in this document that refer to plans and expectations for the third quarter, the year and the future are forward-looking statements that involve a number of risks and uncertainties. Many factors could affect Intel’s actual results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could cause actual results to differ materially from the corporation’s expectations. Ongoing uncertainty in global economic conditions pose a risk to the overall economy as consumers and businesses may defer purchases in response to tighter credit and negative financial news, which could negatively affect product demand and other related matters. Consequently, demand could be different from Intel's expectations due to factors including changes in business and economic conditions, including conditions in the credit market that could affect consumer confidence; customer acceptance of Intel’s and competitors’ products; changes in customer order patterns including order cancellations; and changes in the level of inventory at customers. Intel operates in intensely competitive industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Additionally, Intel is in the process of transitioning to its next generation of products on 32nm process technology, and there could be execution issues associated with these changes, including product defects and errata along with lower than anticipated manufacturing yields. Revenue and the gross margin percentage are affected by the timing of new Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; and Intel’s ability to respond quickly to technological developments and to incorporate new features into its products. The gross margin percentage could vary significantly from expectations based on changes in revenue levels; capacity utilization; start-up costs, including costs associated with the new 32nm process technology; variations in inventory valuation, including variations related to the timing of qualifying products for sale; excess or obsolete inventory; product mix and pricing; manufacturing yields; changes in unit costs; impairments of long-lived assets, including manufacturing, assembly/test and intangible assets; and the timing and execution of the manufacturing ramp and associated costs. Expenses, particularly certain marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and the level of revenue and profits. The current financial stress affecting the banking system and financial markets and the going concern threats to investment banks and other financial institutions have resulted in a tightening in the credit markets, a reduced level of liquidity in many financial markets, and heightened volatility in fixed income, credit and equity markets. There could be a number of follow-on effects from the credit crisis on Intel’s business, including insolvency of key suppliers resulting in product delays; inability of customers to obtain credit to finance purchases of our products and/or customer insolvencies; counterparty failures negatively impacting our treasury operations; increased expense or inability to obtain short-term financing of Intel’s operations from the issuance of commercial paper; and increased impairments from the inability of investee companies to obtain financing. The majority of our non-marketable equity investment portfolio balance is concentrated in companies in the flash memory market segment, and declines in this market segment or changes in management’s plans with respect to our investments in this market segment could result in significant impairment charges, impacting restructuring charges as well as gains/losses on equity investments and interest and other. Intel's results could be impacted by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Intel's results could be affected by adverse effects associated with product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. A detailed discussion of these and other risk factors that could affect Intel’s results is included in Intel’s SEC filings, including the report on Form 10-Q for the quarter ended June 27, 2009. Rev. 7/27/09