SlideShare uma empresa Scribd logo
1 de 3
Baixar para ler offline
Branch Prediction Contest: Implementation of Piecewise Linear Prediction
                                Algorithm
                                               Prosunjit Biswas
                                         Department of Computer Science.
                                        University of Texas at San Antonio.

                     Abstract                                         First Path-Based Neural Branch Prediction[4]
Branch predictor’s accuracy is very important to            is another attempt that combines path and pattern
harness the parallelism available in ILP and thus           history to overcome the limitation associated with
improve performance of today’s microprocessors              preexisting neural predictors. It improved accuracy
and specially superscalar processors. Among branch          over previous neural predictors and achieved
predictors, various neural branch predictors                significantly low latency. This predictor achieved IPC
including Scaled Neural Branch Predictor (SNAP),            of an aggressively clocked microarchitecture by 16%
Piecewise Linear Branch predictor outperform other          over the former perceptron predictor.
state-of-the-art predictors. In this course final           Scaled neural analog predictor, or SNAP is another
project for the course of Computer Architecture             recently proposed neural branch predictor which uses
(CS-5513), I have studied various neural predictors         the concept of piecewise-linear branch prediction and
and implemented the Piecewise Linear Branch                 relies on a mixed analog/digital implementation. This
Predictor as per the algorithm provided by a                predictor decreases latency over power consumption
research paper of Dr. Daniel A. Jimenez. The                over other available neural predictors [5]. Fig.1
hardware budget is restricted for this project and I        (Courtesy – “An Optimized Scaled Neural Branch
have implemented the predictor within a predefined          Predictor” by Daniel A. Jimenez) shows comparative
hardware budget of 64K of memory. I am also                 performance of noted branch prediction approaches on
competing for branch prediction contest.                    a set of SPEC CPU 2000 and 2006 integer benchmarks.

                                                                                 III.   THE ALGORITHM
Keywords: Piecewise        Linear,   Neural   Network,
                                                            The Branch predictor algorithm has two major parts
Branch Prediction.
                                                            namely i) Prediction algorithm ii) Train/Update
                                                            algorithm. Before going to the implementation of these
                  I.    INTRODUCTION
Neural Branch predictors are the most accurate
predictors in the literature but they were impractical
due to the high latency associated with prediction. This
latency is due to the complex computation that must be
carried out to determine the excitation of an artificial
neuron. [3]
Piecewise Linear Branch Prediction [1] improved both
accuracy and latency over previous neural predictors.
This predictor works by developing a set of linear
functions, one for each program path to the branch to
be predicted that separate predicted taken from
predicted untaken.
In this paper, Piecewise Linear Branch Prediction,
Daniel A. Jimenez proposed two versions of the
prediction algorithm – i) The Idealized Piecewise
Linear Branch Predictor and ii) A Practical Piecewise
Linear Branch Predictor. In this project, I have focused
on the idealized predictor.

                 II. RELATED WORKS                          Fig. 1. Performance of Branch different branch
                                                            Predictors over SPEC CPU 2000 and 2006 integer
                                                            benchmarks (Courtesy - “An Optimized Scaled Neural
Perceptron prediction is one of the first attempts in       Branch Predictor” by Daniel A. Jimenez)
branch prediction history that associated branch            two algorithms, we will discuss the states and variable
prediction through neural network. This predictor           they use. The three dimensional array W is the data
achieved a improved misprediction rate on a composite       structure used to store weights of the branches which is
trace of SPEC2000 benchmarks by 14.7%. [2] But              used in both prediction and update algorithm.
unfortunately, this predictor was impractical due to its
high latency.
Table II: The update/train algorithm

                                                              void update (branch_update *u, bool taken, unsigned int target) {
                                                                         if (bi.br_flags & BR_CONDITIONAL) {
    Fig2: The array of W with its corresponding indices                    if ( abs(output)< theta || ( (output>=0) != taken) ){

                                                                                    if (taken == true ) {
Branch address is generally taken as the last 8/10 bits                                   if (W[address][0][0] < SAT_VAL)
of the instruction address. For each predicting branch,                                        W[address][0][0] ++;
the algorithm keeps history of all other branches that                                 } else {
                                                                                          if (W[address][0][0] > (-1) * SAT_VAL)
precede this branch in the dynamic path taken by the                                           W[address][0][0] --;
branch. The second dimension indicated by the variable
GA keeps track of these per branch dynamic path                                     }
history. The third dimension, as shown as GHR[i],                          for(int i=0; i<H-1; i++) {
                                                                              if(GHR[i] == taken ) {
keeps track of the position of the address GA[i] in the                              if (W[address][GA[i]][i] < SAT_VAL)
global branch history register namely GHR.                                                 W[address][GA[i]][i] ++;
                                                                                      } else {
Some of the important variables of the algorithm is also                             if (W[address][GA[i]][i] > (-1) * SAT_VAL+1
                                                              )
given here for the clarity purpose.                                                      W[address][GA[i]][i] --;
                                                                                     }
GA : An array of address. This array keeps the path                            }
history associated with each branch address. As new                        }
                                                                         shift_update_GA(address);
branch is executed, the address of the branch is shifted                 shift_update_GHR(taken);
into the first position of the array.                                    }
                                                              }
GHR: An array of Boolean true/false value. This array
keep track of the taken / untaken status of the branches.

H : Length of History Register.                                          IV. TUNING PERFORMANCE
Output: An integer value generated by the predictor           Besides the algorithm, the MPKI (Miss Per Kilo
algorithm to predict current branch.                          Instruction) rate of the algorithm depends on the size of
                                                              various dimension of the array W. I have experienced
                                                              MPKI against various dimension of W. The result of
Table I: The prediction algorithm.                            my experiment is shown below. Table 1 shows the
                                                              result of the experiment.


void branch_update *predict (branch_info & b) {               Table I : MPKI rate of the Piecewise Linear Algorithm
            bi = b;
          if (b.br_flags & BR_CONDITIONAL) {                  with limited budget of 64K
              address = ( ((b.address >> 4 ) & 0x0F )<<2) |
                           ((b.address>>2)) & 0x03;               W[i][GA[i]][GHR[i]                          MPKI
               output = W[address][0][0];
               for (int i=0; i<H; i++) {
                                                                    W[64][16][64]                             3.982
               if ( GHR[i] == true )                               W[128][16][32]                             4.217
                         output += W[address][GA[i]][i];            W[64][8][128]                             4.292
                          else if (GHR[i] == false)                W[32][16][128]                             5.807
                            output -= W[address][GA[i]][i];
                                                                    W[64][64][16]                             4.826
                         }
                        u.direction_prediction(output>=0);    The table shows that the predictor performs better when
             } else {
                                                              i, GA[i], GHR[i] has corresponding 64,16,64 entries.
                      u.direction_prediction (false);
              }
           u.target_prediction (0);                                V. TWEAKING INSTRUCTION ADDRESS
          return &u;
}
                                                              I have found that rather than taking the last bits from
                                                              the address, discarding the 2 least significant bits of the
                                                              address and then taking 3-8 bits make the predictor
                                                              predicts more accurately. It decreases the aliasing and
                                                              thus improves prediction rate a little bit.
Table II: 64 K ( 65,532 Byte) memory budget limit
                                                                                                                                                                                                                                                      calculation

                                                                                                                                                                                                                                                      DataStructure/Array/Varia      Memory calculation
  Fig. 3: Tweaking Branch address for performance                                                                                                                                                                                                     ble
                     speed up.                                                                                                                                                                                                                        W[64][16][63] of each 1        64,512 byte
                                                                                                                                                                                                                                                      Byte long
                                                                                                                                                                                                                                                      Constants(SIZE,H,SAT_V         5*1 byte ( each value < 128)
                                                                                                                   VI. RESULT                                                                                                                         AL,theta,N)
                                                                                                                                                                                                                                                      (GA[63] * 6 bits / 8) byte     48 byte
                                                                                                                                                                                                                                                      (GHR[63] * 1 bit / 8) byte     8 byte
Misprediction rate of the benchmarks according to the                                                                                                                                                                                                 vaiables (address , output )   8 byte
piecewise linear algorithm is shown in fig 4. Fig.5                                                                                                                                                                                                   * 4 byte
shows      comparison     of   different   prediction                                                                                                                                                                                                 Total:                         64,581 byte
algorithms(piecewise linear, perceptron and gshare)
against various given benchmarks.

  14
  12                                                                                                                                                                                                                                                                           VIII CONCLUSION
  10
  8
                                                                                                                                                                                                                                                      In this individual course final project, I have tried to
  6
                                                                                                                                                                                                                                                      implement the piecewise linear branch prediction
  4                                                                                                                                                                                                                                                   algorithm. . In my implementation, I have achieved a
  2                                                                                                                                                                                                                                                   MPKI of 3.988 at best. I think, it is also possible to
  0                                                                                                                                                                                                                                                   enhance the performance of this algorithm with better
                                                                                                                                                                                        /253.perlbmk
                                                                                                                                        222.mpegaudio




                                                                                                                                                                                                                                          300.twolf
                                                                                                    205.raytrace




                                                                                                                                                                                                                 255.vortex
                                                                                                                                                        227.mtrt




                                                                                                                                                                                                                              256.bzip2
       164.gzip



                                      181.mcf


                                                             197.parser
                                                                          201.compress



                                                                                                                   209.db
                                                186.crafty
                            176.gcc




                                                                                                                            213.javac
                  175.vpr




                                                                                         202.jess




                                                                                                                                                                              252.eon


                                                                                                                                                                                                       254.gap
                                                                                                                                                                   228.jack




                                                                                                                                                                                                                                                      implementation tricks. I have also compared the
                                                                                                                                                                                                                                                      performance of piecewise prediction algorithm with
                                                                                                                                                                                                                                                      perceptron and gshare algorithms. With the same
                                                                                                                                                                                                                                                      memory limit, piecewise prediction performs
Fig 4: Misprediction rate of different benchmarks using                                                                                                                                                                                               significantly better than the other two.
         piecewise linear prediction algorithm

                                                                                                                                                                                                                                                                              REFERENCES
                                                                                                                                                                                                                                                      [1] Daniel A. Jimenez. Piecewise linear branch
                                                                                                                                                                                                                                                          prediction. In Proceedings of the 32nd Annual
                                                                                                                                                                                                                                                          International    Symposium      on   Computer
                                                                                                                                                                                                                                                          Architecture (ISCA-32), June 2005.

                                                                                                                                                                                                                                                      [2] D. Jimenez and C. Lin. Dynamic branch prediction
                                                                                                                                                                                                                                                          with per-ceptrons. In Proceedings of the Seventh
                                                                                                                                                                                                                                                          International Sym-posium on High Performance
                                                                                                                                                                                                                                                          Computer Architecture,Jan-uary 2001

                                                                                                                                                                                                                                                      [3] Lakshminarayanan, Arun; Shriraghavan, Sowmya,
                                                                                                                                                                                                                                                          “Neural Branch Prediction” available at
  Fig 5: Comparison of prediction algorithms against                                                                                                                                                                                                      http://webspace.ulbsibiu.ro/lucian.vintan/html/neu
      different benchmarks on given 64K budget.                                                                                                                                                                                                           ralpredictors.pdf

                                                                                                                                                                                                                                                      [4] D.A. Jimenez, “Fast Path-Based Neural Branch
                                          VII. 64K BUDGET CALCULATION                                                                                                                                                                                     Prediction,” Proc. 36th Ann. Int’l Symp.
                                                                                                                                                                                                                                                          Microarchitecture, pp. 243-252, Dec. 2003.
I have limited the implementation of piecewise linear
prediction algorithm within 64K + 256 byte memory.                                                                                                                                                                                                    [5] D.A. Jimenez, “An optimized scaled neural branch
The algorithm performs better as I increase the memory                                                                                                                                                                                                    predictor,” Computer Design (ICCD), 2011 IEEE
limit. In table II, I have shown the calculation of 64K +                                                                                                                                                                                                 29th International Conference, pp. 113 - 118, Oct.
256 byte budget.                                                                                                                                                                                                                                          2011.

Mais conteúdo relacionado

Mais procurados

Clustering techniques
Clustering techniquesClustering techniques
Clustering techniquestalktoharry
 
FINITE DIFFERENCE MODELLING FOR HEAT TRANSFER PROBLEMS
FINITE DIFFERENCE MODELLING FOR HEAT TRANSFER PROBLEMSFINITE DIFFERENCE MODELLING FOR HEAT TRANSFER PROBLEMS
FINITE DIFFERENCE MODELLING FOR HEAT TRANSFER PROBLEMSroymeister007
 
Analysing and combining partial problem solutions for properly informed heuri...
Analysing and combining partial problem solutions for properly informed heuri...Analysing and combining partial problem solutions for properly informed heuri...
Analysing and combining partial problem solutions for properly informed heuri...Alexander Decker
 
Parallel kmeans clustering in Erlang
Parallel kmeans clustering in ErlangParallel kmeans clustering in Erlang
Parallel kmeans clustering in ErlangChinmay Patel
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”Er. Arpit Sharma
 
Alternating direction-implicit-finite-difference-method-for-transient-2 d-hea...
Alternating direction-implicit-finite-difference-method-for-transient-2 d-hea...Alternating direction-implicit-finite-difference-method-for-transient-2 d-hea...
Alternating direction-implicit-finite-difference-method-for-transient-2 d-hea...Abimbola Ashaju
 
Numerical methods for 2 d heat transfer
Numerical methods for 2 d heat transferNumerical methods for 2 d heat transfer
Numerical methods for 2 d heat transferArun Sarasan
 
On selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictionOn selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictioncsandit
 
Cluster analysis using k-means method in R
Cluster analysis using k-means method in RCluster analysis using k-means method in R
Cluster analysis using k-means method in RVladimir Bakhrushin
 
Data scientist training in bangalore
Data scientist training in bangaloreData scientist training in bangalore
Data scientist training in bangaloreprathyusha1234
 

Mais procurados (14)

Clustering techniques
Clustering techniquesClustering techniques
Clustering techniques
 
FINITE DIFFERENCE MODELLING FOR HEAT TRANSFER PROBLEMS
FINITE DIFFERENCE MODELLING FOR HEAT TRANSFER PROBLEMSFINITE DIFFERENCE MODELLING FOR HEAT TRANSFER PROBLEMS
FINITE DIFFERENCE MODELLING FOR HEAT TRANSFER PROBLEMS
 
Analysing and combining partial problem solutions for properly informed heuri...
Analysing and combining partial problem solutions for properly informed heuri...Analysing and combining partial problem solutions for properly informed heuri...
Analysing and combining partial problem solutions for properly informed heuri...
 
Clustering: A Survey
Clustering: A SurveyClustering: A Survey
Clustering: A Survey
 
Parallel kmeans clustering in Erlang
Parallel kmeans clustering in ErlangParallel kmeans clustering in Erlang
Parallel kmeans clustering in Erlang
 
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”"FingerPrint Recognition Using Principle Component Analysis(PCA)”
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
 
Alternating direction-implicit-finite-difference-method-for-transient-2 d-hea...
Alternating direction-implicit-finite-difference-method-for-transient-2 d-hea...Alternating direction-implicit-finite-difference-method-for-transient-2 d-hea...
Alternating direction-implicit-finite-difference-method-for-transient-2 d-hea...
 
K mean-clustering
K mean-clusteringK mean-clustering
K mean-clustering
 
Numerical methods for 2 d heat transfer
Numerical methods for 2 d heat transferNumerical methods for 2 d heat transfer
Numerical methods for 2 d heat transfer
 
On selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series predictionOn selection of periodic kernels parameters in time series prediction
On selection of periodic kernels parameters in time series prediction
 
Data miningpresentation
Data miningpresentationData miningpresentation
Data miningpresentation
 
Icmtea
IcmteaIcmtea
Icmtea
 
Cluster analysis using k-means method in R
Cluster analysis using k-means method in RCluster analysis using k-means method in R
Cluster analysis using k-means method in R
 
Data scientist training in bangalore
Data scientist training in bangaloreData scientist training in bangalore
Data scientist training in bangalore
 

Destaque (6)

Cyber Security Exam 2
Cyber Security Exam 2Cyber Security Exam 2
Cyber Security Exam 2
 
Recitation
RecitationRecitation
Recitation
 
Recitation
RecitationRecitation
Recitation
 
Transcription Factor DNA Binding Prediction
Transcription Factor DNA Binding PredictionTranscription Factor DNA Binding Prediction
Transcription Factor DNA Binding Prediction
 
Ksi
KsiKsi
Ksi
 
Attribute Based Encryption
Attribute Based EncryptionAttribute Based Encryption
Attribute Based Encryption
 

Semelhante a Branch prediction contest_report

A Fast Near Optimal Vertex Cover Algorithm (NOVCA)
A Fast Near Optimal Vertex Cover Algorithm (NOVCA)A Fast Near Optimal Vertex Cover Algorithm (NOVCA)
A Fast Near Optimal Vertex Cover Algorithm (NOVCA)Waqas Tariq
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171Yaxin Liu
 
Expert system design for elastic scattering neutrons optical model using bpnn
Expert system design for elastic scattering neutrons optical model using bpnnExpert system design for elastic scattering neutrons optical model using bpnn
Expert system design for elastic scattering neutrons optical model using bpnnijcsa
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphspione30
 
Wiener Filter Hardware Realization
Wiener Filter Hardware RealizationWiener Filter Hardware Realization
Wiener Filter Hardware RealizationSayan Chaudhuri
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine LearningPavithra Thippanaik
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdfNarenRajVivek
 
Landmark Retrieval & Recognition
Landmark Retrieval & RecognitionLandmark Retrieval & Recognition
Landmark Retrieval & Recognitionkenluck2001
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphstuxette
 
Machine Status Prediction for Dynamic and Heterogenous Cloud Environment
Machine Status Prediction for Dynamic and Heterogenous Cloud EnvironmentMachine Status Prediction for Dynamic and Heterogenous Cloud Environment
Machine Status Prediction for Dynamic and Heterogenous Cloud Environmentjins0618
 
Number of sources estimation using a hybrid algorithm for smart antenna
Number of sources estimation using a hybrid algorithm for  smart antennaNumber of sources estimation using a hybrid algorithm for  smart antenna
Number of sources estimation using a hybrid algorithm for smart antennaIJECEIAES
 
Instance based learning
Instance based learningInstance based learning
Instance based learningswapnac12
 
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...IRJET Journal
 

Semelhante a Branch prediction contest_report (20)

A Fast Near Optimal Vertex Cover Algorithm (NOVCA)
A Fast Near Optimal Vertex Cover Algorithm (NOVCA)A Fast Near Optimal Vertex Cover Algorithm (NOVCA)
A Fast Near Optimal Vertex Cover Algorithm (NOVCA)
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Expert system design for elastic scattering neutrons optical model using bpnn
Expert system design for elastic scattering neutrons optical model using bpnnExpert system design for elastic scattering neutrons optical model using bpnn
Expert system design for elastic scattering neutrons optical model using bpnn
 
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning MachineFast Object Recognition from 3D Depth Data with Extreme Learning Machine
Fast Object Recognition from 3D Depth Data with Extreme Learning Machine
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphs
 
Wiener Filter Hardware Realization
Wiener Filter Hardware RealizationWiener Filter Hardware Realization
Wiener Filter Hardware Realization
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
Lec09 nbody-optimization
Lec09 nbody-optimizationLec09 nbody-optimization
Lec09 nbody-optimization
 
Conference_paper.pdf
Conference_paper.pdfConference_paper.pdf
Conference_paper.pdf
 
Hs3613611366
Hs3613611366Hs3613611366
Hs3613611366
 
Hs3613611366
Hs3613611366Hs3613611366
Hs3613611366
 
Bj4103381384
Bj4103381384Bj4103381384
Bj4103381384
 
Landmark Retrieval & Recognition
Landmark Retrieval & RecognitionLandmark Retrieval & Recognition
Landmark Retrieval & Recognition
 
From RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphsFrom RNN to neural networks for cyclic undirected graphs
From RNN to neural networks for cyclic undirected graphs
 
Machine Status Prediction for Dynamic and Heterogenous Cloud Environment
Machine Status Prediction for Dynamic and Heterogenous Cloud EnvironmentMachine Status Prediction for Dynamic and Heterogenous Cloud Environment
Machine Status Prediction for Dynamic and Heterogenous Cloud Environment
 
Data analysis of weather forecasting
Data analysis of weather forecastingData analysis of weather forecasting
Data analysis of weather forecasting
 
Number of sources estimation using a hybrid algorithm for smart antenna
Number of sources estimation using a hybrid algorithm for  smart antennaNumber of sources estimation using a hybrid algorithm for  smart antenna
Number of sources estimation using a hybrid algorithm for smart antenna
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
 
ECE 565 presentation
ECE 565 presentationECE 565 presentation
ECE 565 presentation
 
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
 

Mais de UT, San Antonio

digital certificate - types and formats
digital certificate - types and formatsdigital certificate - types and formats
digital certificate - types and formatsUT, San Antonio
 
Static Analysis with Sonarlint
Static Analysis with SonarlintStatic Analysis with Sonarlint
Static Analysis with SonarlintUT, San Antonio
 
Shellshock- from bug towards vulnerability
Shellshock- from bug towards vulnerabilityShellshock- from bug towards vulnerability
Shellshock- from bug towards vulnerabilityUT, San Antonio
 
Big Data Processing: Performance Gain Through In-Memory Computation
Big Data Processing: Performance Gain Through In-Memory ComputationBig Data Processing: Performance Gain Through In-Memory Computation
Big Data Processing: Performance Gain Through In-Memory ComputationUT, San Antonio
 
Enumerated authorization policy ABAC (EP-ABAC) model
Enumerated authorization policy ABAC (EP-ABAC) modelEnumerated authorization policy ABAC (EP-ABAC) model
Enumerated authorization policy ABAC (EP-ABAC) modelUT, San Antonio
 
Where is my Privacy presentation slideshow (one page only)
Where is my Privacy presentation slideshow (one page only)Where is my Privacy presentation slideshow (one page only)
Where is my Privacy presentation slideshow (one page only)UT, San Antonio
 
Security_of_openstack_keystone
Security_of_openstack_keystoneSecurity_of_openstack_keystone
Security_of_openstack_keystoneUT, San Antonio
 
Research seminar group_1_prosunjit
Research seminar group_1_prosunjitResearch seminar group_1_prosunjit
Research seminar group_1_prosunjitUT, San Antonio
 
Final Project Transciption Factor DNA binding Prediction
Final Project Transciption Factor DNA binding Prediction Final Project Transciption Factor DNA binding Prediction
Final Project Transciption Factor DNA binding Prediction UT, San Antonio
 
Transcription Factor DNA Binding Prediction
Transcription Factor DNA Binding PredictionTranscription Factor DNA Binding Prediction
Transcription Factor DNA Binding PredictionUT, San Antonio
 
On the incoherencies in web browser access control
On the incoherencies in web browser access controlOn the incoherencies in web browser access control
On the incoherencies in web browser access controlUT, San Antonio
 

Mais de UT, San Antonio (20)

digital certificate - types and formats
digital certificate - types and formatsdigital certificate - types and formats
digital certificate - types and formats
 
Saml metadata
Saml metadataSaml metadata
Saml metadata
 
Static Analysis with Sonarlint
Static Analysis with SonarlintStatic Analysis with Sonarlint
Static Analysis with Sonarlint
 
Shellshock- from bug towards vulnerability
Shellshock- from bug towards vulnerabilityShellshock- from bug towards vulnerability
Shellshock- from bug towards vulnerability
 
Abac17 prosun-slides
Abac17 prosun-slidesAbac17 prosun-slides
Abac17 prosun-slides
 
Abac17 prosun-slides
Abac17 prosun-slidesAbac17 prosun-slides
Abac17 prosun-slides
 
Big Data Processing: Performance Gain Through In-Memory Computation
Big Data Processing: Performance Gain Through In-Memory ComputationBig Data Processing: Performance Gain Through In-Memory Computation
Big Data Processing: Performance Gain Through In-Memory Computation
 
Enumerated authorization policy ABAC (EP-ABAC) model
Enumerated authorization policy ABAC (EP-ABAC) modelEnumerated authorization policy ABAC (EP-ABAC) model
Enumerated authorization policy ABAC (EP-ABAC) model
 
Where is my Privacy presentation slideshow (one page only)
Where is my Privacy presentation slideshow (one page only)Where is my Privacy presentation slideshow (one page only)
Where is my Privacy presentation slideshow (one page only)
 
Three month course
Three month courseThree month course
Three month course
 
One month-syllabus
One month-syllabusOne month-syllabus
One month-syllabus
 
Zerovm backgroud
Zerovm backgroudZerovm backgroud
Zerovm backgroud
 
Security_of_openstack_keystone
Security_of_openstack_keystoneSecurity_of_openstack_keystone
Security_of_openstack_keystone
 
Research seminar group_1_prosunjit
Research seminar group_1_prosunjitResearch seminar group_1_prosunjit
Research seminar group_1_prosunjit
 
Final Project Transciption Factor DNA binding Prediction
Final Project Transciption Factor DNA binding Prediction Final Project Transciption Factor DNA binding Prediction
Final Project Transciption Factor DNA binding Prediction
 
Transcription Factor DNA Binding Prediction
Transcription Factor DNA Binding PredictionTranscription Factor DNA Binding Prediction
Transcription Factor DNA Binding Prediction
 
Secure webbrowsing 1
Secure webbrowsing 1Secure webbrowsing 1
Secure webbrowsing 1
 
On the incoherencies in web browser access control
On the incoherencies in web browser access controlOn the incoherencies in web browser access control
On the incoherencies in web browser access control
 
Cultural conflict
Cultural conflictCultural conflict
Cultural conflict
 
Pair programming
Pair programmingPair programming
Pair programming
 

Último

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Último (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Branch prediction contest_report

  • 1. Branch Prediction Contest: Implementation of Piecewise Linear Prediction Algorithm Prosunjit Biswas Department of Computer Science. University of Texas at San Antonio. Abstract First Path-Based Neural Branch Prediction[4] Branch predictor’s accuracy is very important to is another attempt that combines path and pattern harness the parallelism available in ILP and thus history to overcome the limitation associated with improve performance of today’s microprocessors preexisting neural predictors. It improved accuracy and specially superscalar processors. Among branch over previous neural predictors and achieved predictors, various neural branch predictors significantly low latency. This predictor achieved IPC including Scaled Neural Branch Predictor (SNAP), of an aggressively clocked microarchitecture by 16% Piecewise Linear Branch predictor outperform other over the former perceptron predictor. state-of-the-art predictors. In this course final Scaled neural analog predictor, or SNAP is another project for the course of Computer Architecture recently proposed neural branch predictor which uses (CS-5513), I have studied various neural predictors the concept of piecewise-linear branch prediction and and implemented the Piecewise Linear Branch relies on a mixed analog/digital implementation. This Predictor as per the algorithm provided by a predictor decreases latency over power consumption research paper of Dr. Daniel A. Jimenez. The over other available neural predictors [5]. Fig.1 hardware budget is restricted for this project and I (Courtesy – “An Optimized Scaled Neural Branch have implemented the predictor within a predefined Predictor” by Daniel A. Jimenez) shows comparative hardware budget of 64K of memory. I am also performance of noted branch prediction approaches on competing for branch prediction contest. a set of SPEC CPU 2000 and 2006 integer benchmarks. III. THE ALGORITHM Keywords: Piecewise Linear, Neural Network, The Branch predictor algorithm has two major parts Branch Prediction. namely i) Prediction algorithm ii) Train/Update algorithm. Before going to the implementation of these I. INTRODUCTION Neural Branch predictors are the most accurate predictors in the literature but they were impractical due to the high latency associated with prediction. This latency is due to the complex computation that must be carried out to determine the excitation of an artificial neuron. [3] Piecewise Linear Branch Prediction [1] improved both accuracy and latency over previous neural predictors. This predictor works by developing a set of linear functions, one for each program path to the branch to be predicted that separate predicted taken from predicted untaken. In this paper, Piecewise Linear Branch Prediction, Daniel A. Jimenez proposed two versions of the prediction algorithm – i) The Idealized Piecewise Linear Branch Predictor and ii) A Practical Piecewise Linear Branch Predictor. In this project, I have focused on the idealized predictor. II. RELATED WORKS Fig. 1. Performance of Branch different branch Predictors over SPEC CPU 2000 and 2006 integer benchmarks (Courtesy - “An Optimized Scaled Neural Perceptron prediction is one of the first attempts in Branch Predictor” by Daniel A. Jimenez) branch prediction history that associated branch two algorithms, we will discuss the states and variable prediction through neural network. This predictor they use. The three dimensional array W is the data achieved a improved misprediction rate on a composite structure used to store weights of the branches which is trace of SPEC2000 benchmarks by 14.7%. [2] But used in both prediction and update algorithm. unfortunately, this predictor was impractical due to its high latency.
  • 2. Table II: The update/train algorithm void update (branch_update *u, bool taken, unsigned int target) { if (bi.br_flags & BR_CONDITIONAL) { Fig2: The array of W with its corresponding indices if ( abs(output)< theta || ( (output>=0) != taken) ){ if (taken == true ) { Branch address is generally taken as the last 8/10 bits if (W[address][0][0] < SAT_VAL) of the instruction address. For each predicting branch, W[address][0][0] ++; the algorithm keeps history of all other branches that } else { if (W[address][0][0] > (-1) * SAT_VAL) precede this branch in the dynamic path taken by the W[address][0][0] --; branch. The second dimension indicated by the variable GA keeps track of these per branch dynamic path } history. The third dimension, as shown as GHR[i], for(int i=0; i<H-1; i++) { if(GHR[i] == taken ) { keeps track of the position of the address GA[i] in the if (W[address][GA[i]][i] < SAT_VAL) global branch history register namely GHR. W[address][GA[i]][i] ++; } else { Some of the important variables of the algorithm is also if (W[address][GA[i]][i] > (-1) * SAT_VAL+1 ) given here for the clarity purpose. W[address][GA[i]][i] --; } GA : An array of address. This array keeps the path } history associated with each branch address. As new } shift_update_GA(address); branch is executed, the address of the branch is shifted shift_update_GHR(taken); into the first position of the array. } } GHR: An array of Boolean true/false value. This array keep track of the taken / untaken status of the branches. H : Length of History Register. IV. TUNING PERFORMANCE Output: An integer value generated by the predictor Besides the algorithm, the MPKI (Miss Per Kilo algorithm to predict current branch. Instruction) rate of the algorithm depends on the size of various dimension of the array W. I have experienced MPKI against various dimension of W. The result of Table I: The prediction algorithm. my experiment is shown below. Table 1 shows the result of the experiment. void branch_update *predict (branch_info & b) { Table I : MPKI rate of the Piecewise Linear Algorithm bi = b; if (b.br_flags & BR_CONDITIONAL) { with limited budget of 64K address = ( ((b.address >> 4 ) & 0x0F )<<2) | ((b.address>>2)) & 0x03; W[i][GA[i]][GHR[i] MPKI output = W[address][0][0]; for (int i=0; i<H; i++) { W[64][16][64] 3.982 if ( GHR[i] == true ) W[128][16][32] 4.217 output += W[address][GA[i]][i]; W[64][8][128] 4.292 else if (GHR[i] == false) W[32][16][128] 5.807 output -= W[address][GA[i]][i]; W[64][64][16] 4.826 } u.direction_prediction(output>=0); The table shows that the predictor performs better when } else { i, GA[i], GHR[i] has corresponding 64,16,64 entries. u.direction_prediction (false); } u.target_prediction (0); V. TWEAKING INSTRUCTION ADDRESS return &u; } I have found that rather than taking the last bits from the address, discarding the 2 least significant bits of the address and then taking 3-8 bits make the predictor predicts more accurately. It decreases the aliasing and thus improves prediction rate a little bit.
  • 3. Table II: 64 K ( 65,532 Byte) memory budget limit calculation DataStructure/Array/Varia Memory calculation Fig. 3: Tweaking Branch address for performance ble speed up. W[64][16][63] of each 1 64,512 byte Byte long Constants(SIZE,H,SAT_V 5*1 byte ( each value < 128) VI. RESULT AL,theta,N) (GA[63] * 6 bits / 8) byte 48 byte (GHR[63] * 1 bit / 8) byte 8 byte Misprediction rate of the benchmarks according to the vaiables (address , output ) 8 byte piecewise linear algorithm is shown in fig 4. Fig.5 * 4 byte shows comparison of different prediction Total: 64,581 byte algorithms(piecewise linear, perceptron and gshare) against various given benchmarks. 14 12 VIII CONCLUSION 10 8 In this individual course final project, I have tried to 6 implement the piecewise linear branch prediction 4 algorithm. . In my implementation, I have achieved a 2 MPKI of 3.988 at best. I think, it is also possible to 0 enhance the performance of this algorithm with better /253.perlbmk 222.mpegaudio 300.twolf 205.raytrace 255.vortex 227.mtrt 256.bzip2 164.gzip 181.mcf 197.parser 201.compress 209.db 186.crafty 176.gcc 213.javac 175.vpr 202.jess 252.eon 254.gap 228.jack implementation tricks. I have also compared the performance of piecewise prediction algorithm with perceptron and gshare algorithms. With the same memory limit, piecewise prediction performs Fig 4: Misprediction rate of different benchmarks using significantly better than the other two. piecewise linear prediction algorithm REFERENCES [1] Daniel A. Jimenez. Piecewise linear branch prediction. In Proceedings of the 32nd Annual International Symposium on Computer Architecture (ISCA-32), June 2005. [2] D. Jimenez and C. Lin. Dynamic branch prediction with per-ceptrons. In Proceedings of the Seventh International Sym-posium on High Performance Computer Architecture,Jan-uary 2001 [3] Lakshminarayanan, Arun; Shriraghavan, Sowmya, “Neural Branch Prediction” available at Fig 5: Comparison of prediction algorithms against http://webspace.ulbsibiu.ro/lucian.vintan/html/neu different benchmarks on given 64K budget. ralpredictors.pdf [4] D.A. Jimenez, “Fast Path-Based Neural Branch VII. 64K BUDGET CALCULATION Prediction,” Proc. 36th Ann. Int’l Symp. Microarchitecture, pp. 243-252, Dec. 2003. I have limited the implementation of piecewise linear prediction algorithm within 64K + 256 byte memory. [5] D.A. Jimenez, “An optimized scaled neural branch The algorithm performs better as I increase the memory predictor,” Computer Design (ICCD), 2011 IEEE limit. In table II, I have shown the calculation of 64K + 29th International Conference, pp. 113 - 118, Oct. 256 byte budget. 2011.