SlideShare uma empresa Scribd logo
1 de 3
Baixar para ler offline
Insight
                         Insight




Understand the Machine
to Write Efficient Code
How many programmers can actually write assembly programs? With the rising popularity
of high-level languages (like Java, VB.Net, etc), there is rarely any need for programmers
to learn assembly or low-level programming. However, there are domains where writing
efficient code is very important, for example, game programming and scientific computing.




                              W
                                            hen can we write highly             for alternatives to write efficient code.
                                            efficient code? It is when we           We can write in low-level programming
                                            understand how the underlying       languages like C to get write code, whose
                                            machine works and make best         efficiency is often comparable to the
                                            use of that knowledge. One          equivalent code written in assembly. For this
                             well-known way to write highly efficient code      reason, C is often referred as a ‘high-level
                             is to write code in assembly. There are many       assembler’. In this article, we’ll look at various
                             disadvantages with this; for example, we           programming constructs from the perspective
                             cannot port the programs easily, it is difficult   of efficiency. We’ll consider general machine
                             to maintain the code, etc. So, we need to look     architecture for illustration; and for specific


12   February 2008   |    LINuX For you   |   www.openITis.com



                                                                 cmyk
Insight



examples, x86 architecture will be          floating point division operation                 toggling the case of characters
used. A word of caution before we           might take 50 to 100 cycles. Memory               in a string, then it is not a good
proceed: the techniques and issues          access operations are very slow—if                implementation.
covered are not for general-purpose         the desired memory location is not in                  The code can be improved
programming.                                cache, then it might take hundreds of             as follows: since the comparison
                                            cycles to fetch the data from the main            operators are not required, the
Basic types                                 memory.                                           function precondition says that
The machine, in general,                                                                      ch passed to the function is in the
understands only three types of             Operators                                         given range [‘a’-‘z’] or [‘A’-‘Z’]. Based
values: address, integer and floating-      C supports a rich set of operators.               on C tradition, we need not check
point values. For representation            There are few operators that are                  to ensure that the given char is
and manipulation, here is the               directly supported by the processor               in fact in this range. Also, since
correspondence between the types            and there are a few that are simulated            we are performing either the ‘-’ or
that the machine can understand             in the software.                                  ‘+’ arithmetic operation, we can
and what C supports. Addresses                  For integral types, bit-                      replace it with bit-wise operations
correspond to the pointer construct;        manipulation operators are faster                 for toggling the bit using the ex-
integers—both signed and                    compared to other operators like                  or operator. With this the code
unsigned—correspond to short, int,          arithmetic, logical or relational                 becomes efficient and simple:
long, long long, char (yes, a char is       operators. One of the ways to write
represented as an int internally!) etc;     efficient code is to write code using                 // precondition: the char ch provided
floating-point types correspond to          bitwise operations instead of other                   is in range [‘a’-‘z’] or [‘A’-‘Z’]
float, double, long double, etc.            slower operations. Here is a well-                    char toggle_ascii_char_case(char ch)
     The most efficient data-type           known example: Using ‘<<’ is more                     {
that a processor handles is a ‘word’,       operator efficient than dividing an                             return (ch ^= 0x20);
which corresponds to ‘int’ type             integer value by 2. We’ll look at a                   }
in C. For floating-point types, all         different example for illustration
computation is typically done in            here.                                                  This example is just for
a larger floating-point type. For               A typical code segment for                    illustration purposes. Using bit-wise
example, in x86 machines, all               toggling a character’s case is to use             operators obscures the code, but it
floating-point computation is done          relational operators, as in:                      usually significantly improves the
in ‘extended precision’, which is 80                                                          efficiency of the code.
bits in size and usually corresponds         // precondition: the char ch provided
to ‘long double’ in C; if floating point     is in range [‘a’-‘z’] or [‘A’-‘Z’]               Control flow
expressions are used in the code,            char toggle_ascii_char_case(char ch) {           C has various conditional and
they are internally converted to                      if( (ch >= ‘a’) && (ch <= ‘z’)          looping constructs. A C compiler
extended precision by the processor          )        // lower case                           transforms such code constructs
and the results are converted back                                 ch = ch - 0x20;            to branching (also known as
to float (which occupies 32 bits).                    else if( (ch >= ‘A’) && (ch <=          ‘jump’) instructions. So, goto is the
The processor does floating-point            ‘Z’) )   // upper case                           most straightforward construct
computations in a separate ‘co-                                    ch = ch + 0x20;            for programming. It is possible to
processor’ unit.                                      return ch;                              take any C program and create an
     Address computation (such               }                                                equivalent program by removing all
as array index access) is done                                                                conditions and loops with just goto
using pointers in C. They directly              The code works on the following               statements. Though it is ultimately
correspond to memory access                 assumption: the given char ch is                  branching instructions, there are
operations in the underlying machine        within the range [a-z] or [A-Z]. If               subtle differences in constructs
(such as index addressing, in this          the char is [A-Z], it returns the                 when it comes to efficiency.
case).                                      corresponding char in [a-z] and vice                   Which one is more efficient—
     The ‘int’ type is the most efficient   versa, that is, it toggles the case of            nested if conditions or switch
for computations. Unsigned types            the character. The value 0x20 is                  statements? In general, a switch
and operations on that are as               added or subtracted based on the fact             is more efficient than nested if
efficient as signed types. Floating         that the alphabetic characters are                statements. If both are implemented
point types and operations are slow         separated by the hex value 0x20 in                using branching, why is switch more
compared to integral types. For             the ASCII table.                                  efficient than nested if conditions?
example, in an imaginary processor,             But this function is slow. If                 Recall that, in a switch statement,
if integer division takes four cycles,      this is a library function used for               all cases are constants. So, a


                                                                           www.openITis.com   |       LINuX For you   |   February 2008     13


                                                           cmyk
Insight



compiler can transform a switch                 more efficient?                           it is not possible to take address
statement to a range or look-up                                                           of specific bits in a byte. Though it
table, which is more efficient than a               for(i = 0; i< 50; i++)                is space-efficient to use bit-fields,
long list of jumps.                                     for(j = 0; j< 50; j++)            it is not time-efficient since it is
    Note that executing jump                               for(k = 0; k< 50; k++)         not possible to access individual
instructions is not costly, but                               printf(“%d ”,               bits; so the compiler emits code to
unpredictable jumps can result                   a[k][j][i]);                             access ‘word’s and then does bit-
in considerably slower execution.                                                         manipulation to access individual
For example, frequent jumps can                     for(i = 0; i< 50; i++)                bit-field member values.
result in the flushing of the pipeline.                for(j = 0; j< 50; j++)                  The following bit-field struct is to
Similarly, processors typically look                       for(k = 0; k< 50; k++)         represent time in a day in HH:MM:
ahead in the instruction stream                               printf(“%d ”,               SS format. For the hour, the range is
and pre-fetch necessary memory                   a[i][j][k]);                             0-23 and for minutes and seconds,
accesses and put it in cache. So,                                                         the range is 0-59. We can use the
unpredictable jumps can result in                   C has arrays implemented in           following struct:
memory faults, which will result in             row-major order, that is, the same
wasting hundreds of cycles since                way it is organised in the hardware.       struct time {
the processor has to wait for the               The second loop is more efficient                   unsigned int hour : 5;
memory value to be available for it             because it accesses memory                          unsigned int minute: 6;
to continue execution. In general,              locations sequentially, in row-major                unsigned int second: 6;
a program with less number of                   order. The processor will fetch the        } tm1, tm2;
branches is faster than those that              memory blocks into cache, and
have a large number of branches.                since the memory access is also                To access tm.minute, the
                                                sequential, this is efficient. However,   compiler has to generate code to
Memory access                                   in the first loop, the memory access      access the word (4 bytes) and do bit-
As said earlier, memory access                  is not sequential and hence there         manipulation and access only the 5th
is a costly operation. Let us take              might be a lot of memory faults and       to 10th bit in that word, which is slow.
a specific example to illustrate                hence it will be considerably slower      So, avoid using bit-fields extensively
this. Typically, it takes the same              than the second loop.                     if performance is important for your
time to access global data or local                 From these examples we learn          software. A better option in this case
(stack allocated) data. However,                that it is important to keep in mind      is to use a struct (without bit-fields)
it is preferable to use local data              that memory faults are costly and         with a byte each for the hour, minute
instead of global data because of the           we need to minimise such memory           and second, respectively.
well-known ‘principle of locality’.             faults to write efficient code.                In this article, we explored some
If a memory location is accessed,                                                         fundamental issues to understand
the processor doesn’t fetch value               Compound types                            how various programming constructs
in just that memory location; it                C supports compound types like            can affect the efficiency of programs.
fetches many values adjacent to that            structs, unions and bit-fields that       There are many other issues, such
memory location since the program               are implemented in terms of other         as unaligned memory access, cost
is likely to access variables that are          primitive or compound types. The          of I/O operations, etc, that are not
located near that memory location.              hardware does not understand any          covered in this article. This article is
Fetching a block of data and putting            compound types, and all processing        just a starting point to understand
it in cache is not time consuming,              is done on primitive types only.          such problems. If you are interested,
but if there is a memory fault, it can          There are many aspects—such               you can read books on assembly
lead to the processor waiting for               as padding and alignment—in               language, computer architecture and
hundreds of cycles for that memory              using compound types that can             compiler optimisation to get a better
access to happen. If there are large            affect the performance. Here, we’ll       understanding of the issues related to
numbers of global variables and their           look at an example of bit-fields to       writing efficient programs.
accesses are spread throughout                  understand how it is supported by
the program, then the program                   the hardware.                              By: S G Ganesh is a research
execution becomes considerably                      In C, we can manipulate and            engineer in Siemens (Corporate
slower.                                         access bits using bit-fields. It is a      Technology), Bangalore. His latest
     Let us consider another (well-             syntax error to attempt taking the         book is, ‘60 Tips on Object Oriented
known) example to illustrate this               address of a bit-field member. Why?        Programming’, published by Tata
important ‘principle of locality’. Of           The granularity of addressing in           McGraw-Hill in December 2007. You
                                                                                           can reach him at sgganesh@gmail.com
the following two loops, which one is           modern computers is in bytes and


14     February 2008   |   LINuX For you   |   www.openITis.com



                                                                     cmyk

Mais conteúdo relacionado

Mais procurados

Cpp17 and Beyond
Cpp17 and BeyondCpp17 and Beyond
Cpp17 and BeyondComicSansMS
 
Computer programming(CP)
Computer programming(CP)Computer programming(CP)
Computer programming(CP)nmahi96
 
Intermediate code- generation
Intermediate code- generationIntermediate code- generation
Intermediate code- generationrawan_z
 
Pointers-Computer programming
Pointers-Computer programmingPointers-Computer programming
Pointers-Computer programmingnmahi96
 
Software Abstractions for Parallel Hardware
Software Abstractions for Parallel HardwareSoftware Abstractions for Parallel Hardware
Software Abstractions for Parallel HardwareJoel Falcou
 
Introduction to c programming
Introduction to c programmingIntroduction to c programming
Introduction to c programminggajendra singh
 
Complete C++ programming Language Course
Complete C++ programming Language CourseComplete C++ programming Language Course
Complete C++ programming Language CourseVivek chan
 
Library function in c++ specially designed for clas 11th students
Library function in c++ specially designed for clas 11th studentsLibrary function in c++ specially designed for clas 11th students
Library function in c++ specially designed for clas 11th studentsMirza Hussain
 
C Programming basics
C Programming basicsC Programming basics
C Programming basicsJitin Pillai
 
interfacing matlab with embedded systems
interfacing matlab with embedded systemsinterfacing matlab with embedded systems
interfacing matlab with embedded systemsRaghav Shetty
 

Mais procurados (20)

Cpp17 and Beyond
Cpp17 and BeyondCpp17 and Beyond
Cpp17 and Beyond
 
C programming session3
C programming  session3C programming  session3
C programming session3
 
Computer programming(CP)
Computer programming(CP)Computer programming(CP)
Computer programming(CP)
 
C programming session10
C programming  session10C programming  session10
C programming session10
 
Intermediate code- generation
Intermediate code- generationIntermediate code- generation
Intermediate code- generation
 
C programming session7
C programming  session7C programming  session7
C programming session7
 
Final Exam in FNDPRG
Final Exam in FNDPRGFinal Exam in FNDPRG
Final Exam in FNDPRG
 
Pointers-Computer programming
Pointers-Computer programmingPointers-Computer programming
Pointers-Computer programming
 
Software Abstractions for Parallel Hardware
Software Abstractions for Parallel HardwareSoftware Abstractions for Parallel Hardware
Software Abstractions for Parallel Hardware
 
Serial comm matlab
Serial comm matlabSerial comm matlab
Serial comm matlab
 
Introduction to c programming
Introduction to c programmingIntroduction to c programming
Introduction to c programming
 
C programming session8
C programming  session8C programming  session8
C programming session8
 
Complete C++ programming Language Course
Complete C++ programming Language CourseComplete C++ programming Language Course
Complete C++ programming Language Course
 
Quiz 9
Quiz 9Quiz 9
Quiz 9
 
C notes.pdf
C notes.pdfC notes.pdf
C notes.pdf
 
Unit ii
Unit   iiUnit   ii
Unit ii
 
Introduction Of C++
Introduction Of C++Introduction Of C++
Introduction Of C++
 
Library function in c++ specially designed for clas 11th students
Library function in c++ specially designed for clas 11th studentsLibrary function in c++ specially designed for clas 11th students
Library function in c++ specially designed for clas 11th students
 
C Programming basics
C Programming basicsC Programming basics
C Programming basics
 
interfacing matlab with embedded systems
interfacing matlab with embedded systemsinterfacing matlab with embedded systems
interfacing matlab with embedded systems
 

Destaque (6)

Intermediate Languages
Intermediate LanguagesIntermediate Languages
Intermediate Languages
 
Java Multithreading
Java MultithreadingJava Multithreading
Java Multithreading
 
An Introduction To C++Templates
An Introduction To C++TemplatesAn Introduction To C++Templates
An Introduction To C++Templates
 
Inside.Net
Inside.NetInside.Net
Inside.Net
 
Oop Extract
Oop ExtractOop Extract
Oop Extract
 
Metaprogramming
MetaprogrammingMetaprogramming
Metaprogramming
 

Semelhante a Writing Efficient Code Feb 08

Semelhante a Writing Efficient Code Feb 08 (20)

22 Jop Oct 08
22 Jop Oct 0822 Jop Oct 08
22 Jop Oct 08
 
Go1
Go1Go1
Go1
 
c.ppt
c.pptc.ppt
c.ppt
 
A Domain-Specific Embedded Language for Programming Parallel Architectures.
A Domain-Specific Embedded Language for Programming Parallel Architectures.A Domain-Specific Embedded Language for Programming Parallel Architectures.
A Domain-Specific Embedded Language for Programming Parallel Architectures.
 
Assembly language part I
Assembly language part IAssembly language part I
Assembly language part I
 
Assembly language part I
Assembly language part IAssembly language part I
Assembly language part I
 
C tutorials
C tutorialsC tutorials
C tutorials
 
C programming session9 -
C programming  session9 -C programming  session9 -
C programming session9 -
 
C tour Unix
C tour UnixC tour Unix
C tour Unix
 
15 Jo P Mar 08
15 Jo P Mar 0815 Jo P Mar 08
15 Jo P Mar 08
 
Lecture 3 getting_started_with__c_
Lecture 3 getting_started_with__c_Lecture 3 getting_started_with__c_
Lecture 3 getting_started_with__c_
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
Bound and Checked
Bound and CheckedBound and Checked
Bound and Checked
 
Migration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming ModelsMigration To Multi Core - Parallel Programming Models
Migration To Multi Core - Parallel Programming Models
 
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
Oh Crap, I Forgot (Or Never Learned) C! [CodeMash 2010]
 
C programming
C programmingC programming
C programming
 
Unmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/InvokeUnmanaged Parallelization via P/Invoke
Unmanaged Parallelization via P/Invoke
 
Source vs object code
Source vs object codeSource vs object code
Source vs object code
 
C programming
C programmingC programming
C programming
 
Stringing Things Along
Stringing Things AlongStringing Things Along
Stringing Things Along
 

Mais de Ganesh Samarthyam

Applying Refactoring Tools in Practice
Applying Refactoring Tools in PracticeApplying Refactoring Tools in Practice
Applying Refactoring Tools in PracticeGanesh Samarthyam
 
CFP - 1st Workshop on “AI Meets Blockchain”
CFP - 1st Workshop on “AI Meets Blockchain”CFP - 1st Workshop on “AI Meets Blockchain”
CFP - 1st Workshop on “AI Meets Blockchain”Ganesh Samarthyam
 
Great Coding Skills Aren't Enough
Great Coding Skills Aren't EnoughGreat Coding Skills Aren't Enough
Great Coding Skills Aren't EnoughGanesh Samarthyam
 
College Project - Java Disassembler - Description
College Project - Java Disassembler - DescriptionCollege Project - Java Disassembler - Description
College Project - Java Disassembler - DescriptionGanesh Samarthyam
 
Coding Guidelines - Crafting Clean Code
Coding Guidelines - Crafting Clean CodeCoding Guidelines - Crafting Clean Code
Coding Guidelines - Crafting Clean CodeGanesh Samarthyam
 
Design Patterns - Compiler Case Study - Hands-on Examples
Design Patterns - Compiler Case Study - Hands-on ExamplesDesign Patterns - Compiler Case Study - Hands-on Examples
Design Patterns - Compiler Case Study - Hands-on ExamplesGanesh Samarthyam
 
Bangalore Container Conference 2017 - Brief Presentation
Bangalore Container Conference 2017 - Brief PresentationBangalore Container Conference 2017 - Brief Presentation
Bangalore Container Conference 2017 - Brief PresentationGanesh Samarthyam
 
Bangalore Container Conference 2017 - Poster
Bangalore Container Conference 2017 - PosterBangalore Container Conference 2017 - Poster
Bangalore Container Conference 2017 - PosterGanesh Samarthyam
 
Software Design in Practice (with Java examples)
Software Design in Practice (with Java examples)Software Design in Practice (with Java examples)
Software Design in Practice (with Java examples)Ganesh Samarthyam
 
OO Design and Design Patterns in C++
OO Design and Design Patterns in C++ OO Design and Design Patterns in C++
OO Design and Design Patterns in C++ Ganesh Samarthyam
 
Bangalore Container Conference 2017 - Sponsorship Deck
Bangalore Container Conference 2017 - Sponsorship DeckBangalore Container Conference 2017 - Sponsorship Deck
Bangalore Container Conference 2017 - Sponsorship DeckGanesh Samarthyam
 
Let's Go: Introduction to Google's Go Programming Language
Let's Go: Introduction to Google's Go Programming LanguageLet's Go: Introduction to Google's Go Programming Language
Let's Go: Introduction to Google's Go Programming LanguageGanesh Samarthyam
 
Google's Go Programming Language - Introduction
Google's Go Programming Language - Introduction Google's Go Programming Language - Introduction
Google's Go Programming Language - Introduction Ganesh Samarthyam
 
Java Generics - Quiz Questions
Java Generics - Quiz QuestionsJava Generics - Quiz Questions
Java Generics - Quiz QuestionsGanesh Samarthyam
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz QuestionsGanesh Samarthyam
 
Core Java: Best practices and bytecodes quiz
Core Java: Best practices and bytecodes quizCore Java: Best practices and bytecodes quiz
Core Java: Best practices and bytecodes quizGanesh Samarthyam
 

Mais de Ganesh Samarthyam (20)

Wonders of the Sea
Wonders of the SeaWonders of the Sea
Wonders of the Sea
 
Animals - for kids
Animals - for kids Animals - for kids
Animals - for kids
 
Applying Refactoring Tools in Practice
Applying Refactoring Tools in PracticeApplying Refactoring Tools in Practice
Applying Refactoring Tools in Practice
 
CFP - 1st Workshop on “AI Meets Blockchain”
CFP - 1st Workshop on “AI Meets Blockchain”CFP - 1st Workshop on “AI Meets Blockchain”
CFP - 1st Workshop on “AI Meets Blockchain”
 
Great Coding Skills Aren't Enough
Great Coding Skills Aren't EnoughGreat Coding Skills Aren't Enough
Great Coding Skills Aren't Enough
 
College Project - Java Disassembler - Description
College Project - Java Disassembler - DescriptionCollege Project - Java Disassembler - Description
College Project - Java Disassembler - Description
 
Coding Guidelines - Crafting Clean Code
Coding Guidelines - Crafting Clean CodeCoding Guidelines - Crafting Clean Code
Coding Guidelines - Crafting Clean Code
 
Design Patterns - Compiler Case Study - Hands-on Examples
Design Patterns - Compiler Case Study - Hands-on ExamplesDesign Patterns - Compiler Case Study - Hands-on Examples
Design Patterns - Compiler Case Study - Hands-on Examples
 
Bangalore Container Conference 2017 - Brief Presentation
Bangalore Container Conference 2017 - Brief PresentationBangalore Container Conference 2017 - Brief Presentation
Bangalore Container Conference 2017 - Brief Presentation
 
Bangalore Container Conference 2017 - Poster
Bangalore Container Conference 2017 - PosterBangalore Container Conference 2017 - Poster
Bangalore Container Conference 2017 - Poster
 
Software Design in Practice (with Java examples)
Software Design in Practice (with Java examples)Software Design in Practice (with Java examples)
Software Design in Practice (with Java examples)
 
OO Design and Design Patterns in C++
OO Design and Design Patterns in C++ OO Design and Design Patterns in C++
OO Design and Design Patterns in C++
 
Bangalore Container Conference 2017 - Sponsorship Deck
Bangalore Container Conference 2017 - Sponsorship DeckBangalore Container Conference 2017 - Sponsorship Deck
Bangalore Container Conference 2017 - Sponsorship Deck
 
Let's Go: Introduction to Google's Go Programming Language
Let's Go: Introduction to Google's Go Programming LanguageLet's Go: Introduction to Google's Go Programming Language
Let's Go: Introduction to Google's Go Programming Language
 
Google's Go Programming Language - Introduction
Google's Go Programming Language - Introduction Google's Go Programming Language - Introduction
Google's Go Programming Language - Introduction
 
Java Generics - Quiz Questions
Java Generics - Quiz QuestionsJava Generics - Quiz Questions
Java Generics - Quiz Questions
 
Java Generics - by Example
Java Generics - by ExampleJava Generics - by Example
Java Generics - by Example
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
 
Docker by Example - Quiz
Docker by Example - QuizDocker by Example - Quiz
Docker by Example - Quiz
 
Core Java: Best practices and bytecodes quiz
Core Java: Best practices and bytecodes quizCore Java: Best practices and bytecodes quiz
Core Java: Best practices and bytecodes quiz
 

Último

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Writing Efficient Code Feb 08

  • 1. Insight Insight Understand the Machine to Write Efficient Code How many programmers can actually write assembly programs? With the rising popularity of high-level languages (like Java, VB.Net, etc), there is rarely any need for programmers to learn assembly or low-level programming. However, there are domains where writing efficient code is very important, for example, game programming and scientific computing. W hen can we write highly for alternatives to write efficient code. efficient code? It is when we We can write in low-level programming understand how the underlying languages like C to get write code, whose machine works and make best efficiency is often comparable to the use of that knowledge. One equivalent code written in assembly. For this well-known way to write highly efficient code reason, C is often referred as a ‘high-level is to write code in assembly. There are many assembler’. In this article, we’ll look at various disadvantages with this; for example, we programming constructs from the perspective cannot port the programs easily, it is difficult of efficiency. We’ll consider general machine to maintain the code, etc. So, we need to look architecture for illustration; and for specific 12 February 2008 | LINuX For you | www.openITis.com cmyk
  • 2. Insight examples, x86 architecture will be floating point division operation toggling the case of characters used. A word of caution before we might take 50 to 100 cycles. Memory in a string, then it is not a good proceed: the techniques and issues access operations are very slow—if implementation. covered are not for general-purpose the desired memory location is not in The code can be improved programming. cache, then it might take hundreds of as follows: since the comparison cycles to fetch the data from the main operators are not required, the Basic types memory. function precondition says that The machine, in general, ch passed to the function is in the understands only three types of Operators given range [‘a’-‘z’] or [‘A’-‘Z’]. Based values: address, integer and floating- C supports a rich set of operators. on C tradition, we need not check point values. For representation There are few operators that are to ensure that the given char is and manipulation, here is the directly supported by the processor in fact in this range. Also, since correspondence between the types and there are a few that are simulated we are performing either the ‘-’ or that the machine can understand in the software. ‘+’ arithmetic operation, we can and what C supports. Addresses For integral types, bit- replace it with bit-wise operations correspond to the pointer construct; manipulation operators are faster for toggling the bit using the ex- integers—both signed and compared to other operators like or operator. With this the code unsigned—correspond to short, int, arithmetic, logical or relational becomes efficient and simple: long, long long, char (yes, a char is operators. One of the ways to write represented as an int internally!) etc; efficient code is to write code using // precondition: the char ch provided floating-point types correspond to bitwise operations instead of other is in range [‘a’-‘z’] or [‘A’-‘Z’] float, double, long double, etc. slower operations. Here is a well- char toggle_ascii_char_case(char ch) The most efficient data-type known example: Using ‘<<’ is more { that a processor handles is a ‘word’, operator efficient than dividing an return (ch ^= 0x20); which corresponds to ‘int’ type integer value by 2. We’ll look at a } in C. For floating-point types, all different example for illustration computation is typically done in here. This example is just for a larger floating-point type. For A typical code segment for illustration purposes. Using bit-wise example, in x86 machines, all toggling a character’s case is to use operators obscures the code, but it floating-point computation is done relational operators, as in: usually significantly improves the in ‘extended precision’, which is 80 efficiency of the code. bits in size and usually corresponds // precondition: the char ch provided to ‘long double’ in C; if floating point is in range [‘a’-‘z’] or [‘A’-‘Z’] Control flow expressions are used in the code, char toggle_ascii_char_case(char ch) { C has various conditional and they are internally converted to if( (ch >= ‘a’) && (ch <= ‘z’) looping constructs. A C compiler extended precision by the processor ) // lower case transforms such code constructs and the results are converted back ch = ch - 0x20; to branching (also known as to float (which occupies 32 bits). else if( (ch >= ‘A’) && (ch <= ‘jump’) instructions. So, goto is the The processor does floating-point ‘Z’) ) // upper case most straightforward construct computations in a separate ‘co- ch = ch + 0x20; for programming. It is possible to processor’ unit. return ch; take any C program and create an Address computation (such } equivalent program by removing all as array index access) is done conditions and loops with just goto using pointers in C. They directly The code works on the following statements. Though it is ultimately correspond to memory access assumption: the given char ch is branching instructions, there are operations in the underlying machine within the range [a-z] or [A-Z]. If subtle differences in constructs (such as index addressing, in this the char is [A-Z], it returns the when it comes to efficiency. case). corresponding char in [a-z] and vice Which one is more efficient— The ‘int’ type is the most efficient versa, that is, it toggles the case of nested if conditions or switch for computations. Unsigned types the character. The value 0x20 is statements? In general, a switch and operations on that are as added or subtracted based on the fact is more efficient than nested if efficient as signed types. Floating that the alphabetic characters are statements. If both are implemented point types and operations are slow separated by the hex value 0x20 in using branching, why is switch more compared to integral types. For the ASCII table. efficient than nested if conditions? example, in an imaginary processor, But this function is slow. If Recall that, in a switch statement, if integer division takes four cycles, this is a library function used for all cases are constants. So, a www.openITis.com | LINuX For you | February 2008 13 cmyk
  • 3. Insight compiler can transform a switch more efficient? it is not possible to take address statement to a range or look-up of specific bits in a byte. Though it table, which is more efficient than a for(i = 0; i< 50; i++) is space-efficient to use bit-fields, long list of jumps. for(j = 0; j< 50; j++) it is not time-efficient since it is Note that executing jump for(k = 0; k< 50; k++) not possible to access individual instructions is not costly, but printf(“%d ”, bits; so the compiler emits code to unpredictable jumps can result a[k][j][i]); access ‘word’s and then does bit- in considerably slower execution. manipulation to access individual For example, frequent jumps can for(i = 0; i< 50; i++) bit-field member values. result in the flushing of the pipeline. for(j = 0; j< 50; j++) The following bit-field struct is to Similarly, processors typically look for(k = 0; k< 50; k++) represent time in a day in HH:MM: ahead in the instruction stream printf(“%d ”, SS format. For the hour, the range is and pre-fetch necessary memory a[i][j][k]); 0-23 and for minutes and seconds, accesses and put it in cache. So, the range is 0-59. We can use the unpredictable jumps can result in C has arrays implemented in following struct: memory faults, which will result in row-major order, that is, the same wasting hundreds of cycles since way it is organised in the hardware. struct time { the processor has to wait for the The second loop is more efficient unsigned int hour : 5; memory value to be available for it because it accesses memory unsigned int minute: 6; to continue execution. In general, locations sequentially, in row-major unsigned int second: 6; a program with less number of order. The processor will fetch the } tm1, tm2; branches is faster than those that memory blocks into cache, and have a large number of branches. since the memory access is also To access tm.minute, the sequential, this is efficient. However, compiler has to generate code to Memory access in the first loop, the memory access access the word (4 bytes) and do bit- As said earlier, memory access is not sequential and hence there manipulation and access only the 5th is a costly operation. Let us take might be a lot of memory faults and to 10th bit in that word, which is slow. a specific example to illustrate hence it will be considerably slower So, avoid using bit-fields extensively this. Typically, it takes the same than the second loop. if performance is important for your time to access global data or local From these examples we learn software. A better option in this case (stack allocated) data. However, that it is important to keep in mind is to use a struct (without bit-fields) it is preferable to use local data that memory faults are costly and with a byte each for the hour, minute instead of global data because of the we need to minimise such memory and second, respectively. well-known ‘principle of locality’. faults to write efficient code. In this article, we explored some If a memory location is accessed, fundamental issues to understand the processor doesn’t fetch value Compound types how various programming constructs in just that memory location; it C supports compound types like can affect the efficiency of programs. fetches many values adjacent to that structs, unions and bit-fields that There are many other issues, such memory location since the program are implemented in terms of other as unaligned memory access, cost is likely to access variables that are primitive or compound types. The of I/O operations, etc, that are not located near that memory location. hardware does not understand any covered in this article. This article is Fetching a block of data and putting compound types, and all processing just a starting point to understand it in cache is not time consuming, is done on primitive types only. such problems. If you are interested, but if there is a memory fault, it can There are many aspects—such you can read books on assembly lead to the processor waiting for as padding and alignment—in language, computer architecture and hundreds of cycles for that memory using compound types that can compiler optimisation to get a better access to happen. If there are large affect the performance. Here, we’ll understanding of the issues related to numbers of global variables and their look at an example of bit-fields to writing efficient programs. accesses are spread throughout understand how it is supported by the program, then the program the hardware. By: S G Ganesh is a research execution becomes considerably In C, we can manipulate and engineer in Siemens (Corporate slower. access bits using bit-fields. It is a Technology), Bangalore. His latest Let us consider another (well- syntax error to attempt taking the book is, ‘60 Tips on Object Oriented known) example to illustrate this address of a bit-field member. Why? Programming’, published by Tata important ‘principle of locality’. Of The granularity of addressing in McGraw-Hill in December 2007. You can reach him at sgganesh@gmail.com the following two loops, which one is modern computers is in bytes and 14 February 2008 | LINuX For you | www.openITis.com cmyk