SlideShare a Scribd company logo
1 of 10
Download to read offline
AMD64 (EM64T) architecture
Authors: Evgeniy Ryzhkov, Andrey Karpov

Date: 02.10.2008


Abstract
The article briefly describes AMD64 architecture by AMD Company and its implementation EM64T by
Intel Company. The architecture's peculiarities, advantages and disadvantages are described.


Introduction
Development of computer-solved tasks demands more and more from the hardware these tasks are
being solved on. The requirements to computer systems of personal-computer class have been growing
year by year for 20 years already. It happens because people wish to solve on their personal computers
more and more complex tasks which have been earlier solved only on high-performance mainframes.

What are these requirements to the personal computers for solving complex tasks? Of course, these are
requirements of main-memory size and processor's performance (don't mix up with frequency!). IA32
architecture (Intel Architecture 32) dominating during the last decade offers 4Gb (2^32) of main
memory of which only 2Gb are usually allocated to an application; different register blocks and sets of
various tricks such as branch predication block, which should increase the system's performance
without increasing such an abstract parameter as processor's frequency [1].

Modern tasks for personal computers approach 2Gb while processors' frequency increase cannot help
increase performance.

Newly-developed 64-bit architectures SPARC64 and Intel Itanium can to some extend serve to solve the
problem of modern 32-bit computers' limitations. But they are intended for hi-end systems and are not
available as cheap solutions. It is AMD64 architecture by AMD Company and its implementation EM64T
by Intel Company which are to become really popular. These architectures are twins and programs
compiled for one of them can be launched on the other as well. But it is the solution by AMD that
historically appeared first. EM64T is actually only an implementation of AMD64 by Intel. AMD64
architecture is now implemented in processors of all classes: mobiles, work-stations, servers.

Despite evident advantages of AMD64 platform (which are described in detail in this article) it doesn't
introduce anything revolutionary into computing machinery. Porting from 32 bits to 64 bits didn't lead
to quality improvements while previous porting from 16 bits to 32 bits had increased systems' safety
and performance significantly.


1. AMD64 architecture
AMD64 architecture is fully described in five documentation volumes provided by AMD Company. This
chapter provides a brief description based on the first volume [2]. Pay attention that in official
documentation this architecture is defined as AMD x86-64 what underlines its backward compatibility.
1.1. The architecture's description
AMD x86-64 architecture is a simple but powerful backward compatible extension of the obsolete
           64                                        backward-compatible
industrial architecture x86 [1]. It adds 64
                              ].         64-bit address space and extends register resources for
supporting more performance for recompiled 64 bit programs providing support of obsolete 16
                                                 64-bit                                         16-bit and
32-bit code of applications and operational systems without modifying or recompiling them.
   bit

Necessity of 64-bit x86 architecture is explained by applications which need large address space. These
are high-performance servers, data managers, CAD systems and of course games. Such applications will
         performance                            CAD-systems
gain an advantage due to 64-bit address space and more registers. Few registers available in obsolete
                              bit
x86 architecture limit computing task performance. More registers provide sufficient performance for
                       computing-task
most applications.

x86-64 architecture introduces two new peculiarities:
    64

1. Extended registers (Picture 1):

    •   8 general-purpose registers;
                  purpose
    •   all 16 general-purpose registers are 64
                       purpose               64-bit;
    •   8 new 128-bit XMM registers;
                   bit
    •   a new command prefix (REX) for access to extended registers.

2. special mode "Long Mode" which is shown in Table 1:

    •   up to 64-bit virtual addresses;
                 bit
    •   64-bit command pointer (RIP);
            bit
    •   flat address space.
Picture 1. Set of x86-64 registers
Table 1. Processor operating modes.

Table 2 contains comparison of registers' and stack's resources available to an application in different
modes. Left columns show resources provided by obsolete x86 architecture which are available only to
compatibility. Right columns show resources available in 64 bit mode. The difference between the
                                                          64-bit
modes is marked grey.
Table 2. Registers and stack available in different modes

As shown in Table 2 obsolete x86 architecture (this mode is called legacy mode in x86 x86-64) supports 8
general-purpose registers. But actually only 4 registers are usually used: EAX, EBX, ECX, EDX. Registers
         purpose                                                                ,
EBP, ESI, EDI, ESP have a special purpose X86-64 architecture adds 8 general-
                                  purpose.                                       -purpose registers and
enlarges the register range from 32 bits to 64 bits. It allows compilers to increase code performance. A
64-bit compiler can use registers for storing variables more efficiently. The compiler also allows you to
   bit                                                         efficiently.
minimize memory access by locating operation inside general purpose registers.
                                                         general-purpose

    •   x86-64 architecture supports the whole set of x86 instructions and adds some new instructions
             64
        for supporting long-mode. The commands are divided into several subsets:
                            mode.
•   General-purpose commands. These are main x86 integer commands used in all programs. Most
        of them are intended for loading, saving and processing data located in general-purpose
        registers or memory. Some of these commands manage the command stream providing passage
        from one program section to another.
    •   128-bit media-commands. These are SSE and SSE2 (streaming SIMD extension) commands
        intended for loading, saving or processing data located in 128-bit XMM registers. They perform
        integer or floating-point operations over vector (packed) and scalar data types. As vector
        commands can perform one operation over a data set independently they are called single-
        instruction, multiple-data (SIMD) commands. They are used for media- and science applications
        for processing data blocks.
    •   64-bit media-commands. These are multimedia extension (MMX) and 3DNow! Commands. They
        save, restore and process data located in 64-bit MMX registers. Like 128-bit commands
        described before they perform integer and floating-point operations over vector (packed) and
        scalar data.
    •   x87 commands. They are intended for working with the floating point in obsolete x87
        applications. They process data in x87 registers.

Some of these commands connect two or more subsets of the commands described above. For
example, such are commands of data transmission between general-purpose registers and XMM or
MMX registers.

Let's consider in detail the operating modes shown in Table 1 supported by x86-64. In most cases
addresses' and operands' sizes can be overlayed by a command prefix.

Let's describe long-mode at first. This is an extension of the obsolete protected mode. Long-mode
consists of two submodes: 64-bit mode and compatibility mode. 64-bit mode supports all the new
possibilities and register extensions introduced into x86-64. Compatibility mode supports binary
compatibility with existing 16-bit and 32-bit code. Long-mode doesn't support obsolete real mode or
obsolete virtual-8086 mode and it also doesn't support hardware task switching.

As 64-bit mode supports 64-bit address space you need to use a new 64-bit operational system for its
work. Meanwhile, the existing applications can be launched without recompiling in compatibility mode
under the OS working in 64-bit mode. For 64-bit command addressing a 64-bit register (RIP) and a new
addressing mode with single flat address space for code, stack and data are used.

64-bit mode implements support of extended registers through a new prefix group of REX commands.

In 64-bit mode addresses' size is 64 bits on default but implementations of x86-64 may have a smaller
size. An operand's size is 32 bits on default. For most instructions the operand's size can be overlaid
using a prefix of REX-type commands.

64-bit mode provides data addressing relative to the 64-bit register RIP. X86 architecture provided
addressing relative to IP register only in control transfer commands. RIP-relative addressing increases
efficiency of position-independent code and code addressing global data.

Some opcode commands were redefined to support extended registers and 64-bit addressing.

Compatibility mode is intended for executing existing 16-bit and 32-bit programs in a 64-bit OS.
Applications are launched in compatibility mode with the use of 32- or 16-bit address space and can
have access to 4Gb of virtual address space. Commands' prefixes can switch 16- and 32-bit addresses
and operands' sizes.

From the application's viewpoint compatibility mode looks like the obsolete protected x86 mode but
from the viewpoint of the OS (address translation, processing of interruptions and exceptions) 64-bit
mechanisms are used.

Legacy mode provides binary compatibility not only with 16- and 32-bit applications but with 16- and
32-bit operational systems as well. It includes three modes:

    •   Protected mode. 16- and 32-bit programs with segmental memory organization, privilege and
        virtual memory support. Address space is 4Gb.
    •   Virtual-8086 mode. Supports 16-bit applications launched as tasks in protected mode. Address
        space is 1Mb.
    •   Real mode. Supports 16-bit programs with simple register addressing of segmented memory.
        Virtual memory and privileges are not supported. 1Mb of memory is available.

Legacy mode is used only when 16- and 32-bit OS are operating.

1.2. The architecture's advantages
Let's outline the main advantages of AMD x86-64 architecture.

    •   64-bit address space.
    •   Extended register set.
    •   Developer-habitual command set.
    •   Possibility of launching obsolete 32-bit applications in a 64-bit OS.
    •   Possibility of using a 32-bit OS.

1.3. The architecture's disadvantages
The new architecture AMD x86-64 hasn't introduced crucial disadvantages into 32-bit architecture. We
can point out only a bit increased programs' memory requirements because of the larger size of
addresses and operands. But it won't influence however significantly the code size or the requirements
to available main memory.

But the fact is that AMD x86-64 hasn't introduced anything significantly new. There is no performance
gain. On the average, you can expect 5-15% performance gain after recompiling a program.


AMD64 program model
Nearly all modern OS now have versions for AMD64 architecture. Thus, Microsoft presents Windows XP
64-bit, Windows Server 2003 64bit, Windows Vista 64bit. The leading UNIX system developers also
provide 64-bit versions, such as, for example, Linux Debian 3.1 x86-64. But it doesn't mean that the
whole code of such a system is completely 64-bit. Some OS code and many applications still can remain
32-bit as AMD64 provides backward compatibility.

64-bit Windows version, for example, uses a special mode WoW (Windows-on-Windows 64) which
translates 32-bit applications' calls to the resources of a 64-bit OS. Let's consider in detail AMD64
program model available to a programmer in 64-bit Windows [3, 4] shortly called Win64.
Let's begin with address space. Although a 64-bit processor can theoretically address 16 exabyte (2^64)
Win64 now supports 16 terabytes (2^44). There are several reasons for this. Existing processors can
provide access only to 1 terabyte (2^40) of actual storage. The architecture (but not the hardware part)
can extend this space up to 4 petabytes. But anyway we need a great memory size for page tables
representing memory. (see Table 3).

                          32-bit mode                   64-bit mode
Process's general         4Gb                           16Tb
address space
Address space           2Gb (3Gb if the system is       4Gb if the application is compiled with
available to a 32-bit   loaded with /3GB key)           /LARGEADDRESSAWARE key (2Gb otherwise)
process
Address space           Impossible                      8Tb
available to a 64-bit
process
Paged pool              470Mb                           128Gb
Non-paged pool          256Mb                           128Gb
System Page Table       660Mb - 900Mb                   128Gb
(PTE)
Table 3. Main memory limitations in Windows

Like in Win32 the addressed memory range is divided into user and system addresses. Each process
receives 8Tb and 8Tb remain in the system (unlike 2Gb and 2Gb in Win32 correspondingly). Different
Windows versions have different limitations shown in Table 4.

Actual storage and number of processors 32-bit models            64-bit models
Windows XP Home                             4 Gb, 1 CPU          Not present
Windows XP Professional                     4 Gb, 1-2 CPU        128 Gb, 1-2 CPU
Windows Server 2003, Standard               4 Gb, 1-4 CPU        32 Gb, 1-4 CPU
Windows Server 2003, Enterprise             64 Gb, 1-8 CPU       1 Tb, 1-8 CPU
Windows Server 2003, Datacenter             64 Gb, 8-32 CPU      1 Tb, 8-64 CPU
Windows Server 2008, Datacenter             64 Gb, 2-64 CPU      2 Tb, 2-64 CPU
Windows Server 2008, Enterprise             64 Gb, 1-8 CPU       2 Tb, 1-8 CPU
Windows Server 2008, Standard               4 Gb, 1-4 CPU        32 Gb, 1-4 CPU
Windows Server 2008, Web Server             4 Gb, 1-4 CPU        32 Gb, 1-4 CPU
Vista Home Basic                            4 Gb, 1 CPU          8 Gb, 1 CPU
Vista Home Premium                          4 Gb, 1-2 CPU        16 Gb, 1-2 CPU
Vista Business                              4 Gb, 1-2 CPU        128 Gb, 1-2 CPU
Vista Enterprise                            4 Gb, 1-2 CPU        128 Gb, 1-2 CPU
Vista Ultimate                              4 Gb, 1-2 CPU        128 Gb, 1-2 CPU
Table 4. Limitations of different Windows versions

Like in Win32 a page's size is 4Kb. First 4Kb of address space are never shown, i.e. the least true address
is 0x10000. Unlike Win32 system DLL are loaded exceeding 4Gb.

All the processors implementing AMD64 have support for "CPU No Execution" bit which is used by
Windows for implementing the hardware technology "Data Execution Protection" (DEP) which forbids
execution of user data instead of code. It allows you to increase programs' safety excluding influence of
such errors as execution of the buffer with data as code.

The peculiarity of AMD64 compilers is that they can most efficiently implement registers for passing
parameters into functions instead of using the stack. It allowed Win64 architecture developers to get rid
off such a notion as calling convention. In Win32 you can use different conventions (ways of passing
parameters): __stdcall, __cdecl, __fastcall etc. In Win64 there is only one calling convention. Let's
consider an example of how four arguments of integer-type are passed in registers:

    •   RCX: first argument
    •   RDX: second argument
    •   R8: third argument
    •   R9: fourth argument

Arguments after the first four integers are passed on the stack. For float arguments XMM0-XMM3 both
the registers and the stack are used.

The difference in calling conventions leads to that you cannot use both 64-bit and 32-bit code in one
program. In other words, if an application is compiled for 64-bit mode all the used DLL libraries must be
64-bit too.

While writing 64-bit code you can get additional performance gain thanks to special optimization. This
question is considered in detail in optimizing instructions [5].


3. Porting applications on AMD64
One of the purposes of high-level languages is to reduce as far as possible the binding of program code
to the architecture and provide the most possible portability between hardware platforms. For example,
C++ programs written correctly are theoretically independent from the hardware platform. And, ideally,
to compile the corresponding 32-bit applications for AMD64 platform it is enough only to change the
compiler [ 6] and just recompile the program. But in practice everything is more complicated.

Software using Assembler code for 32-bit processors still exists. Many programs written in high-level
languages contain Assembler blocks. That's why it is often impossible just to recompile a large project.
The solution of this problem is clear. Firstly, you can refuse porting an application on a new platform. It
can be a very reasonable solution because, for example, Windows-family OS provide good backward
compatibility due to Wow64 technology. The second variant is to rewrite the program code. Moreover,
it seems reasonable to rewrite it using high-level languages. By the way, pay attention that Visual C++
compiler doesn't support compilation of Assembler blocks in 64-bit compilation mode anymore [7].

Presence of Assembler program code is not the only obstacle we face while mastering 64-bit systems.
While porting programs on 64-bit systems different errors occur relating to changing of the data model
(type dimension). What's more, some errors become apparent only while using large memory size which
was unavailable in 32-bit systems. Such errors are well described in the article "20 issues of porting C++
code on the 64-bit platform" [8].

All said above relates mostly to C/C++ applications. It is better with managed code (C#) although we can
face some small problems here as well. Unfortunately, large program complexes are often built using
libraries written in C/C++. And that's why in case of a large C# project it most likely contains C/C++
modules or libraries which can be unsafe and contain vulnerabilities.

For testing and checking program code ported on a 64-bit platform you can use different special
methods and tools [9]. For example, such static analyzers as Viva64 (for Windows systems) and PC-Lint
(for Unix systems) can provide good results. To learn more about this toolkit read the article
"Comparison of analyzers' diagnostic abilities while testing 64-bit code" [10].
Conclusion
Undoubtedly, AMD64 architecture offered by AMD Company turned out to be needed on market.
AMD64's advantage is that it allows you to smoothly switch to 64-bit programs without losing
compatibility with obsolete 32-bit applications. But there is nothing revolutionary in AMD64.

Migration of 32-bit programs on AMD64, as experiments demonstrate, allows you, firstly, to solve tasks
which are much more memory-demanding and, secondly, get about 10% performance gain "just so"
without changing code due to optimization of an application by the compiler for the new architecture.

We may conclude that AMD64 architecture postponed the problem of limited available main-memory
size for many years but didn't solve the problem of modern personal computers' performance gain. The
future is still with multi-core and multi-processor systems.


References
   1. Intel Software Developer's Manual. Volume 1: Basic Architecture.
       http://www.viva64.com/go.php?url=212
   2. AMD x86-64 Architecture Programmer's Manual. Volume 1: Application Programming.
       http://www.viva64.com/go.php?url=213
   3. Mike Wall. Tricks for Porting Applications to 64-Bit Windows on AMD64 Architecture.
       http://www.viva64.com/go.php?url=214
   4. Matt Pietrek. Everything You Need To Know To Start Programming 64-Bit Windows Systems.
       http://www.viva64.com/go.php?url=215
   5. Software Optimization Guide for AMD Athlon 64 and AMD Opteron Processors.
       http://www.viva64.com/go.php?url=59
   6. Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 Platforms.
       http://www.viva64.com/go.php?url=216
   7. Daniel Pistelli. Moving to Windows Vista x64. http://www.viva64.com/go.php?url=217
   8. Andrey Karpov, Evgeniy Ryzhkov. 20 issues of porting C++ code on the 64-bit platform.
       http://www.viva64.com/art-1-2-599168895.html
   9. Andrey Karpov. Problems of testing 64-bit applications. http://www.viva64.com/art-1-2-
       1289354852.html
   10. Andrey Karpov. Comparison of analyzers' diagnostic abilities while testing 64-bit code.
       http://www.viva64.com/art-1-2-914146540.html

More Related Content

What's hot

Case Study on Cray T3E Architecture
Case Study on Cray T3E ArchitectureCase Study on Cray T3E Architecture
Case Study on Cray T3E Architecturedivyawani2
 
32 bit and 64 bit Register manipulation
32 bit and 64 bit Register manipulation32 bit and 64 bit Register manipulation
32 bit and 64 bit Register manipulationraheel_niazi
 
PCIe BUS: A State-of-the-Art-Review
PCIe BUS: A State-of-the-Art-ReviewPCIe BUS: A State-of-the-Art-Review
PCIe BUS: A State-of-the-Art-ReviewIOSRJVSP
 
Pentium (80586) Microprocessor By Er. Swapnil Kaware
Pentium (80586) Microprocessor By Er. Swapnil KawarePentium (80586) Microprocessor By Er. Swapnil Kaware
Pentium (80586) Microprocessor By Er. Swapnil KawareProf. Swapnil V. Kaware
 
Lec 03 ia32 architecture
Lec 03  ia32 architectureLec 03  ia32 architecture
Lec 03 ia32 architectureAbdul Khan
 
Chapt 02 ia-32 processer architecture
Chapt 02   ia-32 processer architectureChapt 02   ia-32 processer architecture
Chapt 02 ia-32 processer architecturebushrakainat214
 
Micro[processor
Micro[processorMicro[processor
Micro[processorcollege
 
Bt0068 computer organization and architecture
Bt0068 computer organization and architecture Bt0068 computer organization and architecture
Bt0068 computer organization and architecture Techglyphs
 
64 bit computing
64 bit computing64 bit computing
64 bit computingAnkita Nema
 
Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Mauryasuraj98
 
Special of 80386 registers
Special of 80386 registersSpecial of 80386 registers
Special of 80386 registersTanmoy Mazumder
 

What's hot (19)

Micropro
MicroproMicropro
Micropro
 
Aes
AesAes
Aes
 
Case Study on Cray T3E Architecture
Case Study on Cray T3E ArchitectureCase Study on Cray T3E Architecture
Case Study on Cray T3E Architecture
 
32 bit and 64 bit Register manipulation
32 bit and 64 bit Register manipulation32 bit and 64 bit Register manipulation
32 bit and 64 bit Register manipulation
 
PCIe BUS: A State-of-the-Art-Review
PCIe BUS: A State-of-the-Art-ReviewPCIe BUS: A State-of-the-Art-Review
PCIe BUS: A State-of-the-Art-Review
 
Pentium (80586) Microprocessor By Er. Swapnil Kaware
Pentium (80586) Microprocessor By Er. Swapnil KawarePentium (80586) Microprocessor By Er. Swapnil Kaware
Pentium (80586) Microprocessor By Er. Swapnil Kaware
 
Protection mode
Protection modeProtection mode
Protection mode
 
Memory mgmt 80386
Memory mgmt 80386Memory mgmt 80386
Memory mgmt 80386
 
Pcie drivers basics
Pcie drivers basicsPcie drivers basics
Pcie drivers basics
 
Lec 03 ia32 architecture
Lec 03  ia32 architectureLec 03  ia32 architecture
Lec 03 ia32 architecture
 
Chapt 02 ia-32 processer architecture
Chapt 02   ia-32 processer architectureChapt 02   ia-32 processer architecture
Chapt 02 ia-32 processer architecture
 
Micro[processor
Micro[processorMicro[processor
Micro[processor
 
Segment registers
Segment registersSegment registers
Segment registers
 
Bt0068 computer organization and architecture
Bt0068 computer organization and architecture Bt0068 computer organization and architecture
Bt0068 computer organization and architecture
 
Explaining 32bit vs 64bit
Explaining 32bit vs 64bitExplaining 32bit vs 64bit
Explaining 32bit vs 64bit
 
64 bit computing
64 bit computing64 bit computing
64 bit computing
 
Pentium processor
Pentium processorPentium processor
Pentium processor
 
Case study on Intel core i3 processor.
Case study on Intel core i3 processor. Case study on Intel core i3 processor.
Case study on Intel core i3 processor.
 
Special of 80386 registers
Special of 80386 registersSpecial of 80386 registers
Special of 80386 registers
 

Similar to AMD64 (EM64T) architecture

Intel microprocessor history lec12_x86arch.ppt
Intel microprocessor history lec12_x86arch.pptIntel microprocessor history lec12_x86arch.ppt
Intel microprocessor history lec12_x86arch.pptjeronimored
 
22cs201 COMPUTER ORGANIZATION AND ARCHITECTURE
22cs201 COMPUTER ORGANIZATION AND ARCHITECTURE22cs201 COMPUTER ORGANIZATION AND ARCHITECTURE
22cs201 COMPUTER ORGANIZATION AND ARCHITECTUREKathirvel Ayyaswamy
 
The reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memoryThe reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memoryPVS-Studio
 
Arm v8 instruction overview android 64 bit briefing
Arm v8 instruction overview android 64 bit briefingArm v8 instruction overview android 64 bit briefing
Arm v8 instruction overview android 64 bit briefingMerck Hung
 
Architecture and Implementation of the ARM Cortex-A8 Microprocessor
Architecture and Implementation of the ARM Cortex-A8 MicroprocessorArchitecture and Implementation of the ARM Cortex-A8 Microprocessor
Architecture and Implementation of the ARM Cortex-A8 MicroprocessorAneesh Raveendran
 
x86 architecture
x86 architecturex86 architecture
x86 architecturei i
 
CO&AL-lecture-04 about the procedures in c language (1).pptx
CO&AL-lecture-04 about the procedures in c language (1).pptxCO&AL-lecture-04 about the procedures in c language (1).pptx
CO&AL-lecture-04 about the procedures in c language (1).pptxgagarwazir7
 
EC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptxEC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptxdeviifet2015
 
Mac osx 64_rop_chains
Mac osx 64_rop_chainsMac osx 64_rop_chains
Mac osx 64_rop_chainsRahul Sasi
 
Energy Core Ecx - 2000 Processor
Energy Core Ecx - 2000 Processor Energy Core Ecx - 2000 Processor
Energy Core Ecx - 2000 Processor Tish997
 
Linux on ARM 64-bit Architecture
Linux on ARM 64-bit ArchitectureLinux on ARM 64-bit Architecture
Linux on ARM 64-bit ArchitectureRyo Jin
 
GCC for ARMv8 Aarch64
GCC for ARMv8 Aarch64GCC for ARMv8 Aarch64
GCC for ARMv8 Aarch64Yi-Hsiu Hsu
 

Similar to AMD64 (EM64T) architecture (20)

64 bits
64 bits64 bits
64 bits
 
SS-CISC -1.pptx
SS-CISC -1.pptxSS-CISC -1.pptx
SS-CISC -1.pptx
 
Intel microprocessor history lec12_x86arch.ppt
Intel microprocessor history lec12_x86arch.pptIntel microprocessor history lec12_x86arch.ppt
Intel microprocessor history lec12_x86arch.ppt
 
22cs201 COMPUTER ORGANIZATION AND ARCHITECTURE
22cs201 COMPUTER ORGANIZATION AND ARCHITECTURE22cs201 COMPUTER ORGANIZATION AND ARCHITECTURE
22cs201 COMPUTER ORGANIZATION AND ARCHITECTURE
 
The reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memoryThe reasons why 64-bit programs require more stack memory
The reasons why 64-bit programs require more stack memory
 
Arm v8 instruction overview android 64 bit briefing
Arm v8 instruction overview android 64 bit briefingArm v8 instruction overview android 64 bit briefing
Arm v8 instruction overview android 64 bit briefing
 
Cao 2012
Cao 2012Cao 2012
Cao 2012
 
Advanced microprocessor
Advanced microprocessorAdvanced microprocessor
Advanced microprocessor
 
Architecture and Implementation of the ARM Cortex-A8 Microprocessor
Architecture and Implementation of the ARM Cortex-A8 MicroprocessorArchitecture and Implementation of the ARM Cortex-A8 Microprocessor
Architecture and Implementation of the ARM Cortex-A8 Microprocessor
 
Mips 64
Mips 64Mips 64
Mips 64
 
x86 architecture
x86 architecturex86 architecture
x86 architecture
 
CO&AL-lecture-04 about the procedures in c language (1).pptx
CO&AL-lecture-04 about the procedures in c language (1).pptxCO&AL-lecture-04 about the procedures in c language (1).pptx
CO&AL-lecture-04 about the procedures in c language (1).pptx
 
EC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptxEC8791 ARM Processor and Peripherals.pptx
EC8791 ARM Processor and Peripherals.pptx
 
It322 intro 2
It322 intro 2It322 intro 2
It322 intro 2
 
X86 Architecture
X86 Architecture X86 Architecture
X86 Architecture
 
Mac osx 64_rop_chains
Mac osx 64_rop_chainsMac osx 64_rop_chains
Mac osx 64_rop_chains
 
U I - 4. 80386 Real mode.pptx
U I - 4. 80386 Real mode.pptxU I - 4. 80386 Real mode.pptx
U I - 4. 80386 Real mode.pptx
 
Energy Core Ecx - 2000 Processor
Energy Core Ecx - 2000 Processor Energy Core Ecx - 2000 Processor
Energy Core Ecx - 2000 Processor
 
Linux on ARM 64-bit Architecture
Linux on ARM 64-bit ArchitectureLinux on ARM 64-bit Architecture
Linux on ARM 64-bit Architecture
 
GCC for ARMv8 Aarch64
GCC for ARMv8 Aarch64GCC for ARMv8 Aarch64
GCC for ARMv8 Aarch64
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

AMD64 (EM64T) architecture

  • 1. AMD64 (EM64T) architecture Authors: Evgeniy Ryzhkov, Andrey Karpov Date: 02.10.2008 Abstract The article briefly describes AMD64 architecture by AMD Company and its implementation EM64T by Intel Company. The architecture's peculiarities, advantages and disadvantages are described. Introduction Development of computer-solved tasks demands more and more from the hardware these tasks are being solved on. The requirements to computer systems of personal-computer class have been growing year by year for 20 years already. It happens because people wish to solve on their personal computers more and more complex tasks which have been earlier solved only on high-performance mainframes. What are these requirements to the personal computers for solving complex tasks? Of course, these are requirements of main-memory size and processor's performance (don't mix up with frequency!). IA32 architecture (Intel Architecture 32) dominating during the last decade offers 4Gb (2^32) of main memory of which only 2Gb are usually allocated to an application; different register blocks and sets of various tricks such as branch predication block, which should increase the system's performance without increasing such an abstract parameter as processor's frequency [1]. Modern tasks for personal computers approach 2Gb while processors' frequency increase cannot help increase performance. Newly-developed 64-bit architectures SPARC64 and Intel Itanium can to some extend serve to solve the problem of modern 32-bit computers' limitations. But they are intended for hi-end systems and are not available as cheap solutions. It is AMD64 architecture by AMD Company and its implementation EM64T by Intel Company which are to become really popular. These architectures are twins and programs compiled for one of them can be launched on the other as well. But it is the solution by AMD that historically appeared first. EM64T is actually only an implementation of AMD64 by Intel. AMD64 architecture is now implemented in processors of all classes: mobiles, work-stations, servers. Despite evident advantages of AMD64 platform (which are described in detail in this article) it doesn't introduce anything revolutionary into computing machinery. Porting from 32 bits to 64 bits didn't lead to quality improvements while previous porting from 16 bits to 32 bits had increased systems' safety and performance significantly. 1. AMD64 architecture AMD64 architecture is fully described in five documentation volumes provided by AMD Company. This chapter provides a brief description based on the first volume [2]. Pay attention that in official documentation this architecture is defined as AMD x86-64 what underlines its backward compatibility.
  • 2. 1.1. The architecture's description AMD x86-64 architecture is a simple but powerful backward compatible extension of the obsolete 64 backward-compatible industrial architecture x86 [1]. It adds 64 ]. 64-bit address space and extends register resources for supporting more performance for recompiled 64 bit programs providing support of obsolete 16 64-bit 16-bit and 32-bit code of applications and operational systems without modifying or recompiling them. bit Necessity of 64-bit x86 architecture is explained by applications which need large address space. These are high-performance servers, data managers, CAD systems and of course games. Such applications will performance CAD-systems gain an advantage due to 64-bit address space and more registers. Few registers available in obsolete bit x86 architecture limit computing task performance. More registers provide sufficient performance for computing-task most applications. x86-64 architecture introduces two new peculiarities: 64 1. Extended registers (Picture 1): • 8 general-purpose registers; purpose • all 16 general-purpose registers are 64 purpose 64-bit; • 8 new 128-bit XMM registers; bit • a new command prefix (REX) for access to extended registers. 2. special mode "Long Mode" which is shown in Table 1: • up to 64-bit virtual addresses; bit • 64-bit command pointer (RIP); bit • flat address space.
  • 3. Picture 1. Set of x86-64 registers
  • 4. Table 1. Processor operating modes. Table 2 contains comparison of registers' and stack's resources available to an application in different modes. Left columns show resources provided by obsolete x86 architecture which are available only to compatibility. Right columns show resources available in 64 bit mode. The difference between the 64-bit modes is marked grey.
  • 5. Table 2. Registers and stack available in different modes As shown in Table 2 obsolete x86 architecture (this mode is called legacy mode in x86 x86-64) supports 8 general-purpose registers. But actually only 4 registers are usually used: EAX, EBX, ECX, EDX. Registers purpose , EBP, ESI, EDI, ESP have a special purpose X86-64 architecture adds 8 general- purpose. -purpose registers and enlarges the register range from 32 bits to 64 bits. It allows compilers to increase code performance. A 64-bit compiler can use registers for storing variables more efficiently. The compiler also allows you to bit efficiently. minimize memory access by locating operation inside general purpose registers. general-purpose • x86-64 architecture supports the whole set of x86 instructions and adds some new instructions 64 for supporting long-mode. The commands are divided into several subsets: mode.
  • 6. General-purpose commands. These are main x86 integer commands used in all programs. Most of them are intended for loading, saving and processing data located in general-purpose registers or memory. Some of these commands manage the command stream providing passage from one program section to another. • 128-bit media-commands. These are SSE and SSE2 (streaming SIMD extension) commands intended for loading, saving or processing data located in 128-bit XMM registers. They perform integer or floating-point operations over vector (packed) and scalar data types. As vector commands can perform one operation over a data set independently they are called single- instruction, multiple-data (SIMD) commands. They are used for media- and science applications for processing data blocks. • 64-bit media-commands. These are multimedia extension (MMX) and 3DNow! Commands. They save, restore and process data located in 64-bit MMX registers. Like 128-bit commands described before they perform integer and floating-point operations over vector (packed) and scalar data. • x87 commands. They are intended for working with the floating point in obsolete x87 applications. They process data in x87 registers. Some of these commands connect two or more subsets of the commands described above. For example, such are commands of data transmission between general-purpose registers and XMM or MMX registers. Let's consider in detail the operating modes shown in Table 1 supported by x86-64. In most cases addresses' and operands' sizes can be overlayed by a command prefix. Let's describe long-mode at first. This is an extension of the obsolete protected mode. Long-mode consists of two submodes: 64-bit mode and compatibility mode. 64-bit mode supports all the new possibilities and register extensions introduced into x86-64. Compatibility mode supports binary compatibility with existing 16-bit and 32-bit code. Long-mode doesn't support obsolete real mode or obsolete virtual-8086 mode and it also doesn't support hardware task switching. As 64-bit mode supports 64-bit address space you need to use a new 64-bit operational system for its work. Meanwhile, the existing applications can be launched without recompiling in compatibility mode under the OS working in 64-bit mode. For 64-bit command addressing a 64-bit register (RIP) and a new addressing mode with single flat address space for code, stack and data are used. 64-bit mode implements support of extended registers through a new prefix group of REX commands. In 64-bit mode addresses' size is 64 bits on default but implementations of x86-64 may have a smaller size. An operand's size is 32 bits on default. For most instructions the operand's size can be overlaid using a prefix of REX-type commands. 64-bit mode provides data addressing relative to the 64-bit register RIP. X86 architecture provided addressing relative to IP register only in control transfer commands. RIP-relative addressing increases efficiency of position-independent code and code addressing global data. Some opcode commands were redefined to support extended registers and 64-bit addressing. Compatibility mode is intended for executing existing 16-bit and 32-bit programs in a 64-bit OS. Applications are launched in compatibility mode with the use of 32- or 16-bit address space and can
  • 7. have access to 4Gb of virtual address space. Commands' prefixes can switch 16- and 32-bit addresses and operands' sizes. From the application's viewpoint compatibility mode looks like the obsolete protected x86 mode but from the viewpoint of the OS (address translation, processing of interruptions and exceptions) 64-bit mechanisms are used. Legacy mode provides binary compatibility not only with 16- and 32-bit applications but with 16- and 32-bit operational systems as well. It includes three modes: • Protected mode. 16- and 32-bit programs with segmental memory organization, privilege and virtual memory support. Address space is 4Gb. • Virtual-8086 mode. Supports 16-bit applications launched as tasks in protected mode. Address space is 1Mb. • Real mode. Supports 16-bit programs with simple register addressing of segmented memory. Virtual memory and privileges are not supported. 1Mb of memory is available. Legacy mode is used only when 16- and 32-bit OS are operating. 1.2. The architecture's advantages Let's outline the main advantages of AMD x86-64 architecture. • 64-bit address space. • Extended register set. • Developer-habitual command set. • Possibility of launching obsolete 32-bit applications in a 64-bit OS. • Possibility of using a 32-bit OS. 1.3. The architecture's disadvantages The new architecture AMD x86-64 hasn't introduced crucial disadvantages into 32-bit architecture. We can point out only a bit increased programs' memory requirements because of the larger size of addresses and operands. But it won't influence however significantly the code size or the requirements to available main memory. But the fact is that AMD x86-64 hasn't introduced anything significantly new. There is no performance gain. On the average, you can expect 5-15% performance gain after recompiling a program. AMD64 program model Nearly all modern OS now have versions for AMD64 architecture. Thus, Microsoft presents Windows XP 64-bit, Windows Server 2003 64bit, Windows Vista 64bit. The leading UNIX system developers also provide 64-bit versions, such as, for example, Linux Debian 3.1 x86-64. But it doesn't mean that the whole code of such a system is completely 64-bit. Some OS code and many applications still can remain 32-bit as AMD64 provides backward compatibility. 64-bit Windows version, for example, uses a special mode WoW (Windows-on-Windows 64) which translates 32-bit applications' calls to the resources of a 64-bit OS. Let's consider in detail AMD64 program model available to a programmer in 64-bit Windows [3, 4] shortly called Win64.
  • 8. Let's begin with address space. Although a 64-bit processor can theoretically address 16 exabyte (2^64) Win64 now supports 16 terabytes (2^44). There are several reasons for this. Existing processors can provide access only to 1 terabyte (2^40) of actual storage. The architecture (but not the hardware part) can extend this space up to 4 petabytes. But anyway we need a great memory size for page tables representing memory. (see Table 3). 32-bit mode 64-bit mode Process's general 4Gb 16Tb address space Address space 2Gb (3Gb if the system is 4Gb if the application is compiled with available to a 32-bit loaded with /3GB key) /LARGEADDRESSAWARE key (2Gb otherwise) process Address space Impossible 8Tb available to a 64-bit process Paged pool 470Mb 128Gb Non-paged pool 256Mb 128Gb System Page Table 660Mb - 900Mb 128Gb (PTE) Table 3. Main memory limitations in Windows Like in Win32 the addressed memory range is divided into user and system addresses. Each process receives 8Tb and 8Tb remain in the system (unlike 2Gb and 2Gb in Win32 correspondingly). Different Windows versions have different limitations shown in Table 4. Actual storage and number of processors 32-bit models 64-bit models Windows XP Home 4 Gb, 1 CPU Not present Windows XP Professional 4 Gb, 1-2 CPU 128 Gb, 1-2 CPU Windows Server 2003, Standard 4 Gb, 1-4 CPU 32 Gb, 1-4 CPU Windows Server 2003, Enterprise 64 Gb, 1-8 CPU 1 Tb, 1-8 CPU Windows Server 2003, Datacenter 64 Gb, 8-32 CPU 1 Tb, 8-64 CPU Windows Server 2008, Datacenter 64 Gb, 2-64 CPU 2 Tb, 2-64 CPU Windows Server 2008, Enterprise 64 Gb, 1-8 CPU 2 Tb, 1-8 CPU Windows Server 2008, Standard 4 Gb, 1-4 CPU 32 Gb, 1-4 CPU Windows Server 2008, Web Server 4 Gb, 1-4 CPU 32 Gb, 1-4 CPU Vista Home Basic 4 Gb, 1 CPU 8 Gb, 1 CPU Vista Home Premium 4 Gb, 1-2 CPU 16 Gb, 1-2 CPU Vista Business 4 Gb, 1-2 CPU 128 Gb, 1-2 CPU Vista Enterprise 4 Gb, 1-2 CPU 128 Gb, 1-2 CPU Vista Ultimate 4 Gb, 1-2 CPU 128 Gb, 1-2 CPU Table 4. Limitations of different Windows versions Like in Win32 a page's size is 4Kb. First 4Kb of address space are never shown, i.e. the least true address is 0x10000. Unlike Win32 system DLL are loaded exceeding 4Gb. All the processors implementing AMD64 have support for "CPU No Execution" bit which is used by Windows for implementing the hardware technology "Data Execution Protection" (DEP) which forbids execution of user data instead of code. It allows you to increase programs' safety excluding influence of such errors as execution of the buffer with data as code. The peculiarity of AMD64 compilers is that they can most efficiently implement registers for passing parameters into functions instead of using the stack. It allowed Win64 architecture developers to get rid
  • 9. off such a notion as calling convention. In Win32 you can use different conventions (ways of passing parameters): __stdcall, __cdecl, __fastcall etc. In Win64 there is only one calling convention. Let's consider an example of how four arguments of integer-type are passed in registers: • RCX: first argument • RDX: second argument • R8: third argument • R9: fourth argument Arguments after the first four integers are passed on the stack. For float arguments XMM0-XMM3 both the registers and the stack are used. The difference in calling conventions leads to that you cannot use both 64-bit and 32-bit code in one program. In other words, if an application is compiled for 64-bit mode all the used DLL libraries must be 64-bit too. While writing 64-bit code you can get additional performance gain thanks to special optimization. This question is considered in detail in optimizing instructions [5]. 3. Porting applications on AMD64 One of the purposes of high-level languages is to reduce as far as possible the binding of program code to the architecture and provide the most possible portability between hardware platforms. For example, C++ programs written correctly are theoretically independent from the hardware platform. And, ideally, to compile the corresponding 32-bit applications for AMD64 platform it is enough only to change the compiler [ 6] and just recompile the program. But in practice everything is more complicated. Software using Assembler code for 32-bit processors still exists. Many programs written in high-level languages contain Assembler blocks. That's why it is often impossible just to recompile a large project. The solution of this problem is clear. Firstly, you can refuse porting an application on a new platform. It can be a very reasonable solution because, for example, Windows-family OS provide good backward compatibility due to Wow64 technology. The second variant is to rewrite the program code. Moreover, it seems reasonable to rewrite it using high-level languages. By the way, pay attention that Visual C++ compiler doesn't support compilation of Assembler blocks in 64-bit compilation mode anymore [7]. Presence of Assembler program code is not the only obstacle we face while mastering 64-bit systems. While porting programs on 64-bit systems different errors occur relating to changing of the data model (type dimension). What's more, some errors become apparent only while using large memory size which was unavailable in 32-bit systems. Such errors are well described in the article "20 issues of porting C++ code on the 64-bit platform" [8]. All said above relates mostly to C/C++ applications. It is better with managed code (C#) although we can face some small problems here as well. Unfortunately, large program complexes are often built using libraries written in C/C++. And that's why in case of a large C# project it most likely contains C/C++ modules or libraries which can be unsafe and contain vulnerabilities. For testing and checking program code ported on a 64-bit platform you can use different special methods and tools [9]. For example, such static analyzers as Viva64 (for Windows systems) and PC-Lint (for Unix systems) can provide good results. To learn more about this toolkit read the article "Comparison of analyzers' diagnostic abilities while testing 64-bit code" [10].
  • 10. Conclusion Undoubtedly, AMD64 architecture offered by AMD Company turned out to be needed on market. AMD64's advantage is that it allows you to smoothly switch to 64-bit programs without losing compatibility with obsolete 32-bit applications. But there is nothing revolutionary in AMD64. Migration of 32-bit programs on AMD64, as experiments demonstrate, allows you, firstly, to solve tasks which are much more memory-demanding and, secondly, get about 10% performance gain "just so" without changing code due to optimization of an application by the compiler for the new architecture. We may conclude that AMD64 architecture postponed the problem of limited available main-memory size for many years but didn't solve the problem of modern personal computers' performance gain. The future is still with multi-core and multi-processor systems. References 1. Intel Software Developer's Manual. Volume 1: Basic Architecture. http://www.viva64.com/go.php?url=212 2. AMD x86-64 Architecture Programmer's Manual. Volume 1: Application Programming. http://www.viva64.com/go.php?url=213 3. Mike Wall. Tricks for Porting Applications to 64-Bit Windows on AMD64 Architecture. http://www.viva64.com/go.php?url=214 4. Matt Pietrek. Everything You Need To Know To Start Programming 64-Bit Windows Systems. http://www.viva64.com/go.php?url=215 5. Software Optimization Guide for AMD Athlon 64 and AMD Opteron Processors. http://www.viva64.com/go.php?url=59 6. Compiler Usage Guidelines for 64-Bit Operating Systems on AMD64 Platforms. http://www.viva64.com/go.php?url=216 7. Daniel Pistelli. Moving to Windows Vista x64. http://www.viva64.com/go.php?url=217 8. Andrey Karpov, Evgeniy Ryzhkov. 20 issues of porting C++ code on the 64-bit platform. http://www.viva64.com/art-1-2-599168895.html 9. Andrey Karpov. Problems of testing 64-bit applications. http://www.viva64.com/art-1-2- 1289354852.html 10. Andrey Karpov. Comparison of analyzers' diagnostic abilities while testing 64-bit code. http://www.viva64.com/art-1-2-914146540.html