The document discusses the evolution of computer architectures from early technological achievements like the transistor and integrated circuit. It describes increasing transistor densities following Moore's Law. Future technologies will focus on increasing core counts while decreasing cycle times and voltages. Performance will come from parallelism rather than clock speed increases due to heat limitations. The document outlines challenges in scaling to exascale systems by 2018.
7. Power Density 1 10 100 1000 i386 i486 Pentium® Pentium® Pro Pentium® II Pentium® III Hot plate Nuclear Reactor Sun's Surface Rocket Nozzle * “New Microarchitecture Challenges in the Coming Generations of CMOS Process Technologies” – Fred Pollack, Intel Corp. Micro32 conference keynote - 1999. Pentium® 4 Watts/cm 2
8.
9. Technology Outlook Shekhar Borkar, Micro37, P Medium High Very High Variability Energy scaling will slow down >0.5 >0.5 >0.35 Energy/Logic Op scaling 0.5 to 1 layer per generation 8-9 7-8 6-7 Metal Layers 1 1 1 1 1 1 1 1 RC Delay Reduce slowly towards 2-2.5 <3 ~3 ILD (K) Low Probability High Probability Alternate, 3G etc 128 11 2016 High Probability Low Probability Bulk Planar CMOS Delay scaling will slow down >0.7 ~0.7 0.7 Delay = CV/I scaling 256 64 32 16 8 4 2 Integration Capacity (BT) 8 16 22 32 45 65 90 Technology Node (nm) 2018 2014 2012 2010 2008 2006 2004 High Volume Manufacturing
10. We have seen increasing number of gates on a chip and increasing clock speed. Heat becoming an unmanageable problem, Intel Processors > 100 Watts We will not see the dramatic increases in clock speeds in the future. However, the number of gates on a chip will continue to increase. Increasing the number of gates into a tight knot and decreasing the cycle time of the processor Lower Voltage Increase Clock Rate & Transistor Density Core Cache Core Cache Core C1 C2 C3 C4 Cache C1 C2 C3 C4 Cache C1 C2 C3 C4 C1 C2 C3 C4 C1 C2 C3 C4 C1 C2 C3 C4
17. BSC-CNS e iniciativas a nivel internacional: IESP Build an international plan for developing the next generation open source software for scientific high-performance computing Improve the world’s simulation and modeling capability by improving the coordination and development of the HPC software environment
18.
19. Education for Parallel Programming Multicore-based pacifier I multi-core programming I many-core programming We all massive parallel prog. I games
22. In 50 Years ... Eniac , Eckert&Mauchly1946 ... 18000 vacuum tubes Pentium III playing DVD, 1998 ... 24 M transistors
23. Technology Trends: Microprocessor Capacity 2X transistors/Chip Every 1.5 years Called “ Moore’s Law ” Moore’s Law Microprocessors have become smaller, denser, and more powerful. Not just processors, bandwidth, storage, etc Gordon Moore (co-founder of Intel) predicted in 1965 that the transistor density of semiconductor chips would double roughly every 18 months.
24.
25.
26.
27.
28.
29. MareIncognito: Project structure 4 relevant apps: Materials: SIESTA Geophisics imaging: RTM Comp. Mechanics: ALYA Plasma: EUTERPE General kernels Automatic analysis Coarse/fine grain prediction Sampling Clustering Integration with Peekperf Contention, Collectives Overlap computation/communication Slimmed Networks Direct versus indirect networks Contribution to new Cell design Support for programming model Support for load balancing Support for performance tools Issues for future processors Coordinated scheduling: Run time, Process, Job Power efficiency StarSs: CellSs, SMPSs [email_address] OpenMP++ MPI + OpenMP/StarSs Performance analysis tools Processor and node Load balancing Interconnect Applications Programming models Models and prototype
Access latency for main memory, even using a modern SDRAM with a CAS latency of 2, will typically be around 9 cycles of the **memory system clock** -- the sum of The latency between the FSB and the chipset (Northbridge) (+/- 1 clockcycle) The latency between the chipset and the DRAM (+/- 1 clockcycle) The RAS to CAS latency (2-3 clocks, charging the right row) The CAS latency (2-3 clocks, getting the right column) 1 cycle to transfer the data. The latency to get this data back from the DRAM output buffer to the CPU (via the chipset) (+/- 2 clockcycles) Assuming a typical 133 MHz SDRAM memory system (eg: either PC133 or DDR266/PC2100), and assuming a 1.3 GHz processor, this makes 9*10 = 90 cycles of the CPU clock to access main memory! Yikes, you say! And it gets worse – a 1.6 GHz processor would take it to 108 cycles, a 2.0 GHz processor to 135 cycles, and even if the memory system was increased to 166 MHz (and still stayed CL2), a 3.0 GHz processor would wait a staggering 162 cycles! Caches make the memory system seem almost as fast as the L1 cache, yet as large as main memory. A modern primary (L1) cache has a latency of just two or three **processor cycles**, which is dozens of times faster than accessing main memory, and modern primary caches achieve hit rates of around 90% for most applications. So 90% of the time, accessing memory only takes a couple of cycles. Good overview http://www.pattosoft.com.au/Articles/ModernMicroprocessors/
It is the conclusion of this TTA that, in the very near future (in fact some early examples are clearly in evidence right now), virtual worlds will extend their reach well beyond their current subject matter of on-line fantasy gaming to incorporate all manner of business and commerce. This evolution will quickly encompass many industries and business processes where IBM has traditionally had a significant business interests. In the education industry, it is not at all a stretch to imagine a university physics professor convening a kinematics lecture in a virtual world in which the professor could alter the force of gravity and move large, virtual objects to demonstrate environments on other planets. Closer to our industry, an IBM Industry Solution sales specialist could arrange to meet a client in a virtual world populated by highly realistic (virtual) world venues containing software solutions created by IBM and select business partners. In these virtual sales worlds, clients would interact with the solutions in the same manner as real world users, exploiting all the solution's functional capacities. For example, a virtual mobile work force solution could be demonstrated from multiple perspectives in the context of real business scenarios - the control center, the mobile vehicle etc. The solution demonstration would totally immerse the client in the solution experience there by creating an unparalleled selling tool. The possibilities are limitless. From top left, clockwise: (1) Worlds of Warcraft: A Tavern. This is just a symbolic representation of commerce & advertising within games. Many people run their own businesses within virtual worlds, trading both virtual and real items for virtual and real currencies. Microsoft’s acquisition of Massive Inc. has also now secured them a huge advertising ecosystem of game development companies, advertising agencies and leading brands, using online video games as another advertising channel for directed and personalized ads and product placement deals. The tavern represents the real-world metaphors that build community within virtual worlds, much like the 18 th century coffee houses lead to the formation of stock exchanges. Incidentally, there is a game advertising summit in San Francisco, June 9 th 2006. (2) Hazmat Hot zone: project based at the Entertainment Technology Center at Carnegie Melon University, is one of the earliest serious game projects and now has several scenarios up-and-running using Unreal-Tournament based graphics and game play. Intended users: fire-department personnel who handle HazMat response. HazMat uses multiplayer gaming technology and augmented communication practices to assist with team-based training vital to HazMat and other disaster response practices. (3) Virtual Iraq: Not only are the army using virtual world simulations for the training of troops and engagement planning, but also for the treatment of Post Traumatic Stress Disorder (PTSD) through the ability to “relive” traumatic events through simulation. ( http://www.washingtonpost.com/ac2/wp-dyn/A58360-2005Mar22?language=printer) (4) Simulation of forest fire disasters and how to combat them. (5) Virtual Acropolis: This is an example of using virtual environments as an educational and research tool for the humanities, in this case ancient history. The use of highly detailed models, created collaboratively by historians and researchers, to model world heritage sites for a variety of uses, including tourism, education, simulation of “what-if” scenarios, etc. imagine teaching history of a famous era or battle by immersing the student in a highly realistic, immersive simulation complete with architecture, artifacts and even populace of the period. These may also help the study of social history and sociological development and evolution via large scale community participation. (6) Food Force: From the United Nations World Food Program (WFP), Food Force is an educational video game telling the story of a hunger crisis on the fictitious island of Sheylan. Comprised of 6 mini-games or “missions”, the game takes young players from an initial crisis assessment through to delivery and distribution of food aid, with each sequential mission addressing a particular aspect of this challenging process. (http://www.food-force.com/) (7) Yourself Fitness: Yourself!Fitness is a complete fitness program on a disc - exercise, diet, motivation, and fitness tracking are all included. Your host is Maya, a dynamically generated digital personality who guides you through all aspects of the application. You need nothing more than an Xbox and a television set to partake. ( http://www.yourselffitness.com/) (8) Pulse!! The virtual clinical learning lab and simulation, for training of first responders in treatments and medical and nursing students. ( http://www.businessweek.com/innovate/content/apr2006/id20060410_051875.htm?chan=innovation_game+room_features). (10) Another picture of Worlds of Warcraft: This is just to illustrate the breadth, diversity and scale of virtual environments. It is easy to take for granted that the fact that this huge architectural vista and the tavern above are all parts of a single virtual world that is WoW, is a challenge to the rendering engine, to deal with a broad spectrum of conditions. Why is this important? It means that the same middleware engine can be used to a broad variety of simulation environments and applications these days, rather than purpose built or specialized simulations for specific scenarios, and are configurable through XML & scripting mechanisms. (centre) Google Earth: Now being offered as Enterprise Services for a variety of applications including real-estate, architecture & engineering, insurance, media. Google’s provision of 3D modelling tools and open repository for free is a significant step in them making Google Earth a platform for application development using it as a visualization engine and MySpace of the future. NEED FOR STANDARDS: Multiple Virtual Worlds Interconnected & Interdependent Independently operated Open standard interfaces, to allow: Avatar portability Property portability Security Metering, Billing, Separations, Settlements Distributed problem determination Distributed systems management
(Please note - this slide includes 2 animation steps) An exciting question to ask, is where is this research heading? In this slide you can see what is probably a familiar chart depicting the progress that has been made in supercomputing since the early 90s. (At each time point, the green line shows the 500th fast supercomputer, the dark blue line the fastest supercomputer, and the light blue line the summed power of the top 500 machines). These lines show a nice trend, which we’ve extrapolated out 10 years. [ANIMATE SLIDE] The IBM team’s latest simulation results fall here on the graph. These latest results represent a model about 4 and a half percent of the scale of the cerebral cortex, which was run at 1/83 of real time. The machine used provided 144 TB of memory and 0.5 PFLop/s. [ANIMATE SLIDE] Turning to the future, you can see that running human scale cortical simulations will require 4 PB of memory and to run these simulations in real time will require over 1 EFLop/s. If the current trends in supercomputing continue, however, the IBM team believes they will have the ability to perform such simulations in the not too distant future.