SlideShare a Scribd company logo
1 of 40
Download to read offline
GPU-Assisted
Password Cracking
Who may need
 Password Recovery?
 Ordinary users (own passwords)
 IT Departments (employee’s passwords)
 Security auditors, consultants and
  penetration testers
 Law enforcement & government


 Hackers usually don’t!
Why speed counts?

Users and IT Departments:
ÂŤWe needed those passwords
yesterdayÂť
                Auditors, consultants
                     and pentesters:
                  ÂŤTime is MoneyÂť
How to increase speed?
Traditional way is to network together
many computers to form a cluster

• Communication
  overhead
• Difficult to manage
• Not power-efficient
Any other options?
Yes!
For many HPC applications GPUs
are many times faster
than CPUs
                But they’re not only
                    faster, they are
                           greener!
Why?
CPUs are designed to
be efficient at serial
computing…



                     …while GPU’s
                    main concern is
                  parallel computing
Intel® Core™ i7-965
   “The highest performing
  desktop processor on the
                   planet.”

                        4 cores
                       3,2 GHz
        731 million transistors
                      263 mm2
Memory Controller
IO




                                               IO
                        Q
      Core   Core                Core   Core
                        u
                        e
                        u
                        e
QPI




                                               QPI
                   L3 cache
                      8 Мb
             >384 million transistors
Out-of-
                  Memory Controller
                                Order
             Execution Units Scheduling
                                  &
                             Retirement
IO




                                                                                  IO
                                  Q




                      Ordering &
                      Execution
                                  u Instruction
      Core     Core                       Core                             Core


                       Memory
              L1
                                  e Decode &
             Data
                                    Microcode
                                  u
                                  e


                                     Branch Prediction

                                                         Inst Fetch & L1
QPI




                                                                                  QPI
                         Paging


                      L3 cache
               L2        8 Мb
                >384 million transistors
Memory Controller
IO




                                               IO
                        Q
      Core   Core                Core   Core
                        u
                        e
                        u
                        e
QPI




                                               QPI
                   L3 cache
                      8 Мb
             >384 million transistors
CPU dedicates only
 about 10% to the
   execution units!




    1/10
CPU dedicates only
 about 10% to the
   execution units!
NVIDIAÂŽ
 GeForceÂŽ GTX 285


240 cores
1.476 GHz
1.4 billion transistors
470 mm2
PCIe &
TPC         TPC   TPC      Memory      TPC         TPC
                          Controller



                           Thread
      ROP         Setup                      ROP
                          Dispatch




                           Memory
TPC         TPC   TPC                  TPC         TPC
                          Controller
PCIe &




                  Multiprocessor

                                   Multiprocessor

                                                    Multiprocessor
TPC         TPC       TPC             Memory                         TPC         TPC
                                     Controller



                                        Thread
      ROP         Setup                                                    ROP
                                       Dispatch




                      Texture Memory
                              Fetch &
TPC         TPC       TPC                                            TPC         TPC
                             Controller
                           Other
PCIe &
TPC         TPC   TPC      Memory      TPC         TPC
                          Controller



                           Thread
      ROP         Setup                      ROP
                          Dispatch




                           Memory
TPC         TPC   TPC                  TPC         TPC
                          Controller
GPU dedicates about
30% to the execution
units!

             1/3
GPU dedicates 6 times as many
 resources to the execution units
                        as CPU!




 183 Watts        6x130=780 Watts
  full load           full load
Performance
                          680
                  250
 LM
                 195
           32                                                                  S1070

                                                                               GTX 295
                                                                  2 600
                                        1 330
NTLM                                                                           GTX 285
                              795
            87
                                                                               Q6600
                                                 1 920
                                 920
MD5
                        570
           70


       0                        1 000           2 000                    3 000
                                                         Millions passwords per second
Performance per $
           85
                             521
 LM
                              557
                                                                                           S1070
                178

                                                                                           GTX 295
                      325
                                                                                  2 771
NTLM                                                                                       GTX 285
                                                                      2 271
                            483
                                                                                           Q6600
                 240
                                                              1 917
MD5
                                                     1 629
                       389


       0               500          1 000    1 500           2 000      2 500       3 000
                                                              Thousands passwords per $ per second
Performance per Watt
                   850
                   865
 LM
                      1 066
           305
                                                                                   S1070

                                                                                   GTX 295
                                                      3 250
                                                                          4 602
NTLM
                                                                                   GTX 285
                                                                      4 344
                   829
                                                                                   Q6600
                                      2 400
                                                   3 183
MD5
                                                  3 115
                 667


       0          1 000       2 000           3 000          4 000         5 000
                                                      Thousands passwords per watt per second
Bad News:
Not every algorithm is worth
offloading to GPU
                              MD4 / MD5 
GPU is good at computing   SHA-1 / SHA-2 
                                RIPEMD 
                                       
          but                      MD2
                                   AES 
                                   DES 
 GPU is bad at accessing           RC4 
random memory locations
Good News:


Humans love
  repetition
WPA-PSK
      Tesla S1070                                                                              52400


                                                                       31500
       HD4870x2


                                                       21700
         GTX 295


                                               15750
          HD4870


                                       12500
         GTX 285


                                       11800
         GTX 280


                        3100
Core 2 Quad Q6600


                    0          10000           20000           30000           40000   50000           60000
Other Accelerators?


Based on FPGA (Xilinx)
FireWire
Proprietary SDK




US $3’995
Single Unit Performance
45000
                40000
40000

35000

30000
        27000
25000
                                                      TACC1441
20000
                                                      GTX 285
                                16000
15000                   13500

10000
                                               5050
 5000                                   3050

    0
        PGPdisk 128     PGPdisk 256     Office 2007
US $3’995
US $3’995
Performance for $4K
500000
                 440000
450000

400000

350000

300000

250000                                                     TACC1441
                                                           11x GTX 285
200000                             176000

150000

100000
                                                   55550
 50000   27000
                           13500            3050
     0
         PGPdisk 128       PGPdisk 256      Office 2007
Greener Computing

• Consider a cluster of 25
  dual-CPU quad-core
  computers
• 400 watts full load each
• 10’000 watts total
Greener Computing

• Two Tesla S1070 provide
  same performance
• 800 watts full load each
• One computer for
  management
• 2’000 watts total
Greener Computing
• 8’000 watts saved
• 49’090 kWh a year (at 70% utilization)
• € 5’890 savings on electricity a year
  (at 0.12€ per kWh average rate)
• Prevents 27’500 kg CO2 emission
• Takes 5 cars off the roads
• Saves 2’300 trees/year
Thank You!
       Andrey Belenko
 a.belenko@elcomsoft.com

More Related Content

Viewers also liked

GPU Cracking - On the Cheap
GPU Cracking - On the CheapGPU Cracking - On the Cheap
GPU Cracking - On the Cheap
NetSPI
 
Attack All The Layers - What's Working in Penetration Testing
Attack All The Layers - What's Working in Penetration TestingAttack All The Layers - What's Working in Penetration Testing
Attack All The Layers - What's Working in Penetration Testing
NetSPI
 
Introduction to Windows Dictionary Attacks
Introduction to Windows Dictionary AttacksIntroduction to Windows Dictionary Attacks
Introduction to Windows Dictionary Attacks
NetSPI
 
Thick Application Penetration Testing - A Crash Course
Thick Application Penetration Testing - A Crash CourseThick Application Penetration Testing - A Crash Course
Thick Application Penetration Testing - A Crash Course
NetSPI
 
WTF is Penetration Testing
WTF is Penetration TestingWTF is Penetration Testing
WTF is Penetration Testing
NetSPI
 

Viewers also liked (12)

GPU Cracking - On the Cheap
GPU Cracking - On the CheapGPU Cracking - On the Cheap
GPU Cracking - On the Cheap
 
BSides MCR 2016: From CSV to CMD to qwerty
BSides MCR 2016: From CSV to CMD to qwertyBSides MCR 2016: From CSV to CMD to qwerty
BSides MCR 2016: From CSV to CMD to qwerty
 
Password Cracking with Rainbow Tables
Password Cracking with Rainbow TablesPassword Cracking with Rainbow Tables
Password Cracking with Rainbow Tables
 
Cyber security and ethical hacking 9
Cyber security and ethical hacking 9Cyber security and ethical hacking 9
Cyber security and ethical hacking 9
 
Attack All The Layers - What's Working in Penetration Testing
Attack All The Layers - What's Working in Penetration TestingAttack All The Layers - What's Working in Penetration Testing
Attack All The Layers - What's Working in Penetration Testing
 
Introduction to Windows Dictionary Attacks
Introduction to Windows Dictionary AttacksIntroduction to Windows Dictionary Attacks
Introduction to Windows Dictionary Attacks
 
Thick Application Penetration Testing - A Crash Course
Thick Application Penetration Testing - A Crash CourseThick Application Penetration Testing - A Crash Course
Thick Application Penetration Testing - A Crash Course
 
Application Risk Prioritization - Overview - Secure360 2015 - Part 1 of 2
Application Risk Prioritization - Overview - Secure360 2015 - Part 1 of 2Application Risk Prioritization - Overview - Secure360 2015 - Part 1 of 2
Application Risk Prioritization - Overview - Secure360 2015 - Part 1 of 2
 
Thick client application security assessment
Thick client  application security assessmentThick client  application security assessment
Thick client application security assessment
 
Password Attack
Password Attack Password Attack
Password Attack
 
Password Cracking
Password Cracking Password Cracking
Password Cracking
 
WTF is Penetration Testing
WTF is Penetration TestingWTF is Penetration Testing
WTF is Penetration Testing
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 

GPU Assisted Password Cracking (Andrey Belenko, Elcomsoft)

  • 2. Who may need Password Recovery?  Ordinary users (own passwords)  IT Departments (employee’s passwords)  Security auditors, consultants and penetration testers  Law enforcement & government  Hackers usually don’t!
  • 3. Why speed counts? Users and IT Departments: ÂŤWe needed those passwords yesterdayÂť Auditors, consultants and pentesters: ÂŤTime is MoneyÂť
  • 4. How to increase speed? Traditional way is to network together many computers to form a cluster • Communication overhead • Difficult to manage • Not power-efficient
  • 5.
  • 6.
  • 8. Yes! For many HPC applications GPUs are many times faster than CPUs But they’re not only faster, they are greener!
  • 10. CPUs are designed to be efficient at serial computing… …while GPU’s main concern is parallel computing
  • 11. IntelÂŽ Core™ i7-965 “The highest performing desktop processor on the planet.” 4 cores 3,2 GHz 731 million transistors 263 mm2
  • 12. Memory Controller IO IO Q Core Core Core Core u e u e QPI QPI L3 cache 8 Мb >384 million transistors
  • 13. Out-of- Memory Controller Order Execution Units Scheduling & Retirement IO IO Q Ordering & Execution u Instruction Core Core Core Core Memory L1 e Decode & Data Microcode u e Branch Prediction Inst Fetch & L1 QPI QPI Paging L3 cache L2 8 Мb >384 million transistors
  • 14. Memory Controller IO IO Q Core Core Core Core u e u e QPI QPI L3 cache 8 Мb >384 million transistors
  • 15. CPU dedicates only about 10% to the execution units! 1/10
  • 16. CPU dedicates only about 10% to the execution units!
  • 17. NVIDIAÂŽ GeForceÂŽ GTX 285 240 cores 1.476 GHz 1.4 billion transistors 470 mm2
  • 18. PCIe & TPC TPC TPC Memory TPC TPC Controller Thread ROP Setup ROP Dispatch Memory TPC TPC TPC TPC TPC Controller
  • 19. PCIe & Multiprocessor Multiprocessor Multiprocessor TPC TPC TPC Memory TPC TPC Controller Thread ROP Setup ROP Dispatch Texture Memory Fetch & TPC TPC TPC TPC TPC Controller Other
  • 20. PCIe & TPC TPC TPC Memory TPC TPC Controller Thread ROP Setup ROP Dispatch Memory TPC TPC TPC TPC TPC Controller
  • 21. GPU dedicates about 30% to the execution units! 1/3
  • 22. GPU dedicates 6 times as many resources to the execution units as CPU! 183 Watts 6x130=780 Watts full load full load
  • 23. Performance 680 250 LM 195 32 S1070 GTX 295 2 600 1 330 NTLM GTX 285 795 87 Q6600 1 920 920 MD5 570 70 0 1 000 2 000 3 000 Millions passwords per second
  • 24. Performance per $ 85 521 LM 557 S1070 178 GTX 295 325 2 771 NTLM GTX 285 2 271 483 Q6600 240 1 917 MD5 1 629 389 0 500 1 000 1 500 2 000 2 500 3 000 Thousands passwords per $ per second
  • 25. Performance per Watt 850 865 LM 1 066 305 S1070 GTX 295 3 250 4 602 NTLM GTX 285 4 344 829 Q6600 2 400 3 183 MD5 3 115 667 0 1 000 2 000 3 000 4 000 5 000 Thousands passwords per watt per second
  • 26. Bad News: Not every algorithm is worth offloading to GPU MD4 / MD5  GPU is good at computing SHA-1 / SHA-2  RIPEMD   but MD2 AES  DES  GPU is bad at accessing RC4  random memory locations
  • 28.
  • 29. WPA-PSK Tesla S1070 52400 31500 HD4870x2 21700 GTX 295 15750 HD4870 12500 GTX 285 11800 GTX 280 3100 Core 2 Quad Q6600 0 10000 20000 30000 40000 50000 60000
  • 30. Other Accelerators? Based on FPGA (Xilinx) FireWire Proprietary SDK US $3’995
  • 31. Single Unit Performance 45000 40000 40000 35000 30000 27000 25000 TACC1441 20000 GTX 285 16000 15000 13500 10000 5050 5000 3050 0 PGPdisk 128 PGPdisk 256 Office 2007
  • 34. Performance for $4K 500000 440000 450000 400000 350000 300000 250000 TACC1441 11x GTX 285 200000 176000 150000 100000 55550 50000 27000 13500 3050 0 PGPdisk 128 PGPdisk 256 Office 2007
  • 35. Greener Computing • Consider a cluster of 25 dual-CPU quad-core computers • 400 watts full load each • 10’000 watts total
  • 36. Greener Computing • Two Tesla S1070 provide same performance • 800 watts full load each • One computer for management • 2’000 watts total
  • 37.
  • 38.
  • 39. Greener Computing • 8’000 watts saved • 49’090 kWh a year (at 70% utilization) • € 5’890 savings on electricity a year (at 0.12€ per kWh average rate) • Prevents 27’500 kg CO2 emission • Takes 5 cars off the roads • Saves 2’300 trees/year
  • 40. Thank You! Andrey Belenko a.belenko@elcomsoft.com