SlideShare a Scribd company logo
1 of 35
Download to read offline
Analyzing Pwned
Passwords with Spark
Kelley Robinson
@kelleyrobinson
Developer Evangelist
+
BIG DATA & SECURITY @KELLEYROBINSON
Spark: then and now
The state of passwords
Spark in action
Big Data ∩ Security
BIG DATA & SECURITY @KELLEYROBINSON
BIG DATA & SECURITY @KELLEYROBINSON
BIG DATA & SECURITY @KELLEYROBINSON
Apache Spark Ecosystem
BIG DATA & SECURITY @KELLEYROBINSON
Spark Abstractions
Then
Now
RDD (Resilient Distributed Dataset)
DataFrames / Datasets
https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html
@KELLEYROBINSONBIG DATA & SECURITY
RDDs
• Immutable & distributed
collection
• Unstructured data
• Low-level transformation
and control
BIG DATA & SECURITY @KELLEYROBINSON
BIG DATA & SECURITY
https://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html
@KELLEYROBINSON
https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html
@KELLEYROBINSONBIG DATA & SECURITY
Datasets
• Structured data
• Strongly typed
• Fast
@KELLEYROBINSONBIG DATA & SECURITY
Datasets
• Structured data
• Strongly typed
• Fast
• SQL DSLs
BIG DATA & SECURITY @KELLEYROBINSON
Apache Spark Ecosystem
BIG DATA & SECURITY @KELLEYROBINSON
Scala has the most
robust language API
BIG DATA & SECURITY
https://www.slideshare.net/databricks/composable-parallel-processing-in-apache-spark-and-weld
@KELLEYROBINSON
BIG DATA & SECURITY @KELLEYROBINSON
Spark: then and now
The state of passwords
Spark in action
Big Data ∩ Security
@KELLEYROBINSONBIG DATA & SECURITY
Spark: then and now
The state of passwords
Spark in action
Big Data ∩ Security
https://twitter.com/dog_rates/status/986762231290490881
Benefits
Fast
Flexible
Good for exploration
Proven for large systems
BIG DATA & SECURITY @KELLEYROBINSON
Challenges
Opaque error messages
Operationalizing
Documentation
http://heather.miller.am/blog/launching-a-spark-cluster-part-1.html
BIG DATA & SECURITY @KELLEYROBINSON
BIG DATA & SECURITY
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/
@KELLEYROBINSON
👍💯
The missing Spark documentation
BIG DATA & SECURITY @KELLEYROBINSON
Spark: then and now
The state of passwords
Spark in action
Big Data ∩ Security
BIG DATA & SECURITY @KELLEYROBINSON
@KELLEYROBINSON
BIG DATA & SECURITY
THANK YOU!
@kelleyrobinson
Spark Resources
• Apache Spark
• Jacek's Spark Documentation
• Zeppelin
• RDDs vs. Datasets
• Running Spark on a Cluster
Security Resources
• Pwned Passwords
• Reverse SHA1 hashes
• LastPass and 1Password
• 2FA Guides
@KELLEYROBINSONBIG DATA & SECURITY
Analyzing Pwned Passwords with Spark and Scala

More Related Content

What's hot

What's hot (7)

Connecting the Unconnected using GraphDB - Tel Aviv Summit 2018
Connecting the Unconnected using GraphDB - Tel Aviv Summit 2018Connecting the Unconnected using GraphDB - Tel Aviv Summit 2018
Connecting the Unconnected using GraphDB - Tel Aviv Summit 2018
 
It's all about the data - Tel Aviv Summit 2018
It's all about the data - Tel Aviv Summit 2018It's all about the data - Tel Aviv Summit 2018
It's all about the data - Tel Aviv Summit 2018
 
Build Machine Learning Models Quickly & Easily with Amazon SageMaker & Perisc...
Build Machine Learning Models Quickly & Easily with Amazon SageMaker & Perisc...Build Machine Learning Models Quickly & Easily with Amazon SageMaker & Perisc...
Build Machine Learning Models Quickly & Easily with Amazon SageMaker & Perisc...
 
Rapid Development using Serverless Infrastructure - Tel Aviv Summit 2018
Rapid Development using Serverless Infrastructure - Tel Aviv Summit 2018Rapid Development using Serverless Infrastructure - Tel Aviv Summit 2018
Rapid Development using Serverless Infrastructure - Tel Aviv Summit 2018
 
Nosql
NosqlNosql
Nosql
 
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017
 
Matt Badanes' talk from AWS + OWASP "Event-based Scanning for AMI enforcement...
Matt Badanes' talk from AWS + OWASP "Event-based Scanning for AMI enforcement...Matt Badanes' talk from AWS + OWASP "Event-based Scanning for AMI enforcement...
Matt Badanes' talk from AWS + OWASP "Event-based Scanning for AMI enforcement...
 

Similar to Analyzing Pwned Passwords with Spark and Scala

Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
Databricks
 
Jumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksJumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on Databricks
Databricks
 
2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3
Chester Chen
 

Similar to Analyzing Pwned Passwords with Spark and Scala (20)

Don't Let the Spark Burn Your House: Perspectives on Securing Spark
Don't Let the Spark Burn Your House: Perspectives on Securing SparkDon't Let the Spark Burn Your House: Perspectives on Securing Spark
Don't Let the Spark Burn Your House: Perspectives on Securing Spark
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
 
Oracle Big Data SQL
Oracle Big Data SQLOracle Big Data SQL
Oracle Big Data SQL
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
 
Jumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on DatabricksJumpstart on Apache Spark 2.2 on Databricks
Jumpstart on Apache Spark 2.2 on Databricks
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
Vote Early, Vote Often: From Napkin to Canvassing Application in a Single Wee...
Vote Early, Vote Often: From Napkin to Canvassing Application in a Single Wee...Vote Early, Vote Often: From Napkin to Canvassing Application in a Single Wee...
Vote Early, Vote Often: From Napkin to Canvassing Application in a Single Wee...
 
Spark + Hadoop Perfect together
Spark + Hadoop Perfect togetherSpark + Hadoop Perfect together
Spark + Hadoop Perfect together
 
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...
Sparkler—Crawler on Apache Spark: Spark Summit East talk by Karanjeet Singh a...
 
Sparkler at spark summit east 2017
Sparkler at spark summit east 2017Sparkler at spark summit east 2017
Sparkler at spark summit east 2017
 
Sparkler Presentation for Spark Summit East 2017
Sparkler Presentation for Spark Summit East 2017Sparkler Presentation for Spark Summit East 2017
Sparkler Presentation for Spark Summit East 2017
 
Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at Cohesive
 
Chicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at CohesiveChicago AWS user group meetup - May 2014 at Cohesive
Chicago AWS user group meetup - May 2014 at Cohesive
 
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
 
The Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationThe Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data Implementation
 
Rapid RESTful Web Applications with Apache Sling and Jackrabbit
Rapid RESTful Web Applications with Apache Sling and JackrabbitRapid RESTful Web Applications with Apache Sling and Jackrabbit
Rapid RESTful Web Applications with Apache Sling and Jackrabbit
 
EBS on Oracle Cloud
EBS on Oracle CloudEBS on Oracle Cloud
EBS on Oracle Cloud
 
2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
 

More from Kelley Robinson

More from Kelley Robinson (20)

Protecting your phone verification flow from fraud & abuse
Protecting your phone verification flow from fraud & abuseProtecting your phone verification flow from fraud & abuse
Protecting your phone verification flow from fraud & abuse
 
Preventing phone verification fraud (SMS pumping)
Preventing phone verification fraud (SMS pumping)Preventing phone verification fraud (SMS pumping)
Preventing phone verification fraud (SMS pumping)
 
Auth on the web: better authentication
Auth on the web: better authenticationAuth on the web: better authentication
Auth on the web: better authentication
 
WebAuthn
WebAuthnWebAuthn
WebAuthn
 
Introduction to Public Key Cryptography
Introduction to Public Key CryptographyIntroduction to Public Key Cryptography
Introduction to Public Key Cryptography
 
2FA in 2020 and Beyond
2FA in 2020 and Beyond2FA in 2020 and Beyond
2FA in 2020 and Beyond
 
Identiverse 2020 - Account Recovery with 2FA
Identiverse 2020 - Account Recovery with 2FAIdentiverse 2020 - Account Recovery with 2FA
Identiverse 2020 - Account Recovery with 2FA
 
Designing customer account recovery in a 2FA world
Designing customer account recovery in a 2FA worldDesigning customer account recovery in a 2FA world
Designing customer account recovery in a 2FA world
 
Introduction to SHAKEN/STIR
Introduction to SHAKEN/STIRIntroduction to SHAKEN/STIR
Introduction to SHAKEN/STIR
 
Intro to SHAKEN/STIR
Intro to SHAKEN/STIRIntro to SHAKEN/STIR
Intro to SHAKEN/STIR
 
PSD2, SCA, WTF?
PSD2, SCA, WTF?PSD2, SCA, WTF?
PSD2, SCA, WTF?
 
Building a Better Scala Community
Building a Better Scala CommunityBuilding a Better Scala Community
Building a Better Scala Community
 
BSides SF - Contact Center Authentication
BSides SF - Contact Center AuthenticationBSides SF - Contact Center Authentication
BSides SF - Contact Center Authentication
 
Communication @ Startups
Communication @ StartupsCommunication @ Startups
Communication @ Startups
 
Contact Center Authentication
Contact Center AuthenticationContact Center Authentication
Contact Center Authentication
 
Authentication Beyond SMS
Authentication Beyond SMSAuthentication Beyond SMS
Authentication Beyond SMS
 
BSides PDX - Threat Modeling Authentication
BSides PDX - Threat Modeling AuthenticationBSides PDX - Threat Modeling Authentication
BSides PDX - Threat Modeling Authentication
 
SIGNAL - Practical Cryptography
SIGNAL - Practical CryptographySIGNAL - Practical Cryptography
SIGNAL - Practical Cryptography
 
2FA Best Practices
2FA Best Practices2FA Best Practices
2FA Best Practices
 
Practical Cryptography
Practical CryptographyPractical Cryptography
Practical Cryptography
 

Recently uploaded

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
jaanualu31
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
MayuraD1
 

Recently uploaded (20)

Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 

Analyzing Pwned Passwords with Spark and Scala