SlideShare uma empresa Scribd logo
1 de 14
Invalidating Copyright
Infringement Claims with
Python and Fuzzy
Hashing
Joe T. Sylve, M.S.

Managing Partner
504ENSICS Labs
Background
• Client was being sued for Copyright Infringement
• Client’s lawyer wanted two questions answered
• Does the code contain any open source or GPL code?
• When was the code in question written?

• Code was written in PHP (web-based application)
• Code had absolutely no comments
• No copyright headers
• No dates of any kind

www.504ensics.com
Goal
• If it can be proven that the code contains open
source or GPL code with restrictive licenses then
the claim in invalid
• If it can be proven that the copyright code on file
was written after the author’s claimed “creation
date”, Copyright is invalid

www.504ensics.com
Is code original?
• No comments or header’s that would imply
authorship
• Code didn’t look familiar
• Code was kind of crappy

www.504ensics.com
Step 1 – Acquire Samples
• Wrote Python script to download all projects
written in PHP from Github
• Scraped from search feature
• Limited to 50 pages of search

• Got something like 10GB of compressed code
• ~100,000 files

www.504ensics.com
Step 2 – Compare Code
• Three Options
• Manual Verification
• Grad Students, Interns, etc

• Cryptographic Hashing
• MD5, SHA-1, etc

• “Fuzzy” Hashing
• ssdeep, sdhash

www.504ensics.com
Fuzzy Hashing
• Vassil says I have to call it “Approximate Matching”
• Ssdeep
• Vassil Roussev & Candace Quates
• Free, Open Source
• Awesome

• Traditional hashing
• If a single bit of the input changes, the whole hash
changes

• Fuzzy Hashing
• Compares files and gives similarity index
• Can find “similar” files
www.504ensics.com
When was code written?
• We can invalidate copyright if the sample on file
was written after the claimed authorship date
• No comments or dates of any kind in the code!
• No access to developer’s workstation to do
traditional forensics
• ???

www.504ensics.com
PHP
• Web-based language
• Updated reasonably frequently
• New Features added often
• Goal
• Determine which features were used in the code
• Correlate features with PHP release date
• Code couldn’t have been written before this date

www.504ensics.com
Step 1 – Function Use
• Programmer can create own functions or use ones
available in the language
• Ex
• function plus_one($x) { return $x + 1; }

• Python script to find all function declarations and
calls
• Ignore declared functions
• Left with a list of language “features” used

www.504ensics.com
Step 2 – Version Detection
• PHP comes with auto-generated documentation
about each built-in function
• Documentation says which version each function
became first available
• Write python script to scrape PHP documentation
• Correlate functions with PHP versions
• We only care about the function with the newest
version

www.504ensics.com
Step 3 – Date the code
• PHP has an archive of release notes on their
website
• Contains release versions and dates
• Python script scrapes release notes for the PHP
version of interest and gives us the release date
• Reasonably, the code couldn’t have been written
before that date

www.504ensics.com
Step 4 – Profit
• Win!
• Code in question used features first available in
PHP 5.1.5
• Release date 17-Aug-2006
• This was after the claimed creation date

www.504ensics.com
Conclusion
• Sometimes you can’t depend solely on existing
tools
• Learn to program even if you’re not a
“programmer”
• PHP sucks
• Fuzzy Hashing and Python is Cool

www.504ensics.com

Mais conteúdo relacionado

Mais procurados

Specification-driven API Design with OpenAPI
Specification-driven API Design with OpenAPISpecification-driven API Design with OpenAPI
Specification-driven API Design with OpenAPILukas Leander Rosenstock
 
apidays LIVE London 2021 - Designing APIs: Less Data is More by Damir Svrtan,...
apidays LIVE London 2021 - Designing APIs: Less Data is More by Damir Svrtan,...apidays LIVE London 2021 - Designing APIs: Less Data is More by Damir Svrtan,...
apidays LIVE London 2021 - Designing APIs: Less Data is More by Damir Svrtan,...apidays
 
Native Script by Sebastian Witalec
Native Script by Sebastian WitalecNative Script by Sebastian Witalec
Native Script by Sebastian WitalecSimone Basso
 
Managing Open Source Software in the GitHub Era
Managing Open Source Software in the GitHub EraManaging Open Source Software in the GitHub Era
Managing Open Source Software in the GitHub EranexB Inc.
 
nexB Software Audit M&A: What to expect as a Seller
nexB Software Audit M&A: What to expect as a SellernexB Software Audit M&A: What to expect as a Seller
nexB Software Audit M&A: What to expect as a SellernexB Inc.
 
Magento 2 performance profiling and best practices
Magento 2 performance profiling and best practicesMagento 2 performance profiling and best practices
Magento 2 performance profiling and best practicesJacques Bodin-Hullin
 
OmegaT "Team Project" feature: a case study
OmegaT "Team Project" feature: a case studyOmegaT "Team Project" feature: a case study
OmegaT "Team Project" feature: a case studyQabiria
 
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)Google Developer Relations Team
 
How to Review your Translation with 2 Free and Open Source QA Tools
How to Review your Translation with 2 Free and Open Source QA ToolsHow to Review your Translation with 2 Free and Open Source QA Tools
How to Review your Translation with 2 Free and Open Source QA ToolsQabiria
 
Android maven Road to flutter| Mavenizing Flutter for web
Android maven Road to flutter| Mavenizing Flutter for webAndroid maven Road to flutter| Mavenizing Flutter for web
Android maven Road to flutter| Mavenizing Flutter for webOluwatobiAkinpelu
 
Effective .NET Core Unit Testing with SQLite and Dapper
Effective .NET Core Unit Testing with SQLite and DapperEffective .NET Core Unit Testing with SQLite and Dapper
Effective .NET Core Unit Testing with SQLite and DapperMike Melusky
 
How to Manage Open Source requirements with AboutCode
How to Manage Open Source requirements with AboutCodeHow to Manage Open Source requirements with AboutCode
How to Manage Open Source requirements with AboutCodenexB Inc.
 
Introduction to OmegaT
Introduction to OmegaTIntroduction to OmegaT
Introduction to OmegaTQabiria
 

Mais procurados (14)

Specification-driven API Design with OpenAPI
Specification-driven API Design with OpenAPISpecification-driven API Design with OpenAPI
Specification-driven API Design with OpenAPI
 
apidays LIVE London 2021 - Designing APIs: Less Data is More by Damir Svrtan,...
apidays LIVE London 2021 - Designing APIs: Less Data is More by Damir Svrtan,...apidays LIVE London 2021 - Designing APIs: Less Data is More by Damir Svrtan,...
apidays LIVE London 2021 - Designing APIs: Less Data is More by Damir Svrtan,...
 
Native Script by Sebastian Witalec
Native Script by Sebastian WitalecNative Script by Sebastian Witalec
Native Script by Sebastian Witalec
 
Managing Open Source Software in the GitHub Era
Managing Open Source Software in the GitHub EraManaging Open Source Software in the GitHub Era
Managing Open Source Software in the GitHub Era
 
nexB Software Audit M&A: What to expect as a Seller
nexB Software Audit M&A: What to expect as a SellernexB Software Audit M&A: What to expect as a Seller
nexB Software Audit M&A: What to expect as a Seller
 
Magento 2 performance profiling and best practices
Magento 2 performance profiling and best practicesMagento 2 performance profiling and best practices
Magento 2 performance profiling and best practices
 
Reaching Out To Developers
Reaching Out To DevelopersReaching Out To Developers
Reaching Out To Developers
 
OmegaT "Team Project" feature: a case study
OmegaT "Team Project" feature: a case studyOmegaT "Team Project" feature: a case study
OmegaT "Team Project" feature: a case study
 
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)
Google Developer Day 2010 Japan: 音声入力 API for Android (アレックス グランスタイン, 小西 祐介)
 
How to Review your Translation with 2 Free and Open Source QA Tools
How to Review your Translation with 2 Free and Open Source QA ToolsHow to Review your Translation with 2 Free and Open Source QA Tools
How to Review your Translation with 2 Free and Open Source QA Tools
 
Android maven Road to flutter| Mavenizing Flutter for web
Android maven Road to flutter| Mavenizing Flutter for webAndroid maven Road to flutter| Mavenizing Flutter for web
Android maven Road to flutter| Mavenizing Flutter for web
 
Effective .NET Core Unit Testing with SQLite and Dapper
Effective .NET Core Unit Testing with SQLite and DapperEffective .NET Core Unit Testing with SQLite and Dapper
Effective .NET Core Unit Testing with SQLite and Dapper
 
How to Manage Open Source requirements with AboutCode
How to Manage Open Source requirements with AboutCodeHow to Manage Open Source requirements with AboutCode
How to Manage Open Source requirements with AboutCode
 
Introduction to OmegaT
Introduction to OmegaTIntroduction to OmegaT
Introduction to OmegaT
 

Semelhante a Invalidating copyright infringement claims

WordPress Under Control (Boston WP Meetup)
WordPress Under Control (Boston WP Meetup)WordPress Under Control (Boston WP Meetup)
WordPress Under Control (Boston WP Meetup)Matt Bernhardt
 
Managing Open Source Software Supply Chains
Managing Open Source Software Supply ChainsManaging Open Source Software Supply Chains
Managing Open Source Software Supply ChainsnexB Inc.
 
Php internal architecture
Php internal architecturePhp internal architecture
Php internal architectureElizabeth Smith
 
Shift Remote FRONTEND: Building Web Parasite Using Chrome Extension - Ivan Vu...
Shift Remote FRONTEND: Building Web Parasite Using Chrome Extension - Ivan Vu...Shift Remote FRONTEND: Building Web Parasite Using Chrome Extension - Ivan Vu...
Shift Remote FRONTEND: Building Web Parasite Using Chrome Extension - Ivan Vu...Shift Conference
 
OSSF 2018 - Jamie Jones of GitHub - Pull what where? Contributing to Open Sou...
OSSF 2018 - Jamie Jones of GitHub - Pull what where? Contributing to Open Sou...OSSF 2018 - Jamie Jones of GitHub - Pull what where? Contributing to Open Sou...
OSSF 2018 - Jamie Jones of GitHub - Pull what where? Contributing to Open Sou...FINOS
 
Developing rich multimedia applications with FI-WARE.
Developing rich multimedia applications with FI-WARE.Developing rich multimedia applications with FI-WARE.
Developing rich multimedia applications with FI-WARE.Luis Lopez
 
Managing Software Inventories & Automating Open Source Software Compliance
Managing Software Inventories & Automating Open Source Software ComplianceManaging Software Inventories & Automating Open Source Software Compliance
Managing Software Inventories & Automating Open Source Software CompliancenexB Inc.
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Den Delimarsky
 
CodeIgniter - PHP MVC Framework by silicongulf.com
CodeIgniter - PHP MVC Framework by silicongulf.comCodeIgniter - PHP MVC Framework by silicongulf.com
CodeIgniter - PHP MVC Framework by silicongulf.comChristopher Cubos
 
Desktop Apps with PHP and Titanium
Desktop Apps with PHP and TitaniumDesktop Apps with PHP and Titanium
Desktop Apps with PHP and TitaniumBen Ramsey
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformAntonio Peric-Mazar
 
PHP Frameworks Review - Mar 19 2015
PHP Frameworks Review - Mar 19 2015PHP Frameworks Review - Mar 19 2015
PHP Frameworks Review - Mar 19 2015kyphpug
 
Open Source Security and ChatGPT-Published.pdf
Open Source Security and ChatGPT-Published.pdfOpen Source Security and ChatGPT-Published.pdf
Open Source Security and ChatGPT-Published.pdfJavier Perez
 
Building RESTful APIs
Building RESTful APIsBuilding RESTful APIs
Building RESTful APIsSilota Inc.
 
Modern Web 2016: Using Golang to build a smart IM Bot
Modern Web 2016: Using Golang to build a smart IM Bot Modern Web 2016: Using Golang to build a smart IM Bot
Modern Web 2016: Using Golang to build a smart IM Bot Evan Lin
 
PYTHON_WORLD.pptx
PYTHON_WORLD.pptxPYTHON_WORLD.pptx
PYTHON_WORLD.pptxUr's HAyath
 

Semelhante a Invalidating copyright infringement claims (20)

WordPress Under Control (Boston WP Meetup)
WordPress Under Control (Boston WP Meetup)WordPress Under Control (Boston WP Meetup)
WordPress Under Control (Boston WP Meetup)
 
Managing Open Source Software Supply Chains
Managing Open Source Software Supply ChainsManaging Open Source Software Supply Chains
Managing Open Source Software Supply Chains
 
Php internal architecture
Php internal architecturePhp internal architecture
Php internal architecture
 
Shift Remote FRONTEND: Building Web Parasite Using Chrome Extension - Ivan Vu...
Shift Remote FRONTEND: Building Web Parasite Using Chrome Extension - Ivan Vu...Shift Remote FRONTEND: Building Web Parasite Using Chrome Extension - Ivan Vu...
Shift Remote FRONTEND: Building Web Parasite Using Chrome Extension - Ivan Vu...
 
Python programming 2nd
Python programming 2ndPython programming 2nd
Python programming 2nd
 
Web-App Remote Code Execution Via Scripting Engines
Web-App Remote Code Execution Via Scripting EnginesWeb-App Remote Code Execution Via Scripting Engines
Web-App Remote Code Execution Via Scripting Engines
 
OSSF 2018 - Jamie Jones of GitHub - Pull what where? Contributing to Open Sou...
OSSF 2018 - Jamie Jones of GitHub - Pull what where? Contributing to Open Sou...OSSF 2018 - Jamie Jones of GitHub - Pull what where? Contributing to Open Sou...
OSSF 2018 - Jamie Jones of GitHub - Pull what where? Contributing to Open Sou...
 
Juc boston2014.pptx
Juc boston2014.pptxJuc boston2014.pptx
Juc boston2014.pptx
 
Developing rich multimedia applications with FI-WARE.
Developing rich multimedia applications with FI-WARE.Developing rich multimedia applications with FI-WARE.
Developing rich multimedia applications with FI-WARE.
 
Managing Software Inventories & Automating Open Source Software Compliance
Managing Software Inventories & Automating Open Source Software ComplianceManaging Software Inventories & Automating Open Source Software Compliance
Managing Software Inventories & Automating Open Source Software Compliance
 
Php
PhpPhp
Php
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018
 
CodeIgniter - PHP MVC Framework by silicongulf.com
CodeIgniter - PHP MVC Framework by silicongulf.comCodeIgniter - PHP MVC Framework by silicongulf.com
CodeIgniter - PHP MVC Framework by silicongulf.com
 
Desktop Apps with PHP and Titanium
Desktop Apps with PHP and TitaniumDesktop Apps with PHP and Titanium
Desktop Apps with PHP and Titanium
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API Platform
 
PHP Frameworks Review - Mar 19 2015
PHP Frameworks Review - Mar 19 2015PHP Frameworks Review - Mar 19 2015
PHP Frameworks Review - Mar 19 2015
 
Open Source Security and ChatGPT-Published.pdf
Open Source Security and ChatGPT-Published.pdfOpen Source Security and ChatGPT-Published.pdf
Open Source Security and ChatGPT-Published.pdf
 
Building RESTful APIs
Building RESTful APIsBuilding RESTful APIs
Building RESTful APIs
 
Modern Web 2016: Using Golang to build a smart IM Bot
Modern Web 2016: Using Golang to build a smart IM Bot Modern Web 2016: Using Golang to build a smart IM Bot
Modern Web 2016: Using Golang to build a smart IM Bot
 
PYTHON_WORLD.pptx
PYTHON_WORLD.pptxPYTHON_WORLD.pptx
PYTHON_WORLD.pptx
 

Último

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 

Último (20)

AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

Invalidating copyright infringement claims

  • 1. Invalidating Copyright Infringement Claims with Python and Fuzzy Hashing Joe T. Sylve, M.S. Managing Partner 504ENSICS Labs
  • 2. Background • Client was being sued for Copyright Infringement • Client’s lawyer wanted two questions answered • Does the code contain any open source or GPL code? • When was the code in question written? • Code was written in PHP (web-based application) • Code had absolutely no comments • No copyright headers • No dates of any kind www.504ensics.com
  • 3. Goal • If it can be proven that the code contains open source or GPL code with restrictive licenses then the claim in invalid • If it can be proven that the copyright code on file was written after the author’s claimed “creation date”, Copyright is invalid www.504ensics.com
  • 4. Is code original? • No comments or header’s that would imply authorship • Code didn’t look familiar • Code was kind of crappy www.504ensics.com
  • 5. Step 1 – Acquire Samples • Wrote Python script to download all projects written in PHP from Github • Scraped from search feature • Limited to 50 pages of search • Got something like 10GB of compressed code • ~100,000 files www.504ensics.com
  • 6. Step 2 – Compare Code • Three Options • Manual Verification • Grad Students, Interns, etc • Cryptographic Hashing • MD5, SHA-1, etc • “Fuzzy” Hashing • ssdeep, sdhash www.504ensics.com
  • 7. Fuzzy Hashing • Vassil says I have to call it “Approximate Matching” • Ssdeep • Vassil Roussev & Candace Quates • Free, Open Source • Awesome • Traditional hashing • If a single bit of the input changes, the whole hash changes • Fuzzy Hashing • Compares files and gives similarity index • Can find “similar” files www.504ensics.com
  • 8. When was code written? • We can invalidate copyright if the sample on file was written after the claimed authorship date • No comments or dates of any kind in the code! • No access to developer’s workstation to do traditional forensics • ??? www.504ensics.com
  • 9. PHP • Web-based language • Updated reasonably frequently • New Features added often • Goal • Determine which features were used in the code • Correlate features with PHP release date • Code couldn’t have been written before this date www.504ensics.com
  • 10. Step 1 – Function Use • Programmer can create own functions or use ones available in the language • Ex • function plus_one($x) { return $x + 1; } • Python script to find all function declarations and calls • Ignore declared functions • Left with a list of language “features” used www.504ensics.com
  • 11. Step 2 – Version Detection • PHP comes with auto-generated documentation about each built-in function • Documentation says which version each function became first available • Write python script to scrape PHP documentation • Correlate functions with PHP versions • We only care about the function with the newest version www.504ensics.com
  • 12. Step 3 – Date the code • PHP has an archive of release notes on their website • Contains release versions and dates • Python script scrapes release notes for the PHP version of interest and gives us the release date • Reasonably, the code couldn’t have been written before that date www.504ensics.com
  • 13. Step 4 – Profit • Win! • Code in question used features first available in PHP 5.1.5 • Release date 17-Aug-2006 • This was after the claimed creation date www.504ensics.com
  • 14. Conclusion • Sometimes you can’t depend solely on existing tools • Learn to program even if you’re not a “programmer” • PHP sucks • Fuzzy Hashing and Python is Cool www.504ensics.com