Cantina content based approach to detect phishing websites

•

2 gostaram•1,681 visualizações

thestarlight92

Tecnologia Design

CANTINA
A Content-Based Approach to Detecting Phishing Web
Sites

•CANTINA is a content-based
approach.
•Examines whether the content is
legitimate or not.
•Detects phishing URLs and links.
ABSTRACT

INTRODUCTION
• Phishing
A kind of attack in which victims are tricked by
spoofed emails and fraudulent web sites into giving
up personal information
•How many phishing sites are there?
9,255 unique phishing sites were reported in June of
2006 alone
•How much phishing costs each year?
$1 billion to 2.8 billion per year

EXISTING SYSTEM
• NetCraft(Surface Characteristics)
• SpoofGuard(Surface Characteristics and
blacklist)
• Cloudmark(Blacklist )

PROPOSED SYSTEM
• Detects phishing websites
• Examines text-based content along with surface
characteristics.
• Text based content includes:
-Age of Domain.
-Known Images.
-Suspicious URL.
-Suspicious links.
 Detects phishing links in users email.

TF-IDF ALGORITHM
• Term Frequency (TF)
–The number of times a given term appears
in a specific document
–Measure of the importance of the term
within the particular document
• Inverse Document Frequency (IDF)
–Measure how common a term is across an
entire collection of documents
• High TF-IDF weight means High TF

MODULES
• Parsing the web pages
• Generating the lexical signature
• Testing Process
• Report Generation

Parsing the web pages
• Link, anchor tag, form tag and attachment in the
web pages is turned into corresponding Text Link,
HTML Link e.t.c.
•Done by parsing each Text
• Uses HTML Parser API
• It is used for extracting information from
HTML code

Generating the lexical signature
• TF-IDF algorithm used to generate
lexical signatures.
• Calculating the TF-IDF value for each
word in a document.
• Selecting the words with highest
value.

Testing Process
• Feed this lexical signature to a search
engine.
• Check domain name of the current
web page matches the domain name
of the N top search results.

Report Generation
• If a page is Legitimate it returns
“legitimate”
• If a page is phishing it returns
“phishing”

• Used to detect fraudulent websites,
emails.
•Protects from giving up personal
information like credit card numbers,
bank details, account passwords etc.
•Used to detect suspicious links in
email.
APPLICATIONS

•Content-based approach for detecting
phishing websites.
•User friendly interface for the users.
•Anti-phishing website that protects users
from giving their personal information.
CONCLUSION

Mais conteúdo relacionado

Semelhante a Cantina content based approach to detect phishing websites

Detection of Phishing Websites Nikhil Soni

HadoopSummit_2010_big dataspamchallange_hadoopsummit2010Yahoo Developer Network

Cyberscout Corporate SecurityFiroze Hussain

Web miningSarthakSahoo8

introduction for web connectivity (IoT)FabMinds

Chapter2_2018 The Internet, the Web, and Electronic Commerce.pptxborith10b

Web Mining & Text MiningHemant Sharma

Eba ppt rajeshRajeshP153

Detecting Phishing using Machine Learningijtsrd

Automation Attacks At ScaleMayank Dhiman

Identity TheftSimpletel

Catching the Golden Snitch- Leveraging Threat Intelligence Platforms to Defen...Chi En (Ashley) Shen

PhishingSreekanth Narendran

An introduction to web analyticsShilpa P

1. web technology basicsJyoti Yadav

Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...Selman Bozkır

Phishing Website Detection by Machine Learning Techniques Presentation.pdfVaralakshmiKC

Winning the Big Data SPAM Challenge__HadoopSummit2010Yahoo Developer Network

BlueVenn: Creating and Using the 'Golden Customer Record'Daniel Williams

DC presentation 1Harini Sirisena

Semelhante a Cantina content based approach to detect phishing websites (20)

Detection of Phishing Websites

HadoopSummit_2010_big dataspamchallange_hadoopsummit2010

Cyberscout Corporate Security

Web mining

introduction for web connectivity (IoT)

Chapter2_2018 The Internet, the Web, and Electronic Commerce.pptx

Web Mining & Text Mining

Eba ppt rajesh

Detecting Phishing using Machine Learning

Automation Attacks At Scale

Identity Theft

Catching the Golden Snitch- Leveraging Threat Intelligence Platforms to Defen...

Phishing

An introduction to web analytics

1. web technology basics

Phishing Attacks: Trends, Detection Systems and Computer Vision as a Promisin...

Phishing Website Detection by Machine Learning Techniques Presentation.pdf

Winning the Big Data SPAM Challenge__HadoopSummit2010

BlueVenn: Creating and Using the 'Golden Customer Record'

DC presentation 1

Último

TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc

How to write a Business Continuity PlanDatabarracks

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

A Journey Into the Emotions of Software DevelopersNicole Novielli

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

Rise of the Machines: Known As Drones...Rick Flair

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

Data governance with Unity Catalog PresentationKnoldus Inc.

Cantina content based approach to detect phishing websites

1. CANTINA A Content-Based Approach to Detecting Phishing Web Sites

2. •CANTINA is a content-based approach. •Examines whether the content is legitimate or not. •Detects phishing URLs and links. ABSTRACT

3. INTRODUCTION • Phishing A kind of attack in which victims are tricked by spoofed emails and fraudulent web sites into giving up personal information •How many phishing sites are there? 9,255 unique phishing sites were reported in June of 2006 alone •How much phishing costs each year? $1 billion to 2.8 billion per year

4. EXISTING SYSTEM • NetCraft(Surface Characteristics) • SpoofGuard(Surface Characteristics and blacklist) • Cloudmark(Blacklist )

5. PROPOSED SYSTEM • Detects phishing websites • Examines text-based content along with surface characteristics. • Text based content includes: -Age of Domain. -Known Images. -Suspicious URL. -Suspicious links.  Detects phishing links in users email.

6. TF-IDF ALGORITHM • Term Frequency (TF) –The number of times a given term appears in a specific document –Measure of the importance of the term within the particular document • Inverse Document Frequency (IDF) –Measure how common a term is across an entire collection of documents • High TF-IDF weight means High TF

7. REAL EBAY WEBPAGE

8. FAKE EBAY WEBPAGE

9. MODULES • Parsing the web pages • Generating the lexical signature • Testing Process • Report Generation

10. Parsing the web pages • Link, anchor tag, form tag and attachment in the web pages is turned into corresponding Text Link, HTML Link e.t.c. •Done by parsing each Text • Uses HTML Parser API • It is used for extracting information from HTML code

11. Generating the lexical signature • TF-IDF algorithm used to generate lexical signatures. • Calculating the TF-IDF value for each word in a document. • Selecting the words with highest value.

12. Testing Process • Feed this lexical signature to a search engine. • Check domain name of the current web page matches the domain name of the N top search results.

13. Report Generation • If a page is Legitimate it returns “legitimate” • If a page is phishing it returns “phishing”

14. • Used to detect fraudulent websites, emails. •Protects from giving up personal information like credit card numbers, bank details, account passwords etc. •Used to detect suspicious links in email. APPLICATIONS

15. •Content-based approach for detecting phishing websites. •User friendly interface for the users. •Anti-phishing website that protects users from giving their personal information. CONCLUSION

Cantina content based approach to detect phishing websites

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Cantina content based approach to detect phishing websites

Semelhante a Cantina content based approach to detect phishing websites (20)

Último

Último (20)

Cantina content based approach to detect phishing websites