SlideShare uma empresa Scribd logo
1 de 4
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.6, 2013- Selected from International Conference on Recent Trends in Applied Sciences with Engineering Applications
11
A Review of Spam Filtering and Measures of Antispam
Ms.Rachana Mishra
Asst.Prof.( Department IT), Oriental College of Technology, Bhopal(M.P.)
Dr..Ramjeevan Singh Thakur
Associate Prof.(Department of computer application), M.A.N.I.T., Bhopal(M.P.)
rachanamishra812@gmail.com
Abstract: Spam is commonly defined as unsolicited email messages, and the goal of spam categorization is to
distinguish between spam and legitimate email messages. Our main aim is classification of spam mail and
solving various problem is related to web space. So we discuss the measuring parameter which are helpful for
reduce the spam or junk mail. In this paper, we describe procedure that can help eliminate unsolicited
commercial e-mail, viruses, Trojans, and worms, as well as frauds perpetrated electronically and other undesired
and troublesome e-mail.so this research is helpful for best classification and categorization method of email
spam detection.
Keywords: spam, anti spam factor.
INTRODUCTION:
The Internet is gradually becoming an integral part of everyday life. Internet usage is expected to continue
growing and e-mail has become a powerful tool intended for idea and information exchange, as well as for
commercial and social lives. Along with the growth of the Internet and e-mail, there has been a dramatic growth
in spam in recent years [1,2,6]. The majority of spam solutions deal with the flood of spam.However, it is
amazing that despite the increasing development of anti-spam services and technologies, the number of spam
messages continues to increase rapidly.
The increasing volume of spam has become a serious threat not only to the Internet, but also to society. For the
business and educational environment, spam has become a security issue. Spam has gone from being annoying
to being expensive and risky. The enigma is that spam is difficult to define. What is spam to one person is not
necessarily spam to another. Fortunately or unfortunately, spam is here to stay and destined to increase its impact
around the world. It has become an issue that can no longer be ignored; an issue that needs to be addressed in a
multi-layered approach: at the source, on the network, and with the end-user. Consequently, spam filtering is
able to control the problem in a variety of ways. Identification and spam removal from the e-mail delivery
system allows end-users to regain a useful means of communication. Many researches on spam filtering have
been centered on the more sophisticated classifier-related issues. Currently, machine learning for spam
classification is an important research issue . The success of machine learning techniques in text categorization
has led researchers to[1] explore learning algorithms in spam filtering. In particular, Bayesian techniques,
support vector machines (SVM) effectively used for text categorization which influences researchers to classify
the email is based on a special case of TC (text categorization), with the categories being spam and non-spam.
SPAM PROBLEMS
Spam is difficult to define. In fact, there is no widely agreed or clear workable definition, We all recognize spam
when we see it, but the truth is that what is spam to one person may not be spam to another. So, the notion of
spam is subjective [1,3]. The most well-known definition seems to be ‘unsolicited commercial email’ (UCE) and
‘Unsolicited Bulk Email’ (UBE). The content of spam ranges enormously from advertisements for goods and
services to pornographic material, financial advertisements, information on illegal copies of software, fraudulent
advertisements and/or fraudulent attempts to solicit money. More recently, spam has been spreading at an
increasingly rapid rate, and while groups of spammers were relatively small in the past, the wide availability of
‘spam kits’ over the Internet has spread the practice from the United States to China, Russia and South
America .[3] The scale of the problem is perhaps best highlighted when the growth of spam since 2001 is
considered, as well as the percentage of spam, which was 7 per cent of all received e-mail . By 2002 this had
grown to 29 per cent, and by the end of 2003 the total stood at 54 per cent . In March 2004 the percentage had
increased to 63 per cent and this is set to continue rising. According to Message Labs, a US consultancy firm,
spam now accounts for around 65% of all e-mail traffic. The following subsections outline a number of reasons
which can explain why spam has become a serious problem.
“SPAM” A Serious Problem
“Cost-Shifting”
Sending spam is extremely cheap (for the sender).The costs of spamming are paid by others: network
mainainers,recipients,etc.“Fraud” Often,spams pretend to be replies or follow-ups to previous inqiuiries to get
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.6, 2013- Selected from International Conference on Recent Trends in Applied Sciences with Engineering Applications
12
people into opening messages.“Disguise origin” Spammers can easily disguise the originof their spam
messages.Otherwise it would be just too easy to filter spam and spamming would be rendered useless.
Problem formulation
The problem with spam is that it tends to swamp desirable e-mail. In my own experience, a few years ago I
occasionally received an inappropriate message, perhaps one or two each day. Every day of this month, in
contrast, I received many times more spams than I did legitimate correspondences. On average, I probably get 10
spams for every appropriate e-mail. In some ways I am unusual -- as a public writer, I maintain a widely
published e-mail address; moreover, I both welcome and receive frequent correspondence from strangers related
to my published writing and to my software libraries. Unfortunately, a letter from a stranger -- with who-knows-
which e-mail application, OS, native natural language, and so on, is not immediately obvious in its purpose; and
spammers try to slip their messages underneath such ambiguities. My seconds are valuable to me, especially
when they are claimed many times during every hour of a day. All mail server are not control spam mail, so that
I can try to do some more work on spam data analysis, and give some accurate point to control them.
Sample data of spam:
ANTI –SPAM MEASURES
Why finding out secondary impacts of spam filtering is so difficult
From a spam filtering point of view we can use two main information resources for studying secondary impacts
of spam filtering. First, we can analyze block lists and the IP addresses they list. Technically this is rather
straightforward and we will cover results in the next section. Second, we can analyze spam messages and where
they (allegedly) come from. In addition, it is necessary to take into account information from other, non-
technical sources to understand how spamming is affecting the networked world.
Analyzing email in the context of spam filtering: For a number of reasons including avoiding legal liability
and protecting their valuable infrastructure, spammers try to disguise the true origin of their messages. Naturally
this behavior means that collecting data regarding secondary, unwanted effects of anti-spam measures beyond
false-positive rates (which are computed locally) is rather difficult.
Identifying fake Received headers in email
For example imagine that we receive an email with header information including the following (for a more
detailed discussion see 2006):
Received: from Echo by Destination
Received: from Delta by Echo
Received: from Charlie by Delta
Received: from Bravo by Charlie
Received: from Alpha by Bravo
Type of spam
Text spam:the capability to sort incoming emails based on simple string found in specific header fields.its
capability is very simple and doesn’t even include regular expression matching.in text spam we filter the
particular keyword which are unwanted.
Image spam: Image spam is a kind of E-mail spam where the message text of the spam is presented as a picture
Journal of Information Engineering and Applications www.iiste.org
ISSN 2224-5782 (print) ISSN 2225-0506 (online)
Vol.3, No.6, 2013- Selected from International Conference on Recent Trends in Applied Sciences with Engineering Applications
13
in an image file. Since most modern graphical E-mail client software will render the image file by default,
presenting the message image directly to the user, it is highly effective at circumventing normal E-mail filtering
software. Currently, the surest known countermeasure for image spam is to discard all messages containing
images which do not appear to come from an already white listed E-mail address. However, this has the
disadvantage that valid messages containing images from new correspondents must either be silently
discarded.Content based spam: involves a number of methods, such as repeating unrelated phrases, to manipulate
the relevance or prominence of resources indexed by a search engine, in a manner inconsistent with the purpose
of the indexing system.
How to filter the spam:
Step 1 : collect the spam and anti spam data.
Step2: preprocessing the data in which we have to reduce the noise,means we reduces the unwanted field.we
have to clean the data .
Step 3: identify the specific bad sender, read the header body.
Step 4: applying particular classification tools or techniques for specified content, text and images. We have to
apply these technique as per our requirement .
Step 5: store the result in the form of content based or text based or image based.
Techniques for anti spam filtering:
White /list filter
Rule based filter
Content based filter
Pattern detection
In our study we conclude that spam can be classified into two wide categories .The first categories is spam with
attachment & the second categories is spam without attachment .Spam with attachment into four type such as
spam mail containing image file(.gif),text message,URL,spam containing image, text and spam containing
worms,virus,Trojans as an attachment. Spam without attachment are small in size but spam with attachment
bigger in size.our aim is to introduce the survey spam techniques.
Conclusion
Spam measures are helpful for categorization of spam mail. Our main aim is finding best filtering techniques for
spam detection, so given measures are helpful for features selection . We can conclude that number of SPAM is
a very serious problem which is drastically widespread. In future ,we plan to propose and implement a new spam
classification and spam detection approach .
References:
[1] Fulu Li, Mo-han Hsieh, “An empirical study ofclustering behavior of spammers and Group based Anti-spam
strategies”, CEAS 2006, pp 21-28, 2006.
[2] Dhinaharan Nagamalai, Cynthia.D, Jae Kwang Lee,” ANovel Mechanism to defend DDoS attacks caused by
spam”,
International Journal of Smart Home, SERSC, Seoul, July 2007, pp 83-96.
[3] Calton pu, Steve webb: “Observed trends in spam construction techniques: A case study of spam evolution”,
CEAS 2006, pp 104-112, July 27-28, 2006.
[4] Anirudh Ramachandran, David Dagon, Nick Feamste,“Can DNS-based Blacklists keep up with Bots”, CEAS
2006,CA, USA, July 27-28, 2006.
[5] SpamCop http://spamcop.net.
[6] Internet User Forecasts by Country http:// www. etforecasts.Com.
[7] Nigerian fraud mail Gallery http://www.potifos.com/fraud/.
[8] Fairfax Digital http://www.smh.com.au/articles/2004/10/18.
This academic article was published by The International Institute for Science,
Technology and Education (IISTE). The IISTE is a pioneer in the Open Access
Publishing service based in the U.S. and Europe. The aim of the institute is
Accelerating Global Knowledge Sharing.
More information about the publisher can be found in the IISTE’s homepage:
http://www.iiste.org
CALL FOR PAPERS
The IISTE is currently hosting more than 30 peer-reviewed academic journals and
collaborating with academic institutions around the world. There’s no deadline for
submission. Prospective authors of IISTE journals can find the submission
instruction on the following page: http://www.iiste.org/Journals/
The IISTE editorial team promises to the review and publish all the qualified
submissions in a fast manner. All the journals articles are available online to the
readers all over the world without financial, legal, or technical barriers other than
those inseparable from gaining access to the internet itself. Printed version of the
journals is also available upon request of readers and authors.
IISTE Knowledge Sharing Partners
EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open
Archives Harvester, Bielefeld Academic Search Engine, Elektronische
Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial
Library , NewJour, Google Scholar

Mais conteúdo relacionado

Mais procurados

Web Spam Detection Using Machine Learning
Web Spam Detection Using Machine LearningWeb Spam Detection Using Machine Learning
Web Spam Detection Using Machine Learning
butest
 
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
ijtsrd
 
Spam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta BhattacharyaSpam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta Bhattacharya
sankhadeep
 
Predicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningPredicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learning
MirXahid1
 

Mais procurados (15)

A multi layer architecture for spam-detection system
A multi layer architecture for spam-detection systemA multi layer architecture for spam-detection system
A multi layer architecture for spam-detection system
 
E spam
E spamE spam
E spam
 
Survey on spam filtering
Survey on spam filteringSurvey on spam filtering
Survey on spam filtering
 
Web Spam Detection Using Machine Learning
Web Spam Detection Using Machine LearningWeb Spam Detection Using Machine Learning
Web Spam Detection Using Machine Learning
 
Spam and Anti Spam Techniques
Spam and Anti Spam TechniquesSpam and Anti Spam Techniques
Spam and Anti Spam Techniques
 
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
 
Spam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta BhattacharyaSpam and Anti-spam - Sudipta Bhattacharya
Spam and Anti-spam - Sudipta Bhattacharya
 
Fighting spam
Fighting spamFighting spam
Fighting spam
 
Predicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningPredicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learning
 
A Survey: SMS Spam Filtering
A Survey: SMS Spam FilteringA Survey: SMS Spam Filtering
A Survey: SMS Spam Filtering
 
Secure and Reliable Data Transmission in Generalized E-Mail
Secure and Reliable Data Transmission in Generalized E-MailSecure and Reliable Data Transmission in Generalized E-Mail
Secure and Reliable Data Transmission in Generalized E-Mail
 
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail S...
Collateral Damage:
Consequences of Spam and Virus Filtering for the E-Mail S...Collateral Damage:
Consequences of Spam and Virus Filtering for the E-Mail S...
Collateral Damage: Consequences of Spam and Virus Filtering for the E-Mail S...
 
E mail image spam filtering techniques
E mail image spam filtering techniquesE mail image spam filtering techniques
E mail image spam filtering techniques
 
Spam Filtering
Spam FilteringSpam Filtering
Spam Filtering
 
Spamming and Spam Filtering
Spamming and Spam FilteringSpamming and Spam Filtering
Spamming and Spam Filtering
 

Destaque

AllPay Supreme Court of Appeal and Constitutional Court PPT
AllPay Supreme Court of Appeal and Constitutional Court PPTAllPay Supreme Court of Appeal and Constitutional Court PPT
AllPay Supreme Court of Appeal and Constitutional Court PPT
Raymond Esau
 
Dolor de cabeza y nauseas
Dolor de cabeza y nauseasDolor de cabeza y nauseas
Dolor de cabeza y nauseas
bezaesaqs
 

Destaque (20)

A review on natural gas utilization and cutting carbon emissions how viable i...
A review on natural gas utilization and cutting carbon emissions how viable i...A review on natural gas utilization and cutting carbon emissions how viable i...
A review on natural gas utilization and cutting carbon emissions how viable i...
 
A review on image processing
A review on image processingA review on image processing
A review on image processing
 
A semi analytic method for solving nonlinear partial differential equations
A semi analytic method for solving nonlinear partial differential equationsA semi analytic method for solving nonlinear partial differential equations
A semi analytic method for solving nonlinear partial differential equations
 
Socping - Social network for Shopping
Socping - Social network for ShoppingSocping - Social network for Shopping
Socping - Social network for Shopping
 
A review on wireless sensor networks
A review on wireless sensor networksA review on wireless sensor networks
A review on wireless sensor networks
 
A study of the multi dimensional function of the image of the pear tree in ...
A study of the multi dimensional function of the image of the  pear tree in  ...A study of the multi dimensional function of the image of the  pear tree in  ...
A study of the multi dimensional function of the image of the pear tree in ...
 
A study on industrial agglomeration in manufacturing sector of pakistan using...
A study on industrial agglomeration in manufacturing sector of pakistan using...A study on industrial agglomeration in manufacturing sector of pakistan using...
A study on industrial agglomeration in manufacturing sector of pakistan using...
 
AMIGOS DE INTERNET
AMIGOS DE INTERNETAMIGOS DE INTERNET
AMIGOS DE INTERNET
 
Agissons ensemble
Agissons ensembleAgissons ensemble
Agissons ensemble
 
2014christmas card
2014christmas card2014christmas card
2014christmas card
 
Тренды ближайшего будущего
Тренды ближайшего будущегоТренды ближайшего будущего
Тренды ближайшего будущего
 
Antoskiewicz larry 3.4
Antoskiewicz larry 3.4Antoskiewicz larry 3.4
Antoskiewicz larry 3.4
 
BiH i ECHR
BiH i ECHRBiH i ECHR
BiH i ECHR
 
EL Pilar Voleibol Vigo
EL Pilar Voleibol VigoEL Pilar Voleibol Vigo
EL Pilar Voleibol Vigo
 
Síndrome de Ahogamiento
Síndrome de AhogamientoSíndrome de Ahogamiento
Síndrome de Ahogamiento
 
Patrones
PatronesPatrones
Patrones
 
Who break first 16 dec 14 attack tv media performance
Who break first 16 dec 14 attack tv media performanceWho break first 16 dec 14 attack tv media performance
Who break first 16 dec 14 attack tv media performance
 
AllPay Supreme Court of Appeal and Constitutional Court PPT
AllPay Supreme Court of Appeal and Constitutional Court PPTAllPay Supreme Court of Appeal and Constitutional Court PPT
AllPay Supreme Court of Appeal and Constitutional Court PPT
 
BPC: Do you have the right design?
BPC: Do you have the right design?BPC: Do you have the right design?
BPC: Do you have the right design?
 
Dolor de cabeza y nauseas
Dolor de cabeza y nauseasDolor de cabeza y nauseas
Dolor de cabeza y nauseas
 

Semelhante a A review of spam filtering and measures of antispam

Semelhante a A review of spam filtering and measures of antispam (20)

Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
Spam Detection in Social Networks Using Correlation Based Feature Subset Sele...
 
A Model for Fuzzy Logic Based Machine Learning Approach for Spam Filtering
A Model for Fuzzy Logic Based Machine Learning Approach for  Spam FilteringA Model for Fuzzy Logic Based Machine Learning Approach for  Spam Filtering
A Model for Fuzzy Logic Based Machine Learning Approach for Spam Filtering
 
miniproject.ppt.pptx
miniproject.ppt.pptxminiproject.ppt.pptx
miniproject.ppt.pptx
 
Thematic and self learning method for
Thematic and self learning method forThematic and self learning method for
Thematic and self learning method for
 
Analysis of an image spam in email based on content analysis
Analysis of an image spam in email based on content analysisAnalysis of an image spam in email based on content analysis
Analysis of an image spam in email based on content analysis
 
IRJET- Image Spam Detection: Problem and Existing Solution
IRJET-  	  Image Spam Detection: Problem and Existing SolutionIRJET-  	  Image Spam Detection: Problem and Existing Solution
IRJET- Image Spam Detection: Problem and Existing Solution
 
A multi layer architecture for spam-detection system
A multi layer architecture for spam-detection systemA multi layer architecture for spam-detection system
A multi layer architecture for spam-detection system
 
Spam- A Menace to Society
Spam- A Menace to SocietySpam- A Menace to Society
Spam- A Menace to Society
 
Spam
SpamSpam
Spam
 
E-Mail Security Using Spam Mail Detection and Filtering System
E-Mail Security Using Spam Mail Detection and Filtering SystemE-Mail Security Using Spam Mail Detection and Filtering System
E-Mail Security Using Spam Mail Detection and Filtering System
 
E-Mail Security Using Spam Mail Detection and Filtering System
E-Mail Security Using Spam Mail Detection and Filtering SystemE-Mail Security Using Spam Mail Detection and Filtering System
E-Mail Security Using Spam Mail Detection and Filtering System
 
Research Report
Research ReportResearch Report
Research Report
 
The Detection of Suspicious Email Based on Decision Tree ...
The Detection of Suspicious Email Based on Decision Tree                     ...The Detection of Suspicious Email Based on Decision Tree                     ...
The Detection of Suspicious Email Based on Decision Tree ...
 
SPAM FILTERS
SPAM FILTERSSPAM FILTERS
SPAM FILTERS
 
What is SPAM?
What is SPAM?What is SPAM?
What is SPAM?
 
E spam
E spamE spam
E spam
 
E spam
E spamE spam
E spam
 
E spam
E spamE spam
E spam
 

Mais de Alexander Decker

Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...
Alexander Decker
 
A usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websitesA usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websites
Alexander Decker
 
A universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banksA universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banks
Alexander Decker
 
A unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized dA unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized d
Alexander Decker
 
A trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistanceA trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistance
Alexander Decker
 
A transformational generative approach towards understanding al-istifham
A transformational  generative approach towards understanding al-istifhamA transformational  generative approach towards understanding al-istifham
A transformational generative approach towards understanding al-istifham
Alexander Decker
 
A time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibiaA time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibia
Alexander Decker
 
A therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school childrenA therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school children
Alexander Decker
 
A theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banksA theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banks
Alexander Decker
 
A systematic evaluation of link budget for
A systematic evaluation of link budget forA systematic evaluation of link budget for
A systematic evaluation of link budget for
Alexander Decker
 
A synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjabA synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjab
Alexander Decker
 
A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...
Alexander Decker
 
A survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incrementalA survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incremental
Alexander Decker
 
A survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniquesA survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniques
Alexander Decker
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
Alexander Decker
 
A survey on challenges to the media cloud
A survey on challenges to the media cloudA survey on challenges to the media cloud
A survey on challenges to the media cloud
Alexander Decker
 
A survey of provenance leveraged
A survey of provenance leveragedA survey of provenance leveraged
A survey of provenance leveraged
Alexander Decker
 
A survey of private equity investments in kenya
A survey of private equity investments in kenyaA survey of private equity investments in kenya
A survey of private equity investments in kenya
Alexander Decker
 
A study to measures the financial health of
A study to measures the financial health ofA study to measures the financial health of
A study to measures the financial health of
Alexander Decker
 

Mais de Alexander Decker (20)

Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...Abnormalities of hormones and inflammatory cytokines in women affected with p...
Abnormalities of hormones and inflammatory cytokines in women affected with p...
 
A validation of the adverse childhood experiences scale in
A validation of the adverse childhood experiences scale inA validation of the adverse childhood experiences scale in
A validation of the adverse childhood experiences scale in
 
A usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websitesA usability evaluation framework for b2 c e commerce websites
A usability evaluation framework for b2 c e commerce websites
 
A universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banksA universal model for managing the marketing executives in nigerian banks
A universal model for managing the marketing executives in nigerian banks
 
A unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized dA unique common fixed point theorems in generalized d
A unique common fixed point theorems in generalized d
 
A trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistanceA trends of salmonella and antibiotic resistance
A trends of salmonella and antibiotic resistance
 
A transformational generative approach towards understanding al-istifham
A transformational  generative approach towards understanding al-istifhamA transformational  generative approach towards understanding al-istifham
A transformational generative approach towards understanding al-istifham
 
A time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibiaA time series analysis of the determinants of savings in namibia
A time series analysis of the determinants of savings in namibia
 
A therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school childrenA therapy for physical and mental fitness of school children
A therapy for physical and mental fitness of school children
 
A theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banksA theory of efficiency for managing the marketing executives in nigerian banks
A theory of efficiency for managing the marketing executives in nigerian banks
 
A systematic evaluation of link budget for
A systematic evaluation of link budget forA systematic evaluation of link budget for
A systematic evaluation of link budget for
 
A synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjabA synthetic review of contraceptive supplies in punjab
A synthetic review of contraceptive supplies in punjab
 
A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...A synthesis of taylor’s and fayol’s management approaches for managing market...
A synthesis of taylor’s and fayol’s management approaches for managing market...
 
A survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incrementalA survey paper on sequence pattern mining with incremental
A survey paper on sequence pattern mining with incremental
 
A survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniquesA survey on live virtual machine migrations and its techniques
A survey on live virtual machine migrations and its techniques
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
A survey on challenges to the media cloud
A survey on challenges to the media cloudA survey on challenges to the media cloud
A survey on challenges to the media cloud
 
A survey of provenance leveraged
A survey of provenance leveragedA survey of provenance leveraged
A survey of provenance leveraged
 
A survey of private equity investments in kenya
A survey of private equity investments in kenyaA survey of private equity investments in kenya
A survey of private equity investments in kenya
 
A study to measures the financial health of
A study to measures the financial health ofA study to measures the financial health of
A study to measures the financial health of
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

A review of spam filtering and measures of antispam

  • 1. Journal of Information Engineering and Applications www.iiste.org ISSN 2224-5782 (print) ISSN 2225-0506 (online) Vol.3, No.6, 2013- Selected from International Conference on Recent Trends in Applied Sciences with Engineering Applications 11 A Review of Spam Filtering and Measures of Antispam Ms.Rachana Mishra Asst.Prof.( Department IT), Oriental College of Technology, Bhopal(M.P.) Dr..Ramjeevan Singh Thakur Associate Prof.(Department of computer application), M.A.N.I.T., Bhopal(M.P.) rachanamishra812@gmail.com Abstract: Spam is commonly defined as unsolicited email messages, and the goal of spam categorization is to distinguish between spam and legitimate email messages. Our main aim is classification of spam mail and solving various problem is related to web space. So we discuss the measuring parameter which are helpful for reduce the spam or junk mail. In this paper, we describe procedure that can help eliminate unsolicited commercial e-mail, viruses, Trojans, and worms, as well as frauds perpetrated electronically and other undesired and troublesome e-mail.so this research is helpful for best classification and categorization method of email spam detection. Keywords: spam, anti spam factor. INTRODUCTION: The Internet is gradually becoming an integral part of everyday life. Internet usage is expected to continue growing and e-mail has become a powerful tool intended for idea and information exchange, as well as for commercial and social lives. Along with the growth of the Internet and e-mail, there has been a dramatic growth in spam in recent years [1,2,6]. The majority of spam solutions deal with the flood of spam.However, it is amazing that despite the increasing development of anti-spam services and technologies, the number of spam messages continues to increase rapidly. The increasing volume of spam has become a serious threat not only to the Internet, but also to society. For the business and educational environment, spam has become a security issue. Spam has gone from being annoying to being expensive and risky. The enigma is that spam is difficult to define. What is spam to one person is not necessarily spam to another. Fortunately or unfortunately, spam is here to stay and destined to increase its impact around the world. It has become an issue that can no longer be ignored; an issue that needs to be addressed in a multi-layered approach: at the source, on the network, and with the end-user. Consequently, spam filtering is able to control the problem in a variety of ways. Identification and spam removal from the e-mail delivery system allows end-users to regain a useful means of communication. Many researches on spam filtering have been centered on the more sophisticated classifier-related issues. Currently, machine learning for spam classification is an important research issue . The success of machine learning techniques in text categorization has led researchers to[1] explore learning algorithms in spam filtering. In particular, Bayesian techniques, support vector machines (SVM) effectively used for text categorization which influences researchers to classify the email is based on a special case of TC (text categorization), with the categories being spam and non-spam. SPAM PROBLEMS Spam is difficult to define. In fact, there is no widely agreed or clear workable definition, We all recognize spam when we see it, but the truth is that what is spam to one person may not be spam to another. So, the notion of spam is subjective [1,3]. The most well-known definition seems to be ‘unsolicited commercial email’ (UCE) and ‘Unsolicited Bulk Email’ (UBE). The content of spam ranges enormously from advertisements for goods and services to pornographic material, financial advertisements, information on illegal copies of software, fraudulent advertisements and/or fraudulent attempts to solicit money. More recently, spam has been spreading at an increasingly rapid rate, and while groups of spammers were relatively small in the past, the wide availability of ‘spam kits’ over the Internet has spread the practice from the United States to China, Russia and South America .[3] The scale of the problem is perhaps best highlighted when the growth of spam since 2001 is considered, as well as the percentage of spam, which was 7 per cent of all received e-mail . By 2002 this had grown to 29 per cent, and by the end of 2003 the total stood at 54 per cent . In March 2004 the percentage had increased to 63 per cent and this is set to continue rising. According to Message Labs, a US consultancy firm, spam now accounts for around 65% of all e-mail traffic. The following subsections outline a number of reasons which can explain why spam has become a serious problem. “SPAM” A Serious Problem “Cost-Shifting” Sending spam is extremely cheap (for the sender).The costs of spamming are paid by others: network mainainers,recipients,etc.“Fraud” Often,spams pretend to be replies or follow-ups to previous inqiuiries to get
  • 2. Journal of Information Engineering and Applications www.iiste.org ISSN 2224-5782 (print) ISSN 2225-0506 (online) Vol.3, No.6, 2013- Selected from International Conference on Recent Trends in Applied Sciences with Engineering Applications 12 people into opening messages.“Disguise origin” Spammers can easily disguise the originof their spam messages.Otherwise it would be just too easy to filter spam and spamming would be rendered useless. Problem formulation The problem with spam is that it tends to swamp desirable e-mail. In my own experience, a few years ago I occasionally received an inappropriate message, perhaps one or two each day. Every day of this month, in contrast, I received many times more spams than I did legitimate correspondences. On average, I probably get 10 spams for every appropriate e-mail. In some ways I am unusual -- as a public writer, I maintain a widely published e-mail address; moreover, I both welcome and receive frequent correspondence from strangers related to my published writing and to my software libraries. Unfortunately, a letter from a stranger -- with who-knows- which e-mail application, OS, native natural language, and so on, is not immediately obvious in its purpose; and spammers try to slip their messages underneath such ambiguities. My seconds are valuable to me, especially when they are claimed many times during every hour of a day. All mail server are not control spam mail, so that I can try to do some more work on spam data analysis, and give some accurate point to control them. Sample data of spam: ANTI –SPAM MEASURES Why finding out secondary impacts of spam filtering is so difficult From a spam filtering point of view we can use two main information resources for studying secondary impacts of spam filtering. First, we can analyze block lists and the IP addresses they list. Technically this is rather straightforward and we will cover results in the next section. Second, we can analyze spam messages and where they (allegedly) come from. In addition, it is necessary to take into account information from other, non- technical sources to understand how spamming is affecting the networked world. Analyzing email in the context of spam filtering: For a number of reasons including avoiding legal liability and protecting their valuable infrastructure, spammers try to disguise the true origin of their messages. Naturally this behavior means that collecting data regarding secondary, unwanted effects of anti-spam measures beyond false-positive rates (which are computed locally) is rather difficult. Identifying fake Received headers in email For example imagine that we receive an email with header information including the following (for a more detailed discussion see 2006): Received: from Echo by Destination Received: from Delta by Echo Received: from Charlie by Delta Received: from Bravo by Charlie Received: from Alpha by Bravo Type of spam Text spam:the capability to sort incoming emails based on simple string found in specific header fields.its capability is very simple and doesn’t even include regular expression matching.in text spam we filter the particular keyword which are unwanted. Image spam: Image spam is a kind of E-mail spam where the message text of the spam is presented as a picture
  • 3. Journal of Information Engineering and Applications www.iiste.org ISSN 2224-5782 (print) ISSN 2225-0506 (online) Vol.3, No.6, 2013- Selected from International Conference on Recent Trends in Applied Sciences with Engineering Applications 13 in an image file. Since most modern graphical E-mail client software will render the image file by default, presenting the message image directly to the user, it is highly effective at circumventing normal E-mail filtering software. Currently, the surest known countermeasure for image spam is to discard all messages containing images which do not appear to come from an already white listed E-mail address. However, this has the disadvantage that valid messages containing images from new correspondents must either be silently discarded.Content based spam: involves a number of methods, such as repeating unrelated phrases, to manipulate the relevance or prominence of resources indexed by a search engine, in a manner inconsistent with the purpose of the indexing system. How to filter the spam: Step 1 : collect the spam and anti spam data. Step2: preprocessing the data in which we have to reduce the noise,means we reduces the unwanted field.we have to clean the data . Step 3: identify the specific bad sender, read the header body. Step 4: applying particular classification tools or techniques for specified content, text and images. We have to apply these technique as per our requirement . Step 5: store the result in the form of content based or text based or image based. Techniques for anti spam filtering: White /list filter Rule based filter Content based filter Pattern detection In our study we conclude that spam can be classified into two wide categories .The first categories is spam with attachment & the second categories is spam without attachment .Spam with attachment into four type such as spam mail containing image file(.gif),text message,URL,spam containing image, text and spam containing worms,virus,Trojans as an attachment. Spam without attachment are small in size but spam with attachment bigger in size.our aim is to introduce the survey spam techniques. Conclusion Spam measures are helpful for categorization of spam mail. Our main aim is finding best filtering techniques for spam detection, so given measures are helpful for features selection . We can conclude that number of SPAM is a very serious problem which is drastically widespread. In future ,we plan to propose and implement a new spam classification and spam detection approach . References: [1] Fulu Li, Mo-han Hsieh, “An empirical study ofclustering behavior of spammers and Group based Anti-spam strategies”, CEAS 2006, pp 21-28, 2006. [2] Dhinaharan Nagamalai, Cynthia.D, Jae Kwang Lee,” ANovel Mechanism to defend DDoS attacks caused by spam”, International Journal of Smart Home, SERSC, Seoul, July 2007, pp 83-96. [3] Calton pu, Steve webb: “Observed trends in spam construction techniques: A case study of spam evolution”, CEAS 2006, pp 104-112, July 27-28, 2006. [4] Anirudh Ramachandran, David Dagon, Nick Feamste,“Can DNS-based Blacklists keep up with Bots”, CEAS 2006,CA, USA, July 27-28, 2006. [5] SpamCop http://spamcop.net. [6] Internet User Forecasts by Country http:// www. etforecasts.Com. [7] Nigerian fraud mail Gallery http://www.potifos.com/fraud/. [8] Fairfax Digital http://www.smh.com.au/articles/2004/10/18.
  • 4. This academic article was published by The International Institute for Science, Technology and Education (IISTE). The IISTE is a pioneer in the Open Access Publishing service based in the U.S. and Europe. The aim of the institute is Accelerating Global Knowledge Sharing. More information about the publisher can be found in the IISTE’s homepage: http://www.iiste.org CALL FOR PAPERS The IISTE is currently hosting more than 30 peer-reviewed academic journals and collaborating with academic institutions around the world. There’s no deadline for submission. Prospective authors of IISTE journals can find the submission instruction on the following page: http://www.iiste.org/Journals/ The IISTE editorial team promises to the review and publish all the qualified submissions in a fast manner. All the journals articles are available online to the readers all over the world without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. Printed version of the journals is also available upon request of readers and authors. IISTE Knowledge Sharing Partners EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open Archives Harvester, Bielefeld Academic Search Engine, Elektronische Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial Library , NewJour, Google Scholar