SlideShare uma empresa Scribd logo
1 de 26
CRIME ANALYSIS AND PREDICTION USING
DATA MINING
CHETAN HIREHOLI,
M.TECH, SOFTWARE ENGINEERING
Data Mining, what is it?
Data mining is about finding
new information in a lot of data.
• Generally, data mining (sometimes called data
or knowledge discovery) is the process of
analyzing data from different perspectives and
summarizing it into useful information -
information that can be used to increase
revenue, cuts costs, or both.
• Data mining software is one of a
number of analytical tools for analyzing
data.
Timeline
John W.
Tukey-
Exploratory
Data Analysis,
1962
Gregory Piatetsky- Shapiro organizes and
chairs the first Knowledge Discovery in
Databases (KDD) workshop, 1989
BusinessWeek publishe
s a cover story on
“Database Marketing”,
1994
For the first time, the
term “data science” is
included in the title of
the conference (“Data
science, classification,
and related methods”),
1996 by IFCS
The ability to take data—to be able to
understand it, to process it, to extract
value from it, to visualize it, to
communicate it—that’s going to be a
hugely important skill in the next
decades… - Hal Varian, Google’s Chief
Economist, 2009
Application and Trends…
 Financial Data Analysis
 Retail Industry
 Telecommunication Industry
 Biological Data Analysis
 Other Scientific Applications
 Intrusion Detection
Feel Good, Do Good!
“Crime Analysis and Prediction Using Data Mining”
Shiju Sathyadevan, Devan M.S and Surya Gangadharan. S, 2014 IEEE
Abstract
 What is Crime analysis?- Crime analysis is a law enforcement function that involves systematic
analysis for identifying and analyzing patterns and trends in crime and disorder.
 The proposed system has an approach between computer science and criminal justice to
develop a data mining procedure that can help solve crimes faster.
Introduction
 It is only within the last few decades that the technology made spatial data
mining a practical solution for wide audiences of Law enforcement officials which
is affordable and available.
 Huge chunks of data to be collected- web sites, news sites, blogs, social media,
RSS feeds etc.
 So the main challenge in front of us is developing a better, efficient crime pattern
detection tool to identify crime patterns effectively.
Doing analysis is a hard job!
 The reason for choosing this(Clustering):
 Only known data present with us
 Classification technique will not predict well
 Also nature of crimes change over time
 So in order to be able to detect newer and
unknown patterns in future, clustering
techniques work better.
Steps in doing Crime Analysis
Data Collection
Classification
Pattern
Prediction
Visualization
Related Work
Using Series Finder
will get me more
Films!
 Series Finder for finding the patterns in burglary.
 For achieving this they used the modus operandi of offender and they extracted
some crime patterns which were followed by offender.
 The algorithm constructs modus operandi of the offender.
In your dreams…
You can’t catch
me!,
I’m KRISHH!
Methodology
 Data Collection
 Collecting data from various sources like news sites, blogs, social media,
RSS feeds etc.
 But the data we got is ‘VERY UNSTRUCTURED’!, and how do we store it?!
 The advantage of NoSQL database over SQL database is that it allows insertion
of data without a predefined schema.
 Object-oriented programming- hence is easy to use and flexible.
 Unlike SQL database it not need to know what we are storing in advance, specify
its size etc.
Okay! Enough of humor,
come lets get serious, and
look into how it
actually works!
Methodology
 Classification
 Naïve Bayes- a supervised learning method as well as a statistical method
 The algorithm classifies a news article into a crime type to which it fits the
best Eg. "What is the probability that a crime document D belongs to a given
class C?“
Thomas Bayes
Methodology
 Classification
 Naïve Bayes has it’s advantages:
 Simple, and converges quicker than logistic regression.
 Compared to SVM (Support Vector Machine), it is easy to implement and comes with
high performance. Also in case of SVM as size of training set increases the speed of
execution decreases.
 Works well for small amount of training to calculate the classification parameters.
 Also it fixes the Zero-frequency problem!
Methodology
 Classification
 Using Naive Bayes algorithm we create a model by training crime data related to
vandalism, murder, robbery, burglary, sex abuse, gang rape, arson, armed robbery,
highway robbery, snatching etc.
 Test results shows that Naive Bayes shows more than 90% accuracy!!
Pseudo code for Naïve Bayes
Methodology
 Classification
 Named Entity Recognition(NER)- also known as Entity Extraction
finds and classify elements in text into predefined categories such
as the person names, organizations, locations, date, time etc.
Sample NER
Methodology
 Classification
 Coreference Resolution- Find the referenced entities in a text.
Input: E.g.: A pillion bike rider snatched away a gold mangalsutra
worth Rs 85,000 of a 60-year-old woman
pedestrian in sector 19, Kharghar on Friday. The victim,
Shakuntala Mande, was walking towards a vegetable outlet
around 9.40am, when a bike came close to her and the pillion
rider snatched her mangalsutra. A robbery case has been
registered at Kharghar police station.
Methodology
 Pattern Identification
 Apriori algorithm- used to determine association rules which highlight general trends
 The result of this phase is the crime pattern for a particular place.
 After getting a general crime pattern for a place, when a new case arrives and if it follows
the same crime pattern then we can say that the area has a chance for crime occurrence.
 Information regarding patterns helps police officials to facilitate resources in an effective
manner.
Methodology
 Prediction
 Decision tree- It is simple to understand and interpret!
 Its robust nature and also it works well with large data
sets.
Root node
Leaf node
Splitting ?
Methodology
 Visualization
 A heat map which indicates level of activity, usually
darker colors to indicate low activity and brighter colors
to indicate high activity.
Methodology
 Visualization
 In the x-axis all main locations in India are
plotted whereas in y-axis the crime rate is
plotted.
 The graph shows the regions which has
maximum crime rate.
 The data plotted here is based on the historical
records.
Methodology
 Visualization
 Shows the rate/percentage of crime occurrence
in places like airport, temples, bus station,
railway stations, bank, casino, jewelry shops, bar,
ATM, airport, bus station, highways etc..
 In the x axis the main spots like temple, bank,
bus station, railway station, ATM etc. are plotted
while in y-axis the rate of crime is plotted.
Future Work
 Criminal Profiling
 Helps the crime investigators to record the characteristics of criminals.
 The main goal of doing criminal profiling is that:
 To provide crime investigators with a social and psychological assessment of the offender
 To evaluate belongings found in the possession of the offender.
 For doing this, the maximum details of each criminals is collected from criminal records
and the modus operandi is found out
Future Work
 Criminal Profiling
 Sifting through each crime record after a particular crime occurrence is tedious task.
 So instead we can use some visualization mechanisms to represent the criminal details in
a human understandable form.
Future Work
 Criminal Profiling
Conclusion Data
Collection
• Web sites, news channels,
blogs, etc.
Classification
• Using Naïve Bayes theorem, a
predictor is created
Patten
Identification
• Apriori Algorithm
Prediction • Decision Tree
Visualization
• Neo4j
• GraphDB

Mais conteúdo relacionado

Mais procurados

Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud DetectionNitesh Kumar
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learningdataalcott
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine LearningScaleway
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithmsankit panigrahy
 
Anomaly detection Full Article
Anomaly detection Full ArticleAnomaly detection Full Article
Anomaly detection Full ArticleMenglinLiu1
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detectionvineeta vineeta
 
Moving object detection in video surveillance
Moving object detection in video surveillanceMoving object detection in video surveillance
Moving object detection in video surveillanceAshfaqul Haque John
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber SecurityRishi Kant
 
Machine Learning for Threat Detection
Machine Learning for Threat DetectionMachine Learning for Threat Detection
Machine Learning for Threat DetectionNapier University
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximizationbutest
 
computer forensic tools-Hardware & Software tools
computer forensic tools-Hardware & Software toolscomputer forensic tools-Hardware & Software tools
computer forensic tools-Hardware & Software toolsN.Jagadish Kumar
 
Gaussian Mixture Models
Gaussian Mixture ModelsGaussian Mixture Models
Gaussian Mixture Modelsguestfee8698
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationImpetus Technologies
 

Mais procurados (20)

Machine Learning for Fraud Detection
Machine Learning for Fraud DetectionMachine Learning for Fraud Detection
Machine Learning for Fraud Detection
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
Credit card fraud detection through machine learning
Credit card fraud detection through machine learningCredit card fraud detection through machine learning
Credit card fraud detection through machine learning
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine Learning
 
Credit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning AlgorithmsCredit card fraud detection using machine learning Algorithms
Credit card fraud detection using machine learning Algorithms
 
Anomaly detection Full Article
Anomaly detection Full ArticleAnomaly detection Full Article
Anomaly detection Full Article
 
Final ppt
Final pptFinal ppt
Final ppt
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 
Handwritten Character Recognition
Handwritten Character RecognitionHandwritten Character Recognition
Handwritten Character Recognition
 
Moving object detection in video surveillance
Moving object detection in video surveillanceMoving object detection in video surveillance
Moving object detection in video surveillance
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber Security
 
Machine Learning for Threat Detection
Machine Learning for Threat DetectionMachine Learning for Threat Detection
Machine Learning for Threat Detection
 
Machine learning projects
Machine learning projectsMachine learning projects
Machine learning projects
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Credit card fraud dection
Credit card fraud dectionCredit card fraud dection
Credit card fraud dection
 
computer forensic tools-Hardware & Software tools
computer forensic tools-Hardware & Software toolscomputer forensic tools-Hardware & Software tools
computer forensic tools-Hardware & Software tools
 
Machine learning
Machine learningMachine learning
Machine learning
 
Gaussian Mixture Models
Gaussian Mixture ModelsGaussian Mixture Models
Gaussian Mixture Models
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 

Destaque

Crime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesCrime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesChamath Sajeewa
 
2014 Chicago Crime Data Analysis
2014 Chicago Crime Data Analysis 2014 Chicago Crime Data Analysis
2014 Chicago Crime Data Analysis Yawen Li
 
Fundamentalsof Crime Mapping Tactical Analysis Concepts
Fundamentalsof Crime Mapping Tactical Analysis ConceptsFundamentalsof Crime Mapping Tactical Analysis Concepts
Fundamentalsof Crime Mapping Tactical Analysis ConceptsOsokop
 
06 analysis of crime
06 analysis of crime06 analysis of crime
06 analysis of crimeJim Gilmer
 
A Machine Learning approach to predict Software Defects
A Machine Learning approach to predict Software DefectsA Machine Learning approach to predict Software Defects
A Machine Learning approach to predict Software DefectsChetan Hireholi
 
Prevent the crime, don't just record it
Prevent the crime, don't just record itPrevent the crime, don't just record it
Prevent the crime, don't just record itVideoIQ
 
Chicago crime analysis
Chicago crime analysisChicago crime analysis
Chicago crime analysisjangyoung
 
Software Analytics: Towards Software Mining that Matters
Software Analytics: Towards Software Mining that MattersSoftware Analytics: Towards Software Mining that Matters
Software Analytics: Towards Software Mining that MattersTao Xie
 
A Machine Learning approach to predict Software Defects
A Machine Learning approach to predict Software DefectsA Machine Learning approach to predict Software Defects
A Machine Learning approach to predict Software DefectsChetan Hireholi
 
Future internet of things architecture
Future internet of things architectureFuture internet of things architecture
Future internet of things architectureChetan Hireholi
 
Educational Information Management System (EIMS)
Educational Information Management System (EIMS)Educational Information Management System (EIMS)
Educational Information Management System (EIMS)Chetan Hireholi
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is funZhen Li
 
Micro processor programs
Micro processor programsMicro processor programs
Micro processor programssantosh kumar
 
Cloud GIS for Crime Mapping
Cloud GIS for Crime Mapping Cloud GIS for Crime Mapping
Cloud GIS for Crime Mapping IJORCS
 
Text Mining, Association Rules and Decision Tree Learning
Text Mining, Association Rules and Decision Tree LearningText Mining, Association Rules and Decision Tree Learning
Text Mining, Association Rules and Decision Tree LearningAdrian Cuyugan
 
빅데이터와 교육데이터마이닝 (고려대학교 대학원 강의) 6주차
빅데이터와 교육데이터마이닝 (고려대학교 대학원 강의) 6주차빅데이터와 교육데이터마이닝 (고려대학교 대학원 강의) 6주차
빅데이터와 교육데이터마이닝 (고려대학교 대학원 강의) 6주차JM code group
 
Mineograph Mining Automation Software
Mineograph Mining Automation SoftwareMineograph Mining Automation Software
Mineograph Mining Automation SoftwareMineograph Software
 
Mining Unstructured Software Repositories Using IR Models
Mining Unstructured Software Repositories Using IR ModelsMining Unstructured Software Repositories Using IR Models
Mining Unstructured Software Repositories Using IR ModelsSAIL_QU
 

Destaque (20)

Crime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articlesCrime Analytics: Analysis of crimes through news paper articles
Crime Analytics: Analysis of crimes through news paper articles
 
2014 Chicago Crime Data Analysis
2014 Chicago Crime Data Analysis 2014 Chicago Crime Data Analysis
2014 Chicago Crime Data Analysis
 
Fundamentalsof Crime Mapping Tactical Analysis Concepts
Fundamentalsof Crime Mapping Tactical Analysis ConceptsFundamentalsof Crime Mapping Tactical Analysis Concepts
Fundamentalsof Crime Mapping Tactical Analysis Concepts
 
06 analysis of crime
06 analysis of crime06 analysis of crime
06 analysis of crime
 
A Machine Learning approach to predict Software Defects
A Machine Learning approach to predict Software DefectsA Machine Learning approach to predict Software Defects
A Machine Learning approach to predict Software Defects
 
Prevent the crime, don't just record it
Prevent the crime, don't just record itPrevent the crime, don't just record it
Prevent the crime, don't just record it
 
Chicago crime analysis
Chicago crime analysisChicago crime analysis
Chicago crime analysis
 
Software Analytics: Towards Software Mining that Matters
Software Analytics: Towards Software Mining that MattersSoftware Analytics: Towards Software Mining that Matters
Software Analytics: Towards Software Mining that Matters
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
A Machine Learning approach to predict Software Defects
A Machine Learning approach to predict Software DefectsA Machine Learning approach to predict Software Defects
A Machine Learning approach to predict Software Defects
 
Future internet of things architecture
Future internet of things architectureFuture internet of things architecture
Future internet of things architecture
 
Educational Information Management System (EIMS)
Educational Information Management System (EIMS)Educational Information Management System (EIMS)
Educational Information Management System (EIMS)
 
Random Forest and KNN is fun
Random Forest and KNN is funRandom Forest and KNN is fun
Random Forest and KNN is fun
 
CCTNS Karnataka Overview
CCTNS Karnataka OverviewCCTNS Karnataka Overview
CCTNS Karnataka Overview
 
Micro processor programs
Micro processor programsMicro processor programs
Micro processor programs
 
Cloud GIS for Crime Mapping
Cloud GIS for Crime Mapping Cloud GIS for Crime Mapping
Cloud GIS for Crime Mapping
 
Text Mining, Association Rules and Decision Tree Learning
Text Mining, Association Rules and Decision Tree LearningText Mining, Association Rules and Decision Tree Learning
Text Mining, Association Rules and Decision Tree Learning
 
빅데이터와 교육데이터마이닝 (고려대학교 대학원 강의) 6주차
빅데이터와 교육데이터마이닝 (고려대학교 대학원 강의) 6주차빅데이터와 교육데이터마이닝 (고려대학교 대학원 강의) 6주차
빅데이터와 교육데이터마이닝 (고려대학교 대학원 강의) 6주차
 
Mineograph Mining Automation Software
Mineograph Mining Automation SoftwareMineograph Mining Automation Software
Mineograph Mining Automation Software
 
Mining Unstructured Software Repositories Using IR Models
Mining Unstructured Software Repositories Using IR ModelsMining Unstructured Software Repositories Using IR Models
Mining Unstructured Software Repositories Using IR Models
 

Semelhante a Crime Analysis using Data Analysis

Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningVenkat Projects
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningVenkat Projects
 
Propose Data Mining AR-GA Model to Advance Crime analysis
Propose Data Mining AR-GA Model to Advance Crime analysisPropose Data Mining AR-GA Model to Advance Crime analysis
Propose Data Mining AR-GA Model to Advance Crime analysisIOSR Journals
 
Survey on Crime Interpretation and Forecasting Using Machine Learning
Survey on Crime Interpretation and Forecasting Using Machine LearningSurvey on Crime Interpretation and Forecasting Using Machine Learning
Survey on Crime Interpretation and Forecasting Using Machine LearningIRJET Journal
 
Crime Prediction and Analysis
Crime Prediction and AnalysisCrime Prediction and Analysis
Crime Prediction and AnalysisIRJET Journal
 
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...Zakaria Zubi
 
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...IJARIIT
 
The Transpose Technique On Number Of Transactions Of...
The Transpose Technique On Number Of Transactions Of...The Transpose Technique On Number Of Transactions Of...
The Transpose Technique On Number Of Transactions Of...Amanda Brady
 
A predictive model for mapping crime using big data analytics
A predictive model for mapping crime using big data analyticsA predictive model for mapping crime using big data analytics
A predictive model for mapping crime using big data analyticseSAT Journals
 
Life and science journal.pdf
Life and science journal.pdfLife and science journal.pdf
Life and science journal.pdfSarita30844
 
A Survey on Data Mining Techniques for Crime Hotspots Prediction
A Survey on Data Mining Techniques for Crime Hotspots PredictionA Survey on Data Mining Techniques for Crime Hotspots Prediction
A Survey on Data Mining Techniques for Crime Hotspots PredictionIJSRD
 
9th may net sci presentation (1)
9th may net sci presentation (1)9th may net sci presentation (1)
9th may net sci presentation (1)Rajath Mahesh
 
Ashish sonal_banglore
Ashish sonal_bangloreAshish sonal_banglore
Ashish sonal_bangloreIPPAI
 
Data Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat YazıcıData Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat YazıcıMurat YAZICI, M.Sc.
 
A LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMININGA LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMININGCarrie Romero
 

Semelhante a Crime Analysis using Data Analysis (20)

Case.pptx
Case.pptxCase.pptx
Case.pptx
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data mining
 
Crime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data miningCrime analysis mapping, intrusion detection using data mining
Crime analysis mapping, intrusion detection using data mining
 
CRIME.pptx
CRIME.pptxCRIME.pptx
CRIME.pptx
 
Propose Data Mining AR-GA Model to Advance Crime analysis
Propose Data Mining AR-GA Model to Advance Crime analysisPropose Data Mining AR-GA Model to Advance Crime analysis
Propose Data Mining AR-GA Model to Advance Crime analysis
 
Survey on Crime Interpretation and Forecasting Using Machine Learning
Survey on Crime Interpretation and Forecasting Using Machine LearningSurvey on Crime Interpretation and Forecasting Using Machine Learning
Survey on Crime Interpretation and Forecasting Using Machine Learning
 
F033026029
F033026029F033026029
F033026029
 
Crime Prediction and Analysis
Crime Prediction and AnalysisCrime Prediction and Analysis
Crime Prediction and Analysis
 
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...
 
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...
Physical and Cyber Crime Detection using Digital Forensic Approach: A Complet...
 
The Transpose Technique On Number Of Transactions Of...
The Transpose Technique On Number Of Transactions Of...The Transpose Technique On Number Of Transactions Of...
The Transpose Technique On Number Of Transactions Of...
 
A predictive model for mapping crime using big data analytics
A predictive model for mapping crime using big data analyticsA predictive model for mapping crime using big data analytics
A predictive model for mapping crime using big data analytics
 
Life and science journal.pdf
Life and science journal.pdfLife and science journal.pdf
Life and science journal.pdf
 
A Survey on Data Mining Techniques for Crime Hotspots Prediction
A Survey on Data Mining Techniques for Crime Hotspots PredictionA Survey on Data Mining Techniques for Crime Hotspots Prediction
A Survey on Data Mining Techniques for Crime Hotspots Prediction
 
Netsci
NetsciNetsci
Netsci
 
9th may net sci presentation (1)
9th may net sci presentation (1)9th may net sci presentation (1)
9th may net sci presentation (1)
 
Ashish sonal_banglore
Ashish sonal_bangloreAshish sonal_banglore
Ashish sonal_banglore
 
Data Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat YazıcıData Mining for Big Data-Murat Yazıcı
Data Mining for Big Data-Murat Yazıcı
 
A LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMININGA LITERATURE REVIEW ON DATAMINING
A LITERATURE REVIEW ON DATAMINING
 
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
 

Último

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 

Último (20)

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 

Crime Analysis using Data Analysis

  • 1. CRIME ANALYSIS AND PREDICTION USING DATA MINING CHETAN HIREHOLI, M.TECH, SOFTWARE ENGINEERING
  • 2. Data Mining, what is it? Data mining is about finding new information in a lot of data. • Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. • Data mining software is one of a number of analytical tools for analyzing data.
  • 3. Timeline John W. Tukey- Exploratory Data Analysis, 1962 Gregory Piatetsky- Shapiro organizes and chairs the first Knowledge Discovery in Databases (KDD) workshop, 1989 BusinessWeek publishe s a cover story on “Database Marketing”, 1994 For the first time, the term “data science” is included in the title of the conference (“Data science, classification, and related methods”), 1996 by IFCS The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades… - Hal Varian, Google’s Chief Economist, 2009
  • 4. Application and Trends…  Financial Data Analysis  Retail Industry  Telecommunication Industry  Biological Data Analysis  Other Scientific Applications  Intrusion Detection
  • 5. Feel Good, Do Good! “Crime Analysis and Prediction Using Data Mining” Shiju Sathyadevan, Devan M.S and Surya Gangadharan. S, 2014 IEEE
  • 6. Abstract  What is Crime analysis?- Crime analysis is a law enforcement function that involves systematic analysis for identifying and analyzing patterns and trends in crime and disorder.  The proposed system has an approach between computer science and criminal justice to develop a data mining procedure that can help solve crimes faster.
  • 7. Introduction  It is only within the last few decades that the technology made spatial data mining a practical solution for wide audiences of Law enforcement officials which is affordable and available.  Huge chunks of data to be collected- web sites, news sites, blogs, social media, RSS feeds etc.  So the main challenge in front of us is developing a better, efficient crime pattern detection tool to identify crime patterns effectively.
  • 8. Doing analysis is a hard job!  The reason for choosing this(Clustering):  Only known data present with us  Classification technique will not predict well  Also nature of crimes change over time  So in order to be able to detect newer and unknown patterns in future, clustering techniques work better.
  • 9. Steps in doing Crime Analysis Data Collection Classification Pattern Prediction Visualization
  • 10. Related Work Using Series Finder will get me more Films!  Series Finder for finding the patterns in burglary.  For achieving this they used the modus operandi of offender and they extracted some crime patterns which were followed by offender.  The algorithm constructs modus operandi of the offender. In your dreams… You can’t catch me!, I’m KRISHH!
  • 11. Methodology  Data Collection  Collecting data from various sources like news sites, blogs, social media, RSS feeds etc.  But the data we got is ‘VERY UNSTRUCTURED’!, and how do we store it?!  The advantage of NoSQL database over SQL database is that it allows insertion of data without a predefined schema.  Object-oriented programming- hence is easy to use and flexible.  Unlike SQL database it not need to know what we are storing in advance, specify its size etc. Okay! Enough of humor, come lets get serious, and look into how it actually works!
  • 12. Methodology  Classification  Naïve Bayes- a supervised learning method as well as a statistical method  The algorithm classifies a news article into a crime type to which it fits the best Eg. "What is the probability that a crime document D belongs to a given class C?“ Thomas Bayes
  • 13. Methodology  Classification  Naïve Bayes has it’s advantages:  Simple, and converges quicker than logistic regression.  Compared to SVM (Support Vector Machine), it is easy to implement and comes with high performance. Also in case of SVM as size of training set increases the speed of execution decreases.  Works well for small amount of training to calculate the classification parameters.  Also it fixes the Zero-frequency problem!
  • 14. Methodology  Classification  Using Naive Bayes algorithm we create a model by training crime data related to vandalism, murder, robbery, burglary, sex abuse, gang rape, arson, armed robbery, highway robbery, snatching etc.  Test results shows that Naive Bayes shows more than 90% accuracy!!
  • 15. Pseudo code for Naïve Bayes
  • 16. Methodology  Classification  Named Entity Recognition(NER)- also known as Entity Extraction finds and classify elements in text into predefined categories such as the person names, organizations, locations, date, time etc. Sample NER
  • 17. Methodology  Classification  Coreference Resolution- Find the referenced entities in a text. Input: E.g.: A pillion bike rider snatched away a gold mangalsutra worth Rs 85,000 of a 60-year-old woman pedestrian in sector 19, Kharghar on Friday. The victim, Shakuntala Mande, was walking towards a vegetable outlet around 9.40am, when a bike came close to her and the pillion rider snatched her mangalsutra. A robbery case has been registered at Kharghar police station.
  • 18. Methodology  Pattern Identification  Apriori algorithm- used to determine association rules which highlight general trends  The result of this phase is the crime pattern for a particular place.  After getting a general crime pattern for a place, when a new case arrives and if it follows the same crime pattern then we can say that the area has a chance for crime occurrence.  Information regarding patterns helps police officials to facilitate resources in an effective manner.
  • 19. Methodology  Prediction  Decision tree- It is simple to understand and interpret!  Its robust nature and also it works well with large data sets. Root node Leaf node Splitting ?
  • 20. Methodology  Visualization  A heat map which indicates level of activity, usually darker colors to indicate low activity and brighter colors to indicate high activity.
  • 21. Methodology  Visualization  In the x-axis all main locations in India are plotted whereas in y-axis the crime rate is plotted.  The graph shows the regions which has maximum crime rate.  The data plotted here is based on the historical records.
  • 22. Methodology  Visualization  Shows the rate/percentage of crime occurrence in places like airport, temples, bus station, railway stations, bank, casino, jewelry shops, bar, ATM, airport, bus station, highways etc..  In the x axis the main spots like temple, bank, bus station, railway station, ATM etc. are plotted while in y-axis the rate of crime is plotted.
  • 23. Future Work  Criminal Profiling  Helps the crime investigators to record the characteristics of criminals.  The main goal of doing criminal profiling is that:  To provide crime investigators with a social and psychological assessment of the offender  To evaluate belongings found in the possession of the offender.  For doing this, the maximum details of each criminals is collected from criminal records and the modus operandi is found out
  • 24. Future Work  Criminal Profiling  Sifting through each crime record after a particular crime occurrence is tedious task.  So instead we can use some visualization mechanisms to represent the criminal details in a human understandable form.
  • 26. Conclusion Data Collection • Web sites, news channels, blogs, etc. Classification • Using Naïve Bayes theorem, a predictor is created Patten Identification • Apriori Algorithm Prediction • Decision Tree Visualization • Neo4j • GraphDB