Devil Crime Rate Prediction Using K-Means
Devil Crime Rate Prediction Using K-Means
Devil Crime Rate Prediction Using K-Means
CLUSTERING
Submitted by
SHARIQUE SALAM (15SCSE101738)
YASH JAISWAL (15SCSE105066)
Under the supervision of
Prof. HIMANSHU SHARMA
SCHOOL OF COMPUTER SCIENCE AND SYSTEM ENGINEERING
GALGOTIAS UNIVERSITY
GREATER NOIDA, GAUTAM BUDH NAGAR,
UTTAR PRADESH, INDIA
CONTENT
S
S.NO. TOPIC/CONTENT
1. Abstract
2. Introduction
(i) Overall description
(ii) Purpose
(iii) Motivations and scope
3. Literature Survey
4. Proposed Model
5. Implementation
6. References
1. Abstract
Crimes have a negative effect on any society both socially and economically. We propose a Crime
Rate Prediction to assist law enforcement bodies to perform descriptive, predictive, and prescriptive
analysis on crime rate and data. Crime Rate Prediction has a modular architecture where each
component is built separate from each other. Crime Rate Prediction also supports plugins enabling
future feature expansions. The platform can ingest any crime dataset which has the required attributes
to map dataset to attributes required by the platform . Crime Rate Prediction also combines census
data with crime data to achieve more comprehensive crime analysis and their impact on society. We
demonstrate the utility of the platform by visualizing spatial and temporal relationships in a set of real-
world crime datasets.
2. Introduction
Overall description
Crimes are a social nuisance and it has a direct effect on a society. Governments spend lots of money through law
enforcement agencies to try and stop crimes from taking place. Crime data are complex because they have many
dimensions and in different formats, e.g., most of them contain string records and narrative records. Due to this diversity,
it is difficult to mine them using off the shelf, statistical and machine learning data analytics tools0. It is the primary
reason for lack of general platform for crime data mining. Predictive capabilities of the platform are demonstrated by
predicting crime categories, for which a machine learning approach is used. As data mining is the appropriate field to
apply on high volume crime dataset and knowledge gained from data mining approaches will be useful and support
police force. So In this project crime analysis is done by performing k-means clustering on crime dataset using rapid
miner tool.
Purpose
How to develop a software platform to conduct descriptive, predictive, and prescriptive analysis of diverse
crime data?
Descriptive analyzing focuses on identifying spatial temporal relationships with crime data. Predictive
analytics methods are mainly used for predicting category of a crime which can be occurred somewhere at a
given time. In order to achieve it system integrate Census data with the crime data and feed it to machine
learning algorithms. In prescriptive analyzer it suggests process re-engineering steps to allocate police
resources optimally with the intention of reduce crimes and impact to the general public.
Motivation
High or increased crime-levels make communities decline, as crimes reduce house prices, neighborhood
satisfaction, and the desire to move in a negative manner0. To reduce and prevent crimes it is important to
identify the reasons behind crimes, predict crimes, and prescribe solutions. Due to large volumes of data and
the number of algorithms needed to be applied on crime data, it is unrealistic to do a manual analysis.
Therefore, it is necessary to have a platform which is capable of applying any algorithm required to do a
descriptive, predictive, and prescriptive analysis on large volume of crime data. Through those three
methodologies law-enforcement authorities will be able to take suitable actions to prevent the crimes.
Moreover, by predicting the highly likely targets to be attacked, during a specific period of time and specific
geographical location, police will be able to identify better ways to deploy the limited resources and also to
find and fix the problems leading to crimes. Designing a tool which is easy to use with minimal training would
help law-enforcing bodies all around the world to reduce crimes.
3. Literature Survey
Data mining in the study and analysis of criminology can be categorized into main areas, crime control and
crime suppression. Crime control tends to use knowledge from the analyzed data to control and prevent the
occurrence of crime, while the criminal suppression tries to catch a criminal by using his/her history recorded
in data mining.
Brown (1998) constructed a software framework called ReCAP (Regional Crime Analysis Program) for mining
data in order to catch professional criminals using data mining and data fusion techniques. Data fusion was used to
manage, fuse and interprets information from multiple sources. The main purpose was to overcome confusion from
conflicting reports and cluttered or noisy backgrounds. Data mining was used to automatically discover patterns
and relationships in large databases.
Crime detection and prevention techniques are applied to different applications ranging from cross-
border security, Internet security to household crimes. Abraham et al. (2006) proposed a method to employ
computer log files as history data to search some relationships by using the frequency occurrence of incidents.
Then, they analyzed the result to produce profiles, which can be used to perceive the behavior of criminal.
PROPOSED SYSTEM AND ARCHITECTURE
An open source data mining tool which can be implemented easily and analysis can be done easily. So here
crime analysis is done on crime dataset by applying k means clustering algorithm using rapid miner tool.
Implementation
We implemented a web application to visualize the project output to the user. This web
application provides user the access to all the implemented features of the platform, including
descriptive, predictive, and prescriptive analytics
Features provided by the platform
1. D. E. Brown, “The regional crime analysis program (recap): a framework for mining data to catch
criminals,” IEEE Intl. Conf. on Systems, Man, and Cybernetics, vol. 3, pp. 2848–2853, 1998.
2. H. Chen, D. Zeng, H. Atabakhsh, W. Wyzga , and J. Schroeder, “Coplink: managing law enforcement
data and knowledge,” Communications of the ACM, vol. 46, no. 1, pp. 28–34, 2003.
3. A. Verma, R. Ramyaa , S. Marru , Y. Fan, and R. Singh, “Rationalizing police patrol beats using
voronoi tessellations,” IEEE Intl. Conf. on Intelligence and Security Informatics (ISI), pp. 165–167,
2010.