Deep Learning Based Fusion Approach For Hate Speech Detection
Deep Learning Based Fusion Approach For Hate Speech Detection
Deep Learning Based Fusion Approach For Hate Speech Detection
on
“Deep Learning Based Fusion Approach for Hate Speech Detection”
Bachelor of Technology
in
Computer Science and Engineering
by
P.S.SHIVA PRASAD (17FH1A0546)
S.RAFEEQ AHMED (17FH1A0553)
A.UDAY SAI AKHIL (17FH1A0559)
V.VENKATA RAMANA (16FH1A0553)
CERTIFICATE
This is to certify that the Project Entitled “Deep Learning Based Fusion Approach for Hate
Speech Detection” being submitted by P.S.SHIVA PRASAD (17FH1A0546), S.RAFEEQ
AHMED (17FH1A0553), A.UDAY SAI AKHIL (17FH1A0559), V.VENKATA RAMANA
(16FH1A0517) in partial fulfillment of the requirements for the award of the degree of
Bachelor of Technology in Computer Science and Engineering to theDr.K.V. Subba Reddy
Institute of Technology Affiliated to JNTU Anantapur is Record of Bonafied work carried out
during the year 2020-2021.
The Results presented in this thesis has been verified and found to be Satisfactory.
The results embodied in this thesis report have not been submitted to any other University for
the award of any other degree or diploma.
ProjectGuide HOD
H.Ateeq Ahmed Mtech Dr. C. MohammedGulzar Mtech, Phd
Assistant professor Associate professor
Department of CSE Department of CSE
JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY,
ANANTAPUR
CERTIFICATE OF EXAMINER
Dissertation work entitled “Deep Learning Based Fusion Approach for Hate Speech Detection”
submitted by P.S.Shiva Prasad (17FH1A0546), S.Rafeeq Ahmed (17FH1A0553), A.Uday Sai Akhil
(17fh1a0559), V.Venkata Ramana (16FH1A0553) is approved for the degree of Bachelor of Technology
in Computer Science and Engineering.
Examiners:
1.
2.
ACKNOWLEDGEMENT
It is our privilege and pleasure to express our profound sense of respect,
gratitude and Indebtedness to our guide, H.Ateeq Ahmed,MTech, Assistant
Professor, Department of Computer Science and Engineering,Dr.K.V.Subba
Reddy Institute of Technology, for this indefatigable inspiration, guidance,
cogent discussion and encouragement throughout this dissertation work
Last but not least, we Wish to acknowledge our friends, family members
and colleagues for giving moral strength and helping us to complete this
dissertation.
PROJECT TEAM:
List of Figures i
List of Tables ii
Abstract 01
CHAPTER 1 07
Introduction 07
CHAPTER 2 09
System Environment 09
2.1 Python 09
2.2 History of python 10
2.3 Python Features 10
2.4 Python Methods and Functions 11
CHAPTER 3
UML Models 50
3.1 Class Diagram 50
3.2 Data Flow Diagram 51
3.3 Sequence Diagram 52
3.4 Use Case Diagram 53
3.5 Flow Chart-Remote User and Service Provider 54-55
3.6 Preliminary Investigation 56
3.7 Input and Output 58-59
CHAPTER 4 63
Architecture 63
CHAPTER 5
System Testing 64
5.1 Types of Testing 64-71
5.2 User Training 72
5.3 Testing Methodologies 73-78
5.4 Maintenance 78
CHAPTER 6
Result 80-85
CHAPTER 7
Conclusion 86
CHAPTER 8
` Referernce 88-89
Bibliography 90
LIST OF TABLES
LIST OF FIGURES
ABSTRACT
In recent years, the increasing prevalence of hate speech in social media has been considered as a
serious problem worldwide. Many governments and organizations have made significant
investment in hate speech detection techniques, which have also attracted the attention of the
scientific community. Although plenty of literature focusing on this issue is available, it remains
difficult to assess the performances of each proposed method, as each has its own advantages and
disadvantages. A general way to improve the overall results of classification by fusing the various
classifiers results is a meaningful attempt. We first focus on several famous machine learning
methods for text classification such as Embeddings from Language Models (ELMo), Bidirectional
Encoder Representation from Transformers (BERT) and Convolutional Neural Network (CNN),
and apply these methods to the data sets of the SemEval 2019 Task 5. We then adopt some fusion
strategies to combine the classifiers to improve the overall classification performance. The results
show that the accuracy and F1-score of the classification are significantly improved.
EXISTING SYSTEM
The deep learning methods can be roughly divided into two categories: one
focuses on front-end processing which optimizes the word embedding
technology, and the other on mid-end processing which usually uses simple
word or character based embedding technology and pays more attention to the
middle neural networks processing. The most famous methods focused on
front-end processing are Embeddings from Language Models (ELMo) [6][13],
which trains word vectors with context, and Bidirectional Encoder
Representation from Transformers (BERT) [14][15]. BERT is the first deeply
bidirectional, unsupervised language representation from unlabeled text by
jointly conditioning on both left and right context in all layers. It shows
overwhelmingly good performance and has attracted great attention. The most
popular network architectures focused on neural networks processing are
typically based on long short-term memory networks, such as Convolutional
Neural Network (CNN) [16][17], Recurrent Neutral Network (RNN) and some
processing versions of them [18]. As mentioned above, firstly, we focus on
several well-behaved deep learning methods in this paper.
Text representation is the first pivotal step in NLP because the digitization of
text features is fundamental to enabling automated processing. At the
beginning, discrete representations like one-hot coding method and bag-of-
words model are used. They are simple and easy to implement. However, the
representation is sparse with high dimensions and does not consider the
semantic information of words in the sentence. Then word embeddings which
are obtained by training a language model on a large-scale corpus are widely
used. One of the most well-known representative works is Word2Vec [19].
Word2Vec showed that we can use a vector to properly represent words in a
PROPOSED SYSTEM
Although great contributions have been made in this area of proposed work,
each research method has its own advantages and disadvantages. It is still
difficult to compare their performance, largely due to the use of different
datasets and different feature extraction techniques. Different methods usually
provide disparate suitability on feature sets, even for the same datasets. The
most important question, perhaps, is not which method is the best, but how the
Dept of CSE Dr.KVSRIT Page 4
Deep Learning Based Fusion Approach for Hate Speech Detection
The underlying idea of ensemble learning is that even if one weak classifier
gets the wrong prediction, other classifiers can correct the error back to some
extent. The two most common ways of ensemble learning are bagging and
boosting [8]. However, these two methods are unsuitable for ensemble
learning between different classifiers. Adopting some simple algebraic rules of
fusion for multiple classifiers results may prove meaningful.
Advantages
The idea of ensemble learning in machine learning can advance the overall
classification performance and improve the overall accuracy in prediction.
Bagging or boosting is based on the same classification algorithm, focusing on
the diversity of the data samples, and is short of diversity creation through
different algorithms.
SYSTEM REQUIREMENTS
SOFTWARE REQUIREMENTS:
Front-End : Python.
Back-End : Django-ORM
Chapter 1
INTRODUCTION
The popularity of social media platforms such as Facebook, Twitter and YouTube,
etc. provide channels for internet users to express their opinions and share comments
that are visible to all. Some people express aggressive, hateful or threatening speech
online arbitrarily. Hate speech is commonly defined as any public speech that
expresses disparagement to a person or a group on the basis of some characteristics
such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or
other characteristics [1][2]. Social networks encourage the interactions between
people to be more indirect and anonymous thus providing anonymity for some
people making them feel safer even though they express hate speech. It can easily
lead to disruptive anti-social outcomes if it continues to be unregulated and
uncontrolled. Hate speech is therefore considered as a serious problem worldwide,
and many countries and organizations resolutely resist it [3].
The polarity detection of speech on platforms is the first step and is critical to
government departments, social security services, law enforcement and social media
companies which expect to remove accounts with offensive content from their
websites[4]. Compared with manual filtering which is very time consuming,
automatic identification of hate speech will enable the platform to detect the hate
speech and remove them much more quickly and efficiently. The problem of online
hate speech detection has raised interest in both the scientific community and the
business world. There have been many research efforts aimed at automating the
process which is usually modeled as a supervised classification problem. Recently,
machine learning approach which can learn the different associations between pieces
of text, and that a particular output is expected for a particular input by using pre-
labeled examples as training data is popular in scientific studies for hate speech
detection. Among various machine learning methods, deep learning which is a
subset of machine learning, is very prominent in Natural Language Processing (NLP)
to tackle the issue of text classification [5][6].
Although great contributions have been made in this area of work, each research
method has its own advantages and disadvantages. It is still difficult to compare their
performance, largely due to the use of different datasets and different feature
extraction techniques. Different methods usually provide disparate suitability on
feature sets, even for the same datasets.
The most important question, perhaps, is not which method is the best, but how the
results can be better used in general. It is therefore more worthwhile to find a way to
improve the results of each classification. Ensemble learning tries to improve the
overall performance of the system efficiently by combining the outputs from various
candidate systems[7]. The underlying idea of ensemble learning is that even if one
weak classifier gets the wrong prediction, other classifiers can correct the error back
to some extent. The two most common ways of ensemble learning are bagging and
boosting [8]. However, these two methods are unsuitable for ensemble learning
between different classifiers. Adopting some simple algebraic rules of fusion for
multiple classifiers results may prove meaningful.
Chapter 2
2.1 PYTHON
Python is Interactive: You can actually sit at a Python prompt and interact with
the interpreter directly to write your programs.
Python was developed by Guido van Rossum in the late eighties and early nineties at
the National Research Institute for Mathematics and Computer Science in the
Netherlands.
Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol-
68, SmallTalk, and Unix shell and other scripting languages.
Python is copyrighted. Like Perl, Python source code is now available under the GNU
General Public License (GPL).
Python is now maintained by a core development team at the institute, although Guido
van Rossum still holds a vital role in directing its progress.
Easy-to-learn: Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
Easy-to-read: Python code is more clearly defined and visible to the eyes.
A broad standard library: Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
Interactive Mode: Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
Portable: Python can run on a wide variety of hardware platforms and has the
same interface on all platforms.
Extendable: You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more
efficient.
GUI Programming: Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows
MFC, Macintosh, and the X Window system of Unix.
Scalable: Python provides a better structure and support for large programs than
shell scripting.
It provides very high-level dynamic data types and supports dynamic type
checking.
It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
- Subtraction Subtracts right hand operand from left hand operand. a–b=-
10
% Modulus Divides left hand operand by right hand operand and b%a=
returns remainder 0
c%a
& Binary AND Operator copies a bit to the result if it exists in both (a & b)
operands (means
0000 1100)
^ Binary XOR It copies the bit if it is set in one operand but not both. (a ^ b) = 49
(means
0011 0001)
~ Binary Ones It is unary and has the effect of 'flipping' bits. (~a ) = -61
Complement (means
1100 0011
in 2's
complement
form due to
a signed
binary
number.
<< Binary Left Shift The left operands value is moved left by the number of bits a << 2 =
specified by the right operand. 240 (means
1111 0000)
>> Binary Right The left operands value is moved right by the number of a >> 2 = 15
Shift bits specified by the right operand. (means
0000 1111)
and Logical If both the operands are true then condition (a and b)
AND becomes true. is true.
not Logical Used to reverse the logical state of its operand. Not(a
NOT and b) is
false.
not in Evaluates to true if it does not finds a variable in the x not in y, here
specified sequence and false otherwise. not in results in a
1 if x is not a
member of
sequence y.
Operator Description
~+- Complement, unary plus and minus (method names for the last two are
+@ and -@)
LIST
The list is a most versatile data type available in Python which can be written as a list of
comma-separated values (items) between square brackets. Important thing about a list is
that items in a list need not be of the same type.
list2 = [1, 2, 3, 4, 5 ];
1 cmp(list1, list2)
2 len(list)
3 max(list)
4 min(list)
5 list(seq)
1 list.append(obj)
2 list.count(obj)
3 list. extend(seq)
4 list.index(obj)
5 list.insert(index, obj)
6 list.pop(obj=list[-1])
7 list.remove(obj)
8 list.reverse()
9 list.sort([func])
TUPLES
A tuple is a sequence of immutable Python objects. Tuples are sequences, just like lists.
The differences between tuples and lists are, the tuples cannot be changed unlike lists
and tuples use parentheses, whereas lists use square brackets.
tup2 = (1, 2, 3, 4, 5 );
tup1 = ();
To write a tuple containing a single value you have to include a comma, even though
there is only one value −
tup1 = (50,);
Like string indices, tuple indices start at 0, and they can be sliced, concatenated, and so
on.
tup2 = (1, 2, 3, 4, 5, 6, 7 );
tup1[0]: physics
tup2[1:5]: [2, 3, 4, 5]
Updating Tuples:
Tuples are immutable which means you cannot update or change the values of tuple
elements. We are able to take portions of existing tuples to create new tuples as the
following example demonstrates −
print tup3
To explicitly remove an entire tuple, just use the del statement. For example:
print tup
del tup;
print "After deleting tup : "
print tup
Built-in TupleFunctions
1
cmp(tuple1, tuple2):Compares elements of both tuples.
2
len(tuple):Gives the total length of the tuple.
3
max(tuple):Returns item from the tuple with max value.
4
min(tuple):Returns item from the tuple with min value.
5
tuple(seq):Converts a list into tuple.
DICTIONARY
Each key is separated from its value by a colon (:), the items are separated by commas,
and the whole thing is enclosed in curly braces. An empty dictionary without any items is
written with just two curly braces, like this: {}.
Keys are unique within a dictionary while values may not be. The values of a dictionary
can be of any type, but the keys must be of an immutable data type such as strings,
numbers, or tuples.
Result –
dict['Name']: Zara
dict['Age']: 7
Updating Dictionary
We can update a dictionary by adding a new entry or a key-value pair, modifying an
existing entry, or deleting an existing entry as shown below in the simple example −
Result −
dict['Age']: 8
dict['School']: DPS School
To explicitly remove an entire dictionary, just use the del statement. Following is a
simple example –
1 cmp(dict1, dict2)
2 len(dict)
Gives the total length of the dictionary. This would be equal to the number of
items in the dictionary.
3 str(dict)
4 type(variable)
Returns the type of the passed variable. If passed variable is dictionary, then it
would return a dictionary type.
3 dict.fromkeys():Create a new dictionary with keys from seq and values set to value.
A function is a block of organized, reusable code that is used to perform a single, related action.
Functions provide better modularity for your application and a high degree of code reusing. Python
gives you many built-in functions like print(), etc. but you can also create your own functions.
These functions are called user-defined functions.
Defining a Function
Simple rules to define a function in Python.
Function blocks begin with the keyword def followed by the function name and
parentheses ( ( ) ).
The code block within every function starts with a colon (:) and is indented.
return [expression]
Calling a Function
Defining a function only gives it a name, specifies the parameters that are to be included
in the function and structures the blocks of code.Once the basic structure of a function is
finalized, you can execute it by calling it from another function or directly from the Python
prompt. Following is the example to call printme() function −
return;
# Now you can call printme function
printme("I'm first call to user defined function!")
printme("Again second call to the same function")
Function Arguments
You can call a function by using the following types of formal arguments:
Required arguments
Keyword arguments
Default arguments
Variable-length arguments
Scope of Variables
All variables in a program may not be accessible at all locations in that program. This
depends on where you have declared a variable.
The scope of a variable determines the portion of the program where you can access a
particular identifier. There are two basic scopes of variables in Python −
This means that local variables can be accessed only inside the function in which they
are declared, whereas global variables can be accessed throughout the program body
by all functions. When you call a function, the variables declared inside it are brought
into scope. Following is a simple example −
return total;
sum( 10, 20 );
Result −
A module allows you to logically organize your Python code. Grouping related code into
a module makes the code easier to understand and use. A module is a Python object
with arbitrarily named attributes that you can bind and reference.Simply, a module is a
file consisting of Python code. A module can define functions, classes and variables. A
module can also include runnable code.
Example:
The Python code for a module named aname normally resides in a file named aname.py.
Here's an example of a simple module, support.py
return
When the interpreter encounters an import statement, it imports the module if the module
is present in the search path. A search path is a list of directories that the interpreter
searches before importing a module. For example, to import the module support.py, you
need to put the following command at the top of the script −
A module is loaded only once, regardless of the number of times it is imported. This
prevents the module execution from happening over and over again if multiple imports
occur.
Packages in Python
A package is a hierarchical file directory structure that defines a single Python
application environment that consists of modules and sub packages and sub-sub
packages.
Consider a file Pots.py available in Phone directory. This file has following line of source
code −
def Pots():
Similar way, we have another two files having different functions with the same name as
above −
Phone/__init__.py
To make all of your functions available when you've imported Phone,to put explicit import
statements in __init__.py as follows −
from G3 import G3
After you add these lines to __init__.py, you have all of these classes available when you
import the Phone package.
import Phone
Phone.Pots()
Phone.Isdn()
Phone.G3()
RESULT:
I'm 3G Phone
In the above example, we have taken example of a single functions in each file, but you
can keep multiple functions in your files. You can also define different Python classes in
those files and then you can create your packages out of those classes.
This chapter covers all the basic I/O functions available in Python.
PrintingtotheScreen
The simplest way to produce output is using the print statement where you can pass
zero or more expressions separated by commas. This function converts the expressions
you pass into a string and writes the result to standard output as follows −
Result:
ReadingKeyboardInput
Python provides two built-in functions to read a line of text from standard input, which by
default comes from the keyboard. These functions are −
raw_input
input
Theraw_inputFunction
The raw_input([prompt]) function reads one line from standard input and returns it as a
string (removing the trailing newline).
This prompts you to enter any string and it would display same string on the screen.
When I typed "Hello Python!", its output is like this −
TheinputFunction
The input([prompt]) function is equivalent to raw_input, except that it assumes the input
is a valid Python expression and returns the evaluated result to you.
This would produce the following result against the entered input −
OpeningandClosingFiles
Until now, you have been reading and writing to the standard input and output. Now, we
will see how to use actual data files.
Python provides basic functions and methods necessary to manipulate files by default.
You can do most of the file manipulation using a file object.
TheopenFunction
Before you can read or write a file, you have to open it using Python's built-
in open() function. This function creates a file object, which would be utilized to call other
support methods associated with it.
Syntax
file object = open(file_name [, access_mode][, buffering])
file_name: The file_name argument is a string value that contains the name of the
file that you want to access.
access_mode: The access_mode determines the mode in which the file has to be
opened, i.e., read, write, append, etc. A complete list of possible values is given
below in the table. This is optional parameter and the default file access mode is
read (r).
Modes Description
R Opens a file for reading only. The file pointer is placed at the beginning of the file. This is the
default mode.
Rb Opens a file for reading only in binary format. The file pointer is placed at the beginning of the
file. This is the default mode.
r+ Opens a file for both reading and writing. The file pointer placed at the beginning of the file.
rb+ Opens a file for both reading and writing in binary format. The file pointer placed at the
beginning of the file.
W Opens a file for writing only. Overwrites the file if the file exists. If the file does not exist,
creates a new file for writing.
Wb Opens a file for writing only in binary format. Overwrites the file if the file exists. If the file does
not exist, creates a new file for writing.
w+ Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file
does not exist, creates a new file for reading and writing.
wb+ Opens a file for both writing and reading in binary format. Overwrites the existing file if the file
exists. If the file does not exist, creates a new file for reading and writing.
A Opens a file for appending. The file pointer is at the end of the file if the file exists. That is, the
file is in the append mode. If the file does not exist, it creates a new file for writing.
Ab Opens a file for appending in binary format. The file pointer is at the end of the file if the file
exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for
writing.
a+ Opens a file for both appending and reading. The file pointer is at the end of the file if the file
exists. The file opens in the append mode. If the file does not exist, it creates a new file for
reading and writing.
ab+ Opens a file for both appending and reading in binary format. The file pointer is at the end of the
file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new
file for reading and writing.
ThefileObjectAttributes
Once a file is opened and you have one file object, you can get various information
related to that file.
Attribute Description
file.softspace Returns false if space explicitly required with print, true otherwise.
Example
# Open a file
fo = open("foo.txt", "wb")
Theclose()Method
The close() method of a file object flushes any unwritten information and closes the file
object, after which no more writing can be done.Python automatically closes a file when
the reference object of a file is reassigned to another file. It is a good practice to use the
close() method to close a file.
Syntax
fileObject.close();
Example
# Open a file
fo = open("foo.txt", "wb")
Result −
ReadingandWritingFiles
The file object provides a set of access methods to make our lives easier. We would see
how to use read() and write() methods to read and write files.
Thewrite()Method
The write() method writes any string to an open file. It is important to note that Python
strings can have binary data and not just text.The write() method does not add a newline
character ('\n') to the end of the string Syntax
fileObject.write(string);
Here, passed parameter is the content to be written into the opened file. Example
# Open a file
fo = open("foo.txt", "wb")
The above method would create foo.txt file and would write given content in that file and
finally it would close that file. If you would open this file, it would have following content.
Theread()Method
Dept of CSE Dr.KVSRIT Page 31
Deep Learning Based Fusion Approach for Hate Speech Detection
The read() method reads a string from an open file. It is important to note that Python
strings can have binary data. apart from text data.
Syntax
fileObject.read([count]);
Here, passed parameter is the number of bytes to be read from the opened file. This
method starts reading from the beginning of the file and if count is missing, then it tries to
read as much as possible, maybe until the end of file.
Example
# Open a file
fo = open("foo.txt", "r+")
str = fo.read(10);
FilePositions
The tell() method tells you the current position within the file; in other words, the next
read or write will occur at that many bytes from the beginning of the file.
32
The seek(offset[, from]) method changes the current file position. The offset argument
indicates the number of bytes to be moved. The from argument specifies the reference
position from where the bytes are to be moved.
If from is set to 0, it means use the beginning of the file as the reference position and 1
means use the current position as the reference position and if it is set to 2 then the end
of the file would be taken as the reference position.
Example
# Open a file
fo = open("foo.txt", "r+")
str = fo.read(10);
str = fo.read(10);
RenamingandDeletingFiles
Python os module provides methods that help you perform file-processing operations,
such as renaming and deleting files.
To use this module you need to import it first and then you can call any related functions.
Therename()Method
The rename() method takes two arguments, the current filename and the new filename.
Syntax
os.rename(current_file_name, new_file_name)
Example
import os
Theremove()Method
You can use the remove() method to delete files by supplying the name of the file to be
deleted as the argument.
Syntax
os.remove(file_name)
Example
Following is the example to delete an existing file test2.txt −
#!/usr/bin/python
import os
DirectoriesinPython
All files are contained within various directories, and Python has no problem handling
these too. The os module has several methods that help you create, remove, and
change directories.
Themkdir()Method
You can use the mkdir() method of the os module to create directories in the current
directory. You need to supply an argument to this method which contains the name of
the directory to be created.
Syntax
os.mkdir("newdir")
Example
#!/usr/bin/python
import os
Thechdir()Method
You can use the chdir() method to change the current directory. The chdir() method
takes an argument, which is the name of the directory that you want to make the current
directory.
Syntax
os.chdir("newdir")
Example
Following is the example to go into "/home/newdir" directory −
#!/usr/bin/python
import os
Thegetcwd()Method
Syntax
os.getcwd()
Example
Following is the example to give current directory −
import os
Thermdir()Method
The rmdir() method deletes the directory, which is passed as an argument in the method.
Syntax:
os.rmdir('dirname')
Example
Following is the example to remove "/tmp/test" directory. It is required to give fully
qualified name of the directory, otherwise it would search for that directory in the current
directory.
import os
# This would remove "/tmp/test" directory.
os.rmdir( "/tmp/test" )
File& DirectoryRelatedMethods
There are three important sources, which provide a wide range of utility methods to
handle and manipulate files & directories on Windows and Unix operating systems. They
are as follows −
File Object Methods: The file object provides functions to manipulate files.
Python provides two very important features to handle any unexpected error
in your Python programs and to add debugging capabilities in them −
EXCEPTION DESCRIPTION
NAME
StopIteration Raised when the next() method of an iterator does not point to any
object.
StandardError Base class for all built-in exceptions except StopIteration and
SystemExit.
ArithmeticError Base class for all errors that occur for numeric calculation.
ZeroDivisionError Raised when division or modulo by zero takes place for all numeric
types.
EOFError Raised when there is no input from either the raw_input() or input()
function and the end of file is reached.
KeyError Raised when the specified key is not found in the dictionary.
namespace.
IOError Raised when an input/ output operation fails, such as the print
statement or the open() function when trying to open a file that does
IOError
not exist.
SystemError Raised when the interpreter finds an internal problem, but when this
error is encountered the Python interpreter does not exit.
ValueError Raised when the built-in function for a data type has the valid type
of arguments, but the arguments have invalid values specified.
RuntimeError Raised when a generated error does not fall into any category.
What is Exception?
An exception is an event, which occurs during the execution of a program that disrupts
the normal flow of the program's instructions. In general, when a Python script
encounters a situation that it cannot cope with, it raises an exception. An exception is a
Python object that represents an error.
When a Python script raises an exception, it must either handle the exception
immediately otherwise it terminates and quits.
Handlinganexception
If you have some suspicious code that may raise an exception, you can defend your
program by placing the suspicious code in a try: block. After the try: block, include
an except: statement, followed by a block of code which handles the problem as
elegantly as possible.
The Python standard for database interfaces is the Python DB-API. Most Python
database interfaces adhere to this standard.
You can choose the right database for your application. Python Database API supports a
wide range of database servers such as −
GadFly
mSQL
MySQL
PostgreSQL
Informix
Interbase
Oracle
Sybase
The DB API provides a minimal standard for working with databases using Python
structures and syntax wherever possible. This API includes the following:
Issuing SQL statements and stored procedures and closing the module.
Chapter 3
UML Models
Service
Provider
View Post
student data
sets
View Hate
Speech
PRELIMINARY INVESTIGATION
The first and foremost strategy for development of a project starts from the thought of
designing a mail enabled platform for a small firm in which it is easy and convenient of sending
and receiving messages, there is a search engine ,address book and also including some
entertaining games. When it is approved by the organization and our project guide the first activity,
ie. preliminary investigation begins. The activity has three parts:
Request Clarification
Feasibility Study
Request Approval
REQUEST CLARIFICATION
After the approval of the request to the organization and project guide, with an
investigation being considered, the project request must be examined to determine precisely what
the system requires.
Here our project is basically meant for users within the company whose systems can
be interconnected by the Local Area Network(LAN). In today’s busy schedule man need
everything should be provided in a readymade manner. So taking into consideration of the vastly
use of the net in day to day life, the corresponding development of the portal came into existence.
FEASIBILITY ANALYSIS
Operational Feasibility
Economic Feasibility
Technical Feasibility
Operational Feasibility
Operational Feasibility deals with the study of prospects of the system to be developed.
This system operationally eliminates all the tensions of the Admin and helps him in effectively
tracking the project progress. This kind of automation will surely reduce the time and energy,
which previously consumed in manual work. Based on the study, the system is proved to be
operationally feasible.
Economic Feasibility
Technical Feasibility
According to Roger S. Pressman, Technical Feasibility is the assessment of the technical
resources of the organization. The organization needs IBM compatible machines with a graphical
web browser connected to the Internet and Intranet. The system is developed for platform
Independent environment. Java Server Pages, JavaScript, HTML, SQL server and WebLogic
Server are used to develop the system. The technical feasibility has been carried out. The system is
technically feasible for development and can be developed with the existing facility.
Not all request projects are desirable or feasible. Some organization receives so many
project requests from client users that only few of them are pursued. However, those projects that
are both feasible and desirable should be put into schedule. After a project request is approved, it
cost, priority, completion time and personnel requirement is estimated and used to determine
where to add it to any project list. Truly speaking, the approval of those above factors,
development works can be launched.
INPUT DESIGN
Input Design plays a vital role in the life cycle of software development, it requires very
careful attention of developers. The input design is to feed data to the application as accurate as
possible. So inputs are supposed to be designed effectively so that the errors occurring while
feeding are minimized. According to Software Engineering Concepts, the input forms or screens
are designed to provide to have a validation control over the input limit, range and other related
validations.
This system has input screens in almost all the modules. Error messages are developed to
alert the user whenever he commits some mistakes and guides him in the right way so that invalid
entries are not made. Let us see deeply about this under module design.
Input design is the process of converting the user created input into a computer-based
format. The goal of the input design is to make the data entry logical and free from errors. The
error is in the input are controlled by the input design. The application has been developed in user-
friendly manner. The forms have been designed in such a way during the processing the cursor is
placed in the position where must be entered. The user is also provided with in an option to select
an appropriate input from various alternatives related to the field in certain cases.
Validations are required for each data entered. Whenever a user enters an erroneous data,
error message is displayed and the user can move on to the subsequent pages after completing all
the entries in the current page.
OUTPUT DESIGN
The Output from the computer is required to mainly create an efficient method of
communication within the company primarily among the project leader and his team members, in
other words, the administrator and the clients. The output of VPN is the system which allows the
project leader to manage his clients in terms of creating new clients and assigning new projects to
them, maintaining a record of the project validity and providing folder level access to each client
on the user side depending on the projects allotted to him. After completion of a project, a new
project may be assigned to the client. User authentication procedures are maintained at the initial
stages itself. A new user may be created by the administrator himself or a user can himself register
as a new user but the task of assigning projects and validating a new user rests with the
administrator only.
The application starts running when it is executed for the first time. The server has to be started
and then the internet explorer in used as the browser. The project will run on the local area network
so the server machine will serve as the administrator while the other connected systems can act as
the clients. The developed system is highly user friendly and can be easily understood by anyone
using it even for the first time.
The feasibility of the project is analyzed in this phase and business proposal
is put forth with a very general plan for the project and some cost estimates.
carried out. This is to ensure that the proposed system is not a burden to the
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the
justified. Thus the developed system as well within the budget and this was achieved
because most of the technologies used are freely available. Only the customized
TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the
technical requirements of the system. Any system developed must not have a high
demand on the available technical resources. This will lead to high demands on the
available technical resources. This will lead to high demands being placed on the
client. The developed system must have a modest requirement, as only minimal or
SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the
user. This includes the process of training the user to use the system efficiently. The
user must not feel threatened by the system, instead must accept it as a necessity.
The level of acceptance by the users solely depends on the methods that are
employed to educate the user about the system and to make him familiar with it. His
level of confidence must be raised so that he is also able to make some constructive
Chapter 4
Architecture Diagram
Service Provider
Admin Accepting all user Information View All Person's Speech Data Set Details
View user data details Search Person Speech Review Data Set Details
View Hate Speech
View Positive Speech
Authorize View Negative Speech
the Admin Process all View Score Results by Line Chart
user queries View Score Results By Bar Chart
Store and retrievals View All Remote Users
Registering
the User
WEB
Database
Remote User
Register and Login
Tweet Server
Post person speech data set,
Search on person speech data set Details,
View Your Profile
Chapter 5
SYSTEM TESTING
Integration testing
Functional test
System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.
Unit Testing:
Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
Integration Testing
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
Acceptance Testing
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
SYSTEM TESTING
TESTING METHODOLOGIES
The following are the Testing Methodologies:
o Unit Testing.
o Integration Testing.
o User Acceptance Testing.
o Output Testing.
o Validation Testing.
Unit Testing
Unit testing focuses verification effort on the smallest unit of Software design that is the
module. Unit testing exercises specific paths in a module’s control structure to ensure complete
coverage and maximum error detection. This test focuses on each module individually, ensuring
that it functions properly as a unit. Hence, the naming is Unit Testing.
During this testing, each module is tested individually and the module interfaces are
verified for the consistency with design specification. All important processing path are tested for
the expected results. All error handling paths are also tested.
Integration Testing
Integration testing addresses the issues associated with the dual problems of verification
and program construction. After the software has been integrated a set of high order tests are
conducted. The main objective in this testing process is to take unit tested modules and builds a
program structure that has been dictated by design.
2. Bottom-up Integration
This method begins the construction and testing with the modules at the lowest level in the
program structure. Since the modules are integrated from the bottom up, processing required for
modules subordinate to a given level is always available and the need for stubs is eliminated. The
bottom up integration strategy may be implemented with the following steps:
The low-level modules are combined into clusters into clusters that perform
a specific Software sub-function.
A driver (i.e.) the control program for testing is written to coordinate test case
input and output.
The cluster is tested.
Drivers are removed and clusters are combined moving upward in the program
structure
The bottom up approaches tests each module individually and then each module is module is
integrated with a main module and tested for functionality.
Output Testing
After performing the validation testing, the next step is output testing of the proposed
system, since no system could be useful if it does not produce the required output in the specified
format. Asking the users about the format required by them tests the outputs generated or
displayed by the system under consideration. Hence the output format is considered in 2 ways –
one is on screen and another in printed format.
Validation Checking
Validation checks are performed on the following fields.
Text Field:
The text field can contain only the number of characters lesser than or equal to its size. The
text fields are alphanumeric in some tables and alphabetic in other tables. Incorrect entry always
flashes and error message.
Numeric Field:
The numeric field can contain only numbers from 0 to 9. An entry of any character flashes
an error messages. The individual modules are checked for accuracy and what it has to perform.
Each module is subjected to test run along with sample data. The individually tested modules
are integrated into a single system. Testing involves executing the real data information is used in
the program the existence of any program defect is inferred from the output. The testing should be
planned so that all the requirements are individually tested.
A successful test is one that gives out the defects for the inappropriate data and produces
and output revealing the errors in the system.
The most effective test programs use artificial test data generated by persons other than those who
wrote the programs. Often, an independent team of testers formulates a testing plan, using the
systems specifications.
The package “Virtual Private Network” has satisfied all the requirements specified as per software
requirement specification and was accepted.
MAINTAINENCE
This covers a wide range of activities including correcting code and design errors. To reduce the
need for maintenance in the long run, we have more accurately defined the user’s requirements
during the process of system development. Depending on the requirements, this system has been
developed to satisfy the needs to the largest possible extent. With development in technology, it
may be possible to add many more features based on the requirements in future. The coding and
designing is simple and easy to understand which will make maintenance easier.
software. Thus, a series of testing are performed for the proposed system before the system is
ready for user acceptance testing.
SYSTEM TESTING:
Software once validated must be combined with other system elements (e.g. Hardware, people,
database). System testing verifies that all the elements are proper and that overall system function
performance is achieved. It also tests to find discrepancies between the system and its original
objective, current specifications and system documentation.
UNIT TESTING:
In unit testing different are modules are tested against the specifications produced during the
design for the modules. Unit testing is essential for verification of the code produced during the
coding phase, and hence the goals to test the internal logic of the modules. Using the detailed
design description as a guide, important Conrail paths are tested to uncover errors within the
boundary of the modules. This testing is carried out during the programming stage itself. In this
type of testing step, each module was found to be working satisfactorily as regards to the expected
output from the module.
In Due Course, latest technology advancements will be taken into consideration. As part
of technical build-up many components of the networking system will be generic in nature so that
future projects can either use or interact with this. The future holds a lot to offer to the
development and refinement of this project.
SYSTEM TESTING
TESTING METHODOLOGIES
o Unit Testing.
o Integration Testing.
o User Acceptance Testing.
o Output Testing.
Dept of CSE Dr.KVSRIT Page 67
Deep Learning Based Fusion Approach for Hate Speech Detection
o Validation Testing.
Unit Testing
Unit testing focuses verification effort on the smallest unit of Software design that is the
module. Unit testing exercises specific paths in a module’s control structure to
ensure complete coverage and maximum error detection. This test focuses on each module
individually, ensuring that it functions properly as a unit. Hence, the naming is Unit Testing.
During this testing, each module is tested individually and the module interfaces are
verified for the consistency with design specification. All important processing path are tested for
the expected results. All error handling paths are also tested.
Integration Testing
Integration testing addresses the issues associated with the dual problems of verification
and program construction. After the software has been integrated a set of high order tests are
conducted. The main objective in this testing process is to take unit tested modules and builds a
program structure that has been dictated by design.
In this method, the software is tested from main module and individual stubs are replaced
when the test proceeds downwards.
2. Bottom-up Integration
This method begins the construction and testing with the modules at the lowest level in the
program structure. Since the modules are integrated from the bottom up, processing required for
modules subordinate to a given level is always available and the need for stubs is eliminated. The
bottom up integration strategy may be implemented with the following steps:
The low-level modules are combined into clusters into clusters that perform
a specific Software sub-function.
A driver (i.e.) the control program for testing is written to coordinate test case
input and output.
The cluster is tested.
Drivers are removed and clusters are combined moving upward in the program
structure
The bottom up approaches tests each module individually and then each module is module is
integrated with a main module and tested for functionality.
User Acceptance of a system is the key factor for the success of any system. The system
under consideration is tested for user acceptance by constantly keeping in touch with the
prospective system users at the time of developing and making changes wherever required. The
system developed provides a friendly user interface that can easily be understood even by a person
who is new to the system.
Output Testing
After performing the validation testing, the next step is output testing of the proposed
system, since no system could be useful if it does not produce the required output in the specified
format. Asking the users about the format required by them tests the outputs generated or
displayed by the system under consideration. Hence the output format is considered in 2 ways –
one is on screen and another in printed format.
Validation Checking
Validation checks are performed on the following fields.
Text Field:
The text field can contain only the number of characters lesser than or equal to its size. The
text fields are alphanumeric in some tables and alphabetic in other tables. Incorrect entry always
flashes and error message.
Numeric Field:
The numeric field can contain only numbers from 0 to 9. An entry of any character flashes
an error messages. The individual modules are checked for accuracy and what it has to perform.
Each module is subjected to test run along with sample data. The individually tested modules
are integrated into a single system. Testing involves executing the real data information is used in
the program the existence of any program defect is inferred from the output. The testing should be
planned so that all the requirements are individually tested.
A successful test is one that gives out the defects for the inappropriate data and produces
and output revealing the errors in the system.
Taking various kinds of test data does the above testing. Preparation of test data plays a
vital role in the system testing. After preparing the test data the system under study is tested
using that test data. While testing the system by using test data errors are again uncovered and
corrected by using above testing steps and corrections are also noted for future use.
Live test data are those that are actually extracted from organization files. After a system is
partially constructed, programmers or analysts often ask users to key in a set of data from their
normal activities. Then, the systems person uses this data as a way to partially test the system. In
other instances, programmers or analysts extract a set of live data from the files and have them
entered themselves.
It is difficult to obtain live data in sufficient amounts to conduct extensive testing. And,
although it is realistic data that will show how the system will perform for the typical processing
requirement, assuming that the live data entered are in fact typical, such data generally will not test
all combinations or formats that can enter the system. This bias toward typical values then does not
provide a true systems test and in fact ignores the cases most likely to cause system failure.
Artificial test data are created solely for test purposes, since they can be generated to test all
combinations of formats and values. In other words, the artificial data, which can quickly be
prepared by a data generating utility program in the information systems department, make
possible the testing of all login and control paths through the program.
The most effective test programs use artificial test data generated by persons other than
those who wrote the programs. Often, an independent team of testers formulates a testing plan,
using the systems specifications.
The package “Virtual Private Network” has satisfied all the requirements specified as per
software requirement specification and was accepted.
USER TRAINING
Whenever a new system is developed, user training is required to educate them about the
working of the system so that it can be put to efficient use by those for whom the system has been
primarily designed. For this purpose the normal working of the project was demonstrated to the
prospective users. Its working is easily understandable and since the expected users are people who
have good knowledge of computers, the use of this system is very easy.
5.4 MAINTAINENCE
This covers a wide range of activities including correcting code and design errors. To
reduce the need for maintenance in the long run, we have more accurately defined the user’s
requirements during the process of system development. Depending on the requirements, this
system has been developed to satisfy the needs to the largest possible extent. With development in
technology, it may be possible to add many more features based on the requirements in future. The
coding and designing is simple and easy to understand which will make maintenance easier.
TESTING STRATEGY :
A strategy for system testing integrates system test cases and design techniques into a well
planned series of steps that results in the successful construction of software. The testing strategy
must co-operate test planning, test case design, test execution, and the resultant data collection and
evaluation .A strategy for software testing must accommodate low-level tests that are necessary
to verify that a small source code segment has been correctly implemented as well as high level
tests that validate major system functions against user requirements.
Software testing is a critical element of software quality assurance and represents the ultimate
review of specification design and coding. Testing represents an interesting anomaly for the
software. Thus, a series of testing are performed for the proposed system before the system is
ready for user acceptance testing.
SYSTEM TESTING:
Software once validated must be combined with other system elements (e.g. Hardware,
people, database). System testing verifies that all the elements are proper and that overall system
function performance is
achieved. It also tests to find discrepancies between the system and its original objective, current
specifications and system documentation.
UNIT TESTING:
In unit testing different are modules are tested against the specifications produced during
the design for the modules. Unit testing is essential for verification of the code produced during the
coding phase, and hence the goals to test the internal logic of the modules. Using the detailed
design description as a guide, important Conrail paths are tested to uncover errors within the
boundary of the modules. This testing is carried out during the programming stage itself. In this
type of testing step, each module was found to be working satisfactorily as regards to the expected
output from the module.
Chapter 6
Program on pycharm and user login in web browser:
Now pychram will execute the puthon program with command python manage.py
runserverit will gives an url http://127.0.0.1:8000/ we can browse
we need to register and login as user we have another option that we need to
1. Register
2. Login as User
2.Control as a Administrator
2. Line chart
Chapter 7
CONCLUSIONS
This paper presented the principle of three types of text classification methods,
ELMo, BERT and CNN, and applied them to hate speech detection, then improved
the performance by fusion from two perspectives: the fusion of the classification
results of ELMo, BERT and CNN, and the fusion of the classification results of
three CNN classifiers with different parameters. The results showed that fusion
processing is a viable way to improve the performance of hate speech detection. It
can be deemed reasonable to achieve the practical significance of performance at a
little extra cost. This paper focuses on the fusion after separate classification; the
degree of integration is not deep enough. In the future we will pay more attention to
the early cooperation before classification. We will try to replace the basic word
vector expression in CNN with the embedding technologies in ELMo or BERT. This
can integrate the advantages of excellent word embedding and powerful neural
networks deeply.
Chapter 8
Bibliography
[18] E. O. Olaniyi, O. K. Oyedotun, and K. Adnan, ``Heart diseases diagnosis
using neural networks arbitration,'' Int. J. Intell. Syst. Appl., vol. 7, no. 12,
p. 72, 2015.
[19] R. Das, I. Turkoglu, and A. Sengur, ``Effective diagnosis of heart disease
through neural networks ensembles,'' Expert Syst. Appl., vol. 36, no. 4,
pp. 76757680, May 2009.
[20] O. W. Samuel, G. M. Asogbon, A. K. Sangaiah, P. Fang, and G. Li,
``An integrated decision support system based on ANN and Fuzzy_AHP
for heart failure risk prediction,'' Expert Syst. Appl., vol. 68, pp. 163172,
Feb. 2017.
[21] A. V. S. Kumar, ``Diagnosis of heart disease using fuzzy resolution
mechanism,''
J. Artif. Intell., vol. 5, no. 1, pp. 4755, Jan. 2012.
[22] M. Gudadhe, K. Wankhade, and S. Dongre, ``Decision support system forheart
disease based on support vector machine and arti_cial neural network,''
in Proc. Int. Conf. Comput. Commun. Technol. (ICCCT), Sep. 2010,
pp. 741_745.
[23] H. Kahramanli and N. Allahverdi, ``Design of a hybrid system for the diabetes
and heart diseases,'' Expert Syst. Appl., vol. 35, nos. 1_2, pp. 82_89,
Jul. 2008.
[24] M. A. Jabbar, B. Deekshatulu, and P. Chandra, ``Classi_cation of heart disease
using arti_cial neural network and feature subset selection,'' Global
J. Comput. Sci. Technol. Neural Artif. Intell., vol. 13, no. 3, pp. 4_8, 2013.
[25] X. Liu, X. Wang, Q. Su, M. Zhang, Y. Zhu, Q. Wang, and Q. Wang,
``A hybrid classi_cation system for heart disease diagnosis based on
the RFRS method,'' Comput. Math. Methods Med., vol. 2017, pp. 1_11,
Jan. 2017.