Wart Treatment Using Machine Learning Support Vector Algorithm
Wart Treatment Using Machine Learning Support Vector Algorithm
Abstract support vector machine (SVM) in data optimization has becomes powerful tools of problem
solving in machine learning. SVM algorithm can be used for Face detection, image classification, text
categorization, etc. In this paper, we present wart treatment of patients using immunotherapy by using
support vector machine algorithm model. Immunotherapy is a new class of cancer treatment that works
to harness the innate powers of our own immune system to fight cancer. We run various kinds of
algorithms and compared the performance of each algorithm with the other in terms of the wart
treatment result of patients using immunotherapy. The different types of algorithms are incorporated
step by step. The treatment of patient’s wart using immunotherapy has been considered as a
classification problem and it is evaluated using various types of machine learning algorithms. The
evaluations have been performed on diverse feature sets and the different classification methods. The
comparison of the results is also presented and the evaluation show that for the wart treatment using
immunotherapy. The immunotherapy data set has 90 instances and 8 attributes of type integer and real
type. For these, data set the data mining tool used was sklearn.
1. Introduction
SVM is a supervised machine learning algorithm which can be used for classification or regression
problems. It uses a technique called the kernel trick to transform your data and then based on these
transformations it finds an optimal boundary between the possible outputs. Simply put, it does some
extremely complex data transformations, then figures out how to separate our data based on the labels
or outputs we have defined. The goal of the SVM algorithm is to create the best line or decision
boundary that can segregate n-dimensional space into classes so that we can easily put the new data
point in the correct category in the future. This best decision boundary is called a hyperplane. SVM
chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called
as support vectors, and hence algorithm is termed as Support Vector Machine. Consider the below
diagram in which there are two different categories that are classified using a decision boundary or hype
One of the most common and leading cause of cancer death in human beings is lung cancer. The
advanced observation of cancer takes the main role to inflate a patient’s probability for survival of the
disease. Other researchers show the accomplishment of support vector machine (SVM) and logistic
regression (LR) algorithms in predicting the survival rate of lung cancer patients and compares the
effectiveness of these two algorithms through accuracy, precision, recall, F1 score and confusion matrix.
These techniques have been applied to detect the survival possibilities of lung cancer victims and help
the physicians to take decisions on the forecast of the disease.
Support vector machine is a representation of the training data as points in space separated into
categories by a clear gap that is as wide as possible. New examples are then mapped into that same
space and predicted to belong to a category based on which side of the gap they fall. SVM it capable of
doing both classification and regression. In this post I'll focus on using SVM for classification. In particular
I'll be focusing on non-linear SVM, or SVM using a non-linear kernel. Non-linear SVM means that the
boundary that the algorithm calculates doesn't have to be a straight line. The benefit is that you can
capture much more complex relationships between your datapoints without having to perform difficult
transformations on your own. The downside is that the training time is much longer as it's much more
computationally intensive.
A decision tree is a tree-like graph with nodes representing the place where we pick an attribute and ask
a question; edges represent the answers to the question; and the leaves represent the actual output or
class label. They are used in non-linear decision making with simple linear decision surface.
RangeIndex: 90 entries, 0 to 89
In this paper I am trying to compare three different classification algorithms on the same dataset to see
the performance or accuracy of each model and provided a comparison results in terms of accuracy,
confusion matrix and classification report using the experiment result.
Result 1: Accuracy, confusion matrix and classification report for Decision Tree
Result 3 Accuracy, confusion matrix and classification report for random forest
So as shown in the below bar chart I have conducted comparison on different classification technique
and provided a basis among them in terms of accuracy, confusion matrix and classification report by
applying 10- fold cross validation, and random forest can achieve around 86.6 % of acuuracy.
6. Conclusion
In this paper, I am trying to compare Support vector machine, random forest and decision for
immunotherapy dataset. I use 10-fold cross validation (90% for training and 10% for testing) for all
algorithms. Based on these classifications the accuracy decision tree 83.33%, accuracy of random forest
is 86.66% and accuracy of SVM 78.88%. From these accuracy result, we can say that with the same
dataset the accuracy of different algorithms becomes different. So, we must be selecting the efficient
one by comparing them.
7. Reference
8.
https://www.researchgate.net/publication/319870836_Predicting_Lung_Cancer_Survivability_using
_SVM_and_Logistic_Regression_Algorithms
https://archive.ics.uci.edu/ml/datasets/Immunotherapy+Dataset
https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-
code/