Disease Drug Prediction Using ML

CHAPTER 1
INTRODUCTION
1.1Problem statement:
After sophisticated computing has come, doctors will need the technology in a
number of fields, such as the form of surgical depiction and x-rays. The treatment
often involves the physician's experience and awareness of different causes, from
medical background to temperature, environment, blood pressure and several other
variables. The vast number of variables are assumed to be whole variables required to
account for the whole workflow, but no model has been tested successfully. To
counter this drawback, patient decision support systems must be used. This device
can allow physicians to make the correct decision. The help system for medical
decisions covers both the procedure of detecting or identifying suspicious diseases or
symptoms and the opinion that this mechanism has been achieved.
1.2 MOTIVATION: Regardless of potential factors or rare diseases, medical options

may be extremely specialised and overwhelming. Alternative factors are stress;
exhaustion misdiagnosis may be different from the confusion and lack of
understanding of clinicians. Both variables, such as medical data background, family
data history and different factors associated with patient reporting, may mask the
extent of unknown factors. Different diagnostic methods may be used to identify an
item where several choices are available and to provide the candidate with
alternatives. This method can be discarded or information collected that decreases to
negligible quantities the probability of candidate situations.
1.3 Objective:
Reduce the number of variables and identify diseases using the K-means algorithm
when feasible. This algorithm is ideally suited to organise additional pathogens. K-
Mean is one of the algorithms that solve the clustering issue. Holding the k centroids
one per cluster is the main principle. Different patient controls may function as
clustering features. This algorithm reduces the number of iterations and establishes
cluster boundaries without Overlapping, for each diagnosis to deliver a precise
outcome. This technology utilises service-oriented architecture (SOA), anyone with
1
internet connectivity can login and the LAMSTAR network can measure the weight,
improve algorithm precision, increase overall performance and achieve the best
results.
1.3.1 Proposed System:
Reduce the number of variables and classify the most possible diseases using the K-
means algorithm. This algorithm is best adapted for grouping further pathogens. K-
Mean is one of the unattended learning algorithms that address the issue of clustering.
The key concept is to create the k centroids, one per cluster. Various checks on the
patients are used as a clustering attribute.
1.3.2 Advantages of proposed systems:
By using this algorithm, it decreases the number of iterations, the cluster boundaries
are established without overlapping, such that any diagnosis produces the same
outcome. This technology uses service-oriented architecture (SOA), anybody can
access the web and LAMSTAR Network can measure weight, improve algorithm
precision, speed test overall and deliver better results.
2
CHAPTER 2
TECHNOLOGIES LEARNT
What is Python :-
Below are some facts about Python.
Python is currently the most popular, high-level, multifunctional programming

language.
Python enables programming in paradigms for artefacts and procedures. Generally,

Python applications are smaller than other languages such as Java.
Programmers need to type very little and have a language indentation requirement,
making it readable all the time.
Nearly all technological giants such as Google, Amazon, Facebook, Instagram,

Dropbox, Uber... use Python language.
Python's greatest asset is the vast series of basic libraries that can be used for the
following –
• Learning Machine
• Applications for GUI (like Kivy, Tkinter, PyQt etc. )
• Web frames such as Django (used by YouTube, Instagram, Dropbox)
• Editing of photographs (like OpenCV, Pillow)
• Scraping of the web (like Scrapy, BeautifulSoup, Selenium)
• Structures for research
• Multilateral Media
Advantages of Python :-
Let's see if other languages are dominated by Python.
3
1. Extensive Libraries
Python instals a wide repository of code for different uses such as regular
expressions, documentation generation, unit checking, site browsings, threading,
databases, CGI, email, image handling and more. But we don't have to compose the
whole code manually for that.
2. Extensible
Python may be expanded to other languages, as we showed previously. Any of the

programming can be written in languages such as C++ or C. This is useful, especially
in projects.
3. Embeddable
Python is also embeddable, complimenting extensibility. You should place the Python
code in a different language source code, like C++. This allows us to apply scripting
to our other language code.
4. Improved Productivity
The flexibility of the language and large libraries allow programmers more efficient
than languages such as Java and C++. The reality that you have to compose less and
do better.
5. IOT Opportunities
As Python forms the backbone of modern platforms such as Raspberry Pi, the Internet
Of Things has a promising future. This is how vocabulary can be connected to the
modern world.
6. Simple and Easy
You will need to build a class to print 'Hello World' while dealing with Java. But only
a print sentence can do in Python. You can also read, comprehend and code very
easily. This is why when people take Python, they find it difficult to adapt to other
verbs like Java.
4
7. Readable
Since it is not so verbose, reading Python is quite close to reading English. This is
why learning, understanding and code is so simple. There are also no curly braces
required to describe blocks and indentation is compulsory. This also helps to make
the code readable.Centered on the object.
8. Object-Oriented
This vocabulary embraces all the paradigms of proceedings and objects. Although
functions aid us reusability of code, classes and objects enable us to model the real
world. A class requires details and roles to be encapsulated into one.
9. Free and Open-Source
Like we said before, Python is free to use. However, you can not only use Python for
free, but can even download its source code, adjust it, and even distribute it. It
downloads an extensive library set to support you with your assignments.
10. Portable
If you code your project in a language such as C++, you will have to modify it if you
wish to execute it on another platform. But Python isn't the same. You just need to
code once here, and you can run it anywhere. It is named Compose Anywhere Until
Run (WORA). However, you must be vigilant not to use system-dependent roles.
11. Interpreted
Finally, we'll claim it's a phrase translated. Debugging is simpler than in compiled
languages, since sentences are implemented one by one.
Some questions regarding the benefits of Python before now? In the comment line,
mention.
5
Advantages of Python Over Other Languages
1. Less Coding
Almost any job done in Python needs fewer coding while in other languages the same
work is completed. Python also has great standard library help, so no third-party
library needs to be searched to accomplish the work. This is so many people
recommend that newcomers read Python.
2. Affordable
Python is open, so people, small businesses and large corporations may create apps
using the available free tools. Python is common and commonly used to provide
better help to the group.
The Github 2019 annual survey found Python overtook Java in the most common
division of programming.
3. Python is for Everyone
Python programming can operate on every Linux, Mac or Windows system.

Programmers need to master various languages for different work, but with Python,
you can develop web applications professionally, evaluate data and learn computers,
automate stuff and make games and powerful views. It is a programming language
that is all-round.
Disadvantages of Python
So far, we have shown why the Python project is a perfect option. Yet you can still be
mindful of the implications if you want it. Let's now look at Python's downsides over
another language.
1. Speed Limitations
We saw Python code running line by line. But since Python is interpreted, poor
execution also occurs. However, this is not a concern unless pace is a priority of the
6
project. In other words, the benefits provided by Python are adequate to distract us
from its speed limitations, until high speed is needed.
2. Weak in Mobile Computing and Browsers
Though it is an outstanding server-side language, Python is hardly used on the

customer side. In addition, Smartphone-based devices are seldom utilised. All of
these applications is Carbonnelle.
The reason it is not so renowned notwithstanding Brython's presence is that it is not

so clean.
3. Design Restrictions
Python is dynamically-typed, as you know. This ensures you do not have to declare
the variable form when you write the code. It uses the form of duck. But wait, what is
it? Ok, that also says it could be a duck if it appears like a duck. Although coding is
simple for programmers, runtime errors may be increased.
4. Undeveloped Access Layers Database
The access layers of Python's database are rather underdeveloped compared with more
commonly utilised technologies such as JDBC (Java DataBase Connectivity) and ODBC
(Open DataBase Connectivity). Consequently, it is less often seen in large companies.
5. Easy to use.
No, we're not joking. We're not kidding. Simplicity of Python may actually be a challenge.
Take my case. Take my example. I'm more a Python user, I don't do Java. I think that the
syntax is so plain that the Java code's verbosity seems excessive.
This was more about Python Program Language's benefits and drawbacks.
7
Python History: -
What have Python's alphabet and programming language in common? Right, both of them
begin with ABC. If in Python we speak about ABC, it is obvious that ABC is the
programming language. ABC is a general programming language and programming
framework, built at the CWI in the Netherlands, Amsterdam (Centrum Wiskunde &
Informatica). ABC's biggest success was to shape Python's architecture. In the late 1980s,
Python was conceptualised. Guido van Rossum was consulting on the distributed
operating framework, Amoeba, at the CWI. Guido van Rossum said in an interview with
Bill Venners1: "In the early 1980s I worked at Centrum voor Wiskunde et Informatica as
an implementer in a team developing a language named ABC (CWI). I'm not sure how
much people recognise the effect of ABC on Python. I try to note the power of ABC
because I owe everything I knew during the project and the people who worked on it."
Guido van Rossum added later in the same interview: "I recalled all my history with ABC
and some of my dissatisfaction. I wanted to try to design a basic scripting language, which
had some better properties of ABC, but which had no problems. Then I began typing. A
virtual computer, a simple parser and a simple runtime were developed. I created my own
edition of the different ABC pieces I wanted. Instead of curly braces or begin-end lines, I
built the simple syntax, used an indentation for grouping statements, and produced a small
number of powerful data types: hash table (or dictionary), list, strings, and numbers."
What is machine education: -
Let's begin by looking at what machine learning is and what it is not until we get at the
specifics of different approaches. Machine learning is sometimes classified as an artificial
intelligence subfield, however at first brush it is sometimes misleading to categorise. In
this sense, analysis has undoubtedly contributed to the study of machine learning, but it is
more beneficial to think of machine learning as a way to construct data models in the
implementation of machine learning techniques.
Machine learning essentially entails constructing mathematical models to better interpret

results. 'learning' comes into the fray as these models provide tunable parameters which
can be changed to be followed by observable results; the software can then be called
8
'learning' from the data. After these models have been applied to previously observed data,
they may be used to forecast and explain newly observed data elements. I would leave the
reader with a more philosophic digression on how close this form of model-based
'learning' in mathematics is to the 'learning' seen by the human brain. To use these
methods effectively, it's crucial to understand the issue in machine learning and so we will
start with some broad categorizations of the types of approaches we are discussing here.
Machine Leaning Categories:-
Machine learning can be divided into two major forms at the most simple level:
supervised learning and unregulated learning.
Monitored learning requires somehow modelling a correlation between calculated data

features and a mark connected to the data; this model may be used to add labels to fresh,
unfamiliar data until this model has been determined. The labels are further subdivided
into classification activities and regression tasks: they are discreet types in classifications
whereas the labels are constantly amounts in regression. In the following segment, we can
see representations of all methods of controlled learning.
Uncontrolled learning entails modelling the features of a dataset without any mark context
and is sometimes defined as "letting the dataset talk for themselves." These models
include functions such as clustering and reduction in dimensionality. Clustering
algorithms classify different data groups, while algorithms for dimensionality decrease
look for more concise data representations. In the following segment, we can see
representations of both forms of unattended learning.
Machine Learning Need
Human beings are the smartest and most evolved creatures on earth right now, since they
can think, analyse and solve complicated problems. In the other hand, AI is still in its
initial Step and has not in several ways exceeded human intellect. Then the issue is, what
does the computer need to learn? The most appropriate explanation for this is, "to make
choices of productivity and scale based on results."
In recent times, companies have invested extensively in newer technology such as

artificial intelligence, machine learning, and deep learning in order to obtain crucial data
9
knowledge for various real-world challenges and problem solving. We can call it data-
driven machine decisions, particularly to automate the operation. Instead of utilising
programming logic, these data-driven choices should be found in problems that cannot be
coded inherently. In reality, we cannot do without human intellect, but another factor is
that we could all tackle issues of effectiveness on a vast scale in the modern world. That is
why machine learning is required.
Machine Learning Challenges:-
Although machine learning is quickly advancing, with cryptography and independent cars
making major strides, this division of AI as a whole still has a long way to go. The
explanation behind this is that ML could not resolve multiple obstacles. The problems that
ML actually faces are
Data quality − One of the greatest problems is to provide good data for ML algorithms.
The usage of poor quality data contributes to issues with the preprocessing of data and
retrieval of functions.
Time-consuming job − Another difficulty for ML models is to take the time to collect,
retrieve and recover details.
Lack of specialists − Because ML technology is still in its infancy, the availability of

professional resources is a difficult task.
No specific goal for market issue formulation − No definite target and established goal
is another main challenge for ML, as this technology is not yet so mature.
Overfitting & underfitting − If the model overfit or underfit, the issue would not be
adequately described.
Dimensionality Curse − So many data points functions are another obstacle for the ML
model. This may be a real obstacle.
Implementation difficulties − The complexity of the ML paradigm allows it very

impossible in actual life to be used.
10
Machine Learning Applications:-
Machine learning is the fastest rising science, and we are in the Golden Year of AI and
ML according to the researchers. It is used to address several difficult issues in the modern
world that cannot be overcome through conventional methods. Any real-world ML
implementations are below
• Study of feelings
• Analysis of emotions
• Identification and avoidance of errors
• Prediction and prediction of weather
• Analysis and forecasting of financial prices
• Synthesis of voice
• Appreciation of voice
• Segmentation of customers
• Identification of artefacts
• Prevention of fraud
• Avoidance of theft
• Consumer product recommendation for online shopping
How to Start Learning Machine Learning?
The word "machine learning" was coined by Arthur Samuel in 1959 and described as
"a field of research where computers can be learned without being directly
programmed."
And that was the origin of Machine Learning! Machine learning is one of the most
common (if not the most) career options in modern times. Machine Learning
11
Engineer is the best job in 2019 with increases of 344% and an annual gross base
wage of $146,085.
But there's still a lot of doubts on what machine learning is and how to get started?
This essay addresses the basics of Machine Learning and the road to ultimately
becoming a comprehensive Machine Learning Engineer. Let's start now!!!
How do I continue to understand ML?
This is a hard roadmap that you will take on the journey to become a professional
engineer. Naturally, you can still change the steps to achieve your ultimate target!
Step 1 – Understanding Preconditions
If you are a genius, you can start ML immediately but usually you have to know
several prerequisites like linear algebra, Multivariate Calculus, statistics and Python.
And never be scared if you don't recognise these! You don't need a doctoral degree in
these subjects, however you need practical skills.
(a) Learn Multivariate Algebra and Linear Calculus
In machine learning, both linear algebra and multivariate calculus are essential. The
amount you need them depends, however, on your position as a data scientist. If you
concentrate mainly on application heavy-duty machine learning, you need not focus
too much on math as several popular libraries are open. If you choose to work on
research and development at machine learning, mastering of linear algebra and
multivariate calculus is extremely significant, since often ML algorithms have to be
implemented from scratch.
(b) Statistics learning
In machine learning, data plays an important part. In reality, 80% of your time as an
ML specialist is spent gathering and cleaning details. And statistics are an area in
which data are collected, analysed and presented. So it's no wonder you have to
practise this!!!
12
Certain essential principles in statistics are statistical importance, likelihood
distributions, hypothesis testing, regression, etc. Bayesian thinking is also a very
significant aspect of ML that addresses different topics such as Conditional
Probability, Preor and Posteria, Maximum Likelihood, etc.
(c) Python analysis
Some may choose to miss the linear algebra, multivariate calculus and statistics, and
think from them as trials and errors come along. But the one feature you can't even
miss is Python! You may use other languages for machine learning, such as R, Scala
etc. Python is the most common ML language at present. There are actually several
Python libraries which are particularly useful for artificial intelligence and machine
learning, including Keras, TensorFlow, Scikit-learn, etc.
So if you'd like to master ML, the easiest thing to do is to learn Python! You may use
different online tools and courses such as Free on GeeksforGeeks for Fork Python.
Step 2 – Learn different concepts for ML
Now that the prerequisites are met, you can finally study ML (which is the fun
part!!!) It is better to start with the basics and then pass on to the more nuanced stuff.
Any of the fundamental principles in ML are:
(a) Machine Learning Terminologies
• Model – A model is a complex image learned from data by the use of an algorithm
of machine learning. A model is sometimes known as a theory.
• Function – A feature is an observable human data object. A variety of numerical

features may be represented easily by a function vector. Feature vectors are supplied
as a sample input. For eg, in order to predict a fruit, colour, scent, taste, etc. can be
attributes.
• Aim (Mark) – the expected value of our model is a target attribute or label. For the
illustration of fruit mentioned in the function segment, the name of the fruit with each
input set will be like apple, orange, banana, etc.
13
• Preparation – The idea is to generate a series of inputs and predicted outcomes
(labels). After the training, we're going to provide a model (hypothesis), which maps
new data into one of the learned categories.
• Prediction – When our model is ready, a series of inputs can be provided to

produce a forecast performance (label).
(b) Machine Learning Forms
• Controlled learning – this implies learning from a training dataset utilising

classification and regression methods, with named results. This learning Step
proceeds to the desired standard of success.
• Unmonitored learning – Utilizing unlabeled data and discovering the fundamental

structure in the data, it means increasingly learning the data itself using factor and
cluster analysis models.
• Semi-supervised learning – Using unlabeled data such as unlabeled learning or a

limited volume of labelled data. The use of labelled data improves learning precision
considerably and is therefore cheaper than supervised learning.
• Improving learning – This means learning optimal behaviour through checking

and mistake. Thus, the next stage is calculated by learning patterns dependent on the
present state and optimising the incentive in the future.
Machine learning advantages:-
1. Identify trends and patterns easily
Machine learning may examine vast amounts of data to uncover trends and patterns
which are not obvious to humans. For example, for an e-commerce platform like
Amazon, it helps to consider its customers' shopping habits and to buy history to
better cater for appropriate items, sales and memoranda. It uses the data to disclose
the related advertising.
2. No human involvement is essential (automation)
14
You don't have to baby every step of the way with ML. Since it ensures that
computers have the potential to learn, it enables them to simulate and refine
algorithms by themselves. One typical example of this is anti-virus software; when
known, they learn to root out new risks. ML is also excellent at spam recognition.
3. Ongoing enhancement.
As ML algorithms accumulate knowledge, accuracy and reliability begin to increase.

This helps them to make smart choices. Say you have to create a model for the
weather prediction. If the volume of data you have continues to increase, the
algorithms will render forecasts more precise.
4. Multi-dimensional and multi-variety data management
Machine Learning algorithms are useful for the management of multi-dimensional

and multifunctional data and can do so in complex or unpredictable settings.
5. Large applications
You may be an e-tailer or a medical provider and ML function for you. In its apps, it
has the potential to provide consumers with a far more intimate interface and
therefore to reach the best customers.
Machine Learning Disadvantages:-
1. Acquisition of data
Machine learning entails the training of large data sets, which should be
inclusive/unbiased and of high quality. There will also be moments while waiting for
the generation of new data.
2. Ressources and Time
ML requires sufficient time to study and refine the algorithms enough to achieve its
objective with great precision and relevance. It still requires big resources to run. This
can imply extra computing power needs for you.
15
3. Perception of the result
The ability to correctly interpret the findings obtained by the algorithms is another
major challenge. The algorithms for your function must also be carefully chosen.
4. High sensitivity to error
Machine learning is self-sufficient but extremely error-prone. Suppose you practise

an algorithm with data sets that are limited enough not to be used. You wind up with
partial forecasts from a partial training sample. This results in meaningless
advertising for consumers. With ML, these errors will cause a series of errors that will
go undetected for a long time. And when they are detected, it requires quite some
time to detect and much longer to fix the root of the issue.
Development Steps Python: -
In February 1991 Guido Van Rossum released at alt.sources the first edition of the
Python code (version 0.9.0). This update already contains excellent handling,
functions and key collection, dict, str and other data forms. It was also geared towards
the object and had a device module.
Version 1.0 of Python was published in January 1994. The main new features in this
update were lambda, track, filter and minimise usable programming resources, which
Guido Van Rossum never liked. Python 2.0 was launched six and a half years later in
October 2000. This release provided list understandings, a total garbage collector and
unicode support. Python flourished in 2.x iterations for another 8 years until
Python3.0 (also known as "Python 3000" and "Py3K") was introduced as the new big
update. Python 3 is not compliant with Python 2.x backwards. The main focus of
Python 3 was on the elimination of duplicate structures and modules of programming
that satisfy or are similar to fulfilling the 13th Python Zen Law: "One – and hopefully
only one – obvious way should be used." Some Python 7.3 changes:
• Printing is a tool now

• Views and iterators more than tables
• Regulations have been streamlined for ordering contrasts. For instance, it is not
possible to sort a heterogeneous list, since all items of the list must be comparable.
16
• Only one integer form is left, i.e. int. long is int.
• The two integers division returns a float instead of an integer. "//" can be used for
the actions of "old."
• Text Vs. Data Rather than Unicode Vs. 8-bit.
Objective:-
We have shown that our technique makes it possible to successfully segment intra-
retinal layers with the help of the ANIS function also with photographs of poor
quality including speckles, low contrast and various intensity ranges throughout.
The Python
For general purpose computing, Python is an interpreted high-level programming

language. Created by Guido van Rossum and first published in 1991, the Python
definition stresses the readability of text, in particular utilising a wide whitespace.
Python has a complex machine type and automated memory protection. It embraces a
number of programming paradigms, including entity, imperative, procedural, and
process-oriented paradigms.
• Python is translated − The parser processes Python at runtime. Before running it,
you don't have to compile the software. The same is true in PERL and PHP.
• Python is interactive - in reality, you can sit at a Python prompt with your
programmes, and communicate directly with the interpreter.
Python also recognises the importance of pace of progress. This includes readable and
lax code and access to efficient structures, which prevent tired reuse of code.
Maintenance may also be an all but meaningless statistic, but it does mean something
about how much coding you need to search, interpret, and/or comprehend to repair
issue or tweak behaviour. This speed, the ease with which a programmer from other
17
languages may learn simple Python know-how and the vast standard library, are
important for another field in which Python excels. All of its tools were easy to
integrate, saved a lot of time and several people with no Python experience later
patched and modified – with no breaks.
Project Used Modules:-
Flow of tensor
Tensor Flow is a free and open source data flow software library that can programme
a number of tasks. It is often used in deep learning systems such as neural networks
as a symbolic math library. It is used at Google for testing and development.
Tensor Flow was created for internal Google usage by the Google Brain team. It was
launched on 9 November 2015 under the open-source Apache 2.0 licence.
Numpy
Numpy is an array-processing package for general purposes. It offers a
multidimensional high-performance array object and software for these arrays.
It is the essential kit for Python's science computation. It contains many functions,
including these:
• A mighty N-dimensional entity series
• Towards advanced (broadcast) features
• Software for C/C++ and Fortran code incorporation
• Useful linear algebra, transforming Fourier and random number ability
Numpy may also be used as an effective multi-dimensional, standardised data

container, in addition to its apparent science applications. Arbitrary data types may be
identified by utilising Numpy, which enables Numpy to interact with a number of
18
databases easily and rapidly.
Pandas
Pandas is an open-source Python library that uses its versatile data structures for high
performance data processing and analysis. Python was used primarily for mungling
and preparing results. It made very little contribution to the study of results. This
dilemma has been solved by Pandas. Using Pandas, we can complete, plan,
manipulate, model and study five typical steps in the treatment and analysis of data,
irrespective of the origin of data load. Python with Pandas is used in a variety of
fields including banking, economics, statistics, analysis and so forth.
Plotlib
Matplotlib is a Python 2D plotting library delivering quality publications in a range of

hardcopy formats and immersive surroundings across platforms. In Perl, Python or
IPython files, Jupyter Notebooks, web-app servers and four visual user-interface
toolkits, Matplotlib may be included. Matplotlib attempts to make it fast and
complicated. With a few lines of code you can create tracks, histograms, power
spectra, bar charts, error charts, scatter plots, etc. See the sample plots and the gallery
for illustrations.
The pyplot module offers a MATLAB-like gui for easy plotting, in particular in
combination with IPython. You have complete control of line types, font properties,
axis properties etc, for the power consumer, with an object-oriented gui and a series
of MATLAB-functions.
Scikit - learning
Scikit-learn offers a variety of managed and unmonitored learning algorithms via a

consistent Python interface. It has a permissive simplified BSD licence and is
released under several Linux distributions to promote academic and company use.
Python The
For general purpose computing, Python is an interpreted high-level programming
19
language. Created by Guido van Rossum and first published in 1991, the Python
definition stresses the readability of text, in particular utilising a wide whitespace.
Python has a complex machine type and automated memory protection. It embraces a
number of programming paradigms, including entity, imperative, procedural, and
process-oriented paradigms.
• Python is translated − The parser processes Python at runtime. Before running it,
you don't have to compile the software. The same is true in PERL and PHP.
• Python is interactive - in reality, you can sit at a Python prompt with your
programmes, and communicate directly with the interpreter.
Python also recognises the importance of pace of progress. This includes readable and
lax code and access to efficient structures, which prevent tired reuse of code.
Maintenance may also be an all but meaningless statistic, but it does mean something
about how much coding you need to search, interpret, and/or comprehend to repair
issue or tweak behaviour. This speed, the ease with which a programmer from other
languages may learn simple Python know-how and the vast standard library, are
important for another field in which Python excels. All of its tools were easy to
integrate, saved a lot of time and several people with no Python experience later
patched and modified – with no breaks.
Install Python in Windows and Mac step by step:
Python does not have a flexible programming language on your machine. Python was
first published in 1991 and is a highly successful high-level programming language
until today. Its theory of style illustrates the readability of code with its prominent
usage of broad space.
Python's object-oriented method and language build enable programmers to write

simple as well as logical code for projects. This programme would not come with
Windows pre-packaged.
20
Windows and Mac how to instal Python:
Over the years there have been many changes in the Python edition. The issue is how
Python should be installed? It may be frustrating to start studying Python for a novice
but this tutorial can solve your problem. Python's current or latest update is version
3.7.4, which means Python 3.
Note: The version 3.7.4 of python cannot be used on Windows XP or earlier

platforms.
Until beginning the Python installation process. Next, you must recognise the device
specifications. You must download the python version depending on your device
form, i.e. operating system and processor. My machine style is a 64-bit system from
Windows. The following measures are also essential to instal Python update 3.7.4 on
Windows 7 or Python 3. Python Cheatsheet download here. The Windows 10, 8 and 7
measures to instal Python are split into 4 sections to better grasp.
Download the machine edition right
Step 1: Download and instal python from the official website using Google Chrome
or some other web browser. OR Click on the page below: httpsww.python.org
21
Now search the operating system for the new and right update.
Step 2: Download the Download Page.
Step 3: You can either pick the Python Update button in Yellow Color for windows
3.7.4 or you can scroll down and press on the download version for each version.
The most recent update of Python for Windows 3.7.4 is downloaded here.
Step 4: Scroll down the page until the Files alternative is found.
22
Step 5: Here you can see a separate Python edition of the OS.
You may use one of three ways to download Windows 32-bit python: Windows x86
embeddable zip file, Windows x86 executable installer, or Web based Windows x86
installer.
• You can use any of the three choices for the 64-bit python download: Windows
x86-64, Windows x86-64, Windows x86-64 installer and Windows x86-64 web-
based installer.
Here we instal the web-based Windows x86-64 installer. This is your first
component on which edition of python to download. Now we are continuing with
the second portion of the installation of python, i.e.
Note: You can click on the Release Note option to know the modifications or
improvements produced in the edition.
Python installation
Step 1: Go to Update and open the edition of Python downloaded for installation.
23
Step 2: Until clicking Install Now, please tick Add Python 3.7 to Route.
Step 3: Tap on NOW Install Efficient during implementation. Click Close.

Click Close.
24
Through these three Steps on python installation, you installed Python successfully
and properly. The time has now come to check the installation.
Note: It might take a few minutes to load.
Verify installation of Python
Step 1: Press Start
Step 2: Form "cmd" in the Windows Run Command
25
Step 3: : Open the prompt command option.
Step 4: Verify if the python is configured correctly. Python type –V and hit the
Enter button.
Step 5: You can get the reply as 3.7.4
Note: If you have already enabled some of the earlier variants of Python. Second, the
previous edition must be uninstalled and the current one installed.
26
See how the Python IDLE performs
Step 1: Press Start
Step 2: Form "python idle" in the Windows Run command
Step 3: Click on IDLE and open the software (Python 3.7 64-bit).
Step 4: You must first save the file to function in IDLE. Click File > File Select the
Save tab
Step 5: Call the file and save it as a file form. Select the SAVE tab. I called the files
here as Hey World.
Step 6: Insert the print ("Hey World") and the Press Enter for example.
27
 You will see that the given command is started. That concludes our tutorial on installing
Python. You learned how to download python to your respective operating system for
windows.
 Note: unlike Java, at the end of statements Python does not need semicolons, otherwise
it will not work.
 This stack which contains: aworld.
Django – Design Philosophies
Django comes with the following design philosophies −
 Loosely Coupled − Django aims to make each element of its stack

independent of the others.
 Less Coding − Less code so in turn a quick development.
 Don't Repeat Yourself (DRY) − Everything should be developed only in
exactly one place instead of repeating it again and again.
 Fast Development − Django's philosophy is to do all it can to facilitate hyper-
fast development.
 Clean Design − Django strictly maintains a clean design throughout its own
code and makes it easy to follow best web-development practices.
28
Advantages of Django
Here are few advantages of using Django which can be listed out here −
 Object-Relational Mapping (ORM) Support − Django provides a bridge

between the data model and the database engine, and supports a large set of database
systems including MySQL, Oracle, Postgres, etc. Django also supports NoSQL
database through Django-nonrel fork. For now, the only NoSQL databases supported
are MongoDB and google app engine.
 Multilingual Support − Django supports multilingual websites through its
built-in internationalization system. So you can develop your website, which would
support multiple languages.
 Framework Support − Django has built-in support for Ajax, RSS, Caching
and various other frameworks.
 Administration GUI − Django provides a nice ready-to-use user interface for
administrative activities.
 Development Environment − Django comes with a lightweight web server

for the development and testing of end-to-end applications.
As you know, Django is a web framework for Python. And Django supports the MVC
pattern, just like the most modern framework. Let's see first what is the Model-View-
Controller (MVC) pattern, and then we look at Django's MVT pattern specificity.
 MVC Models
When talking about applications providing a UI (web or desktop), MVC architecture

is usually discussed. The MVC pattern is based on three components, Model, View
and Controller, as its name suggests. See our MVC tutorial here for more information.
 MVC Django – MVT Models
The Model View Template (MVT) differs slightly from the MVC. The main
difference between two patterns is that the controller part (Software code controlling
the interactions between the model and view) is taken by Django himself, leaving us
with the template. The template is an HTML file mixed with the language of Django
Template (DTL).
29
 The diagram below shows the interaction between each of the components of
the MVT pattern and the user request.
Fig 2.2: Django MVC – MVT Pattern
The developer provides the Model, the view and the template then just maps it to a
URL and Django does the magic to serve it to the user.
Jupyter Notebook
Jupyter Notebook is an open source web application for creating and sharing
documents containing live code, equations, views and text. The people at Project
Jupyter keep Jupyter Notebook.
Jupyter Notebooks is an IPython project spin-off, which was once an IPython
Notebook project. The name, Jupyter, comes from the core languages supported by
the programming: Julia, Python, and R. Jupyter ships with the IPython kernel, which
enables you to write programmes in Python, but currently there are more than 100
other kernels you can use as well.
Anaconda Anaconda
What is Python Anaconda?
Python distributions include the Python interpreter, along with a list of Python
packages, tools like editors. Anaconda is one of several distributions for Python.
30
Anaconda is a new Python and R data science package distribution. It used to be
known as Continuum Analytics. There are more than 100 new packages in Anaconda.
Anaconda is used for scientific computing, data science, statistical analytics and
machine learning in this working environment. The latest Anaconda 5.0.1 version is
available in October 2017.
Version 5.0.1 released addresses a few minor bugs and adds useful features like up-to-
date support for R language. In the original 5.0.0 release, all these features were not
available.
The package manager is also a Python distribution and an open source package
collection that contains over 1000 R and Python Data Science Packages.
Why Python's Anaconda?
If you're happy with regular python, there's no big reason to switch to Anaconda. But
some people like data scientists who are not full-time designers think anaconda is
very useful because it simplifies many common problems that a beginner encounters.
Anaconda can help -
• Multiple platform installation of Python
• Separation of various environments
• not having the right privileges and
• Get up and run with particular packages and libraries

How to Download Anaconda 5.0.1?
31
CHAPTER 3SYSTEM DESIGN
3.1 System Architecture
32
3.2Moduledescription
1. Chemical structure
2. Drug targets
Structure of the chemical At the molecular level, the drug structure describes its binding
activity. The most commonly used structural pro ling markers for drugs are chemical
ngerprints[13]. Bit vectors indicating the presence of fingerprints are (1)or lack (0) of certain
chemical characteristics (e.g. group C=N, 6 member ring,We have taken an input chemical
formula (SMILES) from OpenBabel 2.3 libraryID) and generate binary structural molecular
access system (MACCS)Lists with 166 lengths.
Targets for drugs
A set of drug targets can highlight a ected biological Processes. Processes. We represent the
DrugBank set of drug targets and KEGG as a bit vector in which 1 is a drug objective and 0 It
does not represent a drug target. This leads to a small matrix, as the The median of each
drug is one putative target.
3.3System Specification
3.3.1 Software Requirements
1. Functional requirements for secure cloud storage are simple:
2. 1. The service should be able to store the user data; 2. The data should be
available via any Internet-connected device;
3. The service should be able to sync user data between several devices (notebooks,
smart phones, etc.);
4. All historical changes should be maintained (versioning) by the service;
5. Data with other users should be shared;
6. SSO support should be provided to the service
33
7. Interoperability with other cloud storage services should be permitted, allowing
data migration from one CSP toanother.
• Operating System:Windows
• Coding Language: Python 3.7
• Script:
• Database :
3.3.2 HardwareRequirements:
• Processor - Pentium–III
• Speed – 2.4GHz
• RAM - 512 MB(min)
• Hard Disk - 20 GB
• Floppy Drive - 1.44MB
• Key Board - Standard Keyboard
• Monitor – 15 VGAColour
Cloud computing has three fundamental models, these are:
3.4 DetailedDesign
UML is a Unified Modeling Language acronym. UML is simply a modern approach

to software modelling and documentation. In fact, it is one of the most popular
techniques for modelling business processes.
It is based on software component diagrammatic representations. As the old proverb
says, "Thousands of words are worth the picture." By using visual representations we
can better understand possible software or business process defects or errors.
The chaos around software development and documentation created UML. There
were various ways of representing and documenting software systems in the 1990s.
There was a need for a more unified way of viewing these systems and, in 1994-1996,
34
three software engineers working for Rational Software developed the UML. It was
later adopted as the standard in 1997 and has since remained the norm with only a
few updates.
GOALS:
The main purposes of the UML design are:
 Give users ready-to-use visual modelling language to develop and exchange
meaningful models.
 Provide expansion and specialisation mechanisms for expanding core concepts.
 Be independent of specific programming and development languages.
 Provide a formal basis for language understanding modelling.
 Encourage market growth of OO tools. 5.
 6 Support concepts such as collaborations, frameworks, higher development

patterns and components.
7 Include good practises.
i. USE CASEDIAGRAM:
A Uniform Modeling Language (UML) case diagram is a type of behavioural

diagram that is defined and generated from a Use-case analysis. It aims to provide a
graphical overview of the system functionality, its aim (represented as use cases), and
any dependence between the uses. The main purpose of a case diagram is to show for
which actors the system functions are performed. The actors' roles in the system can
be represented.
35
login
name of the patient
symptom 1
symptom 2
symptom 3
admin patient
symptom 4
symptom 5
logistic regression
support vector mechine
drug prediction
36
ii. SEQUENCEDIAGRAM:
A Unified Modeling Language (UML) sequence diagram is a kind of interaction

diagram that shows how and where the processes operate. It is a composition of a
Message Sequence Chart. Sequence diagrams are sometimes referred to as event
diagrams, event scenarios and schedules.
patient admin data base
login
login
name of the patient
symptom 1
symptom 2
symptom 3
symptom 4
symptom 5
logistic regression
logistic regression
support vector machine
support vector machine
drug prediction
drug prediciton
iii. CLASSDIAGRAM:
In software engineering, a Unified Modeling Language (UML) class chart is a kind of
static structure chart that describes a system's structure by showing the classes, its
attributes, operations (or methods) and class relation. It explains the class that
contains information.
patient
user name
password
admin
name of the patient()
symptom 1() user name
symptom 2() password
symptom 3()
symptom 4() logistic regression()
symptom 5() support vector machine()
logistic regression() drug prediction()
support vector machine()
drug prediction()
Data Flow diagram:-
Data flow diagrams represent the data flow in a business information system
graphically. DFD describes how data from input to file storage and report
generation are transferred to the system.
The data flow charts can be logically and physically divided. The logical data flow
chart describes data flow through a system to perform certain business functions.
The physical data flow diagram describes how the logical data flow is implemented.
DFD graphically depicts the functions or processes that collect, manipulate, store
and distribute data between a system, its environment and between system
components. It is a good communication tool between the user and systems designer
because of its visual representation. The DFD structure allows a wide overview and
a hierarchy of detailed diagrams to be extended. For the following reasons, DFD
was often used:
CHAPTER 4IMPLEMENTATION
from __future__ import absolute_import

from __future__ import division
from __future__ import print_function
import argparse
import collections
from datetime import datetime
import hashlib
import os.path
import random
import re
import sys
import tarfile
import numpy as np
from six.moves import urllib
import tensorflow as tf
from tensorflow.python.framework import graph_util

from tensorflow.python.framework import tensor_shape
from tensorflow.python.platform import gfile
from tensorflow.python.util import compat
FLAGS = None
MAX_NUM_IMAGES_PER_CLASS = 2 ** 27 - 1 # ~134M
def create_image_lists(image_dir, testing_percentage, validation_percentage):
if not gfile.Exists(image_dir):
tf.logging.error("Image directory '" + image_dir + "' not found.")
return None
result = collections.OrderedDict()
sub_dirs = [
os.path.join(image_dir,item)
for item in gfile.ListDirectory(image_dir)]
sub_dirs = sorted(item for item in sub_dirs
if gfile.IsDirectory(item))
for sub_dir in sub_dirs:
extensions = ['jpg', 'jpeg', 'JPG', 'JPEG']
file_list = []
dir_name = os.path.basename(sub_dir)
if dir_name == image_dir:
continue
tf.logging.info("Looking for images in '" + dir_name + "'")
for extension in extensions:
file_glob = os.path.join(image_dir, dir_name, '*.' + extension)
file_list.extend(gfile.Glob(file_glob))
if not file_list:
tf.logging.warning('No files found')
continue
if len(file_list) < 20:
tf.logging.warning(
'WARNING: Folder has less than 20 images, which may cause issues.')
elif len(file_list) > MAX_NUM_IMAGES_PER_CLASS:
tf.logging.warning(
'WARNING: Folder {} has more than {} images. Some images will '
'never be selected.'.format(dir_name, MAX_NUM_IMAGES_PER_CLASS))
label_name = re.sub(r'[^a-z0-9]+', ' ', dir_name.lower())
training_images = []
testing_images = []
validation_images = []
for file_name in file_list:
base_name = os.path.basename(file_name)
hash_name = re.sub(r'_nohash_.*$', '', file_name)
hash_name_hashed = hashlib.sha1(compat.as_bytes(hash_name)).hexdigest()
percentage_hash = ((int(hash_name_hashed, 16) %
(MAX_NUM_IMAGES_PER_CLASS + 1)) *
(100.0 / MAX_NUM_IMAGES_PER_CLASS))
if percentage_hash < validation_percentage:
validation_images.append(base_name)
elif percentage_hash < (testing_percentage + validation_percentage):
testing_images.append(base_name)
else:
training_images.append(base_name)
result[label_name] = {
'dir': dir_name,
'training': training_images,
'testing': testing_images,
'validation': validation_images,
}
return result
def get_image_path(image_lists, label_name, index, image_dir, category):

if label_name not in image_lists:
tf.logging.fatal('Label does not exist %s.', label_name)
label_lists = image_lists[label_name]
if category not in label_lists:
tf.logging.fatal('Category does not exist %s.', category)
category_list = label_lists[category]
if not category_list:
tf.logging.fatal('Label %s has no images in the category %s.',
label_name, category)
mod_index = index % len(category_list)
base_name = category_list[mod_index]
sub_dir = label_lists['dir']
full_path = os.path.join(image_dir, sub_dir, base_name)
return full_path
def get_bottleneck_path(image_lists, label_name, index, bottleneck_dir,

category, architecture):
return get_image_path(image_lists, label_name, index, bottleneck_dir,
category) + '_' + architecture + '.txt'
def create_model_graph(model_info):
with tf.Graph().as_default() as graph:
model_path = os.path.join(FLAGS.model_dir, model_info['model_file_name'])
with gfile.FastGFile(model_path, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
bottleneck_tensor, resized_input_tensor = (tf.import_graph_def(
graph_def,
name='',
return_elements=[
model_info['bottleneck_tensor_name'],
model_info['resized_input_tensor_name'],
]))
return graph, bottleneck_tensor, resized_input_tensor
def run_bottleneck_on_image(sess, image_data, image_data_tensor,

decoded_image_tensor, resized_input_tensor,
bottleneck_tensor):
resized_input_values = sess.run(decoded_image_tensor,
{image_data_tensor: image_data})
bottleneck_values = sess.run(bottleneck_tensor,
{resized_input_tensor: resized_input_values})
bottleneck_values = np.squeeze(bottleneck_values)
return bottleneck_values
def maybe_download_and_extract(data_url):
dest_directory = FLAGS.model_dir
if not os.path.exists(dest_directory):
os.makedirs(dest_directory)
filename = data_url.split('/')[-1]
filepath = os.path.join(dest_directory, filename)
if not os.path.exists(filepath):
def _progress(count, block_size, total_size):

sys.stdout.write('\r>> Downloading %s %.1f%%' %
(filename,
float(count * block_size) / float(total_size) * 100.0))
sys.stdout.flush()
filepath, _ = urllib.request.urlretrieve(data_url, filepath, _progress)

print()
statinfo = os.stat(filepath)
tf.logging.info('Successfully downloaded', filename, statinfo.st_size,
'bytes.')
tarfile.open(filepath, 'r:gz').extractall(dest_directory)
def ensure_dir_exists(dir_name):
if not os.path.exists(dir_name):
os.makedirs(dir_name)
bottleneck_path_2_bottleneck_values = {}
def create_bottleneck_file(bottleneck_path, image_lists, label_name, index,
image_dir, category, sess, jpeg_data_tensor,
bottleneck_tensor):
tf.logging.info('Creating bottleneck at ' + bottleneck_path)
image_path = get_image_path(image_lists, label_name, index,
image_dir, category)
if not gfile.Exists(image_path):
tf.logging.fatal('File does not exist %s', image_path)
image_data = gfile.FastGFile(image_path, 'rb').read()
try:
bottleneck_values = run_bottleneck_on_image(
sess, image_data, jpeg_data_tensor, decoded_image_tensor,
resized_input_tensor, bottleneck_tensor)
except Exception as e:
raise RuntimeError('Error during processing file %s (%s)' % (image_path,
str(e)))
bottleneck_string = ','.join(str(x) for x in bottleneck_values)
with open(bottleneck_path, 'w') as bottleneck_file:
bottleneck_file.write(bottleneck_string)
def get_or_create_bottleneck(sess, image_lists, label_name, index, image_dir,

category, bottleneck_dir, jpeg_data_tensor,
bottleneck_tensor, architecture):
label_lists = image_lists[label_name]
sub_dir = label_lists['dir']
sub_dir_path = os.path.join(bottleneck_dir, sub_dir)
ensure_dir_exists(sub_dir_path)
bottleneck_path = get_bottleneck_path(image_lists, label_name, index,
bottleneck_dir, category, architecture)
if not os.path.exists(bottleneck_path):
create_bottleneck_file(bottleneck_path, image_lists, label_name, index,
bottleneck_tensor)
with open(bottleneck_path, 'r') as bottleneck_file:
bottleneck_string = bottleneck_file.read()
did_hit_error = False
try:
bottleneck_values = [float(x) for x in bottleneck_string.split(',')]
except ValueError:
tf.logging.warning('Invalid float found, recreating bottleneck')
did_hit_error = True
if did_hit_error:
create_bottleneck_file(bottleneck_path, image_lists, label_name, index,
bottleneck_tensor)
with open(bottleneck_path, 'r') as bottleneck_file:
bottleneck_string = bottleneck_file.read()
bottleneck_values = [float(x) for x in bottleneck_string.split(',')]
return bottleneck_values
def cache_bottlenecks(sess, image_lists, image_dir, bottleneck_dir,

jpeg_data_tensor, decoded_image_tensor,
resized_input_tensor, bottleneck_tensor, architecture):
how_many_bottlenecks = 0
ensure_dir_exists(bottleneck_dir)
for label_name, label_lists in image_lists.items():
for category in ['training', 'testing', 'validation']:
category_list = label_lists[category]
for index, unused_base_name in enumerate(category_list):
get_or_create_bottleneck(
sess, image_lists, label_name, index, image_dir, category,
bottleneck_dir, jpeg_data_tensor, decoded_image_tensor,
resized_input_tensor, bottleneck_tensor, architecture)
how_many_bottlenecks += 1
if how_many_bottlenecks % 100 == 0:
tf.logging.info(
str(how_many_bottlenecks) + ' bottleneck files created.')
def get_random_cached_bottlenecks(sess, image_lists, how_many, category,

bottleneck_dir, image_dir, jpeg_data_tensor,
bottleneck_tensor, architecture):
class_count = len(image_lists.keys())
bottlenecks = []
ground_truths = []
filenames = []
if how_many >= 0:
for unused_i in range(how_many):
label_index = random.randrange(class_count)
label_name = list(image_lists.keys())[label_index]
image_index = random.randrange(MAX_NUM_IMAGES_PER_CLASS + 1)
image_name = get_image_path(image_lists, label_name, image_index,
bottleneck = get_or_create_bottleneck(
sess, image_lists, label_name, image_index, image_dir, category,
ground_truth = np.zeros(class_count, dtype=np.float32)
ground_truth[label_index] = 1.0
bottlenecks.append(bottleneck)
ground_truths.append(ground_truth)
filenames.append(image_name)
else:
for label_index, label_name in enumerate(image_lists.keys()):
for image_index, image_name in enumerate(
image_lists[label_name][category]):
return bottlenecks, ground_truths, filenames
def get_random_distorted_bottlenecks(
sess, image_lists, how_many, category, image_dir, input_jpeg_tensor,
distorted_image, resized_input_tensor, bottleneck_tensor):
bottlenecks = []
ground_truths = []
image_path = get_image_path(image_lists, label_name, image_index, image_dir,
category)
jpeg_data = gfile.FastGFile(image_path, 'rb').read()
distorted_image_data = sess.run(distorted_image,
{input_jpeg_tensor: jpeg_data})
{resized_input_tensor: distorted_image_data})
bottlenecks.append(bottleneck_values)
return bottlenecks, ground_truths
def should_distort_images(flip_left_right, random_crop, random_scale,

random_brightness):
return (flip_left_right or (random_crop != 0) or (random_scale != 0) or
(random_brightness != 0))
def add_input_distortions(flip_left_right, random_crop, random_scale,

random_brightness, input_width, input_height,
input_depth, input_mean, input_std):
jpeg_data = tf.placeholder(tf.string, name='DistortJPGInput')
decoded_image = tf.image.decode_jpeg(jpeg_data, channels=input_depth)
decoded_image_as_float = tf.cast(decoded_image, dtype=tf.float32)
decoded_image_4d = tf.expand_dims(decoded_image_as_float, 0)
margin_scale = 1.0 + (random_crop / 100.0)
resize_scale = 1.0 + (random_scale / 100.0)
margin_scale_value = tf.constant(margin_scale)
resize_scale_value = tf.random_uniform(tensor_shape.scalar(),
minval=1.0,
maxval=resize_scale)
scale_value = tf.multiply(margin_scale_value, resize_scale_value)
precrop_width = tf.multiply(scale_value, input_width)
precrop_height = tf.multiply(scale_value, input_height)
precrop_shape = tf.stack([precrop_height, precrop_width])
precrop_shape_as_int = tf.cast(precrop_shape, dtype=tf.int32)
precropped_image = tf.image.resize_bilinear(decoded_image_4d,
precrop_shape_as_int)
precropped_image_3d = tf.squeeze(precropped_image, squeeze_dims=[0])
cropped_image = tf.random_crop(precropped_image_3d,
[input_height, input_width, input_depth])
if flip_left_right:
flipped_image = tf.image.random_flip_left_right(cropped_image)
else:
flipped_image = cropped_image
brightness_min = 1.0 - (random_brightness / 100.0)
brightness_max = 1.0 + (random_brightness / 100.0)
brightness_value = tf.random_uniform(tensor_shape.scalar(),
minval=brightness_min,
maxval=brightness_max)
brightened_image = tf.multiply(flipped_image, brightness_value)
offset_image = tf.subtract(brightened_image, input_mean)
mul_image = tf.multiply(offset_image, 1.0 / input_std)
distort_result = tf.expand_dims(mul_image, 0, name='DistortResult')
return jpeg_data, distort_result
def variable_summaries(var):
with tf.name_scope('summaries'):
mean = tf.reduce_mean(var)
tf.summary.scalar('mean', mean)
with tf.name_scope('stddev'):
stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
tf.summary.scalar('stddev', stddev)
tf.summary.scalar('max', tf.reduce_max(var))
tf.summary.scalar('min', tf.reduce_min(var))
tf.summary.histogram('histogram', var)
def add_final_training_ops(class_count, final_tensor_name, bottleneck_tensor,
bottleneck_tensor_size):
with tf.name_scope('input'):
bottleneck_input = tf.placeholder_with_default(
if (i % FLAGS.eval_step_interval) == 0 or is_last_step:
train_accuracy, cross_entropy_value = sess.run(
[evaluation_step, cross_entropy],
feed_dict={bottleneck_input: train_bottlenecks,
ground_truth_input: train_ground_truth})
tf.logging.info('%s: Step %d: Train accuracy = %.1f%%' %
(datetime.now(), i, train_accuracy * 100))
tf.logging.info('%s: Step %d: Cross entropy = %f' %
(datetime.now(), i, cross_entropy_value))
validation_bottlenecks, validation_ground_truth, _ = (
get_random_cached_bottlenecks(
sess, image_lists, FLAGS.validation_batch_size, 'validation',
FLAGS.bottleneck_dir, FLAGS.image_dir, jpeg_data_tensor,
decoded_image_tensor, resized_image_tensor, bottleneck_tensor,
FLAGS.architecture))
# Run a validation step and capture training summaries for TensorBoard
# with the `merged` op.
validation_summary, validation_accuracy = sess.run(
[merged, evaluation_step],
feed_dict={bottleneck_input: validation_bottlenecks,
ground_truth_input: validation_ground_truth})
validation_writer.add_summary(validation_summary, i)
tf.logging.info('%s: Step %d: Validation accuracy = %.1f%% (N=%d)' %
(datetime.now(), i, validation_accuracy * 100,
len(validation_bottlenecks)))
# Store intermediate results

intermediate_frequency = FLAGS.intermediate_store_frequency
if (intermediate_frequency > 0 and (i % intermediate_frequency == 0)
and i > 0):
intermediate_file_name = (FLAGS.intermediate_output_graphs_dir +
'intermediate_' + str(i) + '.pb')
default=10,
help='How often to evaluate the training results.'
)
parser.add_argument(
'--train_batch_size',
type=int,
default=100,
help='How many images to train on at a time.'
)
'--test_batch_size',
type=int,
default=-1,
help="""\
How many images to test on. This test set is only used once, to evaluate
the final accuracy of the model after training completes.
A value of -1 causes the entire test set to be used, which leads to more
stable results across runs.\
"""
)
'--validation_batch_size',
type=int,
default=100,
help="""\
How many images to use in an evaluation batch. This validation set is
used much more often than the test set, and is an early indicator of how
accurate the model is during training.
A value of -1 causes the entire validation set to be used, which leads to
more stable results across training iterations, but may be slower on large
training sets.\
"""

minval=1.0,
if flip_left_right:
else:
bottleneck_tensor,
shape=[None, bottleneck_tensor_size],
name='BottleneckInputPlaceholder')
ground_truth_input = tf.placeholder(tf.float32,
[None, class_count],
name='GroundTruthInput')
layer_name = 'final_training_ops'
with tf.name_scope(layer_name):
with tf.name_scope('weights'):
initial_value = tf.truncated_normal(
[bottleneck_tensor_size, class_count], stddev=0.001)
from __future__ import absolute_import

from __future__ import division
from __future__ import print_function
import argparse
import collections
from datetime import datetime
import hashlib
import os.path
import random
import re
import sys
import tarfile
import numpy as np
from six.moves import urllib
import tensorflow as tf
from tensorflow.python.framework import graph_util

from tensorflow.python.framework import tensor_shape
from tensorflow.python.platform import gfile
from tensorflow.python.util import compat
FLAGS = None
MAX_NUM_IMAGES_PER_CLASS = 2 ** 27 - 1 # ~134M
def create_image_lists(image_dir, testing_percentage, validation_percentage):

if not gfile.Exists(image_dir):
tf.logging.error("Image directory '" + image_dir + "' not found.")
return None
result = collections.OrderedDict()
sub_dirs = [
os.path.join(image_dir,item)
for item in gfile.ListDirectory(image_dir)]
sub_dirs = sorted(item for item in sub_dirs
if gfile.IsDirectory(item))
for sub_dir in sub_dirs:
extensions = ['jpg', 'jpeg', 'JPG', 'JPEG']
file_list = []
dir_name = os.path.basename(sub_dir)
if dir_name == image_dir:
continue
tf.logging.info("Looking for images in '" + dir_name + "'")
for extension in extensions:
file_glob = os.path.join(image_dir, dir_name, '*.' + extension)
file_list.extend(gfile.Glob(file_glob))
if not file_list:
tf.logging.warning('No files found')
continue
if len(file_list) < 20:
tf.logging.warning(
'WARNING: Folder has less than 20 images, which may cause issues.')
elif len(file_list) > MAX_NUM_IMAGES_PER_CLASS:
tf.logging.warning(
'WARNING: Folder {} has more than {} images. Some images will '
'never be selected.'.format(dir_name, MAX_NUM_IMAGES_PER_CLASS))
label_name = re.sub(r'[^a-z0-9]+', ' ', dir_name.lower())
training_images = []
testing_images = []
validation_images = []
for file_name in file_list:
base_name = os.path.basename(file_name)
hash_name = re.sub(r'_nohash_.*$', '', file_name)
hash_name_hashed = hashlib.sha1(compat.as_bytes(hash_name)).hexdigest()
if how_many >= 0:
else:
for label_index, label_name in enumerate(image_lists.keys()):
for image_index, image_name in enumerate(
image_lists[label_name][category]):
return bottlenecks, ground_truths, filenames
def get_random_distorted_bottlenecks(
sess, image_lists, how_many, category, image_dir, input_jpeg_tensor,
distorted_image, resized_input_tensor, bottleneck_tensor):
bottlenecks = []
ground_truths = []
image_path = get_image_path(image_lists, label_name, image_index, image_dir,
category)
jpeg_data = gfile.FastGFile(image_path, 'rb').read()
distorted_image_data = sess.run(distorted_image,
{input_jpeg_tensor: jpeg_data})
{resized_input_tensor: distorted_image_data})
bottlenecks.append(bottleneck_values)
return bottlenecks, ground_truths
def should_distort_images(flip_left_right, random_crop, random_scale,

random_brightness):
return (flip_left_right or (random_crop != 0) or (random_scale != 0) or
(random_brightness != 0))
def add_input_distortions(flip_left_right, random_crop, random_scale,

minval=1.0,
if flip_left_right:
else:
bottleneck_tensor,
shape=[None, bottleneck_tensor_size],
name='BottleneckInputPlaceholder')
ground_truth_input = tf.placeholder(tf.float32,
[None, class_count],
name='GroundTruthInput')
layer_name = 'final_training_ops'
with tf.name_scope(layer_name):
with tf.name_scope('weights'):
initial_value = tf.truncated_normal(
[bottleneck_tensor_size, class_count], stddev=0.001)
CHAPTER – 5
TEST RESULTS
The aim of the test is to detect errors. Testing is an attempt to detect any conceivable
fault or weakness in a work product. It offers a way of monitoring the functionality of
components, subassemblies, assemblies and/or final products It is a software exercise
to ensure that the software system meets its requirements and user expectations and
does not fail in an unacceptable way. There are different test types. Each type of test
addresses a particular test requirement.
TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that validate the proper functioning of
the internal programmes logic and valid output of the programme inputs. All branches
of decision and internal code flow should be validated. It is the testing of individual
application software units. It is done before integration after completion of an
individual unit. This is a structural test, based on construction knowledge and
invasive. Unit tests perform basic component tests and test a specific business
process, application and/or system setup. Unit tests ensure that each single route of
the business process meets the documented specifications accurately and contains
clearly defined inputs and expected results.
Integration testing
Integration tests are designed to test integrated software components in order

to determine whether they are operating as a single programme. Testing is event
driven and focuses more on the basic results of screens or fields. Integration tests
demonstrate that although the components were individually satisfied, the
combination of components is correct and consistent, as demonstrated by successful
unit testing. Integration tests are specifically designed to identify issues arising from
the combination of components.
Functional test
Functional tests provide systematic demonstrations that functions tested are

available as specified by the business and technical requirements, system
documentation, and usermanuals.
Functional testing is centered on the following items:
Valid Input : identified classes of valid input must be accepted.

InvalidInput : identified classes of invalid input must be rejected. Functions
: identified functions must beexercised.
Output : identified classes of application outputs must be exercised.
Systems/Procedures: interfacing systems or procedures must beinvoked.
Functional test organisation and preparation is focussed on needs, key functions or

special test cases. Furthermore, systematic coverage for identifying business process
flows must be considered for testing, data fields, predefined processes and successive
processes. Further tests are identified and the effective value of current tests
determined before functional testing is complete.
System Test
System testing ensures that all integrated software complies with the requirements. It
tests a setup to ensure known and predictable results. The configuration-oriented
system integration test is an example of system testing. System testing relies on
process descriptions and flows, which highlight pre-driven process links and points of
integration.
White Box Testing

White Box Testing is a test in which the tester has knowledge of, or at least its
purpose, the software's inner workings, structure and language. It is intended. It is
used for testing areas which cannot be reached from a black box.
Black Box Testing
Black Box Testing tests the software without knowing the module's interior
function, structure or language. As with most other types of tests, black box tests need
to be written in a definitive source document, such as a specification or requirements
document, like a specification or requirements document. It is a test that processes the
software being tested, asablackbox.youcannot—see all in all. The test
providesinputsandoutput without looking at how the software works.
5.1 UnitTesting:
 Unit testing is generally performed within the combined code and unit testing phase
of the software life cycle, although it is not uncommon to perform two distinct phases
of coding and unit testing.
 Strategy and approach for testing
 Field tests are carried out manually and functional tests are written in detail.
Objectives for testing
 All field entries must be functional.
• Pages from the identified link must be enabled.
• The input screen, messages and replies should not be laid back.
Characteristics to be tested
 • Check that the entries are correctly formatted
 No duplicate entries should beallowed
 All links should take the user to the correctpage.
5.2 Integration Testing
Two or more integrated software components on a single platform are tested

incrementally for integration testing for failures due to interface defects.
The integration test is intended to ensure an errorless interaction of components or
software applications such as software system components or software applications at
corporate level.
Test results: all of the test cases mentioned above were passed successfully. No flaws
found.
5.3 Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires

significant participation by the end user. It also ensures that the system meets the
functionalrequirements.
Test Results: All the test cases mentioned above passed successfully. No
defectsencountered.
CHAPTER 6
RESULTS
itching,
skin_rash
nodal_skin_eruptions
continuous_sneezing
shivering
chills
joint_pain
stomach_pain
acidity
ulcers_on_tongue
muscle_wasting
vomiting
burning_micturition
spotting_ urination
fatigue
weight_gain
anxiety
cold_hands_and_feets
mood_swings
weight_loss
restlessness
lethargy
patches_in_throat
irregular_sugar_level
cough
high_fever
sunken_eyes
breathlessness
sweating
dehydration
indigestion
headache
yellowish_skin
dark_urine
nausea
loss_of_appetite
pain_behind_the_eyes
back_pain
constipation
abdominal_pain
diarrhoea
mild_fever
yellow_urine
yellowing_of_eyes
acute_liver_failure
fluid_overload
swelling_of_stomach
swelled_lymph_nodes
malaise
blurred_and_distorted_vision
phlegm
throat_irritation
redness_of_eyes
sinus_pressure
runny_nose
congestion
chest_pain
weakness_in_limbs
fast_heart_rate
pain_during_bowel_movements
pain_in_anal_region
bloody_stool
irritation_in_anus
neck_pain
dizziness
cramps
bruising
obesity
swollen_legs
swollen_blood_vessels
puffy_face_and_eyes
enlarged_thyroid
brittle_nails
swollen_extremeties
excessive_hunger
extra_marital_contacts
drying_and_tingling_lips
slurred_speech
knee_pain
hip_joint_pain
muscle_weakness
stiff_neck
swelling_joints
movement_stiffness
spinning_movements
loss_of_balance
unsteadiness
weakness_of_one_body_side
loss_of_smell
bladder_discomfort
foul_smell_of urine
continuous_feel_of_urine
passage_of_gases
internal_itching
toxic_look_(typhos)
depression
irritability
muscle_pain
altered_sensorium
red_spots_over_body
We will predict the disease with the dataset and we have implemented to suggest medicine for
diabetes and Hypertension problems.
Execution:
Click on run.bat file in your project directory
Enter the Name of patient and enter the symptoms of the patient to prediction the
disease. And then click on algorithm from which you want to predict.
From the above figure for given symptoms it predicted Hypertensio by using Logistic
regression
Now test for SVM also.
For SVM also for the given symptoms it predicted Hypertension. Now predict the
Drug for the disease.
For Hypertension it suggest 2 drugs for relieve the pain
Symptoms for Diabetes:

Blurred & Distored Vision,
Obesity,
Excessive hunger
Polyuria
Increased appetite
Symptoms for Hypertension:
Lack of concentration, Unsteadiness,
CHAPTER-9
CONCLUSIONS AND FUTURE SCOPE
Researchers have used publicly accessible data sets to validate their drug
prediction hypotheses. However, the data sets are different and can be changed over
time, which can lead to conclusions for the same hypotheses. Semantic Web
technologies, in particular Linked Data, are used to represent, link and access data on
Bio2RDF-based drugs and diseases. We use SPARQL queries for the classification of
medicines and diseases. In case of a version update of the data, the queries can be
executed again and new updated data can be obtained. We have collected a broader
collection of data containing 816 medicinal products and 1393 diseases. Gold
standard data predictions generated by combining multiple drug data sources were
evaluated. We have tried our method with a separate dataset [23], which shows us the
predictability of our method irrespective of the compiled data. A crucial element of a
typical evaluation system for drug predictions that make unrealistic predictions is that
the paired nature of the inputs is not considered[15]. We have divided data into
different trains and test sets in which both pairs and drugs/diseases are not overlapped
as indicated in [14] for the interaction prediction of drugs. In various cross validation
schemes we tested several classifiers and compared our approach to the existing
methods PREDICT, SLAMS. We found that in disjoint cross-validation settings we
had better predictive performance than PREDICT and SLAMS.
BIBLIOGRAPHY
1. Brown, A.S., Patel, C.J.: A standard database for drug repositioning. Scientic
Data 4, 170029 (2017)
2. Callahan, A., Cruz-Toledo, J., Ansell, P., Dumontier, M.: Bio2rdf release 2:
improved
coverage, interoperability and provenance of life science linked data. In:
Extended Semantic Web Conference.pp. 200{212. Springer (2013)
3. Campillos, M., Kuhn, M., Gavin, A.C., Jensen, L.J., Bork, P.: Drug target identi-
cation using side-e
ect similarity. Science 321(5886), 263{266 (2008)
4. Chiang, A.P., Butte, A.J.: Systematic evaluation of drug{disease relationships to
identify leads for novel drug uses. Clinical Pharmacology & Therapeutics 86(5),
507{510 (2009)
5. Gottlieb, A., Stein, G.Y., Ruppin, E., Sharan, R.: Predict: a method for inferring
novel drug indications with application to personalized medicine. Molecular
systems biology 7(1), 496 (2011)
6. Guney, E.: Reproducible drug repurposing: When similarity does not suce. In:
PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017. pp. 132{143 (2017)
7. Hay, P.J., Claudino, A.M.: Bulimia nervosa: online interventions. BMJ clinical
evidence 2015 (2015)
8. Hu, G., Agarwal, P.: Human disease-drug network based on genomic expression
proles. PloS one 4(8), e6536 (2009)
9. Kuhn, M., Letunic, I., Jensen, L.J., Bork, P.: The sider database of drugs and side
e
ects. Nucleic acids research 44(D1), D1075{D1079 (2015)
10. Lamb, J., Crawford, E.D., Peck, D., Modell, J.W., Blat, I.C.,Wrobel, M.J., Lerner,
J., Brunet, J.P., Subramanian, A., Ross, K.N., et al.: The connectivity map: using
gene-expression signatures to connect small molecules, genes, and disease. science
313(5795), 1929{1935 (2006)
11. Larrosa, O., de la Llave, Y., Barrio, S., Granizo, J.J., Garcia-Borreguero, D.:
Stimulant
and anticataplectic e
ects of reboxetine in patients with narcolepsy: a pilot
study. Sleep 24(3), 282{285 (2001)
12. Lemke, M.R.: E
ect of reboxetine on depression in parkinson's disease patients.
The Journal of clinical psychiatry 63(4), 300{304 (2002)
13. Melville, J.L., Hirst, J.D.: TMACC: Interpretable Correlation Descriptors for
Quantitative StructureActivity Relationships. J. Chem. Inf. Model. 47(2), 626{
634 (Mar 2007), http://dx.doi.org/10.1021/ci6004178
14. Pahikkala, T., Airola, A., Pietila, S., Shakyawar, S., Szwajda, A., Tang, J.,
Aittokallio,
T.: Toward more realistic drug{target interaction predictions. Briengs in
bioinformatics 16(2), 325{337 (2014)
15. Park, Y., Marcotte, E.M.: Flaws in evaluation schemes for pair-input
computational
predictions. Nature methods 9(12), 1134{1136 (2012)
16. Ratner, S., Laor, N., Bronstein, Y.,Weizman, A., Toren, P.: Six-week open-label
reboxetine
treatment in children and adolescents with attention-decit/hyperactivity
disorder. Journal of the American Academy of Child & Adolescent Psychiatry
44(5), 428{433 (2005)
17. Schmidt, C., Leibiger, J., Fendt, M.: The norepinephrine reuptake inhibitor
reboxetine
is more potent in treating murine narcoleptic episodes than the serotonin
reuptake inhibitor escitalopram. Behavioural brain research 308, 205{210 (2016)
18. Silveira, R.O., Zanatto, V., Appolinario, J., Kapczinski, F.: An open trial of
reboxetine
in obese patients with binge eating disorder. Eating and Weight Disorders-
Studies on Anorexia, Bulimia and Obesity 10(4), e93{e96 (2005)
19. Tehrani-Doost, M., Moallemi, S., Shahrivar, Z.: An open-label trial of reboxetine
in children and adolescents with attention-decit/hyperactivity disorder. Journal
of child and adolescent psychopharmacology 18(2), 179{184 (2008)
20. Versiani, M., Cassano, G., Perugi, G., Benedetti, A., Mastalli, L., Nardi, A.,
Savino,
M.: Reboxetine, a selective norepinephrine reuptake inhibitor, is an e
ective and
well-tolerated treatment for panic disorder. The Journal of clinical psychiatry
(2002)
21. Wilkinson, M., Dumontier, M., Aalbersberg, I., Appleton, G., Axton, M., Baak,
A.,
Blomberg, N., Boiten, J., da Silva Santos, L., Bourne, P., Bouwman, J., Brookes,
A., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C., Finkers,
R., Gonzalez-Beltran, A., Gray, A., Groth, P., Goble, C., Grethe, J., Heringa, J.,
't Hoen, P., Hooft, R., Kuhn, T., Kok, R., Kok, J., Lusher, S., Martone, M., Mons,
A., Packer, A., Persson, B., Rocca-Serra, P., Roos, M., van Schaik, R., Sansone,
S., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M., Thompson, M.,
Van Der Lei, J., Van Mulligen, E., Velterop, J., Waagmeester, A., Wittenburg, P.,
Wolstencroft, K., Zhao, J., Mons, B.: The fair guiding principles for scientic data
management and stewardship. Scientic Data 3 (2016)
22. Yang, L., Agarwal, P.: Systematic drug repositioning based on clinical side-e
ects.
PloS one 6(12), e28025 (2011)
23. Zhang, P., Agarwal, P., Obradovic, Z.: Computational drug repositioning by
ranking
and integrating multiple data sources. In: Joint European Conference on Machine
Learning and Knowledge Discovery in Databases.pp. 579{594. Springer
(2013)

Disease Drug Prediction Using ML

Uploaded by

Copyright:

Available Formats

Disease Drug Prediction Using ML

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Disease Drug Prediction Using ML

Uploaded by

Copyright:

Available Formats

CHAPTER 1

1.2 MOTIVATION: Regardless of potential factors or rare diseases, medical options

1.3.2 Advantages of proposed systems:

Below are some facts about Python.

Python is currently the most popular, high-level, multifunctional programming

Python enables programming in paradigms for artefacts and procedures. Generally,

Nearly all technological giants such as Google, Amazon, Facebook, Instagram,

• Applications for GUI (like Kivy, Tkinter, PyQt etc. )

• Web frames such as Django (used by YouTube, Instagram, Dropbox)

• Editing of photographs (like OpenCV, Pillow)

• Scraping of the web (like Scrapy, BeautifulSoup, Selenium)

• Structures for research

Let's see if other languages are dominated by Python.

Python may be expanded to other languages, as we showed previously. Any of the

6. Simple and Easy

9. Free and Open-Source

3. Python is for Everyone

Python programming can operate on every Linux, Mac or Windows system.

2. Weak in Mobile Computing and Browsers

Though it is an outstanding server-side language, Python is hardly used on the

The reason it is not so renowned notwithstanding Brython's presence is that it is not

4. Undeveloped Access Layers Database

What is machine education: -

Machine learning essentially entails constructing mathematical models to better interpret

Machine Leaning Categories:-

Monitored learning requires somehow modelling a correlation between calculated data

Machine Learning Need

In recent times, companies have invested extensively in newer technology such as

Machine Learning Challenges:-

Lack of specialists − Because ML technology is still in its infancy, the availability of

Implementation difficulties − The complexity of the ML paradigm allows it very

• Identification and avoidance of errors

• Prediction and prediction of weather

• Analysis and forecasting of financial prices

• Consumer product recommendation for online shopping

How to Start Learning Machine Learning?

How do I continue to understand ML?

Step 1 – Understanding Preconditions

(a) Learn Multivariate Algebra and Linear Calculus

(b) Statistics learning

(c) Python analysis

Step 2 – Learn different concepts for ML

(a) Machine Learning Terminologies

• Function – A feature is an observable human data object. A variety of numerical

• Prediction – When our model is ready, a series of inputs can be provided to

(b) Machine Learning Forms

• Controlled learning – this implies learning from a training dataset utilising

• Unmonitored learning – Utilizing unlabeled data and discovering the fundamental

• Semi-supervised learning – Using unlabeled data such as unlabeled learning or a

• Improving learning – This means learning optimal behaviour through checking

Machine learning advantages:-

1. Identify trends and patterns easily

2. No human involvement is essential (automation)

As ML algorithms accumulate knowledge, accuracy and reliability begin to increase.

4. Multi-dimensional and multi-variety data management

Machine Learning algorithms are useful for the management of multi-dimensional

Machine Learning Disadvantages:-

2. Ressources and Time

4. High sensitivity to error

from future import absolute_import