Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Worksheet 8

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Experiment Title.

Student Name: Ankit Ram UID:21MSM3021


Branch: Mathematic Section/Group: 1-B
Semester: 2nd Date of Performance:28/04
Subject Name: Latex Subject Code:

1. Overview of the practical:


What I would be learning in this experiment?

We will see, how to write a research/review paper in LaTex.

2. Task to be done:

i. Write your research/review paper which you have made (completed or in progress) in your
SEMINAR classes under the supervision of your assigned Mentors from Semester -1 & 2.
ii. Please attach your Seminar paper along with your LaTex file (i.e., a merged pdf file or
merged word file). Also mention your supervisor’s name clearly in your manuscript.
Short Review on Machine Learning And its Application
Ankit Ram1, Meenakshi1
1
Department of Mathematics, Chandigarh University, Mohali, Punjab

Email: - ramankit50@gmail.com, chawlameenakshi7@gmail.com

Abstract
Various machine learning approaches are described in this study. Data categorization, prediction, and pattern
recognition are just a few of the uses for these methods. Machine learning's main purpose is to train an algorithm
on relevant data in order to automate human help.

KEYWORDS: Data Analysis, Regression, Decision Tree, Random Sampling, Classification.

1. INTRODUCTION
Machine learning (ML) is the capacity of a system to acquire and integrate information based on large-scale
observations, as well as to develop and expand itself by acquiring new knowledge rather than being programmed
with it. In intelligent tutors, machine learning techniques are used to get new knowledge about students, recognize
their talents, and develop new teaching methods [1]. Unlike early work on AI, which was dominated by logic-
based expert systems, the newfound faith in data-driven methodologies is fueled by the success of pattern
recognition [2] technologies based on machine learning.

the kind and qualities of records, in addition to the performance of the learning algorithms, determine the efficacy
and performance of a machine gaining knowledge of answer [3]. machine getting to know tactics such as analysis,
regression, and clustering, are to be had to hastily construct statistics-pushed structures. [4]

machine learning is a subset of computer technology that differs from conventional computing strategies. [5]
Algorithms are sets of commands that computers hire to calculate or resolve issues in traditional computing.
alternatively, Machine learning strategies permit computer systems to teach on facts inputs and then use statistical
evaluation to offer outputs which are within a specific range. machine gaining knowledge of makes it less
complicated for computer systems to expand fashions from pattern records and automate choice-making
procedures based on records inputs therefore. We gift an intensive angle of several sorts of machine learning
algorithms that can be employed to boost the intelligence and a software in these paintings, based totally at the
importance and potentiality of "machine learning" to assess the records. As a result, the examines most important
contribution is to explain the standards and opportunities of several machine getting to know strategies and
algorithms, in addition to their relevance in a diffusion of actual-world packages.

2. TYPE OF REAL-WORLD DATA

 Structured: organized and handy It has a well-defined structure, follows a regular order in an information
model, and is used by an entity or a programming language. [6] In properly-described systems, along with
relational databases, dependent statistics is often retained in a tabular layout. Examples include names,
instances, locations, credit score card details, inventory information, geography, and different established
statistics. Unstructured facts, then again, has no pre-decided layout or business enterprise, making it far
harder to extract, procedure, and examine.

 Unstructured: Unstructured information, alternatively, has no structure format, making it a way extra
difficult to accumulate, filter out, and compare. textual content and multimedia content material make up
maximum of it. [6] Sensor, letters, blogposts, forums, and word documents are examples of unstructured
information, as are PDF files, voice recordings, films, images, presentations, net pages, and lots of
different styles of enterprise documents.

 Semi-structured: in contrast to structured data, semi-structured statistics isn't saved in a relational


database, however it does include organizational characteristics that make it less complicated to look at.
[6] Semi-structured statistics consists of HTML, XML, JSON files, NoSQL databases, and so on.
 Metadata: that is "information about data" in preference to "regular information." the main distinction
among "records" and "metadata" is that statistics is merely stuff that may be used to categories, degree, or
even describe something in terms of an company's information attributes. Metadata, however, explains the
pertinent statistics information, making it greater significant to facts users. [7] the author, document
length, date created via the report, key phrases to symbolize the document, and so on are all examples of
metadata.

3. Types of Machine Learning Techniques

Image source: Javatpoint


As illustrated in the diagram, machine learning algorithms are categorized into: supervised, unsupervised,
semi-supervised, and reinforcement learning [8]. We go over each form of learning methodology and the
extent to which it will be used to solve complexities.

 Supervised: Supervised learning is a manner of gaining knowledge of an input vector to an output


example of input-output pairs in machine learning [9]. to deduce a function, it employs tagged information
for training and a fixed of training cases. while targeted goals are assigned to be accomplished from a
distinct set of inputs, supervised studying, or a task-driven technique, is used [10]. which separates the
facts, and "regression," which fits the data, are the most common supervised tasks. Supervised learning is
an example of land rate based totally on location type.

 Unsupervised: Unsupervised learning [11] is a data-driven approach for analyzing unlabeled datasets
without requiring human interaction. This is commonly used for extracting functions, identifying
important patterns and structures, groups in results, and experimental reasons. Clustering, density
estimation, feature learning, dimensionality reduction, determining affiliation rules, anomaly detection,
and other unsupervised learning tasks are just a few of the most common.

 Semi-supervised: Semi-supervised learning occurs when some of the samples in your training data aren't
categorized. sklearn's semi-supervised estimators. Semi-supervised methods can take use of the extra
unlabeled records to better capture the structure of the underlying records distribution and generalize to
new samples. When we have a limited number of identified factors and many unlabeled factors, such
algorithms can perform well.

 Reinforcement: This necessitates a whole new approach. It put an agent in a situation with stated limits
defining productive and nonbeneficial behavior, as well as a broad aim. In certain circumstances,
programmers must supply algorithms with well-defined goals and establish incentives and punishments
[8], which is like supervised learning. As a result, supervised learning calls for more explicit programming
than unsupervised learning. However, unlike supervised learning systems, once these requirements are
defined, the algorithm is self-satisfied. As a result, reinforcement learning is sometimes considered a
subset of semi-supervised learning [11], although it is more commonly referred to as a different sort of
machine learning.

4. Machine Learning Tasks and Algorithms

Many different machine learning techniques are reviewed and presented, including classification, regression
analysis, clustering, association rule, feature engineering for dimensionality reduction, and deep learning models.
A machine learning-based prediction model's general architecture.
Classification Algorithm

The type set of rules is a supervised mastering technique that makes use of education facts to classify new
observations. The software program learns from a set of records or observations after which classifies the brand-
new observations into certainly one of numerous lessons or businesses, consisting of [12]. sure or no, 0 or 1,
unsolicited mail or no, cat or canine, etc. goal/label or class is any time period that may be used to describe a
category. in contrast to regression, type produces classes however now not values, including "green or blue", "fruit
or animal" and so forth. because classification technique is a supervised mastering technique, it uses classified
inputs. this is, it consists of each enter and output. A discrete output feature (y) corresponds to an enter variable

(x). y=f(x), where y = categorical output

 Binary classification: This relates to class jobs using magnificence labels, along with "authentic" or "false"
[13]. In binary classification obligations like those, one magnificence may also represent the normal
situation, whilst any other can be the abnormal nation. "Cancer no longer recognized," for example, is the
ordinary circumstance of a work regarding a scientific look at, whilst "most cancers recognized" is the
pathological state. in addition, inside the previous instance of e-mail provider carriers, "spam" and "now
not junk mail" are binary classifications.

 Multiclass classification: This refers to category jobs with extra than two elegance labels inside the
beyond. unlike binary classification troubles, multiclass classification does no longer use the concept of
normal and abnormal consequences. rather, samples are assigned to one in every of several instructions
within a sure range. for example, classifying various styles of network attacks [14] dataset might be a
multiclass class venture.

 Multi-label classification: multi-label classification: when such an example relates to many classes or
labels, multilabel classification is an essential issue in system studying. As a result, it's an extension of
multiclass classification, wherein the hassle's training is hierarchically built and every sample can belong
to many instructions at every hierarchical degree on the same time, inclusive of multi-degree text
classification. as an example, Google news might be classified as "town call," "era," or "present day
news," amongst different matters. unlike classic category issues, in which magnificence labels are at the
same time exclusive, multilabel class uses state-of-the-art gadget mastering strategies to are expecting
numerous jointly non-specific training or labels [15].

in the literature on machine mastering and information technological know-how, numerous category methods had
been offered. The maximum frequent and famous approaches that are extensively hired in lots of application
regions are summarized within the following sections.

 Naive Bayes: The naive Bayes technique is based totally at the Bayes' theorem and assumes that each pair
of traits is independent [16]. in lots of real-global situations, along with document or text categorization,
junk mail filtering, and so on, it really works efficiently and may be used for both binary and multi-class
classes. The NB classifier may be used to efficiently categorize the noisy, examples inside the facts and
broaden a robust prediction model [17]. the primary advantage is that, in evaluation to more complicated
algorithms, it just calls for a minimum amount of schooling statistics to quick estimate the specified
parameters. however, due to the fact to its excessive assumptions on function independence, its overall
performance may be harmed. The maximum common NB classifier variations are Gaussian, Multinomial,
complement, Bernoulli, and categorical.

 Logistic regression: Logistic regression is a common probabilistic-based statistical model for dealing with
classification issues in system learning. In logistic regression, a logistic characteristic, also known as the
mathematically defined sigmoid feature in Eq. 1, is used to estimate the possibilities. When the dataset can
be partitioned linearly, it works well, but it may overfit high-dimensional datasets. Regularization
procedures are most likely used in these situations to avoid over-becoming. The assumption of linearity
among the based and unbiased variables is a major drawback of Logistic Regression. It may be used to
solve problems in each category and regression; nevertheless, classification is the most common use.

g (z) = 1/ 1 + exp(−z). -(1)

 K-nearest neighbors: k-nearest neighbors are a collection of "example-based fully learning" or non-
generalizing learning rules that is sometimes referred to as a "lazy learning" approach. It keeps all times
corresponding to schooling data in n-dimensional region rather than building a wide inner version. KNN is
a collection of machine learning rules that uses facts to categorize new data points using similarity metrics
such as the Euclidean distance function) [9]. To categorize it, a simple majority vote of every factor's okay
closest acquaintance is employed. It is frequently unaffected by noisy schooling records, and accuracy is
determined by data quality. The most challenging aspect of KNN is deciding on the appropriate number of
associates to consider the KNN can be used for both type and regression.
Image source: https://www.geeksforgeeks.org/

 Decision tree: Decision Tree is a supervised learning approach that may be used to handle a variety of
types and regression problems, although it is most used to solve class problems. Each leaf node provides
the belief in this tree-established classifier, whereas inner nodes contain dataset features, branches serve as
selection guides, and each leaf node provides the belief in this tree-established classifier. A choice tree's
choice node and leaf node are the two nodes. Choice nodes are used to make any choices and have many
branches, whereas leaf nodes are the consequence of these selections and have no extra branches. The
judgements or checks are based on the characteristics of the provided dataset. It is a graphical
representation for obtaining all possible answers to a problem/option based on certain factors. It's known
as a choice tree because, like a tree, it begins with the foundation node and expands from there, generating
a tree-like structure [18], as shown in Figure.

Entropy ∶ H(x)= − ∑n I=1 p (x) log2 p (xi) -(2)

Gini(E) =1 − ∑c I=1 pi2 -(3)

Image source:
Javatpoint

Regression Analysis

Regression evaluation refers to a collection of machine learning strategies that allow you to are expecting a
continuous (y) outcome variable primarily based at the values of 1 or greater (x) predictor variables. The
maximum important contrast among type and regression is that category predicts discrete elegance labels, while
regression lets in continuous amount prediction. discern [18] illustrates how categorization differs from regression
models. There are numerous similarities between the two sorts of system studying algorithms. financial
forecasting or prediction, fee estimate, fashion analysis, advertising, time collection estimation, medicinal drug
reaction modelling, and several other fields increasingly appoint regression fashions. Linear, polynomial, lasso,
and ridge regression are examples of common regression strategies.
Image source: https://www.tutorialspoint.com/

Cluster Analysis

Cluster evaluation, normally called clustering, is an unmanaged system mastering technique for coming across
and grouping comparable facts factors in massive datasets without regard for the result [20]. It does arrange a set
of things so that those within the same class, called a cluster, are greater comparable in positive methods than
objects in different groupings. it is an information analysis technique it really is often used to find exciting trends
or patterns in facts, which include groupings of clients based on their conduct. Clustering may be utilized in a
selection of packages, such as cybersecurity, e-commerce, cellular information processing, fitness analytics, user
modelling, and behavioral analytics.
Image source: https://www.analyticsvidhya.com/

Reinforcement Learning

Reinforcement gaining knowledge of (RL) is a system learning approach that lets in an agent to research in
interactive surroundings via trial-and-error making use of information from its movements and studies. The RL
method is based totally on interacting with the environment, as opposed to supervised learning, that is based on
provided pattern facts or instances. The project in reinforcement learning (RL) is characterized as a Markov
choice manner, that is all approximately making judgments consecutively. a standard RL trouble has 4
components: Agent, environment, Rewards, and policy.

version-based and model-free strategies can be loosely divided into RL. model-primarily based RL is the system
of deducing most beneficial behavior from a surroundings model via engaging in moves and tracking the results,
which include the next kingdom and the instantaneous reward [21]. model-primarily based strategies encompass
Alpha zero and AlphaGo. A model-loose technique, alternatively, does no longer employ the transition
opportunity distribution or the praise.

Applications of Machine Learning

 Image Recognition: Image recognition is one of the most common applications of machine learning. It's
used to recognize individuals, places, and digital photos, among other things. A popular use case for image
recognition and face recognition is automated friend tagging suggestion. This is a service provided by
Facebook. When we submit a photo with our Facebook friends, we receive a tagging suggestion with their
names right away, and the technology behind it is a machine-learning-powered face detection and
recognition system. It's based on the Facebook project "Deep Biometrics," which automates face
recognition and human identification in photos.

 Cybersecurity: The field of securing networks, buildings, hardware, and data from digital assaults is
known as cybersecurity [22]. Machine learning has evolved into a critical cybersecurity generation that
constantly learns from data to uncover patterns, better detect malware in encrypted communications,
identify insider threats, predict where dangerous areas on the internet are, keep people safe while surfing,
or safeguard cloud documents by exposing questionable behaviors. For example, clustering algorithms
may be used to detect cyber-anomalies, security breaches, and other problems.

 Traffic prediction: When we need to travel to a new location, we utilize Google Maps, which provides us
with the best route with the shortest distance and traffic forecasts. It employs techniques to predict traffic
conditions, such as whether visits are clear, going slowly, or very congested: The vehicle's current location
is displayed in real time. Sensors and the Google Maps app are both running at the same time, and prior
days' common time has been collected. Everyone who uses Google Map contributes to the app's
development. If you wish to improve performance, it takes data from the customer and sends it back to its
database.

 Email Spam Filtering: Every new e-mail is now categorized into one of three categories: critical, daily, or
spam. Machine learning is the technology that allows us to get critical emails with vital photographs in our
inboxes, as well as spam emails in our spam folder. Gmail uses gadget mastery methods such as Multi-
Layer Perceptron, Choice Tree, and Nave Bayes classifier for e-mail unsolicited mail filtering and virus
detection.
 Online Fraud Detection: Machine learning makes our online transactions safer and more secure by
detecting fraud. There are several ways for a fraudulent transaction to occur when completing an online
transaction, including the use of phone money due, false ids, and financial robbery in the middle of a
transaction. To detect if the transaction is genuine or fraudulent, we employ the Feed forward Neural
network. Each lawful transaction is transformed into a collection of hash values, which may subsequently
be used to begin the following round. Each authentic transaction may include a unique sample that
distinguishes it from the fraudulent transaction, allowing it to be discovered and making our online
transactions safer.

 Automatic Language Translation: It so annoying if we visit a new location and don't speak the local
language; machine learning can assist us with this by converting textual material into our native languages.
This feature is offered by Google's GNMT (Google Neural Machine Translation), which is a Neural device
that robotically translates text into our native tongue. The automatic translation is powered by a sequence-
to-collection learning algorithm, which is like that used in photo popularity and translates text from one
language to another.
 Internet of things and smart cities: The Internet of Things turns everyday objects become smart objects by
allowing them to communicate data and perform activities without requiring human interaction. As a result,
the Internet of Things is considered as a stride forward that may improve practically every aspect of our lives,
including smart governance, smart homes, education, communication, transportation, retail, agriculture, health
care, business, and many others [23]. The smart metropolis, which utilizes generation to improve municipal
offers and inhabitants' living lives [24], is one of the most essential fields of utility for IoT. Machine learning
has emerged as a crucial phase for IoT applications, since it makes use of experience to discover trends and
develop models that help predict future behaviors and events.

Conclusion

on this paper, we have mentioned review of machine mastering algorithms for information evaluation and
programs. we've in short discussed how various varieties of system learning strategies can be used for making
answers to various actual-international troubles. A hit machine studying version depends on both the statistics
and the overall performance of the learning algorithms. The sophisticated mastering algorithms then need to
study through the collected actual-global data and understanding associated with the target software earlier
than the system can help with sensible choice-making

REFERENCES

[1]. Beverly Park Woolf, in Building Intelligent Interactive Tutors, 2009

[2]. Christopher M. Bishop https://link.springer.com/book/9780387310732

[3]. Han J, Pei J, Kamber M. Data mining: concepts and techniques. Amsterdam: Elsevier; 2011

[4]. Witten IH, Frank E. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann; 2005.

[5]. Lisa Tagliaferri https://www.digitalocean.com/community/tutorials/an-introduction-to-machine-learning

[6]. https://www.geeksforgeeks.org/difference-between-structured-semi-structured-and-unstructured-data/

[7]. Cit.JasonScott’sWebloghttps://www.ontotext.com/knowledgehub/fundamentals/metadata

[8] Mohammed M, Khan MB, Bashier Mohammed BE. Machine learning: algorithms and applications. CRC Press; 2016.
[9]. Han J, Pei J, Kamber M. Data mining: concepts and techniques. Amsterdam: Elsevier; 2011

[10]. Sarker IH, Kayes ASM, Badsha S, Alqahtani H, Watters P, Ng A. Cybersecurity data science

[11]. Kaelbling LP, Littman ML, Moore AW. Reinforcement learning

[12]. http://surl.li/bntww

[13]. https://machinelearningmastery.com/types-of-classification-in-machine-learning/

[14]. Tavallaee M, Bagheri E, Lu W, Ghorbani AA. A detailed analysis of the kdd cup 99 data set.

[15]. Pedregosa F, Varoquaux G, Scikit-learn: machine learning in python

[16]. Kaelbling LP, Littman ML, Moore AW. Reinforcement learning

[17]. Sarker IH. A machine learning based robust prediction model

[18]. https://www.analyticsvidhya.com/blog/2021/08/decision-tree-algorithm/

[19]. https://www.javatpoint.com/types-of-machine-learning

[20]. https://training.galaxyproject.org/training-material/

[21]. Polydoros AS, Nalpantidis L. Survey of model-based reinforcement learning

[22]. Ślusarczyk B. Industry 4.0: Polish J Manag Stud. 17, 2018

[23]. Mahdavinejad MS, Rezvan M, Barekatain M, Adibi P, Barnaghi P, Sheth AP. Machine learning for internet of things
data analysis

[24]. Zanella A, Bui N, Castellani A, Vangelista L, Zorzi M. Internet of things for smart cities. IEEE Internet Things

You might also like