Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
204 views

Tools Machine Learning

Machine learning tools are algorithmic applications that allow software to improve predictions without human programming. They examine data, learn from it, and make decisions. Some key machine learning tools include Scikit-learn, PyTorch, TensorFlow, Jupyter Notebook, and Knime. Scikit-learn is useful for tasks like classification, regression, and preprocessing. PyTorch and TensorFlow are deep learning frameworks good for building neural networks. Jupyter Notebook allows sharing live code notebooks while Knime uses a graphical interface requiring no code. These tools help speed up and simplify applied machine learning work.

Uploaded by

Maria Lavanya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
204 views

Tools Machine Learning

Machine learning tools are algorithmic applications that allow software to improve predictions without human programming. They examine data, learn from it, and make decisions. Some key machine learning tools include Scikit-learn, PyTorch, TensorFlow, Jupyter Notebook, and Knime. Scikit-learn is useful for tasks like classification, regression, and preprocessing. PyTorch and TensorFlow are deep learning frameworks good for building neural networks. Jupyter Notebook allows sharing live code notebooks while Knime uses a graphical interface requiring no code. These tools help speed up and simplify applied machine learning work.

Uploaded by

Maria Lavanya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Tools used in Machine Learning

M. Maria Lavanya
St. Ann’s College for Women, Mehdipatnam, Hyderabad, Telangana, India
lavanya.mm@gmail.com

Abstract

Machine Learning is the field of study that gives computers the ability to learn without being explicitly
programmed. Unlike rules based systems, which require a human expert to hard code domain knowledge
directly into the system, a machine learning algorithm learns how to take decisions based on data alone. In
simple term, Machine Learning means making prediction based on data. Machine learning is an astonishing
technology, if we use it in a correct way. How fascinating it would be to build a machine that behaves like a
human being to a great extent. Mastering machine learning tools will help us play with the data, train our
models, discover new methods, and create our own algorithm. Tools are a big part of machine learning and
choosing the right tool can be as important as working with the best algorithms. Machine learning comes
with an extensive collection of ML tools, platforms, and software. Moreover, ML technology is evolving
continuously. Out of a pile of machine learning tools, we need to choose any of them to gain expertise. This
paper discusses a list of top machine learning tools that are widely used by the experts.

Keywords: Machine learning, algorithm, platforms and software, Machine learning tools.

1. Introduction

1.1 What is Machine Learning?

At a very high level, [1] machine learning is the process of teaching a computer system how to make accurate
predictions when fed data. Those predictions could be answering whether a piece of fruit in a photo is a
banana or an apple, spotting people crossing the road in front of a self-driving car, whether the use of the
word book in a sentence relates to a paperback or a hotel reservation, whether an email is spam, or
recognizing speech accurately enough to generate captions for a YouTube video. [1][2]The key difference from
traditional computer software is that a human developer hasn't written code that instructs the system how to
tell the difference between the banana and the apple. Instead a machine-learning model has been taught how
to reliably discriminate between the fruits by being trained on a large amount of data, in this instance likely a
huge number of images labelled as containing a banana or an apple. With the help of machine learning
Systems, we can examine data, learn from that data and make decisions. Machine learning library is a bundle
of algorithms.

Fig 1.1 Machine Learning

2. What are Machine Learning Tools?

[4]
Machine learning tools are algorithmic applications of artificial intelligence that give systems the ability
to learn and improve without ample human input; similar concepts are data mining and predictive modeling.
They allow software to become more accurate in predicting outcomes without being explicitly programmed.
The idea is that a model or algorithm is used to get data from the world, and that data is fed back into the
model so that it improves over time. It’s called machine learning because the model “learns” as it is fed
more and more data. They can be used, for example, to build recommendation engines, predict search
patterns, filter spam, build news feeds, and detect fraud and security threats, and much more.

2.1 Why Use Tools?


Machine learning tools [5] make applied machine learning faster, easier and more fun.

Faster: Good tools can automate each step in the applied machine learning process. This means that the time
from ideas to results is greatly shortened. The alternative is that we have to implement each capability our
self. This can take significantly longer than choosing a tool to use off the shelf.
Easier: We can spend our time choosing the good tools instead of researching and implementing techniques
[5]
to implement. The alternative is that we have to be an expert in every step of the process in order to
implement it. This requires research, deeper exercise in order to understand the techniques, and a higher
level of engineering to ensure it is implemented efficiently.

Fun: There is a lower barrier for beginners to get good results. We can use the extra time to get better results
or work on more projects. The alternative is that we will spend most of our time building our tools rather
than on getting results.

2.2 How does the tool serve in delivering results in a machine learning project?

Machine learning [1] tools are not just implementations of machine learning algorithms. They can also
provide capabilities that we can use at any step in the process of working through a machine learning
problem.

2.3 When to Use Machine Learning Tools?


[1][5]
Machine learning tools can save our time and help us consistency deliver good results across projects.
The most benefit from using machine learning tools includes:

Getting Starting: When we are just getting started machine learning tools, it will guide us through the
process of delivering good results quickly and give us confidence to continue on with our next project.

Day-to-Day: When we need to get good results to a question quickly, machine learning tools can allow us to
focus on the specifics of our problem rather than on the depths of the techniques we need to use to get an
answer.

Project Work: When we are working on a large project, machine learning tools can help us to prototype a
solution, figure out the requirements and give us a template for the system that we may want to implement.
3. Best machine learning tools

Fig 3.1 Machine learning tools

3.1 Scikit-learn

Scikit-Learn is an open source machine learning package.[3][4] It is a unified platform as it is used for multiple
purposes. It assists in regression, clustering, classification, dimensionality reduction and preprocessing.
Scikit-Learn is built on top of the three main Python libraries viz. NumPy, Matplotlib, and SciPy. Along with
this, it will also help us with testing as well as training our models.It helps in data mining and data analysis. It
provides models and algorithms for Classification, Regression, Clustering, Dimensional reduction, Model
selection, and Pre-processing.

3.2 Pytorch

Pytorch is a deep learning framework. It is very fast as well as flexible to use.[1][2][3] This is because Pytorch
has a good command over the GPU. It is one of the most important tools of machine learning because it is
used in the most vital aspects of ML which includes building deep neural networks and tensor calculations.
Pytorch is completely based on Python. Along with this, it is the best alternative to NumPy. PyTorch is a
Torch based, Python machine learning library. The torch is a Lua based computing framework, scripting
language, and machine learning library.
3.3 TensorFlow

[3]
TensorFlow developed by the Google team. It offers a JavaScript library that helps in machine learning
development. Its APIs will help us to create and train the models. [5]
It’s open source machine learning
library which helps us to develop our ML models. It has a flexible scheme of tools, libraries, and resources
that allows researchers and developers to build and deploy machine learning applications. It will help us in
training and building our models. We can also run our existing models with the help of TensorFlow.js which
is a model converter. TensorFlow is an open-source framework that comes in handy for large-scale as well as
numerical ML. It is a blender of machine learning as well as neural network models. Moreover, it is also a
good friend of Python. The most prominent feature of TensorFlow is, it runs on CPU and GPU as well.
Natural language processing, Image classification are the ones who implement this tool.

3.4 Jupyter Notebook

Jupyter notebook is one of the most widely used machine learning tools among all. It is a very fast processing
[6][10]
as well as an efficient platform. Moreover, it supports three languages viz. Julia, R, Python. Thus the
name of Jupyter is formed by the combination of these three programming languages. Jupyter Notebook
allows the user to store and share the live code in the form of notebooks. One can also access it through a
GUI. For example, winpython navigator, anaconda navigator, etc.

3.5 Knime

[3][4]
Knime is an open-source machine learning tool that is based on GUI. The best thing about Knime is it
doesn’t require any knowledge of programming. One can still avail the facilities provided by Knime. It is
generally used for data relevant purposes. For example, data manipulation, data mining, etc. Moreover, it
processes data by creating different various workflows and then executes them. It comes with repositories
that are full of different nodes. These nodes are then brought into the Knime portal. And finally, a workflow
of nodes is created and executed. Knime lets us create entire data science workflows using a drag and drop
interface. We can essentially implement everything from feature engineering to feature selection and even
add predictive machine learning models to our workflow this way. This approach of visually implementing
our entire model workflow is very intuitive and can be really useful when working on complex problem
statements.

3.6 Colab

Google Colab is a cloud service which supports Python.[4] It will help us in building the machine learning
applications using the libraries of PyTorch, Keras, TensorFlow, and OpenCV. Colaboratory, or "Colab" for
short, allows us to write and execute Python in our browser, with zero configuration required, free access to
GPUs, easy sharing.It also helps in machine learning education and Assists in machine learning research

3.7Apache Mahout

Mahout is launched by Apache which is an open-source platform based on Hadoop. [1] It is generally used for
machine learning and data mining. Techniques such as regression, classification, and clustering became
possible with Mahout. Along with this, it also makes use of math-based functions such as vectors, etc.
Apache Mahout is a mathematically expressive Scala DSL and distributed linear algebra framework. It is an
open source and free project of the Apache Software Foundation.[10] The main goal of this framework is to
implement an algorithm promptly for mathematicians, data scientists, and statisticians.An extensible
framework for building scalable algorithms. It includes matrix and vector libraries.Run on top of Apache
Hadoop using the Map Reduce paradigm.

3.8 Google Cloud ML Engine

The objective of Google cloud AutoML is to make artificial intelligence accessible to everyone.[4] It provides
the models which are pre-trained to the users in order to create various services. For example, text
recognition, speech recognition, etc. Google Cloud AutoML became very much popular among companies.
As the companies want to apply artificial intelligence in every sector of the industry. If we are training our
classifier on a plenty of data, our PC or laptop might work quite well. However, if we have millions or
billions of training data then the algorithm is quite sophisticated and take a long time in proper execution. In
that case we should use Google Cloud ML Engine for our rescue. It is a hosted platform where machine
learning app developers and data scientists create and run optimum quality machine learning models.

3.9 Keras
[5][6]
Keras is an API for neural networks. It helps in doing quick research and is written in Python. Keras
follows best practices for reducing cognitive load. It offers consistent & simple APIs, it minimizes the
number of user actions required for common use cases, and it provides clear & actionable error messages. It
also has extensive documentation and developer guides. Keras is also used for deep learning framework.
Because keras makes it easier to run new experiments, it empowers to try more ideas than competition,
faster. It is built on top of TensorFlow 2.0 and Keras is an industry-strength framework that can scale to
large clusters of GPUs or an entire TPU pod. We can export keras models to JavaScript to run directly in the
browser, to TF Lite to run on iOS, Android, and embedded devices.[6] It's also easy to serve Keras models as
via a web API. Keras is used by CERN, NASA, NIH and many more scientific organizations around the
world. Keras has the low-level flexibility to implement arbitrary research ideas while offering optional high-
level convenience features to speed up experimentation cycles a vast ecosystem.[7] Keras is a central part of
the tightly-connected TensorFlow 2.0 ecosystem, covering every step of the machine learning workflow,
from data management to hyper parameter training to deployment solutions. Because of its ease-of-use and
focus on user experience, Keras is the deep learning solution of choice for many university courses. It is
widely recommended as one of the best ways to learn deep learning.

3.10 Rapid Miner

Rapid Miner is a piece of good news for the non-programmers. [9]It is a data science platform and has a very
amazing interface. Rapid Miner is platform-independent as it works on cross-platform operating systems.
With the help of this tool, one can use their own data as well as test their own models. Its interface is very
user-friendly. We only drag and drop.[10] This is the major reason why it is beneficial for non-programmers as
well.Rapid Miner provides a platform for machine learning, deep learning, data preparation, text mining, and
predictive analytics. It can be used for research, education and application development. Through GUI, it
helps in designing and implementing analytical workflows. It also helps with data preparation.It helps in
Result Visualization.It helps in Model validation and optimization. It is extensible through plugins and it is
easy to use.

4. Comparison Chart
Written in
Platform Cost Algorithms or Features
language

Scikit Learn Linux, Mac OS, Free. Python, Classification


Windows Cython, C, Regression
C++ Clustering
Preprocessing
Model Selection
Dimensionality reduction.

PyTorch Linux, Mac OS, Free Python, C++, Autograd Module


Windows CUDA Optim Module
nn Module

TensorFlow Linux, Mac OS, Free Python, C++, Provides a library for dataflow
Windows CUDA programming.

Jupyter Linux, Mac OS, Free Julia, R , A web application


Notebook Windows
Python Notebook documents
Written in
Platform Cost Algorithms or Features
language

KNIME Linux, Mac OS, Free Java Can work with large data volume.
Windows Supports text mining & image
mining through plugins

Colab Cloud Service Free - Supports libraries of PyTorch,


Keras, TensorFlow, and OpenCV

Apache Mahout Cross-platform Free Java Preprocessors


Scala Regression
Clustering
Recommenders
Distributed Linear Algebra.

Google Cloud Cloud Service Free Python Logistic and linear regression
ML Engine
Classification
Neural networks

Keras.io Cross-platform Free Python API for neural networks

Rapid Miner Cross-platform Free plan Java Data loading & Transformation
Small: $2500 Data pr
per year.
Medium:
$5000 per
year.
Large: $10000
per year.

5. Conclusion

In this paper, we have explored some of the most popular and widely used machine learning tools. All these
show how advanced machine learning is. All these tools use different programming languages and run on
them. For example, some of them run on Python, some on C++, and some on Java. Selection of the tool
depends on our requirement for the algorithm, our expertise level, and the price of the tool. Machine
learning library should be easy to use. Most of these libraries are free except Rapid Miner. TensorFlow is
more popular in machine learning, but it has a learning curve. Scikit-learn and PyTorch are also popular
tools for machine learning and both support Python programming language. Keras.io and TensorFlow are
good for neural networks.
References:

[1]. Introduction to Machine Learning with Python: A Guide for Data Scientists 1st Editio by Andreas C.
Muller (Author), Sarah Guido(Author).

[2]. Automl: methods, systems, challenges (new book) Editors: Frank Hutter, Lars Kotthoff, Joaquin
Vanschoren.

[3]. Machine Learning with R: Expert techniques for predictive modeling, 3rd Edition .

[4]. Machine Learning Mastery with Python by Jason Browniee Tom M. Mitchell, “Machine Learning”,1 Jul
2017

[5]. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and
Techniques to Build Intelligent Systems 2nd Edition.

[6]. Data Mining: Practical Machine Learning Tools and Techniques (Morgan Kaufmann Series in Data
Management Systems) 4th Edition.

[7]. Learning Jupyter – November 30, 2016 by Dan Toomey.

[8]. Understanding Machine Learning: From Theory to Algorithms Kindle Edition,by Shai Shalev-Shwartz

[9]. RapidMiner: Data Mining Use Cases and Business Analytics Applications (2013).
[10]. Most Popular Machine Learning Software Tools.
https://towardsdatascience.com/10-most-popular-machine-learning-software-tools

You might also like