B Ei
B Ei
B Ei
Deep learning uses a complex structure of algorithms modeled on the human brain.
This enables the processing of unstructured data such as documents, images, and text.
What is DataScience
• It is about data gathering, analysis and decision making
• It helps to find patterns in data though analysis and predictions
• Helps to discover patterns and make better decisions
• Statistics is the science of analyzing data
• Data science is an interdisciplinary field that uses scientific
methods, processes, algorithms and systems to extract or
extrapolate knowledge and insights from noisy, structured and
unstructured data, and apply knowledge from data across a
broad range of application domains.
• The intellectual and practical activity dealing with the
systematic study of the structure and behaviour of the physical
and natural world through observation and experiment of data
Data
• It is collection of information
• Data can be categorized as structured or un
structured
• Variable can be measured or counted
• Data is categorized as numerical(Discrete or
continuous), categorical(color or type) or
ordinal (grading system)
By understanding the various techniques, methods, tools and analytical
approaches, data scientists can help the organizations that employ them achieve
the strategic and competitive benefits
The three types of machine learning methods
Statistics: Statistics is one of the most important components of data science. ...
Domain Expertise: In data science, domain expertise binds data science together. ...
Data engineering: Data engineering is a part of data science, which involves
acquiring, storing, retrieving, and transforming the data.
6 Types of Data in Statistics & Research: Key in Data Science
Objects Attributes
Objects can have attributes. Attributes are part of the object. These include:
• names
• dimnames
• dim
• class: used to check the class
• attributes (contain metadata)
R Data Structures
• R Vectors
• R Matrix
• R array
• List in R
• R Data Frame
• R Factor
Matrix is a two dimensional data structure in R programming.
Matrix is similar to vector but additionally contains the dimension attribute.
All attributes of an object can be checked with the attributes() function
(dimension can be checked directly with the dim() function).
We can check if a variable is a matrix or not with the class() function.
Matrics
> matrix(1:9, nrow = 3, ncol = 3) > # It is also possible to change names
[,1] [,2] [,3]
[1,] 1 4 7
> colnames(x) <- c("C1","C2","C3")
[2,] 2 5 8 > rownames(x) <- c("R1","R2","R3")
[3,] 3 6 9
> matrix(1:9, nrow=3, byrow=TRUE)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
> dim(x) <- c(2,3)
> print(x)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
> class(x)
[1] "matrix" "array"
Let us consider a 2-dimensional space having three points P1 (X1, Y1), P2 (X2, Y2), and
P3 (X3, Y3), the Minkowski distance is given by ( |X1 – Y1|p + |X2 – Y2|p + |X2 – Y2|p )1/p
What is Euclidean Distance? In Mathematics, the Euclidean distance is defined as the
distance between two points
Thus, the Euclidean distance formula is given by:
d =√[(x2 – x1)2 + (y2 – y1)2]
Where,
“d” is the Euclidean distance
(x1, y1) is the coordinate of the first point
(x2, y2) is the coordinate of the second point.