A data structure is a particular way of organizing data in a computer so that it can be used effectively. The idea is to reduce the space and time complexities of different tasks.
The two most important data structures in R are Matrix and Dataframe, they look the same but different in nature.
Matrix in R –
It’s a homogeneous collection of data sets which is arranged in a two dimensional rectangular organisation. It’s a m*n array with similar data type. It is created using a vector input. It has a fixed number of rows and columns. You can perform many arithmetic operations on R matrix like – addition, subtraction, multiplication, and divisions.
Example:
Python
A = matrix (c( 11 , 22 , 33 , 44 , 55 , 66 ),
nrow = 2 , ncol = 3 , byrow = 1 )
print (A)
|
Output:
[, 1] [, 2] [, 3]
[1, ] 11 22 33
[2, ] 44 55 66
Application & Usage
- It has great use in Economics for calculating some data like GDP(Gross Domestic Production) or PI(Price per capita income).
- It is also helpful in the study of electrical and electronics circuits.
- Matrix are used in the study of surveys, i.e. Plotting graphs etc.
- Helpful in probability and statistics.
DataFrames in R –
It is used for storing data tables. It can contain multiple data types in multiple columns called fields. It is a list of vector of equal length. It is a generalized form of a matrix. It is like a table in excel sheets. It has column and row names. The name of rows are unique with no empty columns. The data stored must be numeric, character or factor type. DataFrames are heterogeneous.
Example:
Python
comp.data < - data.frame(
comp_id = c ( 1 : 3 ),
comp_name = c( "Geeks" , "For" , "Geeks" ),
growth = c( 16000 , 14000 , 12000 ),
comp_start_date = as.Date(c( "02/05/10" , "04/04/10" , "05/03/10" ))
)
print (comp.data)
|
Output :
Application & Usage
- Data frames can do lot of works like fit statistics formulas.
- Processing data(Not possible with Matrix, First converting to Data Frame is mandatory).
- Transpose is possible, i.e. changing rows to columns and vice versa which is useful in Data Science.
Matrix v/s Data Frames in R
Matrix |
Dataframe |
Collection of data sets arranged in a two dimensional rectangular organisation. |
Stores data tables that contains multiple data types in multiple column called fields. |
It’s m*n array with similar data type. |
It is a list of vector of equal length. It is a generalized form of matrix. |
It has fixed number of rows and columns. |
It has variable number of rows and columns. |
The data stored in columns can be only of same data type. |
The data stored must be numeric, character or factor type. |
Matrix is homogeneous. |
DataFrames is heterogeneous. |