Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
26 views

DA Lab 1-7

Uploaded by

vishwas gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

DA Lab 1-7

Uploaded by

vishwas gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

KIT-651 DATA ANALYTICS LAB

Experiment-01
Object:-To get the input from user and perform numerical operations (MAX, MIN,
AVG, SUM, SQRT, ROUND) using in R.

#Program
numbers <- c(2,4,6,8,10)

characters <- c("s", "a", "p", "b")

max(numbers)

max(characters)

min(numbers)

sum(numbers)

sqrt(numbers)

mean(numbers)

OUTPUT:
[1]10

[1]"s"

[1] 2

[1] 30

[1] 1.414214 2.000000 2.449490 2.828427 3.162278

[1] 6

1
KIT-651 DATA ANALYTICS LAB

2
KIT-651 DATA ANALYTICS LAB

Experiment-02
Object:-To perform data import or export (.csv, .xls ,.txt ) operation using
dataframe in R language.

#Program to csv file


> data<- read.csv("salaries.csv")
> print(data)

Output
work_year experience_level employment_type
1 2023 SE FT
2 2023 MI CT
3 2023 MI CT
4 2023 SE FT
5 2023 SE FT
6 2023 SE FT
7 2023 SE FT
8 2023 SE FT
9 2023 SE FT
10 2023 SE FT

#Program to import .txt file

3
KIT-651 DATA ANALYTICS LAB

> data<- read.table("saleries.txt" ,sep = "\t" ,header = TRUE)


> print(data)

Output

work_year experience_level employment_type


1 2023 SE FT
2 2023 MI CT
3 2023 MI CT
4 2023 SE FT
5 2023 SE FT
6 2023 SE FT
7 2023 SE FT
8 2023 SE FT
9 2023 SE FT
10 2023 SE FT

#Program to import .xlsx file

> sheet1 <- read_excel("sheet1.xlsx")


> View(sheet1)
> print(sheet1)

Output:

id name salary start_date dept`

1 1 Rick 623.3 1/1/2012 IT


2 2 Dan 515.2 9/23/2013 Operations
3 3 Michelle 611 11/15/2014 IT
4 4 Ryan 729 5/11/2014 HR
5 5 Gary 43.25 3/27/2015 Finance
6 6 Nina 578 5/21/2013 IT
7 7 Simon 632.8 7/30/2013 Operations
8 8 Guru 722.5 6/17/2014 Finance

4
KIT-651 DATA ANALYTICS LAB

5
KIT-651 DATA ANALYTICS LAB

Experiment-03
Object:-To get the input matrix from user and perform Matrix addition, subtraction,
multiplication, inverse transpose and division operations using vector concept in R.

#Program

# Create two 2x3 matrixes.

m1 = matrix(c(1, 2, 3, 4, 5, 6), nrow = 2)

print("Matrix-1:")

print(m1)

m2 = matrix(c(0, 1, 2, 3, 0, 2), nrow = 2)

print("Matrix-2:")

print(m2)

result = m1 + m2

print("Result of addition")

print(result)

result = m1 - m2

print("Result of subtraction")

print(result)

result = m1 * m2

print("Result of multiplication")

print(result)

result = m1 / m2

print("Result of division:")

print(result)

6
KIT-651 DATA ANALYTICS LAB

OUTPUT :

[1] "Matrix-1:"

[,1] [,2] [,3]

[1,] 1 3 5

[2,] 2 4 6

[1] "Matrix-2:"

[,1] [,2] [,3]

[1,] 0 2 0

[2,] 1 3 2

[1] "Result of addition"

[,1] [,2] [,3]

[1,] 1 5 5

[2,] 3 7 8

[1] "Result of subtraction"

[,1] [,2] [,3]

[1,] 1 1 5

[2,] 1 1 4

[1] "Result of multiplication"

[,1] [,2] [,3]

[1,] 0 6 0

[2,] 2 12 12

[1] "Result of division:"

[,1] [,2] [,3]

[1,]Inf 1.500000 Inf

[2,] 2 1.333333 3

7
KIT-651 DATA ANALYTICS LAB

8
KIT-651 DATA ANALYTICS LAB

Experiment-04

Object:-To perform statistical operations (Mean, Median, Mode) using R.

#Program
marks <- c(97, 78, 57, 64, 87)
result <- mean(marks)
print(result)

marks <- c(97, 78, 57, 64, 87)


result <- median(marks)
print(result)
marks <- c(97, 78, 57,78, 97, 66, 87, 64, 87, 78)

mode = function() {
return(names(sort(-table(marks)))[1])
}
mode()

OUTPUT:

[1] 76.6

[1] 78

[1] "78"

9
KIT-651 DATA ANALYTICS LAB

10
KIT-651 DATA ANALYTICS LAB

Experiment-05

Object: To perform data preprocessing operations 1.handling missing


data , 2. Min, Max normalization using r language

#Program

# Create a data frame with missing values

mydata <- data.frame(x = c(1, 2, NA, 4, 5), y = c(6, 7, 8, NA, 10))

print("Original Data:")

print(mydata)

# Remove missing values

mydata <- na.omit(mydata)

print("Data after removing missing values:")

print(mydata)

Output:

[1]"Original Data:"
x y
1 1 6
2 2 7
3 NA 8
4 4 NA
5 5 10
[1] "Data after removing missing values:"
x y
11 6
22 7
5 5 10

11
KIT-651 DATA ANALYTICS LAB

12
KIT-651 DATA ANALYTICS LAB

Min-Max Normalization:
Program:
# Create a vector of data

mydata <- c(1, 2, 3, 4, 5)

print("Original Data:")

print(mydata)

# Perform min-max normalization

mydata_normalized <- scale(mydata, center = min(mydata), scale = max(mydata) - min(mydata))

print("Data after min-max normalization:")

print(mydata_normalized)

Output:

[1] "Original Data:"

[1] 1 2 3 4 5

[1] "Data after min-max normalization:"

[,1]

[1,] 0.00

[2,] 0.25

[3,] 0.50

[4,] 0.75

[5,] 1.00

attr(,"scaled:center")

[1] 1

attr(,"scaled:scale")

[1] 4

13
KIT-651 DATA ANALYTICS LAB

14
KIT-651 DATA ANALYTICS LAB

Experiment:6
Object: To perform dimensionality reduction operation using PCA
for house's dataset using R language with output.
#Program
# Loading Data
data(mtcars)

# Apply PCA using prcomp function


# Need to scale / Normalize as
# PCA depends on distance measure
my_pca <- prcomp(mtcars, scale = TRUE,
center = TRUE, retx = T)
names(my_pca)

# Summary
summary(my_pca)
my_pca

# View the principal component loading


# my_pca$rotation[1:5, 1:4]
my_pca$rotation

# See the principal components


dim(my_pca$x)
my_pca$x

# Plotting the resultant principal components


# The parameter scale = 0 ensures that arrows
# are scaled to represent the loadings
biplot(my_pca, main = "Biplot", scale = 0)

# Compute standard deviation


my_pca$sdev

# Compute variance
my_pca.var <- my_pca$sdev ^ 2
my_pca.var

# Proportion of variance for a scree plot


propve <- my_pca.var / sum(my_pca.var)
propve

# Plot variance explained for each principal component


plot(propve, xlab = "principal component",

15
KIT-651 DATA ANALYTICS LAB

ylab = "Proportion of Variance Explained",


ylim = c(0, 1), type = "b",
main = "Scree Plot")

# Plot the cumulative proportion of variance explained


plot(cumsum(propve),
xlab = "Principal Component",
ylab = "Cumulative Proportion of Variance Explained",
ylim = c(0, 1), type = "b")

# Find Top n principal component


# which will atleast cover 90 % variance of dimension
which(cumsum(propve) >= 0.9)[1]

# Predict mpg using first 4 new Principal Components


# Add a training set with principal components
train.data <- data.frame(disp = mtcars$disp, my_pca$x[, 1:4])

# Running a Decision tree algporithm


## Installing and loading packages
install.packages("rpart")
install.packages("rpart.plot")
library(rpart)
library(rpart.plot)

rpart.model <- rpart(disp ~ .,


data = train.data, method = "anova")

rpart.plot(rpart.model)

16
KIT-651 DATA ANALYTICS LAB

OUTPUT

Rscript /tmp/pkYpkkGJLy.r
[1] "sdev" "rotation" "center" "scale" "x"
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7
Standard deviation 2.5707 1.6280 0.79196 0.51923 0.47271 0.46000 0.3678
Proportion of Variance 0.6008 0.2409 0.05702 0.02451 0.02031 0.01924 0.0123
Cumulative Proportion 0.6008 0.8417 0.89873 0.92324 0.94356 0.96279 0.9751
PC8 PC9 PC10 PC11
Standard deviation 0.35057 0.2776 0.22811 0.1485
Proportion of Variance 0.01117 0.0070 0.00473 0.0020
Cumulative Proportion 0.98626 0.9933 0.998001.0000
Standard deviations (1, .., p=11):
[1]2.5706809 1.6280258 0.7919579 0.5192277 0.4727061 0.4599958 0.3677798
[8] 0.3505730 0.2775728 0.2281128 0.1484736

Rotation (n x k) = (11 x 11):


PC1 PC2 PC3 PC4 PC5 PC6
mpg -0.3625305 0.01612440 -0.22574419 -0.022540255 -0.10284468 -0.10879743
cyl 0.3739160 0.04374371 -0.17531118 -0.002591838 -0.05848381 0.16855369disp
0.3681852 -0.04932413 -0.06148414 0.256607885 -0.39399530 -0.33616451
hp 0.3300569 0.24878402 0.14001476 -0.067676157 -0.54004744 0.07143563
drat -0.2941514 0.27469408 0.16118879 0.854828743 -0.07732727 0.24449705
wt 0.3461033 -0.14303825 0.34181851 0.245899314 0.07502912 -0.46493964
qsec -0.2004563 -0.46337482 0.403169040.068076532 0.16466591 -0.33048032
vs -0.3065113 -0.23164699 0.42881517 -0.214848616 -0.59953955 0.19401702
am -0.2349429 0.42941765 -0.20576657 -0.030462908 -0.08978128 -0.57081745
gear -0.2069162 0.46234863 0.28977993 -0.264690521 -0.04832960 -0.24356284
carb 0.2140177 0.41357106 0.52854459 -0.1267891790.36131875 0.18352168
PC7 PC8 PC9 PC10 PC11
mpg 0.367723810 0.754091423 -0.235701617 -0.13928524 -0.124895628
cyl 0.0572777360.230824925 -0.054035270 0.84641949 -0.140695441
disp 0.214303077 -0.001142134 -0.198427848 -0.04937979 0.660606481
hp -0.001495989 0.222358441 0.575830072 -0.24782351 -0.256492062
drat 0.021119857 -0.032193501 0.046901228 0.10149369 -0.039530246
wt -0.020668302 0.008571929 -0.359498251 -0.09439426 -0.567448697
qsec 0.050010522 0.231840021 0.528377185 0.27067295 0.181361780
vs -0.265780836 -0.025935128 -0.358582624 0.15903909 0.008414634
am -0.587305101 0.059746952 0.047403982 0.17778541 0.029823537
gear 0.605097617 -0.336150240 0.001735039 0.21382515 -0.053507085
carb -0.174603192 0.395629107 -0.170640677-0.07225950 0.319594676
PC1 PC2 PC3 PC4 PC5 PC6
mpg -0.3625305 0.01612440 -0.22574419 -0.022540255 -0.10284468 -0.10879743
cyl 0.3739160 0.04374371 -0.17531118 -0.002591838 -0.05848381 0.16855369
disp 0.3681852 -0.04932413 -0.06148414 0.256607885 -0.39399530 -0.33616451
hp 0.3300569 0.24878402 0.14001476 -0.067676157 -0.54004744 0.07143563
drat -0.2941514 0.27469408 0.16118879 0.854828743 -0.07732727 0.24449705

17
KIT-651 DATA ANALYTICS LAB

wt 0.3461033 -0.14303825 0.34181851 0.245899314 0.07502912 -0.46493964


qsec -0.2004563 -0.46337482 0.40316904 0.068076532 0.16466591 -0.33048032
vs -0.3065113 -0.23164699 0.42881517 -0.214848616 -0.59953955 0.19401702
am -0.2349429 0.42941765 -0.20576657 -0.030462908 -0.08978128 -0.57081745
gear -0.2069162 0.46234863 0.28977993 -0.264690521 -0.04832960 -0.24356284
carb0.2140177 0.41357106 0.52854459 -0.126789179 0.36131875 0.18352168
PC7 PC8 PC9 PC10 PC11
mpg 0.367723810 0.754091423 -0.235701617 -0.13928524 -0.124895628
cyl 0.057277736 0.230824925 -0.054035270 0.84641949 -0.140695441
disp 0.214303077 -0.001142134 -0.198427848 -0.04937979 0.660606481
hp -0.001495989 0.222358441 0.575830072 -0.24782351 -0.256492062
drat 0.021119857 -0.032193501 0.046901228 0.10149369 -0.039530246
wt -0.020668302 0.008571929 -0.359498251 -0.09439426 -0.567448697
qsec 0.050010522 0.231840021 0.528377185 0.27067295 0.181361780
vs -0.265780836 -0.025935128 -0.358582624 0.15903909 0.008414634
am -0.587305101 0.059746952 0.047403982 0.17778541 0.029823537
gear 0.605097617 -0.3361502400.001735039 0.21382515 -0.053507085
carb -0.174603192 0.395629107 -0.170640677 -0.07225950 0.319594676
[1] 32 11
PC1 PC2 PC3 PC4
Mazda RX4 -0.6468627420 1.7081142 -0.5917309 0.113702214
Mazda RX4 Wag -0.6194831460 1.5256219 -0.3763013 0.199121210
Datsun 710 -2.7356242748 -0.1441501 -0.2374391 -0.245215450
Hornet 4 Drive -0.3068606268 -2.3258038 -0.1336213 -0.503800355
Hornet Sportabout 1.9433926844 -0.7425211-1.1165366 0.074461963
Valiant -0.0552534228 -2.7421229 0.1612456 -0.975167425
Duster 360 2.9553851233 0.3296133 -0.3570461 -0.051529216
Merc 240D -2.0229593244 -1.4421056 0.9290295 -0.142129082
Merc 230 -2.2513839535 -1.9522879 1.7689364 0.287210957
Merc 280 -0.5180912217 -0.1594610 1.4692603 0.066263362
Merc 280C -0.5011860079 -0.3187934 1.6570701 0.094357222
Merc 450SE 2.2124096339 -0.6727099 -0.3694707 -0.129797905
Merc 450SL 2.0155715693 -0.6724606 -0.4768341 -0.210991001
Merc 450SLC 2.1147047372 -0.7891129 -0.2904620 -0.175332868
Cadillac Fleetwood 3.8383725118 -0.8149087 0.6370972 0.290505877
Lincoln Continental 3.8918495626 -0.7218314 0.7092612 0.405336898
Chrysler Imperial 3.5363862158 -0.4145024 0.5402468 0.665665306
Fiat 128 -3.7955510831 -0.2920783 -0.4161681 0.055191058
Honda Civic -4.1870356784 0.6775721 -0.2035831 1.167526096
Toyota Corolla -4.1675359344 -0.2748890 -0.4589124 0.183313028
Toyota Corona -1.8741790870 -2.0864529 0.1543265 0.050514126
Dodge Challenger 2.1504414942 -0.9982442 -1.1503639 -0.584982249
AMC Javelin 1.8340369797 -0.8921886 -0.9472872 0.005694071
Camaro Z28 2.8434957523 0.6701037 -0.1605593 0.814340105
Pontiac Firebird 2.2105479148 -0.8600504 -1.0279577 0.146420497
Fiat X1-9 -3.5176818134 -0.1192950 -0.4464716 -0.013427353
Porsche 914-2 -2.6095003965 2.0141425 -0.8172519 0.568564789

18
KIT-651 DATA ANALYTICS LAB

Lotus Europa -3.3323844512 1.3568877 -0.4467167 -1.153197531


Ford Pantera L 1.3513346957 3.4448780 -0.1343943 0.590098358
Ferrari Dino -0.0009743305 3.1683750 0.3957610 -0.938933017
Maserati Bora 2.6270897605 4.3107016 1.3315940 -0.877332804
Volvo 142E -2.3824711412 0.2299603 0.4052798 0.223549117
PC5 PC6 PC7 PC8
Mazda RX4 0.945523363 -0.0169873733 -0.42648652 0.009631217
Mazda RX4 Wag 1.016680740 -0.2417246434 -0.41620046 0.084520213
Datsun 710 -0.398762288 -0.3487678138 -0.60884146 -0.585255765
Hornet 4 Drive -0.549208936 0.0192969984 -0.04036075 0.049583029
Hornet Sportabout -0.2075156980.1491927606 0.38350816 0.160297757
Valiant -0.211665375 -0.2438358546 -0.29464160 -0.256612420
Duster 360 -0.343847875 0.7126920868 -0.13607792 0.171103449
Merc 240D 0.316651386 -0.0009889391 0.63946214-0.163156195
Merc 230 0.333682355 -0.3338703384 0.62201034 0.105779936
Merc 280 0.069624161 0.8165308365 0.16117090 -0.099983313
Merc 280C 0.148803650 0.7308383757 0.09254430 -0.197306566
Merc 450SE 0.378611141 0.1317014762 -0.01645498 0.194092435
Merc 450SL 0.355611763 0.2400263805 0.05123623 0.329669990
Merc 450SLC 0.432140303 0.1801997325 -0.06675316 0.119252582
Cadillac Fleetwood 0.048245223 -0.8844735483 -0.16615296 -0.138398783
Lincoln Continental -0.003899176 -0.8625868981 -0.19250873 -0.129305868
Chrysler Imperial -0.208027112 -0.6536447300 0.03449804 0.391104141
Fiat 128 -0.219981109 -0.4675796343 -0.03749941 0.625278746
Honda Civic -0.0976740910.5180554279 -0.25316291 0.395045565
Toyota Corolla -0.222152228 -0.3171521124 0.06617540 0.853947085
Toyota Corona -0.039299002 0.7236992559 -0.28027808 -0.207237627
Dodge Challenger 0.226237802 0.1062181942 0.09489585 -0.316055390
AMC Javelin 0.252565496 0.2888101997 0.08161916 -0.321900593
Camaro Z28 -0.389118986 0.9468795171 -0.21157976 -0.038657331
Pontiac Firebird -0.299261925 -0.1983310387 0.47269865 0.234144182
Fiat X1-9 -0.206753365 -0.1449905641 -0.35850305 -0.089109764
Porsche 914-2 0.597313744 -0.3394265065 0.82032965 -0.634987241
Lotus Europa -0.6946676400.0165037718 0.51018011 -0.004140777
Ford Pantera L -1.101648091 -0.1746156635 0.41358868 -0.609167214
Ferrari Dino 0.848833976 -0.0097569921 0.02967883 -0.014187801
Maserati Bora -0.455265189 -0.0156094416 -0.18813730 0.558646792
Volvo 142E -0.321777017 -0.3263029217 -0.77995741 -0.476634473
PC9 PC10 PC11
Mazda RX4 -0.14642303 0.06670350 0.179693570
Mazda RX4 Wag -0.07452829 0.12692766 0.088644265
Datsun 710 0.13122859 -0.04573787 -0.094632914
Hornet 4 Drive -0.22021812 0.06039981 0.147611269
Hornet Sportabout 0.02117623 0.05983003 0.146406899
Valiant 0.03222907 0.20165466 0.019545064
Duster 360 0.17844547 -0.36086641 0.171863162
Merc 240D -0.37698418 -0.29086529 -0.019090358

19
KIT-651 DATA ANALYTICS LAB

Merc 230 0.86455356 0.11597058 0.159688512


Merc 280 -0.54092449 0.22093750 -0.124486227
Merc 280C -0.30876072 0.34417564 -0.034578568
Merc 450SE 0.05614966 0.06531727 -0.396445135
Merc 450SL 0.20501055 0.10761308 -0.197616838
Merc 450SLC 0.38704169 0.21191036 -0.142498830
Cadillac Fleetwood -0.19333387 -0.06184979 0.262886205
Lincoln Continental -0.19523562 -0.12094849 0.039191100
Chrysler Imperial -0.27447514 -0.27588169 -0.224420191
Fiat 128 -0.10550311 0.02717077 -0.208865888
Honda Civic -0.23711675 0.15433928 0.246835364
Toyota Corolla 0.11313627 0.12606845 -0.031747839
Toyota Corona 0.44646972-0.51147635 0.063679725
Dodge Challenger -0.10435633 0.13641143 0.049594456
AMC Javelin 0.12237636 0.29628634 0.045293027
Camaro Z28 0.05282991 -0.32624525 -0.099386307
Pontiac Firebird -0.20849043 -0.01547674 0.122593248
Fiat X1-9 0.02228967 0.08414018 -0.005746448
Porsche 914-2 0.12999660 -0.34968156 -0.111596656
Lotus Europa -0.29680350 -0.23980308 0.030015592
Ford Pantera L 0.23280792 0.50262890 -0.042242570
Ferrari Dino -0.09813571 -0.14491815 0.043006835
Maserati Bora 0.34081133 -0.04706368 0.062135486
Volvo 142E 0.04473670 -0.11767108 -0.145329008
Error in (function (file = if (onefile) "Rplots.pdf" else "Rplot%03d.pdf", :
cannot open file 'Rplots.pdf'
Calls: biplot ... biplot.prcomp -> biplot.default -> par -> <Anonymous>

20
KIT-651 DATA ANALYTICS LAB

21
KIT-651 DATA ANALYTICS LAB

EXPERIMENT NO - 7

OBJECTIVE – To perform simple linear regression with R language.

1.Create Relationship Model & get the Coefficients


y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
>
> # Apply the lm() function.
> relation <- lm(y~x)
>
> print(relation)

Output
Call:
lm(formula = y ~ x)

Coefficients:
(Intercept) x
-38.4551 0.6746

22
KIT-651 DATA ANALYTICS LAB

2. Get the Summary of the Relationship


> x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
> y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
>
> # Apply the lm() function.
> relation <- lm(y~x)
>
> print(summary(relation))

Output
Call:
lm(formula = y ~ x)

Residuals:
Min 1Q Median 3Q Max
-6.3002 -1.6629 0.0412 1.8944 3.9775

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -38.45509 8.04901 -4.778 0.00139 **
x 0.67461 0.05191 12.997 1.16e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.253 on 8 degrees of freedom


Multiple R-squared: 0.9548, Adjusted R-squared: 0.9491
F-statistic: 168.9 on 1 and 8 DF, p-value: 1.164e-06

23
KIT-651 DATA ANALYTICS LAB

3. Predict the weight of new persons


# The predictor vector.
> x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
>
> # The resposne vector.
> y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
>
> # Apply the lm() function.
> relation <- lm(y~x)
>
> # Find weight of a person with height 170.
> a <- data.frame(x = 170)
> result <- predict(relation,a)
> print(result)

Output

76.22869

24
KIT-651 DATA ANALYTICS LAB

Live Demo

4. Visualize the Regression Graphically


# Create the predictor and response variable.
x <- c(151, 174, 138, 186, 128, 136, 179, 163, 152, 131)
y <- c(63, 81, 56, 91, 47, 57, 76, 72, 62, 48)
relation <- lm(y~x)

# Give the chart file a name.


jpeg(file = "linearregression.jpeg")

# Plot the chart.


plot(y,x,col = "blue",main = "Height & Weight Regression",
abline(lm(x~y)),cex = 1.3,pch = 16,xlab = "Weight in Kg",ylab = "Height in cm")

# Save the file.


dev.off()

Output

25
KIT-651 DATA ANALYTICS LAB

26

You might also like