Loading Datasets From Excel/CSV: A) Local R Database Dataset
Loading Datasets From Excel/CSV: A) Local R Database Dataset
Loading Datasets From Excel/CSV: A) Local R Database Dataset
a)Local R database
Dataset
Dataset name : iris
This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the
variables sepal length and width and petal length and width, respectively, for 50 flowers from
each of 3 species of iris. The species are Iris setosa, versicolor, and virginica.
R commands used:
data()
head()
R Script
> data(iris)
> head(iris)
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
b)xls,csv files
Dataset
Dataset:table1
Contains student details
R commands used:
read.csv()
install.packages("")
library()
read_xls()
R Script:
> data<-read.csv("D:/cloud/table1.csv")
> data
Output:
> library(readxl)
> student<-read_xls("D:/cloud/R/students.xls")
> student
Output:
# A tibble: 30 x 14
ID `Last Name` `First Name` City State Gender `Student Status` Major Country Age SAT
`Average score ~
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
<dbl> <dbl>
1 1 DOE01 JANE01 Los ~ Cali~ Female Graduate Poli~ US 30 2263
67
2 2 DOE02 JANE02 Sedo~ Ariz~ Female Undergraduate Math US 19 2006
63
3 3 DOE01 JOE01 Elmi~ New ~ Male Graduate Math US 26 2221
78.1
4 4 DOE02 JOE02 Lack~ New ~ Male Graduate Econ US 33 1716
77.8
5 5 DOE03 JOE03 Defi~ Ohio Male Graduate Econ US 37 1701
65
6 6 DOE04 JOE04 Tel ~ Isra~ Male Graduate Econ Israel 25 1786
69
7 7 DOE05 JOE05 Cimax Nort~ Male Graduate Poli~ US 39 1577
95.9
8 8 DOE03 JANE03 Libe~ Kans~ Female Undergraduate Poli~ US 21 1842
87
# ... with 20 more rows, and 2 more variables: `Height (in)` <dbl>, `Newspaper readership
(times/wk)` <dbl>
Descriptive Statistics(Dataset characteristics using R commands)
Dataset:
Dataset :Flights2008
US Flight data 2008 is awesome decent size data to be explored for newbies. A lot of
great insights can be get from it. Like which months we have many or few flights, why
there is delays in flights arrival and departure.
Package used:
stringr
devtools
chron
R Commands used:
read.csv() str()
summary() dim()
names() is.na()
ncol() library()
str_pad() substring()
chron() head()
tail()
R Script:
> flights<-read.csv("D:/cloud/R/2008.csv")
> str(flights)
Output: