Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit 1

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 2

TRICHY ENGINEERING COLLEGE

(A Unit of SS Group of Institutions)


Approved by AICTE & Affiliated to Anna University, Chennai
An ISO 9001:2015 Certified Institution
Sivagnanam Nagar, Trichy-Chennai NH, Konalai, Trichy - 621 105.

UNIT I
EXPLORATORY DATA ANALYSIS
1. What is the primary purpose of Exploratory Data Analysis (EDA)?
A) To perform hypothesis testing
B) To create a summary of the main characteristics of the data
C) To make final decisions about the data
D) To forecast future values
Answer: B) To create a summary of the main characteristics of the data

2. Which of the following is not a method used in EDA?


A) Box plots
B) Histograms
C) Scatter plots
D) Neural Networks
Answer: D) Neural Networks

3. In EDA, which plot is useful for detecting outliers in the data?


A) Line plot
B) Box plot
C) Bar plot
D) Pie chart
Answer: B) Box plot

4. Which of the following is NOT typically a goal of EDA?


A) Detecting outliers
B) Understanding data patterns and relationships
C) Assessing data distributions
D) Building machine learning models
Answer: D) Building machine learning models

5. A scatter plot is useful for:


A) Showing the distribution of a single variable
B) Summarizing the frequency of categories
C) Visualizing the relationship between two variables
D) Detecting missing data
Answer: C) Visualizing the relationship between two variables

6. Which statistic is most commonly used to summarize the central tendency of a dataset in
EDA?
TRICHY ENGINEERING COLLEGE
(A Unit of SS Group of Institutions)
Approved by AICTE & Affiliated to Anna University, Chennai
An ISO 9001:2015 Certified Institution
Sivagnanam Nagar, Trichy-Chennai NH, Konalai, Trichy - 621 105.

A) Median
B) Mode
C) Mean
D) Variance
Answer: C) Mean

7. When performing EDA, what does a histogram display?


A) The correlation between two variables
B) The distribution of a single continuous variable
C) The frequency of categorical variables
D) The relationship between variables over time
Answer: B) The distribution of a single continuous variable

8. Which EDA technique would be most useful to show how the data is skewed?
A) Scatter plot
B) Pie chart
C) Histogram
D) Bar chart
Answer: C) Histogram

9. If a dataset has many missing values, what is the first step in the EDA process?
A) Visualizing the missing data
B) Ignoring the missing data
C) Removing all rows with missing data
D) Replacing missing values with zeros
Answer: A) Visualizing the missing data

10. Which of the following measures is least affected by extreme values (outliers)?
A) Mean
B) Median
C) Standard deviation
D) Range
Answer: B) Median

You might also like