0% found this document useful (0 votes)

198 views

Cognitive Class - Answers Data Analysis With Python

This document contains questions from a data analysis with Python certification exam. It covers topics like data wrangling, exploratory data analysis, model development and evaluation. The questions test concepts like CSV files, dataframes, feature engineering, linear regression, and model performance metrics.

Uploaded by

Sloan Ian Ariff

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

198 views

Cognitive Class - Answers Data Analysis With Python

Uploaded by

Sloan Ian Ariff

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Answers Data Analysis with Python Cognitive Class

Clear My Certification September 18, 2020 Cognitive Class Leave a comment 11,862 Views

Module 1 – Introduction
Question 1: What does CSV stand for ?
 Comma Separated Values
 Car Sold values

 Car State values

 None of the above

Question 2: In the data set what represents an attribute or feature?

 Row

 Column
 Each element in the data set

Question 3: What is another name for the variable that we want to predict?
 Target
 Feature

 Dataframe

Question 4: What is the command to display the first five rows of a dataframe df?
 df.head()
 df.tail()

Question 5: what command do you use to get the data type of each row of the dataframe df?

 df.dtypes
 df.head()

 df.tail()

Question 6: How do you get a statistical summary of a dataframe df?

 df.describe()
 df.head()

 df,tails()

Question 7: If you use the method describe() without changing any of the arguments you will get a statistical summary
of all the columns of type object?
 False
 True

Module 2 – Data Wrangling

Question 1: Consider the dataframe “df” what is the result of the following operation df[‘symbolling’] =
df[‘symbolling’] + 1?:
 Every element in the column “symbolling” will increase by one
 Every element in the row “symbolling” will increase by one

 Every element in the dataframe will increase by one

Question 2: Consider the dataframe “df”, what does the command df.rename(columns={‘a’:’b’}) change about the
dataframe “df”
 rename column “a” of the dataframe to “b”
 rename the row “a” to “b”

 nothing as you must set the parameter “inplace =True “

Question 3: Consider the dataframe “df” , what is the result of the following operation df[‘price’] =
df[‘price’].astype(int) ?
 convert or cast the row ‘price’ to an integer value

 convert or cast the column ‘price’ to an integer value

 convert or cast the entire dataframe to an integer value

Question 4: Consider the column of the dataframe df[‘a’]. The colunm has been standardized. What is the standard
deviation of the values, i.e the result of applying the following operation df[‘a’].std() :
1
0

3

Question 5: Consider the column of the dataframe df[‘Fuel’], with two values ‘gas’ and’ diesel’. What will be the name
of the new colunms pd.get_dummies(df[‘Fuel’]) ?
 1 and 0

 Just diesel

 Just gas

 Gas and diesel

Question 6: What are the values of the new columns from part 5 a)
 1 and 0
 Just diesel

 Just gas

 Gas and diesel

Module 3 – Exploratory Data Analysis

Question 1: Consider the dataframe “df”. Which method provides the summary statistics?
 df.describe()
 df.head()

 df.tail()

 df.summary()

Question 2: Consider the following dataframe:

df_test = df[‘body-style’, ‘price’]
The following operations is applied:
df_grp = df_test.groupby([‘body-style’], as_index=False).mean()
What are resulting values of df_grp[‘price’]:
 The average price for each body style
 The average price

 The average body style

Question 3: Correlation implies causation :

 False
 True

Question 4: What is the minimum possible value of Pearson’s Correlation :

1
 -100

 -1
Question 5: What is the Pearson correlation between variables X and Y, if X=Y:
 -1

1
0

X

Y

Module 4 – Model Development

Question 1: Let X be a dataframe with 100 rows and 5 columns, let y be the target with 100 samples,assuming all the
relevant libraries and data have been imported, the following line of code has been executed:
LR = LinearRegression()
LR.fit(X, y)
yhat = LR.predict(X)
How many samples does yhat contain :
5

 500

 100
0

Question 2: What value of R^2 (coefficient of determination) indicates your model performs best ?
 -100

 -1

0

1
Question 3: What statement is true about Polynomial linear regression
 Polynomial linear regression is not linear in any way

 Although the predictor variables of Polynomial linear regression are not linear the relationship between the
parameters or coefficients is linear.
 Polynomial linear regression uses wavelets

Question 4: The larger the mean square error, the better your model has performed
 False
 True

Question 5: Assume all the libraries are imported, y is the target and X is the features or dependent variables, consider
the following lines of code:
Input = [(‘scale’, StandardScaler()), (‘model’, LinearRegression())]
pipe = Pipeline(Input)
pipe.fit(X,y)
ypipe = pipe.predict(X)
What have we just done in the above code?
 Polynomial transform, Standardize the data, then perform a prediction using a linear regression model

 Standardize the data, then perform prediction using a linear regression model
 Polynomial transform then Standardize the data
Module 5 – Model Evaluation:
Question 1: In the following plot, the vertical access shows the mean square error andthe horizontal axis represents the
order of the polynomial. The red line represents the training error the blue line is the test error. What is the best order
of the polynomial given the possible choices in the horizontal axis?
2

8
 16

Question 2: What is the use of the “train_test_split” function such that 40% of the data samples will be utilized for
testing, the parameter “random_state” is set to zero, and the input variables for the features and targets are_data,
y_data respectively.
 train_test_split(x_data, y_data, test_size=0, random_state=0.4)

 train_test_split(x_data, y_data, test_size=0.4, random_state=0)

 train_test_split(x_data, y_data)

Question 3: What is the output of cross_val_score(lre, x_data, y_data, cv=2)?

 The predicted values of the test data using cross validation.

 The average R^2 on the test data for each of the two folds
 This function finds the free parameter alpha

Question 4: What is the code to create a ridge regression object “RR” with an alpha term equal 10
 RR=LinearRegression(alpha=10)

 RR=Ridge(alpha=10)
 RR=Ridge(alpha=1)

Question 5: What dictionary value would we use to perform a grid search for the following values of alpha: 1,10, 100.
No other parameter values should be tested
 alpha=[1,10,100]

 [{‘alpha’: [1,10,100]}]
 [{‘alpha’: [0.001,0.1,1, 10, 100, 1000,10000,100000,100000],’normalize’:[True,False]} ]

Data Analysis with Python Final Exam Answers

Question 1: Question 1: What does the following command do:
df.dropna(subset=[“price”], axis=0)
 Drop the “not a number” from the column price
 Drop the row price

 Rename the data frame price

Question 2: How would you provide many of the summery statistics for all the columns in the dataframe “df”:
 df.describe(include = “all”)
 df.head()

 type(df)

 df.shape

Question 3: How would you find the shape of the dataframe df

 df.describe()

 df.head()

 type(df)
 df.shape
Question 4: What task does the following command to df.to_csv(“A.csv”) perform
 change the name of the column to “A.csv”

 load the data from a csv file called “A” into a dataframe

 Save the dataframe df to a csv file called “A.csv”

Question 5: What task does the following line of code perform:
df[‘peak-rpm’].replace(np.nan, 5,inplace=True)
 replace the not a number values with 5 in the column ‘peak-rpm’
 rename the column ‘peak-rpm’ to 5

 add 5 to the data frame

Question 6: What task does the following line of code perform:

df[‘peak-rpm’].replace(np.nan, 5,inplace=True)
 replace the not a number values with 5 in the column ‘peak-rpm’
 rename the column ‘peak-rpm’ to 5

 add 5 to the data frame

Question 7: How do you “one hot encode” the column ‘fuel-type’ in the dataframe df
 pd.get_dummies(df[“fuel-type”])
 df.mean([“fuel-type”])

 df[df[“fuel-type”])==1 ]=1

Question 8: What does the vertical axis in a scatter plot represent

 independent variable

 dependent variable
Question 9: What does the horizontal axis in a scatter plot represent
 independent variable
 dependent variable

Question 10: If we have 10 columns and 100 samples how large is the output of df.corr()
 10 x 100

 10 x 10
 100×100

 100×100

Question 11: what is the largest possible element resulting in the following operation “df.corr()”
 100

 1000

1
Question 12: if the Pearson Correlation of two variables is zero:
 the two variable have zero mean

 the two variables are not correlated

Question 13: if the p value of the Pearson Correlation is 1:
 the variables are correlated

 the variables are not correlated

 none of the above

Question 14: What does the following line of code do: lm = LinearRegression()
 fit a regression object lm

 create a linear regression object

 predict a value

Question 15: If the predicted function is:

Yhat = a + b1 X1 + b2 X2 + b3 X3 + b4 X4
The method is
 Polynomial Regression

 Multiple Linear Regression

Question 16: What steps do the following lines of code perform:
Input=[(‘scale’,StandardScaler()),(‘model’,LinearRegression())]
pipe=Pipeline(Input)
pipe.fit(Z,y)
ypipe=pipe.predict(Z)
 Standardize the data, then perform a polynomial transform on the features Z

 find the correlation between Z and y

 Standardize the data, then perform a prediction using a linear regression model using the features Z and
targets y
Question 17: What is the maximum value of R^2 that can be obtained
 10

1
0

Question 18: We create a polynomial feature as follows “PolynomialFeatures(degree=2)”, what is the order of the
polynomial
0

1

2
Question 19: You have a linear model the average R^2 value on your training data is 0.5, you perform a 100th order
polynomial transform on your data then use these values to train another model, your average R^2 is 0.99 which
comment is correct
 100-th order polynomial will work better on unseen data

 You should always use the simplest model

 the results on your training data is not the best indicator of how your model performs, you should use your test
data to get a beter idea
Question 20:You train a ridge regression model, you get a R^2 of 1 on your training data and you get a R^2 of 0 on
your validation data, what should you do:
 Nothing your model performs flawlessly on your test data
 your model is under fitting perform a polynomial transform

 your model is overfitting, increase the parameter alpha

Snowflake Resume
No ratings yet
Snowflake Resume
4 pages
(Springer Tracts in Mechanical Engineering) Rajiv Kumar Sharma - Quality Management Practices in MSME Sectors-Springer Singapore - Springer (2021)
No ratings yet
(Springer Tracts in Mechanical Engineering) Rajiv Kumar Sharma - Quality Management Practices in MSME Sectors-Springer Singapore - Springer (2021)
211 pages
App2 PDF
No ratings yet
App2 PDF
20 pages
Config CFG
100% (1)
Config CFG
4 pages
Bryce 7 Artist Guide
100% (2)
Bryce 7 Artist Guide
1,328 pages
Desten Overview Presentation 20200710
No ratings yet
Desten Overview Presentation 20200710
25 pages
Penjelasan jROS
No ratings yet
Penjelasan jROS
8 pages
Oil Blending Problem (Sunco) - 20201219
No ratings yet
Oil Blending Problem (Sunco) - 20201219
6 pages
PF Assignment
No ratings yet
PF Assignment
14 pages
Dessler hrm16 PPT 05 3
No ratings yet
Dessler hrm16 PPT 05 3
38 pages
Springer Consent To Publish Form
No ratings yet
Springer Consent To Publish Form
3 pages
Analytical & Thinking: Creative
No ratings yet
Analytical & Thinking: Creative
34 pages
Data Prep and Cleaning For Machine Learning
No ratings yet
Data Prep and Cleaning For Machine Learning
22 pages
PEPSI-TMC Case On Inclusive Change Management
100% (3)
PEPSI-TMC Case On Inclusive Change Management
16 pages
Book Data Warehouse Design Golfarelli - Rizzi PDF
No ratings yet
Book Data Warehouse Design Golfarelli - Rizzi PDF
398 pages
StationScout Whitepaper Substation Automation Systems Fully Under Control 2018 ENU PDF
No ratings yet
StationScout Whitepaper Substation Automation Systems Fully Under Control 2018 ENU PDF
2 pages
The Updated Delone and Mclean Model of Information Systems Success
No ratings yet
The Updated Delone and Mclean Model of Information Systems Success
19 pages
Daftar Paper International Conference 2019 PDF
No ratings yet
Daftar Paper International Conference 2019 PDF
2 pages
Chapter8 Structuring System Data Requirements
No ratings yet
Chapter8 Structuring System Data Requirements
64 pages
Linear Programming Simplex Methode
No ratings yet
Linear Programming Simplex Methode
78 pages
Chapter 8 Ethics and The Employee
No ratings yet
Chapter 8 Ethics and The Employee
27 pages
Application of Linear Optimization
No ratings yet
Application of Linear Optimization
43 pages
Maximo 761.0
No ratings yet
Maximo 761.0
2 pages
Deep Q-Network
No ratings yet
Deep Q-Network
15 pages
DP v8
No ratings yet
DP v8
19 pages
Dynamic Pricing Report PDF
No ratings yet
Dynamic Pricing Report PDF
36 pages
CONSUMER PERCEPTION On Xiaomi
No ratings yet
CONSUMER PERCEPTION On Xiaomi
16 pages
Case DBM 8a Ma DKK 2021 Business Ecosystem Architecture Development A
No ratings yet
Case DBM 8a Ma DKK 2021 Business Ecosystem Architecture Development A
38 pages
Chapter-4: M/M/1/K Queue With Non-Preemptive Priority
No ratings yet
Chapter-4: M/M/1/K Queue With Non-Preemptive Priority
9 pages
Sawtooth Software: Analysis of Traditional Conjoint Using Microsoft Excel: An Introductory Example
No ratings yet
Sawtooth Software: Analysis of Traditional Conjoint Using Microsoft Excel: An Introductory Example
7 pages
Mid Term Report
No ratings yet
Mid Term Report
35 pages
07factsheet Mini Grid
No ratings yet
07factsheet Mini Grid
6 pages
System Development Life Cycle
No ratings yet
System Development Life Cycle
68 pages
Penawaran Bapak Surya Mesin Es Balok
No ratings yet
Penawaran Bapak Surya Mesin Es Balok
12 pages
Big Data Analytics
No ratings yet
Big Data Analytics
5 pages
Data Mining Analysis To Determine Employee Salaries According To Needs Based On The K-Medoids Clustering Algorithm
No ratings yet
Data Mining Analysis To Determine Employee Salaries According To Needs Based On The K-Medoids Clustering Algorithm
8 pages
Chapter 5
No ratings yet
Chapter 5
49 pages
Company Profile - PLN-2016 PDF
No ratings yet
Company Profile - PLN-2016 PDF
48 pages
ML For LTE - Device Readiness Forecasting
No ratings yet
ML For LTE - Device Readiness Forecasting
5 pages
Customer Value PDF
100% (1)
Customer Value PDF
23 pages
Deteksi Spam Email Dengan Naïve Bayes Dan Partical Swarm Optimization
No ratings yet
Deteksi Spam Email Dengan Naïve Bayes Dan Partical Swarm Optimization
7 pages
Muhammad Reza Adi W - Business Economics Assigment 3B
No ratings yet
Muhammad Reza Adi W - Business Economics Assigment 3B
10 pages
System Analysis & Design: Structuring System Process Requirements
No ratings yet
System Analysis & Design: Structuring System Process Requirements
52 pages
Introduction To Linear Programming
No ratings yet
Introduction To Linear Programming
56 pages
Data Mining Clustering Techniques
No ratings yet
Data Mining Clustering Techniques
3 pages
Business Intelligence: Components
No ratings yet
Business Intelligence: Components
5 pages
(Ashgate Studies in Environmental Policy and Practice) Benjamin K. Sovacool, Ira Martina Drupady - Energy Access, Poverty, And Development_ the Governance of Small-Scale Renewable Energy in Developing (1)
No ratings yet
(Ashgate Studies in Environmental Policy and Practice) Benjamin K. Sovacool, Ira Martina Drupady - Energy Access, Poverty, And Development_ the Governance of Small-Scale Renewable Energy in Developing (1)
329 pages
Pertemuan-10 Tata Kelola Sistem Informasi
No ratings yet
Pertemuan-10 Tata Kelola Sistem Informasi
28 pages
12 Tuning PID Controllers
No ratings yet
12 Tuning PID Controllers
139 pages
Reliability Centred Maintenance Shivajichoudhury Download
No ratings yet
Reliability Centred Maintenance Shivajichoudhury Download
4 pages
Case Study - Fiat
100% (3)
Case Study - Fiat
54 pages
Siemens RD in China Final en
No ratings yet
Siemens RD in China Final en
9 pages
Instant Download (Ebook) Testing of Digital Systems by N. K. Jha, S. Gupta ISBN 9780511077739, 9780521773560, 0521773563, 0511077734 PDF All Chapters
100% (1)
Instant Download (Ebook) Testing of Digital Systems by N. K. Jha, S. Gupta ISBN 9780511077739, 9780521773560, 0521773563, 0511077734 PDF All Chapters
81 pages
Coefficient Alpha, A Basic Introduction From The Perspectives of Classical Test Theory
No ratings yet
Coefficient Alpha, A Basic Introduction From The Perspectives of Classical Test Theory
21 pages
System Analysis and Design Valachi Chapter 6
No ratings yet
System Analysis and Design Valachi Chapter 6
44 pages
10 Waiting Line Analysis
100% (1)
10 Waiting Line Analysis
59 pages
Case 2.2
No ratings yet
Case 2.2
2 pages
Rangkuman OB Chapter 17
No ratings yet
Rangkuman OB Chapter 17
6 pages
CSE1703 - Fundamental of Data Science
No ratings yet
CSE1703 - Fundamental of Data Science
6 pages
Soal CISDM
No ratings yet
Soal CISDM
3 pages
Data Analysis
No ratings yet
Data Analysis
8 pages
Advance Python Lab Solution
No ratings yet
Advance Python Lab Solution
4 pages
MODULE 2 Coursera
No ratings yet
MODULE 2 Coursera
9 pages
Fibeair Ip 20g
No ratings yet
Fibeair Ip 20g
2 pages
Data Analysis 03
No ratings yet
Data Analysis 03
38 pages
Site Id Site Name Site Id TBG Site Name TBG: Andiniraya - TB
No ratings yet
Site Id Site Name Site Id TBG Site Name TBG: Andiniraya - TB
2 pages
Cara Cek Problems GB Interface
No ratings yet
Cara Cek Problems GB Interface
3 pages
GSM - UMTS - LTE BoQ Engineer - PT ZTE Indonesia - Pekerjaan
100% (1)
GSM - UMTS - LTE BoQ Engineer - PT ZTE Indonesia - Pekerjaan
3 pages
Exam ZTE
No ratings yet
Exam ZTE
3 pages
Pdf 文件使用 "Pdffactory Pro" 试用版本创建: Www.Fineprint.Cn
No ratings yet
Pdf 文件使用 "Pdffactory Pro" 试用版本创建: Www.Fineprint.Cn
1 page
Software Upgrade Procedure IP-10
100% (1)
Software Upgrade Procedure IP-10
9 pages
IP-10 E1-T1 Cables and Panels - V4!08!2009
0% (1)
IP-10 E1-T1 Cables and Panels - V4!08!2009
12 pages
Xpic On s340
No ratings yet
Xpic On s340
7 pages
GSM 100 Kpi
No ratings yet
GSM 100 Kpi
4 pages
xczl2011005 - ZXDU68 T601 Power System-V3 - 325632
No ratings yet
xczl2011005 - ZXDU68 T601 Power System-V3 - 325632
2 pages
12.1 - Configure Ip Bts (GSM) Via Cme Gui 1
No ratings yet
12.1 - Configure Ip Bts (GSM) Via Cme Gui 1
42 pages
Format For Mini Project Report
No ratings yet
Format For Mini Project Report
23 pages
Olympus Digital Troubleshooting Tips
No ratings yet
Olympus Digital Troubleshooting Tips
5 pages
Module-2 Notes
No ratings yet
Module-2 Notes
28 pages
Architecture - Wikipedia
No ratings yet
Architecture - Wikipedia
15 pages
Read Me
No ratings yet
Read Me
8 pages
A Micro-Project Report On ": Motherbord
No ratings yet
A Micro-Project Report On ": Motherbord
29 pages
Importing Radar Data From Folder
No ratings yet
Importing Radar Data From Folder
8 pages
Becoming Information Users
No ratings yet
Becoming Information Users
11 pages
Chapter - RDBMS (Basic) Class 10
No ratings yet
Chapter - RDBMS (Basic) Class 10
6 pages
Class05 M7 Homework Oct 01-06
No ratings yet
Class05 M7 Homework Oct 01-06
4 pages
Download: Maha Sankalpam in Telugu PDF Download
No ratings yet
Download: Maha Sankalpam in Telugu PDF Download
3 pages
DATA MINING LAB MANUAL
No ratings yet
DATA MINING LAB MANUAL
35 pages
Resume 2018
No ratings yet
Resume 2018
2 pages
_SE(AIDS) SEM III Mini Project Report Template.docx
No ratings yet
_SE(AIDS) SEM III Mini Project Report Template.docx
5 pages
LAVT & NGR Cubicles Pre & Commissioning Activities Including Tools Required
0% (1)
LAVT & NGR Cubicles Pre & Commissioning Activities Including Tools Required
2 pages
COPA Syllabus
No ratings yet
COPA Syllabus
18 pages
Simulink
No ratings yet
Simulink
6 pages
Class:10 Subject: Computer Total Marks: 50: Objective Type
No ratings yet
Class:10 Subject: Computer Total Marks: 50: Objective Type
3 pages
(Non-QU) Linear Algebra by DR - Gabriel Nagy PDF
No ratings yet
(Non-QU) Linear Algebra by DR - Gabriel Nagy PDF
362 pages
5 Steps For Planning Surveys: Tip Sheet
No ratings yet
5 Steps For Planning Surveys: Tip Sheet
5 pages
VLSC User Guide 082011
No ratings yet
VLSC User Guide 082011
51 pages
CORM 2011 Calculation of CCT and Duv and Practical Conversion Formulae
No ratings yet
CORM 2011 Calculation of CCT and Duv and Practical Conversion Formulae
28 pages
Nuke Main1 Keying
No ratings yet
Nuke Main1 Keying
4 pages
ACN Lab Manual (1) (1) - 4-70
No ratings yet
ACN Lab Manual (1) (1) - 4-70
67 pages
Thermal Mass Flow Meter Proline T-Mass 65F 65I
No ratings yet
Thermal Mass Flow Meter Proline T-Mass 65F 65I
51 pages
Assignment: Advantages Disadvantages
No ratings yet
Assignment: Advantages Disadvantages
20 pages
Smart Retail 4.0 IoT Consumer Retailer Model
100% (1)
Smart Retail 4.0 IoT Consumer Retailer Model
18 pages

Cognitive Class - Answers Data Analysis With Python

Uploaded by

Cognitive Class - Answers Data Analysis With Python

Uploaded by

Answers Data Analysis with Python Cognitive Class

 Car State values

 None of the above

Question 2: In the data set what represents an attribute or feature?

Question 6: How do you get a statistical summary of a dataframe df?

Module 2 – Data Wrangling

 Every element in the dataframe will increase by one

 nothing as you must set the parameter “inplace =True “

 convert or cast the column ‘price’ to an integer value

 Gas and diesel

 Gas and diesel

Module 3 – Exploratory Data Analysis

Question 2: Consider the following dataframe:

 The average body style

Question 3: Correlation implies causation :

Question 4: What is the minimum possible value of Pearson’s Correlation :

Module 4 – Model Development

 train_test_split(x_data, y_data, test_size=0.4, random_state=0)

Question 3: What is the output of cross_val_score(lre, x_data, y_data, cv=2)?

Data Analysis with Python Final Exam Answers

 Rename the data frame price

Question 3: How would you find the shape of the dataframe df

 Save the dataframe df to a csv file called “A.csv”

 add 5 to the data frame

Question 6: What task does the following line of code perform:

 add 5 to the data frame

Question 8: What does the vertical axis in a scatter plot represent

 the two variables are not correlated

 the variables are not correlated

 none of the above

 create a linear regression object

Question 15: If the predicted function is:

 Multiple Linear Regression

 find the correlation between Z and y

 You should always use the simplest model

 your model is overfitting, increase the parameter alpha

You might also like