Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
9 views

Python Practical Questions@Subas

noo

Uploaded by

dhanshreeg367
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Python Practical Questions@Subas

noo

Uploaded by

dhanshreeg367
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Python Practical Questions

Bca-1, 2024 Based on eos

s.n. Practical suBjects sign grade date


1 1. Write a Python program that reads a CSV file and
rename one column name of existing file as
your name.
2. Write a Python program that will show the first
five observation of the renamed column.

2 1. Write a Python program that will show the last


five observation of the renamed column.

2. Write a Python program that will show country,


continent and year / any three columns in
your data set.
3 1. Create range of integers from 0 to 4 inclusive.
2. Create range of integers from 0 to 50
inclusive.
4 1. Write the python program to counts the first
row of the data.
2. Write the python program to get/read the 7th
row data of the data set.
5 1. Write the python program to get the last row
of the data set.
2. Write a python program to select and read
the 4th, 7th and 10th rows.
6 1. Write a python program to get 2nd row of your
data set.
2. Write the python program to get 10th row.

7 Write a python program using “:” and “loc” or “iloc” to


read the column.
Example: df.loc[:, [columns]]
Subset = df.loc[:, [‘year’, ‘pop’]]
Print(subset.head())
1. Write python program to sub-setting rows and
column using loc.
2. Write python program to get the data from 1st, 3rd,
and 5th rows and from the 1st, 4th and 6th
columns.
8 1. Write python program to get the data from 1st, 3rd,
and 5th rows and from the 1st, 4th and 6th columns.
2. Write a python pregaming to print first 10 rows.
9 Write a Python script that calculates and displays
descriptive statistics (mean, median, standard
deviation, quartiles, etc.) for numerical columns and
value counts for categorical columns in a dataset.

10 Write a Python program that reads a CSV file


containing missing values (e.g., represented by NaN or
empty strings). The program should:
● Identify the types of missing values present.
● Impute missing values using appropriate
techniques (e.g., mean/median for numerical
data, mode for categorical data).
● Optionally, handle outliers before imputation
(e.g., using capping or winsorization).
11 Develop a Python script that analyzes a numerical
column for outliers. It should:
● Calculate descriptive statistics (e.g., mean,
standard deviation, quartiles).
● Identify potential outliers based on methods
like IQR (Interquartile Range) or z-scores.
● Visualize the distribution of the data (e.g.,
boxplots) to inspect for outliers visually.
● Provide options for handling outliers (e.g.,
removal, capping) based on domain knowledge.

12 1. Write a Python program that calculates the


harmonic mean of a list of numbers.
2. Create a python program that computes the
combined mean of two datasets. The function
should take two lists of numbers as input and
return the combined mean using the formula.
13 Write a Python program that finds the mode of a list of
numbers. If there are multiple modes, the function
should return all of them. If no mode exists, the
function should return a message indicating that.
14 Implement a script that calculates both the harmonic
mean and the arithmetic mean of a list of numbers, and
then compares the two. Print both means along with a
message indicating which is greater and the
implications of the comparison.
15 Create a python program that takes a list of numbers as
input and provides a summary of statistics, including:
• Arithmetic Mean
• Harmonic Mean
• Combined Mean with a second list provided by
the user
• Mode

16 Write a Python program that takes a list of numbers as


input and returns the variance of those numbers.
17 Create a Python program that calculates the standard
deviation of a given list of numbers.
The function should return both the variance and the
standard deviation.
How do you derive standard deviation from variance?
18 Implement a Python program that calculates the
interquartile range (IQR) of a list of numbers.
The IQR is defined as the difference between the 75th
percentile (Q3) and the 25th percentile (Q1).
Use NumPy for this task.
19 Using Matplotlib and NumPy, write a python script
that generates a boxplot for a dataset of your choice.
Explain how the boxplot visualizes dispersion and what
insights can be drawn from it.
20 Create a Python function that takes two lists of
numbers and compares their dispersions using both
variance and standard deviation.
Based on the results, explain how you can interpret the
differences in dispersion between the two datasets.
21 Implement a Python function that normalizes or
standardizes a numerical dataset. The function should:
● Understand the difference between
normalization and standardization.
● Apply appropriate scaling techniques (e.g.,
Min-Max scaling, z-score normalization).

22 Create a Python program that discretizes a continuous


numerical column into bins (categories). The function
could:
● Use equal-width binning or quantile-based
binning.
● Optionally, apply techniques like chi-square
testing to determine the optimal number of
bins.

23 Develop a Python program that creates various


visualizations for a dataset using libraries like
Matplotlib or Seaborn. Examples:
● Histograms for continuous data distribution.
● Scatter plots for relationships between two
variables.
● Boxplots to compare group distributions.
● Pie charts for categorical data proportions.

24 Create a Python program that calculates the correlation


matrix for a dataset. The function should:
● Handle different data types (numerical vs.
categorical).
● Choose appropriate correlation coefficients
● Visualize the correlation matrix using
heatmaps or other techniques.
25 Write a function that takes a DataFrame and returns a
new DataFrame with missing values replaced using
appropriate methods (e.g., mean, median, mode, or
custom logic).
● Implement different strategies for handling
missing values based on data type (numerical vs.
categorical) and column importance.
26 Create a function to identify outliers in a DataFrame
using techniques like Interquartile Range (IQR) or
standard deviation.
● Provide options to remove outliers, cap them to a
specific value, or transform them (e.g., using log
transformation).
27 Write code to clean inconsistent date formats in a
DataFrame, converting them to a standard format.
● Handle mixed case (uppercase/lowercase) in
categorical columns by converting them to a
consistent format (e.g., lowercase).
28 Implement functions to scale numerical features in a
DataFrame using methods like standardization (z-
score) or normalization (min-max scaling).
● Explain the benefits of scaling and when to use
each method.
29 Create functions to encode categorical features in a
DataFrame using techniques like one-hot encoding or
label encoding.
● Discuss the advantages and disadvantages of each
encoding method.
30 Write code to identify and address data
inconsistencies, such as negative values in columns
that should be positive.
● Perform domain-specific checks (e.g.,
validating email addresses, phone numbers).

31 Write code to calculate summary statistics for


numerical columns (mean, median, standard
deviation) and categorical columns (frequency counts,
proportions).
● Use these statistics to understand the central
tendency, spread, and distribution of data.
32 Create Python visualizations (using libraries like
Matplotlib, Seaborn) to explore data relationships.
Examples:
● Histograms to visualize feature distributions.
● Scatter plots to identify correlations between
features.
● Box plots to compare distributions across groups.
Explain the insights gained from each
visualization.

33 Generate a dataset containing the number of hours


studied by students and their corresponding test scores
for 10 students in Excel.
Write python programming to perform a simple linear
regression that will be predict test scores based on
hours studied.

34 Generate a dataset containing house prices, square


footage, number of bedrooms, and age of the house.
Then build Python program to multiple linear
regression model to predict house prices.

35 Generate a dataset with daily temperatures and


ice cream sales.
Write a python program to calculate the Pearson
correlation coefficient to determine the strength and
direction of the relationship between temperature and
sales.

36 Generate data sets with 15 responses for six


independent variables and one dependent variables.

After this fit a regression model to a dataset, and then


check for multi-collinearity Inflation Factor (VIF).
Interpret the results.
37 Use a dataset with 5-categorical variables (e.g., gender,
region) and perform a regression analysis using one
dependent variables with categorical variable.
38 Write a Python program to calculate the Pearson
correlation coefficient between two lists of numbers.
For example, given two lists:
• List A: [10, 20, 30, 40, 50]
• List B: [15, 25, 35, 45, 55]
39 Using the Pandas library, create a DataFrame from the
following data and compute the correlation matrix:
Height (cm): [150, 160, 170, 180, 190]
Weight (kg): [50, 60, 70, 80, 90]
Age (years): [20, 25, 30, 35, 40]

40 Write a Python program that generates a scatter plot to


visualize the correlation between two variables using
Matplotlib. Use the following data:
• X: [1, 2, 3, 4, 5]
• Y: [2, 4, 6, 8, 10]
• Explain how the plot illustrates the correlation.

41 Load a real dataset (e.g., from a CSV file) using Pandas.


Calculate and display the correlation between different
numerical columns in the dataset.
For example, use the famous Iris dataset to find the
correlation between sepal length, sepal width, petal
length, and petal width.

42 Write a Python program that calculates the Spearman


correlation coefficient between two lists of ranks. For
example:
• List X: [1, 2, 3, 4, 5]
• List Y: [5, 6, 7, 8, 7]
• Explain the difference between Pearson and
Spearman correlation.
43 Write a Python program that performs a one-way
ANOVA test on three different groups of data. For
example:
• Group 1: [5, 7, 8, 6, 9]
• Group 2: [10, 12, 11, 13, 12]
• Group 3: [15, 17, 14, 16, 18] Use the scipy.stats
library to perform the test and interpret the
results.
44 Download a real dataset (e.g., the Iris dataset from
sklearn) and perform a one-way ANOVA to compare
the means of different species based on petal length.
Print the ANOVA table and interpret the p-value.

45 Write a Python program to perform a two-way ANOVA.


Create a dataset that includes two categorical variables
(e.g., Treatment Type and Gender) and a numerical
outcome variable (e.g., Test Scores). Use the stats
models library to conduct the analysis and interpret the
results.
46 After performing a one-way ANOVA, visualize the
group means and their confidence intervals using a box
plot. Use Matplotlib or Seaborn to create the plot based
on the following data:
• Group A: [20, 21, 22, 19, 23]
• Group B: [25, 26, 24, 27, 26]
• Group C: [30, 29, 31, 32, 30] Explain how the
visualization helps in understanding the
results.
47 Create a dataset for a two-way ANOVA test that
includes two categorical independent variables (e.g.,
"Diet" and "Exercise") and one continuous dependent
variable (e.g., "Weight Loss").
Here's an example dataset:
• Diet A (No Exercise): [2, 3, 4, 5]
• Diet A (Exercise): [5, 6, 7, 8]
• Diet B (No Exercise): [1, 2, 1, 3]
• Diet B (Exercise): [6, 7, 5, 6]
Use the statsmodels library to perform the two-way
ANOVA. Print the ANOVA table and interpret the
results to see if there is a significant effect from either
of the independent variables and their interaction.

48 Write a Python program to calculate the Quartile


Deviation (QD) for a given dataset. Use the following
data:
• Data: [12, 15, 14, 10, 18, 22, 20, 16, 17, 19] Print
the QD along with the first and third quartiles.
49 Implement a Python program to calculate the Mean
Deviation (MD) of a dataset. Given the data:
• Data: [5, 7, 8, 9, 10] Calculate and print the
Mean Deviation from the mean.
50 Implement a Python program to calculate the Mean
Deviation (MD) of a dataset. Given the data:
• Data: [5, 7, 8, 9, 10] Calculate and print the
Mean Deviation from the mean.

51 Create a Python program that computes the Skewness


and Kurtosis of a given dataset. Use the following data:
• Data: [1, 2, 2, 3, 4, 4, 4, 5, 6, 8] Print the values
of skewness and kurtosis and interpret what
these values mean in terms of the distribution
shape.

52 Write a Python program that takes a dataset (e.g., from


a CSV file) and computes the Quartile Deviation, Mean
Deviation, Standard Deviation, Variance, Skewness,
and Kurtosis. Print a summary report for the dataset.
You can use the following sample data:
• Sample Data: [5, 10, 15, 20, 25, 30, 35, 40]

PrePared By

You might also like