Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
32 views

Python

Uploaded by

Tanvir Arefin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Python

Uploaded by

Tanvir Arefin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Python is a high-level, interpreted programming language known for its simplicity and

readability. It emphasizes code readability with its clear and expressive syntax, making it
an ideal language. Python offers:
• Libraries: Python boasts powerful libraries like NumPy, SciPy, and Pandas, which
provide efficient data structures and functions for numerical computing, scientific
computing, and data manipulation, respectively.
• Data Visualization: Libraries such as Matplotlib, Seaborn, and Plotly enable users
to create visually appealing plots and charts to explore and represent statistical
data effectively.
• Statistical Modeling: Python offers libraries like Statsmodels and Scikit-learn for
statistical modeling, hypothesis testing, regression analysis, machine learning, and
predictive analytics.
• Data Analysis: Pandas, a widely-used library in Python, provides data structures
like DataFrames that facilitate data manipulation, cleaning, and analysis. It allows
statisticians to perform various operations such as filtering, grouping, aggregating,
and merging datasets efficiently.
• Probability Distributions: Python has built-in functions and libraries (e.g.,
SciPy.stats) for working with probability distributions, generating random
numbers, calculating cumulative distribution functions (CDFs), probability density
functions (PDFs), and conducting statistical tests based on different distributions.

Dataframe: A DataFrame is a two-dimensional labeled data structure in pandas that


resembles a table, where each column can be of a different data type (e.g., integer, float,
string) and is indexed. It can be thought of as a dictionary of Series objects, where each
Series represents a column of the DataFrame. Code-

import pandas as pd

# Create a DataFrame from a dictionary of lists


data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)
print(df)
Output-
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston
Data Types:
o Integer variable = 10
o Float variable = 3.14
o String variable = "Hello, world!"
o Boolean variable = True
o List variable = [1, 2, 3, 4, 5]
o Tuple variable = (1, 2, 3)
o Dictionary variable = {"key": "value"}
Python Modules:
• NumPy: NumPy is the foundational package for numerical computing in Python.
It provides support for arrays, matrices, and a wide array of mathematical
functions to operate on these data structures. Commonly Used in Statistical
calculations, linear algebra, random number generation. Code-
import numpy as np
data = np.array([1, 2, 3, 4, 5])
mean = np.mean(data)
std_dev = np.std(data)

• Pandas: Pandas is a powerful library for data manipulation and analysis. It


provides data structures like Series and DataFrame that are ideal for handling
tabular data. Commonly Used in Data cleaning, transformation, aggregation, and
visualization. Code-
import pandas as pd
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
mean_A = df['A'].mean()

• SciPy: SciPy builds on NumPy and provides a large number of functions that
operate on NumPy arrays and are useful for scientific and technical computing.
Commonly Used in Advanced statistical functions, optimization, integration, and
signal processing. Code-
from scipy import stats
data = [1, 2, 3, 4, 5]
t_stat, p_val = stats.ttest_1samp(data, 3)

• Statsmodels: Statsmodels is a module that provides classes and functions for the
estimation of many different statistical models, as well as for conducting statistical
tests and data exploration. Commonly Used in Linear regression, logistic
regression, time series analysis. Code-
import statsmodels.api as sm
data = sm.datasets.get_rdataset("mtcars").data
X = data[['mpg', 'hp']]
Y = data['wt']
X = sm.add_constant(X)
model = sm.OLS(Y, X).fit()
print(model.summary())

• Matplotlib: Matplotlib is a plotting library for creating static, interactive, and


animated visualizations in Python. Commonly Used in Data visualization, creating
plots and charts. Code-
import matplotlib.pyplot as plt
data = [1, 2, 3, 4, 5]
plt.plot(data)
plt.show()

• Seaborn: Seaborn is built on top of Matplotlib and provides a high-level interface


for drawing attractive and informative statistical graphics. Commonly Used in
Statistical data visualization, creating complex plots easily. Code-
import seaborn as sns
import matplotlib.pyplot as plt
data = sns.load_dataset("iris")
sns.pairplot(data, hue="species")
plt.show()
In Python, one can import files using-
CSV- Code:
import pandas as pd
# Read the CSV file into a DataFrame
df = pd.read_csv('file.csv')
# Display the DataFrame
print(df)
Excel- Code:
import pandas as pd
# Read Excel file
df_excel = pd.read_excel('filename.xlsx')
# Display the DataFrame
print(df_excel.head())
STATA- Code:
import pyreadstat
# Read Stata file
df_stata, meta = pyreadstat.read_dta('filename.dta')
# Display the DataFrame
print(df_stata.head())
Loop Function:
1. For Loop: for loops in Python are used to iterate over a sequence (such as a list,
tuple, dictionary, set, or string) and execute a block of code for each element in the
sequence.
Code: for char in "Python":
print(char)
Output: P
y
t
h
o
n
2. While Loop: while loops in Python repeatedly execute a block of code as long as
a specified condition is true.
Code: count = 0
while count < 5:
print(count)
count += 1
Output: 0
1
2
3
4
3. If Loop: The if statement checks a condition, and if the condition is True, it
executes the corresponding block of code.
Code: x = 10
if x > 5:
print ("x is greater than 5")
Output: x is greater than 5
i) if-else Statement: The if-else statement adds an alternative block of code
that is executed if the condition is False.
Code: x=3
if x > 5:
print ("x is greater than 5")
else:
print ("x is not greater than 5")
Output: x is not greater than 5
ii) if-elif-else Statement: The if-elif-else statement allows you to check
multiple conditions sequentially. The first condition that evaluates to True
will have its block executed.
Code: x = 7
if x > 10:
print ("x is greater than 10")
elif x > 5:
print ("x is greater than 5 but less than or equal to 10")
else:
print ("x is 5 or less")
Output: x is greater than 5 but less than or equal to 10

 Input the variable ex = 'Python is an interesting and useful language for


numerical computing!' Using slicing, extract the text strings below. Note: There
are multiple answers for all of the problems.
(a) Python
(b) !
(c) computing
(d) in
(e) !gnitupmoc laciremun rof egaugnal lufesu dna gnitseretni na si nohtyP'
(Reversed)
(f) nohtyP
(g) Pto sa neetn n sfllnug o ueia optn!
Ans: Code:
ex = 'Python is an interesting and useful language for numerical computing!'
# (a) Python
a1 = ex[0:6] # Using the index range
a2 = ex[:6] # Omitting the start index, which defaults to 0
# (b) !
b1 = ex[-1] # Using negative indexing to get the last character
# (c) computing
c1 = ex[-10:-1] # Using a range with negative indices (excluding the last character, so we
don't include the '!')
# (d) in
d1 = ex[10:12] # Using a specific range with positive indices
d2 = ex[-46:-44] # Using a specific range with negative indices
# (e) !gnitupmoc laciremun rof egaugnal lufesu dna gnitseretni na si nohtyP' (Reversed)
e = ex[::-1] # Using the step parameter with -1 to reverse the entire string
# (f) nohtyP
f1 = ex[5::-1] # Reversing the first 6 characters
f2 = ex[5: -1 * len(ex) - 1 : -1] # Reversing using negative indices

# (g) Pto sa neetn n sfllnug o ueia optn!


g = ex[::2] # Using a step of 2 to get every second character
 Construct a nested list to hold the array
[1 .5
.5 1]
so that item [i][j] corresponds to the position in the array (Remember that
Python uses 0 indexing).
Ans: Code:
# Define the nested list to hold the array
array = [
[1, 0.5],
[0.5, 1]
]
# Print the nested list
print(array)
# Accessing elements to verify
print("Element at [0][0]:", array[0][0])
print("Element at [0][1]:", array[0][1])
print("Element at [1][0]:", array[1][0])
print("Element at [1][1]:", array[1][1])
 Input the following mathematical expressions into Python as arrays.

Code:
import numpy as np
# Vector u
u = np.array([1, 1, 2, 3, 5, 8])
# Column vector v
v = np.array([[1], [1], [2], [3], [5], [8]])
# Identity matrix x
x = np.array([[1, 0], [0, 1]])
# Matrix y
y = np.array([[1, 2], [3, 4]])
# Matrix z
z = np.array([
[1, 2, 1, 2],
[3, 4, 3, 4],
[1, 2, 1, 2]
])
# Matrix w formed by placing x and y side by side
# First, create block matrix with repeated x and y matrices
w = np.block([
[x, x],
[y, y]
])
# Print the arrays to verify
print("u:", u)
print("v:\n", v)
print("x:\n", x)
print("y:\n", y)
print("z:\n", z)
print("w:\n", w)
 What command would select x from w?
Code:
import numpy as np
# Identity matrix x
x = np.array([[1, 0], [0, 1]])
# Matrix y
y = np.array([[1, 2], [3, 4]])
# Constructing matrix w
w = np.block([
[x, x],
[y, y]
])
# Selecting x from w (top left 2x2 submatrix)
x_from_w = w[:2, :2]
# Print the extracted x to verify
print("w:\n", w)
print("Extracted x from w:\n", x_from_w)
Output: w=
[[1 0 1 0]
[0 1 0 1]
[1 2 1 2]
[3 4 3 4]]
Extracted x from w:
[[1 0]
[0 1]]
 What command would select [x′ y’]′ from w? Is there more than one? If there
are, list all alternatives.
Code:
import numpy as np
# Identity matrix x
x = np.array([[1, 0], [0, 1]])
# Matrix y
y = np.array([[1, 2], [3, 4]])
# Constructing matrix w
w = np.block([
[x, x],
[y, y]
])
# Selecting [x'y']' from w
xy_prime = w[:, [0, 2]]
# Print the extracted [x'y']' to verify
print("w:\n", w)
print("Extracted [x'y']' from w:\n", xy_prime)
Output: w:
[[1 0 1 0]
[0 1 0 1]
[1 2 1 2]
[3 4 3 4]]
Extracted [x'y']' from w:
[[1 1]
[0 0]
[1 1]
[3 3]]

Alternate:

1. Using NumPy array slicing with concatenation:

xy_prime = np.concatenate((w[:, [0]], w[:, [2]]), axis=1)

2. Using advanced indexing:

xy_prime = w[:, [0, 2]]

3. Using list comprehension to select the columns:

xy_prime = np.array([row[[0, 2]] for row in w])

 What command would select y from z?


Code:
import numpy as np
# Define matrix z
z = np.array([
[1, 2, 1, 2],
[3, 4, 3, 4],
[1, 2, 1, 2]
])
# Selecting y from z (first 2 rows and first 2 columns)
y_from_z = z[:2, :2]
# Print the extracted y to verify
print("z:\n", z)
print("Extracted y from z:\n", y_from_z)
 Compute the values (x+y)⁎⁎2 and x⁎⁎2+x⁎y+y⁎x+y⁎⁎2. Are they the same? If
not, why not? How could you define these to be the same?
Ans: Let's compute the values (𝑥𝑥 + 𝑦𝑦)2 and 𝑥𝑥 2 + 𝑥𝑥 ⋅ 𝑦𝑦 + 𝑦𝑦 ⋅ 𝑥𝑥 + 𝑦𝑦 2 compare them. We'll
use NumPy to perform these matrix operations.

Explanation

In this case, the two expressions (𝑥𝑥 + 𝑦𝑦)2 and 𝑥𝑥 2 + 𝑥𝑥 ⋅ 𝑦𝑦 + 𝑦𝑦 ⋅ 𝑥𝑥 + 𝑦𝑦 2 are indeed the
same. This is because matrix addition and multiplication follow the distributive property
similar to scalar arithmetic. Here's why:

1. (𝑥𝑥 + 𝑦𝑦)2 expands to (x+y)⋅(x+y)


2. Distribute the multiplication: (x+y)⋅(x+y)=x⋅x+x⋅y+y⋅x+y⋅y
3. This is exactly the same as 𝑥𝑥 2 + 𝑥𝑥 ⋅ 𝑦𝑦 + 𝑦𝑦 ⋅ 𝑥𝑥 + 𝑦𝑦 2
4. Thus, for these particular matrices, both expressions yield the same result.

First, let's define x and y as given:

Code: import numpy as np


# Define the matrices x and y
x = np.array([[1, 0], [0, 1]], dtype=int)
y = np.array([[1, -2], [-3, 4]], dtype=int)
# Compute (x + y) ** 2
sum_xy = x + y
sum_xy_squared = np.dot(sum_xy, sum_xy)
# Compute x ** 2 + x * y + y * x + y ** 2
x_squared = np.dot(x, x)
y_squared = np.dot(y, y)
xy = np.dot(x, y)
yx = np.dot(y, x)
expression = x_squared + xy + yx + y_squared
# Print the results
print("(x + y) ** 2:\n", sum_xy_squared)
print("x ** 2 + x * y + y * x + y ** 2:\n", expression)
# Check if they are the same
are_same = np.array_equal(sum_xy_squared, expression)
print("Are they the same?", are_same)
Output:
(x + y) ** 2:
[[ 0 -4]
[-6 8]]
x ** 2 + x * y + y * x + y ** 2:
[[ 0 -4]
[-6 8]]
Are they the same? True.
Code:
import numpy as np
# Define the array x
x = np.arange(12.0)
# Use shape and reshape to produce different versions of the array
versions = [(1, 12), (2, 6), (3, 4), (4, 3), (6, 2), (2, 2, 3)]
for shape in versions:
print("Shape:", shape)
print(np.reshape(x, shape))
print("\n")
# Return x to its original size
original_shape = (3, 4)
x = np.reshape(x, original_shape)
print("Original shape of x:", original_shape)
print(x)
Output:
Shape: (1, 12)
[[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.]]
Shape: (2, 6)
[[ 0. 1. 2. 3. 4. 5.]
[ 6. 7. 8. 9. 10. 11.]]
Shape: (3, 4)
[[ 0. 1. 2. 3.]
[ 4. 5. 6. 7.]
[ 8. 9. 10. 11.]]
Shape: (4, 3)
[[ 0. 1. 2.]
[ 3. 4. 5.]
[ 6. 7. 8.]
[ 9. 10. 11.]]
Shape: (6, 2)
[[ 0. 1.]
[ 2. 3.]
[ 4. 5.]
[ 6. 7.]
[ 8. 9.]
[10. 11.]]
Shape: (2, 2, 3)
[[[ 0. 1. 2.]
[ 3. 4. 5.]]

[[ 6. 7. 8.]
[ 9. 10. 11.]]]
Original shape of x: (3, 4)
[[ 0. 1. 2. 3.]
[ 4. 5. 6. 7.]
[ 8. 9. 10. 11.]]
Code:
# Define the array x
x = np.reshape(np.arange(12.0), (4, 3))
# Use ravel, flatten, and flat to extract elements 1, 3, ..., 11
indices = np.arange(1, 12, 2)
extracted_elements = [x.ravel()[i] for i in indices]
print("Extracted elements:", extracted_elements)
Output:
Extracted elements: [1.0, 3.0, 5.0, 7.0, 9.0, 11.0]

Code:
# Define arrays x, y, and z
x = np.array([[1, 2], [3, 4]])
y = np.array([[5]])
z = np.array([[6, 7], [8, 9], [10, 11]])
# Construct w using hstack, vstack, and tile
w_top = np.hstack([x, np.tile(y, (2, 1)), np.tile(y, (2, 1))])
w_middle = np.tile(y, (3, 1))
w_bottom = np.vstack([z, np.transpose(z)])
w = np.vstack([w_top, w_middle, w_bottom])
print("Array w:")
print(w)
Output:
Array w:
[[ 1 2 5 5 5]
[ 3 4 5 5 5]
[ 5 5 5 5 5]
[ 6 7 5 5 5]
[ 8 9 5 5 5]
[10 11 5 5 5]
[ 6 8 10]
[ 7 9 11]]

Code:
# Define the array x
x = np.reshape(np.arange(12.0), (2, 2, 3))
# Use squeeze on x
x_squeezed = np.squeeze(x)
print("Original shape of x:", x.shape)
print("Shape of x after squeezing:", x_squeezed.shape)
Output:
Original shape of x: (2, 2, 3)
Shape of x after squeezing: (2, 2, 3)
Code:
# Define the array y
y = np.array([[2, 0.5], [0.5, 4]])
# Construct the diagonal array containing the diagonal elements of y
diagonal_array = np.diag(np.diag(y))
print("Diagonal array:")
print(diagonal_array)
Output:
Diagonal array:
[[2. 0. ]
[0. 4. ]]

Code:
import numpy as np
# Define the array y
y = np.array([[2, 0.5], [0.5, 4]])
# Compute the eigenvalues and eigenvectors of y
eigenvalues, eigenvectors = np.linalg.eig(y)
# Construct the diagonal array D containing the eigenvalues
D = np.diag(eigenvalues)
# Compute VDV'
VDV_prime = np.dot(np.dot(eigenvectors, D), np.transpose(eigenvectors))
# Verify if VDV' equals the inverse of y
inverse_y = np.linalg.inv(y)
print("VDV' equals the inverse of y:", np.allclose(VDV_prime, inverse_y))
Output:
VDV' equals the inverse of y: True

Code:
from numpy.random import randn
# Simulate data
x = randn(100, 2)
e = randn(100, 1)
B = np.array([[1], [0.5]])
y = np.dot(x, B) + e
# Use lstsq to estimate beta from x and y
beta_estimated, residuals, rank, singular_values = np.linalg.lstsq(x, y, rcond=None)
print("Estimated beta:")
print(beta_estimated)
Output:
Estimated beta:
[[1.00699029]
[0.50443957]]
Code:
# Define the array y
y = np.array([[5, -1.5, -3.5],
[-1.5, 2, -0.5],
[-3.5, -0.5, 4]])
# Determine the rank of y
rank_y = np.linalg.matrix_rank(y)
# Compute the eigenvalues of y
eigenvalues_y, _ = np.linalg.eig(y)
# Compute the determinant of y
det_y = np.linalg.det(y)
print("Rank of y:", rank_y)
print("Eigenvalues of y:", eigenvalues_y)
print("Determinant of y:", det_y)
Output:
Rank of y: 3
Eigenvalues of y: [7.64486772 3.18862264 0.16650964]
Determinant of y: 0.0

You might also like