Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
121 views

Python Cheat Sheet For Excel Users

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views

Python Cheat Sheet For Excel Users

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Python

Cheat Sheet

Python | Pandas
Data Analysis
Data Visualization

Artificial Corner
Python Basics Variables
Variable assignment:
Creating a new list:
numbers = [4, 3, 10, 7, 1, 2]

Cheat Sheet
message_1 = "I'm learning Python" Sorting a list:
message_2 = "and it's fun!" >>> numbers.sort()
[1, 2, 3, 4, 7, 10]
String concatenation (+ operator):
Here you will find all the Python core concepts you need to
message_1 + ' ' + message_2 >>> numbers.sort(reverse=True)
know before learning any third-party library.
[10, 7, 4, 3, 2, 1]
String concatenation (f-string):
Data Types f'{message_1} {message_2}' Update value on a list:
>>> numbers[0] = 1000
Integers (int): 1 >>> numbers
Float (float): 1.2
List [1000, 7, 4, 3, 2, 1]
String (str): "Hello World" Creating a list:
Copying a list:
Boolean: True/False countries = ['United States', 'India', new_list = countries[:]
'China', 'Brazil'] new_list_2 = countries.copy()
List: [value1, value2]
Dictionary: {key1:value1, key2:value2, ...} Create an empty list:
my_list = [] Built-in Functions
Numeric Operators Comparison Operators Indexing: Print an object:
>>> countries[0] print("Hello World")
+ Addition
== United States
Equal to
Return the length of x:
- Subtraction >>> countries[3] len(x)
!= Different Brazil
* Multiplication Return the minimum value:
> Greater than >>> countries[-1] min(x)
Division Brazil
/ < Less than Return the maximum value:
Slicing: max(x)
** Exponent >>>countries[0:3]
>= Greater than or equal to
['United States', 'India', 'China'] Returns a sequence of numbers:
% Modulus
<= Less than or equal to range(x1,x2,n) # from x1 to x2
>>>countries[1:] (increments by n)
// Floor division ['India', 'China', 'Brazil']
Convert x to a string:
>>>countries[:2] str(x)
['United States', 'India']
String methods Convert x to an integer/float:
Adding elements to a list: int(x)
string.upper(): converts to uppercase countries.append('Canada') float(x)
string.lower(): converts to lowercase countries.insert(0,'Canada')
string.title(): converts to title case Convert x to a list:
Nested list: list(x)
string.count('l'): counts how many times "l" nested_list = [countries, countries_2]
appears
string.find('h'): position of the "h" first Remove element:
countries.remove('United States')
ocurrance countries.pop(0)#removes and returns value
string.replace('o', 'u'): replaces "o" with "u" del countries[0]
Dictionary If Statement Functions
Creating a dictionary: Create a function:
Conditional test:
my_data = {'name':'Frank', 'age':26} def function(<params>):
if <condition>:
<code> <code>
Create an empty dictionary: elif <condition>: return <data>
my_dict = {} <code>
Get value of key "name":
...
else:
Modules
>>> my_data["name"] <code> Import module:
'Frank' import module
Example: module.method()
Get the keys: if age>=18:
>>> my_data.keys() print("You're an adult!") OS module:
dict_keys(['name', 'age']) import os
Conditional test with list: os.getcwd()
Get the values: if <value> in <list>: os.listdir()
>>> my_data.values() <code> os.makedirs(<path>)
dict_values(['Frank', 26])
Get the pair key-value:
>>> my_data.items()
Loops Special Characters
dict_items([('name', 'Frank'), ('age', 26)]) For loop: # Comment
for <variable> in <list>:
Adding/updating items in a dictionary: <code> \n New Line
my_data['height']=1.7
my_data.update({'height':1.8, For loop and enumerate list elements:
'languages':['English', 'Spanish']}) for i, element in enumerate(<list>): Boolean Operators
>>> my_data Boolean Operators
<code> (Pandas)
{'name': 'Frank',
'age': 26, For loop and obtain dictionary elements: and logical AND & logical AND
'height': 1.8, for key, value in my_dict.items():
'languages': ['English', 'Spanish']} <code> or logical OR | logical OR
Remove an item: While loop: not logical NOT ~ logical NOT
my_data.pop('height') while <condition>:
del my_data['languages'] <code>
my_data.clear()
Copying a dictionary: Data Validation
new_dict = my_data.copy()
Try-except: Below are my guides, tutorials and
try: complete Data Science course:
<code>
except <error>: - Medium Guides
<code> - YouTube Tutorials
Loop control statement: - Data Science Course (Udemy)
break: stops loop execution - Make Money Using Your Programming
continue: jumps to next iteration & Data Science Skills
pass: does nothing
Made by Frank Andrade: artificialcorner.com
Pandas
Concatenate multiple dataframes horizontally:
df3 = pd.DataFrame([[7],[8], [9]],
Selecting rows and columns index=['A','B', 'C'],
columns=['col3'])

Cheat Sheet
Select single column:
df['col1']
pd.concat([df,df3], axis=1)
Select multiple columns:
Pandas provides data analysis tools for Python. All of the Only merge complete rows (INNER JOIN):
df[['col1', 'col2']]
df.merge(df3)
following code examples refer to the dataframe below.
Show first n rows:
Left column stays complete (LEFT OUTER JOIN):
df.head(2)
axis 1 df.merge(df3, how='left')
col1 col2 Show last n rows:
Right column stays complete (RIGHT OUTER JOIN):
df.tail(2)
A 1 4 df.merge(df3, how='right')
Select rows by index values:
axis 0 Preserve all values (OUTER JOIN):
df = B 2 5 df.loc['A'] df.loc[['A', 'B']]
df.merge(df3, how='outer')
C 3 6 Select rows by position:
Merge rows by index:
df.iloc[1] df.iloc[1:]
df.merge(df3,left_index=True,
right_index=True)
Getting Started Data wrangling Fill NaN values:
Import pandas: Filter by value: df.fillna(0)
import pandas as pd df[df['col1'] > 1]
Apply your own function:
Sort by one column: def func(x):
Create a series: df.sort_values('col1') return 2**x
df.apply(func)
s = pd.Series([1, 2, 3],
Sort by columns:
index=['A', 'B', 'C'], df.sort_values(['col1', 'col2'],
name='col1') ascending=[False, True])
Arithmetics and statistics
Create a dataframe: Add to all values:
Identify duplicate rows: df + 10
data = [[1, 4], [2, 5], [3, 6]] df.duplicated()
index = ['A', 'B', 'C'] Sum over columns:
df = pd.DataFrame(data, index=index, Identify unique rows: df.sum()
df['col1'].unique()
columns=['col1', 'col2'])
Read a csv file with pandas: Cumulative sum over columns:
Swap rows and columns: df.cumsum()
df = pd.read_csv('filename.csv') df = df.transpose()
df = df.T Mean over columns:
Advanced parameters: df.mean()
Drop a column:
df = pd.read_csv('filename.csv', sep=',', df = df.drop('col1', axis=1) Standard deviation over columns:
names=['col1', 'col2'], df.std()
Clone a data frame:
index_col=0, clone = df.copy() Count unique values:
encoding='utf-8', df['col1'].value_counts()
Concatenate multiple dataframes vertically:
nrows=3) df2 = df + 5 # new dataframe Summarize descriptive statistics:
pd.concat([df,df2]) df.describe()
Hierarchical indexing Data export Visualization
Create hierarchical index: Data as NumPy array: The plots below are made with a dataframe
df.stack() df.values with the shape of df_gdp (pivot() method)
Dissolve hierarchical index: Save data as CSV file:
df.unstack() df.to_csv('output.csv', sep=",") Import matplotlib:
import matplotlib.pyplot as plt
Format a dataframe as tabular string:
Aggregation df.to_string() Start a new diagram:
plt.figure()
Create group object: Convert a dataframe to a dictionary:
g = df.groupby('col1') df.to_dict() Scatter plot:
df.plot(kind='scatter')
Iterate over groups: Save a dataframe as an Excel table:
for i, group in g: df.to_excel('output.xlsx') Bar plot:
print(i, group) df.plot(kind='bar',
xlabel='data1',
Aggregate groups: ylabel='data2')
g.sum() Pivot and Pivot Table
g.prod() Lineplot:
g.mean() Read csv file 1: df.plot(kind='line',
g.std() df_gdp = pd.read_csv('gdp.csv') figsize=(8,4))
g.describe() Boxplot:
The pivot() method:
Select columns from groups: df_gdp.pivot(index="year", df['col1'].plot(kind='box')
g['col2'].sum() columns="country", Histogram over one column:
g[['col2', 'col3']].sum() values="gdppc")
df['col1'].plot(kind='hist',
Transform values: Read csv file 2: bins=3)
import math df_sales=pd.read_excel( Piechart:
g.transform(math.log) 'supermarket_sales.xlsx')
df.plot(kind='pie',
Apply a list function on each group: Make pivot table: y='col1',
def strsum(group): df_sales.pivot_table(index='Gender', title='Population')
return ''.join([str(x) for x in group.value]) aggfunc='sum') Set tick marks:
g['col2'].apply(strsum) Make a pivot tables that says how much male and labels = ['A', 'B', 'C', 'D']
female spend in each category: positions = [1, 2, 3, 4]
plt.xticks(positions, labels)
df_sales.pivot_table(index='Gender', plt.yticks(positions, labels)
Below are my guides, tutorials and columns='Product line', Label diagram and axes:
complete Pandas course: values='Total',
aggfunc='sum') plt.title('Correlation')
- Medium Guides plt.xlabel('Nunstück')
plt.ylabel('Slotermeyer')
- YouTube Tutorials
- Pandas Course (Udemy) Save most recent diagram:
- Make Money Using Your Programming plt.savefig('plot.png')
plt.savefig('plot.png',dpi=300)
& Data Science Skills plt.savefig('plot.svg')
Made by Frank Andrade: artificialcorner.com

You might also like