0% found this document useful (0 votes)

48 views

Data Analysis Using Python

The document analyzes employee data from a CSV file with 24 entries. It performs tasks like finding number of employees by governorate, department metrics like average age and salary, filtering data for specific departments, and calculating bonuses based on hire date. Visualizations are also planned to represent the data.

Uploaded by

talithasyahda.ts

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

Data Analysis Using Python

Uploaded by

talithasyahda.ts

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

~import libraries
In [1]: import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import datetime
%matplotlib inline
sns.set()

In [2]: Hr = pd.read_csv(r'C:\Users\compucity\Downloads\Book1.csv')
Hr.head()
Out[2]:
E_ID Name Age Address Telephone Salary Department Hire Date

0 11 Aleya 46 Cairo 4218483 3000 Account 24-02-03

1 9 Hassan 25 Cairo 3578283 2000 Sales 21-01-06

2 15 Ramy 57 Alex 3674313 5000 Computer 21-03-00

3 18 Ola 28 Milan 4186473 5000 Sales 04-02-07

4 22 Zeiad 29 Milan 3642303 2000 Sales 01-03-98

task of Data
1- Find the number of employees in each governorate
2- Find the number of employees in each Department
3- Average age of employees in each department
4- Average Salary of employees in each department
5- Retrieving the data of employees who work in the computer department only,
as well as the rest of the employees in other departments
6- Find the number of employees in each department who work in Cairo Governorate only
7- Search for the employee who receives the highest salary and retrieve his complete data+
8- Number of employees by department in each governorate
9- Bonus ... Based on Hire Date
Hire Date1 >= -1-2005 5% of Salary
1-1-2003 10%
1-1-2000 15%
1-1-1995 20%
1-1-1990 25%
Else 30%
Based on Hire Date

10 - Find some suitable graph for the data

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 1/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [4]: Hr.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 24 entries, 0 to 23
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 E_ID 24 non-null int64
1 Name 24 non-null object
2 Age 24 non-null int64
3 Address 24 non-null object
4 Telephone 24 non-null int64
5 Salary 24 non-null int64
6 Department 24 non-null object
7 Hire Date 24 non-null object
dtypes: int64(4), object(4)
memory usage: 1.6+ KB

In [5]: Hr.dtypes

Out[5]: E_ID int64

Name object
Age int64
Address object
Telephone int64
Salary int64
Department object
Hire Date object
dtype: object

In [6]: # Total isnull data

Hr.isnull().sum()

Out[6]: E_ID 0
Name 0
Age 0
Address 0
Telephone 0
Salary 0
Department 0
Hire Date 0
dtype: int64

In [7]: # duplicated
Hr.duplicated().sum()
Out[7]: 0

In [8]: Hr['Address'].value_counts()

Out[8]: Cairo 8
Alex 6
Giza 5
Milan 2
Alexandria 1
Alixandria 1
milan 1
Name: Address, dtype: int64

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 2/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [9]: Hr['Department'].value_counts()

Out[9]: Sales 9
Computer 8
Account 7
Name: Department, dtype: int64

In [10]: # AVERAGE Age

Hr.groupby('Department')['Age'].mean()
Out[10]: Department
Account 46.142857
Computer 41.625000
Sales 33.222222
Name: Age, dtype: float64

In [11]: # AVERAGE Salary of Department

Hr.groupby('Department')['Salary'].mean().round()

Out[11]: Department
Account 2143.0
Computer 4375.0
Sales 2778.0
Name: Salary, dtype: float64

In [12]: Hr[Hr['Department']== "Computer"]

Out[12]: E_ID Name Age Address Telephone Salary Department Hire Date

2 15 Ramy 57 Alex 3674313 5000 Computer 21-03-00

5 12 Salwa 34 Alexandria 4090443 4000 Computer 06-12-05

8 27 Yousef 46 Alixandria 3706323 7000 Computer 10-04-03

16 58 Neveen 43 Alex 3834363 4000 Computer 29-06-03

19 61 Yasser 37 Cairo 3962403 3000 Computer 17-09-06

21 79 Maged 33 Cairo 3930393 6000 Computer 28-08-98

22 90 Ahmed 28 Cairo 4250493 4000 Computer 16-03-90

23 94 Dina 55 Cairo 4282503 2000 Computer 05-04-99

In [13]: Hr[Hr['Department']== "Sales"]

Out[13]: E_ID Name Age Address Telephone Salary Department Hire Date

1 9 Hassan 25 Cairo 3578283 2000 Sales 21-01-06

3 18 Ola 28 Milan 4186473 5000 Sales 04-02-07

4 22 Zeiad 29 Milan 3642303 2000 Sales 01-03-98

6 24 Ali 24 Giza 4154463 1000 Sales 15-01-91

7 25 Tahany 39 Alex 3546273 3000 Sales 01-01-00

10 35 Mahmoud 57 Giza 3610293 4000 Sales 10-02-99

12 48 Wagdy 24 Cairo 4058433 5000 Sales 16-11-07

14 55 Samah 38 milan 4026423 1000 Sales 27-10-01

15 57 Rawan 35 Alex 3994413 2000 Sales 07-10-02

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 3/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [14]: Hr[Hr['Department']== "Account"]

Out[14]: E_ID Name Age Address Telephone Salary Department Hire Date

0 11 Aleya 46 Cairo 4218483 3000 Account 24-02-03

9 29 Khaled 29 Alex 3770343 3000 Account 20-05-01

11 47 Talaat 48 Giza 3866373 3000 Account 19-07-01

13 54 Samy 55 Giza 3738333 1000 Account 30-04-04

17 62 Amr 44 Cairo 4122453 2000 Account 26-12-95

18 65 Hala 53 Alex 3802353 2000 Account 09-06-90

20 71 Radwa 48 Giza 3898383 1000 Account 08-08-05

In [15]: # Find employees who work in Alexandria in the computer department

Hr[Hr['Department']== "Computer"]['Address']== "Alex"
Out[15]: 2 True
5 False
8 False
16 True
19 False
21 False
22 False
23 False
Name: Address, dtype: bool

In [16]: # Find the number of employees in each department who work in Cairo Governorate only
Hr[Hr['Address']=='Cairo']['Department'].value_counts()

Out[16]: Computer 4
Account 2
Sales 2
Name: Department, dtype: int64

In [36]: # Search for the employee who receives the highest salary and retrieve his complete data
Hr[Hr['Salary']==Hr['Salary']].max()[['Name']+['Address']+['Department']+['Age']+['Salary']
Out[36]: Name Zeiad
Address milan
Department Sales
Age 57
Salary 7000
dtype: object

In [37]: # Search for the employee who receives the min salary and retrieve his complete data
Hr[Hr['Salary']==Hr['Salary']].min()[['Name']+['Address']+['Department']+['Age']+['Salary']
Out[37]: Name Ahmed
Address Alex
Department Account
Age 24
Salary 1000
dtype: object

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 4/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [18]: # Number of employees by department in each governorate

Hr.groupby('Address')['Department'].value_counts()
Out[18]: Address Department
Alex Account 2
Computer 2
Sales 2
Alexandria Computer 1
Alixandria Computer 1
Cairo Computer 4
Account 2
Sales 2
Giza Account 3
Sales 2
Milan Sales 2
milan Sales 1
Name: Department, dtype: int64

#Two different ways to find the solution and how to deal with history to do the calculation

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 5/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [14]: def an (Bonus):

#for r in range(len(Hr['Hire Date'])):

hire_date = datetime.datetime.strptime(Hr['Hire Date'], '%d-%m-%y')
if hire_date >= datetime.datetime(2005, 1, 1):
Hr['Bonus'] = 5/100 * Hr['Salary']
elif hire_date >= datetime.datetime(2003, 1, 1):
Hr['Bonus']= 10/100 * Hr['Salary']
elif hire_date >= datetime.datetime(2000, 1, 1):
Hr['Bonus'] = 15/100 * Hr['Salary']
elif hire_date >= datetime.datetime(1995, 1, 1):
Hr['Bonus'] = 20/100 * Hr['Salary']
elif hire_date >= datetime.datetime(1990, 1, 1):
Hr['Bonus'] = 25/100 * Hr['Salary']
else :
Hr['Bonus']= 30/100 * Hr['Salary']

#Hr['Bonus'] = Hr.apply(an, axis=1)

print(Hr['Bonus'])

0 300.0
1 100.0
2 750.0
3 250.0
4 400.0
5 200.0
6 250.0
7 450.0
8 700.0
9 450.0
10 800.0
11 450.0
12 250.0
13 100.0
14 150.0
15 300.0
16 400.0
17 400.0
18 500.0
19 150.0
20 50.0
21 1200.0
22 1000.0
23 400.0
Name: Bonus, dtype: float64

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 6/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [7]: def an(row):

hire_date = datetime.datetime.strptime(row['Hire Date'], '%d-%m-%y')

if hire_date >= datetime.datetime(2005, 1, 1):
return 5/100 * row['Salary']
elif hire_date >= datetime.datetime(2003, 1, 1):
return 10/100 * row['Salary']
elif hire_date >= datetime.datetime(2000, 1, 1):
return 15/100 * row['Salary']
elif hire_date >= datetime.datetime(1995, 1, 1):
return 20/100 * row['Salary']
elif hire_date >= datetime.datetime(1990, 1, 1):
return 25/100 * row['Salary']
else:
return 30/100 * row['Salary']

Hr['Bonus'] = Hr.apply(an, axis=1)

print(Hr['Bonus'])
0 300.0
1 100.0
2 750.0
3 250.0
4 400.0
5 200.0
6 250.0
7 450.0
8 700.0
9 450.0
10 800.0
11 450.0
12 250.0
13 100.0
14 150.0
15 300.0
16 400.0
17 400.0
18 500.0
19 150.0
20 50.0
21 1200.0
22 1000.0
23 400.0
Name: Bonus, dtype: float64

In [9]: # Spreadsheet after adding the increment column

Hr.head()
Out[9]:
E_ID Name Age Address Telephone Salary Department Hire Date Bonus

0 11 Aleya 46 Cairo 4218483 3000 Account 24-02-03 300.0

1 9 Hassan 25 Cairo 3578283 2000 Sales 21-01-06 100.0

2 15 Ramy 57 Alex 3674313 5000 Computer 21-03-00 750.0

3 18 Ola 28 Milan 4186473 5000 Sales 04-02-07 250.0

4 22 Zeiad 29 Milan 3642303 2000 Sales 01-03-98 400.0

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 7/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [30]: # boxplot mean of age

sns.boxplot(x=Hr['Age'],palette='YlGn')
Out[30]: <AxesSubplot:xlabel='Age'>

In [18]: xa = sns.countplot(x=Hr['Department'],palette='PuBu')
for bar in xa.containers:
xa.bar_label(bar)

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 8/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [22]: xa = sns.countplot(x=Hr['Address'],palette='Set1')
for bar in xa.containers:
xa.bar_label(bar)

In [19]: plt.figure(figsize=(10,5))
sns.countplot(x='Department',hue='Address',data=Hr,palette='hsv')
Out[19]: <AxesSubplot:xlabel='Department', ylabel='count'>

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 9/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [95]: # Age with Department

#Comparing ages with salaries in different governorates
plt.figure(figsize=(8,5))
sns.boxplot(x=Hr['Address'],hue=Hr['Department'],y=Hr['Age'],palette='Set1')
Out[95]: <AxesSubplot:xlabel='Address', ylabel='Age'>

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 10/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [96]: #Compare ages with salaries in different departments

sns.jointplot(x='Salary',y='Age',hue='Department',data=Hr)
Out[96]: <seaborn.axisgrid.JointGrid at 0x2043cd52d30>

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 11/12
3/27/24, 11:20 PM Untitled11 - Jupyter Notebook

In [41]: #Compare ages with salaries in different Address

sns.jointplot(x='Salary',y='Age',hue='Address',data=Hr)
Out[41]: <seaborn.axisgrid.JointGrid at 0x1a1bade56d0>

In [ ]:

localhost:8889/notebooks/mahmoud1/Untitled11.ipynb 12/12

240-1034414344 Corporate Identity Manual
No ratings yet
240-1034414344 Corporate Identity Manual
38 pages
KPMG Sample Question
No ratings yet
KPMG Sample Question
1 page
2023 Grade 10 MT1 Computer Studies
No ratings yet
2023 Grade 10 MT1 Computer Studies
4 pages
Idempiere Reporting Help
No ratings yet
Idempiere Reporting Help
35 pages
Process Phases of Asset Under Construction in SAP: Asset Module Optimization & Accuracy in Costing / Depreciation Run
No ratings yet
Process Phases of Asset Under Construction in SAP: Asset Module Optimization & Accuracy in Costing / Depreciation Run
2 pages
FDD Customer Account Statement
0% (1)
FDD Customer Account Statement
14 pages
Organizational Readiness to E-Transformation
From Everand
Organizational Readiness to E-Transformation
Aqel M. Aqel
No ratings yet
2018 IEC 62443 and ISASecure Overview - Suppliers Persp
100% (1)
2018 IEC 62443 and ISASecure Overview - Suppliers Persp
25 pages
SAP FICO Training Document - 15.07.2017
0% (1)
SAP FICO Training Document - 15.07.2017
152 pages
1MV S4hana1909 BPD en XX
No ratings yet
1MV S4hana1909 BPD en XX
14 pages
7KE1 Manual Profit Center Planning
No ratings yet
7KE1 Manual Profit Center Planning
5 pages
Sap Bydesign 1702 Product Info Product Development PDF
No ratings yet
Sap Bydesign 1702 Product Info Product Development PDF
35 pages
Mohammed Amin Fuggawala
No ratings yet
Mohammed Amin Fuggawala
6 pages
Sap CPM
No ratings yet
Sap CPM
66 pages
Materials Management User Manual: Jharkhand Bijli Vitran Nigam Limited
No ratings yet
Materials Management User Manual: Jharkhand Bijli Vitran Nigam Limited
40 pages
FICO Interview Questions
No ratings yet
FICO Interview Questions
177 pages
Sachin Dhumal (Oracle Finance Functional Consultant)
No ratings yet
Sachin Dhumal (Oracle Finance Functional Consultant)
8 pages
DALDA AR Questionaire
No ratings yet
DALDA AR Questionaire
8 pages
Introduction To SAP
100% (1)
Introduction To SAP
60 pages
Sap Ip Bi File Data Load
No ratings yet
Sap Ip Bi File Data Load
25 pages
Busy Courseware
No ratings yet
Busy Courseware
291 pages
Complete Download Integrated Business Processes With ERP Systems 1st Edition (Ebook PDF) PDF All Chapters
100% (2)
Complete Download Integrated Business Processes With ERP Systems 1st Edition (Ebook PDF) PDF All Chapters
41 pages
Programmer Competency Matrix
No ratings yet
Programmer Competency Matrix
23 pages
UM - MM IM Program Structure & AR
No ratings yet
UM - MM IM Program Structure & AR
27 pages
Sap Fico With Hana PDF
No ratings yet
Sap Fico With Hana PDF
7 pages
HFM Overview
No ratings yet
HFM Overview
48 pages
HTML & CSS Syllabus
No ratings yet
HTML & CSS Syllabus
2 pages
SAP ERP Modules - Overview
No ratings yet
SAP ERP Modules - Overview
2 pages
Business Partner Objects in SAP HCM Talent Management - SAP Blogs
No ratings yet
Business Partner Objects in SAP HCM Talent Management - SAP Blogs
11 pages
IFRS 9 Financial Instruments
No ratings yet
IFRS 9 Financial Instruments
126 pages
Step1: Create A Form in Ke34 Which Is For Report Layout
No ratings yet
Step1: Create A Form in Ke34 Which Is For Report Layout
3 pages
General Ledger Tables
No ratings yet
General Ledger Tables
3 pages
BAAN Audit Program
No ratings yet
BAAN Audit Program
19 pages
SAP Sales & Distribution: Think Tree Technologies, Inc
No ratings yet
SAP Sales & Distribution: Think Tree Technologies, Inc
47 pages
Differences Between BADI and BTE
No ratings yet
Differences Between BADI and BTE
1 page
Company Code Merge
No ratings yet
Company Code Merge
6 pages
Certification Matrix OBIEE
100% (1)
Certification Matrix OBIEE
4 pages
Resume: Katikireddy Lakshmi Hanish Mobile: +91-7396041082
No ratings yet
Resume: Katikireddy Lakshmi Hanish Mobile: +91-7396041082
5 pages
Forms Personalization - Training Manual
100% (1)
Forms Personalization - Training Manual
36 pages
GST Walk Through AFL V4.0
No ratings yet
GST Walk Through AFL V4.0
22 pages
HFM - Hyperion Financial Management: Key Features of HFM
No ratings yet
HFM - Hyperion Financial Management: Key Features of HFM
6 pages
Oamk Oak5
No ratings yet
Oamk Oak5
2 pages
SAP Budget Report ZFM01
No ratings yet
SAP Budget Report ZFM01
31 pages
Financial Accounting & Controlling: Run Server
No ratings yet
Financial Accounting & Controlling: Run Server
106 pages
Mahindra Sap Project 2
No ratings yet
Mahindra Sap Project 2
168 pages
PES IUP Brochure
No ratings yet
PES IUP Brochure
14 pages
Goods Receipt Note-Script
No ratings yet
Goods Receipt Note-Script
8 pages
Enterprise Structure - DeEPA G
No ratings yet
Enterprise Structure - DeEPA G
19 pages
Sap Fico Landscape - Sap Fico Notes by Satyanarayana Sir
No ratings yet
Sap Fico Landscape - Sap Fico Notes by Satyanarayana Sir
3 pages
Project Management - LinkedIn
No ratings yet
Project Management - LinkedIn
251 pages
Power BI Unit-I
No ratings yet
Power BI Unit-I
56 pages
Alert Configuration For SAP PI PO 1702036307
No ratings yet
Alert Configuration For SAP PI PO 1702036307
10 pages
Auto PO Creation
No ratings yet
Auto PO Creation
17 pages
KKAO Calculate WIP For PCC Coll
No ratings yet
KKAO Calculate WIP For PCC Coll
7 pages
Courses Listed: From Search On 'TS4F01' Filtered By: Financial Accounting, Colombia, SAP S/4HANA
No ratings yet
Courses Listed: From Search On 'TS4F01' Filtered By: Financial Accounting, Colombia, SAP S/4HANA
5 pages
SAP Real Time Project
No ratings yet
SAP Real Time Project
1 page
Robert SR Consultant - BA
No ratings yet
Robert SR Consultant - BA
9 pages
Cut Over
No ratings yet
Cut Over
11 pages
Oracle Essbase 9 Implementation Guide
From Everand
Oracle Essbase 9 Implementation Guide
Joseph Sydney Gomez
No ratings yet
Oracle E-Business Suite Manufacturing & Supply Chain Management
From Everand
Oracle E-Business Suite Manufacturing & Supply Chain Management
Bastin Gerald
No ratings yet
Oracle Unified Method Third Edition
From Everand
Oracle Unified Method Third Edition
Gerardus Blokdyk
No ratings yet
Business Application Programming Interface BAPI Standard Requirements
From Everand
Business Application Programming Interface BAPI Standard Requirements
Gerardus Blokdyk
No ratings yet
OSA PTP Grandmaster, NTP Server, GNSS Receiver. Your Benefits
No ratings yet
OSA PTP Grandmaster, NTP Server, GNSS Receiver. Your Benefits
7 pages
Digital Signal Processing
No ratings yet
Digital Signal Processing
2 pages
02 - User's Manual Hardware - MICREX-SX SPH Instructions (SX-Programmer Expert) - FEH200
100% (2)
02 - User's Manual Hardware - MICREX-SX SPH Instructions (SX-Programmer Expert) - FEH200
603 pages
Chapter 10 (Stalling)
No ratings yet
Chapter 10 (Stalling)
5 pages
Unit 2 - Notes
No ratings yet
Unit 2 - Notes
52 pages
SFHCM1911 18 Wbook EC Document Generation en in
No ratings yet
SFHCM1911 18 Wbook EC Document Generation en in
26 pages
Vaishakha Porwal ATM1
No ratings yet
Vaishakha Porwal ATM1
18 pages
Allintext
No ratings yet
Allintext
7 pages
Hack Any Wifi by @Anonymous4Bhai
100% (1)
Hack Any Wifi by @Anonymous4Bhai
63 pages
Hotel Billing System
No ratings yet
Hotel Billing System
2 pages
Cookies
No ratings yet
Cookies
29 pages
Wa0006
No ratings yet
Wa0006
3 pages
(Arkoprovo Sarkar Bcac101 Ca1)
No ratings yet
(Arkoprovo Sarkar Bcac101 Ca1)
13 pages
Study of Recent Charge Pump Circuits in Phase Locked Loop: Umakantananda, Jyotirmayee Sarangi, Prakash Kumar Rout
No ratings yet
Study of Recent Charge Pump Circuits in Phase Locked Loop: Umakantananda, Jyotirmayee Sarangi, Prakash Kumar Rout
7 pages
MT 9 I7
No ratings yet
MT 9 I7
49 pages
Changelog
No ratings yet
Changelog
4 pages
A Report of Six Months Industrial Training at Niks Technology & Cognizant
No ratings yet
A Report of Six Months Industrial Training at Niks Technology & Cognizant
36 pages
Website Development of Crime Management System
No ratings yet
Website Development of Crime Management System
35 pages
Opentext Extended Ecm For Sap Successfactors Ce - Overview and Preparation
100% (1)
Opentext Extended Ecm For Sap Successfactors Ce - Overview and Preparation
59 pages
Q3 CHS Module 2
No ratings yet
Q3 CHS Module 2
19 pages
Basic Security For End Users
No ratings yet
Basic Security For End Users
20 pages
Log 20230131
No ratings yet
Log 20230131
37 pages
CSE 421 Algorithms: Richard Anderson Dynamic Programming
No ratings yet
CSE 421 Algorithms: Richard Anderson Dynamic Programming
4 pages
Module 2 Advanced Java JSP
No ratings yet
Module 2 Advanced Java JSP
12 pages
Lab 12
No ratings yet
Lab 12
4 pages
Deteksi Tingkat Kematangan Buah Tomat Dengan Metode Fuzzy Logic Menggunakan Modul Kamera Raspberry-Pi
No ratings yet
Deteksi Tingkat Kematangan Buah Tomat Dengan Metode Fuzzy Logic Menggunakan Modul Kamera Raspberry-Pi
45 pages
Network Design and Management
No ratings yet
Network Design and Management
4 pages
Javafx Tableview by Adding New Row in The Table During Runtime With Radiobutton
No ratings yet
Javafx Tableview by Adding New Row in The Table During Runtime With Radiobutton
11 pages