0% found this document useful (0 votes)

21 views

Data Exploration With SQL & Python - Ipynb - Colab

Uploaded by

matrix.evolution800

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Data Exploration With SQL & Python - Ipynb - Colab

Uploaded by

matrix.evolution800

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

11/30/24, 6:23 PM Data Exploration with SQL & Python.

ipynb - Colab

keyboard_arrow_down Data Exploration with SQL & Python

keyboard_arrow_down Installing Required Libraries
# Installing SQLAlchemy, PandaSQL
!pip install SQLAlchemy
!pip install PandaSQL
!pip install pymysql
# sqlite3 is already part of python.

Requirement already satisfied: SQLAlchemy in /usr/local/lib/python3.10/dist-packages (2.0.36)

Requirement already satisfied: typing-extensions>=4.6.0 in /usr/local/lib/python3.10/dist-packages (
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLA
Collecting PandaSQL
Downloading pandasql-0.7.3.tar.gz (26 kB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from PandaSQL) (1.2
Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from PandaSQL) (2.
Requirement already satisfied: sqlalchemy in /usr/local/lib/python3.10/dist-packages (from PandaSQL)
Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (fr
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas-
Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.10/dist-packages (from panda
Requirement already satisfied: typing-extensions>=4.6.0 in /usr/local/lib/python3.10/dist-packages (
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from sql
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-date
Building wheels for collected packages: PandaSQL
Building wheel for PandaSQL (setup.py) ... done
Created wheel for PandaSQL: filename=pandasql-0.7.3-py3-none-any.whl size=26772 sha256=2c220120cc0
Stored in directory: /root/.cache/pip/wheels/e9/bc/3a/8434bdcccf5779e72894a9b24fecbdcaf97940607eaf
Successfully built PandaSQL
Installing collected packages: PandaSQL
Successfully installed PandaSQL-0.7.3
Collecting pymysql
Downloading PyMySQL-1.1.1-py3-none-any.whl.metadata (4.4 kB)
Downloading PyMySQL-1.1.1-py3-none-any.whl (44 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.0/45.0 kB 2.3 MB/s eta 0:00:00
Installing collected packages: pymysql
Successfully installed pymysql-1.1.1

keyboard_arrow_down SQLAlchemy
## End point creation :
# Format & Syntax: mysql+pymysql://username:password@host:port/database

# host: learning-activity-rr.cejogcrmn6il.ap-south-1.rds.amazonaws.com
# port: 3306
# Database: assignment
# username: almafolk
# password: 8l39zk60
# Access: Read-only
# Final End Point : mysql+pymysql://almafolk:8l39zk60q@learning-activity-rr.cejogcrmn6il.ap-south-1.rds.a

import pandas as pd
from sqlalchemy import create_engine
from sqlalchemy.pool import NullPool
def mysql(query:'Write the query here .'):
'''
https://colab.research.google.com/drive/15JvXgBfQlTIHL4My4aaNG5Kiyy9x1dWd?usp=sharing#scrollTo=2oOkYg6evcFh 1/6
11/30/24, 6:23 PM Data Exploration with SQL & Python.ipynb - Colab
This function fetches data from database and returns the result.
'''
try:
engine_db = create_engine('mysql+pymysql://almafolk:8l39zk60q@learning-activity-rr.cejogcrmn6il.a
conn = engine_db.connect() ## Connection Object
# Reading Data
df = pd.read_sql_query(query, conn)

#if your connection object is named conn

if not conn.closed:
conn.close()
engine_db.dispose()
return df
except Exception as e:
print(e)

mysql('''show tables''')

Tables_in_assignment

0 abcnews-date-text

1 abcnews_date_text

2 almax_job_scraping

3 campaign_identifier

4 customer_nodes

... ...

77 subscriptions

78 telecom_churn

79 test_store_csv

80 users

81 weekly_sales

82 rows × 1 columns

keyboard_arrow_down Table 1: plans

mysql('''select * from plans''')

plan_id plan_name price

0 0 trial 0.0

1 1 basic monthly 9.9

2 2 pro monthly 19.9

3 3 pro annual 199.0

4 4 churn NaN

mysql('''select * from plans''').info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):

https://colab.research.google.com/drive/15JvXgBfQlTIHL4My4aaNG5Kiyy9x1dWd?usp=sharing#scrollTo=2oOkYg6evcFh 2/6
11/30/24, 6:23 PM Data Exploration with SQL & Python.ipynb - Colab
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 plan_id 5 non-null int64
1 plan_name 5 non-null object
2 price 4 non-null float64
dtypes: float64(1), int64(1), object(1)
memory usage: 248.0+ bytes

keyboard_arrow_down Table 2: subscriptions

mysql('''select * from subscriptions''')

customer_id plan_id start_date

0 1 0 2020-08-01

1 1 1 2020-08-08

2 2 0 2020-09-20

3 2 3 2020-09-27

4 3 0 2020-01-13

... ... ... ...

2645 999 2 2020-10-30

2646 999 4 2020-12-01

2647 1000 0 2020-03-19

2648 1000 2 2020-03-26

2649 1000 4 2020-06-04

2650 rows × 3 columns

mysql('''select * from subscriptions''').info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2650 entries, 0 to 2649
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 customer_id 2650 non-null int64
1 plan_id 2650 non-null int64
2 start_date 2650 non-null object
dtypes: int64(2), object(1)
memory usage: 62.2+ KB
keyboard_arrow_down

1. How many customers has the restaurant ever had?

[ ] ↳ 2 cells hidden
keyboard_arrow_down

2. What plan start_date values occur after the year 2020 for our dataset?

Show the breakdown by count of events for each plan_name.

[ ] ↳ 9 cells hidden

https://colab.research.google.com/drive/15JvXgBfQlTIHL4My4aaNG5Kiyy9x1dWd?usp=sharing#scrollTo=2oOkYg6evcFh 3/6
11/30/24, 6:23 PM Data Exploration with SQL & Python.ipynb - Colab

keyboard_arrow_down SQLite3
SQLite is commonly used in scenarios where lightweight, serverless, and self-contained database capabilities are
required. Here are some common use cases:

1. Embedded Systems: SQLite is often used in embedded systems and IoT devices where a compact, self-
contained database engine is needed to store and retrieve data locally.
2. Mobile Applications: Many mobile applications use SQLite as their local database engine due to its small
footprint, simplicity, and ease of integration with mobile platforms like Android and iOS.
3. Prototyping and Testing: SQLite is frequently used during the prototyping and testing phases of software
development due to its simplicity and convenience. Developers can quickly set up and interact with
databases without the need for a separate database server.

import sqlite3

# Connecting to DB
# Connection to DataBase
conn = sqlite3.connect('coffee_shop.db') # It will create the DB if not exists.

c = conn.cursor() #DML or DDL # Cursor Object

c.execute('''DROP TABLE IF EXISTS transactions;''')

## Cursor Object
## It is an object used to establish a connection to run SQL queries.
## It serves as middleware between the SQL query and the SQLite database connection.
## After connecting to an SQLite database, it is generated.

c.execute('''
CREATE TABLE transactions (
transaction_id INTEGER PRIMARY KEY,
date DATE,
time TIME,
item TEXT,
price REAL,
quantity INTEGER,
total_amount REAL
);
''')

c.execute('''
INSERT INTO transactions (
transaction_id, date, time, item, price, quantity, total_amount)
VALUES
(1, '2022-01-01', '09:00', 'Coffee', 2.50, 1, 2.50),
(2, '2022-01-01', '10:30', 'Croissant', 1.75, 2, 3.50),
(3, '2022-01-01', '11:15', 'Cappuccino', 3.00, 1, 3.00),
(4, '2022-01-02', '08:45', 'Latte', 3.50, 2, 7.00),
(5, '2022-01-02', '10:00', 'Muffin', 2.25, 1, 2.25),
(6, '2022-01-02', '12:15', 'Espresso', 2.75, 1, 2.75),
(7, '2022-01-03', '09:30', 'Croissant', 1.75, 3, 5.25),
(8, '2022-01-03', '11:00', 'Iced Coffee', 3.50, 2, 7.00),
(9, '2022-01-03', '12:45', 'Hot Chocolate', 3.25, 1, 3.25),
(10, '2022-01-04', '10:15', 'Cappuccino', 3.00, 2, 6.00);
''')

https://colab.research.google.com/drive/15JvXgBfQlTIHL4My4aaNG5Kiyy9x1dWd?usp=sharing#scrollTo=2oOkYg6evcFh 4/6
11/30/24, 6:23 PM Data Exploration with SQL & Python.ipynb - Colab

cursor=c.execute('''
SELECT * from transactions
''')
# display all data from hotel table
for row in cursor:
print(row)

df1= pd.read_sql_query('''
SELECT * from transactions
''',conn)

df1.head()

df1_g=df1.groupby(["item"])["price"].mean().reset_index()

df1_g.head()

df = pd.read_sql_query('''
SELECT item, avg(price) AS avg_price
FROM transactions
group by item;
''',conn)

df.head()

# DF to SQL table
df.to_sql("item_price", conn, if_exists="replace")
# item_price is the table name
## if_exists options: 'append','replace'

pd.read_sql_query('''
SELECT * from item_price
''',conn)

### Iterative Execution of Query

df = pd.DataFrame() # Empty Data frame

for item in ['Coffee','Croissant','Cappuccino']:

df_item = pd.read_sql_query(f'''
SELECT *
FROM transactions where item = '{item}';
''',conn)
df = pd.concat([df,df_item],axis = 0).reset_index(drop=True)
print(f"Execution of item {item} is done!")

df.loc[df.item == 'Coffee',:]

## Iterative
# DF to SQL table
for i in ['Coffee','Croissant','Cappuccino']:
df.loc[df.item == i,:].to_sql("new_table", conn, if_exists="append")

https://colab.research.google.com/drive/15JvXgBfQlTIHL4My4aaNG5Kiyy9x1dWd?usp=sharing#scrollTo=2oOkYg6evcFh 5/6
11/30/24, 6:23 PM Data Exploration with SQL & Python.ipynb - Colab

pd.read_sql_query('''
SELECT * from new_table
''',conn)

# Close the connection

conn.close()

PandaSQL
keyboard_arrow_down

[ ] ↳ 7 cells hidden

https://colab.research.google.com/drive/15JvXgBfQlTIHL4My4aaNG5Kiyy9x1dWd?usp=sharing#scrollTo=2oOkYg6evcFh 6/6

The Organization of Information 4th Edition (2017, Libraries Unlimited)
97% (34)
The Organization of Information 4th Edition (2017, Libraries Unlimited)
483 pages
Mysql 3rd Edition
100% (10)
Mysql 3rd Edition
646 pages
Google Hacking Database
83% (18)
Google Hacking Database
91 pages
Dangerous Google - Searching For Secrets PDF
88% (26)
Dangerous Google - Searching For Secrets PDF
12 pages
Voyager 7S Data Dictionary - Through Update DB 5854 - 060619
67% (3)
Voyager 7S Data Dictionary - Through Update DB 5854 - 060619
3,877 pages
Data Structures Cheat Sheet
71% (14)
Data Structures Cheat Sheet
2 pages
Google Hacking Database
No ratings yet
Google Hacking Database
91 pages
024 Price and Everything PDF
No ratings yet
024 Price and Everything PDF
12 pages
Analyzing Historical Stock - Revenue Data and Building A Dashboard - Jupyter Notebook
No ratings yet
Analyzing Historical Stock - Revenue Data and Building A Dashboard - Jupyter Notebook
9 pages
Understanding Database Types - by Alex Xu
No ratings yet
Understanding Database Types - by Alex Xu
13 pages
Pyton Book
No ratings yet
Pyton Book
15 pages
How To Use Google Hack
100% (1)
How To Use Google Hack
4 pages
Policy Document Ucc Redemption Understanding The Process Further
80% (20)
Policy Document Ucc Redemption Understanding The Process Further
37 pages
Hackers Black Book (2011-Edition)
No ratings yet
Hackers Black Book (2011-Edition)
6 pages
Some Tutorials in Computer Networking Hacking
From Everand
Some Tutorials in Computer Networking Hacking
Dr. Hidaia Mahmood Alassouli
No ratings yet
Dark Web Market Price Index Hacking Tools July 2018 Top10VPN2
91% (11)
Dark Web Market Price Index Hacking Tools July 2018 Top10VPN2
7 pages
Google Hacking
100% (7)
Google Hacking
66 pages
Color-Coded Genealogy Research Filing System
No ratings yet
Color-Coded Genealogy Research Filing System
15 pages
Kali Linux Tools Descriptions
100% (2)
Kali Linux Tools Descriptions
26 pages
Elen C-Series-User-Manual
100% (1)
Elen C-Series-User-Manual
68 pages
Nitomortar TC2000 PDF
100% (2)
Nitomortar TC2000 PDF
3 pages
Business and Administrative Communication by Kitty Locker and Donna Kienzler - 10e, TEST BANK 0073403180
No ratings yet
Business and Administrative Communication by Kitty Locker and Donna Kienzler - 10e, TEST BANK 0073403180
35 pages
CS 6290: High-Performance Computer Architecture Spring 2009 Final Exam
No ratings yet
CS 6290: High-Performance Computer Architecture Spring 2009 Final Exam
14 pages
Data Exploration With SQL & Python
No ratings yet
Data Exploration With SQL & Python
4 pages
Mini Projects 3-6-Satyaki Mitra
No ratings yet
Mini Projects 3-6-Satyaki Mitra
60 pages
Data Reduction Using Pythonh
No ratings yet
Data Reduction Using Pythonh
5 pages
Server Side - Python - MySQL Connectivity With Python
No ratings yet
Server Side - Python - MySQL Connectivity With Python
13 pages
Python My SQL
100% (2)
Python My SQL
13 pages
Customer - Segmentation - Jupyter Notebook
No ratings yet
Customer - Segmentation - Jupyter Notebook
3 pages
Improving Performance of SQLite Data 1703882908
No ratings yet
Improving Performance of SQLite Data 1703882908
8 pages
Spark 3.0 New Features: Spark With GPU Support
No ratings yet
Spark 3.0 New Features: Spark With GPU Support
8 pages
Database Programming in Python
No ratings yet
Database Programming in Python
21 pages
GNN Hands On 02
No ratings yet
GNN Hands On 02
11 pages
Python To MySql Connection
No ratings yet
Python To MySql Connection
16 pages
ML Ass
No ratings yet
ML Ass
27 pages
The Python Database API
No ratings yet
The Python Database API
9 pages
Output2
No ratings yet
Output2
2 pages
UTS Analisis Big Data Part 1.ipynb - Colaboratory
No ratings yet
UTS Analisis Big Data Part 1.ipynb - Colaboratory
13 pages
KRAI LabManual
No ratings yet
KRAI LabManual
77 pages
Notes Database Connectivity 2022
No ratings yet
Notes Database Connectivity 2022
13 pages
GNN Hands On 01
No ratings yet
GNN Hands On 01
11 pages
Connecting Python With SQL Database
No ratings yet
Connecting Python With SQL Database
21 pages
SBL Python LAB Manual by NY Expt. No. 6.docx
No ratings yet
SBL Python LAB Manual by NY Expt. No. 6.docx
5 pages
Exercises - Mastering Postgresql - Mastering SQL Using Postgresql
No ratings yet
Exercises - Mastering Postgresql - Mastering SQL Using Postgresql
25 pages
Alumni Management
No ratings yet
Alumni Management
31 pages
Interface Python With MYSQL
No ratings yet
Interface Python With MYSQL
10 pages
Java Struts2 and Hibernate4 CRUD With MySQL With Pagination, Sorting and Export Option Using Netbeans
No ratings yet
Java Struts2 and Hibernate4 CRUD With MySQL With Pagination, Sorting and Export Option Using Netbeans
22 pages
Machine Learning Laboratory
No ratings yet
Machine Learning Laboratory
23 pages
ESTIVEN - HURTADO.SANTOS - Analytics, De, Data, No, Estructurada - Machine, Learning - ESTIVEN - HURTADO.SANTOS - Ipynb - Colaboratory
No ratings yet
ESTIVEN - HURTADO.SANTOS - Analytics, De, Data, No, Estructurada - Machine, Learning - ESTIVEN - HURTADO.SANTOS - Ipynb - Colaboratory
5 pages
Csharp Sqlite
No ratings yet
Csharp Sqlite
8 pages
Node - Js SQLite Tutorial - Connection & CRUD - Techiediaries
No ratings yet
Node - Js SQLite Tutorial - Connection & CRUD - Techiediaries
3 pages
Scikit-Learn-Exercises - Jupyter Notebook
100% (2)
Scikit-Learn-Exercises - Jupyter Notebook
28 pages
Python Database Programming
No ratings yet
Python Database Programming
11 pages
Python Database Programming: Storage Areas
No ratings yet
Python Database Programming: Storage Areas
11 pages
XII CS Python MySQL Connectivity Notes
0% (1)
XII CS Python MySQL Connectivity Notes
6 pages
11_Interface Python with SQL
No ratings yet
11_Interface Python with SQL
18 pages
Python Pandas Data Analysis
No ratings yet
Python Pandas Data Analysis
36 pages
LAB 2 DMA in Pointers
No ratings yet
LAB 2 DMA in Pointers
8 pages
SBMs
No ratings yet
SBMs
4 pages
Python - Mysql Database Access: Gadfly MSQL Mysql Postgresql Microsoft SQL Server 2000 Informix Interbase Oracle Sybase
No ratings yet
Python - Mysql Database Access: Gadfly MSQL Mysql Postgresql Microsoft SQL Server 2000 Informix Interbase Oracle Sybase
10 pages
1.Chap 15 Interfacing Python with MySQLA (1)
No ratings yet
1.Chap 15 Interfacing Python with MySQLA (1)
13 pages
Data Mapper Guide-1.6.2
No ratings yet
Data Mapper Guide-1.6.2
108 pages
Interfacing Python To Mysql
No ratings yet
Interfacing Python To Mysql
20 pages
Interfacing Python With MySql
No ratings yet
Interfacing Python With MySql
3 pages
Oracle 10g Datafile I/O Statistics Mike Ault, Harry Conway and Don Burleson
No ratings yet
Oracle 10g Datafile I/O Statistics Mike Ault, Harry Conway and Don Burleson
10 pages
2324 BigData Lab3
No ratings yet
2324 BigData Lab3
6 pages
Databricks Interview Questions
No ratings yet
Databricks Interview Questions
4 pages
Mysql Cluster Load Balancing
No ratings yet
Mysql Cluster Load Balancing
5 pages
Data Minig and Techniquezz
No ratings yet
Data Minig and Techniquezz
48 pages
Database Lab
No ratings yet
Database Lab
69 pages
IRIS Commands Practice
No ratings yet
IRIS Commands Practice
10 pages
Objective
No ratings yet
Objective
4 pages
Extracting Data From An API On Databricks - by Ryan Chynoweth - Feb, 2024 - Medium
No ratings yet
Extracting Data From An API On Databricks - by Ryan Chynoweth - Feb, 2024 - Medium
12 pages
Learning Apache Spark With Python
No ratings yet
Learning Apache Spark With Python
10 pages
DMC - Record
No ratings yet
DMC - Record
54 pages
Alumni Management
No ratings yet
Alumni Management
31 pages
Exploring The Sysmaster Database: by Lester Knutsen
No ratings yet
Exploring The Sysmaster Database: by Lester Knutsen
23 pages
Python and SQLite Development
From Everand
Python and SQLite Development
Agus Kurniawan
No ratings yet
Administering Microsoft Azure SQL Solutions DP 300
From Everand
Administering Microsoft Azure SQL Solutions DP 300
Manish Soni
No ratings yet
Useful Google Hacks
100% (4)
Useful Google Hacks
7 pages
SQL Crash Course
No ratings yet
SQL Crash Course
17 pages
Microsoft Access For Beginners PDF
100% (2)
Microsoft Access For Beginners PDF
196 pages
TITLE 28 United States Code Sec. 3002
91% (11)
TITLE 28 United States Code Sec. 3002
77 pages
Google Hacking Database PDF
0% (1)
Google Hacking Database PDF
100 pages
Database Management Systems
No ratings yet
Database Management Systems
19 pages
24 Essential SQL Interview Questions
No ratings yet
24 Essential SQL Interview Questions
13 pages
Mythic Magazine #015
100% (3)
Mythic Magazine #015
34 pages
Open Source Intelligence
No ratings yet
Open Source Intelligence
4 pages
Open Source Intelligence (Osint) Reference Sheet
0% (1)
Open Source Intelligence (Osint) Reference Sheet
23 pages
Other Link Classified - How To Find The Book I Want
No ratings yet
Other Link Classified - How To Find The Book I Want
453 pages
Master Cyber Digital Forensics
50% (2)
Master Cyber Digital Forensics
114 pages
SQL Cheat Sheet
91% (11)
SQL Cheat Sheet
11 pages
Anatomy of A Hack
No ratings yet
Anatomy of A Hack
43 pages
Network Automation Cookbook
No ratings yet
Network Automation Cookbook
44 pages
Ijme V2i6p107
No ratings yet
Ijme V2i6p107
5 pages
Strategic Management MCQ Chapter 8
No ratings yet
Strategic Management MCQ Chapter 8
9 pages
L4M2 Quiz
No ratings yet
L4M2 Quiz
108 pages
TIH 100M_230V - Extra large induction heaters _ SKF
No ratings yet
TIH 100M_230V - Extra large induction heaters _ SKF
5 pages
Module 2 Lesson 2.2
No ratings yet
Module 2 Lesson 2.2
9 pages
Vego Brick
No ratings yet
Vego Brick
4 pages
Crunch
No ratings yet
Crunch
24 pages
Indonesian: Malay: English
No ratings yet
Indonesian: Malay: English
3 pages
Literature Review On Small and Medium Scale Enterprises in Nigeria
100% (1)
Literature Review On Small and Medium Scale Enterprises in Nigeria
4 pages
Vineet Dhanawat
No ratings yet
Vineet Dhanawat
8 pages
Candidate Practical
No ratings yet
Candidate Practical
4 pages
My Log
No ratings yet
My Log
6 pages
Studies in Education
No ratings yet
Studies in Education
120 pages
TOS EsP 6 1st Quarter 2022 2023
No ratings yet
TOS EsP 6 1st Quarter 2022 2023
10 pages
Integrals S23
No ratings yet
Integrals S23
5 pages
Reemplazos OPTOACOPLADORES
67% (6)
Reemplazos OPTOACOPLADORES
42 pages
Alignment Procedure
No ratings yet
Alignment Procedure
2 pages
Starter, Ign & Charging PDF
No ratings yet
Starter, Ign & Charging PDF
1 page
Participatory Action Research Literature Review
100% (1)
Participatory Action Research Literature Review
7 pages
BMU Titan Cradle: Standard Features Control Box
No ratings yet
BMU Titan Cradle: Standard Features Control Box
2 pages
215 Cultural Safety Module and Self-Evaluation Assignment 2021 1
No ratings yet
215 Cultural Safety Module and Self-Evaluation Assignment 2021 1
15 pages
5 Ood
No ratings yet
5 Ood
40 pages
Thesis Title
No ratings yet
Thesis Title
5 pages
Cable Tray - Wireway Cable Fill Computation - 042618
No ratings yet
Cable Tray - Wireway Cable Fill Computation - 042618
1 page
Essay Arts
No ratings yet
Essay Arts
4 pages
APA 7th Edition Paper Template v1.0
No ratings yet
APA 7th Edition Paper Template v1.0
7 pages