Script Scrapping

Uploaded by

This Python code uses Selenium to scrape reviews from the Google Play Store. It loads the Selenium webdriver, sets Chrome options, and navigates to a specific app URL. It then loops through reviews, parsing each with BeautifulSoup to extract the date, rating, title, and text. The scraped data is stored in a Pandas dataframe which is exported to a CSV file.

Copyright:

Available Formats

Download as TXT, PDF, TXT or read online from Scribd

Script Scrapping

Uploaded by

NormaFikria

0% found this document useful (0 votes)

93 views2 pages

Original Description:

playstore web scrapping

Original Title

script scrapping

Copyright

Available Formats

TXT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as TXT, PDF, TXT or read online from Scribd

Download as txt, pdf, or txt

0% found this document useful (0 votes)

93 views2 pages

Script Scrapping

Uploaded by

NormaFikria

Copyright:

Available Formats

Download as TXT, PDF, TXT or read online from Scribd

Download as txt, pdf, or txt

Jump to Page

You are on page 1of 2

Search inside document

#load webdriver function from selenium

from selenium import webdriver

from time import sleep
from bs4 import BeautifulSoup, Comment
import pandas as pd

#Setting up Chrome webdriver Options

chrome_options = webdriver.ChromeOptions()

#setting up local path of chrome binary file

chrome_path= r"C:\Users\USER\Documents\chromedriver_win32\chromedriver.exe"

#creating Chrome webdriver instance with the set chrome_options

driver = webdriver.Chrome(chrome_path)
link = "https://play.google.com/store/apps/details?id=com.kai.kaiticketing&hl=in"
driver.get(link)
#driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
Ptitle = driver.find_element_by_class_name('id-app-title').text.replace(' ','')
print(Ptitle)
#driver.find_element_by_xpath('//*[@id="body-
content"]/div/div/div[1]/div[2]/div[2]/div[1]/div[4]/button[2]/div[2]').click()

sleep(1)
driver.find_element_by_xpath('//*[@id="body-
content"]/div/div/div[1]/div[2]/div[2]/div[1]/div[4]/button[2]/div[2]/div/div').cli
ck()
#select_newest.select_by_visible_text('Newest')
driver.find_element_by_xpath('//*[@id="body-
content"]/div/div/div[1]/div[2]/div[2]/div[1]/div[4]/button[2]/div[2]/div/div').cli
ck()
sleep(2)

#driver.find_element_by_css_selector('.review-filter.id-review-sort-
filter.dropdown-menu-container').click()
driver.find_element_by_css_selector('.displayed-child').click()
#driver.find_element_by_xpath("//button[@data-dropdown-value='1']").click()
driver.execute_script("document.querySelectorAll('button.dropdown-child')
[0].click()")
reviews_df = []
for i in range(1,100):
try:
for elem in driver.find_elements_by_class_name('single-review'):
print(str(i))
content = elem.get_attribute('outerHTML')
soup = BeautifulSoup(content, "html.parser")
#print(soup.prettify())
date = soup.find('span',class_='review-date').get_text()
rating = soup.find('div',class_='tiny-star')['aria-label'][6:7]
title = soup.find('span',class_='review-title').get_text()
txt = soup.find('div',class_='review-body').get_text().replace('Full
Review','')[len(title)+1:]
print(soup.get_text())
temp = pd.DataFrame({'Date':date,'Rating':rating,'Review
Title':title,'Review Text':txt},index=[0])
print('-'*10)
reviews_df.append(temp)
#print(elem)
except:
print('s')
driver.find_element_by_xpath('//*[@id="body-
content"]/div/div/div[1]/div[2]/div[2]/div[1]/div[4]/button[2]/div[2]/div/div').cli
ck()
reviews_df = pd.concat(reviews_df,ignore_index=True)

reviews_df.to_csv(Ptitle+'_reviews_kaiaccess.csv', encoding='utf-8')

#driver.close()

Angularjs: HTML Enhanced For Web Apps!
Document4 pages
Angularjs: HTML Enhanced For Web Apps!
rashed44
No ratings yet
Safety Issues Including Digital Safety Rules
Document6 pages
Safety Issues Including Digital Safety Rules
Jedidiah Garcia
100% (2)
Web Scrapin D Comment
Document1 page
Web Scrapin D Comment
rishavrajbanka
No ratings yet
Scraping Instagram With Python
Document4 pages
Scraping Instagram With Python
Srujana Takkallapally
No ratings yet
Automation Testing Case Study Solution
Document15 pages
Automation Testing Case Study Solution
urvashi4301
No ratings yet
Wa Status
Document2 pages
Wa Status
14.Muhammad Fahri Al Islami
No ratings yet
Selenium by Using Python
Document5 pages
Selenium by Using Python
phonesjunction
No ratings yet
Economist Old Edition
Document7 pages
Economist Old Edition
Piscine
No ratings yet
Assignment-Equivalence Class Testing Using Selenium
Document5 pages
Assignment-Equivalence Class Testing Using Selenium
Ananya Lakshmi P
No ratings yet
Week 05 - Python
Document1 page
Week 05 - Python
Sanwarie Gunaratne
No ratings yet
Message
Document6 pages
Message
iicheater800
No ratings yet
Scrapy
Document1 page
Scrapy
Rayees Rasheed
No ratings yet
Red Bus
Document2 pages
Red Bus
Kavita Badgujar
No ratings yet
Cabico Tan
Document11 pages
Cabico Tan
jaydee cabico
No ratings yet
S Det Selenium Imp File
Document77 pages
S Det Selenium Imp File
manikumarmalle132
No ratings yet
QB IA3 Answers Complete
Document8 pages
QB IA3 Answers Complete
Darshan Dachhu
No ratings yet
Example Import GCP To ADLS
Document7 pages
Example Import GCP To ADLS
jenniferwright3264338
No ratings yet
Web Scraping Project
Document1 page
Web Scraping Project
prerak sheth
No ratings yet
Dot Net
Document8 pages
Dot Net
sohail peerjade
No ratings yet
Selenium Notes
Document17 pages
Selenium Notes
Aditya Gupta
No ratings yet
Message
Document12 pages
Message
iicheater800
No ratings yet
Allah Ikhlaf 3la Moulah
Document15 pages
Allah Ikhlaf 3la Moulah
simosifax04
No ratings yet
Menu Py
Document5 pages
Menu Py
sunlamp.joists.0c
No ratings yet
Manual de Crud Firebase
Document10 pages
Manual de Crud Firebase
Jaiver Alberto Rojas Monterrosa
No ratings yet
another hack test3
Document4 pages
another hack test3
panaghiotidismatthieu
No ratings yet
3.1 Reselling - Code
Document2 pages
3.1 Reselling - Code
tarekrecovery21
No ratings yet
1
Document3 pages
1
firas.zouari
No ratings yet
Fixed by Chat GPT 4
Document7 pages
Fixed by Chat GPT 4
murad tatari
No ratings yet
Selenium Cheat Sheet
Document14 pages
Selenium Cheat Sheet
akshay
No ratings yet
Doa Untuk Pelajar#
Document29 pages
Doa Untuk Pelajar#
Nurul Atiqah Nazaruddin
No ratings yet
Setup
Document3 pages
Setup
luis danilo ortega calero
No ratings yet
Jsjsjs
Document2 pages
Jsjsjs
Fadel Muhammad
No ratings yet
Backend Web Development Project Files
Document20 pages
Backend Web Development Project Files
Ram
No ratings yet
Locators
Document3 pages
Locators
balamuruganraj.qa
No ratings yet
Export
Document7 pages
Export
glosyrosy
No ratings yet
Lab 3
Document5 pages
Lab 3
zannatulmaoameem
No ratings yet
Emuparadise Download User Js
Document2 pages
Emuparadise Download User Js
carlos mantilla
No ratings yet
Python - Django Simple CRUD With Ajax: Getting Started
Document6 pages
Python - Django Simple CRUD With Ajax: Getting Started
Prasetyoef Pisangcisadane
No ratings yet
Fresco 2 Play Coding Answers
Document31 pages
Fresco 2 Play Coding Answers
dashingknight90
No ratings yet
Crawl
Document1 page
Crawl
Huỳnh Đỗ Tấn Thành
No ratings yet
Code of The Project
Document10 pages
Code of The Project
Abhishek kumar
No ratings yet
Code 2
Document1 page
Code 2
SuprinAhluwalia
No ratings yet
PowerShell Examples v4
Document2 pages
PowerShell Examples v4
reemreem01
100% (3)
use
Document9 pages
use
Dhirly Pattynama
No ratings yet
Xpath Student Notest
Document9 pages
Xpath Student Notest
929-Vijay Kumar
No ratings yet
10 Realtime Python Automation Scripts
Document12 pages
10 Realtime Python Automation Scripts
Shreenivasa Y G
100% (2)
Quesion Bank UT-2 CSS(22519)
Document26 pages
Quesion Bank UT-2 CSS(22519)
snehamagade1
No ratings yet
Selenium Automation
Document58 pages
Selenium Automation
Jiji Abhilash
100% (1)
Cached Web
Document16 pages
Cached Web
sieudat123
No ratings yet
10 PHP MVC Frameworks Templating and Forms Lab
Document15 pages
10 PHP MVC Frameworks Templating and Forms Lab
Nguyen Tuan Kiet (FGW DN)
No ratings yet
SeleniumSyntaxQuick ReferenceSreehariB
Document15 pages
SeleniumSyntaxQuick ReferenceSreehariB
B Sreehari
No ratings yet
Appendices A D
Document24 pages
Appendices A D
Jhon Emar Quillo
No ratings yet
core_audit_addition_notify_and_population
Document3 pages
core_audit_addition_notify_and_population
priya s
No ratings yet
Data Gathering
Document7 pages
Data Gathering
abdur.rahman.30104
No ratings yet
DBDBDBDBD
Document18 pages
DBDBDBDBD
Islam Tiaiba
No ratings yet
Lab 4
Document5 pages
Lab 4
zannatulmaoameem
No ratings yet
Flask Deployment Doc - Amen
Document3 pages
Flask Deployment Doc - Amen
santhiyasantthosh
No ratings yet
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
React Portfolio App Development: Increase your online presence and create your personal brand
From Everand
React Portfolio App Development: Increase your online presence and create your personal brand
Abdelfattah Ragab
No ratings yet
Angular Generative AI: Building an intelligent CV enhancer with Google Gemini
From Everand
Angular Generative AI: Building an intelligent CV enhancer with Google Gemini
Abdelfattah Ragab
No ratings yet
50 Recipes for Programming Angular
From Everand
50 Recipes for Programming Angular
Jamie Munro
Rating: 4 out of 5 stars
4/5 (1)
Online Class Orientation
Document20 pages
Online Class Orientation
Joana Sor
No ratings yet
Eng Odgovori
Document5 pages
Eng Odgovori
Vedran Mendelski
No ratings yet
Service Manual REV1.4 (Pancorp) - English
Document120 pages
Service Manual REV1.4 (Pancorp) - English
Mary Trujillo
No ratings yet
Digm8
Document12 pages
Digm8
Amandeep nanda
No ratings yet
Alzheimers Assistant PDF
Document34 pages
Alzheimers Assistant PDF
abhirami manikandan
No ratings yet
Production Job Description
Document2 pages
Production Job Description
fantavina
100% (1)
Building A Wisp - Ubiquiti Wiki
Document3 pages
Building A Wisp - Ubiquiti Wiki
Muhammad Andik Izzuddin
No ratings yet
Student Handbook 1.0
Document8 pages
Student Handbook 1.0
Januhu Zakary
No ratings yet
The Art of The Accident
Document5 pages
The Art of The Accident
Riccardo Mantelli
No ratings yet
Note: Extra Spare Switches Need To Be Available As Backup Incase of Failures
Document21 pages
Note: Extra Spare Switches Need To Be Available As Backup Incase of Failures
rajugs_lg
No ratings yet
Literature Review of Nepal Telecom
Document7 pages
Literature Review of Nepal Telecom
afdtalblw
100% (1)
Flutter - IDE Shortcuts For Faster Development
Document21 pages
Flutter - IDE Shortcuts For Faster Development
MarceloMoreiraCunha
No ratings yet
Assignment History of Social Networking
Document15 pages
Assignment History of Social Networking
Manoji Arulampalam
No ratings yet
Jaipur Engineering College and Research Center :: Neha Gupta :: Report
Document19 pages
Jaipur Engineering College and Research Center :: Neha Gupta :: Report
engineeringwatch
No ratings yet
Python Project Ideas
Document11 pages
Python Project Ideas
betafilip6910
100% (1)
Ccna-200-301 - 4.0 IP Services Questions
Document9 pages
Ccna-200-301 - 4.0 IP Services Questions
nomar24
No ratings yet
CGI Programming Using Perl: Student Workbook
Document68 pages
CGI Programming Using Perl: Student Workbook
Jignesh Patil
No ratings yet
Cyberbullying and Self-Esteem
Document9 pages
Cyberbullying and Self-Esteem
Dantrs
No ratings yet
Case Study Networking
Document15 pages
Case Study Networking
maulana anjasmara
No ratings yet
Avt Yura 54
Document1 page
Avt Yura 54
Юра Коваленко
No ratings yet
2015 Summer Model Answer Paper
Document25 pages
2015 Summer Model Answer Paper
Thalesh
No ratings yet
Ags - 'Arabish' in CMC (Jcms 2003) 110725
Document29 pages
Ags - 'Arabish' in CMC (Jcms 2003) 110725
shfranke
No ratings yet
លិខិតធ្វើអំណោយផ្តាច់
Document299 pages
លិខិតធ្វើអំណោយផ្តាច់
Chhuon Narom
No ratings yet
2001 Developments in Virtual Museums
Document18 pages
2001 Developments in Virtual Museums
Helena Anjoš
No ratings yet
PostGIS Essentials - Sample Chapter
Document17 pages
PostGIS Essentials - Sample Chapter
Packt Publishing
100% (1)
Assignment Bus 231 Essay On The Beauty of Video Games
Document7 pages
Assignment Bus 231 Essay On The Beauty of Video Games
hellstorm85
No ratings yet
Getting Started With Oracle BI Publisher 11g
Document76 pages
Getting Started With Oracle BI Publisher 11g
returnasap
No ratings yet
USER GUIDE (Ver. 1.0) Land Regularisation Scheme
Document11 pages
USER GUIDE (Ver. 1.0) Land Regularisation Scheme
upendar reddy Mallu
No ratings yet
Bridging ONT + HGW Network Scenario (GPON/XG-PON/XGS-PON Networking)
Document30 pages
Bridging ONT + HGW Network Scenario (GPON/XG-PON/XGS-PON Networking)
Min
No ratings yet