Welcome to Scribd!

0% found this document useful (0 votes)

3 views

PIg in BIg Data

Uploaded by

mayurkshirsat

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

PIg in BIg Data

Uploaded by

mayurkshirsat

0% found this document useful (0 votes)

3 views28 pages

Original Title

23070243046_PIg in BIg data

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Download as pptx, pdf, or txt

0% found this document useful (0 votes)

3 views28 pages

PIg in BIg Data

Uploaded by

mayurkshirsat

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Download as pptx, pdf, or txt

Jump to Page

You are on page 1of 28

Search inside document

Presentation by Mayur Shirsat

PRN: 23070243046
INTRODUCTION

• Apache Pig is an abstraction over MapReduce.

• It is a tool/platform which is used to analzye larger sets of data
represnting them as data flows.
• It is generally used with hadoop
• To analyze data using Apache Pig programmers need to write cripts
using PIG LATIN.
WHY DO WE NEED?

• Programmers who are not so good at java normally used to struggle

working with Hadoop wspecially while performing any MapReduce
tasks.
• Pig Latin is SQL - like language and it is easy to learn.
• Apache Pig provides many built-in operators to support data operations
like joins, filters, ordering etc.
PIG VS MAPREDUCE

APACHE PIG MAP REDUCE

• Pig is a data flow language
• MapReduce is data processing paradigm.
• Any Novice programmer with a basic knowledge of SQL
• Exposure to Java is must to work with MapReduce
can work conveiently wiith apache pig
• Apache Pig uses multi query approach there by reducing • MapReduce will require almost 20 times more the number

the length of the codes to greate extent. of lines to perform the same task
• There is no need for compilation On Execution, every • Map Reduce jobs have a long compilation process.
Apache Pig operator is converted internally into a Map
Reduce job.
Word count code for Map Reduce in Java
Word count code in Apache Pig

This Apache Pig code takes a text file as input, splits each line into words, groups the words by
their value, counts the occurrences of each word, and finally stores the word counts in a
separate file.
PIG VS OTHER TOOLS
PIG ARCHITECTURE
HOW TO INSTALL PIG
step 1 Visit the website and download pig-0.17.0.tar.gz file
step 2 After download we need to extract the file.
step 3 Open environment settings
step 4 Giving path
step 5 opening the bin folder in pig
step 6 changing the file windows command script
step 7 opening window power shell as administrator
APACHE PIG EXECUTION MODES
Apache Pig Execution Modes
You can run Apache Pig in two modes, namely, Local Mode and HDFS mode.

Local Mode
In this mode, all the files are installed and run from your local host and local file system.
There is no need of Hadoop or HDFS. This mode is generally used for testing purpose.

MapReduce Mode
MapReduce mode is where we load or process the data that exists in the Hadoop File System
(HDFS) using Apache Pig. In this mode, whenever we execute the Pig Latin statements to
process the data, a MapReduce job is invoked in the back-end to perform a particular
operation on the data that exists in the HDFS.
READING DATA
READING DATA
READING DATA
READING DATA
READING DATA
function − We have to choose a function from the set of load functions provided by Apache Pig (BinStorage, JsonLoader, PigStorage,
TextLoader).
BASIC COMMANDS
PIG VS SQL

APACHE PIG SQL

• Pig Latin is a procedural language.
• SQL is a declarative language.
• In Apache Pig, schema is optional. We can store data
• Schema is mandatory in SQL.
without designing a schema (values are stored as $01, $02
• The data model used in SQL is flat relational.
etc.)
• The data model in Apache Pig is nested relational. • There is more opportunity for query optimization in SQL.

• Apache Pig provides limited opportunity for Query

optimization.
PIG PHILOSOPHY
Pigs eats anything

Pigs live anywhere

Pigs are domestic animals

THANK YOU

Snowflake Syllabus
Document2 pages
Snowflake Syllabus
Venkata Satya
100% (1)
Troubleshooting Guide: Merge Efilm
Document34 pages
Troubleshooting Guide: Merge Efilm
Nouman Mughal
No ratings yet
PIg in BIg Data
Document28 pages
PIg in BIg Data
mayurkshirsat
No ratings yet
Unit5 Bigdatanotes
Document52 pages
Unit5 Bigdatanotes
Satyam Pandey
No ratings yet
Unit-5 Pig
Document38 pages
Unit-5 Pig
ltaditya3010
No ratings yet
Unit - V PIG Hadoop & Big Data: Pig Latin. This Language Provides Various Operators Using Which Programmers
Document9 pages
Unit - V PIG Hadoop & Big Data: Pig Latin. This Language Provides Various Operators Using Which Programmers
Abhay Dabhade
No ratings yet
Notes Unit 5 Bigdata
Document19 pages
Notes Unit 5 Bigdata
SHRUTI NEMA
No ratings yet
Notes Unit 5 Bigdata
Document21 pages
Notes Unit 5 Bigdata
SHRUTI NEMA
No ratings yet
Unit IV - Big Data Programming
Document17 pages
Unit IV - Big Data Programming
jasmine
No ratings yet
Notes
Document19 pages
Notes
akshitsingh28bhanu
No ratings yet
Unit V-Apache Pig
Document10 pages
Unit V-Apache Pig
Smitha Rajesh
No ratings yet
Unit 5
Document76 pages
Unit 5
maheshwariakshita53
No ratings yet
3 Pig
Document77 pages
3 Pig
priyal pavanipriyal
No ratings yet
BDP U4
Document58 pages
BDP U4
Durga Bisht
No ratings yet
Unit No. 8
Document24 pages
Unit No. 8
vishal phule
No ratings yet
Bda Unit 4 060115 Big Data Analytics Unit 4
Document19 pages
Bda Unit 4 060115 Big Data Analytics Unit 4
nandan
No ratings yet
Pig Latin Modes
Document3 pages
Pig Latin Modes
yohetad
No ratings yet
pig
Document23 pages
pig
msaicse2105g3
No ratings yet
Big Data Unit IV
Document19 pages
Big Data Unit IV
beelogger4321
No ratings yet
Unit 5
Document39 pages
Unit 5
gupta1803yashi
No ratings yet
UNIT 5 Complete Notes
Document21 pages
UNIT 5 Complete Notes
works8606
No ratings yet
Introduction To Apache Pig: Geeksforgeeks
Document5 pages
Introduction To Apache Pig: Geeksforgeeks
amitsachan47
No ratings yet
Pig Full Lecture
Document38 pages
Pig Full Lecture
Atharv Chaudhari
No ratings yet
unit-4_SGS
Document13 pages
unit-4_SGS
shweta.shete
No ratings yet
Unit 4
Document29 pages
Unit 4
Ajay Kumar Kanamarlapudi
No ratings yet
What Is Apache Pig
Document8 pages
What Is Apache Pig
Sudharsana Vasudevan
No ratings yet
Big_Data_Unit-5
Document81 pages
Big_Data_Unit-5
sumitsharmass9122
No ratings yet
Unit III
Document118 pages
Unit III
Alekhya Abbaraju
No ratings yet
Unit 4 Bba
Document10 pages
Unit 4 Bba
rajendrameena172003
No ratings yet
Pig
Document6 pages
Pig
mytempemail2023
No ratings yet
Unit-V Pig Programming
Document123 pages
Unit-V Pig Programming
Paleti Sunitha
No ratings yet
4.1_PIG_UNIT4
Document55 pages
4.1_PIG_UNIT4
ιηρ
No ratings yet
BDA - Unit-4 Part 1
Document47 pages
BDA - Unit-4 Part 1
teja.ksp1801
No ratings yet
Unit 5 PIG&HIVE
Document115 pages
Unit 5 PIG&HIVE
Kishore Parimi
No ratings yet
Cse 17CS82 M2 S1 PPT
Document35 pages
Cse 17CS82 M2 S1 PPT
Vasanth Kumar
No ratings yet
07 Pig
Document5 pages
07 Pig
Vikas Sinha
No ratings yet
Hadoop Pig
Document111 pages
Hadoop Pig
Jhumri Talaiya
No ratings yet
Unit Iv Part - 2
Document59 pages
Unit Iv Part - 2
Nithya Naraparaju
No ratings yet
Nosql 24 011 Pig
Document41 pages
Nosql 24 011 Pig
shubham agarwal
No ratings yet
Pig Architecture
Document7 pages
Pig Architecture
shivam.agrawalpy
No ratings yet
Unit v Notes
Document17 pages
Unit v Notes
prat0ham
No ratings yet
Course On: Big Data Analytics
Document52 pages
Course On: Big Data Analytics
munish kumar agarwal
No ratings yet
Hadoop Tools - A Brief Overview
Document18 pages
Hadoop Tools - A Brief Overview
Sabbir Bin Shazid
No ratings yet
BD 5
Document28 pages
BD 5
gaudav217
No ratings yet
Unit-5 (1) BD
Document18 pages
Unit-5 (1) BD
ishitaag2003
No ratings yet
Apache Pig
Document21 pages
Apache Pig
sachin rajput
No ratings yet
Pig and Pig Latin
Document16 pages
Pig and Pig Latin
shrina.jain.07
No ratings yet
Module 2.2
Document32 pages
Module 2.2
Priyanka Bandagale
No ratings yet
Notes - 5 Unit Big Data
Document22 pages
Notes - 5 Unit Big Data
Satish Kumar Singh
No ratings yet
4 Hadoop Ecosystem
Document16 pages
4 Hadoop Ecosystem
Vipul Khandke
No ratings yet
Pig Hive
Document72 pages
Pig Hive
suhasspotifypvt
No ratings yet
DA Unit-5
Document78 pages
DA Unit-5
Gio
No ratings yet
BD Notes 5
Document37 pages
BD Notes 5
gudalasubbu143
No ratings yet
Apache Pig: Pig Is The Abstraction Over Mapreduce
Document4 pages
Apache Pig: Pig Is The Abstraction Over Mapreduce
prerna gupta
No ratings yet
Apache Spark
Document25 pages
Apache Spark
PhillipeSantos
No ratings yet
Module 4 - Pig
Document65 pages
Module 4 - Pig
Aditya Raj
No ratings yet
Big Data
Document120 pages
Big Data
prithvikotian2002
No ratings yet
Apache Pig
Document6 pages
Apache Pig
Mayank Sinha
No ratings yet
BDA Module 2 PDF
Document123 pages
BDA Module 2 PDF
Nidhi Srivastava
No ratings yet
Unit 5 Bda
Document10 pages
Unit 5 Bda
VINAY AGGARWAL
No ratings yet
CH 6 BDA
Document10 pages
CH 6 BDA
Binit Karmakar
No ratings yet
Apache Hadoop 3 Quick Start Guide: Learn about big data processing and analytics
From Everand
Apache Hadoop 3 Quick Start Guide: Learn about big data processing and analytics
Hrishikesh Vijay Karambelkar
No ratings yet
Data Warehousing 95-797: Meeting Days, Times, Location: Semester:, Year
Document5 pages
Data Warehousing 95-797: Meeting Days, Times, Location: Semester:, Year
amelksibi
No ratings yet
CS8481 - Set4
Document7 pages
CS8481 - Set4
ffffffffffffffff
0% (1)
Aws Storage
Document15 pages
Aws Storage
Sushruth Gowda
No ratings yet
Log
Document10 pages
Log
Mhd Frago
No ratings yet
Users and Roles in Database Management
Document3 pages
Users and Roles in Database Management
cofoje9006
No ratings yet
70 433
Document91 pages
70 433
eolimabr
No ratings yet
SQL Basics
Document4 pages
SQL Basics
Gowri Shankar
No ratings yet
Wondershare Recoverit 2023
Document40 pages
Wondershare Recoverit 2023
BrosGee
No ratings yet
Entity Integrity and Referential Integrity
Document3 pages
Entity Integrity and Referential Integrity
yasmeen_shabana
No ratings yet
ODPODQ
Document5 pages
ODPODQ
sagardarji
No ratings yet
Untitled Document
Document5 pages
Untitled Document
Noel Ony
No ratings yet
Microsoft SQL Server Master Data Services Roadmap
Document5 pages
Microsoft SQL Server Master Data Services Roadmap
Gabriel Setnic
No ratings yet
IBIP Text Script Imp
Document7 pages
IBIP Text Script Imp
cenu79
No ratings yet
ChatGPT Cheat Sheet
Document1 page
ChatGPT Cheat Sheet
hiroyukisanada310
No ratings yet
Azure SQL
Document3,323 pages
Azure SQL
Sambit Padhy
No ratings yet
Mass-Storage System
Document22 pages
Mass-Storage System
mir sumon
No ratings yet
cs506 Mcqs
Document15 pages
cs506 Mcqs
Aqsa Maryaim
100% (1)
Sayan - Resume For MSI
Document1 page
Sayan - Resume For MSI
deepakhoke987
No ratings yet
Dork Ceritanamah
Document9 pages
Dork Ceritanamah
Nyanta
No ratings yet
Author Name: Suhel Daimi
Document20 pages
Author Name: Suhel Daimi
sara
No ratings yet
Rufus SAP Datasphere 1
Document8 pages
Rufus SAP Datasphere 1
Dia San
No ratings yet
RDBMS Assessment Answers
Document14 pages
RDBMS Assessment Answers
Video Trend
No ratings yet
HRSFEC - PN - FTSD - PERNR-specific Full Transmission Start Date - Consolut
Document1 page
HRSFEC - PN - FTSD - PERNR-specific Full Transmission Start Date - Consolut
Arun Varshney (MULAYAM)
No ratings yet
Data Cloud 1
Document1 page
Data Cloud 1
s-tiny
No ratings yet
ABAP Complete Interview Guide
Document85 pages
ABAP Complete Interview Guide
Vishwa Borbacchi
No ratings yet
Database Management System: Refer Below To Answer The Questions (Q.1 To Q4)
Document6 pages
Database Management System: Refer Below To Answer The Questions (Q.1 To Q4)
bhavesh agrawal
No ratings yet
Oracle SQL and PLSQL Topics
Document11 pages
Oracle SQL and PLSQL Topics
boginipavan
No ratings yet
Table
Document3 pages
Table
Arun M MCA LE 2018-2020
No ratings yet