0% found this document useful (0 votes)

313 views

Big Data Computing - Assignment 3

This document contains a 10 question quiz about Spark, distributed computing concepts, and NoSQL databases for the Week 3 assignment of the Big Data Computing course. The questions cover topics like RDDs, Spark APIs, Spark Streaming, GraphX, Cassandra, and scaling strategies. The document provides the questions, possible answer options, and a submission button to record responses before the due date.

Uploaded by

VarshaMega

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

313 views

Big Data Computing - Assignment 3

Uploaded by

VarshaMega

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

9/6/21, 6:41 AM Big Data Computing - - Unit 5 - Week-3

Assessment submitted.

(https://swayam.gov.in)

(https://swayam.gov.in/nc_details/NPTEL)
X

remeshbabu@gecskp.ac.in 

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL)
»
Big Data Computing (course)

Register for
Certification
exam
Thank you for taking the Week 3:
(https://examform.nptel.ac.in/) Assignment-3.
Course
outline Week 3: Assignment-3
Your last recorded submission was on 2021-09-06, 06:41 Due date: 2021-09-15, 23:59 IST.
How does an IST
NPTEL online
course work? 1) In Spark, a ______________________is a read-only collection of objects 1 point
partitioned across a set of machines that can be rebuilt if a partition is lost.
Week-0

Spark Streaming

FlatMap
Week-1

Driver
Week-2
Resilient Distributed Dataset (RDD)

2) Given the following definition about the join transformation in Apache Spark:
1 point
Week-3

Parallel
def join[W](other: RDD[(K, W)]): RDD[(K, (V, W))]
Programming

with Spark Where join operation is used for joining two datasets. When it is called on datasets of type (K, V)
(unit? and (K, W), it returns a dataset of (K, (V, W)) pairs with all pairs of elements for each key.

unit=33&lesson=34)

Output the result of joinrdd, when the following code is run.

Introduction to
Spark (unit?

unit=33&lesson=35) val rdd1 = sc.parallelize(Seq(("m",55),("m",56),("e",57),("e",58),("s",59),("s",54)))

Spark Built-in
val rdd2 = sc.parallelize(Seq(("m",60),("m",65),("s",61),("s",62),("h",63),("h",64)))

Libraries (unit?

unit=33&lesson=36)
val joinrdd = rdd1.join(rdd2)

Design of Key-

Value Stores joinrdd.collect

(unit?
unit=33&lesson=37)
Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,
Week 3:
(59,61)), (s,(59,62)), (h,(63,64)), (s,(54,61)), (s,(54,62)))
Lecture

https://onlinecourses.nptel.ac.in/noc21_cs86/unit?unit=33&assessment=94 1/3
9/6/21, 6:41 AM Big Data Computing - - Unit 5 - Week-3

material (unit?
Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,
Assessment submitted.
unit=33&lesson=38)
(59,61)), (s,(59,62)), (e,(57,58)), (s,(54,61)), (s,(54,62)))
X
Quiz: Week 3:
Array[(String, (Int, Int))] = Array((m,(55,60)), (m,(55,65)), (m,(56,60)), (m,(56,65)), (s,
Assignment-3 (59,61)), (s,(59,62)), (s,(54,61)), (s,(54,62)))
(assessment?
name=94)

None of the mentioned

3) Consider the following statements in the context of Spark:

1 point

Statement 1: Spark improves efficiency through in-memory computing primitives and general
computation graphs.

Statement 2: Spark improves usability through high-level APIs in Java, Scala, Python and also
provides an interactive shell.

Only statement 1 is true

Only statement 2 is true

Both statements are true

Both statements are false

4) True or False ?
1 point

Resilient Distributed Datasets (RDDs) are fault-tolerant and immutable.

True

False

5) Which of the following is not a NoSQL database ? 1 point

HBase

Cassandra

SQL Server

None of the mentioned

6) True or False ?
1 point

Apache Spark potentially run batch-processing programs up to 100 times faster than Hadoop
MapReduce in memory, or 10 times faster on disk.

True

False

7) ______________ leverages Spark Core fast scheduling capability to perform 1 point

streaming analytics.

MLlib

Spark Streaming

GraphX

RDDs

8) ____________________ is a distributed graph processing framework on top of 1 point

Spark.

MLlib

Spark streaming

https://onlinecourses.nptel.ac.in/noc21_cs86/unit?unit=33&assessment=94 2/3
9/6/21, 6:41 AM Big Data Computing - - Unit 5 - Week-3

GraphX
Assessment submitted.

All of the mentioned
X
9) Point out the incorrect statement in the context of Cassandra: 1 point

It is a centralized key-value store

It is originally designed at Facebook

It is designed to handle large amounts of data across many commodity servers,
providing high availability with no single point of failure

It uses a ring-based DHT (Distributed Hash Table) but without finger tables or routing

10) Consider the following statements:

1 point

Statement 1: Scale out means grow your cluster capacity by replacing with more powerful
machines.

Statement 2: Scale up means incrementally grow your cluster capacity by adding more COTS
machines (Components Off the Shelf).

Only statement 1 is true

Only statement 2 is true

Both statements are false

Both statements are true

You may submit any number of times before the due date. The final submission will be
considered for grading.
Submit Answers

https://onlinecourses.nptel.ac.in/noc21_cs86/unit?unit=33&assessment=94 3/3

Graded Quiz - Advanced SQL For Data Engineers
71% (7)
Graded Quiz - Advanced SQL For Data Engineers
3 pages
Nptel Big Data Full PPT Book With Assignment Solution Rajiv Mishra IIT Patna 2021
100% (1)
Nptel Big Data Full PPT Book With Assignment Solution Rajiv Mishra IIT Patna 2021
1,103 pages
Teste
No ratings yet
Teste
73 pages
Nptel Big Data Full Assignment Solution 2021
100% (8)
Nptel Big Data Full Assignment Solution 2021
36 pages
Hadoop MCQs
75% (8)
Hadoop MCQs
21 pages
Cassandra PPT Final
No ratings yet
Cassandra PPT Final
23 pages
Unit - 1 Big Data Handwritten Notes
No ratings yet
Unit - 1 Big Data Handwritten Notes
16 pages
C & C++ Interview Questions You'll Most Likely Be Asked
From Everand
C & C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Oracle DBA Interview Questions Experienced Candidate
No ratings yet
Oracle DBA Interview Questions Experienced Candidate
6 pages
UNIT-3 Hadoop and MapReduce Programming
100% (1)
UNIT-3 Hadoop and MapReduce Programming
84 pages
Chapter+9+ HIVE
No ratings yet
Chapter+9+ HIVE
50 pages
Big Data Analytics Unit 1 MCQ
90% (10)
Big Data Analytics Unit 1 MCQ
10 pages
SQL Full
No ratings yet
SQL Full
75 pages
Network and Information Security Laboratory: Assignment No - 09 Title: Simulation of SQL Injection
No ratings yet
Network and Information Security Laboratory: Assignment No - 09 Title: Simulation of SQL Injection
5 pages
1Z0-047 Oracle Database SQL Expert
No ratings yet
1Z0-047 Oracle Database SQL Expert
2 pages
Big Data Computing - Week-1
No ratings yet
Big Data Computing - Week-1
3 pages
2023 BD All Assignment
No ratings yet
2023 BD All Assignment
63 pages
Week 3 Assignment Answer 2022
No ratings yet
Week 3 Assignment Answer 2022
3 pages
Big Data Computing - Assignment 8
No ratings yet
Big Data Computing - Assignment 8
3 pages
Big Data Computing - Assignment 4
No ratings yet
Big Data Computing - Assignment 4
4 pages
Big Data Computing - Assignment 2
No ratings yet
Big Data Computing - Assignment 2
3 pages
Big Data Computing - Assignment 6
No ratings yet
Big Data Computing - Assignment 6
3 pages
Noc19 cs33 Assignment5
No ratings yet
Noc19 cs33 Assignment5
3 pages
2022 Assignment Answers
No ratings yet
2022 Assignment Answers
37 pages
Week 2 Assignment Answers 2022
No ratings yet
Week 2 Assignment Answers 2022
4 pages
Unit 3-BDA
50% (2)
Unit 3-BDA
26 pages
4 UNIT-4 Introduction To Hadoop
No ratings yet
4 UNIT-4 Introduction To Hadoop
154 pages
Ccs 334
No ratings yet
Ccs 334
16 pages
Unit5 BD
100% (2)
Unit5 BD
91 pages
BDA Experiment 14 PDF
No ratings yet
BDA Experiment 14 PDF
77 pages
Unit 3 Big Data MCQ AKTU: Royal Brinkman Gartenbaubedarf
No ratings yet
Unit 3 Big Data MCQ AKTU: Royal Brinkman Gartenbaubedarf
17 pages
BD - Unit - III - MapReduce
100% (1)
BD - Unit - III - MapReduce
31 pages
Apache Spark Architecture
No ratings yet
Apache Spark Architecture
7 pages
BDC Previous Papers 2 Marks
100% (1)
BDC Previous Papers 2 Marks
7 pages
MCQ - Bda
33% (3)
MCQ - Bda
3 pages
Bda Lab Manual
0% (1)
Bda Lab Manual
40 pages
Tech Leap-AWS-Data-Engineer-TeachLeap-School-Final PDF
No ratings yet
Tech Leap-AWS-Data-Engineer-TeachLeap-School-Final PDF
14 pages
Data Science For Engineers - Unit 5 - Week 1
100% (1)
Data Science For Engineers - Unit 5 - Week 1
5 pages
CS8091-BIG DATA ANALYTICS UNIT V Notes
100% (4)
CS8091-BIG DATA ANALYTICS UNIT V Notes
31 pages
Nosql Databases Unit-1
No ratings yet
Nosql Databases Unit-1
16 pages
Unit 5
100% (1)
Unit 5
109 pages
Nosql Module 2
100% (1)
Nosql Module 2
87 pages
Course - DBMS: Course Instructor Dr. Umadevi V Department of CSE, BMSCE
No ratings yet
Course - DBMS: Course Instructor Dr. Umadevi V Department of CSE, BMSCE
43 pages
Pyspark Material
No ratings yet
Pyspark Material
16 pages
What Is Spark?: Up To 100× Faster
No ratings yet
What Is Spark?: Up To 100× Faster
56 pages
Bdhs - Ebook
No ratings yet
Bdhs - Ebook
970 pages
Anatomy of Map Reduce Job Run
100% (1)
Anatomy of Map Reduce Job Run
20 pages
Big Data Analytics Unit-3
No ratings yet
Big Data Analytics Unit-3
15 pages
BigData Objective
No ratings yet
BigData Objective
93 pages
10 SparkBasics
No ratings yet
10 SparkBasics
45 pages
Midterm Solution
0% (1)
Midterm Solution
7 pages
Cloud Computing Assignment-Week 1 10 Total Mark: 10 X 1 10
No ratings yet
Cloud Computing Assignment-Week 1 10 Total Mark: 10 X 1 10
40 pages
Chapter 10
No ratings yet
Chapter 10
50 pages
CCS334 Big Data Analytics Important Question
No ratings yet
CCS334 Big Data Analytics Important Question
1 page
Practical 1 Aim: Introduction To Nosql Database
No ratings yet
Practical 1 Aim: Introduction To Nosql Database
16 pages
Bda Sem 7 Book
No ratings yet
Bda Sem 7 Book
188 pages
DSBDa MCQ
No ratings yet
DSBDa MCQ
17 pages
Updated Unit-2
0% (1)
Updated Unit-2
55 pages
BDA Unit - II
No ratings yet
BDA Unit - II
66 pages
Unit 5 Notes
100% (3)
Unit 5 Notes
66 pages
Chapter 6
100% (1)
Chapter 6
51 pages
BD - Spark - Baladasu A - SightSpectrum
No ratings yet
BD - Spark - Baladasu A - SightSpectrum
3 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Spark Interview Q&A
No ratings yet
Spark Interview Q&A
31 pages
Tarea 8
0% (2)
Tarea 8
13 pages
Assignment 03 BigData Computing Noc23-Cs112
No ratings yet
Assignment 03 BigData Computing Noc23-Cs112
6 pages
NATE Assignment 4
100% (1)
NATE Assignment 4
3 pages
Big Data Computing - Assignment 0
No ratings yet
Big Data Computing - Assignment 0
3 pages
Big Data Computing - Assignment 7
No ratings yet
Big Data Computing - Assignment 7
3 pages
Big Data Computing - Assignment 1
No ratings yet
Big Data Computing - Assignment 1
3 pages
DB2 Manual
No ratings yet
DB2 Manual
82 pages
Dbms Lab4 PDF
No ratings yet
Dbms Lab4 PDF
2 pages
DBMS Unit4
No ratings yet
DBMS Unit4
12 pages
Oracle Database Certification Path
No ratings yet
Oracle Database Certification Path
1 page
DBMS (R20) Unit - 2
No ratings yet
DBMS (R20) Unit - 2
32 pages
Hihahi
No ratings yet
Hihahi
1 page
CLASS 11 NOTES INFORMATICS PRACTICES CHAP 8 (2024-25)
No ratings yet
CLASS 11 NOTES INFORMATICS PRACTICES CHAP 8 (2024-25)
4 pages
Difference Between Dbms and Rdbms
No ratings yet
Difference Between Dbms and Rdbms
13 pages
SQL Server 2005 Reporting Services
No ratings yet
SQL Server 2005 Reporting Services
27 pages
Question 1 of 20
No ratings yet
Question 1 of 20
61 pages
CIT 3302 Advanced Database Systems Main Exam
No ratings yet
CIT 3302 Advanced Database Systems Main Exam
3 pages
Database Management System QB
No ratings yet
Database Management System QB
5 pages
PI OLEDB and DTS SSIS
No ratings yet
PI OLEDB and DTS SSIS
24 pages
Basic Database Performance Tuning - Developer's Perspective: Michal Kwiatek
No ratings yet
Basic Database Performance Tuning - Developer's Perspective: Michal Kwiatek
6 pages
Space LargestTables 11g+
No ratings yet
Space LargestTables 11g+
12 pages
Object-Oriented Features in Oracle: Varying Array Type - General Syntax: AS Varray (N) of TYPE Persons VARRAY (3) OF
No ratings yet
Object-Oriented Features in Oracle: Varying Array Type - General Syntax: AS Varray (N) of TYPE Persons VARRAY (3) OF
31 pages
Oracle Guide Examples: User Name / Password: Scott/tiger@ora
No ratings yet
Oracle Guide Examples: User Name / Password: Scott/tiger@ora
8 pages
Dbms and Rdbms Questions
No ratings yet
Dbms and Rdbms Questions
16 pages
1.smart Data Access
100% (1)
1.smart Data Access
30 pages
practical Copy.pdf (1)108
No ratings yet
practical Copy.pdf (1)108
172 pages
Unit 1 JDBC
No ratings yet
Unit 1 JDBC
16 pages
Library Management Neww
No ratings yet
Library Management Neww
25 pages
Lecture 3 - Introduction To NoSQL - Updated
No ratings yet
Lecture 3 - Introduction To NoSQL - Updated
35 pages
Business Object - Creating A Universe A Step by Step Tutorial
No ratings yet
Business Object - Creating A Universe A Step by Step Tutorial
13 pages

Big Data Computing - Assignment 3

Uploaded by

Big Data Computing - Assignment 3

Uploaded by

9/6/21, 6:41 AM Big Data Computing - - Unit 5 - Week-3

Output the result of joinrdd, when the following code is run.

unit=33&lesson=35) val rdd1 = sc.parallelize(Seq(("m",55),("m",56),("e",57),("e",58),("s",59),("s",54)))

Value Stores joinrdd.collect

3) Consider the following statements in the context of Spark:

Resilient Distributed Datasets (RDDs) are fault-tolerant and immutable.

5) Which of the following is not a NoSQL database ? 1 point

7) ______________ leverages Spark Core fast scheduling capability to perform 1 point

8) ____________________ is a distributed graph processing framework on top of 1 point

10) Consider the following statements:

You might also like