Trắc Nghiệm Big data
Trắc Nghiệm Big data
Trắc Nghiệm Big data
Question
Which of the follwing is a platform for analyzing large data sets that consists of a high-level language
for expressing data analysis programs
Pig Latin
Oozie
Pig
Hive
2. Question
Pig Latin scripting language is not only a higher-level data flow language but also has operators
similar to
JSON
SQL
XML
3. Question
Which of the following is data flow scripting language for analyzing unstructured data?
Mahoot
Hive
Pig
4. Question
Which of the following command is used to show values to keys used in Pig ?
Set
Declare
Display
5. Question
Use the __________ command to run a Pig script that can interact with the Grunt shell (interactive
mode).
Fetch
Declare
Run
6. Question
Which of the following command can be used for debugging?
Exec
Execute
Error
Throw
7. Question
____________ method will be called by Pig both in the front end and back end to pass a unique
signature to the Loader.
relativeToAbsolutePath()
setUdfContextSignature()
getCacheFiles()
getShipFiles
8. Question
Which of the following is a framework for collecting and storing script-level statistics for Pig Latin.
Pig Stats
PStatistics
Pig Statistics
9. Question
Which among the following is simple xUnit framework that enables you to easily test your Pig scripts.
PigUnit
PigXUnit
PigUnitX
10. Question
Which of the following will compile the Pigunit?
$pig_trunk ant pigunit-jar
$pig_tr ant pigunit-jar
$pig_ ant pigunit-jar
11. Question
PigUnit runs in Pig’s _______ mode by default.
Local
Tez
MapReduce
12. Question
Pig operates in mainly how many nodes?
2
3
4
5
13. Question
You can run Pig in batch mode using
Pig shell command
Pig scripts
Pig options
14. Question
Which of the following function is used to read data in PIG?
WRITE
READ
LOAD
15. Question
You can run Pig in interactive mode using the which of the following shell.
Grunt
FS
HDFS
1. Question
Which of the following will run pig in local mode?
$ pig -x tez_local
$ pig -x local
$ pig
None of the above
2. Question
Which of the following platform is used for constructing data flows for extract, transform, and load
(ETL) processing and analysis of large datasets.
Pig Latin
Pig
Oozie
Hive
3. Question
Which of the following component is of Pig Execution Environment?
Pig Scripts
Parser
Optimizer
All of the above
4. Question
Which among the following is the way of executing Pig script
Embedded Script
Grunt Shell
Script File
All of the above
5. Question
Which of the following is diagnostic operators in Pig
DUMP
DESCRIBE
EXPLAIN
All of the above
6. Question
‘ILLUSTRATE’ run a MapReduce job
False
True
7. Question
Which of the following is relational operators in Pig.
DUMP
DISTINCT
DESCRIBE
All of the above
8. Question
Which of the following is execution modes available in Pig
Local Mode
Map Mode
Reduce Mode
None of the above
9. Question
Pig script is
Case sensitive
Case insensitive
Both the above
None of the above
10. Question
Collection of Tuples is called
Map
Bag
Tuples
All of the above
11. Question
Apache Pig reduces the length of codes by using multi-query approach
True
False
12. Question
Which of the following is the feature of PIG
1. You can run Pig in interactive mode using the ______ shell
Grunt
HDFS
FS
Hadoop
bag
All of these
5. Nhiều lựa chọn
What are the different complex data types in PIG
map
tuple
bag
All of these
pig
hive
9. Nhiều lựa chọn
Pig operates in mainly how many nodes?
2
3
4
5
oozie
hive
pig latin
14. Nhiều lựa chọn
Which of the following component is of Pig Execution Environment?
pig script
parser
optimizer
all of mentioned
15. Nhiều lựa chọn
Which among the following is the way of executing Pig script
embedded script
grunt shell
script file
all of the above
16. Nhiều lựa chọn
Which of the following is execution modes available in Pig
local mode
map mode
reduce mode
none of the above
17. Nhiều lựa chọn
Collection of Tuples is called
TUPLE
MAP
BAG
ALL OF THE ABOVE
Extensibility
Optimization opportunities
All of the above
19. Nhiều lựa chọn
Which among the following is complex data types supported by Pig Latin.
TUPLE
BAG
MAP
ALL OF THE ABOVE
1. Question
The results of a hive query can be stored as
Local File
HDFS file
Both the above
Can not be stored
2. Question
If the database contains some tables then it can be forced to drop without dropping the tables by using
the keyword
RESTRICT
OVERWRITE
F DROP
CASCADE
3. Question
Users can pass configuration information to the SerDe using
SET SERDEPRPERTIES
WITH SERDEPRPERTIES
BY SERDEPRPERTIES
CONFIG SERDEPRPERTIES
4. Question
The property set to run hive in local mode as true so that it runs without creating a mapreduce job is
hive.exec.mode.local.auto
hive.exec.mode.local.override
hive.exec.mode.local.settings
hive.exec.mode.local.config
5. Question
Which kind of keys(CONSTRAINTS) Hive can have?
Primary Keys
Foreign Keys
Unique Keys
None of the above
6. Question
What is the disadvantage of using too many partitions in Hive tables?
It slows down the namenode
Storage space is wasted
Join quires become slow
All of the above
7. Question
The default delimiter in hive to separate the element in STRUCT is
'\001'
'\oo2'
'\oo3'
'\oo4'
8. Question
By default when a database is dropped in Hive
The tables are also deleted
The directory is deleted if there are no tables
The HDFS blocks are formatted
None of the above
9. Question
The main advantage of creating table partition is
Effective storage memory utilization
Faster query performance
Less RAM required by namenode
Simpler query syntax
10. Question
If the schema of the table does not match with the data types present in the file containing the table
then Hive
Automatically drops the file
Automatically corrects the data
Reports Null values for mismatched data
Does not allow any query to run on the table
11. Question
A view in Hive can be seen by using
SHOW TABLES
SHOW VIEWS
DESCRIBE VIEWS
VIEW VIEWS
12. Question
If an Index is dropped then
The underlying table is also dropped
The directory containing the index is deleted
The underlying table is not dropped
Error is thrown by hive
13. Question
Which file controls the logging of Mapreduce Tasks?
hive-log4j.properties
hive-exec-log4j.properties
hive-cli-log4j.properties
hive-create-log4j.properties
14. Question
What Hive can not offer
Storing data in tables and columns
Online transaction processing
Handling date time data
Partitioning stored data
15. Question
To see the partitions keys present in a Hive table the command used is
Describe
Describe extended
Show
Show extended
1. Question
For optimizing join of three tables, the largest sized tables should be placed as
Remote
HTTP
Embedded
Interactive
7. Question
Which of the following data type is supported by Hive?
map
record
string
enum
8. Question
Which of the following is not a complex data type in Hive?
Matrix
Array
Map
STRUCT
9. Question
Each database created in hive is stored as
A file
A directory
A HDFS block
A jar file
10. Question
When a partition is archived in Hive it
Reduces space through compression
Reduces the length of records
Reduces the number of files stored
Reduces the block size
11. Question
When a Hive query joins 3 tables, How many mapreduce jobs will be started?
1
2
3
12. Question
The reverse() function reverses a string passed to it in a Hive query. This is an example of
Standard UDF
Aggregate UDF
Table Generating UDF
None of the above
13. Question
Hive can be accessed remotely by using programs written in C++, Ruby etc, over a single port. This is
achieved by using
HiveServer
HiveMetaStore
HiveWeb
Hive Streaming
14. Question
The thrift service component in hive is used for
Moving hive data files between different servers
Use multiple hive versions
Submit hive queries from a remote client
Installing hive
15. Question
The query “SHOW DATABASE LIKE ‘h.*’ ; gives the output with database name
Containing h in their name
Starting with h
Ending with h
Containing 'h.'
1. Question
Is it possible to change the default location of Managed Tables in Hive
Yes
No
2. Question
Which among the following command is used to change the settings within Hive session
RESET
SET
3. Question
How to change the column data type in Hive
ALTER and CHANGE
ALTER
CHANGE
4. Question
Which of the following is the data types in Hive
ARRAY
STRUCT
MAP
All of the above
5. Question
Which of the following is the Key components of Hive Architecture
User Interface
Metastore
Driver
All of the above
6. Question
Are multiline comments supported in Hive?
Yes
No
7. Question
Can we run UNIX shell commands from Hive?
Yes
No
8. Question
Which of the following is the commonly used Hive services
Command Line Interface (cli)
Hive Web Interface (hwi)
HiveServer (hiveserver)
All of the above
9. Question
Explode in Hive is used to convert complex data types into desired table formats.
True
False
10. Question
Is it possible to overwrite Hadoop MapReduce configuration in Hive?
Yes
No
11. Question
Point out the correct statement
Hive is not a relational database, but a query engine that supports the parts of SQL
Hive is a relational database with SQL support
Pig is a relational database with SQL support
None of the above
12. Question
Which of the following is used to analyse data stored in Hadoop cluster using SQL like query
Mahoot
Hive
Pig
All of the above
13. Question
If an Index is dropped then
HiveServer2
HiveServer3
HiveServer4
None of the mentioned
22. Nhiều lựa chọn
The below expression in the where clause RLIKE '.*(Chicago|Ontario).*'; gives the result which
match
words containing both Chicago and Ontario
words containing either Chicago or Ontario
words Ending with Chicago or Ontario
words starting with Chicago or Ontario
All of these
https://www.tutorialspoint.com/hive/hive_online_quiz.htm
Q 1 - in hive when the schema does not match the file content
A - It cannot read the file
B - It reads only the string data type
C - it throws an error and stops reading the file
D - It returns null values for mismatched fields.
Answer : D
Explanation
Instead of returning error, Hive returns null values for mismatch between schema and actual data.
Q 2 - If the database contains some tables then it can be forced to drop without dropping the
tables by using the keyword
A - RESTRICT
B - OVERWRITE
C - F DROP
D – CASCADE
Answer : D
Explanation
CASCADE clause drops the table first before dropping the database
Q 3 - The "strict" mode when querying a partitioned table is used to
A - stop queries of partitioned tables without a where clause
B - automatically add a where clause to the queries on a partitioned table
C - Limit the result of a query on partitioned table to 100
D - Ignore any error in the name of the partitioned table
Answer : A
Explanation
The strict mode is designed to avoid long running jobs.
Q 4 - When a partition is archived in Hive it
A - Reduces space through compression
B - Reduces the block size
C - reduces the length of records
D - reduces the number of files stored
Answer : D
Explanation
Archiving merges the files into one directory.
Q 5 - To select all columns starting with the word 'Sell' form the table GROSS_SELL the query
is
A - select '$Sell*' from GROSS_SELL
B - select 'Sell*' from GROSS_SELL
C - select 'sell.*' from GROSS_SELL
D - select 'sell[*]' from GROSS_SELL
Answer : C
Explanation
Hive supports java based regular expression for querying its metadata.
Q 6 - The name of a view in Hive
A - can be same as the name of another table in the same database
B - cannot be same as the name of another table in the same database
C - cannot contain a number
D - cannot be more than 10 character long
Answer : B
Explanation
Views and tables are treated similarly in the hive metadata
Q 7 - The identifiers in HiveQL are
A - case sensitive
B - case insensitive
C - sometimes case sensitive
D - Depends on the Hadoop environment
Answer : A
Explanation
Hive is case insensitive
Q 8 - Setting the local mode execution to true causes
A - All tasks are executed on data available closet to the namenode
B - All tasks are executed only on a single machine
C - All the data files are cached on a datanode before query execution
D - Random data is used for query execution
Answer : B
Explanation
Local mode avoid creating mapreduce job while running the job in a single machine.
Q 9 - A Table Generating Function is a Function that
A - Takes one or more columns form a row and returns a single value
B - Takes one or more columns form many rows and returns a single value
C - Take zero or more inputs and produce multiple columns or rows of output
D - Detects the type of input programmatically and provides appropriate response.
Q 10 - To add a new user defined Function permanently to Hive, we need to
A - Create a new version of HIve
B - Add the .class Java code to FunctionRegistry
C - Add the .jar Java code to FunctionRegistry
D - Add the .jar java code to $HOME/.hiverc
Answer : B
Explanation
Functionregistry holds the list of all permanent functions
https://www.freshersnow.com/hive-quiz/
Top 60 Hive Multiple Choice Questions | Practice Online Quiz
1. What is Hive?
A. A data processing tool
B. A database management system
C. A distributed computing system
D. A cloud computing service
Answer: A. A data processing tool
Explanation: Hive is a data processing tool that provides an SQL-like interface to Hadoop, allowing
users to query and analyze large datasets stored in Hadoop Distributed File System (HDFS).
2. Which of the following is NOT a data warehouse system that can be integrated with Hive?
A. Apache HBase
B. Apache Cassandra
C. Apache Druid
D. Apache Kylin
Answer: B. Apache Cassandra
Explanation: Hive can integrate with various data warehouse systems, including Apache HBase,
Apache Druid, and Apache Kylin, but not Apache Cassandra, which is a NoSQL database.
3. What is the language used to write Hive queries?
A. Java
B. Python
C. SQL
D. HiveQL
Answer: D. HiveQL
Explanation: Hive provides a SQL-like interface called HiveQL, which allows users to write queries
to analyze data stored in Hadoop.
4. Which of the following is a Hive built-in function for filtering data based on multiple
conditions?
A. BETWEEN
B. IN
C. LIKE
D. CASE
Answer: D. CASE
Explanation: The CASE function in Hive allows users to filter data based on multiple conditions. It
works like a switch statement in other programming languages.
5. Which of the following commands is used to create a new database in Hive?
A. CREATE TABLE
B. CREATE PARTITION
C. CREATE DATABASE
D. CREATE VIEW
Answer: C. CREATE DATABASE
Explanation: The CREATE DATABASE command is used to create a new database in Hive.
6. What is the default file format used by Hive to store data in HDFS?
A. CSV
B. Avro
C. Parquet
D. ORC
Answer: D. ORC
Explanation: The default file format used by Hive to store data in HDFS is ORC (Optimized Row
Columnar).
7. What is a Hive partition?
A. A subset of data in a Hive table
B. A type of Hive table
C. A directory in HDFS
D. A Hive database
Answer: A. A subset of data in a Hive table
Explanation: A Hive partition is a subset of data in a Hive table that is based on a specific column
value.
8. Which of the following commands is used to create a Hive table?
A. CREATE DATABASE
B. CREATE PARTITION
C. CREATE VIEW
D. CREATE TABLE
Answer: D. CREATE TABLE
Explanation: The CREATE TABLE command is used to create a new table in Hive.
9. Which of the following is NOT a supported file format for storing data in Hive?
A. CSV
B. JSON
C. XML
D. YAML
Answer: D. YAML
Explanation: Hive supports various file formats for storing data, including CSV, JSON, and XML,
but not YAML.
10. What is Hive metastore?
A. A tool for managing Hive databases
B. A file format for storing Hive metadata
C. A component that stores metadata for Hive tables and partitions
D. A Hive server that processes queries
Answer: C. A component that stores metadata for Hive tables and partitions
Explanation: Hive metastore is a component that stores metadata for Hive tables and partitions,
including table schemas, column definitions, and partition locations.
11. Which of the following commands is used to load data into a Hive table?
A. INSERT INTO
B. LOAD DATA
C. CREATE TABLE
D. ALTER TABLE
Answer: B. LOAD DATA
Explanation: The LOAD DATA command is used to load data into a Hive table from an external file.
12. Which of the following is NOT a data type supported by Hive?
A. BOOLEAN
B. CHAR
C. ARRAY
D. FLOAT
Answer: B. CHAR
Explanation: Hive supports various data types, including BOOLEAN, ARRAY, and FLOAT, but not
CHAR.
13. What is the purpose of Hive’s EXPLAIN command?
A. To execute a Hive query
B. To display the query plan for a Hive query
C. To debug a Hive query
D. To optimize a Hive query
Answer: B. To display the query plan for a Hive query
Explanation: The EXPLAIN command in Hive is used to display the query plan for a Hive query,
showing how the query will be executed and which operations will be used.
14. Which of the following commands is used to remove a Hive table?
A. DROP DATABASE
B. DROP PARTITION
C. DROP VIEW
D. DROP TABLE
Answer: D. DROP TABLE
Explanation: The DROP TABLE command is used to remove a Hive table.
15. Which of the following is NOT a Hive function for manipulating strings?
A. SUBSTRING
B. LENGTH
C. CONCAT
D. ADD
Answer: D. ADD
Explanation: Hive provides various built-in functions for manipulating strings, including
SUBSTRING, LENGTH, and CONCAT, but not ADD.
16. Which of the following commands is used to create an external table in Hive?
A. CREATE TABLE
B. CREATE EXTERNAL TABLE
C. CREATE MANAGED TABLE
D. CREATE TEMPORARY TABLE
Answer: B. CREATE EXTERNAL TABLE
Explanation: The CREATE EXTERNAL TABLE command is used to create an external table in
Hive, which points to data stored outside of Hive.
17. What is the purpose of Hive’s GROUP BY clause?
A. To group data based on specific column values
B. To sort data based on specific column values
C. To filter data based on specific column values
D. To join multiple tables based on specific column values
Answer: A. To group data based on specific column values
Explanation: The GROUP BY clause in Hive is used to group data based on specific column values,
allowing users to aggregate and summarize data.
18. Which of the following commands is used to rename a Hive table?
A. RENAME TABLE
B. ALTER TABLE
C. UPDATE TABLE
D. MODIFY TABLE
Answer: A. RENAME TABLE
Explanation: The RENAME TABLE command is used to rename a Hive table.
19. Which of the following is NOT a supported join type in Hive?
A. INNER JOIN
B. LEFT OUTER JOIN
C. RIGHT OUTER JOIN
D. FULL OUTER JOIN
Answer: D. FULL OUTER JOIN
Explanation: Hive supports various join types, including INNER JOIN, LEFT OUTER JOIN, and
RIGHT OUTER JOIN, but not FULL OUTER JOIN.
20. Which of the following commands is used to add a new column to a Hive table?
A. ADD COLUMN
B. ALTER COLUMN
C. MODIFY COLUMN
D. CHANGE COLUMN
Answer: A. ADD COLUMN
Explanation: The ADD COLUMN command is used to add a new column to a Hive table.
21. Which of the following is NOT a Hive data format for storing data in HDFS?
A. ORC
B. Parquet
C. Avro
D. JSON
Answer: D. JSON
Explanation: Hive supports various data formats for storing data in HDFS, including ORC, Parquet,
and Avro, but not JSON.
22. What is the purpose of Hive’s HAVING clause?
A. To group data based on specific column values
B. To sort data based on specific column values
C. To filter data based on specific column values
D. To limit the number of results returned by a query
Answer: C. To filter data based on specific column values
Explanation: The HAVING clause in Hive is used to filter data based on specific column values after
the GROUP BY clause has been applied.
23. Which of the following is a valid way to insert data into a Hive table?
A. INSERT INTO my_table VALUES (1, ‘hello’, true)
B. LOAD DATA INPATH ‘/path/to/data’ INTO TABLE my_table
C. COPY FROM ‘/path/to/data’ TO TABLE my_table
D. IMPORT DATA ‘/path/to/data’ INTO TABLE my_table
Answer: B. LOAD DATA INPATH ‘/path/to/data’ INTO TABLE my_table
Explanation: The LOAD DATA INPATH command is used to insert data into a Hive table from an
external file.
24. Which of the following commands is used to list all of the tables in a Hive database?
A. SHOW DATABASES
B. SHOW TABLES
C. DESCRIBE DATABASE
D. DESCRIBE TABLE
Answer: B. SHOW TABLES
Explanation: The SHOW TABLES command is used to list all of the tables in a Hive database.
25. Which of the following is NOT a Hive function for working with dates and times?
A. YEAR
B. MONTH
C. HOUR
D. CONCAT
Answer: D. CONCAT
Explanation: Hive provides various built-in functions for working with dates and times, including
YEAR, MONTH, and HOUR, but not CONCAT.
26. Which of the following is a valid Hive query to select all of the columns from a table called
my_table?
A. SELECT * FROM my_table
B. SELECT ALL FROM my_table
C. SELECT COLUMNS FROM my_table
D. SELECT DATA FROM my_table
Answer: A. SELECT * FROM my_table
Explanation: The SELECT * FROM command is used to select all of the columns from a table in
Hive.
27. Which of the following commands is used to add a new partition to a Hive table?
A. ADD PARTITION
B. ALTER PARTITION
C. MODIFY PARTITION
D. CHANGE PARTITION
Answer: A. ADD PARTITION
Explanation: The ADD PARTITION command is used to add a new partition to a Hive table.
28. Which of the following is a valid way to create a Hive table with a custom delimiter?
A. CREATE TABLE my_table (col1 INT, col2 STRING) DELIMITER ‘,’
B. CREATE TABLE my_table (col1 INT, col2 STRING) ROW FORMAT DELIMITED FIELDS
TERMINATED BY ‘,’
C. CREATE TABLE my_table (col1 INT, col2 STRING) TERMINATED BY ‘,’
D. CREATE TABLE my_table (col1 INT, col2 STRING) DELIMITED BY ‘,’
Answer: B. CREATE TABLE my_table (col1 INT, col2 STRING) ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’
Explanation: The ROW FORMAT DELIMITED FIELDS TERMINATED BY command is used to
create a Hive table with a custom delimiter.
29. Which of the following is a valid Hive query to select the top 10 rows from a table called
my_table?
A. SELECT * FROM my_table LIMIT 10
B. SELECT TOP 10 FROM my_table
C. SELECT FIRST 10 FROM my_table
D. SELECT ROW
Answer: A. SELECT * FROM my_table LIMIT 10
Explanation: The LIMIT clause is used to limit the number of rows returned by a Hive query, and it
can be used with the SELECT statement to select the top N rows from a table.
30. Which of the following commands is used to drop a Hive table?
A. DROP TABLE my_table
B. REMOVE TABLE my_table
C. DELETE TABLE my_table
D. DESTROY TABLE my_table
Answer: A. DROP TABLE my_table
Explanation: The DROP TABLE command is used to drop a Hive table.
31. Which of the following commands is used to list all of the databases in Hive?
A. SHOW DATABASES
B. LIST DATABASES
C. DESCRIBE DATABASES
D. DISPLAY DATABASES
Answer: A. SHOW DATABASES
Explanation: The SHOW DATABASES command is used to list all of the databases in Hive.
32. Which of the following is a valid way to create a Hive table that is partitioned by date?
A. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITIONED BY (date_col DATE)
B. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITIONED ON date_col
C. CREATE TABLE my_table (col1 INT, col2 STRING) DATE PARTITIONED
D. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITIONED BY date_col
Answer: D. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITIONED BY date_col
Explanation: The PARTITIONED BY command is used to create a Hive table that is partitioned by a
specific column, such as a date column.
33. Which of the following commands is used to modify the structure of a Hive table?
A. MODIFY TABLE
B. ALTER TABLE
C. CHANGE TABLE
D. UPDATE TABLE
Answer: B. ALTER TABLE
Explanation: The ALTER TABLE command is used to modify the structure of a Hive table, such as
adding or dropping columns.
34. Which of the following is a valid Hive query to select the distinct values of a column from a
table called my_table?
A. SELECT DISTINCT col1 FROM my_table
B. SELECT UNIQUE col1 FROM my_table
C. SELECT ALL DISTINCT col1 FROM my_table
D. SELECT DISTINCT ALL col1 FROM my_table
Answer: A. SELECT DISTINCT col1 FROM my_table
Explanation: The SELECT DISTINCT command is used to select the distinct values of a column
from a table in Hive.
35. Which of the following commands is used to set the delimiter for a Hive query output file?
A. SET DELIMITER
B. SET TERMINATOR
C. SET OUTPUT DELIMITER
D. SET OUTPUT TERMINATOR
Answer: C. SET OUTPUT DELIMITER
Explanation: The SET OUTPUT DELIMITER command is used to set the delimiter for a Hive query
output file.
36. Which of the following is a valid Hive query to join two tables called table1 and table2 on a
common column called col1?
A. SELECT * FROM table1, table2 WHERE table1.col1 = table2.col1
B. SELECT * FROM table1 JOIN table2 ON table1.col1 = table2.col1
C. SELECT * FROM table1 INNER JOIN table2 ON table1.col1 = table2.col1
D. All of the above
Answer: D. All of the above
Explanation: All of the above options are valid ways to join two tables in Hive.
37. Which of the following is a valid Hive query to filter rows in a table called my_table where
the value of col1 is greater than 10?
A. SELECT * FROM my_table WHERE col1 > 10
B. SELECT * FROM my_table HAVING col1 > 10
C. SELECT * FROM my_table FILTER col1 > 10
D. All of the above
Answer: A. SELECT * FROM my_table WHERE col1 > 10
Explanation: The WHERE clause is used to filter rows in Hive, and the > operator can be used to
compare the value of a column to a specific value.
38. Which of the following is a valid Hive query to group the rows in a table called my_table by
the values in col1 and calculate the sum of col2 for each group?
A. SELECT col1, SUM(col2) FROM my_table GROUP BY col1
B. SELECT col1, AVG(col2) FROM my_table GROUP BY col1
C. SELECT col1, MAX(col2) FROM my_table GROUP BY col1
D. All of the above
Answer: A. SELECT col1, SUM(col2) FROM my_table GROUP BY col1
Explanation: The GROUP BY clause is used to group the rows in Hive by the values in one or more
columns, and aggregate functions like SUM can be used to calculate the sum of another column for
each group.
39. Which of the following commands is used to create a Hive database?
A. CREATE DATABASE my_db
B. MAKE DATABASE my_db
C. ADD DATABASE my_db
D. BUILD DATABASE my_db
Answer: A. CREATE DATABASE my_db
Explanation: The CREATE DATABASE command is used to create a Hive database.
40. Which of the following is a valid Hive query to order the rows in a table called my_table by
the values in col1 in descending order?
A. SELECT * FROM my_table ORDER BY col1 DESC
B. SELECT * FROM my_table SORT BY col1 DESC
C. SELECT * FROM my_table ARRANGE BY col1 DESC
D. SELECT * FROM my_table GROUP BY col1 DESC
Answer: A. SELECT * FROM my_table ORDER BY col1 DESC
Explanation: The ORDER BY clause is used to order the rows in Hive by the values in one or more
columns, and the DESC keyword can be used to order the rows in descending order.
41. Which of the following commands is used to load data into a Hive table from a file?
A. LOAD DATA my_table FROM ‘/path/to/file’
B. INSERT DATA my_table FROM ‘/path/to/file’
C. LOAD DATA INFILE ‘/path/to/file’ INTO TABLE my_table
D. INSERT INTO my_table FROM ‘/path/to/file’
Answer: C. LOAD DATA INFILE ‘/path/to/file’ INTO TABLE my_table
Explanation: The LOAD DATA INFILE command is used to load data into a Hive table from a file.
42. Which of the following is a valid Hive query to select the top 10 rows from a table called
my_table, ordered by the values in col1 in descending order?
A. SELECT * FROM my_table ORDER BY col1 DESC LIMIT 10
B. SELECT * FROM my_table ORDER BY col1 DESC FETCH FIRST 10 ROWS ONLY
C. SELECT * FROM my_table ORDER BY col1 DESC ROWS 10
D. SELECT * FROM my_table ORDER BY col1 DESC TOP 10
Answer: A. SELECT * FROM my_table ORDER BY col1 DESC LIMIT 10
Explanation: The LIMIT clause can be used with the SELECT statement to select the top N rows
from a table in Hive, and the ORDER BY clause can be used to order the rows by the values
43. Which of the following Hive functions is used to calculate the average value of a column?
A. SUM()
B. COUNT()
C. AVG()
D. MAX()
Answer: C. AVG()
Explanation: The AVG() function is used to calculate the average value of a column in Hive.
44. Which of the following commands is used to create a Hive table that is partitioned by the
values in a specific column?
A. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITIONED BY (col3 INT)
B. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITION col3 BY (INT)
C. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITION BY col3 INT
D. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITION (col3 INT)
Answer: A. CREATE TABLE my_table (col1 INT, col2 STRING) PARTITIONED BY (col3 INT)
Explanation: The PARTITIONED BY clause is used to create a Hive table that is partitioned by the
values in a specific column.
45. Which of the following Hive functions is used to calculate the maximum value of a column?
A. SUM()
B. COUNT()
C. AVG()
D. MAX()
Answer: D. MAX()
Explanation: The MAX() function is used to calculate the maximum value of a column in Hive.
46. Which of the following commands is used to drop a Hive database?
A. DROP DATABASE my_db
B. DELETE DATABASE my_db
C. REMOVE DATABASE my_db
D. ERASE DATABASE my_db
Answer: A. DROP DATABASE my_db
Explanation: The DROP DATABASE command is used to drop a Hive database.
47. Which of the following is a valid Hive query to join two tables called table1 and table2 on the
values in col1?
A. SELECT * FROM table1 JOIN table2 ON table1.col1 = table2.col1
B. SELECT * FROM table1 INNER JOIN table2 ON table1.col1 = table2.col1
C. SELECT * FROM table1 LEFT OUTER JOIN table2 ON table1.col1 = table2.col1
D. All of the above
Answer: D. All of the above
Explanation: All three of these queries are valid ways to join two tables in Hive.
48. Which of the following Hive functions is used to calculate the total number of rows in a
table?
A. SUM()
B. COUNT()
C. AVG()
D. MAX()
Answer: B. COUNT()
Explanation: The COUNT() function is used to calculate the total number of rows in a table in Hive.
49. Which of the following commands is used to insert data into a Hive table?
A. INSERT DATA INTO my_table VALUES (1, ‘value1’), (2, ‘value2’)
B. INSERT INTO my_table VALUES (1, ‘value1’), (2, ‘value2’)
C. INSERT my_table VALUES (1, ‘value1’), (2, ‘value2’)
D. None of the above
Answer: B. INSERT INTO my_table VALUES (1, ‘value1’), (2, ‘value2’)
Explanation: The INSERT INTO command is used to insert data into a Hive table.
50. Which of the following Hive functions is used to calculate the minimum value of a column?
A. SUM()
B. COUNT()
C. AVG()
D. MIN()
Answer: D. MIN()
Explanation: The MIN() function is used to calculate the minimum value of a column in Hive.
51. Which of the following commands is used to view the data in a Hive table?
A. SHOW DATA my_table
B. SELECT * FROM my_table
C. VIEW DATA my_table
D. DESCRIBE my_table
Answer: B. SELECT * FROM my_table
Explanation: The SELECT command is used to view the data in a Hive table.
52. Which of the following is a valid Hive query to filter rows in a table where col1 is equal to 1?
A. SELECT * FROM my_table WHERE col1 = 1
B. SELECT * FROM my_table HAVING col1 = 1
C. SELECT * FROM my_table GROUP BY col1 HAVING col1 = 1
D. None of the above
Answer: A. SELECT * FROM my_table WHERE col1 = 1
Explanation: The WHERE clause is used to filter rows in a Hive table based on a condition.
53. Which of the following Hive functions is used to concatenate two or more strings together?
A. CONCAT()
B. SUBSTR()
C. UPPER()
D. LOWER()
Answer: A. CONCAT()
Explanation: The CONCAT() function is used to concatenate two or more strings together in Hive.
54. Which of the following commands is used to view the structure of a Hive table?
A. SHOW my_table STRUCTURE
B. DESCRIBE my_table
C. VIEW my_table STRUCTURE
D. None of the above
Answer: B. DESCRIBE my_table
Explanation: The DESCRIBE command is used to view the structure of a Hive table.
55. Which of the following Hive functions is used to return a substring of a string?
A. CONCAT()
B. SUBSTR()
C. UPPER()
D. LOWER()
Answer: B. SUBSTR()
Explanation: The SUBSTR() function is used to return a substring of a string in Hive.
56. Which of the following commands is used to view the list of tables in a Hive database?
A. SHOW TABLES my_db
B. LIST TABLES my_db
C. DESCRIBE DATABASE my_db
D. None of the above
Answer: A. SHOW TABLES my_db
Explanation: The SHOW TABLES command is used to view the list of tables in a Hive database.
57. Which of the following Hive functions is used to convert a string to uppercase?
A. CONCAT()
B. SUBSTR()
C. UPPER()
D. LOWER()
Answer: C. UPPER()
Explanation: The UPPER() function is used to convert a string to uppercase in Hive.
58. Which of the following Hive functions is used to convert a string to lowercase?
A. CONCAT()
B. SUBSTR()
C. UPPER()
D. LOWER()
Answer: D. LOWER()
Explanation: The LOWER() function is used to convert a string to lowercase in Hive.
59. Which of the following commands is used to create a new Hive table?
A. CREATE my_table
B. ADD my_table
C. CREATE TABLE my_table
D. None of the above
Answer: C. CREATE TABLE my_table
Explanation: The CREATE TABLE command is used to create a new Hive table.
60. Which of the following commands is used to load data into a Hive table from an external file?
A. LOAD DATA INFILE ‘file_path’ INTO TABLE my_table
B. LOAD DATA INTO TABLE my_table FROM ‘file_path’
C. INSERT DATA INTO my_table FROM ‘file_path’
D. None of the above
Answer: A. LOAD DATA INFILE ‘file_path’ INTO TABLE my_table
Explanation: The LOAD DATA INFILE command is used to load data into a Hive table from an
external file.
https://www.sanfoundry.com/hadoop-questions-answers-introduction-hive/
1. Which of the following command sets the value of a particular configuration variable (key)?
a) set -v
b) set <key>=<value>
c) set
d) reset
Answer: b
Explanation: If you misspell the variable name, the CLI will not show an error.
2. Point out the correct statement.
a) Hive Commands are non-SQL statement such as setting a property or adding a resource
b) Set -v prints a list of configuration variables that are overridden by the user or Hive
c) Set sets a list of variables that are overridden by the user or Hive
d) None of the mentioned
Answer: a
Explanation: Commands can be used in HiveQL scripts or directly in the CLI or Beeline.
3. Which of the following operator executes a shell command from the Hive shell?
a) |
b) !
c) ^
d) +
Answer: b
Explanation: Exclamation operator is for execution of command.
4. Which of the following will remove the resource(s) from the distributed cache?
a) delete FILE[S] <filepath>*
b) delete JAR[S] <filepath>*
c) delete ARCHIVE[S] <filepath>*
d) all of the mentioned
Answer: d
Explanation: Delete command is used to remove existing resource.
5. Point out the wrong statement.
a) source FILE <filepath> executes a script file inside the CLI
b) bfs <bfs command> executes a dfs command from the Hive shell
c) hive is Query language similar to SQL
d) none of the mentioned
Answer: b
Explanation: dfs <dfs command> executes a dfs command from the Hive shell.
6. _________ is a shell utility which can be used to run Hive queries in either interactive or batch
mode.
a) $HIVE/bin/hive
b) $HIVE_HOME/hive
c) $HIVE_HOME/bin/hive
d) All of the mentioned
Answer: c
Explanation: Various types of command line operations are available in the shell utility.
7. Which of the following is a command line option?
a) -d,–define <key=value>
b) -e,–define <key=value>
c) -f,–define <key=value>
d) None of the mentioned
Answer: a
Explanation: Variable substitution to apply to hive commands. e.g. -d A=B or –define A=B.
8. Which is the additional command line option is available in Hive 0.10.0?
a) –database <dbname>
b) –db <dbname>
c) –dbase <<dbname>
d) All of the mentioned
Answer: a
Explanation: Database is specified which is to be used.
9. The CLI when invoked without the -i option will attempt to load $HIVE_HOME/bin/.hiverc and
$HOME/.hiverc as _______ files.
a) processing
b) termination
c) initialization
d) none of the mentioned
Answer: c
Explanation: Hiverc file is loaded as per options selected.
10. When $HIVE_HOME/bin/hive is run without either the -e or -f option, it enters _______ mode.
a) Batch
b) Interactive shell
c) Multiple
d) None of the mentioned
11.