Python Coding Interview Questions On DataFrame and Zip
Python Coding Interview Questions On DataFrame and Zip
Introduction
Python is an interpreted programming language used to build any Machine Learning or Deep Learning
model. Data science aspirants need to have a good understanding of Python looking to work in the field of
artificial intelligence.
So, if you are a fresher looking for a job role in data science, you must be very well prepared to answer
Python-based interview questions. This article will cover coding questions on two main topics, i.e., zip()
function and dataframe, frequently asked in interviews.
Question 1: Given two lists, generate a list of pairs (one element from each list).
list1 = [1, 2, 3, 4]
The below code can be used to create pairs using the zip() function.
The zip() function takes elements from each list one by one and pairs them up i.e., ‘1’ from list1 is paired
with ‘1’ from list2 and so on.
Output:
0 Ram 24
1 Shyam 56
2 Mohan 34
According to the given input and output, the task is to generate a dataframe from list1 and list2.
list1 = ["Ram", "Shyam", "Mohan"] list2 = [24, 56, 34] for x, (list1, list2) in enumerate(zip(list1, list2)):
The zip() function makes pairs, taking elements index-wise from each list. Thus, the above code prints the
index and elements from each list.
Output:
To design a dictionary, we need to take one element each from two lists and combine them iteratively as
shown below:
list1 = [1, 2, 3, 4] list2 = [1, 4, 9, 16] dict1 = {list1: list2 for list1, list2 in zip(list1, list2)}
print(dict1)
The above code works at the element level, i.e., picking the first element of each list, grouping, and printing
them. Then move to the second element of each list and again group and print them.
Output:
{1: 1, 2: 4, 3: 9, 4: 16}
Question 4: Given a list of pairs, say, [(1,1), (2,4),(3,9),(4,16)]. Write a code snippet to split into two
sequences.
Output:
(1, 2, 3, 4) (1, 4, 9, 16)
Zip() is a Python function that combines two data (lists, tuples) into one. For instance,
print(zip(data1, data2)
For instance-
1 Ram 26
2 Shyam 28
3 Neha 36
Here, the first row represents that Ram has age 26 and so on.
Output:
import pandas as pd data= [['Mumbai', 6500], ['Delhi', 7000], ['Pune', 4000]] df = pd.DataFrame(data, columns
=['City','Distance']) print(df)
Output:
The dictionary consists of key-value pairs. The key in the dictionary becomes the column name, and values
become the entry on the cell of a particular column.
Output:
Output:
Roll no Name 0 1001 Geeta 1 1002 Sita 2 1003 Anjali S.no. Name 0 1001 Geeta 1 1002 Sita 2 1003
Anjali
Output:
loc: It is label-based, i.e., rows and column names have to be provided to access an element of a
dataframe.iloc: It is integer location-based, i.e., row number and column number have to be provided to
access an element of a dataframe.
Question 11: How will you delete a row or column from the DataFrame?
The row or column of a DataFrame can be deleted by using the drop() function, i.e.if name of the dataframe
is df; then the row can be deleted by using.
df.drop(['row_name'])
df.drop(['column_name'])
df.sort_values(by=['Name'])
import pandas as pd
dataframe = {'Roll no': [101, 102, 103], 'Name': ['Gorav', 'Riddhi', 'Shyama']} df = pd.DataFrame(dataframe)
print(df)
One should know about the column name and its value for adding a new column. Then, the column can be
added using df[‘column name’].
Output:
Helpful Tips
Source: TechBullion
Dataframe and zip() are the two most important topics in any data scientist interview.
One should be aware of the basics of dataframe before going for advanced topics.
The creation, updation, and manipulation are important topics of dataframe. Therefore, any data
science aspirant should be hand good grasp of the topic.
Practice the basic questions on zip() before going for tough questions.
Conclusion
Zip() is used to combine two lists and is frequently used in combination with the dictionary.
Dataframe is a two-dimensional data structure consisting of rows, columns, and cells.
The difference between loc and iloc is a must-to-know. In loc, row and column name whereas in iloc
row and column number is provided to access an element of the dataframe.
If you get a good grasp of these questions, you can also answer other Python coding questions that
require using any of these two functionality provided by Python.
The media shown in this ar ticle is not owned by Analytics Vidhya and is used at the Author’s discretion.
Saumyab271