Technical Interview Questions Technical Interview Questions
Technical Interview Questions Technical Interview Questions
Technical Interview Questions Technical Interview Questions
The answers here are given by the community. Be careful and double check the answers before using
⚠️ them.
If you see an error, please create a PR with a fix
Table of contents
SQL
Coding (Python)
Algorithmic Questions
SQL
Suppose we have the following schema with two tables: Ads and Events
2) All active campaigns. A campaign is active if there’s at least one active ad.
5) The number of events over the last week per each active ad — broken down by event type and date (most recent first).
SELECT a.ad_id, e.event_type, e.date, count(*) as "count"
FROM Ads AS a
JOIN Events AS e
ON a.ad_id = e.ad_id
WHERE a.status = 'active'
GROUP BY a.ad_id, e.event_type, e.date
ORDER BY e.date ASC, "count" DESC;
7) The number of events over the last week per each campaign — broken down by date (most recent first).
-- for Postgres
8) CTR (click-through rate) for each ad. CTR = number of impressions / number of clicks.
-- for Postgres
SELECT impressions_clicks_table.campaign_id,
(impressions_clicks_table.impressions * 100 / impressions_clicks_table.clicks)::FLOAT || '%' AS CTR
FROM
(
SELECT a.campaign_id,
SUM(CASE e.event_type WHEN 'impression' THEN 1 ELSE 0 END) impressions,
SUM(CASE e.event_type WHEN 'click' THEN 1 ELSE 0 END) clicks
FROM Ads AS a
INNER JOIN Events AS e
ON a.ad_id = e.ad_id
GROUP BY a.campaign_id
) AS impressions_clicks_table
ORDER BY impressions_clicks_table.campaign_id;
9) CVR (conversion rate) for each ad. CVR = number of clicks / number of installs.
-- for Postgres
SELECT conversions_clicks_table.campaign_id,
(conversions_clicks_table.conversions * 100 / conversions_clicks_table.clicks)::FLOAT || '%' AS CVR
FROM
(
SELECT a.campaign_id,
SUM(CASE e.event_type WHEN 'conversion' THEN 1 ELSE 0 END) conversions,
SUM(CASE e.event_type WHEN 'click' THEN 1 ELSE 0 END) clicks
FROM Ads AS a
INNER JOIN Events AS e
ON a.ad_id = e.ad_id
GROUP BY a.campaign_id
) AS conversions_clicks_table
ORDER BY conversions_clicks_table.campaign_id;
10) CTR and CVR for each ad broken down by day and hour (most recent first).
Answer here
Answer here
Coding (Python)
1) FizzBuzz. Print numbers from 1 to 100
Example of output: 1, 2, Fizz, 4, Buzz, Fizz, 7, 8, Fizz, Buzz, 11, Fizz, 13, 14, Fizz Buzz, 16, 17, Fizz, 19, Buzz, Fizz, 22, 23, Fizz, Buzz, 26, Fizz, 28, 29, Fizz Buzz, 31, 32,
Fizz, 34, Buzz, Fizz, ...
factorial(5) = 5! = 1 * 2 * 3 * 4 * 5 = 120
factorial(10) = 10! = 1 * 2 * 3 * 4 * 5 * 6 * 7 * 8 * 9 * 10 = 3628800
def factorial(n):
result = 1
for i in range(2, n + 1):
result *= i
return result
def mean(numbers):
if len(numbers) > 0:
return sum(numbers) / len(numbers)
return float('NaN')
def std_dev(numbers):
if len(numbers) > 0:
avg = mean(numbers)
var = sum([(i - avg) ** 2 for i in numbers]) / len(numbers)
ans = sqrt(var)
return ans
return float('NaN')
5) RMSE . Calculate the RMSE (root mean squared error) of a model. The function takes in two lists: one with actual values, one with predictions.
import math
6) Remove duplicates. Remove duplicates in list. The list is not sorted and the order of elements from the original list should be preserved.
[1, 2, 3, 1] ⇒ [1, 2, 3]
[1, 3, 2, 1, 5, 3, 5, 1, 4] ⇒ [1, 3, 2, 5, 4]
def remove_duplicates(lst):
new_list = []
mentioned_values = set()
for elem in lst:
if elem not in mentioned_values:
new_list.append(elem)
mentioned_values.add(elem)
return new_list
7) Count . Count how many times each element in a list occurs.
[1, 3, 2, 1, 5, 3, 5, 1, 4] ⇒
* 1: 3 times * 2: 1 time * 3: 2 times * 4: 1 time * 5: 2 times
numbers = [1, 3, 2, 1, 5, 3, 5, 1, 4]
counter = dict()
for elem in numbers:
counter[elem] = counter.get(elem, 0) + 1
or
numbers = [1, 3, 2, 1, 5, 3, 5, 1, 4]
counter = Counter(numbers)
8) Palindrome. Is string a palindrome? A palindrome is a word which reads the same backward as forwards.
“ololo” ⇒ Yes
“cafe” ⇒ No
def is_palindrome(s):
return s == s[::-1]
or
def is_palindrome(s):
for i in range(len(s) // 2):
if s[i] != s[-i - 1]:
return False
return True
9) Counter. We have a list with identifiers of form “id-SITE”. Calculate how many ids we have per site.
def counter(lst):
ans = {}
for i in lst:
site = i[-2:]
ans[site] = ans.get(site, 0) + 1
return ans
10) Top counter . We have a list with identifiers of form “id-SITE”. Show the top 3 sites. You can break ties in any way you want.
def top_counter(lst):
site_dict = counter(lst) # using last problem's solution
top_keys = sorted(site_dict, reverse=True, key=site_dict.get)[:3]
return {key: site_dict[key] for key in top_keys}
11) RLE . Implement RLE (run-length encoding): encode each character by the number of times it appears consecutively.
Using itertools.groupby
import itertools
def rle(s):
return [(l, len(list(g))) for l, g in itertools.groupby(s)]
12) Jaccard . Calculate the Jaccard similarity between two sets: the size of intersection divided by the size of union.
13) IDF . Given a collection of already tokenized texts, calculate the IDF (inverse document frequency) of each token.
Where:
t is the token,
n(t) is the number of documents that t occurs in,
N is the total number of documents
def idf1(docs):
docs = [set(doc) for doc in docs]
n_tokens = {}
for doc in docs:
for token in doc:
n_tokens[token] = n_tokens.get(token, 0) + 1
ans = {}
for token in n_tokens:
ans[token] = log10(len(docs) / (1 + n_tokens[token]))
return ans
import math
def idf2(docs):
n_docs = len(docs)
idf_coefficients = {}
for token in all_tokens:
n_docs_w_token = sum(token in doc for doc in docs)
idf_c = math.log10(n_docs / (1 + n_docs_w_token))
idf_coefficients[token] = idf_c
return idf_coefficients
14) PMI. Given a collection of already tokenized texts, find the PMI (pointwise mutual information) of each pair of tokens. Return top 10 pairs according to PMI.
PMI is used for finding collocations in text — things like “New York” or “Puerto Rico”. For two consecutive words, the PMI between them is:
The higher the PMI, the more likely these two tokens form a collection. We can estimate PMI by counting:
Where: * N is the total number of tokens in the text, * c(t1, t2) is the number of times t1 and t2 appear together, * c(t1) and c(t2) — the number of times they appear
separately.
Answer here
Algorithmic Questions
1) Two sum. Given an array and a number N, return True if there are numbers A, B in the array such that A + B = N. Otherwise, return False.
for i in range(n):
for j in range(i + 1, n):
if numbers[i] + numbers[j] == target:
return True
return False
Linear, O(n):
def two_sum(numbers, target):
index = {num: i for (i, num) in enumerate(numbers)}
n = len(numbers)
for i in range(n):
a = numbers[i]
b = target - a
if b in index:
j = index[b]
if i != j:
return True
return False
Using itertools.combinations
2) Fibonacci . Return the n-th Fibonacci number, which is computed using this formula:
F(0) = 0
F(1) = 1
F(n) = F(n-1) + F(n-2)
The sequence is: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
def fibonacci1(n):
'''naive, complexity = O(2 ** n)'''
if n == 0 or n == 1:
return n
else:
return fibonacci1(n - 1) + fibonacci1(n - 2)
def fibonacci2(n):
'''dynamic programming, complexity = O(n)'''
base1, base2 = 0, 1
for i in range(n):
base1, base2 = base2, base1 + base2
return base1
def fibonacci3(n):
'''matrix multiplication, complexity = O(log(n))'''
def mx_mul(m1, m2):
ans = [[0 for i in range(len(m2[0]))] for j in range(len(m1))]
for i in range(len(m1)):
for j in range(len(m2[0])):
for k in range(len(m2)):
ans[i][j] += m1[i][k] * m2[k][j]
return ans
def pow(a, b):
ans = [[1, 0], [0, 1]]
while b > 0:
if b % 2 == 1:
ans = mx_mul(ans, a)
a = mx_mul(a, a)
b //= 2
return ans
ans = mx_mul(pow([[1, 1], [1, 0]], n), [[1], [0]])[1][0]
return ans
memo = {0: 0, 1: 1}
def fibonacci4(n):
'''Top down + memorization (dictionary), complexity = O(n)'''
if n not in memo:
memo[n] = fibonacci4(n-1) + fibonacci4(n-2)
return memo[n]
@lru_cache()
def fibonacci4(n):
if n == 0 or n == 1:
return n
return fibonacci4(n - 1) + fibonacci4(n - 2)
def fibonacci5(n):
'''Top down + memorization (list), complexity = O(n) '''
if n == 1:
return 1
dic = [-1 for i in range(n)]
dic[0], dic[1] = 1, 2
def helper(n, dic):
if dic[n] < 0:
dic[n] = helper(n-1, dic) + helper(n-2, dic)
return dic[n]
return helper(n-1, dic)
3) Most frequent outcome. We have two dice of different sizes (D1 and D2). We roll them and sum their face values. What are the most probable outcomes?
6, 6 ⇒ [7]
2, 4 ⇒ [3, 4, 5]
def reverse_ll(head):
if head.next is not None:
last = None
point = head
while point is not None:
point.next, point, last = last, point.next, point
def flip_bt(head):
if head is not None:
head.left, head.right = head.right, head.left
flip_bt(head.left)
flip_bt(head.right)
6) Binary search. Return the index of a given number in a sorted array or -1 if it’s not there.
[1, 4, 6, 10], 4 ⇒ 1
[1, 4, 6, 10], 3 ⇒ -1
[1, 1, 1, 2, 3, 4, 4, 4, 5, 6, 6] ⇒ [1, 2, 3, 4, 5, 6]
def deduplication1(lst):
'''manual'''
ans = []
last = None
for i in lst:
if last != i:
ans.append(i)
last = i
return ans
def deduplication2(lst):
return list(set(lst))
10) Addition . Implement the addition algorithm from school. Suppose we represent numbers by a list of integers from 0 to 9:
12 is [1, 2]
1000 is [1, 0, 0, 0]
Example:
Output:
12) Check if a tree is a binary search tree . In BST, the element in the root is:
if min_val:
check_val = check_val and (head.val >= min_val)
min_new = min(min_val, head.val)
else:
min_new = head.val
if max_val:
check_val = check_val and (head.val <= max_val)
max_new = max(max_val, head.val)
else:
max_new = head.val
if head.left:
check_left = check_is_bst(head.left, min_val, max_new)
if head.right:
check_right = check_is_bst(head.right, min_new, max_val)
13) Maximum Sum Contiguous Subarray. You are given an array A of length N , you have to find the largest possible sum of an Subarray, of array A .
* [-2, 1, -3, 4, -1, 2, 1, -5, 4] gives 6 as largest sum (from the subarray [4, -1, 2, -1]
from sys import maxsize
def max_sum_subarr(list1, size):
"""Use Kadane's Algorithm for a optimal solution
Time Complexity: O(n)
Desciption: Use one variable for current sum, and one for Overall sum at an index.
So here, the global_max will keep on updating the max sum at any index-1,
and curr_max will check the max value at an index.
And finally after iterating through the list, return the value of global_max variable which contains Maximum sum.
"""
curr_max=list1[0]
global_max=list1[0]
for each in range(1, size):
curr_max = max(list1[each], curr_max+list1[each])
global_max = max(global_max, curr_max)
return global_max
n = int(input())
list1 = []
for i in range(0,n):
num = int(input())
list1.append(num)
print(max_sum_subarr(list1, len(list1)))