Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
8 views

Leetcode

The document provides a collection of coding problems and their respective solutions in Python. Each problem includes an input example, expected output, and a function definition to solve the problem. Topics covered include algorithms for arrays, strings, and other data structures.

Uploaded by

Mohit Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Leetcode

The document provides a collection of coding problems and their respective solutions in Python. Each problem includes an input example, expected output, and a function definition to solve the problem. Topics covered include algorithms for arrays, strings, and other data structures.

Uploaded by

Mohit Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

1.

Two Sum:
Input: nums = [2,7,11,15], target = 9
Output: [0,1]

def twoSum(self, nums: List[int], target: int) -> List[int]:


numdict = {}
n = len(nums)

for i in range(n):
x = target - nums[i]

if x in numdict:
return [numdict[x], i]
numdict[nums[i]] = i

return []

2.Roman to Integer:
Input: s = "LVIII"
Output: 58

def romanToInt(self, s: str) -> int:


dictnum = {'I':1, 'V':5, 'X':10, 'L':50, 'C':100, 'D':500,
'M':1000}
list1 = list(s)
n = len(list1)
ans = 0

for i in range(n):
if i < n-1 and dictnum[list1[i]] < dictnum[list1[i+1]]:
ans = ans - dictnum[list1[i]]
else:
ans = ans + dictnum[list1[i]]

return ans

3.Longest Common Pre x:


Input: strs = [" ower"," ow"," ight"]
Output: " "

def longestCommonPrefix(self, strs: List[str]) -> str:


strs = sorted(strs)
first = strs[0]
last = strs[-1]

ans = ''

for i in range(min(len(first), len(last) )):


if first[i] != last[i]:
return ans
ans = ans+first[i]
return ans

4.Valid Parentheses:
Input: s = "()[]{}"
Output: true
fl
fl
fi
fl
fl
def isValid(self, s: str) -> bool:
list1 = []

for i in s:
if i in '([{':
list1.append(i)

else:
if
not list1 or \
(i
== ')' and list1[-1] != '(') or \
(i
== ']' and list1[-1] != '[') or \
(i
== '}' and list1[-1] != '{') :
return False
list1.pop()
return not list1

5.Remove Duplicates from Sorted Array:


Input: nums = [1,1,2]
Output: 2, nums = [1,2,_]

def removeDuplicates(self, nums: List[int]) -> int:


j = 1
for i in range(1,len(nums)):
if nums[i] != nums[i-1]:
nums[j] = nums[i]
j += 1
return j

6.Remove Element
Input: nums = [0,1,2,2,3,0,4,2], val = 2
Output: 5, nums = [0,1,4,0,3,_,_,_]

def removeElement(self, nums: List[int], val: int) -> int:


j = 0
for i in range(len(nums)):
if nums[i] != val:
nums[j] = nums[i]
j += 1
return j

7.Find the Index of the First Occurrence in a String:


Input: haystack = "sadbutsad", needle = "sad"
Output: 0

def strStr(self, haystack: str, needle: str) -> int:


return haystack.find(needle)

8.Search Insert Posi on


Given a sorted array of dis nct integers and a target value, return the index if the target is found. If
not, return the index where it would be if it were inserted in order.
Input: nums = [1,3,5,6], target = 2
Output: 1

def searchInsert(self, nums: List[int], target: int) -> int:


if target in nums:
ti
ti
return nums.index(target)
else:
nums.append(target)
nums = sorted(nums)
return nums.index(target)

9.Length of Last Word:


Input: s = "Hello World"
Output: 5

def lengthOfLastWord(self, s: str) -> int:


return len(s.split()[-1])

10.Plus One:
Input: digits = [1,2,3]
Output: [1,2,4]

def plusOne(self, digits: List[int]) -> List[int]:


li = int(''.join(map(str, digits)))
return [int(i) for i in str(li+1)]

11.Best Time to Buy and Sell Stock:


Input: prices = [7,1,5,3,6,4]Output: 5Explana on: Buy on day 2 (price = 1) and sell on day 5 (price = 6),
pro t = 6-1 = 5.
Note that buying on day 2 and selling on day 1 is not allowed because you must buy before you sell.

def maxProfit(self, prices: List[int]) -> int:


min_p = prices[0]
max_p = 0

for i in prices[1:]:
max_p = max(max_p, i-min_p)
min_p = min(min_p, i)
return max_p

12.Valid Palindrome:
Input: s = "A man, a plan, a canal: Panama"
Output: trueExplana on: "amanaplanacanalpanama" is a palindrome.

def isPalindrome(self, s: str) -> bool:


li = [i for i in s if i.isalnum()]
a = ''.join(li).lower()
if a == a[::-1]:
return True
else:
return False

13.Single Number:
Given a non-empty array of integers nums, every element appears twice except for one. Find that
single one.
Input: nums = [2,2,1]
Output: 1
fi
ti
ti
def singleNumber(self, nums: List[int]) -> int:
dictcnt = Counter(nums)
return [i for i in dictcnt.keys() if dictcnt[i] == 1][0]

14.Majority Element:
The majority element is the element that appears more than ⌊n / 2⌋ mes. You may assume that the
majority element always exists in the array.
Input: nums = [2,2,1,1,1,2,2]
Output: 2

def majorityElement(self, nums: List[int]) -> int:


dictcnt = Counter(nums)
return max(dictcnt.keys(), key = dictcnt.get)

15.Happy Number:
Input: n = 19Output: true
Explana on:
12 + 92 = 82 (2s are square)
82 + 22 = 68
62 + 82 = 100
12 + 02 + 02 = 1

def isHappy(self, n: int) -> bool:


seen = []
while n != 1 and n not in seen:
seen.append(n)
n = sum([int(i)**2 for i in str(n)])
if n == 1:
return True
else:
return False

16.Isomorphic Strings:
Two strings s and t are isomorphic if the characters in s can be replaced to get t.
Input: s = "egg", t = "add"
Output: true

def isIsomorphic(self, s: str, t: str) -> bool:


li1 = []
li2 = []

for i in s:
li1.append(s.index(i))
for j in t:
li2.append(t.index(j))
if li1 == li2:
return True
else:
False
ti
ti
17.Contains Duplicate:
Given an integer array nums, return true if any value appears at least twice in the array, and
return false if every element is dis nct.
Input: nums = [1,1,1,3,3,4,3,2,4,2]
Output: true

def containsDuplicate(self, nums: List[int]) -> bool:


dictcnt = Counter(nums)
if max(list((dictcnt.values()))) >= 2:
return True
else:
return False

18.Contains Duplicate II:


Given an integer array nums and an integer k, return true if there are two dis nct indices i and j in the
array such that nums[i] == nums[j] and abs(i - j) <= k.
Input: nums = [1,0,1,1], k = 1
Output: true

def containsNearbyDuplicate(self, nums: List[int], k: int) -> bool:


dictcnt = {}
for i in range(len(nums)):
if nums[i] in dictcnt:
if abs(dictcnt[nums[i]] - i) <= k:
return True
dictcnt[nums[i]] = i
return False

19.Summary Ranges:
Input: nums = [0,1,2,4,5,7]
Output: ["0->2","4->5","7"]
Explana on: The ranges are:
[0,2] --> "0->2"
[4,5] --> "4->5"
[7,7] --> "7"

def summaryRanges(self, nums: List[int]) -> List[str]:


arr=[]
nums.append(0)
i=0
while i<len(nums)-1:
a=i
j=i+1
while nums[j]-nums[a]==1 and j<len(nums)-1:
a+=1
j+=1
if j-i==1:
arr.append(str(nums[i]))
else:
arr.append(str(nums[i])+"->"+str(nums[a]))
i=j
return arr
ti
ti
ti
20.Valid Anagram:
An Anagram is a word or phrase formed by rearranging the le ers of a di erent word or phrase,
typically using all the original le ers exactly once.
Input: s = "anagram", t = "nagaram"
Output: true

def isAnagram(self, s: str, t: str) -> bool:


if sorted(s) == sorted(t):
return True
else:
return False

21.Missing Number:
Input: nums = [3,0,1]
Output: 2
Explana on: n = 3 since there are 3 numbers, so all numbers are in the range [0,3]. 2 is the missing
number in the range since it does not appear in nums.

def missingNumber(self, nums: List[int]) -> int:


n = len(nums)
li = [i for i in range(n+1)]
return list(set(li)-set(nums))[0]

22.Move Zeroes:
Input: nums = [0,1,0,3,12]
Output: [1,3,12,0,0]

def moveZeroes(self, nums: List[int]) -> None:


j = 0
for i in range(len(nums)):
if nums[i] != 0:
nums[i], nums[j] = nums[j], nums[i]
j += 1

23.Word Pa ern:
Input: pa ern = "abba", s = "dog cat cat sh"
Output: false

def wordPattern(self, pattern: str, s: str) -> bool:


li = s.split()
return (len(set(li)) == len(set(pattern)) ==
len(set(zip_longest(li,pattern))))

24.Reverse String:
Input: s = ["h","e","l","l","o"]
Output: ["o","l","l","e","h"]

def reverseString(self, s: List[str]) -> None:


l = 0
while(l < len(s)//2):
s[l], s[-l-1] = s[-l-1],s[l]
ti
tt
tt
tt
fi
tt
ff
l += 1

25.Reverse Vowels of a String:


Input: s = "leetcode"
Output: "leotcede"

def reverseVowels(self, s: str) -> str:


li = 'aeiouAEIOU'
nums = []
for i in s:
if i in li:
nums.append(i)
nums = nums[::-1]
li_1 = list(s)
j = 0
for i in range(len(li_1)):
if li_1[i] in li:
li_1[i] = nums[j]
j += 1
return ''.join(li_1)

26.Intersec on of Two Arrays:


Input: nums1 = [4,9,5], nums2 = [9,4,9,8,4]
Output: [9,4]

def intersection(self, nums1: List[int], nums2: List[int]) ->


List[int]:
return list(set(nums1) & set(nums2))

27.Intersec on of Two Arrays II:


Input: nums1 = [1,2,2,1], nums2 = [2,2]
Output: [2,2]

def intersect(self, nums1: List[int], nums2: List[int]) -> List[int]:


li = list(set(nums1)& set(nums2))
dictcnt_1 = Counter(nums1)
dictcnt_2 = Counter(nums2)
res = []
for i in li:
a = min(dictcnt_1[i],dictcnt_2[i])
res = res+[i]*a
return res

28.Ransom Note:
Given two strings ransomNote and magazine, return true if ransomNote can be constructed by using
the le ers from magazine and false otherwise.
Each le er in magazine can only be used once in ransomNote.
Input: ransomNote = "aa", magazine = "aab"
Output: true
tt
tt
ti
ti
def canConstruct(self, ransomNote: str, magazine: str) -> bool:
dict_ran = Counter(ransomNote)
dict_mag = Counter(magazine)
for i in dict_ran.keys():
if dict_ran[i] <= dict_mag[i]:
pass
else:
return False
return True

29.First Unique Character in a String:


Input: s = "loveleetcode"
Output: 2

def firstUniqChar(self, s: str) -> int:


dictcnt = Counter(list(s))
for i in s:
if dictcnt[i] == 1:
return list(s).index(i)
return -1

30.Find the Di erence:


String t is generated by random shu ing string s and then add one more le er at a random posi on.
Return the le er that was added to t.
Input: s = "abcd", t = "abcde"
Output: "e"
Explana on: 'e' is the le er that was added.

def findTheDifference(self, s: str, t: str) -> str:


dict_s = Counter(s)
dict_t = Counter(t)
for i in dict_t.keys():
if dict_t[i]-dict_s[i] == 1:
return i

31.Is Subsequence:
A subsequence of a string is a new string that is formed from the original string by dele ng some (can
be none) of the characters without disturbing the rela ve posi ons of the remaining characters.
(i.e., "ace" is a subsequence of "abcde" while "aec" is not).
Input: s = "abc", t = "ahbgdc"
Output: true

def isSubsequence(self, s: str, t: str) -> bool:


li = list(s)
j = 0
for i in t:
try:
if i == li[j]:
j += 1
except:
continue

if j == len(li):
return True
return False
ti
ff
tt
tt
ffl
ti
ti
tt
ti
ti
32.Longest Palindrome:
Given a string s which consists of lowercase or uppercase le ers, return the length of the longest
palindrome that can be built with those le ers.
Input: s = "abccccdd"
Output: 7
Explana on: One longest palindrome that can be built is "dccaccd", whose length is 7.

def longestPalindrome(self, s: str) -> int:


dictcnt = Counter(s)
even = [i for i in dictcnt.values() if i%2==0]
odd = [i for i in dictcnt.values() if i%2==1]

if odd:
a = sum(odd)-len(odd)+1
return a + sum(even)
else:
return sum(even)

33.Number of Segments in a String:


Input: s = "Hello, my name is John"
Output: 5
Explana on: The ve segments are ["Hello,", "my", "name", "is", "John"]

def countSegments(self, s: str) -> int:


return len(s.split())

34.Find All Numbers Disappeared in an Array:


Given an array nums of n integers where nums[i] is in the range [1, n], return an array of all the
integers in the range [1, n] that do not appear in nums.
Input: nums = [4,3,2,7,8,2,3,1]
Output: [5,6]

def findDisappearedNumbers(self, nums: List[int]) -> List[int]:


for num in nums:
i = abs(num)-1
nums[i] = -1*abs(nums[i])
res = []
for i in range(len(nums)):
if nums[i]>0:
res.append(i+1)
return res

35.Assign Cookies:
Input: g = [1,2], s = [1,2,3]
Output: 2
Explana on: You have 2 children and 3 cookies. The greed factors of 2 children are 1, 2.
You have 3 cookies and their sizes are big enough to gra fy all of the children,
You need to output 2.

def findContentChildren(self, g: List[int], s: List[int]) -> int:


i = 0
ti
ti
ti
fi
tt
ti
tt
s = sorted(s)
g = sorted(g)
res = 0
for j in range(len(s)):
if s[j] >= g[i]:
i += 1
res += 1
if i == len(g):
break
return res

36.Repeated Substring Pa ern:


Given a string s, check if it can be constructed by taking a substring of it and appending mul ple
copies of the substring together.
Input: s = "abab"
Output: true
Explana on: It is the substring "ab" twice.

def repeatedSubstringPattern(self, s: str) -> bool:


return s in (s+s)[1:-1]

37.License Key Forma ng:


Input: s = "2-5g-3-J", k = 2
Output: "2-5G-3J"
Explana on: The string s has been split into three parts, each part has 2 characters except the rst
part as it could be shorter as men oned above.

def licenseKeyFormatting(self, s: str, k: int) -> str:


str_up = s.replace('-','').upper()
str_up = str_up[::-1]
res = ''
for i in range(0, len(str_up),k):
res += str_up[i:i+k]
res += '-'
res = res[::-1]

res = res.replace('-','',1)
return res

38.Max Consecu ve Ones:


Input: nums = [1,1,0,1,1,1]
Output: 3
Explana on: The rst two digits or the last three digits are consecu ve 1s. The maximum number of
consecu ve 1s is 3.

def findMaxConsecutiveOnes(self, nums: List[int]) -> int:


for i in range(1,len(nums)):
if nums[i]:
nums[i] = nums[i] + nums[i-1]
return max(nums)

39.Next Greater Element I:


Input: nums1 = [4,1,2], nums2 = [1,3,4,2]
Output: [-1,3,-1]
Explana on: The next greater element for each value of nums1 is as follows:
ti
ti
ti
ti
ti
ti
fi
tti
tt
ti
ti
ti
fi
- 4 is underlined in nums2 = [1,3,4,2]. There is no next greater element, so the answer is -1.
- 1 is underlined in nums2 = [1,3,4,2]. The next greater element is 3.
- 2 is underlined in nums2 = [1,3,4,2]. There is no next greater element, so the answer is -1.

def nextGreaterElement(self, nums1: List[int], nums2: List[int]) ->


List[int]:
stack = []
res = []
dict_i = {}

stack.append(nums2[0])

for i in range(1,len(nums2)):
while stack and nums2[i] > stack[-1]:
dict_i[stack[-1]] = nums2[i]
stack.pop()
stack.append(nums2[i])

for i in stack:
dict_i[i] = -1

for i in nums1:
res.append(dict_i[i])
return res

40.Keyboard Row:
Given an array of strings words, return the words that can be typed using le ers of the alphabet on
only one row of American keyboard like the image below.
Input: words = ["Hello","Alaska","Dad","Peace"]
Output: ["Alaska","Dad"]

def findWords(self, words: List[str]) -> List[str]:


str1 = set("qwertyuiop")
str2 = set("asdfghjkl")
str3 = set("zxcvbnm")
res = []
for i in words:
a = set(i.lower())
if a.issubset(str1) or a.issubset(str2) or
a.issubset(str3):
res.append(i)
return res

41.Rela ve Ranks:
Input: score = [10,3,8,9,4]
Output: ["Gold Medal","5","Bronze Medal","Silver Medal","4"]
Explana on: The placements are [1st, 5th, 3rd, 2nd, 4th].

def findRelativeRanks(self, score: List[int]) -> List[str]:


dict_rnk = {}
li = sorted(score, reverse= True)
res = []
for i in range(len(li)):
ti
ti
tt
if i==0:
dict_rnk[li[i]]="Gold Medal"
elif i==1:
dict_rnk[li[i]]="Silver Medal"
elif i==2:
dict_rnk[li[i]]="Bronze Medal"
else:
dict_rnk[li[i]]=str(i+1)

for i in score:
res.append(dict_rnk[i])
return res

42.Detect Capital:
Input: word = "FlaG"
Output: false

def detectCapitalUse(self, word: str) -> bool:


if word == word.upper():
return True
elif word == word.lower():
return True
elif word[0] == word[0].upper() and word[1:] ==
word[1:].lower():
return True
else:
return False

43.Longest Uncommon Subsequence I:


Given two strings a and b, return the length of the longest uncommon
subsequence between a and b. If no such uncommon subsequence exists, return -1.
An uncommon subsequence between two strings is a string that is a
subsequenceof exactly one of them.
Input: a = "aba", b = "cdc"
Output: 3
Explana on: One longest uncommon subsequence is "aba" because "aba" is a subsequence of "aba"
but not "cdc".
Note that "cdc" is also a longest uncommon subsequence.

def findLUSlength(self, a: str, b: str) -> int:


return max(len(a),len(b)) if a != b else -1

44.Array Par on:


Given an integer array nums of 2n integers, group these integers into n pairs (a1, b1), (a2, b2), ..., (an,
bn) such that the sum of min(ai, bi) for all i is maximized. Return the maximized sum.
Input: nums = [1,4,3,2]Output: 4Explana on: All possible pairings (ignoring the ordering of elements)
are:
1. (1, 4), (2, 3) -> min(1, 4) + min(2, 3) = 1 + 2 = 3
2. (1, 3), (2, 4) -> min(1, 3) + min(2, 4) = 1 + 2 = 3
3. (1, 2), (3, 4) -> min(1, 2) + min(3, 4) = 1 + 3 = 4
So the maximum possible sum is 4.
ti
ti
ti
ti
def arrayPairSum(self, nums: List[int]) -> int:
nums = sorted(nums)
res = 0
for i in range(0, len(nums),2):
res += nums[i]
return res

45.Distribute Candies:
Alice has n candies, where the ith candy is of type candyType[i]. Alice no ced that she started to gain
weight, so she visited a doctor.
The doctor advised Alice to only eat n / 2 of the candies she has (n is always even). Alice likes her
candies very much, and she wants to eat the maximum number of di erent types of candies while s ll
following the doctor's advice.
Given the integer array candyType of length n, return the maximum number of di erent types of
candies she can eat if she only eats n / 2 of them.
Input: candyType = [1,1,2,2,3,3]
Output: 3
Explana on: Alice can only eat 6 / 2 = 3 candies. Since there are only 3 types, she can eat one of each
type.

def distributeCandies(self, candyType: List[int]) -> int:


a = int(len(candyType)/2)
set_1 = set(candyType)

if a<=len(set_1):
return a
else:
return len(set_1)

46.Longest Harmonious Subsequence:


We de ne a harmonious array as an array where the di erence between its maximum value and its
minimum value is exactly 1.
Input: nums = [1,3,2,2,5,2,3,7]
Output: 5
Explana on: The longest harmonious subsequence is [3,2,2,2,3].

def findLHS(self, nums: List[int]) -> int:


li = list(set(nums))
res = 0
for i in li:
if i+1 in li:
res = max(res, (nums.count(i)+nums.count(i+1)))

return res

47.Minimum Index Sum of Two Lists:


Input: list1 = ["happy","sad","good"], list2 = ["sad","happy","good"]
Output: ["sad","happy"]
Explana on: There are three common strings:
"happy" with index sum = (0 + 1) = 1.
"sad" with index sum = (1 + 0) = 1.
"good" with index sum = (2 + 2) = 4.
The strings with the least index sum are "sad" and "happy".
fi
ti
ti
ti
ff
ff
ti
ff
ti
def findRestaurant(self, list1: List[str], list2: List[str]) ->
List[str]:
li = list(set(list1)&set(list2))
a = len(list1)+len(list2)
res = []
for i in li:
a = min(a, list1.index(i)+list2.index(i))

for i in li:
if list1.index(i)+list2.index(i) == a:
res.append(i)
return res

48.Can Place Flowers:


You have a long owerbed in which some of the plots are planted, and some are not. However,
owers cannot be planted in adjacent plots.
Given an integer array owerbed containing 0's and 1's, where 0 means empty and 1 means not
empty, and an integer n, return true if n new owers can be planted in the owerbed without
viola ng the no-adjacent- owers rule and false otherwise.
Input: owerbed = [1,0,0,0,1], n = 2
Output: false

def canPlaceFlowers(self, flowerbed: List[int], n: int) -> bool:


if n==0:
return True

for i in range(len(flowerbed)):
if flowerbed[i] == 0 and (i ==0 or flowerbed[i-1]==0) and
(i==len(flowerbed)-1 or flowerbed[i+1]==0):
flowerbed[i] = 1
n -= 1
if n == 0:
return True

return False

49.Maximum Product of Three Numbers:


Input: nums = [-1,-2,-3]
Output: -6

def maximumProduct(self, nums: List[int]) -> int:


nums_1 = sorted(nums)
return max(nums_1[-1]*nums_1[-2]*nums_1[-3],
nums_1[0]*nums_1[1]*nums_1[-1])

50.Set Mismatch:
Find the number that occurs twice and the number that is missing and return them in the form of an
array.
Input: nums = [1,2,2,4]
Output: [2,3]
fl
ti
fl
fl
fl
fl
fl
fl
def findErrorNums(self, nums: List[int]) -> List[int]:
dict_cnt = Counter(nums)
li = list(range(1,len(nums)+1))
res = []
for i in dict_cnt.keys():
if dict_cnt[i] == 2:
res.append(i)
res.append(list(set(li)-set(nums))[0])
return res

51.Degree of an Array:
Input: nums = [1,2,2,3,1,4,2]
Output: 6
Explana on:
The degree is 3 because the element 2 is repeated 3 mes.
So [2,2,3,1,4,2] is the shortest subarray, therefore returning 6.

def findShortestSubArray(self, nums: List[int]) -> int:


dict_cnt = Counter(nums)
a = max(dict_cnt.values())
li = []
for i in dict_cnt.keys():
if dict_cnt[i] == a:
li.append(i)
dict_li_1 = {}
for i in range(len(nums)):
if nums[i] in li:
dict_li_1[nums[i]] = i
dict_li_2 = {}
for i in li:
dict_li_2[i] = nums.index(i)
li_1 = []
for i in dict_li_1.keys():
li_1.append(dict_li_1[i]-dict_li_2[i])
return min(li_1)+1

Write a SQL query to get the second highest salary from the Employee table.

+----+--------+
| Id | Salary |
+----+--------+
| 1 | 100 |
| 2 | 200 |
| 3 | 300 |
+----+--------+

For example, given the above Employee table, the query should return 200 as the second highest
salary. If there is no second highest salary, then the query should return null.

+---------------------+
| SecondHighestSalary |
+---------------------+
| 200 |
+---------------------+

SELECT
ti
ti
(SELECT distinct Salary
FROM Employee
ORDER BY Salary DESC
LIMIT 1 OFFSET 1) AS SecondHighestSalary

import pandas as pd

def second_highest_salary(employee: pd.DataFrame) -> pd.DataFrame:


sorted =
employee['salary'].sort_values(ascending=False).drop_duplicates()

if len(sorted) < 2:
return pd.DataFrame({'SecondHighestSalary': [None]})

second_highest = sorted.iloc[1]

return pd.DataFrame({'SecondHighestSalary': [second_highest]})

Write a SQL query to rank scores. If there is a tie between two scores, both should have the same
ranking. Note that after a tie, the next ranking number should be the next consecutive integer value.
In other words, there should be no "holes" between ranks.

+----+-------+
| Id | Score |
+----+-------+
| 1 | 3.50 |
| 2 | 3.65 |
| 3 | 4.00 |
| 4 | 3.85 |
| 5 | 4.00 |
| 6 | 3.65 |
+----+-------+

For example, given the above Scores table, your query should generate the following report (order
by highest score):

+-------+---------+
| score | Rank |
+-------+---------+
| 4.00 | 1 |
| 4.00 | 1 |
| 3.85 | 2 |
| 3.65 | 3 |
| 3.65 | 3 |
| 3.50 | 4 |
+-------+---------+

Important Note: For MySQL solutions, to escape reserved words used as column names, you can
use an apostrophe before and after the keyword. For example Rank.

select score, dense_rank() over(order by score desc) as "rank"


from Scores ;
import pandas as pd

def order_scores(scores: pd.DataFrame) -> pd.DataFrame:


scores['rank'] = scores['score'].rank(method='dense',
ascending=False)
ordered = scores.sort_values(by=['score', 'rank'],
ascending=False)
return ordered[['score', 'rank']]

Table: Logs

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| num | varchar |
+-------------+---------+
id is the primary key for this table.

Write an SQL query to find all numbers that appear at least three times consecutively.

Return the result table in any order.

The query result format is in the following example:

Logs table:
+----+-----+
| Id | Num |
+----+-----+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
+----+-----+

Result table:
+-----------------+
| ConsecutiveNums |
+-----------------+
| 1 |
+-----------------+
1 is the only number that appears consecutively for at least three times.

with cte as (
select num,
lead(num,1) over() as next,
lead(num,2) over() as next_2
from Logs
)

select distinct num as ConsecutiveNums


from cte
where num = next
and num = next_2
The Employee table holds all employees. Every employee has an Id, a salary, and there is also a
column for the department Id.

+----+-------+--------+--------------+
| Id | Name | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1 | Joe | 70000 | 1 |
| 2 | Jim | 90000 | 1 |
| 3 | Henry | 80000 | 2 |
| 4 | Sam | 60000 | 2 |
| 5 | Max | 90000 | 1 |
+----+-------+--------+--------------+

The Department table holds all departments of the company.

+----+----------+
| Id | Name |
+----+----------+
| 1 | IT |
| 2 | Sales |
+----+----------+

Write a SQL query to find employees who have the highest salary in each of the departments. For the
above tables, your SQL query should return the following rows (order of rows does not matter).

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT | Max | 90000 |
| IT | Jim | 90000 |
| Sales | Henry | 80000 |
+------------+----------+--------+

select Department ,Employee , Salary


from
(select Department ,Employee , Salary,
dense_rank() over(partition by Department order by Salary desc) as
rnk
from
(select emp.name as Employee, emp.salary as Salary, dpt.name as
Department
from Employee emp inner join
Department dpt
on emp.departmentId = dpt.id) res ) dns
where rnk = 1;

import pandas as pd

def department_highest_salary(employee: pd.DataFrame, department:


pd.DataFrame) -> pd.DataFrame:
merged = employee.merge(department, left_on='departmentId',
right_on='id', how='inner')
max_salaries = merged.groupby('name_y')
['salary'].transform('max')
highest_salary = merged[merged['salary'] == max_salaries]
[['name_y', 'name_x', 'salary']]
highest_salary.columns = ['department', 'employee', 'salary']
return highest_salary
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id, event_date) is the primary key of this table.
This table shows the activity of players of some game.
Each row is a record of a player who logged in and played a number of games
(possibly 0) before logging out on some day using some device.

Write an SQL query that reports the fraction of players that logged in again on the day after the day
they first logged in, rounded to 2 decimal places. In other words, you need to count the number of
players that logged in for at least two consecutive days starting from their first login date, then divide
that number by the total number of players.

The query result format is in the following example:

Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-03-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
+-----------+-----------+------------+--------------+

Result table:
+-----------+
| fraction |
+-----------+
| 0.33 |
+-----------+
Only the player with id 1 logged back in after the first day he had logged
in so the answer is 1/3 = 0.33

select round(count(distinct player_id)/(select count(distinct


player_id) from Activity),2) as fraction
from Activity
where (player_id, date_sub(event_date, interval 1 day))
in (select player_id, min(event_date) from Activity
group by player_id);

The Employee table holds all employees including their managers. Every employee has an Id, and
there is also a column for the manager Id.

+------+----------+-----------+----------+
|Id |Name |Department |ManagerId |
+------+----------+-----------+----------+
|101 |John |A |null |
|102 |Dan |A |101 |
|103 |James |A |101 |
|104 |Amy |A |101 |
|105 |Anne |A |101 |
|106 |Ron |B |101 |
+------+----------+-----------+----------+
Given the Employee table, write a SQL query that finds out managers with at least 5 direct report.
For the above table, your SQL query should return:

+-------+
| Name |
+-------+
| John |
+-------+

select name from Employee


where id in
(select managerId from Employee
group by managerId
having count(*) >= 5)

import pandas as pd

def find_managers(employee: pd.DataFrame) -> pd.DataFrame:


direct_reports_count = employee.groupby('managerId')
['id'].count().reset_index(name='report_count')
managers_with_five_reports =
direct_reports_count[direct_reports_count['report_count'] >= 5]
manager = pd.merge(managers_with_five_reports, employee,
left_on='managerId', right_on='id', how='inner')

return manager[['name']]

Write a query to print the sum of all total investment values in 2016 (TIV_2016), to a scale of 2
decimal places, for all policy holders who meet the following criteria:

1. Have the same TIV_2015 value as one or more other policyholders.


2. Are not located in the same city as any other policyholder (i.e.: the (latitude, longitude) attribute
pairs must be unique). Input Format: The insurance table is described as follows:
| Column Name | Type |
|-------------|---------------|
| PID | INTEGER(11) |
| TIV_2015 | NUMERIC(15,2) |
| TIV_2016 | NUMERIC(15,2) |
| LAT | NUMERIC(5,2) |
| LON | NUMERIC(5,2) |

where PID is the policyholder’s policy ID, TIV_2015 is the total investment value in 2015, TIV_2016 is
the total investment value in 2016, LAT is the latitude of the policy holder’s city, and LON is the
longitude of the policy holder’s city.

Sample Input

| PID | TIV_2015 | TIV_2016 | LAT | LON |


|-----|----------|----------|-----|-----|
| 1 | 10 | 5 | 10 | 10 |
| 2 | 20 | 20 | 20 | 20 |
| 3 | 10 | 30 | 20 | 20 |
| 4 | 10 | 40 | 40 | 40 |

Sample Output

| TIV_2016 |
|----------|
| 45.00 |

select round(sum(tiv_2016),2) as tiv_2016


from Insurance
where tiv_2015 in (
select tiv_2015
from Insurance
group by tiv_2015
having count(*) >1
)
and (lat, lon) in (
select lat, lon
from Insurance
group by lat, lon
having count(*) <=1
)

import pandas as pd

def find_investments(insurance: pd.DataFrame) ->


pd.DataFrame:

resq = insurance.drop_duplicates(subset=['lat', 'lon'],


keep=False)
need =
insurance.groupby('tiv_2015').pid.count().reset_index()
need = need[need['pid'] >= 2]
need['pid'] = 1
resq = resq.merge(need.rename(columns={'pid' :
'coef'})).fillna(0)[['tiv_2016', 'coef']]
resq['tiv_2016'] *= resq['coef']
return pd.DataFrame([resq['tiv_2016'].sum()],
columns=['tiv_2016']).round(2)

In social network like Facebook or Twitter, people send friend requests and accept others’ requests as
well. Table request_accepted holds the data of friend acceptance, while requester_id and
accepter_id both are the id of a person.

| requester_id | accepter_id | accept_date|


|--------------|-------------|------------|
| 1 | 2 | 2016_06-03 |
| 1 | 3 | 2016-06-08 |
| 2 | 3 | 2016-06-08 |
| 3 | 4 | 2016-06-09 |

Write a query to find the the people who has most friends and the most friends number. For the
sample data above, the result is:

| id | num |
|----|-----|
| 3 | 3 |

Note:
It is guaranteed there is only 1 people having the most friends. The friend request could only been
accepted once, which mean there is no multiple records with the same requester_id and accepter_id
value. Explanation: The person with id ‘3’ is a friend of people ‘1’, ‘2’ and ‘4’, so he has 3 friends in
total, which is the most number than any others.

Follow-up: In the real world, multiple people could have the same most number of friends, can you
find all these people in this case?

with cte as (
select requester_id as id from RequestAccepted
union all
select accepter_id as id from RequestAccepted
)

select id, count(*) as num from cte group by id


order by num desc limit 1

import pandas as pd

def most_friends(request_accepted: pd.DataFrame)


-> pd.DataFrame:
res =
pd.concat([request_accepted["requester_id"],
request_accepted["accepter_id"]]).tolist()
r = mode(res)
return pd.DataFrame({"id" : [r], "num" :
[res.count(r)]})

Mary is a teacher in a middle school and she has a table seat storing students' names and their
corresponding seat ids.

The column id is continuous increment.

Mary wants to change seats for the adjacent students.

Can you write a SQL query to output the result for Mary?

+---------+---------+
| id | student |
+---------+---------+
| 1 | Abbot |
| 2 | Doris |
| 3 | Emerson |
| 4 | Green |
| 5 | Jeames |
+---------+---------+

For the sample input, the output is:

+---------+---------+
| id | student |
+---------+---------+
| 1 | Doris |
| 2 | Abbot |
| 3 | Green |
| 4 | Emerson |
| 5 | Jeames |
+---------+---------+

select case
when id = (select max(id) from Seat) and id%2 = 1
then id
when id%2 = 1
then id+1
else
id-1
end as id,
student from Seat
order by id

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| customer_id | int |
| product_key | int |
+-------------+---------+

product_key is a foreign key to Product table. Table: Product

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| product_key | int |
+-------------+---------+
product_key is the primary key column for this table.

Write an SQL query for a report that provides the customer ids from the Customer table that bought
all the products in the Product table.

For example:

Customer table:
+-------------+-------------+
| customer_id | product_key |
+-------------+-------------+
| 1 | 5 |
| 2 | 6 |
| 3 | 5 |
| 3 | 6 |
| 1 | 6 |
+-------------+-------------+

Product table:
+-------------+
| product_key |
+-------------+
| 5 |
| 6 |
+-------------+
Result table:
+-------------+
| customer_id |
+-------------+
| 1 |
| 3 |
+-------------+
The customers who bought all the products (5 and 6) are customers with id 1
and 3.

select customer_id from (select distinct * from Customer) a


group by customer_id
having count(*) = (select count(product_key) from Product)

import pandas as pd

def find_customers(customer: pd.DataFrame,


product: pd.DataFrame) ->
pd.DataFrame:

df = customer.drop_duplicates(keep = 'first'
).groupby('customer_id').count().
reset_index()

return df[df.product_key == len(product)]


[['customer_id']]

+--------------+---------+
| Column Name | Type |
+--------------+---------+
| product_id | int |
| product_name | varchar |
+--------------+---------+
product_id is the primary key of this table.

Write an SQL query that selects the product id, year, quantity, and price for the first year of every
product sold.

The query result format is in the following example:

Sales table:
+---------+------------+------+----------+-------+
| sale_id | product_id | year | quantity | price |
+---------+------------+------+----------+-------+
| 1 | 100 | 2008 | 10 | 5000 |
| 2 | 100 | 2009 | 12 | 5000 |
| 7 | 200 | 2011 | 15 | 9000 |
+---------+------------+------+----------+-------+

Product table:
+------------+--------------+
| product_id | product_name |
+------------+--------------+
| 100 | Nokia |
| 200 | Apple |
| 300 | Samsung |
+------------+--------------+

Result table:
+------------+------------+----------+-------+
| product_id | first_year | quantity | price |
+------------+------------+----------+-------+
| 100 | 2008 | 10 | 5000 |
| 200 | 2011 | 15 | 9000 |
+------------+------------+----------+-------+

select product_id, year as first_year, quantity, price


from Sales s
where (product_id,year) in (
select product_id, min(year) from Sales
group by product_id
)

import pandas as pd

def sales_analysis(sales: pd.DataFrame, product:


pd.DataFrame) -> pd.DataFrame:
# Merge sales and product tables
merged_df = sales.merge(product, how='left',
on='product_id')

# Find the first year of sale for each


product
first_year_df =
merged_df.groupby('product_id')
['year'].min().reset_index()

# Merge the first year information back to


the merged dataframe to get the corresponding
quantity and price
result_df = first_year_df.merge(merged_df,
how='inner', on=['product_id',
'year']).rename(columns = {'year' :
'first_year'})

return result_df[['product_id', 'first_year',


'quantity', 'price']]

+----------------+---------+
| Column Name | Type |
+----------------+---------+
| user_id | int |
| join_date | date |
| favorite_brand | varchar |
+----------------+---------+
user_id is the primary key of this table.
This table has the info of the users of an online shopping website where
users can sell and buy items.

Table: Orders

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| order_id | int |
| order_date | date |
| item_id | int |
| buyer_id | int |
| seller_id | int |
+---------------+---------+
order_id is the primary key of this table.
item_id is a foreign key to the Items table.
buyer_id and seller_id are foreign keys to the Users table.

Table: Items

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| item_id | int |
| item_brand | varchar |
+---------------+---------+
item_id is the primary key of this table.

Write an SQL query to find for each user, the join date and the number of orders they made as a
buyer in 2019.

The query result format is in the following example:

Users table:
+---------+------------+----------------+
| user_id | join_date | favorite_brand |
+---------+------------+----------------+
| 1 | 2018-01-01 | Lenovo |
| 2 | 2018-02-09 | Samsung |
| 3 | 2018-01-19 | LG |
| 4 | 2018-05-21 | HP |
+---------+------------+----------------+

Orders table:
+----------+------------+---------+----------+-----------+
| order_id | order_date | item_id | buyer_id | seller_id |
+----------+------------+---------+----------+-----------+
| 1 | 2019-08-01 | 4 | 1 | 2 |
| 2 | 2018-08-02 | 2 | 1 | 3 |
| 3 | 2019-08-03 | 3 | 2 | 3 |
| 4 | 2018-08-04 | 1 | 4 | 2 |
| 5 | 2018-08-04 | 1 | 3 | 4 |
| 6 | 2019-08-05 | 2 | 2 | 4 |
+----------+------------+---------+----------+-----------+

Items table:
+---------+------------+
| item_id | item_brand |
+---------+------------+
| 1 | Samsung |
| 2 | Lenovo |
| 3 | LG |
| 4 | HP |
+---------+------------+

Result table:
+-----------+------------+----------------+
| buyer_id | join_date | orders_in_2019 |
+-----------+------------+----------------+
| 1 | 2018-01-01 | 1 |
| 2 | 2018-02-09 | 2 |
| 3 | 2018-01-19 | 0 |
| 4 | 2018-05-21 | 0 |
+-----------+------------+----------------+

select u.user_id as buyer_id , u.join_date, ifnull(a.orders_in_2019,


0) as orders_in_2019
from Users u
left join
(select buyer_id, count(*) as orders_in_2019 from Orders where
year(order_date ) = 2019
group by buyer_id) a
on u.user_id = a.buyer_id

import pandas as pd

def market_analysis(users: pd.DataFrame, orders:


pd.DataFrame, items: pd.DataFrame) ->
pd.DataFrame:
orders = orders[["order_date", "buyer_id"]]
users = users[["user_id",
"join_date"]].rename(columns={"user_id" :
"buyer_id"})
orders =
orders.loc[orders["order_date"].dt.year ==
2019].groupby(["buyer_id"])
["order_date"].count().reset_index()
return
users.merge(orders.rename(columns={"order_date" :
"orders_in_2019"}), how="left").fillna(0)

Table: Products

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| product_id | int |
| new_price | int |
| change_date | date |
+---------------+---------+
(product_id, change_date) is the primary key of this table.
Each row of this table indicates that the price of some product was changed
to a new price at some date.

Write an SQL query to find the prices of all products on 2019-08-16. Assume the price of all products
before any change is 10.

The query result format is in the following example:

Products table:
+------------+-----------+-------------+
| product_id | new_price | change_date |
+------------+-----------+-------------+
| 1 | 20 | 2019-08-14 |
| 2 | 50 | 2019-08-14 |
| 1 | 30 | 2019-08-15 |
| 1 | 35 | 2019-08-16 |
| 2 | 65 | 2019-08-17 |
| 3 | 20 | 2019-08-18 |
+------------+-----------+-------------+

Result table:
+------------+-------+
| product_id | price |
+------------+-------+
| 2 | 50 |
| 1 | 35 |
| 3 | 10 |
+------------+-------+

select product_id, new_price as price


from Products
where (product_id, change_date) in
(select product_id, max(change_date)
from Products
where change_date<= '2019-08-16'
group by product_id)
union
select product_id, 10 as price
from Products
where (product_id) not in
(select product_id
from Products
where change_date<= ‘2019-08-16')

import pandas as pd

def price_at_given_date(products: pd.DataFrame)


-> pd.DataFrame:
before = products[products.change_date <=
'2019-08-16'].groupby('product_id').change_date.m
ax().reset_index()
resbefore = products.merge(before,
how='right')
notafter = resbefore.product_id
after =
products[~products.product_id.isin(notafter)].dro
p_duplicates(subset='product_id')
after['new_price'] = 10
return pd.concat([resbefore, after])
[['product_id',
'new_price']].rename(columns={'new_price' :
'price'})

Table: Delivery

+-----------------------------+---------+
| Column Name | Type |
+-----------------------------+---------+
| delivery_id | int |
| customer_id | int |
| order_date | date |
| customer_pref_delivery_date | date |
+-----------------------------+---------+
delivery_id is the primary key of this table.
The table holds information about food delivery to customers that make
orders at some date and specify a preferred delivery date (on the same order
date or after it).

If the preferred delivery date of the customer is the same as the order date then the order is called
immediate otherwise it's called scheduled.

The first order of a customer is the order with the earliest order date that customer made. It is
guaranteed that a customer has exactly one first order.

Write an SQL query to find the percentage of immediate orders in the first orders of all customers,
rounded to 2 decimal places.

The query result format is in the following example:

Delivery table:
+-------------+-------------+------------+-----------------------------+
| delivery_id | customer_id | order_date | customer_pref_delivery_date |
+-------------+-------------+------------+-----------------------------+
| 1 | 1 | 2019-08-01 | 2019-08-02 |
| 2 | 2 | 2019-08-02 | 2019-08-02 |
| 3 | 1 | 2019-08-11 | 2019-08-12 |
| 4 | 3 | 2019-08-24 | 2019-08-24 |
| 5 | 3 | 2019-08-21 | 2019-08-22 |
| 6 | 2 | 2019-08-11 | 2019-08-13 |
| 7 | 4 | 2019-08-09 | 2019-08-09 |
+-------------+-------------+------------+-----------------------------+

Result table:
+----------------------+
| immediate_percentage |
+----------------------+
| 50.00 |
+----------------------+
The customer id 1 has a first order with delivery id 1 and it is scheduled.
The customer id 2 has a first order with delivery id 2 and it is immediate.
The customer id 3 has a first order with delivery id 5 and it is scheduled.
The customer id 4 has a first order with delivery id 7 and it is immediate.
Hence, half the customers have immediate first orders.

select round((select count(distinct customer_id) from


(select customer_id,min(order_date) as a,
min(customer_pref_delivery_date) as b from Delivery
group by customer_id
having a = b) c ) / (select count(distinct customer_id) from
Delivery) *100,2) as immediate_percentage

import pandas as pd

def immediate_food_delivery(delivery:
pd.DataFrame) -> pd.DataFrame:

den = len(df:=
delivery.groupby("customer_id").min())
# <-- 1

num = len(df[df.order_date ==
df.customer_pref_delivery_date])) #
<-- 2

return pd.DataFrame({"immediate_percentage":
[num / den *100]}).round(2)) # <-- 3

Table: Transactions

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| id | int |
| country | varchar |
| state | enum |
| amount | int |
| trans_date | date |
+---------------+---------+
id is the primary key of this table.
The table has information about incoming transactions.
The state column is an enum of type ["approved", "declined"].

Write an SQL query to find for each month and country, the number of transactions and their total
amount, the number of approved transactions and their total amount.
The query result format is in the following example:

Transactions table:
+------+---------+----------+--------+------------+
| id | country | state | amount | trans_date |
+------+---------+----------+--------+------------+
| 121 | US | approved | 1000 | 2018-12-18 |
| 122 | US | declined | 2000 | 2018-12-19 |
| 123 | US | approved | 2000 | 2019-01-01 |
| 124 | DE | approved | 2000 | 2019-01-07 |
+------+---------+----------+--------+------------+

Result table:
+----------+---------+-------------+----------------+--------------------
+-----------------------+
| month | country | trans_count | approved_count | trans_total_amount |
approved_total_amount |
+----------+---------+-------------+----------------+--------------------
+-----------------------+
| 2018-12 | US | 2 | 1 | 3000 |
1000 |
| 2019-01 | US | 1 | 1 | 2000 |
2000 |
| 2019-01 | DE | 1 | 1 | 2000 |
2000 |
+----------+---------+-------------+----------------+--------------------
+-----------------------+

select date_format(trans_date, '%Y-%m') as month,


country,
count(id)as trans_count,
sum(state= 'approved') as approved_count,
sum(amount) as trans_total_amount,
sum(if(state='approved', amount,0)) as approved_total_amount
from Transactions
group by country, month

import pandas as pd

def monthly_transactions(transactions:
pd.DataFrame) -> pd.DataFrame:
transactions['month'] =
pd.to_datetime(transactions['trans_date']).dt.str
ftime('%Y-%m')
transactions['state'] =
transactions['state'].str.replace('approved',
'1').replace('declined', '0')
transactions['state'] =
transactions['state'].astype(int)
transactions['approved_total_amount'] =
transactions['amount'] * transactions['state']
resq = transactions.groupby(['month',
'country'], dropna=False).agg({'id' : 'count',
'state' : 'sum', 'amount' : 'sum',
'approved_total_amount' : 'sum'}).reset_index()
return resq.rename(columns={'id' :
'trans_count', 'state' : 'approved_count',
'amount' : 'trans_total_amount'})

Table: Queue

+-------------+---------+
| Column Name | Type |
+-------------+---------+
| person_id | int |
| person_name | varchar |
| weight | int |
| turn | int |
+-------------+---------+
person_id is the primary key column for this table.
This table has the information about all people waiting for an elevator.
The person_id and turn columns will contain all numbers from 1 to n, where n
is the number of rows in the table.

The maximum weight the elevator can hold is 1000.

Write an SQL query to find the person_name of the last person who will fit in the elevator without
exceeding the weight limit. It is guaranteed that the person who is first in the queue can fit in the
elevator.

The query result format is in the following example:

Queue table
+-----------+-------------------+--------+------+
| person_id | person_name | weight | turn |
+-----------+-------------------+--------+------+
| 5 | George Washington | 250 | 1 |
| 3 | John Adams | 350 | 2 |
| 6 | Thomas Jefferson | 400 | 3 |
| 2 | Will Johnliams | 200 | 4 |
| 4 | Thomas Jefferson | 175 | 5 |
| 1 | James Elephant | 500 | 6 |
+-----------+-------------------+--------+------+

Result table
+-------------------+
| person_name |
+-------------------+
| Thomas Jefferson |
+-------------------+
Queue table is ordered by turn in the example for simplicity.
In the example George Washington(id 5), John Adams(id 3) and Thomas
Jefferson(id 6) will enter the elevator as their weight sum is 250 + 350 +
400 = 1000.
Thomas Jefferson(id 6) is the last person to fit in the elevator because he
has the last turn in these three people.
select person_name from (select person_name, turn, sum(weight)
over(order by turn) as a
from Queue) b
where a <=1000 order by turn desc limit 1

import pandas as pd

def last_passenger(queue: pd.DataFrame) ->


pd.DataFrame:

sort_queue = queue.sort_values(by='turn',
ascending=True)
sort_queue['Total Weight'] =
sort_queue['weight'].cumsum()
return sort_queue[sort_queue['Total Weight']
<= 1000][['person_name']].tail(1)

Table: Customer

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| customer_id | int |
| name | varchar |
| visited_on | date |
| amount | int |
+---------------+---------+
(customer_id, visited_on) is the primary key for this table.
This table contains data about customer transactions in a restaurant.
visited_on is the date on which the customer with ID (customer_id) have
visited the restaurant.
amount is the total paid by a customer.

You are the restaurant owner and you want to analyze a possible expansion (there will be at least one
customer every day).

Write an SQL query to compute moving average of how much customer paid in a 7 days window
(current day + 6 days before) .

The query result format is in the following example:

Return result table ordered by visited_on.

average_amount should be rounded to 2 decimal places, all dates are in the format ('YYYY-MM-DD').

Customer table:
+-------------+--------------+--------------+-------------+
| customer_id | name | visited_on | amount |
+-------------+--------------+--------------+-------------+
| 1 | Jhon | 2019-01-01 | 100 |
| 2 | Daniel | 2019-01-02 | 110 |
| 3 | Jade | 2019-01-03 | 120 |
| 4 | Khaled | 2019-01-04 | 130 |
| 5 | Winston | 2019-01-05 | 110 |
| 6 | Elvis | 2019-01-06 | 140 |
| 7 | Anna | 2019-01-07 | 150 |
| 8 | Maria | 2019-01-08 | 80 |
| 9 | Jaze | 2019-01-09 | 110 |
| 1 | Jhon | 2019-01-10 | 130 |
| 3 | Jade | 2019-01-10 | 150 |
+-------------+--------------+--------------+-------------+

Result table:
+--------------+--------------+----------------+
| visited_on | amount | average_amount |
+--------------+--------------+----------------+
| 2019-01-07 | 860 | 122.86 |
| 2019-01-08 | 840 | 120 |
| 2019-01-09 | 840 | 120 |
| 2019-01-10 | 1000 | 142.86 |
+--------------+--------------+----------------+

1st moving average from 2019-01-01 to 2019-01-07 has an average_amount of


(100 + 110 + 120 + 130 + 110 + 140 + 150)/7 = 122.86
2nd moving average from 2019-01-02 to 2019-01-08 has an average_amount of
(110 + 120 + 130 + 110 + 140 + 150 + 80)/7 = 120
3rd moving average from 2019-01-03 to 2019-01-09 has an average_amount of
(120 + 130 + 110 + 140 + 150 + 80 + 110)/7 = 120
4th moving average from 2019-01-04 to 2019-01-10 has an average_amount of
(130 + 110 + 140 + 150 + 80 + 110 + 130 + 150)/7 = 142.86

select visited_on,
sum(amount) over (
order by visited_on
rows between 6 preceding AND 0 following
) as amount,
round(avg(amount) over (
order by visited_on
rows between 6 preceding AND 0 following
),2) as average_amount
from (
select visited_on, sum(amount) as amount
from Customer
group by visited_on
) a
order by visited_on
limit 18446744073709551610
offset 6

Table: Movies

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| movie_id | int |
| title | varchar |
+---------------+---------+
movie_id is the primary key for this table.
title is the name of the movie.

Table: Users
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| name | varchar |
+---------------+---------+
user_id is the primary key for this table.

Table: Movie_Rating

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| movie_id | int |
| user_id | int |
| rating | int |
| created_at | date |
+---------------+---------+
(movie_id, user_id) is the primary key for this table.
This table contains the rating of a movie by a user in their review.
created_at is the user's review date.

Write the following SQL query:

Find the name of the user who has rated the greatest number of the movies.
In case of a tie, return lexicographically smaller user name.


Find the movie name with the highest average rating in February 2020.
In case of a tie, return lexicographically smaller movie name..

Query is returned in 2 rows, the query result format is in the following example:

Movies table:
+-------------+--------------+
| movie_id | title |
+-------------+--------------+
| 1 | Avengers |
| 2 | Frozen 2 |
| 3 | Joker |
+-------------+--------------+

Users table:
+-------------+--------------+
| user_id | name |
+-------------+--------------+
| 1 | Daniel |
| 2 | Monica |
| 3 | Maria |
| 4 | James |
+-------------+--------------+

Movie_Rating table:
+-------------+--------------+--------------+-------------+
| movie_id | user_id | rating | created_at |
+-------------+--------------+--------------+-------------+
| 1 | 1 | 3 | 2020-01-12 |
| 1 | 2 | 4 | 2020-02-11 |
| 1 | 3 | 2 | 2020-02-12 |
| 1 | 4 | 1 | 2020-01-01 |
| 2 | 1 | 5 | 2020-02-17 |
| 2 | 2 | 2 | 2020-02-01 |
| 2 | 3 | 2 | 2020-03-01 |
| 3 | 1 | 3 | 2020-02-22 |
| 3 | 2 | 4 | 2020-02-25 |
+-------------+--------------+--------------+-------------+

Result table:
+--------------+
| results |
+--------------+
| Daniel |
| Frozen 2 |
+--------------+

Daniel and Maria have rated 3 movies ("Avengers", "Frozen 2" and "Joker")
but Daniel is smaller lexicographically.
Frozen 2 and Joker have a rating average of 3.5 in February but Frozen 2 is
smaller lexicographically.

(select name as results from


Users
where user_id in
(select user_id
from
(select user_id, count(rating) as b
from MovieRating
group by user_id) a
where b = (
select max(b)
from (select user_id, count(rating) as b
from MovieRating
group by user_id) a
))
order by name
limit 1)
union all
(select title as results from
Movies
where movie_id in
(select movie_id
from
(select movie_id, avg(rating) as b
from MovieRating
where date_format(created_at,'%Y-%m') = '2020-02'
group by movie_id) a
where b = (
select max(b)
from (select movie_id, avg(rating) as b
from MovieRating
where date_format(created_at,'%Y-%m') = '2020-02'
group by movie_id) a
))
order by title
limit 1)

import pandas as pd

def movie_rating(movies: pd.DataFrame, users:


pd.DataFrame, movie_rating: pd.DataFrame) ->
pd.DataFrame:

#merge name/movie info


movies_m1 = movie_rating.merge(users,
on='user_id', how='left')
movies_m2 = movies_m1.merge(movies,
on='movie_id', how='left')

#most frequent rater


fr = movies_m2['name'].value_counts()
m_user = fr[fr ==
fr.max()].sort_index(ascending=True).index[0]

#filter for movies in february


movies_feb =
movies_m2[(movies_m2['created_at'] >=
'2020-02-01') & (movies_m2['created_at'] <=
'2020-02-29')]

#average for each title, then sort to find


the highest rating
avg_rating =
movies_feb.groupby(by='title').agg({'rating':'mea
n'}).reset_index().sort_values(by=['rating'],
ascending=False)
max_rating = avg_rating[avg_rating['rating']
==
avg_rating['rating'].max()].sort_values(by='title
', ascending=True)['title'].iloc[0]

return pd.DataFrame([m_user, max_rating],


columns=['results'])
Table: Stocks

+---------------+---------+
| Column Name | Type |
+---------------+---------+
| stock_name | varchar |
| operation | enum |
| operation_day | int |
| price | int |
+---------------+---------+
(stock_name, day) is the primary key for this table.
The operation column is an ENUM of type ('Sell', 'Buy')
Each row of this table indicates that the stock which has stock_name had an
operation on the day operation_day with the price.
It is guaranteed that each 'Sell' operation for a stock has a corresponding
'Buy' operation in a previous day.

Write an SQL query to report the Capital gain/loss for each stock.

The capital gain/loss of a stock is total gain or loss after buying and selling the stock one or many
times.

Return the result table in any order.

The query result format is in the following example:

Stocks table:
+---------------+-----------+---------------+--------+
| stock_name | operation | operation_day | price |
+---------------+-----------+---------------+--------+
| Leetcode | Buy | 1 | 1000 |
| Corona Masks | Buy | 2 | 10 |
| Leetcode | Sell | 5 | 9000 |
| Handbags | Buy | 17 | 30000 |
| Corona Masks | Sell | 3 | 1010 |
| Corona Masks | Buy | 4 | 1000 |
| Corona Masks | Sell | 5 | 500 |
| Corona Masks | Buy | 6 | 1000 |
| Handbags | Sell | 29 | 7000 |
| Corona Masks | Sell | 10 | 10000 |
+---------------+-----------+---------------+--------+

Result table:
+---------------+-------------------+
| stock_name | capital_gain_loss |
+---------------+-------------------+
| Corona Masks | 9500 |
| Leetcode | 8000 |
| Handbags | -23000 |
+---------------+-------------------+

Leetcode stock was bought at day 1 for 1000$ and was sold at day 5 for
9000$. Capital gain = 9000 - 1000 = 8000$.
Handbags stock was bought at day 17 for 30000$ and was sold at day 29 for
7000$. Capital loss = 7000 - 30000 = -23000$.
Corona Masks stock was bought at day 1 for 10$ and was sold at day 3 for
1010$. It was bought again at day 4 for 1000$ and was sold at day 5 for
500$. At last, it was bought at day 6 for 1000$ and was sold at day 10 for
10000$. Capital gain/loss is the sum of capital gains/losses for each ('Buy'
--> 'Sell') operation = (1010 - 10) + (500 - 1000) + (10000 - 1000) = 1000 -
500 + 9000 = 9500$.
select stock_name,
sum(case when operation = 'Buy' then -1*price
when operation = 'Sell' then price end) as capital_gain_loss
from Stocks
group by stock_name

import pandas as pd

def capital_gainloss(stocks: pd.DataFrame) ->


pd.DataFrame:
stocks.price =
stocks.price.mask(stocks.operation == "Buy",
lambda x : -x)
return
stocks.groupby(["stock_name"]).price.sum().reset_
index()[["stock_name",
"price"]].rename(columns={"price":"capital_gain_l
oss"})

Table: Accounts
+-------------+------+
| Column Name | Type |
+-------------+------+
| account_id | int |
| income | int |
+-------------+------+
account_id is the primary key (column with unique values) for this table.
Each row contains informa on about the monthly income for one bank account.

Write a solu on to calculate the number of bank accounts for each salary category. The salary
categories are:
"Low Salary": All the salaries strictly less than $20000.
"Average Salary": All the salaries in the inclusive range [$20000, $50000].
"High Salary": All the salaries strictly greater than $50000.
The result table must contain all three categories. If there are no accounts in a category, return 0.
Return the result table in any order.
The result format is in the following example.

Example 1:
Input:
Accounts table:
+------------+--------+
| account_id | income |
+------------+--------+
ti
ti
|3 | 108939 |
|2 | 12747 |
|8 | 87709 |
|6 | 91796 |
+------------+--------+
Output:
+----------------+----------------+
| category | accounts_count |
+----------------+----------------+
| Low Salary | 1 |
| Average Salary | 0 |
| High Salary | 3 |
+----------------+----------------+
Explana on:
Low Salary: Account 2.
Average Salary: No accounts.
High Salary: Accounts 3, 6, and 8.

select c.category, ifnull(b.accounts_count,0) as accounts_count


from
(
select 'Low Salary' as category
from Accounts
union
select 'Average Salary' as category
from Accounts
union
select 'High Salary' as category
from Accounts
) c
left join
(select category , count(*) as accounts_count
from
(select account_id,
case when income < 20000 then 'Low Salary'
when income >= 20000 and income <=50000 then 'Average Salary'
when income > 50000 then 'High Salary' end as category
from Accounts ) a
group by category) b
on c.category = b.category

import pandas as pd

def count_salary_categories(accounts: pd.DataFrame) ->


pd.DataFrame:
df=accounts[accounts['income']<20000].shape[0]

ff=accounts[(accounts['income']>=20000)&(accounts['income']<=50
000)].shape[0]
ef=accounts[accounts['income']>50000].shape[0]
ti
result=pd.DataFrame({
'category':['High Salary','Low Salary','Average Salary'],
'accounts_count':[ef,df,ff]
})
return result

You might also like