Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
111 views

SQL For Product Managers - HelloPM - Co

SQL is a language used to retrieve and manage data from relational databases. As a product manager, learning SQL allows you to more deeply understand product data, uncover insights not found in standard tools, and make faster decisions by reducing dependence on data teams. Key SQL concepts include using SELECT statements to retrieve data, filtering with WHERE clauses, and using ORDER BY, LIMIT, and aggregation functions like COUNT to further analyze the data.

Uploaded by

Purnesh Prabhu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

SQL For Product Managers - HelloPM - Co

SQL is a language used to retrieve and manage data from relational databases. As a product manager, learning SQL allows you to more deeply understand product data, uncover insights not found in standard tools, and make faster decisions by reducing dependence on data teams. Key SQL concepts include using SELECT statements to retrieve data, filtering with WHERE clauses, and using ORDER BY, LIMIT, and aggregation functions like COUNT to further analyze the data.

Uploaded by

Purnesh Prabhu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

SQL for product managers – the

definitive guide

© HELLOPM.CO
Product management is about taking your product in the right
direction with the fastest speed possible. Data is always going to be
an integral part of your decision-making, and having the ability to do
the primary analysis yourself can definitely boost your speed as a
product manager.

In this guide, we will understand what is SQL and how product


managers can use it to make better decisions, faster.

Here is the index if you want to jump to a particular section:


– Introduction
– Do product managers need to know SQL?
– SQL & Relational databases
– Retrieving data with SQL
– Filter data with SQL
– Using LIMITS & SORTING in SQL
– Using Aggregate & Group BY
– Introduction to Joins
– Where to write SQL queries?
– Best practices for using SQL for product managers
– Common questions you can try answering with SQL

Depending on the type and scale of the organization that you are
working for, as a product manager you’ll have access to a lot of free
and paid tools available for data analysis, ranging from Google
Analytics, Mixpanel to Tableau, or PowerBI.

While these tools will add up to your productivity and analysis skills,
it’s better as a product manager to be as close to your data as

© HELLOPM.CO
possible. In most of the applications that you’re going to work with,
the data would be stored in something known as relational databases.

Once you develop a decent understanding of how these work and how
you can fetch the data and insights from these databases you’ll get a
better command of your product’s usage data.

This understanding will help you to make decisions faster as a product


manager because now you do not have to depend on your data team
to carry out basic analysis for you and also you can guide your
development team to record any additional data that would help you in
improving the product.

Do product managers need to know SQL?

In summary, here are 3 reasons why product managers should learn


SQL:

● In-depth understanding of your data, to meaningfully contribute


to data decisions.
● Ability to uncover insights that are not available with standard
tools such as GA, mixpanel etc.
● Make decisions faster by reducing dependency on your data
teams for basic analysis and using them for better challenges.

What is SQL?

SQL which stands for ‘Structured Query Language’, is a computer


language that is designed to retrieve and manage data in a relational
database.

© HELLOPM.CO
Thus to understand SQL you need to learn three things:

● What is a relational database


● How data is stored in these databases
● How do you get the data that you want from these databases

A relational database is a collection of tables where data is stored in


rows. Think of it like a google spreadsheet, the whole spreadsheet will
be called as a relational database and the individual sheets within the
spreadsheet will be called as tables. To enter any new data we insert a
row into any of the sheets and the information about that data is
revealed by column names.

Here is a simple visualisation of sheets into relational database:

© HELLOPM.CO
These tables could be related to each other through some common
columns (also known as foreign keys). Lets consider a very basic
example from instagram to understand this, a simplified database
structure (AKA data-schema) for instagram could look like this:

Table 1: Users
Columns: id, first_name, last_name, handle, registered_at, email_id

Table 2: Photos
Columns: id, user_id, photo_url, created_at, caption

Table 3: Likes
Columns: id, user_id, photo_id, created_at

As per this schema, whenever someone registers on Instagram, a new


row is added, the person is assigned an automatically generated
unique id (id) and other information (first_name, last_name, handle,
registered_at, email_id) about them is recorded in the users table.

Whenever someone uploads a photo on instagram a row in photos


table is added, which will contain the photo url, time of posting and
caption along with the user_id to keep a track of which user has
uploaded this photo,

When someone likes this photo, one new row is entered in the likes
table with the user_id of the person who has liked the photo, the id of
the photo which is liked and the time when the photo was liked.

© HELLOPM.CO
Now if you have been following closely till now, you would have
identified that the photos table is connected to the users table through
user_id, the likes table is connected to the users and photos table
through user_id and photo_id respectively.

The user_id in the photos table, the photo_id and user_id in the likes
table are examples of foreign keys.

The ability to create these kinds of relationships between data tables


is the real superpower of relational databases, and you can model the
data for almost any product into a relational database using multiple
tables with such relations between them.

The example we have taken here is extremely simplistic and in real-life


instagram might be using hundreds if not thousands of tables to model
and store their data.

While this may not be under your responsibility as a product manager


to decide or create the data-schema, it is an added advantage to have
a clarity of what all data do you store, so that you are aware of what all
insights you can drive from your data.

Now that we know how data is stored in these tables, lets understand
how we can retrieve data from these tables to drive insights.

This is where SQL comes into play. Simplistically speaking, SQL is the
high-level language to give commands or ask questions to your
database.

Retrieving data with SQL

© HELLOPM.CO
While you can use SQL to read, insert, update or delete data into
relational databases, as a product manager your scope of work will
always be limited to reading data.

Now let us understand how we can make some basic read queries to
our database through SQL.

We will take examples from our instagram schema only.

If you want to see some data from any tables you will use the popular
‘SELECT’ query.

The syntax of query is as follows:

© HELLOPM.CO
SELECT {column name 1, column name 2} from table name

If you want to retrieve all the columns you can replace {column
names} with “*”.

Using * could be particularly useful for product managers when they


want to see the list of information that is stored in an existing table.

Thus, to retrieve data from the users table in our instagram database,
we will use:

SELECT * from users;

Filtering data

Now if you want to filter your data by some criteria, you can use the
‘WHERE’ clause, for example you want to name and date of
registration of the user whose email address is ‘john.doe@gmail.com’,
for such as case you will use:

SELECT first_name, last_name, created_at, email_id


FROM users
WHERE email_id = ‘john.doe@gmail.com’

© HELLOPM.CO
Apart from = condition, you can also use conditional statements such
as greater than (>), greater than equal to (>=), less than (<), less than
equal to (<=) or like to further define your filters.

You can also combine multiple WHERE conditions to further define


your data requirements. The combination can be done by using ‘AND’
when both conditions are essential or by using ‘OR’ when only one of
the given conditions are required to be correct.

Here are some example queries to help you understand this better:

Get List of people who have registered in march 2021

SELECT * from users where created_at >= ‘2021-03-01’ AND created at


<= ‘2021-03-31’

Get list of people whose first names are either Ankit or Ankur

SELECT * from users where first_name = ‘Ankit’ OR first_name =


‘Ankur’

Get list of people who use gmail.com as their email provider

SELECT * from users where email_id like ‘%gmail.com%’

If you have a range of items to check from, you can use ‘WHERE IN’
clause, for example: to get list of people whose email addresses are

© HELLOPM.CO
‘a@gmail.com’, ‘b@gmail.com’, and ‘c@gmail.com’ you can use
following query:

SELECT * from users WHERE email_id IN (‘a@gmail.com’,


‘b@gmail.com’, ‘c@gmail.com’).

That was about 80% of the knowledge you need to have on filtering
data, if you are interested in learning more please go to tutorialpoint to
expand your SQL knowledge.

Apart from filters there are two more important things which you as a
product manager should know: How to sort the data and how to limit
the amount of data you want to retrieve.

LIMITS & Sorting

1. You can sort the data by using ORDER BY clause


2. You can limit the number of rows you want to see by using the
LIMIT clause.

Here are some examples:

This query will give you list of users, with recently registered users on
the top (descending order of date of registration):

SELECT first_name, email_id, created_at


FROM users
ORDER BY created_at DESC

© HELLOPM.CO
This query will give you 10 recently registered users:

SELECT first_name, email_id, created_at


FROM users
ORDER BY created_at DESC
LIMIT 10

As a product manager you should almost always use LIMIT in your


queries, because omitting LIMIT will cause SQL to select all the
available rows, which can affect the performance of your database or
can even bring your database down if there are large numbers of rows
in the table.

AGGREGATE & GROUP BY functions in SQL

SQL also offers something known as AGGREGATE functions, which


can help you perform some calculations on data that you retrieve. The
common AGGREGATE functions are: AVG, SUM, MIN, MAX, COUNT,
SUM etc. Their functions are self-explanatory by their names, eg:
COUNT will give you the count of all the rows which satisfy a particular
condition and SUM will give you the SUM of all values of a particular
column in a result-set.

Here is an example to illustrate the usage of the aggregation function


COUNT:

To get all the likes of a particular photo on instagram, you will use this
query:

© HELLOPM.CO
SELECT count(*), photo_id FROM Likes where photo_id = {photo_id}

To get the number of photos liked by any particular user (lets say
user_id = 20) you will use a query like this:

SELECT count(*), user_id FROM Likes where user_id =20

You can read more about different types of aggregate functions here

Another important command to learn in SQL is GROUP BY clause.


With group by clause you can arrange identical data into groups. When
used together with AGGREGATE functions, GROUP BY can uncover
some really helpful insights for you.

Lets understand with few examples:

To know the number of photos liked by each user on instagram we’ll


use:

SELECT count(*), user_id from Likes GROUP BY user_id

To know the number of photos posted by each user on instagram we’ll


use:

SELECT count(*), user_id from Photos GROUP by user_id

© HELLOPM.CO
JOINs in SQL – Using multiple tables together

Till now we have understood how you can filter, sort, limit, aggregate
and group data. Now the next and the most interesting part of SQL:
JOINS. Joins will help you connect two or more tables and unlock
insights which are spanned across multiple tables.

To join two or more tables, you’ll need some common values


(columns) between them, for example to know names of all the people
who have liked photos we will have to join two tables Users & Likes,
the common columns will be id column in users table and user_id
column in Likes tables.

Here is what the query is going to looks like

SELECT first_name, last_name, photo_id


FROM users JOIN Users.id = Likes.user_id

While this definition is enough for some of the common use cases,
you should definitely spend some time learning about different types
of joins here.

You now are aware of the most common SQL queries and terms that a
product manager should know about SQL.

You now have practical knowledge of:

● How to understand a data-schema


● How to retrieve data from a relational database

© HELLOPM.CO
● How to filter data as per your requirements with WHERE, IN and
conditional statements
● How to Limit & Order your data for most efficiency
● How to aggregate data with aggregate functions and using
GROUP BY to make uncover important insights
● How to use simple joins to get data which is available across
multiple tables

Where do I write all of these queries?

One pressing question that will come to the minds of some readers is
‘where do i write all of these queries to get the data I want?’. For this
you need to talk to your developers, if you are working for a decent
tech company then you might already have a read-only database
exposed for analytical usage, and that tool will most probably give you
a window to write custom SQL queries like the ones we have studied
above.

Some common free tools you can use to expose read-only databases
are:

● Redash
● Metabase
● PHPMyAdmin
● MYSQL WorkBench
● Heide SQL, etc.

Your developer should help you in setting up an environment or you


can read documentation from these tools and work with developers to
get it done for you.

© HELLOPM.CO
Here is how metabase looks like:

Some analytics tools such as Google’s firebase (through bigquery) and


Mode analytics give you the ability to run native SQL queries on their
data-sets.

Best practices for product managers to work with


SQL and native data

● Invest time in understanding your data schema: if you don’t


understand your data, your knowledge of SQL will do no good to
you or the company. You can take help from an engineering
manager or CTO to create documentation around this as this
would be useful for other product managers and new developers
as well.
● Use LIMIT in your queries, your queries will be faster and you’ll
save resources.

© HELLOPM.CO
● Do some sanity check on the output of your data, especially
when you are using joins. Take help from the data team when in
doubt.
● Ensure that your data analysis is correct before presenting it to
stakeholders, presenting wrong data is one sure shot way to
losing trust. Take help from data team and do sanity checks
[again].
● Learn about common functions around how to work with dates
in SQL. Dates are going to be an important part of your data
analysis process and they behave differently than other data
types. You can learn about common date functions here. Also,
keep in mind the time zone in which your database is recording
data.

Here are some questions which you can practice


answering through SQL for your product:

● Number of user registrations every month.


● How many users have been acquired from which marketing
campaign.
● What is the distribution of transactions across your users.
● How many users have filled their contact information.
● How much money/time does your power user spends on the app
as compared to new users.
● On which day of the week do you get the most number of users.

There could be lots of such questions for your own product through
which you can practise your newly acquired SQL skills and build a
solid understanding of your data and customer experience.

© HELLOPM.CO
That’s all folks!

I hope this guide will help you make better and quicker decisions as a
product manager. Please share this guide with aspiring or existing
product managers in your circle, I am sure they are going to thank you
for this!

© HELLOPM.CO

You might also like