v.1

SQL Knowledge Levels: Beginner, Middle, Advanced

EN
In this article, you will learn what SQL operations beginner, intermediate, and advanced analysts need to know.
SQL Knowledge Levels: Beginner, Middle, Advanced
Published
15.06.2023
|
devtodev

Growing any project, one day developers will certainly need information about users, their payments and any activities that they make in the app. For the most part, these tasks can be covered by an analytics platform. But it also happens that there are not enough ready-made reports, and analysts (and sometimes developers) need access to raw data. This is where SQL comes into play. 

The query language is used to work with large databases that store all the information about users, including their payments, country, device version, and so on. 

In devtodev you will find a large number of reports ready to give you most of the data you may need. However, you can always turn to the built-in SQL Report editor and get access to any information from your project.

That is why SQL skills are essential for any analyst. As they immerse themselves in the profession, they turns to more and more complex queries. We at devtodev have prepared a list of SQL statements that are most commonly used by analysts and developers of different levels. Check it out and get ready to grow your skills!!

Beginner

SELECT is an SQL statement that accesses the database to create a selection of the columns we need in a given format.

WHERE adds additional conditions filter to the query.  

Let's see how a simple query with the described statements will look like. Here and below we will use the devtodev demo for this.

select to_char(eventtime, 'yyyy-mm') as month
from p102968.payments p
where eventtime>=current_date - interval '3 month' and eventtime < current_date
group by 1
order by 1 asc

Two basic SQL statements that help sort data are GROUP BY and ORDER BY. The GROUP BY clause sorts data by grouping it based on the columns specified in the query and is used with aggregate functions. ORDER BY allows you to organize the results alphabetically, numerically, and in ascending (ASC) or descending order (DESC).

select p.devtodevid::text as devtodevid, p.eventtime, p.product, u.campaign, u.publisher
from p102968.payments p
left join 
p102968.users u on p.devtodevi= u.devtodevid
where eventtime>current_date - interval '7 day'
order by devtodevid, eventtime

SQL ORDER BY
Query result. Screenshot from devtodev demo

JOIN is a basic database tool that is used to link tables based on a specific set of fields. This operation always uses two tables, usually called left and right.

Types of JOINs
There are four main types of JOINs in SQL

Intermediate

The HAVING statement is a pointer to the result of executing aggregate functions. It is similar to the WHERE clause in that it binds not only to the set of table columns, but also to the set of the generated GROUP BY clause.

The CASE statement is a way of handling IF/THEN logic in SQL. It is followed by at least one pair of WHEN and THEN statements (the SQL equivalent of the IF/THEN statement in Excel). 

The SQL LIKE operator is used in conjunction with the WHERE clause of the UPDATE, DELETE, and SELECT statements to filter rows based on a given pattern.

The IN statement is one of the most useful operations for filtering data in SQL. It allows you to determine whether the value of an object matches the value in the list.

The UNION operator combines data from several tables into one while leaving unique rows. If you join two tables with the same row, the duplicate will not be included in the result (distinct by default). But if you need all data, including duplicates (for example, you count the number of events), then it is better to use UNION ALL– this operator will return all rows from the joined tables.

select column_names (1...N)
from table_name
union

select column_names (1...N)
from table_name

The TOP/LIMIT clause is used to specify the number of records to return. The TOP clause is necessary when you are working with large tables. In this case, using the TOP clause may affect performance.

Read more: SQL for Beginners: How to Calculate Average Check, Transactions, ARPPU

Advanced

The WITH clause in SQL is used to simplify complex long queries (especially with JOINs and subqueries). The WITH clause, often referred to interchangeably as CTE (Common Table Expression), defines a temporary set of data which results are available for reference in subsequent queries.
OVER is an important clause for window functions. The OVER clause defines a window or user-defined set of rows in the result set of a query.

LAG/LEAD are both window functions. The LEAD function is used to access data from the next rows along with the data from the current row. The LAG function is used to access data from previous rows along with data from the current row.

PARTITION BY allows you to group rows by the value of a particular column. This statement only works with window functions such as LAG/LEAD.

select devtodevid, eventtime, item
, row_number() over(partition by devtodevid order by eventtimerow_number
, sum("_Coins") over(partition by devtodevid order by eventtimecum_coins
from p102968.purchases
where created >current_date-interval '30 day'
order by 1,2

SQL PARTITION BY
Query result. Screenshot from devtodev demo

PIVOT/UNPIVOT are important operations that help change the way data is displayed in a database so that it is easier to analyze.  PIVOT is an operator that allows you to convert query results from rows to columns. UNPIVOT is used to rotate data from columns to rows.

Read more: SQL for Beginners: How to Generate Dates and Numbers 

Read more